Advent of Code 2024 day 9

Day 9 was the first challenge that needed a real change of representation from part 1 to part 2.

Reading the input is just opening the file and converting each character to a number.

let diskMap = fmap digitToInt text

That's the same for both parts.

Part 1

This uses a block-based representation of the disk. I keep track of the files and the free spaces. The free space is a Set of free block numbers; the files are a Map from block number to the file ID in that block.

type Disk = M.IntMap Int
type Free = S.IntSet

First, I expand the disk map into the initial block representation. I walk over the elements in the disk map, building up the disk and free as I go: that's a fold over the disk map.

Each step in the fold needs to know a few things:

is the current element of the map a file (or free space)?
what is the current block?
what is the current file ID?
what's the disk and free found so far?

That means I pass a 5-tuple around to keep track of these things. There's a lot to keep track of, but the expandMapItem function is mostly plumbing.

expand :: [Int] -> (Disk, Free)
expand diskMap = (disk, free)
  where (_, _, _, disk, free) = foldl' expandMapItem (True, 0, 0, M.empty, S.empty) diskMap

expandMapItem :: (Bool, Int, Int, Disk, Free) -> Int -> (Bool, Int, Int, Disk, Free)
expandMapItem (True, pos, fileID, disk, free) size =
  (False, pos + size, fileID + 1, disk', free)
  where
    fileExtent = take size $ zip [pos..] (repeat fileID)
    file = M.fromList fileExtent
    disk' = M.union disk file
expandMapItem (False, pos, fileID, disk, free) size =
  (True, pos + size, fileID, disk, free')
  where
    gap = S.fromList $ take size [pos..]
    free' = S.union free gap

For my own peace of mind, I have a little function that shows the layout of the disk. I can compare this with the examples in the problem description.

showDiskFree :: Disk -> Free -> String
showDiskFree disk free = showDisk disk ++ "\n" ++ showFree free

showDisk :: Disk -> String
showDisk disk = [showBlock i | i <- [0..pMax]]
  where (pMax, _) = M.findMax disk
        showBlock i = maybe '.' intToDigit $ M.lookup i disk

showFree :: Free -> String  
showFree free = [if S.member i free then '+' else '.' | i <- [0..pMax]]
  where pMax = S.findMax free

Once I've got the disk map in the block-based form I want, it's time to shuffle blocks!

First, I need to work out when to stop. The disk is packed when the highest-location file block is before the lowest-location free block.

ispackedBlock :: (Disk, Free) -> Bool
ispackedBlock (disk, free) = dMax < fMin
  where (dMax, _) = M.findMax disk
        fMin = S.findMin free

I pack one block (the highest location) by finding its location, finding the location of the smallest free block, and swapping them.

packBlocksStep :: (Disk, Free) -> (Disk, Free)
packBlocksStep (disk, free) 
  | ispackedBlock (disk, free) = (disk, free)
  | otherwise = (M.insert to fID disk1, S.insert from free1)
  where ((from, fID), disk1) = M.deleteFindMax disk
        (to, free1) = S.deleteFindMin free

I pack all the blocks by calling iterate packBlockStep (disk, free), pulling out the first arrangment that isPackedBlock.

packBlocks :: (Disk, Free) -> (Disk, Free)
packBlocks (disk, free) = head $ dropWhile (not . ispackedBlock) $ iterate packBlocksStep (disk, free)

Calculating the checksum follows the definition in the task.

checksum :: Disk -> Int
checksum disk = sum $ fmap (uncurry (*)) $ M.toAscList disk

Part 2

Part 2 operates on whole files rather than blocks, so I need a representation at that level. I settled in having a list of Region, with each Region being either Free or Used.

data Region = Free Int -- size
            | Used Int Int -- size, fileID
            deriving (Show, Eq)

type RDisk = [Region] -- key is start of region

I define a couple of "getter" utility functions.

fileID :: Region -> Int
fileID (Used _ fID) = fID
fileID _ = -1

freeSize :: Region -> Int
freeSize (Free size) = size
freeSize _ = 0

Converting the disk map to the region-based form follows a similar pattern to the previous version of expand.

expand :: [Int] -> RDisk
expand diskMap = reverse disk
  where (_, _, _, disk) = foldl' expandRegion (True, 0, 0, []) diskMap

expandRegion :: (Bool, Int, Int, RDisk) -> Int -> (Bool, Int, Int, RDisk)
expandRegion (True, pos, fID, disk) size =
  (False, pos + size, fID + 1, (Used size fID) : disk)
expandRegion (False, pos, fID, disk) 0 =
  (True, pos, fID, disk)
expandRegion (False, pos, fID, disk) size =
  (True, pos + size, fID, (Free size) : disk)

Note that I don't create zero-sized free spaces.

I also need the same block-based representation for part 1, and for calculating the checksum. That's handled by toBlocks.

toBlocks :: RDisk -> (Disk, Free)
toBlocks rdisk = (disk, free)
  where (_, disk, free) = foldl' toBlock (0, M.empty, S.empty) rdisk

toBlock :: (Int, Disk, Free) -> Region -> (Int, Disk, Free)
toBlock (pos, disk, free) (Free size) = (pos + size, disk, free')
  where
    gap = S.fromList $ take size [pos..]
    free' = S.union free gap
toBlock (pos, disk, free) (Used size fileID) = (pos + size, disk', free)
  where
    fileExtent = take size $ zip [pos..] (repeat fileID)
    file = M.fromList fileExtent
    disk' = M.union disk file

Now I've got the region-based representation, time to pack files. Following the puzzle definition, I work in decreasing order of file ID, making one attempt to pack each file.

packFiles :: RDisk -> RDisk
packFiles disk = packBelow maxID disk
  where maxID = maximum $ fmap fileID disk

packBelow :: Int -> RDisk -> RDisk
packBelow _ [] = []
packBelow 0 disk = disk
packBelow fid disk = packBelow (fid - 1) disk'
  where disk' = tidy $ packFile fid disk

packFile does the work. I need to identify two regions: the file that's moving and the free space it's moving into. Once I've got those two, I can split the list of regions into three parts: the chunk before the free space, the chunk between the free space and the file, and the chunk after the file.

From the puzzle description, you can see the chunks in the diagram below.

0099.111...2...333.44.5555.6666.777.8888..
       |^^^|                   |^^^|
 prefix            middle            suffix

Diagram showing how the regions are split

I use span to split the list into the prefix-and-middle and suffix. findFree finds the gap to put the file in, if it exists. Once I've got all the bits, I move the file and add the new Free chunks as needed.

packFile :: Int -> RDisk -> RDisk
packFile fid disk
  | isNothing gap = disk
  | otherwise = prefix ++ [Used fSize fid, Free (gapSize - fSize)] ++ mid ++ [Free fSize] ++ suffix
  where
    (prefixMid, ((Used fSize _) : suffix)) = span ((/= fid) . fileID) disk
    gap = findFree fSize prefixMid
    (prefix, Free gapSize, mid) = fromJust gap

findFree :: Int -> RDisk -> Maybe (RDisk, Region, RDisk)
findFree size disk
  | null suffix = Nothing
  | otherwise = Just (prefix, head suffix, tail suffix)
  where (prefix, suffix) = break ((>= size) . freeSize) disk

The final utility is tidy, which removes zero-sized Free regions, and combines adjacent Free regions.

tidy :: RDisk -> RDisk
tidy = foldr tidyRegion []

tidyRegion :: Region -> RDisk -> RDisk
tidyRegion (Free 0) rdisk = rdisk
tidyRegion (Free size) ((Free size1) : rdisk) = tidyRegion (Free (size + size1)) rdisk
tidyRegion region rdisk = region : rdisk

This is quite slow, but I'm doing a lot of list copying with all the splitting and joining. I'm sure it would be much faster using a Sequence, or even using another Map-based representation. But it's not so slow that I can be bothered to make the change.

Code

You can get the code from my locally-hosted Git repo, or from Codeberg.