# Advent of Code 2022 day 7

I didn't expect to be using zippers as early as day 7! But at least I got to learn about the standard Data.Tree and Data.Tree.Zipper libraries.

A zipper is a data structure with the notion of a "current element". It's easy to inspect and alter the current element, and to move in relative steps around the structure. For instance, a list may have the focus on a particular element and allow you to move left and right to adjacent positions.

I used this to navigate around the directory tree, using the cd commands to change the focus.

## Data types

First, I defined a type to hold the results of the parsing, with each line parsing into some object. I also needed types to hold the nodes in the trees I was generating (one for the full directory structure, one for just the sizes of directories). I also needed types for the various trees.

data ParsedObject = CD String
| LS
| PDirectory String
| PFile Int String
deriving Show

data Directory = Dir String (M.Map String Int)
deriving (Show, Eq)

data ContainedSize = CSize String Integer
deriving (Show, Eq)

type DTree = Tree Directory
type ZDTree = TreePos Full Directory
type STree = Tree ContainedSize

A Directory holds the name of the directory and the files it holds, stored as a Map of file names to sizes.

The Data.Tree library uses  the Node constructor to hold the details of each node in the tree (a rose tree), where each node also holds a list of subtrees, like this empty tree.

emptyTree :: DTree
emptyTree = Node {rootLabel = (Dir "/" M.empty), subForest = []}

## Parsing

There wasn't much to this. Each line was parsed into a separate object, and parsing returned a list of them. The only wrinkle was defining a parser for names, being a sequence of non-space characters.

traceP = lineP sepBy endOfLine

lineP = cdP <|> lsP <|> directoryP <|> fileP
cdP = CD <$> ("$ cd " *> nameP)
lsP = LS <$"$ ls"
fileP = PFile <$> (decimal <* " ") <*> nameP directoryP = PDirectory <$> ("dir " *> nameP)

nameP = many1 letterP
letterP = satisfy (not . isSpace)

## Building the tree

This was where the bulk of the effort went. mkTree handles the conversion to and from zipped trees and makeTree does the work of assembling the tree.

mkTree :: [ParsedObject] -> DTree -> DTree
mkTree trace tree = toTree $root$ makeTree trace $fromTree tree makeTree :: [ParsedObject] -> ZDTree -> ZDTree makeTree trace tree = foldl' processCommand tree trace ProcessCommand takes each element from the command trace and updates the tree accordingly. processCommand :: ZDTree -> ParsedObject -> ZDTree processCommand tree (CD name) | name == "/" = root tree | name == ".." = fromJust$ parent tree
| otherwise = fromJust $childWithName name tree processCommand tree LS = tree processCommand tree (PFile size name) = modifyLabel (\ (Dir n fs) -> Dir n (M.insert name size fs)) tree processCommand tree (PDirectory name) = if (isJust$ childWithName name tree)
then tree
else fromJust $parent$ insert (Node { rootLabel = (Dir name M.empty)
, subForest = []
}) $children tree Processing cd commands moves up and down the tree, moving into parents or children. Finding out about a new file means I update the file listing of the current node. Finding out about a new directory means checking if the directory exists. If it does, leave it. If it doesn't, create a new empty directory node. (This moves the focus to the new node, so I use parent to return to the current directory.) This relies on the utility function childWithName that searches for the directory with a given name, while the library uses finding by position. It starts with the first child and keeps taking the next one until it finds the directory with the right name. childWithName :: String -> ZDTree -> Maybe ZDTree childWithName name tree = searchForChild name (firstChild tree) searchForChild :: String -> Maybe ZDTree -> Maybe ZDTree searchForChild _name Nothing = Nothing searchForChild name (Just tree) | name == labelName = Just tree | otherwise = searchForChild name (next tree) where (Dir labelName _) = label tree ## Solving the puzzle Now I have the directory tree, actually solving the puzzle is easy! containingSizes finds the sizes of the files directly in each directory. It adds up the file sizes in the Map then fmaps the same function to all the subtrees. transitiveSizes takes those sized directories and adds up the sizes of the subdirectories. containingSizes :: DTree -> STree containingSizes (Node {rootLabel = (Dir name files), subForest = sf}) = (Node {rootLabel = (CSize name sizeHere), subForest = sizedTrees}) where sizeHere = M.foldl (+) 0$ M.map fromIntegral files
sizedTrees = fmap containingSizes sf

transitiveSizes :: STree -> STree
transitiveSizes (Node {rootLabel = (CSize name sizeHere), subForest = sf}) =
(Node {rootLabel = (CSize name (sizeHere + subSizes)), subForest = sizedTrees })
where sizedTrees = fmap transitiveSizes sf
subSizes = sum $fmap (extractCSize . rootLabel) sizedTrees extractCSize, cancelLarge :: ContainedSize -> Integer extractCSize (CSize _ s) = s Part 1 throws away the large sizes and adds up the rest. Part 2 starts by flattening the lovely tree to just a list of nodes, then does some arithmetic to find how much space is needed, then finds the smallest directory larger than that size. part1, part2 :: STree -> Integer part1 = foldTree (\x xs -> sum (x:xs)) . fmap cancelLarge cancelLarge (CSize _ s) | s <= reportingThreshold = s | otherwise = 0 part2 tree = spaceFreed where nodes = fmap extractCSize$ flatten tree
spaceUsed = extractCSize $rootLabel tree spaceUnused = spaceAvailable - spaceUsed spaceToFree = spaceRequired - spaceUnused viableNodes = filter (>= spaceToFree) nodes spaceFreed = head$ sort viableNodes

And that's it! Over-engineered, but thorough.

## Code

You can get the code from my locally-hosted Git repo, or from Gitlab.