Optimising Haskell: quick wins

A couple of quick optimisations that greatly sped up two of the three slow solutions for Advent of Code 2024. (See also my writeup of optimising the Day 12 solution.)

Day 16: using the right queue type

Speeding up the day 16 solution was so simple, it's barely worth mentioning. Quickly profiling the code showed that all the time was spent sorting the agenda of items in the search function.

bfs ::  Agenda -> Closed -> [Agendum] -> MazeContext [Agendum]
bfs [] _ founds = return founds
bfs (currentAgendum : restAgenda) closed founds = 
  do  let reached = currentAgendum.current
      nexts <- candidates currentAgendum closed
      let newAgenda = if viable currentAgendum founds
                      then {-# SCC agendaSort #-} sortOn cost $ restAgenda ++ nexts
                      else restAgenda
      reachedGoal <- isGoal reached
      let founds' = if reachedGoal
                    then updateFounds currentAgendum founds
                    else founds
      bfs newAgenda (S.insert reached closed) founds'

Rather than sorting a whole list to find the smallest, I decided to use a priority queue instead.

I imported the PQueue package and made a couple of changes. The main one was to make bfs use the queue, using the exported patterns for pattern matching.

type Agenda = P.MinPQueue Int Agendum

bfs ::  Agenda -> Closed -> [Agendum] -> MazeContext [Agendum]
bfs Empty _ founds = return founds
bfs ((_, currentAgendum) :< restAgenda) closed founds = 
  do  let reached = currentAgendum.current
      nexts <- candidates currentAgendum closed
      let newAgenda = if viable currentAgendum founds
                      then P.union restAgenda nexts
                      else Empty
      reachedGoal <- isGoal reached
      let founds' = if reachedGoal
                    then updateFounds currentAgendum founds
                    else founds
      bfs newAgenda (S.insert reached closed) founds'

While I was there, I put an extra line in candidates to make it return a queue rather than a list.

candidates :: Agendum -> Closed -> MazeContext Agenda
candidates agendum closed = 
  do  let here = agendum.current
      succs <- successors here
      let viableSuccs = filter (\(r, _) -> not $ r `S.member` closed) succs
      cands <- mapM (makeAgendum agendum.trail agendum.cost) viableSuccs 
      return $ P.fromList $ fmap (\a -> (cost a, a)) cands

That one change took the time from 27 seconds to 0.3 seconds, a speed up of about ninety times.

Day 18: doing less

My day 18 solution was correct, but just did too much work. The goal of the problem is to walk across a grid while obstacles fall from the sky onto it. I have to find which obstacle finally makes the trip impossible.

The first thing I did was tidying up, where I moved the repeated call to aStar into a separate function.

part1 :: [Position] -> Int
part1 bytes = fst $ fromJust $ search $ take 1024 bytes

part2 :: [Position] -> String
part2 bytes = showResult $ head $ snd $ head results
  where 
    (goods, poss) = splitAt 1024 bytes
    results = dropWhile ((== True) . fst) $ scanl' go (True, goods) poss
    go (_, acc) byte = (escapePossible (byte : acc), (byte : acc))
    showResult (V2 x y) = show x ++ "," ++ show y

escapePossible :: [Position] -> Bool
escapePossible = isJust . search 

search :: [Position] -> Maybe (Int, [Explorer])
search bytes = aStar (neighbours memory) 
                      (transitionCost)
                      (estimateCost memory) 
                      (isGoal memory) 
                      (initial memory)
  where memory = Memory (S.fromList bytes) (fst memoryBounds) (snd memoryBounds)

My initial solution used scanl' to try all the inputs in a linear search. That was a lot of repeated work.

Finding blockers

A better solution, as described in the Reddit community, was to find the shortest path across the grid and wait until an obstacle falls on that path. Only then should I test to see if there is still a valid path.

That prompted me to rewrite part 2 as a foldl', maintaining a state of what was needed. The state should contain the current path and the obstacles dropped so far. findBlocker handled the updates of the state, in three conditions:

If there is already no path, do nothing.
If this obstacle falls on the current path, look for another path
Otherwise, remember this obstacle.

part2 :: [Position] -> String
part2 bytes = showResult blocker
  where 
    (_, (blocker : _)) = foldl' findBlocker (snd <$> search [], []) bytes
    showResult (V2 x y) = show x ++ "," ++ show y

findBlocker :: (Maybe [Position], [Position]) -> Position -> (Maybe [Position], [Position])
findBlocker (Nothing, dropped) _ = (Nothing, dropped)
findBlocker (Just path, dropped) byte 
  | byte `elem` path = (path', dropped')
  | otherwise = (Just path, dropped')
  where path' = snd <$> search dropped'
        dropped' = byte : dropped

That moved the runtime from nineteen seconds to less than one second.

Version	Runtime	Fraction of original
`Original`	18.74	1.00
`findBlocker`	0.91	0.049

Binary search

The more generic approach is a binary search. I know that dropping 1024 obstacles doesn't block the route, but dropping all of them does. A binary search would quickly allow me to find the obstacle that closes off all routes.

In binarySearch, the lower bound is known to have a route, while the upper bound is known to have no routes. I stop when the two are adjacent, then correct the off-by-one error between take and !!.

part2 :: [Position] -> String
part2 bytes = showResult (bytes !! (upper - 1))
  where 
    upper = binarySearch bytes 1024 (length bytes)
    showResult (V2 x y) = show x ++ "," ++ show y

binarySearch :: [Position] -> Int -> Int -> Int
binarySearch bytes lower upper 
  | lower + 1 == upper = upper
  | escapePossible dropped = binarySearch bytes mid upper
  | otherwise = binarySearch bytes lower mid
  where
    mid = (lower + upper) `div` 2
    dropped = take mid bytes

That drops the runtime down to less than a quarter of a second, another ninety-fold increase in speed.

Version	Runtime	Fraction of original
`Original`	18.74	1.00
`findBlocker`	0.91	0.049
`binarySearch`	0.22	0.012

Code

You can get the code from my locally-hosted Git repo, or from Codeberg.