Optimising Haskell, example 4

This is the final optimisation post for my solutions to the Advent of Code 2022. My review of my solutions showed that day 19 was by far the slowest of my solutions, taking about 20 minutes to find a result. I must be able to do better.

Most of the speedup came from reframing the problem, but there's some acutal low-level Haskell optimisation at the end!

Factory, not time

My original solution was an A* search, stepping forward in time. At each minute of the simulation, I generated a state of what could happen in the next minute. In that case, the A* search approach guaranteed that the first solution would be the optimal one, but the heuristic I was using (assuming you can make a geode-cracking robot each minute after the current time) vastly overestimated the value of early states, leading to was wasn't far off a breadth-first search of the space.

No better heuristic immediately came to mind, so I instead decided to change how I represented the problem. My first attempt revolved around the time; I would recast the problem about the actions of the robot-making factory.

At any given time, there are options for what the factory can do. For each type of robot, we can decide to:

If I have enough of that type of robot, do nothing.
If I have enough resources on hand, make that robot now.
If I don't have the resources, but will have them if I wait for a bit, then wait until I have the resources and build the robot.

(These choices are mutually exclusive for each robot type.)

Options 2 and 3 collapse to the same thing if I recast it as "wait until I've got enough resources, then build" on the understanding that the wait may be zero minutes.

There's a global fourth action, which is just to run out the clock and have the existing robots gather what they will. This option will handle the last fragment of time for each run.

When waiting to build a robot, I can calculate how much time I need to wait. That suggests I should store the current time in the SearchState:

data SearchState = SearchState
    { _resources :: Collection
    , _robots :: Collection
    , _currentTime :: Int
    } deriving (Eq, Show, Ord)
makeLenses ''SearchState

Next is a function that handles the above decision making, for a particular robot at a particular state. Note that this returns a Maybe SearchState, as it could be there's nothing to do for this type of robot. maxRobots is a collection of the maximum quantity needed of each type of robot; recipe is the requirements for this type of robot.

handleRobot :: SearchState -> Collection -> Int -> Resource -> Collection -> Maybe SearchState
handleRobot state maxRobots timeLimit robot recipe
  | sufficientRobots robot state maxRobots = Nothing
  | otherwise = buildWhenReady robot state recipe timeLimit

I then wrote little functions to do each of the steps.

-- do I already have enough of this robot?
sufficientRobots :: Resource -> SearchState -> Collection -> Bool
sufficientRobots robot state maxRobots = 
    (robot `MS.member` maxRobots)
    && 
    ((MS.occur robot (state ^. robots)) >= (MS.occur robot maxRobots))

The case of delaying for a while until building a robot is a little more convoluted. This case is viable if, for each resource I'm short of, I have at least one robot that can gather that resource. In that case, I calculate the delay needed to get enough of each resource, and the final delay is the largest of them. (If I have the resources, there is no shortfall, so the final delay is zero; hence maximum0.) If waiting will never generate enough resources (because a particular type of robot doesn't exist yet), return Nothing.

-- assuming I can't build this robot, how long do I have to wait for the current
-- robots to gather enough to build it?
buildDelay :: SearchState -> Collection -> Maybe Int
buildDelay state recipe 
  | all (\r -> MS.member r rbts) (MS.distinctElems shortfall) = Just $ maximum0 $ fmap snd $ MS.toOccurList delay
  | otherwise = Nothing
  where shortfall = recipe `MS.difference` (state ^. resources)
        delay = MS.foldOccur calcOneDelay MS.empty shortfall
        rbts = state ^. robots
        calcOneDelay resource count acc = 
          MS.insertMany resource 
                        (ceiling $ (fromIntegral count) / (fromIntegral $ MS.occur resource rbts))
                        acc
        maximum0 xs = if (null xs) then 0 else maximum xs

Using that, updating the state for that delay is a little simpler, but still has to handle a delay of Nothing and a delay that would be beyond the allowed time limit. That's easier with a Maybe monad.

builldWhenReady :: Resource -> SearchState -> Collection -> Int -> Maybe SearchState
buildWhenReady robot state recipe timeLimit =
  do  waitDelay <- buildDelay state recipe
      delay <- tooLate (state ^. currentTime) (waitDelay + 1) timeLimit
      let gathered = MS.foldOccur (\res n acc -> MS.insertMany res (n * delay) acc) 
                                MS.empty 
                                (state ^. robots)
      return (state & robots %~ MS.insert robot
                    & resources %~ (MS.union gathered)
                    & resources %~ ( `MS.difference` recipe )
                    & currentTime %~ (+ delay)
             )

tooLate :: Int -> Int -> Int -> Maybe Int
tooLate current delay timeLimit
  | (current + delay) <= timeLimit = Just delay
  | otherwise = Nothing

The body of buildWhenReady demonstrates how lenses can make this kind of update much easier.

The per-robot-type processing is called from successors, using catMaybes to retain just the valid successors. It also handles the "wait until the end" case.

successors :: SearchState -> BlueprintContext (Q.Seq SearchState)
successors state = 
  do blueprint <- asks getBlueprint
     maxRobots <- asks getMaxRobots
     timeLimit <- asks getTimeLimit

     let robotSuccessors = Q.fromList $ catMaybes $ M.elems $ M.mapWithKey (handleRobot state maxRobots timeLimit) blueprint
     
     let timeRemaining = timeLimit - (state ^. currentTime)
     let gathered = MS.foldOccur (\res n acc -> MS.insertMany res (n * timeRemaining) acc) 
                                MS.empty 
                                (state ^. robots)
     let delayUntilEnd = (state & currentTime .~ timeLimit
                                & resources %~ (MS.union gathered)
                         )
     return ( robotSuccessors |> delayUntilEnd )

With that, finding the overall solution still uses the A* search from before.

The final change was to use parallel processing properly, and explore the different blueprints in parallel. This speeds up part 1, where I process all 30 blueprints, and has less effect on part 2, where I only process three.

part1, part2 :: [(Int, Blueprint)] -> Int
part1 blueprints = sum [n * (MS.occur Geode (r ^. resources)) | (n, r) <- results]
  where results = parMap rdeepseq (scoreBlueprint 24) blueprints

part2 blueprints = product [MS.occur Geode (r ^. resources) | (_, r) <- results]
    where results = parMap rdeepseq (scoreBlueprint 32) $ take 3 blueprints

Performance

How well did this work?

Program	Wall time	Memory used (kb)
Original	15:09.53	1,562,024
Factory-based	0:14.02	1,031,664

Enormously well. The time went from 15-20 minutes to about 14 seconds. But it can go quicker.

Actual Haskell optimisation

Profiling the program showed something interesting. About two-thirds of the runtime was spent in the compare function defined for the SearchState record. This will be part of the Data.Set library, that uses comparison for efficient storage. It's used when checking if the current SearchState is in the closed set. As the SearchState is a couple of Multisets, this takes time.

I can reduce this by having a hashed representation of the SearchState that's good enough for checking equality, even if it's not so useful in the rest of the program.

I could manually redefine the Eq and Ord functions for SearchState, but that would require every element of the closed set be rehashed for every membership check; that's likely to get expensive. An alternative is to store only the hashes in the closed set, and hash the new member when I want to check for membership.

What should the hash be? Integers are fast, so let's use them. I want to hash a Multiset to an integer. That will allow me to hash a SearchState into a 3-tuple of the hash of the resources and robots fields, with the currentTime.

I can quickly hash a Multiset by adding the amounts of each element, scaling each one so that different Multisets will give different values. A quick inspection of some traces shows that we rarely get to over 100 of any resource, so let's be safe and use 200 as a scaling constant. I add the Enum and Bounded classes to Resource to simplify the guarantee of using all the resource types.

data Resource = Ore | Clay | Obsidian | Geode
    deriving (Show, Eq, Ord, Generic, Enum, Bounded)

type Collection = MS.MultiSet Resource

hashCollection :: Collection -> Int
hashCollection c = 
    let k = 200 
    in sum $ zipWith (\r n -> (MS.occur r c) * n) [minBound .. maxBound] $ iterate (* k) 1

type StateHash = (Int, Int, Int)
hashSearchState :: SearchState -> StateHash
hashSearchState s = (hashCollection (s ^. resources), hashCollection (s ^. robots), s ^. currentTime)

type ExploredStates = S.Set StateHash

All that's left is to call hashSearchState the couple of times I examine the closed set, in aStar and candidates.

Performance

This step makes another large improvement.

Program	Wall time	Memory used (kb)
Original	15:09.53	1,562,024
Factory-based	0:14.02	1,031,664
Hashing	0:06.77	1,073,516
Hashing, single-thread	0:13.43	508,552

The version withed the hashed closed set is twice as quick as before, taking less than seven seconds to complete.

Out of curiosity, I removed the parallel processing of the blueprints. This only doubled the runtime, but halved the memory consumption. It seems there is overhead in running in parallel!

Code

You can get the code from my locally-hosted Git repo, or from Gitlab.

Optimising Haskell, example 4

Neil Smith

Neil Smith

Factory, not time

Performance

Actual Haskell optimisation

Performance

Code

Compressed coordinates: Optimising Advent of Code 2025 day 9

Advent of Code 2025 day 12

Advent of Code 2025 day 11

Solving riddles

Optimising Haskell, example 3

Factory, not time

Performance

Actual Haskell optimisation

Performance

Code

Subscribe to Neil's musings

Subscribe to Neil's musings