Advent of Code 2020 day 11

    Day 11 was, like day 8, another return to a stalwart of AoC puzzles, but this time it was the cellular automaton. It was also a good example where a bit of optimisation paid off, using GHC's profiler to identify where the problems were.

    Part 1

    I stared with a similar approach of to part 2 of 2019 day 24, where I developed the automaton directly. First off, a few data structures. A Position is a (row, column) pair; a Seat is a sum type of states, and the collection of Seats is a Map from position to what's at that position. Note that the Seats only contains the positions with a seats, and not the positions that are floor.

    type Position = (Int, Int)
    data Seat = Floor | Empty | Occupied deriving (Eq, Ord)
    type Seats = M.Map Position Seat
    instance Show Seat where
      show Floor = "."
      show Empty = "L"
      show Occupied = "#"

    Note the custom Show instance, to make the display of the grid neat.

    Reading and showing grids are just cycling over the rows and columns. I don't use showGrid in the code, but it's useful to have for debugging.

    readGrid :: String -> (Seats, Position)
    readGrid input = (seats, (maxR, maxC))
      where seats = M.fromList $ concat 
                      [ [((r, c), Empty) | (t, c) <- zip row [0..], t == 'L']
                      | (row, r) <- zip rows [0..] ]
            rows = lines input
            maxC = (length $ head rows) - 1
            maxR = (length rows) - 1
    showGrid seats (maxR, maxC) = 
      unlines $ [ concat [showSeat (r, c) | c <- [0..maxC] ] | r <- [0..maxR]]
      where showSeat here = show $ M.findWithDefault Floor here seats

    And now for the code the drive the automaton. The key part is the rule. It's applied to every seat in the sets and returns the new state of that seat.

    ruleA seats here thisSeat
      | thisSeat == Empty && nOccs == 0 = Occupied
      | thisSeat == Occupied && nOccs >= 4 = Empty
      | otherwise = thisSeat
      where nOccs = M.size $ occupiedNeighbours seats here

    It relies on counting the occupied neighbours, and that in turn means finding all the neighbours.

    neighbours (r, c) = S.delete (r, c) 
      $ S.fromList [ (r + dr, c + dc) 
                   | dr <- [-1, 0, 1], dc <- [-1, 0, 1]]
    neighbourhood seats here = M.restrictKeys seats (neighbours here)
    occupiedNeighbours seats here = M.filter (== Occupied) $ neighbourhood seats here

    To run the simulation, I iterate through all the steps and stop when the state doesn't change. I rely on GHC's optimisations to notice that the two occurrences of runSteps rule seats are the same.

    part1 seats = M.size $ M.filter (== Occupied) $ runUntilSame ruleA seats
    step rule seats = M.mapWithKey (rule seats) seats
    runSteps rule seats = iterate (step rule) seats
    seatDifferences rule seats = zip (runSteps rule seats) (tail $ runSteps rule seats)
    runUntilSame rule seats = fst $ head $ dropWhile (uncurry (/=)) $ seatDifferences rule seats

    Part 2

    This followed the same logic as part 1, but with a different rule and a different definition of the neighbourhood: a chair is a neighbour of this one if it's the closest in a given direction, not necessarily adjacent.

    One way to do that (for one direction) is to generate all the locations in that direction outwards from the starting point, and stop either when you find a seat or the edge of the grid. But that requires passing the grid size into the lowest-level neighbour-finding routines, and I couldn't be bothered to do that. Instead, I wrote a predicate onSightLine that defined whether a seat was on a particular sight line from a location:

    data Direction = Up | UpRight | Right | DownRight | Down 
      | DownLeft | Left | UpLeft
      deriving (Eq, Ord, Show, Enum)
    onSightLine :: Position -> Direction -> Position -> Bool
    onSightLine (r0, c0) Down      (r, c) = (c0 == c) && (r > r0)
    onSightLine (r0, c0) Up        (r, c) = (c0 == c) && (r < r0)
    onSightLine (r0, c0) Right     (r, c) = (r0 == r) && (c > c0)
    onSightLine (r0, c0) Left      (r, c) = (r0 == r) && (c < c0)
    onSightLine (r0, c0) DownRight (r, c) = ((r - r0) > 0) && ((r - r0) == (c - c0))
    onSightLine (r0, c0) UpLeft    (r, c) = ((r - r0) < 0) && ((r - r0) == (c - c0))
    onSightLine (r0, c0) DownLeft  (r, c) = ((r - r0) > 0) && ((r - r0) == (c0 - c))
    onSightLine (r0, c0) UpRight   (r, c) = ((r - r0) < 0) && ((r - r0) == (c0 - c))

    I could then filter all the seats known by whether they were in a particular direction and use a Manhattan distance metric to find the closest one.

    manhattan (r1, c1) (r2, c2) = abs (r1 - r2) + abs (c1 - c2)
    closestInDirection seats here direction = take 1 sortedSeats
      where seatsInDirection = filter (onSightLine here direction) $ M.keys seats
            sortedSeats = sortOn (manhattan here) seatsInDirection 
    closestInSight :: Seats -> Position -> (S.Set Position)
    closestInSight seats here = 
      S.fromList $ concatMap (closestInDirection seats here) 
                             [d | d <- [Up .. UpLeft]]

    I could then filter those seats by whether they were occupied and hence count them and update the rule.

    ruleB seats here thisSeat
      | thisSeat == Empty && nOccs == 0 = Occupied
      | thisSeat == Occupied && nOccs >= 5 = Empty
      | otherwise = thisSeat
      where nOccs = M.size $ occupiedInSight seats here
    occupiedInSight :: Seats -> Position -> Seats
    occupiedInSight seats here = M.filter (== Occupied) 
      $ M.restrictKeys seats 
      $ closestInSight seats here

    It worked, but performance was terrible: it took over half an hour to find a solution.

    Profiling and optimisation

    Running the code with profiling immediately highlighted the problem:

    COST CENTRE                         MODULE SRC                                  %time %alloc
    closestInDirection.seatsInDirection Main   src/advent11naive.hs:79:9-77          95.7   98.4
    closestInDirection.sortedSeats      Main   src/advent11naive.hs:80:9-62           1.8    1.4
    onSightLine                         Main   src/advent11naive.hs:(66,1)-(73,80)    1.3    0.0

    The problem was in the creation of seatsInDirection in the closestInDirection function. This took 95.7% of the runtime. This was a clear target for optimisation.

    The neighbourhood of each seat was the same throughout the simulation. Rather than recalculating it for every cell in every step of the simulation, I could calculate the neighbourhoods at the start of the run, and just look them up when needed. But that would require passing in yet another parameter to the simulation functions, which would get messy, especually as the parameter is read-only.

    Time for a Reader monad to hid the plumbing. And as I'm passing in the neighbourhoods, I might as well pass in the other unchanging part of each simulation, the rule.

    type Neighbourhood = M.Map Position (S.Set Position)
    type Rule = Seats -> Neighbourhood -> Position -> Seat -> Seat
    type CachedSeats a = Reader (Neighbourhood, Rule) a

    That allowed me to keep everything the same for the two parts, just creating a different environment for runSteps to work on.

    part1 seats = M.size $ M.filter (== Occupied) stableSeats
      where cachedNeighbours = allNeighbourhoods seats
            env = (cachedNeighbours, ruleA)
            stableSeats = snd $ runReader (runSteps seats) env
    part2 seats = M.size $ M.filter (== Occupied) stableSeats
      where cachedNeighbours = allSightNeighbourhoods seats
            env = (cachedNeighbours, ruleB)
            stableSeats = snd $ runReader (runSteps seats) env

    Doing the simulation also became easier, using iterateUntilM from Control.Monad.Loops.

    runSteps :: Seats -> CachedSeats (Seats, Seats)
    runSteps seats = iterateUntilM (uncurry (==)) seatChanges (M.empty, seats)
    seatChanges :: (Seats, Seats) -> CachedSeats (Seats, Seats)
    seatChanges (_, seats0) = 
      do seats <- step seats0
         return (seats0, seats)
    step :: Seats -> CachedSeats Seats
    step seats = 
      do  (nbrs, rule) <- ask
          return $ M.mapWithKey (rule seats nbrs) seats

    That dropped the time from 30 minutes to 17 seconds. Not brilliant, but must better than it was.

    Profiling the new version still showed that seatsInDirection was expensive to calculate, taking 63% of the runtime. But at least I was only doing it once!

    COST CENTRE                         MODULE    SRC                       %time %alloc
    closestInDirection.seatsInDirection Main      src/advent11.hs:105:9-77   63.4   87.5
    occupiedNeighbours                  Main      src/advent11.hs:88:1-94    32.1   10.5
    closestInDirection.sortedSeats      Main      src/advent11.hs:106:9-62    1.2    1.2


    You can find the code here or on Gitlab.

    Neil Smith

    Read more posts by this author.