### Advent of Code 2020 review

A look back on the event

Cellular automata, and being inefficient

Day 11 was, like day 8, another return to a stalwart of AoC puzzles, but this time it was the cellular automaton. It was also a good example where a bit of optimisation paid off, using GHC's profiler to identify where the problems were.

I stared with a similar approach of to part 2 of 2019 day 24, where I developed the automaton directly. First off, a few data structures. A `Position`

is a (row, column) pair; a `Seat`

is a sum type of states, and the collection of `Seats`

is a `Map`

from position to what's at that position. Note that the `Seats`

only contains the positions with a seats, and not the positions that are floor.

```
type Position = (Int, Int)
data Seat = Floor | Empty | Occupied deriving (Eq, Ord)
type Seats = M.Map Position Seat
instance Show Seat where
show Floor = "."
show Empty = "L"
show Occupied = "#"
```

Note the custom `Show`

instance, to make the display of the grid neat.

Reading and showing grids are just cycling over the rows and columns. I don't use `showGrid`

in the code, but it's useful to have for debugging.

```
readGrid :: String -> (Seats, Position)
readGrid input = (seats, (maxR, maxC))
where seats = M.fromList $ concat
[ [((r, c), Empty) | (t, c) <- zip row [0..], t == 'L']
| (row, r) <- zip rows [0..] ]
rows = lines input
maxC = (length $ head rows) - 1
maxR = (length rows) - 1
showGrid seats (maxR, maxC) =
unlines $ [ concat [showSeat (r, c) | c <- [0..maxC] ] | r <- [0..maxR]]
where showSeat here = show $ M.findWithDefault Floor here seats
```

And now for the code the drive the automaton. The key part is the `rule`

. It's applied to every seat in the sets and returns the new state of that seat.

```
ruleA seats here thisSeat
| thisSeat == Empty && nOccs == 0 = Occupied
| thisSeat == Occupied && nOccs >= 4 = Empty
| otherwise = thisSeat
where nOccs = M.size $ occupiedNeighbours seats here
```

It relies on counting the occupied neighbours, and that in turn means finding all the neighbours.

```
neighbours (r, c) = S.delete (r, c)
$ S.fromList [ (r + dr, c + dc)
| dr <- [-1, 0, 1], dc <- [-1, 0, 1]]
neighbourhood seats here = M.restrictKeys seats (neighbours here)
occupiedNeighbours seats here = M.filter (== Occupied) $ neighbourhood seats here
```

To run the simulation, I `iterate`

through all the steps and stop when the state doesn't change. I rely on GHC's optimisations to notice that the two occurrences of `runSteps rule seats`

are the same.

```
part1 seats = M.size $ M.filter (== Occupied) $ runUntilSame ruleA seats
step rule seats = M.mapWithKey (rule seats) seats
runSteps rule seats = iterate (step rule) seats
seatDifferences rule seats = zip (runSteps rule seats) (tail $ runSteps rule seats)
runUntilSame rule seats = fst $ head $ dropWhile (uncurry (/=)) $ seatDifferences rule seats
```

This followed the same logic as part 1, but with a different rule and a different definition of the neighbourhood: a chair is a neighbour of this one if it's the closest in a given direction, not necessarily adjacent.

One way to do that (for one direction) is to generate all the locations in that direction outwards from the starting point, and stop either when you find a seat or the edge of the grid. But that requires passing the grid size into the lowest-level neighbour-finding routines, and I couldn't be bothered to do that. Instead, I wrote a predicate `onSightLine`

that defined whether a seat was on a particular sight line from a location:

```
data Direction = Up | UpRight | Right | DownRight | Down
| DownLeft | Left | UpLeft
deriving (Eq, Ord, Show, Enum)
onSightLine :: Position -> Direction -> Position -> Bool
onSightLine (r0, c0) Down (r, c) = (c0 == c) && (r > r0)
onSightLine (r0, c0) Up (r, c) = (c0 == c) && (r < r0)
onSightLine (r0, c0) Right (r, c) = (r0 == r) && (c > c0)
onSightLine (r0, c0) Left (r, c) = (r0 == r) && (c < c0)
onSightLine (r0, c0) DownRight (r, c) = ((r - r0) > 0) && ((r - r0) == (c - c0))
onSightLine (r0, c0) UpLeft (r, c) = ((r - r0) < 0) && ((r - r0) == (c - c0))
onSightLine (r0, c0) DownLeft (r, c) = ((r - r0) > 0) && ((r - r0) == (c0 - c))
onSightLine (r0, c0) UpRight (r, c) = ((r - r0) < 0) && ((r - r0) == (c0 - c))
```

I could then filter all the seats known by whether they were in a particular direction and use a Manhattan distance metric to find the closest one.

```
manhattan (r1, c1) (r2, c2) = abs (r1 - r2) + abs (c1 - c2)
closestInDirection seats here direction = take 1 sortedSeats
where seatsInDirection = filter (onSightLine here direction) $ M.keys seats
sortedSeats = sortOn (manhattan here) seatsInDirection
closestInSight :: Seats -> Position -> (S.Set Position)
closestInSight seats here =
S.fromList $ concatMap (closestInDirection seats here)
[d | d <- [Up .. UpLeft]]
```

I could then filter those seats by whether they were occupied and hence count them and update the rule.

```
ruleB seats here thisSeat
| thisSeat == Empty && nOccs == 0 = Occupied
| thisSeat == Occupied && nOccs >= 5 = Empty
| otherwise = thisSeat
where nOccs = M.size $ occupiedInSight seats here
occupiedInSight :: Seats -> Position -> Seats
occupiedInSight seats here = M.filter (== Occupied)
$ M.restrictKeys seats
$ closestInSight seats here
```

It worked, but performance was terrible: it took over half an hour to find a solution.

Running the code with profiling immediately highlighted the problem:

```
COST CENTRE MODULE SRC %time %alloc
closestInDirection.seatsInDirection Main src/advent11naive.hs:79:9-77 95.7 98.4
closestInDirection.sortedSeats Main src/advent11naive.hs:80:9-62 1.8 1.4
onSightLine Main src/advent11naive.hs:(66,1)-(73,80) 1.3 0.0
```

The problem was in the creation of `seatsInDirection`

in the `closestInDirection`

function. This took 95.7% of the runtime. This was a clear target for optimisation.

The neighbourhood of each seat was the same throughout the simulation. Rather than recalculating it for every cell in every step of the simulation, I could calculate the neighbourhoods at the start of the run, and just look them up when needed. But that would require passing in yet another parameter to the simulation functions, which would get messy, especually as the parameter is read-only.

Time for a `Reader`

monad to hid the plumbing. And as I'm passing in the neighbourhoods, I might as well pass in the other unchanging part of each simulation, the rule.

```
type Neighbourhood = M.Map Position (S.Set Position)
type Rule = Seats -> Neighbourhood -> Position -> Seat -> Seat
type CachedSeats a = Reader (Neighbourhood, Rule) a
```

That allowed me to keep everything the same for the two parts, just creating a different environment for `runSteps`

to work on.

```
part1 seats = M.size $ M.filter (== Occupied) stableSeats
where cachedNeighbours = allNeighbourhoods seats
env = (cachedNeighbours, ruleA)
stableSeats = snd $ runReader (runSteps seats) env
part2 seats = M.size $ M.filter (== Occupied) stableSeats
where cachedNeighbours = allSightNeighbourhoods seats
env = (cachedNeighbours, ruleB)
stableSeats = snd $ runReader (runSteps seats) env
```

Doing the simulation also became easier, using `iterateUntilM`

from Control.Monad.Loops.

```
runSteps :: Seats -> CachedSeats (Seats, Seats)
runSteps seats = iterateUntilM (uncurry (==)) seatChanges (M.empty, seats)
seatChanges :: (Seats, Seats) -> CachedSeats (Seats, Seats)
seatChanges (_, seats0) =
do seats <- step seats0
return (seats0, seats)
step :: Seats -> CachedSeats Seats
step seats =
do (nbrs, rule) <- ask
return $ M.mapWithKey (rule seats nbrs) seats
```

That dropped the time from 30 minutes to 17 seconds. Not brilliant, but must better than it was.

Profiling the new version still showed that `seatsInDirection`

was expensive to calculate, taking 63% of the runtime. But at least I was only doing it once!

```
COST CENTRE MODULE SRC %time %alloc
closestInDirection.seatsInDirection Main src/advent11.hs:105:9-77 63.4 87.5
occupiedNeighbours Main src/advent11.hs:88:1-94 32.1 10.5
closestInDirection.sortedSeats Main src/advent11.hs:106:9-62 1.2 1.2
```

You can find the code here or on Gitlab.