The day 23 solution takes too long (around seven minutes on my machine) so this post shows how a different representation can make a big difference in performance.
Faster sets
I originally decided to represent the set of moving elves as a Set
, where each Elf
knew its position. The problem is that the standard Set
implementation, in the containers
package, isn't the fastest in the world. (Internally, it uses tree structures to store the sets.)
The unordered-containers
package has another implementation of Set
, called HashSet
. This package is based on hastables of elements and is stresses performance of the data structures.
The good news is that the interface to HashSet
s is just about the same as Set
s, so very little had to change in the code. The main difference was that conversion between a HashSet
and a MultiSet
now had to go via an intermediate List
. You can see the code as advent23/MainUnordered.hs
in the repos linked below.
The bad news is that the performance didn't improve much, shaving only a few seconds off the six-minute runtime.
Related to the Set
is the Map
data structure. This allows faster lookup of the elves in a particular situation, but at the cost of a more involved program. Over on Reddit, /u/ephrion/ did a great writeup of this approach. The upshot is that the extra work yielded less benefit than the approach I describe below.
Another approach is needed.
Array, not Set
The representation elves as a Set
has the advantage of easy bounds-maintaining: as the elves moved apart, I could just change the elements of the Set
of elves and not have to keep track of the range of their positions.
This was easy to implement, but led to very slow performance. Each update to each elf had to find the elements of the set that were nearby.
A better representation of the elves would be an Array
, which gives fast, constant-time access to each position in the grid. The downside is that I'd have to check and possibly change the bounds of the array each generation.
Array representation and interface
As I was making lots of updates to the array of elves each simulation round, I thought that using mutable arrays would be the best bet. That should avoid lots of copying of immutable arrays, even if the compiler was able to optimise most of it away.
The (mutable) array interface is split across a few libraries: Data.Array.MArray
is the immutable array interface, and Data.Array.ST
wraps them in an ST
monad, which itself is provided by Control.Monad.ST
. The point of the ST
monad is that it keeps all the impure mutating code behind a fence, where it can't infect the pure functional code.
While I was updating the elf positions, I decided to also track how many elves were proposing to move to a particular space. That would make the identification of clashes easier.
Data structures
That led to these new data structures. The Population
is an array, indexed by position. The cell holds Nothing
if it is vacant, and Just Elf
if it is occupied.
The Elf
type becomes a newtype
, wrapping a Position
. This stores where the Elf
is moving to in the next round. Elves that stay still with contain their current position.
The mutable arrays contain the reference to the ST
monad they were in, a dummy type parameter to keep things hygienic.
newtype Elf = Elf Position
deriving (Eq, Ord, Show)
type Population = A.Array Position (Maybe Elf)
type MPopulation s = STArray s Position (Maybe Elf)
type MClashCounts s = STArray s Position Int
I decided to maintain a border of at least one unoccupied cell around the elves. That meant each elf could check where it was going without having to worry about bounds checking.
Using arrays
The top-level change was localised to the simulateOnce
function, that did one round of the simulation. This changed from
simulateOnce =
do proposeMoves
removeClashes
moveElves
updateDirections
updateCount
to
simulateOnce =
do updateGrove
growGrove
updateDirections
updateCount
where updateGrove
handles all the array manipulation and growGrove
updates the size of the array after each round.
Below that level, a lot of things changed!
updateGrove
uses the runSTArray
function to handle the conversion of the Array
of elves into a mutable one, the updates, then the conversion back to the immutable array.
updateGrove :: GroveState ()
updateGrove =
do grove <- gets currentGrove
proposalsInf <- gets proposalDirections
let proposals = take 4 proposalsInf
let newGrove =
runSTArray $
do mPopulation <- M.thaw grove
mCounts <- M.mapArray (const 0) mPopulation
proposeMoves mPopulation mCounts proposals
removeClashes mPopulation mCounts
moveElves mPopulation
return mPopulation
modify' (\g -> g { currentGrove = newGrove})
It maintains two arrays: one of Maybe Elf
, showing where the elves are; and one of Int
, showing how many elves will move into a particular position. proposeMoves
updates both of these as each elf decides where to move.
The functions below here aren't that interesting to explain (you can see them yourself). The main consequence is that some of the code has to be split to have explicit intermediate results, if different parts do and don't use the arrays. For instance, the noElves
function returns whether there are no elves in a set of locations. The Set
-based version looks like this:
noElves :: Population -> S.Set Position -> Bool
noElves elves tests = S.null $ S.intersection tests $ S.map _current elves
but the Array
-based version looks like this:
noElves :: MPopulation s -> [Position] -> ST s Bool
noElves elves tests =
do others <- mapM (M.readArray elves) tests
return $ all isNothing others
Performance
Using Array
s reduces the runtime from seven minutes to seven seconds, but at the cost of now using 18Mb of RAM rather than the original 14Mb.
Program | Wall time | Memory used (kb) |
---|---|---|
Original | 6:10.18 | 13,408 |
Hash sets | 6:00.52 | 14,208 |
Arrays | 0:06:49 | 17,604 |
The main downside of this approach is the change in programming style it mandates. This program is fundamentally about the mutation of a shared piece of state (the grove of elves). As such, it dictates the shape of the program that does that mutation. The code that solves this problem has a very imperative style, where I've outlined the steps to take and the order to take them, rather than expressing what I want done and allowing results to come together as needed.
Code
You can get the code from my locally-hosted Git repo, or from Gitlab.