6 December 2019 Tagged in: advent of code | haskell

Advent of Code 2019 day 5

Monad stacks, and debugging when things go wrong

Advent of Code 2019 day 5

Day 5 was the expected return to the Intcode virtual machine introduced on Day 2 and, as predicted, it involved taking a lot of code from 2018 day 21.

The monad stack

This version of the Intcode computer has input and output, as well as the state of the machine's memory. I could fiddle around with including the input and output inside the Machine record, but I think its cleaner to keep these concerns separate. That means the ProgrammedMachine type is a combination of three monads: the State monad handles the updates to the machine's memory; the Writer monad collects all the output; and the Reader monad holds all the machine's input. The only change to the machine is have it include the _inputIndex for how far the machine has got through the input.

data Machine = Machine { _memory :: Memory
                       , _ip :: Int
                       , _inputIndex :: Int
                       } 
               deriving (Show, Eq)

type ProgrammedMachine = WriterT [Int] (ReaderT ([Int]) (State Machine)) ()

Executing this thing and extracting the desired result is a gnarly mess of indentation and brackets; something not for the easily-shocked.

findMachineOutput inputs program = last output
    where   finalStack = 
                runState (
                    runReaderT (
                         runWriterT runAll
                               ) 
                         inputs
                         ) 
                         (makeMachine program)
            ((_retval, output), _machine) = finalStack

(If you want to know more about monad transformer stacks, there's a good chapter on them in Real World Haskell and gentle introduction on Monday Morning Haskell.)

Update: using the RWS monad

This combination of reader-writer-state monads is a common one, and exists in the standard library to make things easier (as I discovered while reading brandonchinn's code). Using the standard one involves an extra import line and gives a much neater invocation of the combined monad stack. Rather than the above, the code is now:

type ProgrammedMachine = RWS [Int] [Int] Machine ()

findMachineOutput :: [Int] -> [Int] -> Int
findMachineOutput inputs program = last output
    where (_machine, output) = execRWS runAll inputs (makeMachine program)

You can see this update in advent05rws.hs.

Doing input, process, and output

The three monads for input, processing, output lead naturally to three functions for dealing with them: fetchInput gets any input for this operation, perform updates the machine memory, and putOutput handles any output.

runStep :: ProgrammedMachine
runStep = 
    do mem <- gets _memory
       ip <- gets _ip
       let opcode = (mem!ip) `mod` 100
       let modes = parameterModes  ((mem!ip) `div` 100)
       fetchInput opcode
       putOutput opcode modes
       mem' <- gets _memory
       let (mem'', ip') = perform opcode ip modes mem'
       modify (\m -> m {_ip = ip', _memory = mem''})

(Note that this is a bit tidier than before, due to the use of gets and modify rather than bald get and put.)

In turn, fetchInput and putOutput only act on their respective operations, as defined by pattern matching in the opcode. ask reads from the Reader; tell writes to the Writer.

fetchInput 3 =
    do mem <- gets _memory
       ip <- gets _ip
       inputIndex <- gets _inputIndex
       inputs <- ask 
       let mem' = iInsert (ip + 1) (inputs!!inputIndex) mem
       modify (\m -> m {_inputIndex = inputIndex + 1, _memory = mem'})
fetchInput _ = return ()

putOutput 4 modes =
    do mem <- gets _memory
       ip <- gets _ip
       let v = getMemoryValue (ip + 1) (modes!!0) mem
       tell [v]
putOutput _ _ = return ()     

For part 2, the comparison and jump operations live purely in perform.

perform 1 ip modes mem = (iInsert (ip + 3) (a + b) mem, ip + 4)
    where a = getMemoryValue (ip + 1) (modes!!0) mem
          b = getMemoryValue (ip + 2) (modes!!1) mem
perform 2 ip modes mem = (iInsert (ip + 3) (a * b) mem, ip + 4)
    where a = getMemoryValue (ip + 1) (modes!!0) mem
          b = getMemoryValue (ip + 2) (modes!!1) mem
perform 3 ip _ mem = (mem, ip + 2)
perform 4 ip _ mem = (mem, ip + 2)
perform 5 ip modes mem = (mem, ip')
    where a = getMemoryValue (ip + 1) (modes!!0) mem
          b = getMemoryValue (ip + 2) (modes!!1) mem
          ip' = if a /= 0 then b else ip + 3
perform 6 ip modes mem = (mem, ip')
    where a = getMemoryValue (ip + 1) (modes!!0) mem
          b = getMemoryValue (ip + 2) (modes!!1) mem
          ip' = if a == 0 then b else ip + 3
perform 7 ip modes mem = (iInsert (ip + 3) res mem, ip + 4)
    where a = getMemoryValue (ip + 1) (modes!!0) mem
          b = getMemoryValue (ip + 2) (modes!!1) mem
          res = if a < b then 1 else 0
perform 8 ip modes mem = (iInsert (ip + 3) res mem, ip + 4)
    where a = getMemoryValue (ip + 1) (modes!!0) mem
          b = getMemoryValue (ip + 2) (modes!!1) mem
          res = if a == b then 1 else 0
perform _ ip _ mem = (mem, ip)
  • Operations 1 and 2 are the same as before.
  • Operations 3 and 4 do nothing but move the instruction pointer (ip) the correct amount.
  • Operations 5 and 6 change the ip depending on the test, setting it to the parameter value if the test passes, or updating it to the next instruction if it fails.
  • Operations 7 and 8 write values depending on the test they perform.

Parameter modes

The first part of the problem also introduced parameter modes: whether operand values were taken as literal numbers or as addresses of where to find the value.

My solution for handling these is a bit over-engineered for what's needed at the moment. But I fully expect to come back to this machine later in the Advent of Code, so I want something that will have some flexibility to expand as the requirements extend.

Parameter modes are a thing, so I promote them to being a concrete object in the form of a new data type. When more modes turn up, I can add them here.

The generation of the parameter modes from the opcode prefix is an example of an unfold operation. Just as a fold takes a list of values and collapses them into a single summary (such as the product of a list of numbers, or forming a sentence from a list of words), an unfold takes a single value and expands it into a list of sub-parts (such as finding the prime factors of a number, or splitting a sentence into its constituent words).

In this case, the parameter modes are encoded as the prefix of the operation code. The last digit of the prefix gives the first mode; earlier digits give other modes. As the prefix has, essentially, an infinite number of leading zeroes, the list of parameter codes is infinite. That's not a problem for us: Haskell's laziness means it only generates the parameter modes that are actually used. But it means that when we have operations that take more parameters, nothing in this code needs to change.

data ParameterMode = Position | Immediate deriving (Ord, Eq, Show)

parameterModes :: Int -> [ParameterMode]
parameterModes modeCode = unfoldr generateMode modeCode

generateMode :: Int -> Maybe (ParameterMode, Int)
generateMode modeCode = Just (mode, modeCode `div` 10)
    where mode = case (modeCode `mod` 10) of 
                    0 -> Position
                    1 -> Immediate

The parameter mode gets used in the getMemoryValue function, which takes a location and a mode and returns the correct value, depending on the appropriate mode. (And still using the !> syntactic sugar for indirect fetching from before.)

getMemoryValue loc Position mem = mem!>loc
getMemoryValue loc Immediate mem = mem!loc

Debugging

Of course, when I first ran this program, I got the wrong answer. In fact, I got no answer at all. The Intcode program ran, but the machine generated no output and it didn't seem to read any of the input.

My first thought was that I'd got some part of the monad stack plumbing wrong. The way I normally see inside Python scripts is to litter them with print statements and use them to get some idea of what's actually happening.

That's a bit more difficult in Haskell, with print being in the IO monad and all. Luckily, the trace function from the Debug.Trace library does much the same thing. The Haskell wiki page on debugging shows a standard idiom for using it. In this challenge, I used these trace calls.

fetchInput opcode | trace ("Input with opcode " ++ show opcode) False = undefined
fetchInput 3 =
    do mem <- gets _memory
    …

putOutput opcode _modes | trace ("Output with opcode " ++ show opcode) False = undefined
putOutput 4 modes =
    do mem <- gets _memory
    …

perform :: Int -> Int -> [ParameterMode] -> Memory -> (Memory, Int)
perform instr ip modes mem | trace ("Perform ip " ++ show ip ++ " opcode " ++ show instr ++ " modes " ++ (show (take 3 modes)) ++ " args " ++ (intercalate ", " (map show [(mem!(ip+1)), (mem!(ip+2)), (mem!(ip+3))]))) False = undefined
perform 1 ip modes mem = (iInsert (ip + 3) (a + b) mem, ip + 4)
    where a = getMemoryValue (ip + 1) (modes!!0) mem
          b = getMemoryValue (ip + 2) (modes!!1) mem
…

That showed that the machine wasn't executing any input or output operations. In fact, it wasn't even seeing any. After a few moments of blank staring, I realised what I'd done. I hard-code the input file names into the programs. As I'd copied this program from day 2, it was still reading day 2's input file. Doh!

I fixed that and ran into another problem, where the final operation was attempting to read memory location 50,000-odd and falling over. Eventually I tracked it down to this version of runStep, which follows the input-process-output pattern.

runStep :: ProgrammedMachine
runStep = 
    do mem <- gets _memory
       ip <- gets _ip
       let opcode = (mem!ip) `mod` 100
       let modes = parameterModes  ((mem!ip) `div` 100)
       -- input
       fetchInput opcode
       -- process
       mem' <- gets _memory
       let (mem'', ip') = perform opcode ip modes mem'
       modify (\m -> m {_ip = ip', _memory = mem''})
       -- output
       putOutput opcode modes

Unfortunately, the perform step both stores the result of the operation and updates the ip. putOutput uses the value of ip to determine which memory location to write. Because perform went first, putOutput was always writing the contents of the wrong location.

The fix was to reorder the steps to be input-output-process, like this:

runStep :: ProgrammedMachine
runStep = 
    do mem <- gets _memory
       ip <- gets _ip
       let opcode = (mem!ip) `mod` 100
       let modes = parameterModes  ((mem!ip) `div` 100)
       fetchInput opcode
       putOutput opcode modes
       mem' <- gets _memory
       let (mem'', ip') = perform opcode ip modes mem'
       modify (\m -> m {_ip = ip', _memory = mem''})

And everything worked fine.

Full code and refactoring

Around day 11, I decided to extract the "complete" Intcode interpreter into its own module. You can see the code I wrote on day 5 (and on Github),  and what it became once the interpreter was extracted (and on Github).