### Git for beginners 4b: Working with remote branches

Moving code around with branches

Going back to the source for parser combinators

Day 18 was all about the parsing, rather than the coding. It also meant going back to the original papers that prompted the development of `attoparsec`

and other parsing libraries.

This is about representing arithmetic expressions then evaluating them. As I didn't know what part 2 would bring, I thought I'd develop a data structure to hold a parsed expression, then evaluate it in a separate stage. It's just an algebraic type to hold the different operators. There's no need for a "bracket" option, as the final expression precedence is given by the shape of the `Expression`

tree.

```
data Expression
= Number Integer
| Mul Expression Expression
| Add Expression Expression
deriving (Eq)
instance Show Expression where
show (Number x) = show x
show (Mul a b) = "(" ++ show a ++ " * " ++ show b ++ ")"
show (Add a b) = "(" ++ show a ++ " + " ++ show b ++ ")"
evaluate (Number x) = x
evaluate (Mul a b) = (evaluate a) * (evaluate b)
evaluate (Add a b) = (evaluate a) + (evaluate b)
```

(In my code, I decided not to use a custom `Show`

instance, as I wanted to be clear what the parsers were generating.

The parsing and association rules need to turn an expression like `1 + 2 * 3 + 4`

into an `Expression`

tree like this:

```
+
/ \
* 4
/ \
+ 3
/ \
1 2
```

A first attempt at parsing looked like this.

```
expressionP = mulExpression <|> addExpression
mulExpressionP = Mul <$> expressionP <* " * " *> elementP
addExpressionP = Add <$> expressionP <* " + " *> elementP
elementP = numberP <|> bracketedP
bracketedP = "(" *> exprP <* ")"
```

Which looks good, but fails instantly with an infinite loop. A little thought reveals why: the first `+`

needs to be a few levels down in the tree. The parser needs to parse the whole expression in order to know *how many* levels down it needs to be. But the structure of the parser means it needs to make a decision *now* about this first `+`

, before it's seen the rest of the expression. So it keeps creating more layers in the tree, hoping they'll be enough but never terminating.

This problem was solved in the original paper that introduced parser combinators. That paper illustrated the use of combinators with a calculator example, correctly parsing expressions like `1 + 2 + 3`

into `(1 + 2) + 3`

, which is what we want.

(Hutton, G., Meijer, E. (1998) “Monadic parsing in Haskell,”Journal of Functional Programming8(4): 437–444).

The trick is to use the `chainl1`

combinator to chain, from the left, a bunch of operations. It's easier to use a more Applicative style for this combinator (Applicative hadn't been identified in 1998!), as described by Vibhav Sagar and Stephen Diehl. I'm slightly prefer Diehl's verison, so I'm using it here.

```
mulExprP = string " * " *> pure Mul
addExprP = string " + " *> pure Add
expessionrP = elementP `chainl1` (mulExprP <|> addExprP)
chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
p `chainl1` op = do a <- p
rest a
where rest a = (do f <- op
b <- p
rest (f a b))
<|> return a
```

In this instance of the `chainl1`

parser, it first reads an `element`

(a number or a bracketed term) and calls it `a`

. If it can read the `op`

(e.g. `*`

), it combines `a`

and the `op`

with the result of parsing the `rest`

of the input. If parsing `rest`

fails, it just returns `a`

.

That gives us the parse we want.

(`chainl1`

exists in the `Text.ParserCombinators.ReadP`

library, but its somehow missing from `attoparsec`

. But I'm using `attoparsec`

now, so I'll just reimplement it.)

Given the ability to parse inputs, evaluating them is easy.

```
main =
do text <- TIO.readFile "data/advent18.txt"
print $ part1 text
part1 = parseAndEval expressionsP
parseAndEval parser text = sumEval $ successfulParse parser text
sumEval = sum . map evaluate
```

Using operator precedence takes my code *closer* to the original Functional Pearl, even if the precedence rules are reversed from convention. All that's needed are a couple of changes to the parser, and new names to avoid confusion.

```
pExpressionsP = pExprP `sepBy` endOfLine
pExprP = pTermP `chainl1` mulExprP
pTermP = pElementP `chainl1` addExprP
pElementP = numberP <|> pBracketedP
pBracketedP = "(" *> pExprP <* ")"
```

That's all that's needed!

You can find the code here or on GitLab.