11 December 2020 ; tagged in: advent of code , haskell

Advent of Code 2020 day 4

Persuading Megaparsec about the importance of space.

Advent of Code 2020 day 4

The hardest part of day 4 turned out to be reading the input file!


The problem is that I've been using the Megaparsec library to parse the input. This is a powerful parsing library, but it's been developed to assist the parsing of the languages that humans write. One of the key ideas is that tokens in the parsed filed are separated by spaces, and humans aren't very good at being precise when it comes to what they mean by "space". People can put different amounts of space in places, mix spaces and tabs, throw random line-breaks in the middle of expressions, and so on. That means that Megaparsec is very good at consuming all this white space and allowing you to focus on the actual text in the file.

But the input for today had fields separated by spaces (whether space characters, newlines, or both) and records separated by blank lines.

ecl:gry pid:860033327 eyr:2020 hcl:#fffffd
byr:1937 iyr:2017 cid:147 hgt:183cm

iyr:2013 ecl:amb cid:350 eyr:2023 pid:028048884
hcl:#cfa07d byr:1929

hcl:#ae17e1 iyr:2013
ecl:brn pid:760753108 byr:1931

hcl:#cfa07d eyr:2025 pid:166559648
iyr:2011 ecl:brn hgt:59in

The standard Megaparsec approach would be to treat all of that as "whitespace" and consume the lot, collapsing all the records into one.

I ended up with a complicated whitespace consumer that only consumed one newline character at a time (using try... notFollowedBy to make sure there's only one newline). This combined with a blankLines parser that accepts only blank lines.

sc :: Parser ()
sc = L.space (skipSome (    char ' ' 
                       <|> char '\t' 
                       <|> (try (newline <* notFollowedBy newline))
             ) CA.empty CA.empty

blankLines = skipSome newline

kvP = (,) <$> stringP <* colonP <*> stringP

passportsP = passportP `sepBy` blankLines
passportP = M.fromList <$> many kvP

This does the job, but it's pretty fragile in not handling "blank" lines that contain some space characters. But it's enough to do the job for this task.

Part 1

Now I could read the data, time to validate it. Not knowing what I'd be using the data for, I decided to store each passport as a Map of key-value pairs.

The validation rules were a bit unclear (what if there are fields outside the specified ones? what if there are duplicates of a field?), but I decided to take the most obvious approach. I defined a Set of required fields. To validate a passport, find the set of fields defined in it, subtracted that from the set of required fields. If the result is emtpy, all the required fields are present and the passport passes that stage of validation.

The code is much shorter than the explanation.

requiredFields = S.fromList ["byr", "iyr", "eyr", "hgt", "hcl", 
    "ecl", "pid"]

part1 = length . filter hasRequiredFields

hasRequiredFields passport = S.null $ requiredFields `S.difference` (M.keysSet passport)

Part 2

This was a bit fiddly. A passport validates if it has all the required fields, and each field passes validation. (As all I'm doing is validating key-value pairs, there was no need to convert the set of key-value pairs from the parser into a Map only to convert it back. Oh well.)

part2 = length . filter validPassport

validPassport :: Passport -> Bool
validPassport passport = (hasRequiredFields passport) && (all validField $ M.assocs passport)

validField is just a big case statement. Note that cid always validates, and unknown fields always fail.

validField :: (String, String) -> Bool
validField (key, value) =
  case key of
    "byr" -> validRanged 1920 2002 value
    "iyr" -> validRanged 2010 2020 value
    "eyr" -> validRanged 2020 2030 value
    "hgt" -> validHeight value
    "hcl" -> validHex value
    "ecl" -> validEye value
    "pid" -> validPid value
    "cid" -> True
    _ -> False

The rest of the validations are pretty much implementing the rules. Nothing much to see here.

validRanged lower upper value = 
  if all isDigit value
  then v >= lower && v <= upper
  else False
  where v = read @Int value 

validHeight value = 
  if u == "cm"
  then validRanged 150 193 v
  else if u == "in"
       then validRanged 59 76 v
       else False
  where (v, u) = span isDigit value

validHex value = (length value == 7) && (head value == '#') && (all isHexDigit $ tail value)

validEye value = value `S.member` eyeColours
eyeColours = S.fromList ["amb", "blu", "brn", "gry", "grn", "hzl", "oth"]

validPid value = (length value == 9) && (all isDigit value)


You can find the code here or on Gitlab.