Advent of Code 2020 day 4

    The hardest part of day 4 turned out to be reading the input file!

    Parsing

    The problem is that I've been using the Megaparsec library to parse the input. This is a powerful parsing library, but it's been developed to assist the parsing of the languages that humans write. One of the key ideas is that tokens in the parsed filed are separated by spaces, and humans aren't very good at being precise when it comes to what they mean by "space". People can put different amounts of space in places, mix spaces and tabs, throw random line-breaks in the middle of expressions, and so on. That means that Megaparsec is very good at consuming all this white space and allowing you to focus on the actual text in the file.

    But the input for today had fields separated by spaces (whether space characters, newlines, or both) and records separated by blank lines.

    ecl:gry pid:860033327 eyr:2020 hcl:#fffffd
    byr:1937 iyr:2017 cid:147 hgt:183cm
    
    iyr:2013 ecl:amb cid:350 eyr:2023 pid:028048884
    hcl:#cfa07d byr:1929
    
    hcl:#ae17e1 iyr:2013
    eyr:2024
    ecl:brn pid:760753108 byr:1931
    hgt:179cm
    
    hcl:#cfa07d eyr:2025 pid:166559648
    iyr:2011 ecl:brn hgt:59in
    

    The standard Megaparsec approach would be to treat all of that as "whitespace" and consume the lot, collapsing all the records into one.

    I ended up with a complicated whitespace consumer that only consumed one newline character at a time (using try... notFollowedBy to make sure there's only one newline). This combined with a blankLines parser that accepts only blank lines.

    sc :: Parser ()
    sc = L.space (skipSome (    char ' ' 
                           <|> char '\t' 
                           <|> (try (newline <* notFollowedBy newline))
                           )
                 ) CA.empty CA.empty
    
    blankLines = skipSome newline
    
    kvP = (,) <$> stringP <* colonP <*> stringP
    
    passportsP = passportP `sepBy` blankLines
    passportP = M.fromList <$> many kvP
    

    This does the job, but it's pretty fragile in not handling "blank" lines that contain some space characters. But it's enough to do the job for this task.

    Part 1

    Now I could read the data, time to validate it. Not knowing what I'd be using the data for, I decided to store each passport as a Map of key-value pairs.

    The validation rules were a bit unclear (what if there are fields outside the specified ones? what if there are duplicates of a field?), but I decided to take the most obvious approach. I defined a Set of required fields. To validate a passport, find the set of fields defined in it, subtracted that from the set of required fields. If the result is emtpy, all the required fields are present and the passport passes that stage of validation.

    The code is much shorter than the explanation.

    requiredFields = S.fromList ["byr", "iyr", "eyr", "hgt", "hcl", 
        "ecl", "pid"]
    
    part1 = length . filter hasRequiredFields
    
    hasRequiredFields passport = S.null $ requiredFields `S.difference` (M.keysSet passport)
    

    Part 2

    This was a bit fiddly. A passport validates if it has all the required fields, and each field passes validation. (As all I'm doing is validating key-value pairs, there was no need to convert the set of key-value pairs from the parser into a Map only to convert it back. Oh well.)

    part2 = length . filter validPassport
    
    validPassport :: Passport -> Bool
    validPassport passport = (hasRequiredFields passport) && (all validField $ M.assocs passport)
    

    validField is just a big case statement. Note that cid always validates, and unknown fields always fail.

    validField :: (String, String) -> Bool
    validField (key, value) =
      case key of
        "byr" -> validRanged 1920 2002 value
        "iyr" -> validRanged 2010 2020 value
        "eyr" -> validRanged 2020 2030 value
        "hgt" -> validHeight value
        "hcl" -> validHex value
        "ecl" -> validEye value
        "pid" -> validPid value
        "cid" -> True
        _ -> False
    

    The rest of the validations are pretty much implementing the rules. Nothing much to see here.

    validRanged lower upper value = 
      if all isDigit value
      then v >= lower && v <= upper
      else False
      where v = read @Int value 
    
    validHeight value = 
      if u == "cm"
      then validRanged 150 193 v
      else if u == "in"
           then validRanged 59 76 v
           else False
      where (v, u) = span isDigit value
    
    validHex value = (length value == 7) && (head value == '#') && (all isHexDigit $ tail value)
    
    validEye value = value `S.member` eyeColours
    eyeColours = S.fromList ["amb", "blu", "brn", "gry", "grn", "hzl", "oth"]
    
    validPid value = (length value == 9) && (all isDigit value)
    

    Code

    You can find the code here or on Gitlab.

    Neil Smith

    Read more posts by this author.