Advent of Rust, Day 20 and 21: Stumped by Sea Monsters

Welcome back to another two days’ worth of the ongoing chronicle of me trying to teach myself the Rust programming language by doing programming puzzles from Advent of Code 2020.

Day 20, Part 1

Unlike in the past puzzles, I have no idea how to tackle this problem, so I start just by reading in the data. I’ll create a struct Tile to hold the data. I don’t have to store the actual image data, just the borders, so I can compare them to the borders of the other tiles.

There are eight borders — one on each of the four sides, plus the tiles may also be flipped, so the same borders again but reversed. I’ll store each border as a u16 bit pattern for easy comparing, in an array of length 8.

I do start out with the vague idea that I should use tuple_combinations() for this. But anyway, I’ll read in the data first.

#[macro_use]
extern crate scan_fmt;

use ndarray::{s, Array2, ArrayView, Ix1};
use std::convert::TryInto;

struct Tile {
    id: u16,
    borders: [u16; 8],
}

impl Tile {
    fn new(id: u16, grid: &Array2<u8>) -> Self {
        let borders = [
            s![0, ..],
            s![0, ..;-1],
            s![grid.nrows() - 1, ..],
            s![grid.nrows() - 1, ..;-1],
            s![.., 0],
            s![..;-1, 0],
            s![.., grid.ncols() - 1],
            s![..;-1, grid.ncols() - 1],
        ]
        .iter()
        .map(|&slice| to_bits(grid.slice(slice)))
        .collect::<Vec<u16>>()
        .try_into()
        .unwrap();
        Tile { id, borders }
    }
}

fn to_bits(slice: ArrayView<u8, Ix1>) -> u16 {
    let mut retval = 0;
    for (ix, &cell) in slice.iter().enumerate() {
        if cell > 0 {
            retval |= 2_u16.pow(ix.try_into().unwrap());
        }
    }
    retval
}

fn read_grid(lines: Vec<&str>) -> Array2<u8> {
    let rows = lines.len();
    let cols = lines[0].len();
    let mut cells = Array2::zeros((rows, cols));
    for (y, line) in lines.iter().enumerate() {
        for (x, tile) in line.bytes().enumerate() {
            cells[[y, x]] = match tile {
                b'#' => 1,
                b'.' => 0,
                _ => panic!("Bad tile '{}'", tile),
            };
        }
    }
    cells
}

fn read_tile(input: &str) -> Tile {
    let mut lines = input.lines();
    let header = lines.next().unwrap();
    let id = scan_fmt!(header, "Tile {}:", u16).unwrap();
    let grid = read_grid(lines.collect::<Vec<&str>>());
    Tile::new(id, &grid)
}

fn read_input(input: &'static str) -> impl Iterator<Item = Tile> {
    input.split("\n\n").map(read_tile)
}

I write the slice things (s![0, ..;-1]) after reading a bit more of the documentation about slicing ndarrays. I think this notation is an improvement on the Conway’s Game of Life code that I wrote a few days ago.

Maybe if there is only ever one possibility for each edge to match another tile’s edge, we can iterate over tuple_combinations() and check that there are four tiles that have only two possible connections to other tiles?

I add a HashSet<u16> of connections to the Tile type, and add these two methods:

fn connect(&mut self, other: &mut Self) {
    let borders1: HashSet<_> = self.borders.iter().collect();
    let borders2: HashSet<_> = other.borders.iter().collect();
    let intersection: HashSet<_> = borders1.intersection(&borders2).collect();
    match intersection.len() {
        0 => (),
        1 => {
            self.connections.insert(other.id);
            other.connections.insert(self.id);
        }
        _ => panic!(
            "Tile {} and {} can connect in more than one way",
            self.id, other.id
        ),
    }
}

fn n_connections(&self) -> usize {
    self.connections.len()
}

I try this to set up the network of tiles:

fn network(tiles: &mut Vec<Tile>) {
    for (&mut tile1, &mut tile2) in tiles.iter_mut().tuple_combinations() {
        tile1.connect(&mut tile2);
    }
}

tuple_combinations() doesn’t work, for one thing:

error[E0277]: the trait bound `std::slice::IterMut<'_, Tile>: std::clone::Clone` is not satisfied
  --> puzzle20.rs:81:54
   |
81 |     for (&mut tile1, &mut tile2) in tiles.iter_mut().tuple_combinations() {
   |                                                      ^^^^^^^^^^^^^^^^^^ the trait `std::clone::Clone` is not implemented for `std::slice::IterMut<'_, Tile>`

error[E0277]: the trait bound `&mut Tile: std::clone::Clone` is not satisfied
  --> puzzle20.rs:81:54
   |
81 |     for (&mut tile1, &mut tile2) in tiles.iter_mut().tuple_combinations() {
   |                                                      ^^^^^^^^^^^^^^^^^^ the trait `std::clone::Clone` is not implemented for `&mut Tile`
   |
   = note: `std::clone::Clone` is implemented for `&Tile`, but not for `&mut Tile`

If I understand the error message correctly, this is because tuple_combinations() makes copies of the elements, and I don’t think I can copy the tiles — for one thing, I want the number of connections to be updated in the same copy of the tile, I don’t want different copies of the same tile with different connections!

I try indexing the array and iterating over tuple combinations of indices:

fn network(tiles: &mut Vec<Tile>) {
    let indices = 0..tiles.len();
    for (ix1, ix2) in indices.tuple_combinations() {
        tiles[ix1].connect(&mut tiles[ix2]);
    }
}

This also doesn’t work:

error[E0499]: cannot borrow `*tiles` as mutable more than once at a time
  --> puzzle20.rs:80:33
   |
80 |         tiles[ix1].connect(&mut tiles[ix2]);
   |         -----      -------      ^^^^^ second mutable borrow occurs here
   |         |          |
   |         |          first borrow later used by call
   |         first mutable borrow occurs here

This I don’t understand. I don’t want to borrow the whole array twice, I just want to modify elements in it.

Eventually, after trying a lot more things, I give up, and store the connections in a separate collection:

impl Tile {
    fn connects_to(&self, other: &Self) -> bool {
        let borders1: HashSet<_> = self.borders.iter().collect();
        let borders2: HashSet<_> = other.borders.iter().collect();
        let intersection: HashSet<_> = borders1.intersection(&borders2).collect();
        match intersection.len() {
            0 => false,
            1 => true,
            _ => panic!(
                "Tile {} and {} can connect in more than one way",
                self.id, other.id
            ),
        }
    }
}

fn network(tiles: Vec<Tile>) -> HashMap<u16, HashSet<u16>> {
    let mut connections = HashMap::new();
    for (tile1, tile2) in tiles.iter().tuple_combinations() {
        if tile1.connects_to(tile2) {
            let links1 = connections.entry(tile1.id).or_insert(HashSet::new());
            links1.insert(tile2.id);
            let links2 = connections.entry(tile2.id).or_insert(HashSet::new());
            links2.insert(tile1.id);
        }
    }
    connections
}

This works, but running it on the example input I get ‘Tile 2311 and 1951 can connect in more than one way’! Then I realize that yes, they could both be connected, but if you flip both of them, then they can still be connected. So we should expect that tiles connect in either 0 or 2 possible ways in connects_to().

For the array of connections in the example input, I get 4 tiles with 2 connections, 4 tiles with 3 connections, and one tile with 4 connections, which is consistent with the tiles being arranged in a 3×3 square.

Running it on the real input I get a panic, because a zero-length string is being passed to read_tile(). Sure enough, there is an extra newline at the end of the file. I wonder if instead of splitting on "\n\n" I can split on "\n{2,}", but that requires adding a dependency on the regex package which I’m not using anywhere else, so I add a .filter(|s| s.len() > 0) to read_input().

Finally I write the main() function which looks like this:

fn main() {
    let input = include_str!("input");
    let tiles = read_input(input);
    let connections = network(&tiles);

    let answer: u32 = connections
        .iter()
        .filter(|(_, links)| links.len() == 2)
        .map(|(id, _)| id)
        .product();
    println!("{}", answer);
}

I get an overflow in the call to product() so I change the type of the IDs to u64. This gives me the right answer.

Day 20, Part 2

Next we have to remove the borders of the tiles, and arrange them in the correct configuration, and then search for “sea monsters” that look like this:

                  #
#    ##    ##    ###
 #  #  #  #  #  #

From experience at a job a long time ago, I know how to search for the sea monster, by using cross-correlation, which I would be willing to bet that ndarray is capable of doing. Before I get to that part, however, I’m a bit at a loss for how to stitch the pictures together. I feel that I could do it, but without a good plan it would probably be very messy and take me a lot of time to finish.

I decide once again to start by storing the data that I will need, and then see if I get any ideas while doing that.

First of all, I will need to store the tile, as an ndarray::Array2, without the borders. This can be done by taking a slice and copying it into an owned array, in the constructor:

let image = grid
    .slice(s![1..grid.nrows() - 1, 1..grid.ncols() - 1])
    .into_owned();
Tile { id, borders, image }

Now I’m not sure what to do. I take a break and work on Day 21 instead. When I come back to it, I don’t really have any more ideas.

I decide that since the orientation of each tile is now important, I should do some refactors to Tile. I will add rot90(), fliplr(), and flipud() methods (named after the NumPy functions), and I won’t store eight borders in an arbitrary order, but instead four borders in the order top, right, bottom, and left. The rotating and flipping operations will manipulate the array of borders so that they are correct for the new orientation.

#[derive(Debug, PartialEq)]
struct Tile {
    id: u64,
    // top, right, bottom, left borders with bits counted from LSB to MSB in
    // clockwise direction
    borders: [u16; 4],
    border_bits: u8, // number of bits in border
    image: Array2<u8>,
}

impl Tile {
    fn new(id: u64, grid: &Array2<u8>) -> Self {
        let borders = [
            s![0, ..],
            s![.., grid.ncols() - 1],
            s![grid.nrows() - 1, ..;-1],
            s![..;-1, 0],
        ]
        .iter()
        .map(|&slice| to_bits(grid.slice(slice)))
        .collect::<Vec<u16>>()
        .try_into()
        .unwrap();
        let image = grid
            .slice(s![1..grid.nrows() - 1, 1..grid.ncols() - 1])
            .into_owned();
        Tile {
            id,
            borders,
            image,
            border_bits: grid.ncols() as u8,
        }
    }

    // rotate counterclockwise n times
    fn rot90(&self, n: usize) -> Self {
        let rotated_view = match n % 4 {
            0 => self.image.view(),
            1 => self.image.slice(s![.., ..;-1]).reversed_axes(),
            2 => self.image.slice(s![..;-1, ..;-1]),
            3 => self.image.slice(s![..;-1, ..]).reversed_axes(),
            _ => panic!("Impossible"),
        };
        let rotated_image = rotated_view.into_owned();
        let mut borders = self.borders.clone();
        borders.rotate_left(n);
        Tile {
            id: self.id,
            borders,
            image: rotated_image,
            border_bits: self.border_bits,
        }
    }

    fn fliplr(&self) -> Self {
        let flipped_image = self.image.slice(s![.., ..;-1]).into_owned();
        let mut borders: Vec<_> = self
            .borders
            .iter()
            .map(|&b| flip_bits(b, self.border_bits))
            .collect();
        borders.reverse();
        borders.rotate_right(1);
        Tile {
            id: self.id,
            borders: borders.try_into().unwrap(),
            image: flipped_image,
            border_bits: self.border_bits,
        }
    }

    fn flipud(&self) -> Self {
        let flipped_image = self.image.slice(s![..;-1, ..]).into_owned();
        let mut borders: Vec<_> = self
            .borders
            .iter()
            .map(|&b| flip_bits(b, self.border_bits))
            .collect();
        borders.reverse();
        borders.rotate_left(1);
        Tile {
            id: self.id,
            borders: borders.try_into().unwrap(),
            image: flipped_image,
            border_bits: self.border_bits,
        }
    }

    fn connects_to(&self, other: &Self) -> bool {
        let borders1: HashSet<_> = self.borders.iter().cloned().collect();
        let mut borders2: HashSet<_> = other.borders.iter().cloned().collect();
        borders2.extend(
            other
                .borders
                .iter()
                .map(|&b| flip_bits(b, self.border_bits)),
        );
        let intersection: HashSet<_> = borders1.intersection(&borders2).collect();
        match intersection.len() {
            0 => false,
            1 => true,
            _ => panic!(
                "Tile {} and {} can connect in more than one way",
                self.id, other.id
            ),
        }
    }
}

fn flip_bits(border: u16, n_bits: u8) -> u16 {
    assert!(n_bits <= 16);
    border.swap_bits() >> (16 - n_bits)
}

fn to_bits(slice: ArrayView<u8, Ix1>) -> u16 {
    let mut retval = 0;
    for (ix, &cell) in slice.iter().enumerate() {
        if cell > 0 {
            retval |= 2_u16.pow(ix.try_into().unwrap());
        }
    }
    retval
}

I also write some tests for this code, and debug it until the tests pass. The full code is on the repository. This, at least, still gives me the correct answer for Part 1. However, I’m really tired of doing this puzzle, so I think I’ll stop for now and leave it for another day.

Day 21, Part 1

Here, we have ingredient lists in a language we don’t understand, of foods containing ingredients such as “sqjhc”, “mxmxvkd”, and “kfcds”. These foods also have allergen lists in English. Each allergen is present in one ingredient and each ingredient contains either zero or one allergen. We have to determine which ingredients contain zero allergens, and count how many times they appear in our input.

Here’s another puzzle where I’m not sure how to get started, and so I decide once more to at least write code to read in the data.

use std::collections::HashSet;

struct Food {
    ingredients: HashSet<String>,
    allergens: HashSet<String>,
}

impl Food {
    fn from_string(s: &str) -> Self {
        let mut split1 = s[0..s.len() - 1].split(" (contains ");
        let (ingredients_list, allergens_list) = (split1.next().unwrap(), split1.next().unwrap());
        let ingredients = ingredients_list.split(' ').map(String::from).collect();
        let allergens = allergens_list.split(", ").map(String::from).collect();
        Food {
            ingredients,
            allergens,
        }
    }
}

I also write a test for this, which I’ll copy here since I found the way to assert the contents of a HashSet to be unusually verbose:

#[test]
fn test_parse_food() {
    let food = Food::from_string("mxmxvkd kfcds sqjhc nhms (contains dairy, fish)");
    assert_eq!(
        food.ingredients,
        ["mxmxvkd", "kfcds", "sqjhc", "nhms"]
            .iter()
            .cloned()
            .map(String::from)
            .collect()
    );
    assert_eq!(
        food.allergens,
        ["dairy", "fish"]
            .iter()
            .cloned()
            .map(String::from)
            .collect()
    );
}

It would be nice if we could say something like assert_contains!(food.ingredients, ["mxmxvkd", "kfcds", "sqjhc", "nhms"]).

I’m having trouble figuring out how to get the answer not because I’m not sure about how something works in Rust, but just because it’s not even clear to me how to solve the example problem.

I take a break and think about it some more. I wonder if we could take the tuple combinations of all foods, and compare the foods in each tuple; if they have a common allergen, then all the ingredients that are in one of the foods but not both, are non-allergens.

I write this code:

fn find_non_allergens(foods: &[Food]) -> HashSet<String> {
    foods
        .iter()
        .tuple_combinations()
        .flat_map(|(food1, food2)| {
            if food1.allergens.union(&food2.allergens).count() > 0 {
                food1
                    .ingredients
                    .symmetric_difference(&food2.ingredients)
                    .map(String::from)
                    .collect()
            } else {
                vec![]
            }
        })
        .collect()
}

That doesn’t work, though! The list of non-allergens that it yields on the example problem is too long, including ingredients that might have allergens. This is because of something in the puzzle description that I forgot to take into account: foods might not list all the allergens that they have.

My second attempt makes a list of ingredients possible_allergens that initially could contain any allergen, and removes them as they are found to be impossible:

fn find_non_allergens(foods: &[Food]) -> HashSet<String> {
    let all_allergens: HashSet<_> = foods
        .iter()
        .flat_map(|food| food.allergens.clone())
        .collect();
    let all_ingredients: Vec<_> = foods
        .iter()
        .flat_map(|food| food.ingredients.clone())
        .collect();
    let mut possible_allergens: HashMap<_, HashSet<_>> = all_ingredients
        .iter()
        .map(|ingredient| (ingredient.clone(), all_allergens.clone()))
        .collect();
    for (food1, food2) in foods.iter().tuple_combinations() {
        let allergens_in_both: Vec<_> = food1.allergens.intersection(&food2.allergens).collect();
        let ingredients_not_in_both: Vec<_> = food1
            .ingredients
            .symmetric_difference(&food2.ingredients)
            .map(String::from)
            .collect();
        for ingredient in &ingredients_not_in_both {
            for &allergen in &allergens_in_both {
                possible_allergens
                    .get_mut(ingredient)
                    .unwrap()
                    .remove(allergen);
            }
        }
    }
    possible_allergens
        .iter()
        .filter(|(_, allergens)| allergens.len() == 0)
        .map(|(ingredient, _)| ingredient.clone())
        .collect()
}

This doesn’t work either; “soy” is not ruled out for any of the ingredients, because there are no two foods in the example data that both contain “soy”.

I note that the compiler’s error message when trying to mutate a HashMap value gotten with get() is not very enlightening:

error[E0596]: cannot borrow data in a `&` reference as mutable
  --> puzzle21.rs:45:17
   |
45 |                 possible_allergens.get(ingredient).unwrap().remove(allergen);
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot borrow as mutable

Maybe it should suggest using get_mut() here!

While writing this I notice that I mistakenly used union() in the previous example instead of intersection(), so I quickly go back and try that, but it doesn’t work either.

Then I have a brainwave and realize that I don’t need to compare pairs of foods at all. If a food is listed as containing allergens, then the ingredients that are not in that food cannot contain those allergens. I write this code and get the correct answer from both the example data and the real puzzle input:

fn find_non_allergens(foods: &[Food]) -> HashSet<String> {
    let all_allergens: HashSet<_> = foods
        .iter()
        .flat_map(|food| food.allergens.clone())
        .collect();
    let all_ingredients: Vec<_> = foods
        .iter()
        .flat_map(|food| food.ingredients.clone())
        .collect();
    let mut possible_allergens: HashMap<_, HashSet<_>> = all_ingredients
        .iter()
        .map(|ingredient| (ingredient.clone(), all_allergens.clone()))
        .collect();
    for food in foods {
        for ingredient in all_ingredients
            .iter()
            .filter(|&ingredient| !food.ingredients.contains(ingredient))
        {
            for allergen in &food.allergens {
                possible_allergens
                    .get_mut(ingredient)
                    .unwrap()
                    .remove(allergen);
            }
        }
    }
    possible_allergens
        .iter()
        .filter(|(_, allergens)| allergens.is_empty())
        .map(|(ingredient, _)| ingredient.clone())
        .collect()
}

fn main() {
    let input = include_str!("input");
    let foods: Vec<Food> = input.lines().map(|s| Food::from_string(s)).collect();
    let non_allergens = find_non_allergens(&foods);
    let count: usize = non_allergens
        .iter()
        .map(|ingredient| {
            foods
                .iter()
                .filter(|food| food.ingredients.contains(ingredient))
                .count()
        })
        .sum();
    println!("{}", count);
}

Day 21, Part 2

Part 2 of the puzzle, as one might have expected, is to figure out which ingredients do contain which allergens. The answer to the puzzle is a comma-separated list of ingredients, sorted by allergen alphabetically.

For this we need to reuse possible_allergens, so I first split find_non_allergens() into a find_possible_allergens() function that returns possible_allergens, and a new find_non_allergens() function that does the filtering and mapping on that.

Then we can copy part of the solution from Day 16:

fn determine_allergens(
    possible_allergens: &HashMap<String, HashSet<String>>,
) -> HashMap<String, String> {
    let mut to_be_determined: Vec<_> = possible_allergens
        .iter()
        .map(|(s, set)| (s.clone(), set.clone()))
        .collect();

    let mut dangerous_ingredient_list = HashMap::new();
    while !to_be_determined.is_empty() {
        to_be_determined.sort_by_key(|(_, set)| set.len());
        to_be_determined.reverse();

        let (ingredient, allergens) = to_be_determined.pop().unwrap();
        if allergens.is_empty() {
            continue;
        }
        assert!(allergens.len() == 1, "unable to determine allergens");
        let allergen = allergens.iter().next().unwrap();
        dangerous_ingredient_list.insert(allergen.clone(), ingredient.clone());
        for (_, remaining_allergens) in &mut to_be_determined {
            remaining_allergens.remove(allergen);
        }
    }
    dangerous_ingredient_list
}

The difference is that we have to re-sort the list every iteration, since in one iteration you might rule out soy, but the next item might not have soy as a possibility because it was ruled out from a later item — as I find out by a panic when trying to run it.

Then to get the answer, I write:

let mut dangerous_ingredient_list = determine_allergens(&possible_allergens);
let mut dangerous_ingredients = dangerous_ingredient_list.drain().collect::<Vec<_>>();
dangerous_ingredients.sort_by(|(allergen1, _), (allergen2, _)| allergen1.cmp(allergen2));
let list = dangerous_ingredients
    .drain(..)
    .map(|(_, ingredient)| ingredient)
    .collect::<Vec<_>>()
    .join(",");
println!("{}", list);

The full code is on the repository.

Afterword

The puzzles are definitely getting more difficult! In many of the previous puzzles, I’ve felt like I had a general idea about how to proceed, but I didn’t feel that on Day 20. Instead, this is the first time that I actually gave up on a puzzle and went on to the next one.

There certainly is a lot of .iter()....collect() in the code from both days, to convert between Vec, HashSet, and HashMap. This seems unnecessarily verbose, I wonder if there’s a more idiomatic way that I’m missing?

Advent of Rust 16 and 17: Really, That’s More Iterators Than I Wanted; and More Adventures With NumRust

The last couple of days I took a break from this chronicle of my attempt to teach myself the Rust programming language by solving the programming puzzles on Advent of Code 2020. But now I’m back with another two days’ worth of puzzles!

One thing that I read in the meantime, thanks to a tip from Federico was the first installment of someone else’s blog who’s doing the same thing as I am. That blog is a really good read, and I think the main difference with my series is that the author is already a Rust expert! The style is also very different, as well; I am mostly trying to emphasize the things that I found surprising, struggled with, or didn’t understand. Their blog is much more didactic.1

One really cool thing that I picked up from that blog post is the include_str!() macro, which makes it possible to read the input file into a string at compile time, and dispense with the read_lines() boilerplate and most of the error handling. I think I will be using this from now on. This way also makes it easy to substitute in the example inputs from the puzzle descriptions.

Also in the meantime, I spent a lot of time trying to configure either of the Rust Enhanced or SublimeLinter-contrib-rustc plugins for my Sublime Text editor. I had a surprisingly hard time with this, neither of them seemed to work. In the latter case it was because the plugin was out of sync with the API of the newest version of SublimeLinter. I thought the former plugin, Rust Enhanced, would be a smooth experience since it claims to be the officially sanctioned Rust plugin for Sublime Text, but it didn’t work either! Whenever I would try to run any of the Rust commands, I would get this message:

Blocking waiting for file lock on build directory

I finally figured out that this error goes away if you run cargo clean. My guess, it’s probably because I copied the puzzle15 folder to puzzle16 today, instead of creating it with cargo new! My bad!

The editor plugin makes development much smoother, since I can get the compiler’s answer directly when I save the file, without even leaving my editor. This makes the endless rounds of compile, add a &, compile, add a *, go much faster.

It is mildly annoying that the plugin shows the compiler messages inline, instead of in the editor gutter like all the SublimeLinter plugins do. One day when I have time, I’ll investigate to see if there is a setting that controls this.

On to the puzzle.

Day 16, Part 1

This puzzle is about train tickets. We get a more free-form input file than usual, and we have to use it to deduce the meanings of fields on train tickets that we’ve scanned. In the input file there is a list of valid ranges for each field, and lists of fields on train tickets. Some of the train tickets are invalid, and that’s what Part 1 of the puzzle is about. The answer is the sum of all the values on any ticket that are not valid no matter what field they belong to.

I start by thinking about what data structure I would use to store the ranges. Some of them are overlapping and it would be nice to be able to collapse those. I google “rust collapse range” and land in the intervallum package, where the IntervalSet type seems like exactly what I want for this. It can collapse two intervals into a single interval set which covers the two intervals, and it is allowed to have one or more holes in the middle. So I can collapse all the valid ranges together, and then check each ticket field value to see if it is contained within the any of the valid ranges.

Here’s the program that I write. I split the input into three blocks, and process the first one (the one with the valid ranges in it.)

extern crate gcollections;
extern crate interval;
#[macro_use]
extern crate scan_fmt;

use gcollections::ops::set::Union;
use interval::interval_set::ToIntervalSet;

fn main() {
    let input = "class: 1-3 or 5-7\n\
row: 6-11 or 33-44\n\
seat: 13-40 or 45-50\n\
\n\
your ticket:\n\
7,1,14\n\
\n\
nearby tickets:\n\
7,3,47\n\
40,4,50\n\
55,2,20\n\
38,6,12\n";
    // let input = include_str!("input");
    let mut blocks = input.split("\n\n");

    let mut constraints = vec![].to_interval_set();
    blocks
        .next()
        .unwrap()
        .lines()
        .map(|line| {
            let mut parts = line.split(": ");
            (parts.next().unwrap(), parts.next().unwrap())
        })
        .flat_map(|(_, intervals)| intervals.split(" or "))
        .map(|interval| {
            scan_fmt!(interval, "{d}-{d}", u16, u16)
                .unwrap()
                .to_interval_set()
        })
        .for_each(|interval| constraints = constraints.union(&interval));
    println!("{}", constraints);

    let my_ticket_block = blocks.next().unwrap();
    assert!(my_ticket_block.starts_with("your ticket:\n"));

    let other_tickets_block = blocks.next().unwrap();
    assert!(other_tickets_block.starts_with("nearby tickets:\n"));
}

I initially thought .collect::<Vec<(u16, u16)>>().to_interval_set() would work, but that gives me a panic: “This operation is only for pushing interval to the back of the array, possibly overlapping with the last element.” so apparently we have to use union() on each item. Having a bit of experience with Rust and its reputation for safety, I’d expect it to return a Result or something if you tried this, instead of crashing. This seems consistent with my impression that some Rust packages are high quality and others are a bit rough around the edges, similar to NPM packages. This one seems a bit rough and underdocumented, but good enough that we can still use it.

To use the union() method, I inexplicably have to include the gcollections package as well.

Once that’s done I can process the third block as well, and calculate the answer with one long iterator chain:

let other_tickets_block = blocks.next().unwrap();
assert!(other_tickets_block.starts_with("nearby tickets:\n"));
let error_rate: u16 = other_tickets_block
    .lines()
    .skip(1)
    .flat_map(|line| line.split(','))
    .map(|s| s.parse().unwrap())
    .filter(|val| !constraints.contains(val))
    .sum();

println!("Error rate: {}", error_rate);

Day 16, Part 2

In Part 2 of the puzzle we have to discard all the tickets that had an invalid value, and with the remaining tickets figure out which field belongs to which field name, by comparing the fields with the valid ranges for each field name. The answer is the product of all the values on our own ticket that correspond to fields whose name starts with “departure”.

A lot of the logic will be the same as in Part 1, so I start by rewriting Part 1 to save the intermediate steps that we’ll need for Part 2 as well:

let mut blocks = input.split("\n\n");

let constraints_block = blocks.next().unwrap();
let field_descriptions: HashMap<&str, interval::interval_set::IntervalSet<u16>> =
    constraints_block
        .lines()
        .map(|line| {
            let mut parts = line.split(": ");
            let field_name = parts.next().unwrap();
            let intervals = parts.next().unwrap();
            let interval_set = intervals
                .split(" or ")
                .map(|interval| scan_fmt!(interval, "{d}-{d}", u16, u16).unwrap())
                .collect::<Vec<(u16, u16)>>()
                .to_interval_set();
            (field_name, interval_set)
        })
        .collect();

let mut all_valid_values = vec![].to_interval_set();
for interval in field_descriptions.values() {
    all_valid_values = all_valid_values.union(interval);
}

let my_ticket_block = blocks.next().unwrap();
assert!(my_ticket_block.starts_with("your ticket:\n"));

let other_tickets_block = blocks.next().unwrap();
assert!(other_tickets_block.starts_with("nearby tickets:\n"));
let (valid_tickets, invalid_tickets): (Vec<Vec<u16>>, Vec<Vec<u16>>) = other_tickets_block
    .lines()
    .skip(1)
    .map(|line| {
        line.split(',')
            .map(|s| s.parse().unwrap())
            .collect::<Vec<u16>>()
    })
    .partition(|ticket| ticket.iter().all(|val| all_valid_values.contains(val)));

if is_part2() {
} else {
    let error_rate: u16 = invalid_tickets
        .iter()
        .flat_map(|ticket| ticket.iter().filter(|val| !all_valid_values.contains(val)))
        .sum();
    println!("Error rate: {}", error_rate);
}

I am excited to use the partition() method, that I found by browsing the documentation, to split the iterator into two vectors, one of valid and one of invalid tickets.

I now have vectors of the numbers from each ticket. However, to solve this puzzle I’ll have to transform those into vectors of all the numbers in each position on the tickets (for example, a vector of all numbers coming first on each ticket, coming second, etc.) instead of vectors of each ticket. This sounds like something I should be able to do with a zip() method which I’m familiar with from Python.

Rust’s zip() takes only two iterators. At first I think that itertools::multizip will solve my problem, but that takes a number of iterators that must be known at compile time, which is not the case here. After some more googling I find this Stack Overflow answer and find a code snippet which I simply copy as-is into my program. I don’t really understand it, but I understand enough to see approximately what it does.

I decide to store the possible fields for each position on the ticket as a vector of hashsets, possible_fields_by_position. The order of the vector is by position of the number on the ticket. The elements of the vector are hash sets consisting of zero or more field numbers, which the numbers in this position might be valid for.

To do this, I have to pre-fill the vector with empty hashsets, and I’m not sure how to do this. Googling “rust fill vector of hashset” sends me here.

If the problem is solvable, then there will be at least one element of the vector where the hash set has only one item in it, and then we will know what field corresponds with that position on the ticket. We can then consider that field “determined” and remove it from all the other hash sets in the vector, which hopefully leaves at least one more hash set with only one item, allowing is to uniquely determine another field, and so on.

One interesting thing to note is that I first had this:

for (_, remaining_fields) in possible_fields_by_position {
    remaining_fields.remove(field_ix);
}

Here, for the first time that I’ve seen, the compiler gives a plain wrong suggestion:

error[E0382]: borrow of moved value: `possible_fields_by_position`
   --> puzzle16.rs:97:15
    |
78  |         let mut possible_fields_by_position: Vec<(usize, HashSet<usize>)> = (0..valid_tickets[0]
    |             ------------------------------- move occurs because `possible_fields_by_position` has type `Vec<(usize, std::collections::HashSet<usize>)>`, which does not implement the `Copy` trait
...
97  |         while possible_fields_by_position.len() > 0 {
    |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^ value borrowed here after move
...
102 |             for (_, remaining_fields) in possible_fields_by_position {
    |                                          ---------------------------
    |                                          |
    |                                          value moved here, in previous iteration of loop
    |                                          help: consider borrowing to avoid moving into the for loop: `&possible_fields_by_position`

error[E0596]: cannot borrow `remaining_fields` as mutable, as it is not declared as mutable
   --> puzzle16.rs:103:17
    |
102 |             for (_, remaining_fields) in possible_fields_by_position {
    |                     ---------------- help: consider changing this to be mutable: `mut remaining_fields`
103 |                 remaining_fields.remove(field_ix);
    |                 ^^^^^^^^^^^^^^^^ cannot borrow as mutable

Even without the above, I certainly had more trouble than usual with getting the & operators right! Maybe it’s time for a re-read of the chapter on ownership in the Rust book.

For the last step I am also left wondering how to pre-fill a vector with zeros. The syntax is [0; number_of_zeros]. I do actually stumble upon a Github issue complaining about it not being documented well enough.

Here is the very long full program:

extern crate gcollections;
extern crate interval;
#[macro_use]
extern crate scan_fmt;

use gcollections::ops::set::{Contains, Union};
use interval::interval_set::{IntervalSet, ToIntervalSet};
use std::collections::HashSet;
use std::env;

// https://stackoverflow.com/a/55292215/172999
struct Multizip<T>(Vec<T>);

impl<T> Iterator for Multizip<T>
where
    T: Iterator,
{
    type Item = Vec<T::Item>;

    fn next(&mut self) -> Option<Self::Item> {
        self.0.iter_mut().map(Iterator::next).collect()
    }
}

fn main() {
    let input = include_str!("input");
    let mut blocks = input.split("\n\n");

    let constraints_block = blocks.next().unwrap();
    let field_descriptions: Vec<(&str, IntervalSet<u16>)> = constraints_block
        .lines()
        .map(|line| {
            let mut parts = line.split(": ");
            let field_name = parts.next().unwrap();
            let interval_set = parts
                .next()
                .unwrap()
                .split(" or ")
                .map(|interval| scan_fmt!(interval, "{d}-{d}", u16, u16).unwrap())
                .collect::<Vec<(u16, u16)>>()
                .to_interval_set();
            (field_name, interval_set)
        })
        .collect();

    let mut all_valid_values = vec![].to_interval_set();
    for (_, interval) in &field_descriptions {
        all_valid_values = all_valid_values.union(interval);
    }

    let my_ticket_block = blocks.next().unwrap();
    assert!(my_ticket_block.starts_with("your ticket:\n"));

    let other_tickets_block = blocks.next().unwrap();
    assert!(other_tickets_block.starts_with("nearby tickets:\n"));
    let (valid_tickets, invalid_tickets): (Vec<Vec<u16>>, Vec<Vec<u16>>) = other_tickets_block
        .lines()
        .skip(1)
        .map(read_csv_numbers)
        .partition(|ticket| ticket.iter().all(|val| all_valid_values.contains(val)));

    if is_part2() {
        let mut possible_fields_by_position: Vec<_> = (0..valid_tickets[0].len())
            .map(|_| HashSet::new())
            .enumerate()
            .collect();
        for (position, position_values) in
            Multizip(valid_tickets.iter().map(|ticket| ticket.iter()).collect()).enumerate()
        {
            for (field_ix, (_, interval)) in field_descriptions.iter().enumerate() {
                if position_values.iter().all(|val| interval.contains(val)) {
                    possible_fields_by_position[position].1.insert(field_ix);
                }
            }
        }

        possible_fields_by_position.sort_by_key(|(_, set)| set.len());
        possible_fields_by_position.reverse();

        let mut determined_fields_by_position = vec![0; possible_fields_by_position.len()];
        while !possible_fields_by_position.is_empty() {
            let (position, possible_fields) = possible_fields_by_position.pop().unwrap();
            assert!(possible_fields.len() == 1, "unable to determine fields");
            let field_ix = possible_fields.iter().next().unwrap();
            determined_fields_by_position[position] = *field_ix;
            for (_, remaining_fields) in &mut possible_fields_by_position {
                remaining_fields.remove(field_ix);
            }
        }

        let my_ticket_values: Vec<u16> = my_ticket_block
            .lines()
            .skip(1)
            .flat_map(read_csv_numbers)
            .collect();

        let answer: u64 = determined_fields_by_position
            .iter()
            .map(|field_ix| field_descriptions[*field_ix].0)
            .zip(my_ticket_values.iter())
            .filter(|(field_name, _)| field_name.starts_with("departure"))
            .map(|(_, value)| *value as u64)
            .product();

        println!("My ticket values {:?}", answer);
    } else {
        let error_rate: u16 = invalid_tickets
            .iter()
            .flat_map(|ticket| ticket.iter().filter(|val| !all_valid_values.contains(val)))
            .sum();
        println!("Error rate: {}", error_rate);
    }
}

fn read_csv_numbers(line: &str) -> Vec<u16> {
    line.split(',').map(|s| s.parse().unwrap()).collect()
}

Day 17, Part 1

I didn’t manage to finish the Day 16 blog post before I started working on Day 17’s puzzle, so I’m tacking it on here. Day 17 brings us a three-dimensional Conway’s Game of Life!

I remember very well implementing Game of Life with a different twist on Day 11, so today I will write some more code using ndarray, copying from Day 11 where I can.

In three dimensions we have 33 − 1 = 26 neighbours instead of the 32 − 1 = 8 that we have in two dimensions. I “hand-unrolled”2 the loop over the 8 neighbours in the calc_neighbours() function from Day 11:

fn calc_neighbours(seats: &Array2<i8>) -> Array2<i8> {
    let shape = seats.shape();
    let width = shape[0];
    let height = shape[1];
    let mut neighbours = Array2::<i8>::zeros(seats.raw_dim());
    // Add slices of the occupied seats shifted one space in each direction
    let mut slice = neighbours.slice_mut(s![1.., 1..]);
    slice += &seats.slice(s![..width - 1, ..height - 1]);
    slice = neighbours.slice_mut(s![.., 1..]);
    slice += &seats.slice(s![.., ..height - 1]);
    slice = neighbours.slice_mut(s![..width - 1, 1..]);
    slice += &seats.slice(s![1.., ..height - 1]);
    slice = neighbours.slice_mut(s![1.., ..]);
    slice += &seats.slice(s![..width - 1, ..]);
    slice = neighbours.slice_mut(s![..width - 1, ..]);
    slice += &seats.slice(s![1.., ..]);
    slice = neighbours.slice_mut(s![1.., ..height - 1]);
    slice += &seats.slice(s![..width - 1, 1..]);
    slice = neighbours.slice_mut(s![.., ..height - 1]);
    slice += &seats.slice(s![.., 1..]);
    slice = neighbours.slice_mut(s![..width - 1, ..height - 1]);
    slice += &seats.slice(s![1.., 1..]);
    neighbours
}

That will simply not do when we have 26 neighbours! I decide to first rewrite it in the Day 11 code with the outer product of two vectors [-1, 0, 1] and make sure that it still gives the same answer. I find that the cartesian_product() method from Itertools is ideal for this:

fn calc_neighbours(seats: &Array2<i8>) -> Array2<i8> {
    let shape = seats.shape();
    let width = shape[0] as isize;
    let height = shape[1] as isize;
    let mut neighbours = Array2::<i8>::zeros(seats.raw_dim());
    // Add slices of the occupied seats shifted one space in each direction
    for (xstart, ystart) in (-1..=1).cartesian_product(-1..=1) {
        if xstart == 0 && ystart == 0 {
            continue;
        }
        let xdest = xstart.max(0)..(width + xstart).min(width);
        let ydest = ystart.max(0)..(height + ystart).min(height);
        let xsource = (-xstart).max(0)..(width - xstart).min(width);
        let ysource = (-ystart).max(0)..(height - ystart).min(height);
        let mut slice = neighbours.slice_mut(s![xdest, ydest]);
        slice += &seats.slice(s![xsource, ysource]);
    }
    neighbours
}

It’s still pretty verbose, and I suspect it could be done more cleverly, but this is good enough to be straightforwardly adapted to three dimensions for Day 17:

fn calc_neighbours(grid: &Array3<i8>) -> Array3<i8> {
    let shape = grid.shape();
    let width = shape[0] as isize;
    let height = shape[1] as isize;
    let depth = shape[2] as isize;
    let mut neighbours = Array3::<i8>::zeros(grid.raw_dim());
    // Add slices of the occupied grid shifted one space in each direction
    for starts in iter::repeat(-1..=1).take(3).multi_cartesian_product() {
        if starts.iter().all(|start| *start == 0) {
            continue;
        }
        let (xstart, ystart, zstart) = (starts[0], starts[1], starts[2]);
        let xdest = xstart.max(0)..(width + xstart).min(width);
        let ydest = ystart.max(0)..(height + ystart).min(height);
        let zdest = zstart.max(0)..(depth + zstart).min(depth);
        let xsource = (-xstart).max(0)..(width - xstart).min(width);
        let ysource = (-ystart).max(0)..(height - ystart).min(height);
        let zsource = (-zstart).max(0)..(depth - zstart).min(depth);
        let mut slice = neighbours.slice_mut(s![xdest, ydest, zdest]);
        slice += &grid.slice(s![xsource, ysource, zsource]);
    }
    neighbours
}

Unlike on Day 11, the game board is infinite and has no walls. However, also unlike on Day 11 where we had a termination condition for the game, today we have to simulate exactly 6 iterations of the game, so we can actually pretend the board is finite. The occupied cells can expand at by 1 every turn at most, so the board can be bounded at the size of the input board plus the number of iterations in every direction.

In the example given in the puzzle description, the board is 3×3×1 and we simulate 3 iterations, so the maximum board size would be 9×9×7 in the example. (The example’s actual output fits in 7×7×5, but we need an upper bound.)

I’m able to reuse a lot of the code from Day 11, so this should look very familiar if you’ve been following along:

use itertools::Itertools;
use ndarray::{s, Array3};
use std::iter;

fn main() {
    // let input = ".#.\n..#\n###\n";
    let input = include_str!("input");
    let n_turns = 6;
    let mut grid = read_grid(input, n_turns);

    for _ in 0..n_turns {
        let neighbours = calc_neighbours(&grid);
        let activations = &neighbours.mapv(|count| (count == 3) as i16)
            * &grid.mapv(|active| (active == 0) as i16);
        let deactivations = &neighbours.mapv(|count| (count < 2 || count > 3) as i16) * &grid;
        grid = grid + activations - deactivations;
    }

    dump_grid(&grid);
    println!("{}", grid.sum());
}

fn calc_neighbours(grid: &Array3<i16>) -> Array3<i16> {
    let shape = grid.shape();
    let width = shape[0] as isize;
    let height = shape[1] as isize;
    let depth = shape[2] as isize;
    let mut neighbours = Array3::<i16>::zeros(grid.raw_dim());
    // Add slices of the occupied grid shifted one space in each direction
    for starts in iter::repeat(-1..=1).take(3).multi_cartesian_product() {
        if starts.iter().all(|start| *start == 0) {
            continue;
        }
        let (xstart, ystart, zstart) = (starts[0], starts[1], starts[2]);
        let xdest = xstart.max(0)..(width + xstart).min(width);
        let ydest = ystart.max(0)..(height + ystart).min(height);
        let zdest = zstart.max(0)..(depth + zstart).min(depth);
        let xsource = (-xstart).max(0)..(width - xstart).min(width);
        let ysource = (-ystart).max(0)..(height - ystart).min(height);
        let zsource = (-zstart).max(0)..(depth - zstart).min(depth);
        let mut slice = neighbours.slice_mut(s![xdest, ydest, zdest]);
        slice += &grid.slice(s![xsource, ysource, zsource]);
    }
    neighbours
}

fn read_grid(input: &str, padding: usize) -> Array3<i16> {
    let lines: Vec<&str> = input.lines().collect();
    let height = lines.len();
    let width = lines[0].len();
    let mut cells = Array3::zeros((width + 2 * padding, height + 2 * padding, 2 * padding + 1));
    for (y, line) in lines.iter().enumerate() {
        for (x, tile) in line.bytes().enumerate() {
            cells[[x + padding, y + padding, padding]] = match tile {
                b'#' => 1,
                b'.' => 0,
                _ => panic!("Bad tile '{}'", tile),
            };
        }
    }
    cells
}

fn dump_grid(grid: &Array3<i16>) {
    for xy in grid.axis_iter(ndarray::Axis(2)) {
        for x in xy.axis_iter(ndarray::Axis(1)) {
            println!(
                "{}",
                x.mapv(|active| if active != 0 { '#' } else { '.' })
                    .iter()
                    .collect::<String>()
            )
        }
        println!("");
    }
}

The main difference is that I’ve written a dump_grid() function as well.

I remarked on Day 11 that in my opinion the ndarray package suffers from a few deficiencies compared to NumPy, and I ran into those same problems today:

  • The return value of sum() is limited to the data type of the array it is being called on.
  • You can’t easily use boolean arrays as a mask.

Day 17, Part 2

Part 2 of the puzzle is exactly the same as Part 1, only four-dimensional! I briefly wish that I had spent the time on generalizing calc_neighbours() to work with any number of dimensions. But on the other hand, I really don’t feel confident enough with either Rust or ndarray that I could anticipate being able to do that without spending all day on it.

So instead I decide to do the simplest thing: copy the file to another puzzle17-2.rs and add another section to Cargo.toml:

[[bin]]
name = "puzzle17-2"
path = "puzzle17-2.rs"

I notice that you now need to specify which binary to run, which is nice, because I wasn’t sure how I would choose between the two:

error: `cargo run` could not determine which binary to run. Use the `--bin` option to specify a binary, or the `default-run` manifest key.
available binaries: puzzle17, puzzle17-2

Then I just change all the Array3 to Array4 and add an extra index where needed:

use itertools::Itertools;
use ndarray::{s, Array4};
use std::iter;

fn main() {
    let input = include_str!("input");
    let n_turns = 6;
    let mut grid = read_grid(input, n_turns);

    for _ in 0..n_turns {
        let neighbours = calc_neighbours(&grid);
        let activations = &neighbours.mapv(|count| (count == 3) as i16)
            * &grid.mapv(|active| (active == 0) as i16);
        let deactivations = &neighbours.mapv(|count| (count < 2 || count > 3) as i16) * &grid;
        grid = grid + activations - deactivations;
    }

    println!("{}", grid.sum());
}

fn calc_neighbours(grid: &Array4<i16>) -> Array4<i16> {
    let shape = grid.shape();
    let width = shape[0] as isize;
    let height = shape[1] as isize;
    let depth = shape[2] as isize;
    let limit4 = shape[3] as isize;
    let mut neighbours = Array4::<i16>::zeros(grid.raw_dim());
    // Add slices of the occupied grid shifted one space in each direction
    for starts in iter::repeat(-1..=1).take(4).multi_cartesian_product() {
        if starts.iter().all(|start| *start == 0) {
            continue;
        }
        let (xstart, ystart, zstart, wstart) = (starts[0], starts[1], starts[2], starts[3]);
        let xdest = xstart.max(0)..(width + xstart).min(width);
        let ydest = ystart.max(0)..(height + ystart).min(height);
        let zdest = zstart.max(0)..(depth + zstart).min(depth);
        let wdest = wstart.max(0)..(limit4 + wstart).min(limit4);
        let xsource = (-xstart).max(0)..(width - xstart).min(width);
        let ysource = (-ystart).max(0)..(height - ystart).min(height);
        let zsource = (-zstart).max(0)..(depth - zstart).min(depth);
        let wsource = (-wstart).max(0)..(limit4 - wstart).min(limit4);
        let mut slice = neighbours.slice_mut(s![xdest, ydest, zdest, wdest]);
        slice += &grid.slice(s![xsource, ysource, zsource, wsource]);
    }
    neighbours
}

fn read_grid(input: &str, padding: usize) -> Array4<i16> {
    let lines: Vec<&str> = input.lines().collect();
    let height = lines.len();
    let width = lines[0].len();
    let mut cells = Array4::zeros((
        width + 2 * padding,
        height + 2 * padding,
        2 * padding + 1,
        2 * padding + 1,
    ));
    for (y, line) in lines.iter().enumerate() {
        for (x, tile) in line.bytes().enumerate() {
            cells[[x + padding, y + padding, padding, padding]] = match tile {
                b'#' => 1,
                b'.' => 0,
                _ => panic!("Bad tile '{}'", tile),
            };
        }
    }
    cells
}

This gives me the right answer.

Afterword

The puzzles seem to be getting harder as we go along, particularly Day 16 more than Day 17. I am almost certain that Day 16 could be solved much more elegantly than the solution that I had. Day 17, on the other hand, gave me a chance to improve on the solution that I already had for Day 11.

Having the compiler messages directly in my editor is a mixed blessing! It’s definitely streamlined things, but on the other hand I think it has actually made me more lazy and less inclined to learn about when the & operator is necessary — I’ve caught myself just adding them here and there in likely places without bothering to think, and letting the compiler sort the rest out. I’m definitely not this careless with pointers in C!


[1] And therefore probably more useful if you actually want to learn Rust yourself

[2] Or more like, the loop was never rolled in the first place because it was too complicated for me to write at the time

Advent of Rust 10: When All You Have is Itertools, Every Problem Looks like an Iter

Welcome back to episode 10 of the previously-stream-of-consciousness-now-abbreviated log of me trying to teach myself the Rust programming language, stumbling my way through the puzzles from Advent of Code 2020.

I have noticed that by now doing a puzzle each day and writing about it is somewhat tiring. I’ve found myself eager to solve the puzzle in the quickest way possible rather than to use each day’s puzzle as an excuse to learn a new language feature in Rust. But on the other hand I think that’s OK! Despite that, I am still enjoying it and looking forward each day to seeing what the new puzzle is.

It’s just fine to want to solve puzzles in a hacky way; once the puzzle is solved, it’s done!

Day 10, Part 1

The first part of the puzzle is to figure out how to chain “joltage” (a made-up kind of voltage?) adapters together. Each adapter is rated for a different number of “jolts” and they all have to be connected in increasing order such that there is no more than a difference of 3 jolts between them, and all the adapters are used. I think this is a roundabout way of describing that the list has to be sorted in ascending numerical order?!

The answer to the puzzle is the number of adapters with 1 jolt difference between them, times the number of adapters with 3 jolts difference between them. This reminds me of Day 5 when I wanted to find which seat number was missing from a list and so looked for the item that did not have a number one higher than its predecessor.

This sounds like a job for … Itertools!

The initial approach of forging ahead and writing the code that I think will work, compiling it until it runs, and then trying the answer on the website, has worked many times so far, so I try it again now. I will sort the input array, add the device’s built-in adapter which is always 3 jolts higher than the highest adapter in the list, calculate the differences between consecutive elements using tuple_windows(), and then count the number of occurrences of each of those differences.

It’s for the latter step that I spend some time browsing the Itertools documentation to see if there is a method that already does this. The closest I come is to sort the vector of differences and then use group_by() on it, helpfully copying the code from the example usage of group_by() in the documentation, and then count the length of each of the groups.

Here’s the code:

let mut adapters: Vec<u8> = read_lines(file).map(|s| s.parse().unwrap()).collect();
adapters.sort_unstable();
adapters.push(adapters.last().unwrap() + 3); // add built-in adapter

let mut differences: Vec<u8> = adapters
    .iter()
    .tuple_windows()
    .map(|(j1, j2)| j2 - j1)
    .collect();
differences.sort_unstable();

let mut occurrences = vec![0, 0, 0];
for (key, group) in &differences.iter().group_by(|d| *d) {
    occurrences[*key as usize - 1] = group.count();
}

println!("{}", occurrences[0] * occurrences[2]);

Clippy helpfully reminds me to use sort_unstable() instead of sort() for u8 which doesn’t need to preserve the order of equal elements, and to use count() instead of the much less readable collect::<Vec<&u8>>().len(). Also, the * operators that you see in the code above were definitely the compiler’s idea, not mine.

I get an answer and put it into the Advent of Code website, and it’s too low. I print out adapters, differences, and occurrences to see if they make sense, and they do, but I notice the problem: I’ve added the built-in adapter on one end, which is 3 jolts higher than the highest one in the list, but I’ve forgotten to add the charging outlet on the other end, which is always rated 0 jolts. I do this (add adapters.push(0); before the sort) and this time get the right answer.

Before going on to Part 2, I look at this code again and think that maybe this is maybe just a tiny a bit too much Itertools, and I’m trying too hard to be clever. I refactor it into this, which still gives me the correct answer but only iterates through the sorted adapters once:

fn main() -> Result<(), io::Error> {
    let file = fs::File::open("input")?;
    let mut adapters: Vec<u8> = read_lines(file).map(|s| s.parse().unwrap()).collect();
    adapters.push(0); // add charging outlet
    adapters.sort_unstable();
    adapters.push(adapters.last().unwrap() + 3); // add built-in adapter

    let mut ones = 0;
    let mut threes = 0;
    for difference in adapters.iter().tuple_windows().map(|(j1, j2)| j2 - j1) {
        match difference {
            1 => ones += 1,
            3 => threes += 1,
            _ => (),
        }
    }

    println!("{}", ones * threes);
    Ok(())
}

Day 10, Part 2

Part 2 is to figure out the number of possible configurations of adapters with which you can connect the charging outlet to the device. Each adapter must have a higher number than the one before it, but no more than 3 higher.

The way they describe the puzzle, it sounds like brute force is not an option here! I first start thinking of a tree structure, but quickly discard that idea, because if there are trillions of possible configurations, then the tree would need to have trillions of branches.

For quite a long time I stare at the worked example in the puzzle description which has 8 possible configurations, scrolling back and forth between the list of adapters and the list of configurations. In the list of configurations I notice that many of the numbers remain the same because you have only one choice for the adapter that goes there. Looking at all 8 configurations, they all follow this pattern:

0, 1, 4, 5?, 6?, 7, 10, 11?, 12, 15, 16, 19, 22

Only 5, 6 and 11 are optional. Each one may occur or it may not, so that’s 23 = 8 combinations. And in fact, when you have reached adapter 4 you have 3 choices for the next adapter (5, 6, or 7), when you have reached adapter 10 you have 2 choices for the next adapter (11 or 12), and all the other adapters only have one possible choice for the next adapter. So the number of combinations is the product of 2n − 1 for each adapter where n is the number of possible choices for the next adapter.

For a little while I think this is how to get the answer, but then I realize this can’t be true, because according to this formula the answer can only be a power of 2 (since it’s calculated by multiplying powers of 2 together.) The answer to the second worked example, however, is 19280, not a power of 2.

I suppose if you had the sequence 0, 3, 4, 5, 6, 7, 10 then things would be a bit different! (Unlike the first worked example, there are five adapters next to each other in the list with a difference of 1 between them, not four.)

These would be the valid configurations for that sequence:

0, 3, 4, 5, 6, 7, 10
0, 3, 4, 5,    7, 10
0, 3, 4,    6, 7, 10
0, 3, 4,       7, 10
0, 3,    5, 6, 7, 10
0, 3,    5,    7, 10
0, 3,       6, 7, 10

There are 7 of them, clearly not a power of 2. My scheme was thrown off because the number of choices at one adapter can affect the number of choices at another adapter. At 5 you can choose to skip 6, but you can’t skip 6 if you’ve already skipped 4.

I did notice when I debug-printed occurrences in Part 11, that there were no differences of 2 in the list. They were all differences of 1 or 3. My hunch is that a difference of 3 “resets” things, since the two numbers on either side of it must always occur. So we can consider each “run” of ones in isolation.

We know from the original worked example (differences of 1, 3, 1, 1, 1, 3, 1, 1, 3, 1, 3, 3) that a run of one one yields only one choice, a run of two ones yields two choices (10, 11?, 12), a run of three ones yields four choices (4, 5?, 6?, 7), and now we know from my new example above that a run of four ones yields seven choices.

I check what the valid configurations are for a run of five ones, let’s say 0, 3, 4, 5, 6, 7, 8, 11:

0, 3, 4, 5, 6, 7, 8, 11
0, 3, 4, 5, 6,    8, 11
0, 3, 4, 5,    7, 8, 11
0, 3, 4, 5,       8, 11
0, 3, 4,    6, 7, 8, 11
0, 3, 4,    6,    8, 11
0, 3, 4,       7, 8, 11
0, 3,    5, 6, 7, 8, 11
0, 3,    5, 6,    8, 11
0, 3,    5,    7, 8, 11
0, 3,    5,       8, 11
0, 3,       6, 7, 8, 11
0, 3,       6,    8, 11

There are 13 of them. And a run of six ones, let’s say 0, 3, 4, 5, 6, 7, 8, 9, 12:

0, 3, 4, 5, 6, 7, 8, 9, 12
0, 3, 4, 5, 6, 7,    9, 12
0, 3, 4, 5, 6,    8, 9, 12
0, 3, 4, 5, 6,       9, 12
0, 3, 4, 5,    7, 8, 9, 12
0, 3, 4, 5,    7,    9, 12
0, 3, 4, 5,       8, 9, 12
0, 3, 4,    6, 7, 8, 9, 12
0, 3, 4,    6, 7,    9, 12
0, 3, 4,    6,    8, 9, 12
0, 3, 4,    6,       9, 12
0, 3, 4,       7, 8, 9, 12
0, 3, 4,       7,    9, 12
0, 3,    5, 6, 7, 8, 9, 12
0, 3,    5, 6, 7,    9, 12
0, 3,    5, 6,    8, 9, 12
0, 3,    5, 6,       9, 12
0, 3,    5,    7, 8, 9, 12
0, 3,    5,    7,    9, 12
0, 3,    5,       8, 9, 12
0, 3,       6, 7, 8, 9, 12
0, 3,       6, 7,    9, 12
0, 3,       6,    8, 9, 12
0, 3,       6,       9, 12

Here, there are 24 possible combinations. Each time we start with 2n − 1 combinations, but eliminate a few due to having gaps larger than 3.

I don’t really see a pattern in the number of choices that have to be eliminated because of gaps larger than 3:

run lengthchoices
120 = 1
221 = 2
322 = 4
423 − 1 = 7
524 − 3 = 13
625 − 8 = 24
726 − 20 = 44
827 − 47 = 81

I do notice that any combinations eliminated in one row are also eliminated in all subsequent rows, so maybe it could be defined recursively in terms of doubling the previous entry and removing some more rows:

run length nchoices f(n)
11 = 1
22 × f(1) = 2
32 × f(2) = 4
42 × f(3) − 1 = 7
52 × f(4) − 1 = 13
62 × f(5) − 2 = 24
72 × f(6) − 4 = 44
82 x f(7) − 7 = 81

This is starting to make sense. When I write out the combinations in order to count the rows, the number of rows I have to remove each time consists of a pattern from one of the earlier entries but with three missing adapters in front of it. (I’m not sure if this is a good way to explain it, but maybe more visually: to get from the combinations for five ones to the combinations for six ones that I wrote out above, I add one to all the numbers except zero, insert a column of threes, copy all the lines and paste them below the existing lines, delete the threes from the newly-pasted lines, and then delete the last two rows since they start with 0, 3, 7 which is not allowed. The number of rows that I delete at the end corresponds to the number of rows in an earlier iteration of the pattern.) I predict that a run of nine ones is going to produce 149 choices, or 2 × 81 − 13, and when I write them out to make sure, that turns out to be correct: so I’m guessing that for larger n, f(n) = 2 × f(n − 1) − f(n − 4).

This is by no means a mathematical-level rigorous proof, but I have a working hypothesis that successfully predicted the next result, which is good enough for me to start coding. If this were not a one-off puzzle I would start by verifying the worked examples in the puzzle description, but I will just write the code and see if the first answer I get is correct.

I then laugh out loud as I realize that the totally unnecessary use of group_by() in Part 1 is actually a good way to implement Part 2.

let differences = adapters.iter().tuple_windows().map(|(j1, j2)| j2 - j1);
let groups = differences.group_by(|d| *d);
let total: u64 = groups
    .into_iter()
    .filter(|(key, _)| *key == 1)
    .map(|(_, group)| possible_configurations(group.count()))
    .product();
println!("{}", total);

// ...

fn possible_configurations(run_length: usize) -> u64 {
    match run_length {
        n if n < 1 => panic!("Bad value"),
        1 => 1,
        2 => 2,
        3 => 4,
        4 => 7,
        n => 2 * possible_configurations(n - 1) - possible_configurations(n - 4),
    }
}

I stumble over the return value of group_by() not being an iterator, and not having an iter() method like Vec but instead an into_iter() method. And as in Part 1, the *s are all the compiler’s idea.

Running this gives me the correct answer.

Afterword

I am now curious whether all the trouble I went to, to figure out the number of configurations for long runs of adapters with differences of 1 between them, was actually necessary. With a debug print I find that it wasn’t: the longest run of ones in the input is four. (And in hindsight I can tell that the n => line was never reached, because I initially forgot to multiply it by 2, but still got the correct answer!) I guess the lesson here is, when doing programming puzzles, check the input first, and see if you can get away with not handling cases that aren’t present!

That makes two paths that I took, and spent a lot of time on, that weren’t wrong as such, but were unnecessary. (The second one being, all the time I spent in Part 1 looking for an Itertools method that would count the occurrences in differences even though I knew how to write it with a simple for loop.) Both of them were caused by trying to be too clever. I should learn that for these Advent of Code puzzles, a clever solution doesn’t matter, it only matters that it’s correct! I should have taken my own advice that I wrote in the introduction to this blog post.

I also notice that today’s episode has been much more about figuring out what code to write, than about learning Rust. Once I had figured that out, I wrote the code, added whatever & or * operators the compiler told me to, and for the most part it worked.

I still wish I would understand enough to get the & and * operators right the first time. I suspect that if I set up my editor to show Rust types when hovering over an identifier, or something like that, then I’d be more aware of where I was working with a borrowed type, and be able to write the code accordingly.


[1] Debug printing is good for something, after all

Advent of Rust 9: Find That Number, (F)or Else

Welcome again to the not-so-stream-of-consciousness log, where the puns in the titles get worse every day. Today’s topic is the same as every day’s topic: me teaching myself the Rust programming language by doing programming puzzles from Advent of Code 2020.

Day 9, Part 1

Today’s puzzle is to analyze a long list of numbers, pretending that we are breaking an encryption scheme! The answer to the puzzle is the first number in the list (starting at position 26) that is not the sum of two numbers out of the “preamble” consisting of the previous 25 numbers in the list. Wow, that’s a technically correct but very confusing way to say it. Imagine, if you will, a snake moving through the list whose head is one number and whose tail is 25 numbers. If the snake’s head does not equal the sum of any two of the numbers from the snake’s tail, then the head number is the answer.1

As usual I start with cargo new puzzle9, download the input file, and set up puzzle9.rs with the boilerplate, as I did with puzzle8.rs yesterday.

Someone else might know a more efficient way to solve this,2 but I’m going to try to solve it with brute force and Itertools! I will use tuple_windows() to iterate through windows of length 26, and check the tuple_combinations() of the first 25 numbers of the window. I add itertools to Cargo.toml and use itertools::Itertools; to the top of the file.

First I write a map(|s| s.parse().unwrap()) that parses the numbers, then I try to iterate through windows. I’m soon disappointed that tuple_windows() only supports up to a window length of 4! I put the items into a vector instead and implement the window iteration myself, by taking a slice of &numbers[from..to].

Since I need to stop the tuple_combinations() loop when a sum is found, but stop the tuple_windows() loop when the inner loop is not stopped, I am wondering if Rust has a for-else construct like Python does. I google “rust for else” but find that it does not, although I do read that the loop expression (which I am seeing now for the first time) can actually return a value by using break value;, which is something I’ve never encountered before in a programming language! I wonder if the for loop can do this too? I read a bit more and find that no, it cannot.

Ugh, I don’t much like the found = true idiom ever since I learned about Python’s for-else, but nothing to do I guess, unless I want to refactor this into a separate function. And I don’t! I just cram it all in one main() function:

fn main() -> Result<(), io::Error> {
    let file = fs::File::open("input")?;
    let numbers: Vec<u64> = read_lines(file).map(|s| s.parse().unwrap()).collect();

    let window_size = 25;
    let mut answer = None;
    for ix in window_size..numbers.len() {
        let preamble = &numbers[(ix - window_size)..ix];
        let mut valid = false;
        for (a, b) in preamble.iter().tuple_combinations() {
            if a + b == numbers[ix] {
                valid = true;
                break;
            }
        }
        if !valid {
            answer = Some(numbers[ix]);
            break;
        }
    }

    println!("{}", answer.unwrap());

    Ok(())
}

Some interesting mistakes that I made:

  • I originally tried to parse the file as a list of u32 because I didn’t scroll down far enough and see that there were huge numbers, so I got a panic on parse().
  • I made the window_size 5 initially instead of 25, because I was confused with the example given in the text.
  • I forgot all the muts on my variables.
  • I thought I was being clever by making answer an Option, but then I was not clever enough to remember to assert that the answer was actually found, with unwrap().

Despite the mistakes I was happy to see that the compiler could infer the type of parse() without having to specify parse::<u64>().

This program gives me the answer, and so I can unlock Part 2 of the puzzle.

Day 9, Part 2

This is the first Part 2 of a puzzle that actually uses the answer from Part 1! So I don’t need the is_part2() function today.

This time, I must find a contiguous set of at least two numbers in the list that add up to the answer of Part 1. The answer to Part 2 is the sum of the smallest and largest number in that set.

I’m going to have to do a lot more iteration over sliding windows of the list in this one! That tuple_windows() iterator sure would be nice. I do a bit more searching and find what I hadn’t realized before: you can do sliding windows of arbitrary size with windows(), on a slice, just not on an iterator. I’d really prefer to use windows(),3 so maybe I can refactor Part 1 to do that before I write Part 2.

for window in numbers.windows(window_size + 1) {
    let preamble = &window[..window_size];
    let number = *window.last().unwrap();
    let mut valid = false;
    for (a, b) in preamble.iter().tuple_combinations() {
        if a + b == number {
            valid = true;
            break;
        }
    }
    if !valid {
        answer = Some(number);
        break;
    }
}

This is a bit more readable!

Now to write Part 2. I will start with sliding windows of 2 contiguous elements and keep increasing the window size until I find the answer:

fn encryption_weakness(numbers: &[u64], invalid_number: u64) -> Option<u64> {
    for window_size in 2..numbers.len() {
        for window in numbers.windows(window_size) {
            if window.iter().sum::<u64>() == invalid_number {
                return Some(window.first().unwrap() + window.last().unwrap());
            }
        }
    }
    None
}

I get this to compile without too much trouble. I have to specify the type of sum::<u64> and I don’t understand why, similar to an error I got a few days ago. Helpfully, cargo clippy tells me to use &[u64] as the type of the first parameter instead of &Vec<u64>. Nonetheless, it doesn’t work. The answer is wrong, and this time the website doesn’t say if it’s too high or too low (probably because it’s not relevant.) I debug-print the window of numbers to try to get a clue about why the answer is wrong, and I find out that I added the first and last number in the window, not the largest and smallest number as I was supposed to.

This is a nice excuse to use the Itertools minmax() method, which I scrolled past a few days ago and thought was cool! I now read that it has an enum return type, which luckily I learned about yesterday, so I think I should know how to get an answer out of it. I try this:

let itertools::MinMaxResult::MinMax(largest, smallest) = window.iter().minmax();

I wasn’t sure if let with an enum-pattern would even work, but I decided to try it, as I thought I had seen it elsewhere in an example. I originally thought this would assert that the result was not one of the other variants of the enum (NoElements or OneElement.) But instead, the compiler tells me “let bindings require an ‘irrefutable pattern’, like a struct or an enum with only one variant”, and suggests using if let. That seems like a good suggestion, but instead I peeked into the documentation for itertools::MinMaxResult and decided to use into_option() instead:

let (largest, smallest) = window.iter().minmax().into_option().unwrap();

Other than that, it worked well! I got the right answer and solved the puzzle.

Now that I have a nice-looking encryption_weakness() function, I’d really like to refactor Part 1 to remove this horrible valid = true thing. I create a function invalid_number and that lets me return instead of the outer break statement. In addition, I realize that I can use Iterator::any() to get rid of the inner break statement.

Full code:

use itertools::Itertools;
use std::fs;
use std::io::{self, BufRead};

fn main() -> Result<(), io::Error> {
    let file = fs::File::open("input")?;
    let numbers: Vec<u64> = read_lines(file).map(|s| s.parse().unwrap()).collect();

    let answer = invalid_number(&numbers, 25).unwrap();

    println!("Part 1: {}", answer);
    println!("Part 2: {}", encryption_weakness(&numbers, answer).unwrap());

    Ok(())
}

fn invalid_number(numbers: &[u64], window_size: usize) -> Option<u64> {
    for window in numbers.windows(window_size + 1) {
        let preamble = &window[..window_size];
        let number = window[window_size];
        if !preamble
            .iter()
            .tuple_combinations()
            .any(|(a, b)| a + b == number)
        {
            return Some(number);
        }
    }
    None
}

fn encryption_weakness(numbers: &[u64], invalid_number: u64) -> Option<u64> {
    for window_size in 2..numbers.len() {
        for window in numbers.windows(window_size) {
            if window.iter().sum::<u64>() == invalid_number {
                let (largest, smallest) = window.iter().minmax().into_option().unwrap();
                return Some(largest + smallest);
            }
        }
    }
    None
}

fn read_lines(file: fs::File) -> impl Iterator<Item = String> {
    io::BufReader::new(file).lines().map(|res| res.unwrap())
}

Afterword

I solved this puzzle fairly quickly. I can think of two reasons why: I was relatively familiar with Itertools because I’ve been using it often in the past few days, and I also managed to think of a solution for both parts of the puzzle that was able to be implemented by composing parts of Itertools and the standard library that I was already familiar with, or at least knew what to look for.

What can I say? I just love Itertools. I think having practiced solving Advent of Code puzzles with it, will also make me reach for Python’s itertools more often when I’m programming in Python.

Having now written two functions that return Option<u64>, I do wonder how Rust stores this type internally! Is it two 8-byte words with 63 wasted bits, or is it 9 bytes with 7 wasted bits?

I’m liking the abbreviated style that I’ve been using for the past couple of days and I think I’ll continue it.


[1] I’m not sure that’s clear either, but hopefully it will become clear from the code. Code might be difficult to read, but at least it’s unambiguous

[2] Writing this later, I wonder if a more efficient solution would involve sorting the arrays to eliminate some of the combinations? I don’t plan to spend time on it though

[3] Never thought I’d be writing those words on this blog

Advent of Rust 7: What Type is a sum()?

Welcome back to the stream-of-consciousness log of me solving the puzzles at Advent of Code 2020 in order to teach myself the Rust programming language.

This is the end of the first week of learning. The posts are getting shorter, and while I found in the first days that writing about each compiler error helped me think about what might be causing it, now I have usually figured out what’s going on before I finish writing.

So I think today’s might be the last installment in which I go into so much detail about compiler errors; starting this week I’ll post shorter entries, maybe bunched into a few days at a time, that write about the approach I took and what errors were still surprising to me, instead of writing about every time I have to add or delete a & operator. (But if hearing about those details would still be useful to you, now is a good time to let me know!)

Day 7, Part 1

Today’s puzzle is complicated, I have to read the description several times. The input file contains a number of rules about which color bags can hold which other color bags, written in prose. The answer to the puzzle is how many colors of bags can contain a “shiny gold” bag, either directly or by containing other bags which contain the shiny gold bag.

I start with cargo new puzzle7 and copying over the boilerplate from yesterday. Also, I’m not sure I understand exactly what the rules look like, so I download the input file and check what it contains. A lot of lines like this:

light green bags contain 2 pale cyan bags.
dim tan bags contain 3 shiny teal bags, 5 bright white bags, 4 striped bronze bags.
dotted magenta bags contain 2 faded cyan bags, 4 bright yellow bags.
dull silver bags contain 3 striped green bags.

For this puzzle I can ignore how many of each color of bag can be contained (I suspect it will come back in Part 2, though) and focus on which colors can be contained by which other colors.

I will need to figure out how to obtain the rules out of the prose-like sentences in the input file, and I think I will use scan_fmt! for this, so I add it to Cargo.toml.

I don’t immediately get an idea for a data structure to hold this information, so maybe I will work on the string scanning first, and see if I get any ideas about a data structure while I do that. The one thing that I do notice (after thinking about it for a little bit) is that the input file provides information about which colors can contain which other colors, but to answer the puzzle, I need to look up for a given color which colors it can be contained by. So in any case, I will need to reverse the relationships given by the rules.

But first to the string scanning!

for line in read_lines("input")?.map(|s| s.unwrap()) {
    let (container, contents) =
        scan_fmt!(&line, "{[a-z ]} bags contain {[a-z ]}.", String, String)
            .expect(&format!("{} didn't match pattern", line));
    let contained_bags = if contents == "no other bags" {
        vec![]
    } else {
        contents
            .split(", ")
            .map(|bag| {
                scan_fmt!(bag, "{[a-z ]} bag", String)
                    .expect(&format!("{} didn't match pattern", bag))
            })
            .collect::<Vec<String>>()
    };
    println!("{} -> {:?}", container, contained_bags);
}

I googled “rust format string” (as I haven’t actually done that yet, I’ve only formatted strings with println!()) and discover in Rust by Example that there is a format!() macro which works just like println!(). I use this to provide better error messages, since I do expect that I probably haven’t got the scanning patterns correct the first time.

Running cargo clippy suggests using .unwrap_or_else(|_| panic!("{} didn't match pattern", line)) instead of .expect(&format!("{} didn't match pattern", line)) and it even prints a link that I can read which explains why that’s better!

I made a few other typos and left out some & operators, but that stuff is probably boring by now. The one interesting error that I did make is that I originally tried collect::<Vec<&str>>(). I think this was probably due to not thinking hard enough about the type of the vector, rather than misunderstanding something. The compiler error was clear enough: “a value of type Vec<&str> cannot be built from an iterator over elements of type String“.

When I run this code, it panics on the first line of the input file! The line light green bags contain 2 pale cyan bags. doesn’t match the pattern that I gave to scan_fmt!(). Here’s where I would really like to be able to try a few things in a Rust REPL until I get it right. But I have an idea of what might be going wrong. Maybe the matchers in scan_fmt!() are eager, so the first {[a-z ]} matches everything until the end of the string? I try it like this:

let (adjective, color, contents) = scan_fmt!(
    &line,
    "{} {} bags contain {[a-z ]}.",
    String,
    String,
    String
)
.unwrap_or_else(|_| panic!("{} didn't match pattern", line));
let container = format!("{} {}", adjective, color);

(and the same change to the second scan_fmt!() call as well)

Having changed that I get another panic when trying to scan the contained bags. This time it tells me that the empty string didn’t match the pattern. I wonder how I got an empty string? I look at the scan_fmt!() documentation again figure out that {[a-z ]} is probably only matching one character; what I really want is {/[a-z ]+/}. I make this change, but the error still occurs.

I add a println!("{} {} {}", adjective, color, contents); after the first scan_fmt!() call to see what I’m getting. It seems that contents is blank. But, I see that I have forgotten the comma in the regular expression, and what’s more, even though I figured I could ignore the digits for Part 1 of the puzzle, I cannot ignore them when scanning the string, because of course they are still there! 😳

So, I use "{} {} bags contain {/[0-9a-z, ]+/}." for the first pattern, and "{*d} {} {} bag" for the second pattern, and the program runs. I get output such as:

light green -> ["pale cyan"]
dim tan -> ["shiny teal", "bright white", "striped bronze"]
dull aqua -> []

Now, I still have to figure out what data structure to store this in! I’m thinking that if I use a tree, where each node points to the other nodes that can contain that color of bag, then I can start at the node for shiny gold bags and just count the total number of nodes below it. I google “rust tree” and land on the trees package, but that seems more advanced than what I want, and I don’t understand the documentation very well. I also read this article which says it is more aimed at beginners, and from reading it, it seems like trees are fairly difficult to implement in Rust the way that you would in C, but there is a more convenient way to do it using Vec.1

It also occurs to me that I would like to be able to look up bag rules by the color of the bag. If I had a tree, I’d have to search the whole tree to find my shiny gold bag starting point. So maybe a HashMap<String, Vec<String>> is the appropriate data structure, with the key being the color of the bag, and the value being a vector of other bag colors that can contain this bag.

I decide to use the Entry API that I used yesterday to build the tree. I add use std::collections::HashMap; and let mut rules = HashMap::new(); in the appropriate places, and then write this code:

if contents != "no other bags" {
    contents
        .split(", ")
        .map(|bag| {
            let (adjective, color) = scan_fmt!(bag, "{*d} {} {} bag", String, String)
                .unwrap_or_else(|_| {
                    panic!("'{}' ({}) didn't match pattern", bag, contents)
                });
            format!("{} {}", adjective, color)
        })
        .for_each(|color| {
            let mut entry = rules.entry(color).or_insert(vec![]);
            entry.push(container);
        });
};
println!("{:?}", rules["shiny gold"]);

(Yes, unlike a few days ago I now know that iterators have a for_each() method because I found out that there is documentation for the Iterator trait where I can look…) This gives me a few errors:

warning: variable does not need to be mutable
  --> src/main.rs:31:25
   |
31 |                     let mut entry = rules.entry(color).or_insert(vec![]);
   |                         ----^^^^^
   |                         |
   |                         help: remove this `mut`
   |
   = note: `#[warn(unused_mut)]` on by default

error[E0507]: cannot move out of `container`, a captured variable in an `FnMut` closure
  --> src/main.rs:32:32
   |
21 |         let container = format!("{} {}", adjective, color);
   |             --------- captured outer variable
...
32 |                     entry.push(container);
   |                                ^^^^^^^^^ move occurs because `container` has type `String`, which does not implement the `Copy` trait

I remove the mut from entry (I wrote it because it seemed like an entry should be mutable, but then I realize I didn’t have mut yesterday either). The second error actually makes sense. I need to have a different copy of the string for the containing bag in each vector. I can’t remember off the top of my head whether this would be achieved by String::from() so I quickly google “rust copy string” and land right away on the Clone trait, so I add container.clone() to my code.

Now the code runs, but I get a panic in my debug-print of rules["shiny gold"]. This key doesn’t exist in the map! I change the line to print out the entirety of rules to check that the general shape of the data is what I expected; it is. I wonder why there’s no shiny gold?

I try to print rules["clear gray"] which is the last entry in my debug output, so I know it must exist in the map. I still get the panic, so therefore I must be indexing the map wrong. I think I am actually supposed to use get() and not the [] operator (though then why does the code compile?) I change it to rules.get(&"clear gray").expect("Key not present") and get another interesting error:

error[E0277]: the trait bound `String: Borrow<&str>` is not satisfied
  --> src/main.rs:37:32
   |
37 |         println!("{:?}", rules.get(&"clear gray").expect("Key not present"));
   |                                ^^^ the trait `Borrow<&str>` is not implemented for `String`
   |
   = help: the following implementations were found:
             <String as Borrow<str>>

If I’m reading this error message correctly, then I think it says that I can use "clear gray" here and it will be automatically converted to a temporary String to pass to get(), but the same does not happen for &"clear gray". So I remove the &, but then I get back to the panic that I got earlier: “Key not present”.

Maybe it’s get(&String::from("clear gray"))? Same result, that doesn’t work either.

Then I realize it! I accidentally have my debug-print inside the loop, so it’s printing out the rule after every line in the input file, instead of once at the end! So the first time I try to access rules["clear gray"], it hasn’t been inserted yet. I change the print statement back to rules["shiny gold"], move it down one line and indeed, I get a list of bag colors that can contain a shiny gold bag. One step closer to the solution!

Now I need to count the number of bag colors that can contain a shiny gold bag, plus the number of bag colors that can contain those colors of bags, and so on. The downside is that I don’t know whether there are bag colors that can contain each other (or in math-speak, whether the graph is cyclic.) If I knew for sure the graph was acyclic (bag colors can only contain other bag colors that they are not contained by) then I could implement this by just counting the number of elements in each Vec in the node for "shiny gold" and below. There’s an easy way to find out, though; if I try that and the graph is cyclic, then my program will go into an infinite loop, and I’ll notice soon enough! So I decide to try this. I write this function:

fn total_containers_for(rules: &HashMap<String, Vec<String>>, bag_color: &str) -> usize {
    let containers = &rules[bag_color];
    containers.len()
        + containers
            .iter()
            .map(|color| total_containers_for(rules, color))
            .sum()
}

and I try to print out total_containers_for(&rules, "shiny gold") at the end of the program. I get an error that I don’t understand:

error[E0284]: type annotations needed: cannot satisfy `<usize as std::ops::Add<_>>::Output == usize`
  --> src/main.rs:46:9
   |
46 |         + containers
   |         ^ cannot satisfy `<usize as std::ops::Add<_>>::Output == usize`

I stare at this for a while, but I really just don’t get any closer to understanding it. I refactor it into this, to see if the error is in the first or second half of the expression:

let n_direct_containers = containers.len();
let n_indirect_containers = containers
    .iter()
    .map(|color| total_containers_for(rules, color))
    .sum();
n_direct_containers + n_indirect_containers

That changes the error message into something more understandable:

error[E0282]: type annotations needed
  --> src/main.rs:46:9
   |
46 |     let n_indirect_containers = containers
   |         ^^^^^^^^^^^^^^^^^^^^^ consider giving `n_indirect_containers` a type

OK, that I can do! I add : usize to the variable and it compiles. (I’d like to refactor it back into one expression, but I’m not sure how to add a type annotation to only part of an expression, so I leave it like this.)

Now the program runs, but it panics again, in total_containers_for(), at the &rules[bag_color] expression. I add the following debug-prints:2

let mut debug: Vec<&String> = rules.keys().collect();
debug.sort();
println!("{:?}", debug);
let containers = &rules
    .get(bag_color)
    .unwrap_or_else(|| panic!("{} not found in rules", bag_color));

I find that light fuchsia is the key that’s not found, and indeed it’s not there in the sorted list of keys. So now at least I know I am indexing the map correctly, but there’s a bug in my code that puts all the rules into the map. I open the input file and do a search for light fuchsia. It’s only present in one line:

light fuchsia bags contain 1 shiny gold bag, 5 faded beige bags, 2 dark chartreuse bags, 3 light brown bags.

Apparently, nothing is allowed to contain light fuchsia bags! Since it never appears in the second half of a rule, it’s never added to the map. So what I need to do is add all of the colors in the first half of the rule as well. I add this line to the appropriate place:3

rules.entry(container).or_insert(vec![]);

But this is a mistake: it changes the inferred key type of the HashMap from String to &String, making the subsequent inserts invalid! Luckily I learned above that I can clone() the string, so I change it again to:

rules.entry(container.clone()).or_insert(vec![]);

With this, the program runs, and gives me an answer. It’s not over yet, though: the Advent of Code website tells me my answer is too high.

What to do next? I don’t have any immediate ideas about what could be going wrong. I decide to try running the program on the example input given in the puzzle description, to see if I get the right answer on that. I mv input real_input to back up the input, and then change the input file to have the following contents:

light red bags contain 1 bright white bag, 2 muted yellow bags.
dark orange bags contain 3 bright white bags, 4 muted yellow bags.
bright white bags contain 1 shiny gold bag.
muted yellow bags contain 2 shiny gold bags, 9 faded blue bags.
shiny gold bags contain 1 dark olive bag, 2 vibrant plum bags.
dark olive bags contain 3 faded blue bags, 4 dotted black bags.
vibrant plum bags contain 5 faded blue bags, 6 dotted black bags.
faded blue bags contain no other bags.
dotted black bags contain no other bags.

I then run the program again, and get the answer 6, whereas the correct answer is 4. So I add println!("{:?}", rules); to examine the rules map that I’ve generated. It all seems correct with what is in the file, but I think I’ve found the bug: I’m not removing duplicates from the list of bags that can contain shiny gold bags, so I’m counting light red and dark orange twice.

This will take a bit of refactoring! I change total_containers_for() to a new function all_containers_for() which returns a HashSet of bag colors that can contain the requested bag:

fn all_containers_for(rules: &HashMap<String, Vec<String>>, bag_color: &str) -> HashSet<String> {
    let containers = &rules[bag_color];
    let mut colors: HashSet<String> = containers.iter().map(|s| s.clone()).collect();
    for color in containers.iter() {
        let mut indirect_colors = all_containers_for(rules, color);
        colors.extend(indirect_colors.drain());
    }
    colors
}

Then to get my answer I do println!("{}", all_containers_for(&rules, "shiny gold").len());

I also get a suggestion from cargo clippy:

warning: you are using an explicit closure for cloning elements
  --> src/main.rs:46:39
   |
46 |     let mut colors: HashSet<String> = containers.iter().map(|s| s.clone()).collect();
   |                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider calling the dedicated `cloned` method: `containers.iter().cloned()`

Well that’s useful!

This runs (except that I forgot the mut on both of the HashSet variables at first…) and I get the answer 4, as it should be. Hopefully if I run this again on the real input, I’ll get the real answer! I bring back the real input file with mv real_input input, run the program again, and get an answer. This time it’s correct. Phew!

Day 7, Part 2

Let’s hope Part 2 goes a bit faster than Part 1, this has been more difficult than the other days and I’m a bit tired of it!

Part 2 of the puzzle is to consider the rules the other way around. We have to once again start from a shiny gold bag, but this time say how many bags it must contain.

I think it’s best to assemble a different set of rules, since we are looking up the bags in the other direction. I’ll rename the existing rules to container_rules, and introduce a new contains_rules of type HashMap<String, Vec<(String, usize)>>. Then once I have a database of contains_rules, I can write a function total_contained_by() that will give me the total number of bags that must be contained by the given bag; recursively, similar to all_containers_for().

After a few typos and forgotten .clone()s, my main() function looks like this:

fn main() -> Result<(), io::Error> {
    let mut container_rules = HashMap::new();
    let mut contents_rules = HashMap::new();
    for line in read_lines("input")?.map(|s| s.unwrap()) {
        let (adjective, color, contents) = scan_fmt!(
            &line,
            "{} {} bags contain {/[0-9a-z, ]+/}.",
            String,
            String,
            String
        )
        .unwrap_or_else(|_| panic!("{} didn't match pattern", line));
        let container = format!("{} {}", adjective, color);
        container_rules.entry(container.clone()).or_insert(vec![]);
        if contents != "no other bags" {
            contents
                .split(", ")
                .map(|bag| {
                    let (num, adjective, color) =
                        scan_fmt!(bag, "{d} {} {} bag", usize, String, String).unwrap_or_else(
                            |_| panic!("'{}' ({}) didn't match pattern", bag, contents),
                        );
                    (num, format!("{} {}", adjective, color))
                })
                .for_each(|(num, color)| {
                    let container_entry = container_rules.entry(color.clone()).or_insert(vec![]);
                    container_entry.push(container.clone());
                    let contents_entry = contents_rules.entry(container.clone()).or_insert(vec![]);
                    contents_entry.push((num, color));
                });
        };
    }

    if is_part2() {
        println!("{:?}", contents_rules);
    } else {
        println!(
            "{}",
            all_containers_for(&container_rules, "shiny gold").len()
        );
    }

    Ok(())
}

I print out the new contents_rules map with cargo run 2 and it looks like what I might expect. Now I try to write total_contained_by():

fn total_contained_by(rules: &HashMap<String, Vec<(usize, String)>>, bag_color: &str) -> usize {
    rules[bag_color]
        .iter()
        .map(|(num, color)| num * total_contained_by(rules, color))
        .sum()
}

This compiles, but again crashes with a missing key when I run it. I do get a pretty quick idea of what the problem is, though: I’m forgetting to add the entry to contents_rules if the second part of the rule is contains no other bags. I move the let contents_entry = contents_rules.entry(container.clone()).or_insert(vec![]); line up above the if statement and try again. This time, I get an answer of zero! Clearly the wrong answer, because there is this line in the input file:

shiny gold bags contain 5 mirrored lavender bags, 4 shiny maroon bags, 4 striped yellow bags.

So even if those bags contain nothing else, the answer has to be at least 13.

I do a bit of debug-printing and find out that total_contained_by() is always returning zero, no matter what, so that’s why the total answer is zero. It should return zero for bags that don’t contain other bags. But if a bag contains no other bags, then I still have to count that bag itself! So I change the map() line in total_contained_by() to this:

.map(|(num, color)| num * total_contained_by(rules, color) + 1)

Now I get a more realistic answer from the program. I put it into the puzzle website, but it’s still too low. Then I realize that if my bag contains five other bags that each contain no other bags, I still have to count five bags, not one! In other words, I’ve forgotten the parentheses:

.map(|(num, color)| num * (total_contained_by(rules, color) + 1))

(I also check the puzzle description again to make sure I don’t need to count the original shiny gold bag, but I don’t!) Running the program again gives me a much higher answer which is, finally, correct according to the website.

Here’s the full code, but without the boilerplate:

use std::collections::{HashMap, HashSet};

#[macro_use]
extern crate scan_fmt;

fn main() -> Result<(), io::Error> {
    let mut container_rules = HashMap::new();
    let mut contents_rules = HashMap::new();
    for line in read_lines("input")?.map(|s| s.unwrap()) {
        let (adjective, color, contents) = scan_fmt!(
            &line,
            "{} {} bags contain {/[0-9a-z, ]+/}.",
            String,
            String,
            String
        )
        .unwrap_or_else(|_| panic!("{} didn't match pattern", line));
        let container = format!("{} {}", adjective, color);
        container_rules.entry(container.clone()).or_insert(vec![]);
        let contents_entry = contents_rules.entry(container.clone()).or_insert(vec![]);
        if contents != "no other bags" {
            contents
                .split(", ")
                .map(|bag| {
                    let (num, adjective, color) =
                        scan_fmt!(bag, "{d} {} {} bag", usize, String, String).unwrap_or_else(
                            |_| panic!("'{}' ({}) didn't match pattern", bag, contents),
                        );
                    (num, format!("{} {}", adjective, color))
                })
                .for_each(|(num, color)| {
                    let container_entry = container_rules.entry(color.clone()).or_insert(vec![]);
                    container_entry.push(container.clone());
                    contents_entry.push((num, color));
                });
        };
    }

    if is_part2() {
        println!("{}", total_contained_by(&contents_rules, "shiny gold"));
    } else {
        println!(
            "{}",
            all_containers_for(&container_rules, "shiny gold").len()
        );
    }

    Ok(())
}

fn total_contained_by(rules: &HashMap<String, Vec<(usize, String)>>, bag_color: &str) -> usize {
    rules[bag_color]
        .iter()
        .map(|(num, color)| num * (total_contained_by(rules, color) + 1))
        .sum()
}

fn all_containers_for(rules: &HashMap<String, Vec<String>>, bag_color: &str) -> HashSet<String> {
    let containers = &rules[bag_color];
    let mut colors: HashSet<String> = containers.iter().cloned().collect();
    for color in containers.iter() {
        let mut indirect_colors = all_containers_for(rules, color);
        colors.extend(indirect_colors.drain());
    }
    colors
}

Afterword

I had a more difficult time today than I did on the previous days! First I went down a wrong path while working with scan_fmt!() because I didn’t realize the matchers were greedy instead of lazy. Then I had that weird error “cannot satisfy <usize as std::ops::Add<_>>::Output == usize” which threw me for a loop. Finally, I made several mistakes in the logic of the program which I could have made just as easily in other programming languages. I don’t know if this was just a particularly difficult puzzle, or if the puzzles are getting more difficult as we go along, or if I was just having a bad day.

Yesterday and today I’ve not been writing down the instances where I misspelled something or forgot some & operators. Earlier in the week, writing everything out, as it happened, really helped me think about what to do next, but as I mentioned in the beginning it’s not so helpful at this point.

Lastly, I did find that when I got the wrong answer, it was really useful to run the program on the example input in the puzzle, and make sure that I compute the example answer. Thinking about it more, I’m surprised that today is the first time I’ve done this. I think I should do this more often.


[1] Maybe less efficient, but for a one-day programming puzzle, I don’t care

[2] I know, I know, one of these days I will google “rust debugger” instead of printing things. Today is not that day

[3] It seems like there ought to be a more idiomatic way to say “insert if not already existing”, but I don’t know what it is

Advent of Rust 4: It’s Hard to Return an Iterator

Welcome again to this stream-of-consciousness log about learning the Rust programming language by doing the programming puzzles in Advent of Code 2020, or as I like to call it, On the Code by Jack Kerouac.1 Let’s get on with it!

Day 4, Part 1

I start with cargo new puzzle4 and copying over the code from yesterday, but this time I’d like to refactor it a bit. I would like to write a function is_part2() that tells you whether the solution for Part 2 of the puzzle was requested or not, and I would like to change read_lines() so that it already does the expect() calls that I am copy-pasting every day, rather than returning a Result of an iterator of Results of strings.

Changing read_lines() really does not work for me. I can’t figure out how to return an iterator from the function! Well, clearly the function is already returning an iterator, but I can’t figure out how to express the iterator that I want to return, as a return type.

What I want is this:

fn read_lines<P>(filename: P) -> ???
where
    P: AsRef<path::Path>,
{
    let file = fs::File::open(filename).expect("Bad file");
    io::BufReader::new(file)
        .lines()
        .map(|s| s.expect("Bad line in file"))
}

where ??? ought to be something like Iterator<String>. But I cannot figure out how to write an iterator type. The iterator types that are returned from standard library functions all seem to be type aliases of some sort. For example, io::Lines is the iterator that lines() returns, and I know it is an iterator of Results of strings, because the documentation says so, but I don’t know how to build my own iterator type.

I google “rust return iterator” and the first result is encouragingly titled “Returning Rust Iterators”. This article suggests asking the compiler by returning the empty type () and letting it suggest what to return instead, so I do that.

Unfortunately, I get a type that I don’t think I can put into my program! The compiler says “expected unit type (), found struct Map<std::io::Lines<BufReader<File>>, [closure@src/main.rs:31:14: 31:46]>” I am guessing that referring to a type by the file in which it’s found and the lines and columns that it spans, is not legal Rust syntax!

So I try poking around with no success. By trying things and following the compiler’s suggestions, I end up with the awkward return type of iter::Map<io::Lines<io::BufReader<fs::File>>, dyn FnMut(io::Result<&String>) -> &String>, but this is still producing errors about lifetimes that I don’t understand. So I give up on this idea.

However, maybe I can write read_lines() so that it at least calls expect() on the io::Lines iterator that it returns:

fn read_lines<P>(filename: P) -> io::Lines<io::BufReader<fs::File>>
where
    P: AsRef<path::Path>,
{
    let file = fs::File::open(filename).expect("Bad file");
    io::BufReader::new(file).lines()
}

To be clear, I don’t really understand the P: AsRef<path::Path> part either. I guess I will try to refactor this again in a few days when I have learned a bit more.

Writing is_part2(), on the other hand, is quite straightforward:

fn is_part2() -> bool {
    let args: Vec<String> = env::args().collect();
    let default = String::from("1");
    let arg = args.get(1).unwrap_or(&default);
    arg == "2"
}

This works, but I actually think that I might be able to make this even nicer by using a match statement. I haven’t encountered the match statement directly yet, but I’ve seen it in several google results over the past few days. I google “rust match syntax”, land here, and come up with this:

fn is_part2() -> bool {
    match env::args().nth(1) {
        Some("2") => true,
        _ => false,
    }
}

This almost works, but I have to replace "2" with String::from("2"), and then the compiler tells me that I have to take it out of the match statement, because function calls are not allowed in pattern matching. So in the end it looks like this:

fn is_part2() -> bool {
    let string2 = String::from("2");
    match env::args().nth(1) {
        Some(string2) => true,
        _ => false,
    }
}

The funny thing is, though, that the compiler warns that string2 is an unused variable in both of the places that it is used! It seems like I haven’t understood this well enough. I google “rust match string” and land on a Stack Overflow question titled “How to match a String against string literals in Rust?”, but that is actually only good for when you know that you already have a String! But I have an Option<String>, so I google “rust match option string”, and find another Stack Overflow post with the topic “How can I pattern match against an Option<String>?”. The helpful answer says this is a known limitation of pattern-matching, but suggests two things to do instead. The second solution looks good to me, so I implement it:

fn is_part2() -> bool {
    match env::args().nth(1) {
        Some(s) if s == "2" => true,
        _ => false,
    }
}

This works, and looks like what I had in mind. Time to start on the puzzle!

Today’s puzzle is processing records that represent “passports”. These records consist of whitespace-separated key:value pairs, and records are separated by blank lines. There are eight possible keys, each consisting of three letters. The cid key is optional, and all the others are required. A record is valid if it contains all the required keys. The answer to the puzzle is the number of valid passports in the input file.

I start out by downloading the input file and put it in the project directory, as usual.

So first I think a bit about how I will process the data into records. My usual approach so far has been to split the file into lines and process each line using a chain of iterators, but maybe it’s better to let go of that approach for today. I could read the whole file into a string and split it on the blank lines in order to get an array where each element is a record, and then process each record. Or I could still process the file line-by-line, and use a for loop while keeping the notion of a “current” record, which I push into a vector when it is completed. I think I like that idea best so far.

The records can be hash sets, where I store the keys. That’s a bit wasteful since really all I need is one bit for each of the 8 possible keys, to tell whether the record has it or not! (I can ignore the values for now, since I only need to look at the keys.) But I decide to be wasteful nonetheless, for two reasons: first, I suspect Part 2 of the puzzle will require me to do something with the values, but if Rust is like other languages, a hash set will be easy enough to refactor into a hash map. Second, stuffing the string keys into a hash set or hash map is simple, and I won’t have to write code that translates key strings into record fields.

To make a first attempt, I google “rust hash set” and read the Rust by Example page. It’s interesting that you don’t have to specify the type of a HashSet, the compiler can figure it out by what you insert into it!

I also have to look in the API documentation for how to split a string on whitespace, but it seems there is conveniently a split_whitespace() method.

Here’s my attempt:

use std::collections::HashSet;
use std::env;
use std::fs;
use std::io::{self, BufRead};
use std::path;

fn main() {
    let mut passports = vec![];
    let mut current = HashSet::new();
    for line in read_lines("input").map(|s| s.expect("Bad line in file")) {
        if line == "" {
            passports.push(current);
            current = HashSet::new();
        }
        current = current.union(get_keys_from_line(&line));
    }

    if is_part2() {
        println!("part 2");
    } else {
        println!("the first passport is {:?}", passports[0]);
    }
}

fn get_keys_from_line(line: &str) -> HashSet<String> {
    let mut new_keys = HashSet::new();
    for pair in line.split_whitespace() {
        new_keys.insert(String::from(&pair[..3]));
    }
    new_keys
}

It looks like I got several & operators right this time, even though I still got one wrong:

error[E0308]: mismatched types
  --> src/main.rs:15:33
   |
15 |         current = current.union(get_keys_from_line(&line));
   |                                 ^^^^^^^^^^^^^^^^^^^^^^^^^
   |                                 |
   |                                 expected `&HashSet<_>`, found struct `HashSet`
   |                                 help: consider borrowing here: `&get_keys_from_line(&line)`
   |
   = note: expected reference `&HashSet<_>`
                 found struct `HashSet<String>`

error[E0308]: mismatched types
  --> src/main.rs:15:19
   |
15 |         current = current.union(get_keys_from_line(&line));
   |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected struct `HashSet`, found struct `std::collections::hash_set::Union`
   |
   = note: expected struct `HashSet<_>`
              found struct `std::collections::hash_set::Union<'_, _, RandomState>`

I need a & on the return value of get_keys_from_line(), and it appears I didn’t read the documentation of union() well enough because it returns an iterator, not another hash set, so I need to add collect() after it. I’m momentarily confused since I thought collect() created a vector from an iterator, not a hash set! But I see that the examples in the documentation for union() are doing it like that, so I assume it must be OK. I’m also wondering if there isn’t a more idiomatic way to do what I’m trying to do (I’m thinking of Python’s dict.update()) but I decide to leave it for now.

Now I get this error:

error[E0277]: a value of type `HashSet<String>` cannot be built from an iterator over elements of type `&String`
  --> src/main.rs:15:61
   |
15 |         current = current.union(&get_keys_from_line(&line)).collect();
   |                                                             ^^^^^^^ value of type `HashSet<String>` cannot be built from `std::iter::Iterator<Item=&String>`
   |
   = help: the trait `FromIterator<&String>` is not implemented for `HashSet<String>`

At least I think I know enough to understand this error message now; the union() iterator is giving me the &String values that are owned by the original two hash sets (current and the temporary object returned from get_keys_from_line()). I don’t own them, so I can’t insert them into a hash set of values that I’m supposed to own.

Hmm, this is not what I wanted at all. What I actually wanted was an update() method that would move the elements from the second set into the first set. I start to think this might have been easier after all if I’d used 8 bits to store whether the keys were present… 😝

I google “rust hashset update” and land on this encouragingly titled Stack Overflow post: “How can I insert all values of one HashSet into another HashSet?” It comes as a surprise to me that there is an extend() method for this! I guess as the commenter says under that post,

Plus it teaches me that I should not only look at the ‘methods’ section of the doc to find methods, but also into the traits a struct implements.

I guess I’ve learned that too now! Wow, that really is confusing, that a note down at the bottom of the API documentation page saying only impl<T, S> Extend<T> for HashSet<T, S> is actually telling you that there is an extend() method.

Regardless, I change the line to read current.extend(&get_keys_from_line(&line)); and I don’t quite get what I want either:

error[E0716]: temporary value dropped while borrowed
  --> src/main.rs:15:25
   |
12 |             passports.push(current);
   |                            ------- borrow later used here
...
15 |         current.extend(&get_keys_from_line(&line));
   |                         ^^^^^^^^^^^^^^^^^^^^^^^^^ - temporary value is freed at the end of this statement
   |                         |
   |                         creates a temporary which is freed while still in use
   |
   = note: consider using a `let` binding to create a longer lived value

This is another head-scratcher. I don’t see why it should matter that the temporary value is freed, when I thought I was moving all the elements out of it and into current! From reading about ownership yesterday, I thought I understood that ownership of values is always moved into function calls, unless they are borrowed with the & operator.

But while I was browsing the HashSet documentation I do remember coming across the drain() method and wondering if that would be useful. Maybe the problem is that the ownership of the values isn’t getting transferred, and I need to drain() them out of the temporary object so that current owns them. I change the line to current.extend(get_keys_from_line(&line).drain()); and it works!

So, I file away the knowledge that the operation I think of as a.update(b) is written as a.extend(b.drain()) in Rust. I wonder if that is actually the idiomatic way to do it?

Now that I have the data in the form that I wanted, I can write the code to get the answer:

let count = passports.iter().filter(passport_is_valid).count();
println!("{}", count);

// [...]

fn passport_is_valid(passport: &HashSet<String>) -> bool {
    let n_keys = passport.len();
    n_keys == 8 || (n_keys == 7 && !passport.contains("cid"))
}

But this doesn’t work:

error[E0631]: type mismatch in function arguments
  --> src/main.rs:21:45
   |
21 |         let count = passports.iter().filter(passport_is_valid).count();
   |                                             ^^^^^^^^^^^^^^^^^ expected signature of `for<'r> fn(&'r &HashSet<String>) -> _`
...
26 | fn passport_is_valid(passport: &HashSet<String>) -> bool {
   | -------------------------------------------------------- found signature of `for<'r> fn(&'r HashSet<String>) -> _`

error[E0599]: no method named `count` found for struct `Filter<std::slice::Iter<'_, HashSet<String>>, for<'r> fn(&'r HashSet<String>) -> bool {passport_is_valid}>` in the current scope
    --> src/main.rs:21:64
     |
21   |           let count = passports.iter().filter(passport_is_valid).count();
     |                                                                  ^^^^^ method not found in `Filter<std::slice::Iter<'_, HashSet<String>>, for<'r> fn(&'r HashSet<String>) -> bool {passport_is_valid}>`
     |
     = note: the method `count` exists but the following trait bounds were not satisfied:
             `<for<'r> fn(&'r HashSet<String>) -> bool {passport_is_valid} as FnOnce<(&&HashSet<String>,)>>::Output = bool`
             which is required by `Filter<std::slice::Iter<'_, HashSet<String>>, for<'r> fn(&'r HashSet<String>) -> bool {passport_is_valid}>: Iterator`
             `for<'r> fn(&'r HashSet<String>) -> bool {passport_is_valid}: FnMut<(&&HashSet<String>,)>`
             which is required by `Filter<std::slice::Iter<'_, HashSet<String>>, for<'r> fn(&'r HashSet<String>) -> bool {passport_is_valid}>: Iterator`
             `Filter<std::slice::Iter<'_, HashSet<String>>, for<'r> fn(&'r HashSet<String>) -> bool {passport_is_valid}>: Iterator`
             which is required by `&mut Filter<std::slice::Iter<'_, HashSet<String>>, for<'r> fn(&'r HashSet<String>) -> bool {passport_is_valid}>: Iterator`

Well, I saw an error just like this yesterday, and it was a missing & operator on the type of the function parameter. But I already have a & there! I try putting a second one (since the error message also has two of them) and sure enough, it works. I get an answer, but it’s the wrong answer.

According to the Advent of Code website, my answer is too low. I look over my code again and I see that I forgot to add the last passport to my vector of passports! Maybe that’s the problem? I add passports.push(current); below the loop, run the program again, and yes, I get an answer that’s one more than the previous answer. This time, it’s correct according to the website.

Here’s the full code (I’ll start leaving out the definitions of is_part2() and read_lines() each time unless they change, though):

use std::collections::HashSet;

fn main() {
    let mut passports = vec![];
    let mut current = HashSet::new();
    for line in read_lines("input").map(|s| s.expect("Bad line in file")) {
        if line == "" {
            passports.push(current);
            current = HashSet::new();
        }
        current.extend(get_keys_from_line(&line).drain());
    }
    passports.push(current);

    if is_part2() {
        println!("part 2");
    } else {
        let count = passports.iter().filter(passport_is_valid).count();
        println!("{}", count);
    }
}

fn passport_is_valid(passport: &&HashSet<String>) -> bool {
    let n_keys = passport.len();
    n_keys == 8 || (n_keys == 7 && !passport.contains("cid"))
}

fn get_keys_from_line(line: &str) -> HashSet<String> {
    let mut new_keys = HashSet::new();
    for pair in line.split_whitespace() {
        new_keys.insert(String::from(&pair[..3]));
    }
    new_keys
}

Day 4, Part 2

Well, as I suspected, the second part of the puzzle does require looking at the values. So I will start by refactoring the hash set into a hash map, that saves the values as well as the keys.

This refactor is pretty straightforward, I look up the documentation for HashMap to check if the methods are different (contains() has to change to contains_key()) but other than that, I just have to change HashSet to HashMap throughout. (I also originally overlooked that the correct type is HashMap<String, String>, not HashMap<String>, but the compiler helpfully reminded me.)

use std::collections::HashMap;

fn main() {
    let mut passports = vec![];
    let mut current = HashMap::new();
    for line in read_lines("input").map(|s| s.expect("Bad line in file")) {
        if line == "" {
            passports.push(current);
            current = HashMap::new();
        }
        current.extend(get_pairs_from_line(&line).drain());
    }
    passports.push(current);

    if is_part2() {
        println!("part 2");
    } else {
        let count = passports.iter().filter(passport_is_valid).count();
        println!("{}", count);
    }
}

fn passport_is_valid(passport: &&HashMap<String, String>) -> bool {
    let n_keys = passport.len();
    n_keys == 8 || (n_keys == 7 && !passport.contains_key("cid"))
}

fn get_pairs_from_line(line: &str) -> HashMap<String, String> {
    let mut new_pairs = HashMap::new();
    for pair in line.split_whitespace() {
        new_pairs.insert(String::from(&pair[..3]), String::from(&pair[4..]));
    }
    new_pairs
}

Now I can start changing passport_is_valid() to implement the rules for each field. I’ll make it look something like this:

    (n_keys == 8 || (n_keys == 7 && !passport.contains_key("cid")))
        && valid_birth_year(&passport["byr"])
        && valid_issue_year(&passport["iyr"])
        && valid_expiry_year(&passport["eyr"])
        && valid_height(&passport["hgt"])
        && valid_hair_color(&passport["hcl"])
        && valid_eye_color(&passport["ecl"])
        && valid_passport_id(&passport["pid"])

Then I can write a function for each validation criterion. To start with I write all these functions but make them only contain true, so I can write them one by one but still run the program. (This went pretty smoothly although the compiler did have to remind me to add the & operators before passing each passport data field into each function.)

I wonder for a bit whether I should use scan_fmt! like I did on Day 2, or regular expressions, or parse(). I decide that maybe it’s time for me to use some regular expressions in Rust. It’s new for me, would be useful to learn, and these regular expressions will be simple. But for the fields where I need to check that the numbers are between certain ranges, I’ll also need to use parse().

Let’s start at the top. Birth year must be a 4-digit number between 1920 and 2002 inclusive. (What happens to the 17-year-old or 101-year-old travelers?) I google “rust regex” and find out that you need to install a package for regular expressions, so I add regex = "1" to Cargo.toml and use regex::Regex; to my program. I start reading the documentation for the regex package.

I already know regular expression syntax, which will be very helpful for completing this task. The one thing that I should take note of is that patterns match anywhere in the string, so I need to use ^ and $ to anchor them if I only want them to match the whole string. (That is, matching is like re.search() and not re.match() in Python.)

I come up with this as a first attempt:

fn valid_birth_year(byr: &str) -> bool {
    let re = Regex::new(r"^[12]\d{3}$").unwrap();
    if !re.is_match(byr) {
        return false;
    }
    let year = byr.parse::<u16>();
    year >= 1920 && year <= 2002
}

It seems the only thing I’ve forgotten is to unwrap() the result of parse(), which I had learned on Day 1 but forgot about. So I add that. Then, since issue year and expiry year are validated in a very similar way, just with different maximum and minimum years, I extract that into a valid_year() function:

fn valid_birth_year(byr: &str) -> bool {
    valid_year(byr, 1920, 2002)
}

fn valid_issue_year(iyr: &str) -> bool {
    valid_year(iyr, 2010, 2020)
}

fn valid_expiry_year(eyr: &str) -> bool {
    valid_year(eyr, 2020, 2030)
}

fn valid_year(y: &str, min: u16, max: u16) -> bool {
    let re = Regex::new(r"^[12]\d{3}$").unwrap();
    if !re.is_match(y) {
        return false;
    }
    let year = y.parse::<u16>().unwrap();
    year >= min && year <= max
}

(I initially forgot the -> bool on the new function. I’m so used to either not having a return type on functions (Python, JavaScript) or having the return type come before the name (C, C++) that I forget this a lot!)

Next I write a first attempt at valid_height(), using the match statement that I learned earlier today, and browsing the regex documentation to find that I have to use captures() to get the values of the regular expression’s capture groups:

fn valid_height(hgt: &str) -> bool {
    let re = Regex::new(r"^(\d{2,3})(cm|in)$").unwrap();
    let groups = re.captures(hgt);
    let height = groups[1].parse::<u8>().unwrap();
    match &groups[2] {
        "cm" => height >= 150 && height <= 193,
        "in" => height >= 59 && height <= 76,
    }
}

The compiler tells me that I forgot to unwrap() the result from captures(), and after I do that, it tells me that my match pattern is “non-exhaustive” — I guess this means that I don’t provide any instructions for what to do if the unit is neither cm nor in. I remember from earlier that a default case is written as _ =>, so I add _ => false to the match statement.

Now it runs! But it panics at unwrapping the return value of captures(). I look at the captures() documentation again, and it says that it will return None if the regex doesn’t match. So I decide to do some quick’n’dirty println!("{}", hgt) debugging to see what the string is that doesn’t match. It’s 97, without a unit after it, so indeed it doesn’t match. A string that doesn’t match should make the function return false, not panic.

I’m sure there’s a nice idiom for “return if None, unwrap otherwise” that I’m not yet aware of. I google “rust option return if none” and I learn a bit more about the ? operator, but since I’m not returning a Result from this function that doesn’t help me. However, another thing I learn is that the body of a match statement doesn’t actually have to be the same type as the expression! You can put return false in there and it will return from the function. How nice! So, the captures() call becomes:

let groups = match re.captures(hgt) {
    Some(groups) => groups,
    None => return false,
};

Next comes hair color. This is very similar to the year validation that I’ve done already, only without having to parse the string:

fn valid_hair_color(hcl: &str) -> bool {
    let re = Regex::new(r"^#[0-9a-f]{6}$").unwrap();
    re.is_match(hcl)
}

Eye color doesn’t actually need to use a regular expression:

fn valid_eye_color(ecl: &str) -> bool {
    ecl == "amb"
        || ecl == "blu"
        || ecl == "brn"
        || ecl == "gry"
        || ecl == "grn"
        || ecl == "hzl"
        || ecl == "oth"
}

This works, but I wonder if I could take this opportunity to apply something that I think I remember, from reading about match earlier:

fn valid_eye_color(ecl: &str) -> bool {
    match ecl {
        "amb" | "blu" | "brn" | "gry" | "grn" | "hzl" | "oth" => true,
        _ => false,
    }
}

This also works, and looks much nicer!

Finally, there is the passport ID, which is a nine-digit number:

fn valid_passport_id(pid: &str) -> bool {
    let re = Regex::new(r"^\d{9}$").unwrap();
    re.is_match(pid)
}

I should now be able to get my answer! Before I check whether it’s correct on the Advent of Code website, I would like to make one improvement that was mentioned in the regex documentation. Running the program takes an almost noticeably long time, and it’s probably because I’m rebuilding the same Regex objects once for each passport in the file, several hundred times. The documentation mentions you can use the lazy_static package to build them only once, and so I add that to Cargo.toml and follow the example in the regex documentation.

According to the example, I need to add this to the top of the file:

#[macro_use]
extern crate lazy_static;

And change this:

let re = Regex::new(r"^[12]\d{3}$").unwrap();

to this:

lazy_static! {
    static ref YEAR_REGEX: Regex = Regex::new(r"^[12]\d{3}$").unwrap();
}

I initially forget to add the type : Regex and the compiler’s error message isn’t so helpful:

error: no rules expected the token `=`
  --> src/main.rs:57:23
   |
57 |         static ref re = Regex::new(r"^[12]\d{3}$").unwrap();
   |                       ^ no rules expected this token in macro call

I still don’t quite know how macros work in Rust, or why lazy_static! is a macro and not a function, but my guess is that it’s harder to generate good error messages for macros.

I also try to keep the name re for the regular expressions, but the compiler helpfully tells me that it’s bad style for static variables to have lower case names! So I rename them.

My program seems to run faster now, and still gives me the same answer. So I put that answer into the Advent of Code website, and it’s correct! Hooray!

Here’s the program, quite long this time because of all the validation rules:

use regex::Regex;
use std::collections::HashMap;

#[macro_use]
extern crate lazy_static;

fn main() {
    let mut passports = vec![];
    let mut current = HashMap::new();
    for line in read_lines("input").map(|s| s.expect("Bad line in file")) {
        if line == "" {
            passports.push(current);
            current = HashMap::new();
        }
        current.extend(get_pairs_from_line(&line).drain());
    }
    passports.push(current);

    let count = passports.iter().filter(passport_is_valid).count();
    println!("{}", count);
}

fn passport_is_valid(passport: &&HashMap<String, String>) -> bool {
    let n_keys = passport.len();
    (n_keys == 8 || (n_keys == 7 && !passport.contains_key("cid")))
        && (!is_part2()
            || valid_birth_year(&passport["byr"])
                && valid_issue_year(&passport["iyr"])
                && valid_expiry_year(&passport["eyr"])
                && valid_height(&passport["hgt"])
                && valid_hair_color(&passport["hcl"])
                && valid_eye_color(&passport["ecl"])
                && valid_passport_id(&passport["pid"]))
}

fn valid_birth_year(byr: &str) -> bool {
    valid_year(byr, 1920, 2002)
}

fn valid_issue_year(iyr: &str) -> bool {
    valid_year(iyr, 2010, 2020)
}

fn valid_expiry_year(eyr: &str) -> bool {
    valid_year(eyr, 2020, 2030)
}

fn valid_year(y: &str, min: u16, max: u16) -> bool {
    lazy_static! {
        static ref YEAR_REGEX: Regex = Regex::new(r"^[12]\d{3}$").unwrap();
    }
    if !YEAR_REGEX.is_match(y) {
        return false;
    }
    let year = y.parse::<u16>().unwrap();
    year >= min && year <= max
}

fn valid_height(hgt: &str) -> bool {
    lazy_static! {
        static ref HEIGHT_REGEX: Regex = Regex::new(r"^(\d{2,3})(cm|in)$").unwrap();
    }
    let groups = match HEIGHT_REGEX.captures(hgt) {
        Some(groups) => groups,
        None => return false,
    };
    let height = groups[1].parse::<u8>().unwrap();
    match &groups[2] {
        "cm" => height >= 150 && height <= 193,
        "in" => height >= 59 && height <= 76,
        _ => false,
    }
}

fn valid_hair_color(hcl: &str) -> bool {
    lazy_static! {
        static ref HAIR_REGEX: Regex = Regex::new(r"^#[0-9a-f]{6}$").unwrap();
    }
    HAIR_REGEX.is_match(hcl)
}

fn valid_eye_color(ecl: &str) -> bool {
    match ecl {
        "amb" | "blu" | "brn" | "gry" | "grn" | "hzl" | "oth" => true,
        _ => false,
    }
}

fn valid_passport_id(pid: &str) -> bool {
    lazy_static! {
        static ref ID_REGEX: Regex = Regex::new(r"^\d{9}$").unwrap();
    }
    ID_REGEX.is_match(pid)
}

fn get_pairs_from_line(line: &str) -> HashMap<String, String> {
    let mut new_pairs = HashMap::new();
    for pair in line.split_whitespace() {
        new_pairs.insert(String::from(&pair[..3]), String::from(&pair[4..]));
    }
    new_pairs
}

Afterword

I don’t have that much to reflect on, this time. I did still forget the & operators all over the place, but less often than I did on the previous days, and I had a better understanding of the errors when they occurred.

I’m still a bit mystified about why it’s so difficult to express the type of an iterator in Rust. It seems like it shouldn’t be that hard, so maybe I’m missing something or approaching it the wrong way?

Finally, it seems like I’m reading the API documentation more often, and googling less. I think this is a sign that I have a better idea of where to look and what I’m looking for in the API documentation, but googling is still useful when I need to figure out how to do something that’s not covered by the examples in the API documentation, like a.extend(b.drain()).


[1] I don’t, in fact, like to call it that

Advent of Rust 3: Once in ‘a Lifetime

Well, this is another long stream-of-consciousness chronicle of my attempt to learn how to program in Rust by doing the daily programming puzzles on Advent of Code 2020. It’s long, but on the plus side, this is the first time ever that I’ve published two blog posts in two days, let alone three in three days. And you know what they say, if I’d had more time, I would’ve written you a shorter letter.

I thought a bit about why it should even be interesting or valuable to write about my mistakes and thought process. Or put more bluntly, isn’t this just a waste of time? Who is this useful for?

Well, for one thing, at my job I’ve been working on Temporal, a standards-track proposal to add a modern API for dates and times to the JavaScript language. Earlier this year I conducted a survey of people who had tried out the proposed Temporal API, and one of the purposes was to try to figure out what was easy to understand and what was hard, for someone coming to Temporal with no prior knowledge. Even though I had been in exactly that position myself only a few months before, I had become so accustomed to using Temporal that I could literally remember nothing of my own experience.

It’s sometimes called the curse of knowledge. I’m sure I will look back in a year, or two years, when I have written lots of Rust code, and not remember any of this either, and I’ll be glad that I wrote it down. But maybe it’ll be valuable in the meantime to someone else!

Day 3, Part 1

Today we get a roguelike puzzle! We are sliding on a toboggan down a horizontally repeating grid of open spaces (.) and trees (#) that is defined in our input file. We slide at a certain angle (3 cells across for every 1 cell down), and the answer to Part 1 is how many trees we crash into by the time we get to the bottom.

By this time I’m familiar with how to start — cargo new puzzle3, download the input file and put it in the project directory, and copy the relevant code from yesterday. This time I’m not only copying the read_lines() function, but also the code that determines from the command line argument whether I want to run the code for Part 1 or Part 2.

Here’s what I start out with:

use std::env;
use std::fs;
use std::io::{self, BufRead};
use std::path;

fn main() {
    let args: Vec<String> = env::args().collect();
    let mut part2 = false;
    let default = String::from("1");
    let arg = args.get(1).unwrap_or(&default);
    if arg == "2" {
        part2 = true;
    }
    let lines: Vec<String> = read_lines("input")
        .expect("Bad file")
        .map(|s| s.expect("Bad line in file"))
        .collect();
    println!("grid is {} × {}", lines[0].len(), lines.len());
}

fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<fs::File>>>
where
    P: AsRef<path::Path>,
{
    let file = fs::File::open(filename)?;
    Ok(io::BufReader::new(file).lines())
}

I still made a few minor mistakes when adapting this code from yesterday’s code: I forgot to add the Vec<String> type to lines, I forgot to handle errors with .map(|s| s.expect("Bad line in file")), and I guessed wrong that the length of a vector was length() and not len(). Happily, I am getting accustomed enough to this, that I’m able to correct those mistakes fairly quickly.

I will once again use my approach from the last two days, where I first try to get the data into the right data structure. Now that I’m no longer struggling with knowing what data structures even exist in Rust, and how do I even write Rust code, I can afford to think about this before I start writing any code.

It seems like the appropriate data structure is a two-dimensional array of booleans – true if there is a tree in the cell, false if the cell is an open space. I don’t know if Rust has two-dimensional arrays or vectors, so I first start by googling “rust two dimensional array”. I find several results (1, 2, 3, and the array2d package) but none of them really stand out as saying “this is the way to do it.”

What I do understand so far is that I’d use an array if the size was known at compile time, and a Vec of Vecs if I wanted to have variable-length rows or columns. The array2d package from my search results does look interesting because it seems like it gives you a rectangular array, where the length of each row and column is the same for the whole array, but doesn’t necessarily have to be known at compile time. In theory I do know the width and length at compile time, because I can look in the file and count the number of lines and the length of each line! I have a feeling that that would make my code too tied-up with this particular data file, though, and might mean that I would have to refactor the whole thing for Part 2, so I’d prefer not to take that approach.

(I think briefly about using a one-dimensional array and checking through it with a stride of width + 3, but because the grid repeats in the horizontal direction, I think that wouldn’t work.)

Taking the two-dimensional array approach would mean that unlike yesterday, I would not use iterators very much. Hmm, I feel like I’ve just gotten comfortable with iterators in Rust, I wonder what a solution would look like if it were based on iterators?

Maybe that would actually be easier! Thinking about it some more, I don’t actually need to store the entire grid in a 2-D array. I just need to read in each row, figure out my horizontal position on that row, and store a boolean indicating whether I have hit a tree on that row. I believe it can be done just by operating on the string representing each row, with a map(), filter(), and count().

Let’s see if my guess about how to do it is correct. Here’s the first version that I come up with:

fn main() {
    // [...args...]
    let trees_hit = read_lines("input")
        .expect("Bad file")
        .enumerate()
        .filter(hits_tree)
        .count();
    println!("{}", trees_hit);
}

fn hits_tree(row_index: usize, row: String) -> bool {
    let col_index = row_index * 3 % row.len();
    row.as_bytes()[col_index] == '#'
}

Here, I’m assuming that it’s OK to index the string as bytes, because the only characters allowed are # and .. (I learned from yesterday’s exercise what the difference was between indexing a string by byte and iterating through it by character.) I’m taking a guess that '#' (with single quotes instead of double quotes) will give me a single byte rather than a string, like it does in C.

The compiler gives me quite a lot of errors, but I think I know what most of them mean:

error[E0593]: function is expected to take 1 argument, but it takes 2 arguments
  --> src/main.rs:17:17
   |
17 |         .filter(hits_tree)
   |                 ^^^^^^^^^ expected function that takes 1 argument
...
22 | fn hits_tree(row_index: usize, row: String) -> bool {
   | --------------------------------------------------- takes 2 arguments

error[E0599]: no method named `count` found for struct `Filter<Enumerate<std::io::Lines<BufReader<File>>>, fn(usize, String) -> bool {hits_tree}>` in the current scope
    --> src/main.rs:18:10
     |
18   |           .count();
     |            ^^^^^ method not found in `Filter<Enumerate<std::io::Lines<BufReader<File>>>, fn(usize, String) -> bool {hits_tree}>`
     |
     = note: the method `count` exists but the following trait bounds were not satisfied:
             `<fn(usize, String) -> bool {hits_tree} as FnOnce<(&(usize, std::result::Result<String, std::io::Error>),)>>::Output = bool`
             which is required by `Filter<Enumerate<std::io::Lines<BufReader<File>>>, fn(usize, String) -> bool {hits_tree}>: Iterator`
             `fn(usize, String) -> bool {hits_tree}: FnMut<(&(usize, std::result::Result<String, std::io::Error>),)>`
             which is required by `Filter<Enumerate<std::io::Lines<BufReader<File>>>, fn(usize, String) -> bool {hits_tree}>: Iterator`
             `Filter<Enumerate<std::io::Lines<BufReader<File>>>, fn(usize, String) -> bool {hits_tree}>: Iterator`
             which is required by `&mut Filter<Enumerate<std::io::Lines<BufReader<File>>>, fn(usize, String) -> bool {hits_tree}>: Iterator`

error[E0308]: mismatched types
  --> src/main.rs:24:34
   |
24 |     row.as_bytes()[col_index] == '#'
   |                                  ^^^ expected `u8`, found `char`

The first error is probably because I need to make hits_tree() take a tuple of (row_index, row) as its one argument, instead of two arguments. The second error (reminiscent of those long template errors that you get from C++ compilers) I don’t understand, but it seems like it might be a consequence of the first error, and might go away if I solve that one? The third error shows that my guess that '#' is a byte, is wrong. I was close, it is a char, but a byte is type u8.

The third error seems like the easiest to solve, so first I google “rust byte literal” and land here, which makes me think that b'#' might be correct — and it is!

Next I need to make hits_tree() take a tuple as an argument. I wonder if I can destructure this tuple as I would be able to in JavaScript: function hits_tree([row_index, row]) { ... } I google “rust tuple function argument”, and I get this Stack Overflow answer which makes me think that I should be able to do (row_index, row): (usize, String).

But before I run that, I would like to try to get the & operator right this time. After yesterday, when every time I compiled my code, the compiler told me to add a &, I did spend some time reading the “Understanding Ownership” page. Now I feel like I should be better equipped to avoid these mistakes, or at least understand them when I do make them.

So for the first time in this series, I actually go to the API documentation for a function intentionally, instead of googling it and landing there! First I check enumerate() to check that usize is really the correct type for the index (it is,) then I check lines() to check that String is really the correct type for the row (it isn’t.) The type is actually io::Result<String> which reminds me that I need to expect() the value, so I add .map(|s| s.expect("Bad line in file")) before the call to enumerate().

Then there’s the question of whether the row parameter should be a reference or not. I remember reading this:

A more experienced Rustacean would write [s: &str] instead [of s: &String] because it allows us to use the same function on both &String values and &str values.

But on the other hand, I’m not sure that this parameter is actually a reference! It seems to me that since we get a io::Result<String> from read_lines(), the ownership of the string is passed from function to function in the iterator chain, so I would guess (but with only a little more than 50% confidence)1 that String is correct and &str is not. So I make the change of (row_index, row): (usize, String) and run it.

Unfortunately, sitting down and thinking about it is rewarded with an error that I really don’t understand:

error[E0631]: type mismatch in function arguments
  --> src/main.rs:18:17
   |
18 |         .filter(hits_tree)
   |                 ^^^^^^^^^ expected signature of `for<'r> fn(&'r (usize, String)) -> _`
...
23 | fn hits_tree((row_index, row): (usize, String)) -> bool {
   | ------------------------------------------------------- found signature of `fn((usize, String)) -> _`

I know from the reading I’ve done that 'r is a “lifetime”, but the page that I read told me that lifetimes were an advanced feature that would be explained later in the Rust book. Oops, I guess I should have read that chapter too!

I wonder if I can just fake it ’til I make it, by adding for<'r> and &'r to the function signature as the compiler is suggesting. I try that, but it’s a syntax error, “expected item, found keyword for“. I try removing the for<'r>, and I get another error about “unexpected lifetime” which makes me think that I should put the &'r on (usize, String) instead of (row_index, row). (That makes sense, somewhat, because it’s part of the type, not part of the destructured function arguments.) So I try that, and I get another error, but this one tells me exactly what to do:

error[E0261]: use of undeclared lifetime name `'r`
  --> src/main.rs:23:33
   |
23 | fn hits_tree((row_index, row): &'r (usize, String)) -> bool {
   |             -                   ^^ undeclared lifetime
   |             |
   |             help: consider introducing lifetime `'r` here: `<'r>`

Sure enough, when I add <'r>, it works (with some warnings about not using the part2 variable, which I’ll use soon enough) and I get a number out! I put the number into the Advent of Code website and it’s correct!

The lifetime thing is fairly unsatisfying: I have no idea what it does, why it’s necessary, or why it’s even called r!

I’ll try one more thing. Maybe I was wrong about the row needing to be a String. For example, I look back at yesterday’s code and see fn parse_line(line: &str), so maybe in today’s code I also need &str. So I try changing the function signature to fn hits_tree((row_index, row): (usize, &str)) -> bool, without any lifetimes. But I get the old errors back. I do notice one interesting thing in the error message this time:

error[E0631]: type mismatch in function arguments
  --> src/main.rs:18:17
   |
18 |         .filter(hits_tree)
   |                 ^^^^^^^^^ expected signature of `for<'r> fn(&'r (usize, String)) -> _`
...
23 | fn hits_tree((row_index, row): (usize, &str)) -> bool {
   | ----------------------------------------------------- found signature of `for<'r> fn((usize, &'r str)) -> _`

Now, for some reason, it thinks that my function signature already has a lifetime in it, even though I didn’t put one in! I wonder if the lifetime is implicitly when you include a reference. But it definitely wants a String and not a str. So I try changing the type of the argument to (usize, &String) without the lifetime. That doesn’t work either, but interestingly it does continue to think that the type of my function includes a lifetime. It’s telling me that the reference needs to be on the tuple, not on the string, so I try &(usize, String), once more without a lifetime, and that works.

So the message about having to add a lifetime was strange and confusing, but it turns out that I didn’t need to have it after all. Unfortunately I had to discover this by putting & operators in random places until my code compiled, which I had intended to stop doing today ☹️

On the plus side, I got the correct answer, so that’s the end of Part 1! Here’s the full code:

use std::env;
use std::fs;
use std::io::{self, BufRead};
use std::path;

fn main() {
    let args: Vec<String> = env::args().collect();
    let mut part2 = false;
    let default = String::from("1");
    let arg = args.get(1).unwrap_or(&default);
    if arg == "2" {
        part2 = true;
    }
    let trees_hit = read_lines("input")
        .expect("Bad file")
        .map(|s| s.expect("Bad line in file"))
        .enumerate()
        .filter(hits_tree)
        .count();
    println!("{}", trees_hit);
}

fn hits_tree((row_index, row): &(usize, String)) -> bool {
    let col_index = row_index * 3 % row.len();
    row.as_bytes()[col_index] == b'#'
}

fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<fs::File>>>
where
    P: AsRef<path::Path>,
{
    let file = fs::File::open(filename)?;
    Ok(io::BufReader::new(file).lines())
}

Day 3, Part 2

My guess was that Part 2 of the puzzle might be to do the same thing but with the toboggan at a sharper angle such that you would have to skip rows entirely. (For example, 1 right for every 3 down.) I was both right and wrong. It’s actually that you have to check a list of five angles including a sharper one that makes you skip rows, and the answer to the puzzle is the results from those angles all multiplied together. So, my approach of using iterators for the whole thing should actually be refactored, because I don’t want to read the file five times!

I briefly think about using array2d to store the map of the trees after all, but I decide that Vec<String> would be more convenient since I can get it from my existing iterator pipeline by capping it after the expect() call with collect(), and I expect my existing code will continue to work on that with few modifications.

So before I start on Part 2 I decide to refactor it this way:

let landscape: Vec<String> = read_lines("input")
    .expect("Bad file")
    .map(|s| s.expect("Bad line in file"))
    .collect();

let trees_hit = landscape.iter().enumerate().filter(hits_tree).count();

But oh no…

error[E0631]: type mismatch in function arguments
  --> src/main.rs:19:57
   |
19 |     let trees_hit = landscape.iter().enumerate().filter(hits_tree).count();
   |                                                         ^^^^^^^^^ expected signature of `for<'r> fn(&'r (usize, &String)) -> _`
...
23 | fn hits_tree((row_index, row): &(usize, String)) -> bool {
   | -------------------------------------------------------- found signature of `for<'r> fn(&'r (usize, String)) -> _`

OK, I’m back to adding & operators in random places! At least as far as I can tell from the error message, I will need to keep the one on the tuple, and add a second one to &String. And indeed that works.

Next I want to refactor the second part of that into a function count_trees_hit(landscape, right, down) so I can call it several times on the list of angles expressed as (right, down) pairs. I’ll start by omitting the down parameter and just making sure that the existing code works:

fn main() {
    // ...
    let trees_hit = count_trees_hit(&landscape, 3);
    println!("{}", trees_hit);
}

fn count_trees_hit(landscape: &Vec<String>, right: usize) -> usize {
    landscape
        .iter()
        .enumerate()
        .filter(|(row_index, row): &(usize, &String)| hits_tree(row_index, row, right))
        .count()
}

fn hits_tree(row_index: usize, row: &String, right: usize) -> bool {
    let col_index = row_index * right % row.len();
    row.as_bytes()[col_index] == b'#'
}

It looks like this almost works, but I get one error:

error[E0308]: mismatched types
  --> src/main.rs:27:65
   |
27 |         .filter(|(row_index, row): &(usize, &String)| hits_tree(row_index, row, right))
   |                                                                 ^^^^^^^^^
   |                                                                 |
   |                                                                 expected `usize`, found `&usize`
   |                                                                 help: consider dereferencing the borrow: `*row_index`

Since usize is an integer type I’m not sure why it has a reference, but I try changing the type of the row_index parameter to &usize and it works. It also works if I do what the compiler says and add *, but that’s an operator that I haven’t learned about (the page I read mentioned that it existed, but I don’t have a good idea of how it works.)

Next I want to add a down parameter. I think if we move multiple rows downwards for every column that we move to the right, then we can simply skip those rows. So it seems to me that I should be able to add another filter() step that skips the row if its index modulo down is not zero.

This new step doesn’t even need access to the row data so I wonder if I can simplify my code by ignoring the string parameter. I google “rust ignore parameter” and land on “Pattern Syntax” which I skim, and it looks like I should be able to use either .. or _ to ignore that parameter. I’ll try both and see what the compiler says. After a couple of tries, what works is: .filter(|(row_index, ..): &(usize, _)| *row_index % down == 0).

Now I have this code:

    let trees_hit = count_trees_hit(&landscape, 1, 2);
    println!("{}", trees_hit);
}

fn count_trees_hit(landscape: &Vec<String>, right: usize, down: usize) -> usize {
    landscape
        .iter()
        .enumerate()
        .filter(|(row_index, ..): &(usize, _)| *row_index % down == 0)
        .filter(|(row_index, row): &(usize, &String)| hits_tree(*row_index, row, right))
        .count()
}

fn hits_tree(row_index: usize, row: &String, right: usize) -> bool {
    let col_index = row_index * right % row.len();
    println!("{}, {}", col_index, row_index);
    row.as_bytes()[col_index] == b'#'
}

I am a bit suspicious that I might be getting the wrong row_index in the second filter() step, because I am filtering out the already-enumerated rows. So for testing, I try it with right 1 and down 2, and add the println!() statement in hits_tree() to print out the row index and column index.

It turns out my suspicion was correct. The indices that are printed out, are 0, 0, 2, 2, 4, 4, 6, 6 etc., while they should be 0, 0, 1, 2, 2, 4, 3, 6. I think what I need to do to fix this, is un-enumerate the items after the first filter() step and then re-enumerate them. I could write the un-enumerate step with map, but I have a feeling that there might be a built-in iterator method to take only the second element of a tuple, so I look at the API documentation for iterators again. I don’t see anything likely there. I look at the API documentation for Itertools as well, but don’t see anything there either. I try googling a few things (such as “rust iterator drop tuple element”) but that doesn’t give me any likely-looking results, so I decide to just write it myself. I insert .map(unenumerate).enumerate() into the iterator pipeline, after the first filter(), and define my unenumerate() function as follows:

fn unenumerate((.., row): &(_, &String)) -> &String {
    row
}

I get two errors that I haven’t seen before:

error[E0106]: missing lifetime specifier
  --> src/main.rs:34:45
   |
34 | fn unenumerate((.., row): &(_, &String)) -> &String {
   |                           -------------     ^ expected named lifetime parameter
   |
   = help: this function's return type contains a borrowed value, but the signature does not say which one of argument 1's 2 lifetimes it is borrowed from
help: consider introducing a named lifetime parameter
   |
34 | fn unenumerate<'a>((.., row): &'a (_, &String)) -> &'a String {
   |               ^^^^            ^^^^^^^^^^^^^^^^     ^^^

error[E0121]: the type placeholder `_` is not allowed within types on item signatures
  --> src/main.rs:34:29
   |
34 | fn unenumerate((.., row): &(_, &String)) -> &String {
   |                             ^ not allowed in type signatures
   |
help: use type parameters instead
   |
34 | fn unenumerate<T>((.., row): &(T, &String)) -> &String {
   |               ^^^              ^

From what I can understand from these errors, first it’s asking me to add a lifetime parameter, and second it seems like the .. trick is not allowed in named functions, only in inline functions?

The suggestion to use type parameters makes me think of how this might work in C++, so I try the following instead:

fn unenumerate<First, Second>((.., second): &(First, Second)) -> Second {
    second
}

Here, confusingly, the compiler does not want the & on the tuple type, so after a few tries of adding and removing & operators I do get this to compile and produce the debug output that I was expecting.

A bit too late, I realize that I could have done this much quicker with only one filter() step, skipping the rows by returning false in hits_tree(). Although what I have already works, this seems like it’s potentially a big enough improvement that I should try it:

fn count_trees_hit(landscape: &Vec<String>, right: usize, down: usize) -> usize {
    landscape
        .iter()
        .enumerate()
        .filter(|(row_index, row): &(usize, &String)| hits_tree(*row_index, row, right, down))
        .count()
}

fn hits_tree(row_index: usize, row: &String, right: usize, down: usize) -> bool {
    if row_index % down != 0 {
        return false;
    }
    let col_index = row_index / down * right % row.len();
    println!("{}, {}", col_index, row_index);
    row.as_bytes()[col_index] == b'#'
}

Apart from forgetting the braces on the if statement (apparently, you cannot omit them in Rust for a single-statement block as you can in C-like languages) I get the same result as my previous code that used unenumerate(), so I feel good about this.

It’s time to remove the debug println!() and actually write the Part 2 code, that tries several different angles and multiplies the answers together:

if part2 {
    let total = count_trees_hit(&landscape, 1, 1)
        * count_trees_hit(&landscape, 3, 1)
        * count_trees_hit(&landscape, 5, 1)
        * count_trees_hit(&landscape, 7, 1)
        * count_trees_hit(&landscape, 1, 2);
    println!("{}", total);
} else {
    let trees_hit = count_trees_hit(&landscape, 3, 1);
    println!("{}", trees_hit);
}

With this, cargo run 2 gives me an answer, which I put into the Advent of Code website, and it’s correct. And as a nice note to end on, this last change marks the first time in this series that I actually write some Rust code that compiles right away!

The full code, also pushed to GitHub:

use std::env;
use std::fs;
use std::io::{self, BufRead};
use std::path;

fn main() {
    let args: Vec<String> = env::args().collect();
    let mut part2 = false;
    let default = String::from("1");
    let arg = args.get(1).unwrap_or(&default);
    if arg == "2" {
        part2 = true;
    }
    let landscape: Vec<String> = read_lines("input")
        .expect("Bad file")
        .map(|s| s.expect("Bad line in file"))
        .collect();

    if part2 {
        let total = count_trees_hit(&landscape, 1, 1)
            * count_trees_hit(&landscape, 3, 1)
            * count_trees_hit(&landscape, 5, 1)
            * count_trees_hit(&landscape, 7, 1)
            * count_trees_hit(&landscape, 1, 2);
        println!("{}", total);
    } else {
        let trees_hit = count_trees_hit(&landscape, 3, 1);
        println!("{}", trees_hit);
    }
}

fn count_trees_hit(landscape: &Vec<String>, right: usize, down: usize) -> usize {
    landscape
        .iter()
        .enumerate()
        .filter(|(row_index, row): &(usize, &String)| hits_tree(*row_index, row, right, down))
        .count()
}

fn hits_tree(row_index: usize, row: &String, right: usize, down: usize) -> bool {
    if row_index % down != 0 {
        return false;
    }
    let col_index = row_index / down * right % row.len();
    row.as_bytes()[col_index] == b'#'
}

fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<fs::File>>>
where
    P: AsRef<path::Path>,
{
    let file = fs::File::open(filename)?;
    Ok(io::BufReader::new(file).lines())
}

Afterword

I can tell that I’m starting to get more comfortable in Rust, which feels good. But I’m a bit disappointed that even though I did some extra studying about ownership and the & operator, I was still confused, still couldn’t understand some errors, and basically had to either do exactly what the compiler said without understanding why, or else add or delete & operators arbitrarily until the code compiled.

Maybe what I should do next is read the page in the Rust book on lifetimes, and read about the * operator. Even though the page on ownership said those were advanced topics, both of those came up today in error messages that I didn’t understand.


[1] It occurs to me that — at least for this series — I’m glad I haven’t set up my editor to show Rust compiler errors inline, because otherwise I wouldn’t consciously think about these things