About Philip Chimento

Expect less than the unexpected.

December of Rust Project, Part 2: The Assembler Macro

Posted on May 19, 2023 by Philip Chimento

Welcome back! This is the second post of a series which turned out to be more occasional than I thought it would be. You might remember that I originally called it December of Rust 2021. Look how that worked out! Not only is it not December 2021 anymore, but also it is not December 2022 anymore.

In the previous installment, I wrote about writing a virtual machine for the LC-3 architecture in Rust, that was inspired by Andrei Ciobanu’s blog post Writing a simple 16-bit VM in less than 125 lines of C. The result was a small Rust VM with a main loop based on the bitmatch crate.

By the end of the previous post, the VM seemed to be working well. I had tested it with two example programs from Andrei Ciobanu’s blog post, and I wanted to write more programs to see if they would work as well. Unfortunately, the two test programs were tedious to create; as you might remember, I had to write a program that listed the hex code of each instruction and wrote the machine code to a file. And that was even considering that Andrei had already done the most tedious part, of assembling the instructions by hand into hex codes! Not to mention that when I wanted to add one instruction to one of the test programs, I had to adjust the offset in a neighbouring LEA instruction, which is a nightmare for debugging. This was just not going to be good enough to write any more complicated programs, because (1) I don’t have time for that, (2) I don’t have time for that, and (3) computers are better at this sort of work anyway.

Close-up photo of a vintage-looking Commodore computer — The real question is, can the LC-3 running on modern hardware emulate this badboy in all its Eurostile Bold Extended glory?? (Image by Viktorya Sergeeva, Pexels)

In this post, I will tell the story of how I wrote an assembler for the LC-3. The twist is, instead of making a standalone program that processes a text file of LC-3 assembly language into machine code¹, I wanted to write it as a Rust macro, so that I could write code like this, and end up with an array of u16 machine code instructions:

asm! {
            .ORIG x3000
            LEA R0, hello  ; load address of string
            PUTS           ; output string to console
            HALT
    hello:  .STRINGZ "Hello World!\n"
            .END
}

(This example is taken from the LC-3 specification, and I’ll be using it throughout the post as a sample program.)

The sample already illustrated some features I wanted to have in the assembly language: instructions, of course; the assembler directives like .ORIG and .STRINGZ that don’t translate one-to-one to instructions; and most importantly, labels, so that I don’t have to compute offsets by hand.

Learning about Rust macros

This could be a foolish plan. I’ve written countless programs over the years that process text files, but I have never once written a Rust macro and have no idea how they work. But I have heard they are powerful and can be used to create domain-specific languages. That sounds like it could fit this purpose.

Armed with delusional it’s-all-gonna-work-out overconfidence, I searched the web for “rust macros tutorial” and landed on The Little Book of Rust Macros originally by Daniel Keep and updated by Lukas Wirth. After browsing this I understood a few facts about Rust macros:

They consist of rules, which match many types of single or repeated Rust tokens against patterns.² So, I should be able to define rules that match the tokens that form the LC-3 assembly language.

They can pick out their inputs from among any other tokens. You provide these other tokens in the input matching rule, so you could do for example:

macro_rules! longhand_add {
    ($a:literal plus $b:literal) => { $a + $b };
}

let two = longhand_add!{ 1 plus 1 };

This is apparently how you can create domain-specific languages with Rust macros, because the tokens you match don’t have to be legal Rust code; they just have to be legal tokens. In other words, plus is fine and doesn’t have to be the name of anything in the program; but foo[% is not.

They substitute their inputs into the Rust code that is in the body of the rule. So really, in the end, macros are a way of writing repetitive code without repeating yourself.

A tangent about C macros

This last fact actually corresponds with one of the uses of C macros. C macros are useful for several purposes, for which the C preprocessor is a drastically overpowered and unsafe tool full of evil traps. Most of these purposes have alternative, less overpowered, techniques for achieving them in languages like Rust or even C++. First, compile-time constants:

#define PI 3.1416

for which Rust has constant expressions:

const PI: f64 = 3.1416;

Second, polymorphic “functions”:

#define MIN(a, b) (a) <= (b) ? (a) : (b)

for which Rust has… well, actual polymorphic functions:

fn min<T: std::cmp::PartialOrd>(a: T, b: T) -> T {
    if a <= b { a } else { b }
}

Third, conditional compilation:

#ifdef WIN32
int read_key(void) {
    // ...
}
#endif  // WIN32

for which Rust has… also conditional compilation:

#[cfg(windows)]
fn read_key() -> i32 {
    // ...
}

Fourth, redefining syntax, as mentioned above:

#define plus +
int two = 1 plus 1;

which in C you should probably never do except as a joke. But in Rust (as in the longhand_add example from earlier) at least you get a clue about what is going on because of the the longhand_add!{...} macro name surrounding the custom syntax; and the plus identifier doesn’t leak out into the rest of your program.

Lastly, code generation, which is what we want to do here in the assembler. In C it’s often complicated and this tangent is already long enough, but if you’re curious, here and here is an example of using a C preprocessor technique called X Macros to generate code that would otherwise be repetitive to write. In C, code generation using macros is a way of trading off less readability (because macros are complicated) for better maintainability (because in repeated blocks of very similar code it’s easy to make mistakes without noticing.) I imagine in Rust the tradeoff is much the same.

Designing an LC-3 assembler macro

You may remember in the previous post, in order to run a program with the VM, I had to write a small, separate Rust program to write the hand-assembled words to a file consisting of LC-3 bytecode. I could then load and run the file with the VM’s ld_img() method.

I would like to be able to write a file with the assembler macro, but I would also like to be able to write assembly language directly inline and execute it with the VM, without having to write it to a file. Something like this:

fn run_program() -> Result<()> {
    let mut vm = VM::new();
    vm.ld_asm(&asm! {
                .ORIG x3000
                LEA R0, hello  ; load address of string
                PUTS           ; output string to console
                HALT
        hello:  .STRINGZ "Hello World!\n"
                .END
    }?);
    vm.start()
}

My first thought was that I could have the asm macro expand to an array of LC-3 bytecodes. However, writing out a possible implementation for the VM.ld_asm() method shows that the asm macro needs to give two pieces of data: the origin address as well as the bytecodes.

pub fn ld_asm(&mut self, ???) {
    let mut addr = ???origin???;
    for inst in ???bytecodes??? {
        self.mem[addr] = Wrapping(*inst);
        addr += 1;
    }
}

So, it seemed better to have the asm macro expand to an expression that creates a struct with these two pieces of data in it. I started an assembler.rs submodule and called this object assembler::Program.

#[derive(Debug)]
pub struct Program {
    origin: u16,
    bytecode: Vec<u16>,
}

impl Program {
    pub fn origin(&self) -> usize {
        self.origin as usize
    }

    pub fn bytecode(&self) -> &[u16] {
        &self.bytecode
    }
}

Next, I needed to figure out how to get from LC-3 assembly language to the data model of Program. Obviously I needed the address to load the program into (origin), which is set by the .ORIG directive. But I also needed to turn the assembly language text into bytecodes somehow. Maybe the macro could do this … but at this point, my hunch from reading about Rust macros was that the macro should focus on transforming the assembly language into valid Rust code, and not so much on processing. Processing can be done in a method of Program using regular Rust code, not macro code. So the macro should just extract the information from the assembly language: a list of instructions and their operands, and a map of labels to their addresses (“symbol table”).³

#[derive(Clone, Copy, Debug)]
pub enum Reg { R0, R1, R2, R3, R4, R5, R6, R7 }

#[derive(Debug)]
pub enum Inst {
    Add1(/* dst: */ Reg, /* src1: */ Reg, /* src2: */ Reg),
    Add2(/* dst: */ Reg, /* src: */ Reg, /* imm: */ i8),
    And1(/* dst: */ Reg, /* src1: */ Reg, /* src2: */ Reg),
    And2(/* dst: */ Reg, /* src: */ Reg, /* imm: */ i8),
    // ...etc
    Trap(u8),
}

pub type SymbolTable = MultiMap<&'static str, u16>;

With all this, here’s effectively what I want the macro to extract out of the sample program I listed near the beginning of the post:

let origin: u16 = 0x3000;
let instructions = vec![
    Inst::Lea(Reg::R0, "hello"),
    Inst::Trap(0x22),
    Inst::Trap(0x25),
    Inst::Stringz("Hello world!\n"),
];
let symtab: SymbolTable = multimap!(
    "hello" => 0x3003,
);

(Remember from Part 1 that PUTS and HALT are system subroutines, called with the TRAP instruction.)

As the last step of the macro, I’ll then pass these three pieces of data to a static method of Program which will create an instance of the Program struct with the origin and bytecode in it:

Program::assemble(origin, &instructions, &symtab)

You may be surprised that I picked a multimap for the symbol table instead of just a map. In fact I originally used a map. But it’s possible for the assembly language code to include the same label twice, which is an error. I found that handling duplicate labels inside the macro made it much more complicated, whereas it was easier to handle errors in the assemble() method. But for that, we have to store two copies of the label in the symbol table so that we can determine later on that it is a duplicate.

Demystifying the magic

At this point I still hadn’t sat down to actually write a Rust macro. Now that I knew what I want the macro to achieve, I could start.⁴

The easy part was that the assembly language code should start with the .ORIG directive, to set the address at which to load the assembled bytecode; and end with the .END directive. Here’s a macro rule that does that:

(
    .ORIG $orig:literal
    $(/* magic happens here: recognize at least one asm instruction */)+
    .END
) => {{
    use $crate::assembler::{Inst::*, Program, Reg::*, SymbolTable};
    let mut instructions = Vec::new();
    let mut symtab: SymbolTable = Default::default();
    let origin: u16 = $orig;
    $(
        // more magic happens here: push a value into `instructions` for
        // each instruction recognized by the macro, and add a label to
        // the symbol table if there is one
    )*
    Program::assemble(origin, &instructions, &symtab)
}};

Easy, right? The hard part is what happens in the “magic”!

You might notice that the original LC-3 assembly language’s .ORIG directive looks like .ORIG x3000, and x3000 is decidedly not a Rust numeric literal that can be assigned to a u16.

At this point I had to decide what tradeoffs I wanted to make in the macro. Did I want to support the LC-3 assembly language from the specification exactly? It looked like I might be able to do that, x3000-formatted hex literals and all, if I scrapped what I had so far and instead wrote a procedural macro⁵, which operates directly on a stream of tokens from the lexer. But instead, I decided that my goal would be to support a DSL that looks approximately like the LC-3 assembly language, without making the macro too complicated.

In this case, “not making the macro too complicated” means that hex literals are Rust hex literals (0x3000 instead of x3000) and decimal literals are Rust decimal literals (15 instead of #15). That was good enough for me.

Next I had to write a matcher that would match each instruction. A line of LC-3 assembly language looks like this:⁶

instruction := [ label : ] opcode [ operand [ , operand ]* ] [ ; comment ] \n

So I first tried a matcher like this:

$($label:ident:)? $opcode:ident $($operands:expr),* $(; $comment:tt)

There are a few problems with this. The most pressing one is that “consume tokens until newline” is just not a thing in Rust macro matchers, so it’s not possible to ignore comments like this. Newlines are just treated like any other whitespace. There’s also no fragment specifier⁷ for “any token”; the closest is tt but that matches a token tree, which is not actually what I want here — I think it would mean the comment has to be valid Rust code, for one thing!

Keeping my tradeoff philosophy in mind, I gave up quickly on including semicolon-delimited comments in the macro. Regular // and /* comments would work just fine without even having to match them in the macro. Instead, I decided that each instruction would end with a semicolon, and that way I’d also avoid the problem of not being able to match newlines.

$($label:ident:)? $opcode:ident $($operands:expr),*;

The next problem is that macro matchers cannot look ahead or backtrack, so $label and $opcode are ambiguous here. If we write an identifier, it could be either a label or an opcode and we won’t know until we read the next token to see if it’s a colon or not; which is not allowed. So I made another change to the DSL, to make the colon come before the label.

With this matcher expression, I could write more of the body of the macro rule:⁸

(
    .ORIG $orig:literal;
    $($(:$label:ident)? $opcode:ident $($operands:expr),*;)+
    .END;
) => {{
    use $crate::assembler::{Inst::*, Program, Reg::*, SymbolTable};
    let mut instructions = Vec::new();
    let mut symtab: SymbolTable = Default::default();
    let origin: u16 = $orig;
    $(
        $(symtab.insert(stringify!($label), origin + instructions.len() as u16);)*
        // still some magic happening here...
    )*
    Program::assemble(origin, &instructions, &symtab)
}};

For each instruction matched by the macro, we insert its label (if there is one) into the symbol table to record that it should point to the current instruction. Then at the remaining “magic”, we have to insert an instruction into the code vector. I originally thought that I could do something like instructions.push($opcode($($argument),*));, in other words constructing a value of Inst directly. But that turned out to be impractical because the ADD and AND opcodes actually have two forms, one to do the operation with a value from a register, and one with a literal value. This means we actually need two different arms of the Inst enum for each of these instructions, as I listed above:

Add1(Reg, Reg, Reg),
Add2(Reg, Reg, i8),

I could have changed it so that we have to write ADD1 and ADD2 inside the asm! macro, but that seemed to me too much of a tradeoff in the wrong direction; it meant that if you wanted to copy an LC-3 assembly language listing into the asm! macro, you’d need to go over every ADD instruction and rename it to either ADD1 or ADD2, and same for AND. This would be a bigger cognitive burden than just mechanically getting the numeric literals in the right format.

Not requiring a 1-to-1 correspondence between assembly language opcodes and the Inst enum also meant I could easily define aliases for the trap routines. For example, HALT could translate to Inst::Trap(0x25) without having to define a separate Inst::Halt.

But then what to put in the “magic” part of the macro body? It seemed to me that another macro expansion could transform LEA R0, hello into Inst::Lea(Reg::R0, "hello")! I read about internal rules in the Little Book, and they seemed like a good fit for this.

So, I replaced the magic with this call to an internal rule @inst:

instructions.push(asm! {@inst $opcode $($operands),*});

And I wrote wrote a series of @inst internal rules, each of which constructs an arm of the Inst enum, such as:

(@inst ADD $dst:expr, $src:expr, $imm:literal) => { Add2($dst, $src, $imm) };
(@inst ADD $dst:expr, $src1:expr, $src2:expr)  => { Add1($dst, $src1, $src2) };
// ...
(@inst HALT)                                   => { Trap(0x25) };
// ...
(@inst LEA $dst:expr, $lbl:ident)              => { Lea($dst, stringify!($lbl)) };

Macro rules have to be written from most specific to least specific, so the rules for ADD first try to match against a literal in the third operand (e.g. ADD R0, R1, -1) and construct an Inst::Add2, and otherwise fall back to an Inst::Add1.

But unfortunately I ran into another problem here. The $lbl:ident in the LEA rule is not recognized. I’m still not 100% sure why this is, but the Little Book’s section on fragment specifiers says,

Capturing with anything but the ident, lifetime and tt fragments will render the captured AST opaque, making it impossible to further match it with other fragment specifiers in future macro invocations.

So I suppose this is because we capture the operands with $($operands:expr),*. I tried capturing them as token trees (tt) but then the macro becomes ambiguous because token trees can include the commas and semicolons that I’m using for delimitation. So, I had to rewrite the rules for opcodes that take a label as an operand, like this:

(@inst LEA $dst:expr, $lbl:literal) => { Lea($dst, $lbl) };

and now we have to write them like LEA R0, "hello" (with quotes). This is the one thing I wasn’t able to puzzle out to my satisfaction, that I wish I had been.

Finally, after writing all the @inst rules I realized I had a bug. When adding the address of a label into the symbol table, I calculated the current value of the program counter (PC) with origin + code.len(). But some instructions will translate into more than one word of bytecode: BLKW and STRINGZ.⁹ BLKW 8, for example, reserves a block of 8 words. This would give an incorrect address for a label occurring after any BLKW or STRINGZ instruction.

To fix this, I wrote a method for Inst to calculate the instruction’s word length:

impl Inst {
    pub fn word_len(&self) -> u16 {
        match *self {
            Inst::Blkw(len) => len,
            Inst::Stringz(s) => u16::try_from(s.len()).unwrap(),
            _ => 1,
        }
    }
}

and I changed the macro to insert the label into the symbol table pointing to the correct PC:¹⁰

$(
    symtab.insert(
        stringify!($lbl),
        origin + instructions.iter().map(|i| i.word_len()).sum::<u16>(),
    );
)*

At this point I had something that looked and worked quite a lot like how I originally envisioned the inline assembler. For comparison, the original idea was to put the LC-3 assembly language directly inside the macro:

asm! {
            .ORIG x3000
            LEA R0, hello  ; load address of string
            PUTS           ; output string to console
            HALT
    hello:  .STRINGZ "Hello World!\n"
            .END
}

Along the way I needed a few tweaks to avoid making the macro too complicated, now I had this:

asm! {
            .ORIG 0x3000;
            LEA R0, "hello";  // load address of string
            PUTS;             // output string to console
            HALT;
    :hello  STRINGZ "Hello World!\n";
            .END;
}

Just out of curiosity, I used cargo-expand (as I did in Part One) to expand the above use of the asm! macro, and I found it was quite readable:

A retro-looking cassette tape labeled "COMPUTAPE" — Now we’re ready to save the bytecode to tape. (Image by Bruno /Germany from Pixabay)

{
    use crate::assembler::{Inst::*, Program, Reg::*, SymbolTable};
    let mut instructions = Vec::new();
    let mut symtab: SymbolTable = Default::default();
    let origin: u16 = 0x3000;
    instructions.push(Lea(R0, "hello"));
    instructions.push(Trap(0x22));
    instructions.push(Trap(0x25));
    symtab.insert("hello", origin + instructions.iter().map(|i| i.word_len()).sum::<u16>());
    code.push(Stringz("Hello World!\n"));
    Program::assemble(origin, &instructions, &symtab)
}

Assembling the bytecode

I felt like the hard part was over and done with! Now all I needed was to write the Program::assemble() method. I knew already that the core of it would work like the inner loop of the VM in Part One of the series, only in reverse. Instead of using bitmatch to unpack the instruction words, I matched on Inst and used bitpack to pack the data into instruction words. Most of them were straightforward:

match inst {
    Add1(dst, src1, src2) => {
        let (d, s, a) = (*dst as u16, *src1 as u16, *src2 as u16);
        words.push(bitpack!("0001_dddsss000aaa"));
    }
    // ...etc

The instructions that take a label operand needed a bit of extra work. I had to look up the label in the symbol table, compute an offset relative to the PC, and pack that into the instruction word. This process may produce an error: the label might not exist, or might have been given more than once, or the offset might be too large to pack into the available bits (in other words, the instruction is trying to reference a label that’s too far away.)

fn integer_fits(integer: i32, bits: usize) -> Result<u16, String> {
    let shift = 32 - bits;
    if integer << shift >> shift != integer {
        Err(format!(
            "Value x{:04x} is too large to fit in {} bits",
            integer, bits
        ))
    } else {
        Ok(integer as u16)
    }
}

fn calc_offset(
    origin: u16,
    symtab: &SymbolTable,
    pc: u16,
    label: &'static str,
    bits: usize,
) -> Result<u16, String> {
    if let Some(v) = symtab.get_vec(label) {
        if v.len() != 1 {
            return Err(format!("Duplicate label \"{}\"", label));
        }
        let addr = v[0];
        let offset = addr as i32 - origin as i32 - pc as i32 - 1;
        Self::integer_fits(offset, bits).map_err(|_| {
            format!(
                "Label \"{}\" is too far away from instruction ({} words)",
                label, offset
            )
        })
    } else {
        Err(format!("Undefined label \"{}\"", label))
    }
}

The arm of the core match expression for such an instruction, for example LEA, looks like this:

Lea(dst, label) => {
    let d = *dst as u16;
    let o = Self::calc_offset(origin, symtab, pc, label, 9)
        .unwrap_or_else(append_error);
    words.push(bitpack!("1110_dddooooooooo"));
}

Here, append_error is a closure that pushes the error message returned by calc_offset() into an array: |e| { errors.push((origin + pc, e)); 0 }

Lastly, a couple of arms for the oddball instructions that define data words, not code:

Blkw(len) => words.extend(vec![0; *len as usize]),
Fill(v) => words.push(*v),
Stringz(s) => {
    words.extend(s.bytes().map(|b| b as u16));
    words.push(0);
}

At the end of the method, if there weren’t any errors, then we successfully assembled the program:

if words.len() > (0xffff - origin).into() {
    errors.push((0xffff, "Program is too large to fit in memory".to_string()));
}
if errors.is_empty() {
    Ok(Self {
        origin,
        bytecode: words,
    })
} else {
    Err(AssemblerError { errors })
}

Bells and whistles

Next thing was to add a few extensions to the assembly language to make writing programs easier. (Writing programs is what I’m going to cover in Part 3 of the series.)

While researching the LC-3 for Part 1, I found a whole lot of lab manuals and other university course material. No surprise, since the LC-3 is originally from a textbook. One document I stumbled upon was “LC3 Language Extensions” from Richard Squier’s course material at Georgetown. In it are a few handy aliases for opcodes:

MOV R3, R5 – copy R3 into R5; can be implemented as ADD R3, R5, 0, i.e. R5 = R3 + 0
ZERO R2 – clear (store zero into) R2; can be implemented as AND R2, R2, 0
INC R4 – increment (add one to) R4; can be implemented as ADD R4, R4, 1
DEC R4 – decrement (subtract one from) R4; can be implemented as ADD R4, R4, -1

Besides these, the LC-3 specification itself names RET as an alias for JMP R7. Finally, an all-zero instruction word is a no-op so I defined an alias NOP for it.¹¹

These are straightforward to define in the macro:

(@inst MOV $dst:expr, $src:expr) => { Add2($dst, $src, 0) };
(@inst ZERO $dst:expr)           => { And2($dst, $dst, 0) };
(@inst INC $dst:expr)            => { Add2($dst, $dst, 1) };
(@inst DEC $dst:expr)            => { Add2($dst, $dst, -1) };
(@inst RET)                      => { Jmp(R7) };
(@inst NOP)                      => { Fill(0) };

I wrote the last one as Fill(0) and not as Br(false, false, false, 0) which might have been more instructive, because Br takes a &' static str for its last parameter, not an address. So I would have had to make a dummy label in the symbol table pointing to address 0. Filling a zero word seemed simpler and easier.

The final improvement I wanted was to have AssemblerError print nice error messages. I kind of glossed over AssemblerError earlier, but it is an implementation of Error that contains an array of error messages with their associated PC value:

#[derive(Debug, Clone)]
pub struct AssemblerError {
    errors: Vec<(u16, String)>,
}
impl error::Error for AssemblerError {}

I implemented Display such that it would display each error message alongside a nice hex representation of the PC where the instruction failed to assemble:

impl fmt::Display for AssemblerError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        writeln!(f, "Error assembling program")?;
        for (pc, message) in &self.errors {
            writeln!(f, "  x{:04x}: {}", pc, message)?;
        }
        Ok(())
    }
}

This still left me with a somewhat unsatisfying mix of kinds of errors. Ideally, the macro would catch all possible errors at compile time!

At compile time we can catch several kinds of errors:

// Unsyntactical assembly language program
asm!{ !!! 23 skidoo !!! }
// error: no rules expected the token `!`

HCF;  // Nonexistent instruction mnemonic
// error: no rules expected the token `HCF`

ADD R3, R2;  // Wrong number of arguments
// error: unexpected end of macro invocation
// note: while trying to match `,`
//     (@inst ADD $dst:expr, $src:expr, $imm:literal)  => { Add2($dst, $sr...
//                                    ^

ADD R3, "R2", 5;  // Wrong type of argument
// error[E0308]: mismatched types
//     ADD R3, "R2", 5;
//             ^^^^ expected `Reg`, found `&str`
//     ...al)  => { Add2($dst, $src, $imm) };
//                  ---- arguments to this enum variant are incorrect
// note: tuple variant defined here
//     Add2(/* dst: */ Reg, /* src: */ Reg, /* imm: */ i8),
//     ^^^^

BR R7;  // Wrong type of argument again
// error: no rules expected the token `R7`
// note: while trying to match meta-variable `$lbl:literal`
//     (@inst BR $lbl:literal)                         => { Br(true, true,...
//               ^^^^^^^^^^^^

These error messages from the compiler are not ideal — if I had written a dedicated assembler, I’d have made it output better error messages — but they are not terrible either.

Then there are some errors that could be caught at compile time, but not with this particular design of the macro. Although note that saying an error is caught at runtime is ambiguous here. Even if the Rust compiler doesn’t flag the error while processing the macro, we can still flag it at the time of the execution of assemble() — this is at runtime for the Rust program, but at compile time for the assembly language. It’s different from a LC-3 runtime error where the VM encounters an illegal opcode such as 0xDEAD during execution.

Anyway, this sample program contains one of each such error and shows the nice output of AssemblerError:

asm! {
            .ORIG 0x3000;
            LEA R0, "hello";   // label is a duplicate
            PUTS;
            LEA R0, "greet";   // label doesn't exist
            PUTS;
            LEA R0, "salute";  // label too far away to fit in offset
            ADD R3, R2, 0x7f;  // immediate value is out of range
            HALT;
    :hello  STRINGZ "Hello World!\n";
    :hello  STRINGZ "Good morning, planet!\n";
            BLKW 1024;
    :salute STRINGZ "Regards, globe!\n";
            BLKW 0xffff;  // extra space makes the program too big
            .END;
}?
// Error: Error assembling program
//   x3000: Duplicate label "hello"
//   x3002: Undefined label "greet"
//   x3004: Label "salute" is too far away from instruction (1061 words)
//   x3005: Value x007f is too large to fit in 5 bits
//   xffff: Program is too large to fit in memory

To check that all the labels are present only once, you need to do two passes on the input. In fact, the macro effectively does do two passes: one in the macro rules where it populates the symbol table, and one in assemble() where it reads the values back out again. But I don’t believe it’d be possible to do two passes in the macro rules themselves, to get compile time checking for this.

The out-of-range value in ADD R3, R2, 0x7f is an interesting case though! This could be caught at compile time if Rust had arbitrary bit-width integers.¹² After all, TRAP -1 and TRAP 0x100 are caught at compile time because the definition of Inst::Trap(u8) does not allow you to construct one with those literal values.

I tried using types from the ux crate for this, e.g. Add2(Reg, Reg, ux::i5). But there is no support for constructing custom integer types from literals, so I would have had to use try_from() in the macro — in which case I wouldn’t get compile time errors anyway, so I didn’t bother.

My colleague Andreu Botella suggested that I could make out-of-range immediate values a compile time error by using a constant expression — something I didn’t know existed in Rust.

(@int $bits:literal $num:literal) => {{
    const _: () = {
        let converted = $num as i8;
        let shift = 8 - $bits;
        if converted << shift >> shift != converted {
            panic!("Value is too large to fit in bitfield");
        }
    };
    $num as i8
}};

(@inst ADD $dst:expr, $src:expr, $imm:literal)  => { Add2($dst, $src, asm!(@int 5 $imm)) };

I found this really clever! But on the other hand, it made having a good error message quite difficult. panic! in a constant expression cannot format values, it can only print a string literal. So you get a compile time error like this:

error[E0080]: evaluation of constant value failed
    asm! { .ORIG 0x3000; ADD R0, R0, 127; .END; }.unwrap_err();
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the evaluated program panicked at 'Value is too large to fit in bitfield'

The whole assembly language program is highlighted as the source of the error, and the message doesn’t give any clue which value is problematic. This would make it almost impossible to locate the error in a large program with many immediate values. For this reason, I decided not to adopt this technique. I found the runtime error preferable because it gives you the address of the instruction, as well as the offending value. But I did learn something!

Conclusion

At this point I was quite happy with the inline assembler macro! I was able to use it to write programs for the LC-3 without having to calculate label offsets by hand, which is all I really wanted to do in the first place. Part 3 of the series will be about some programs that I wrote to test the VM (and test my understanding of it!)

I felt like I had successfully demystified Rust macros for myself now, and would be able to write another macro if I needed to. I appreciated having the chance to gain that understanding while working on a personal project that caught my interest.¹³ This is — for my learning style, at least — my ideal way to learn. I hope that sharing it helps you too.

Finally, you should know that this writeup presents an idealized version of the process. In 2020 I wrote a series of posts where I journalled writing short Rust programs, including mistakes and all, and those got some attention. This post is not that. I tried many different dead ends while writing this, and if I’d chronicled all of them this post would be even longer. Here, I’ve tried to convey approximately the journey of understanding I went through, while smoothing it over so that it makes — hopefully — a good read.

Many thanks to Andreu Botella, Angelos Oikonomopoulos, and Federico Mena Quintero who read a draft of this and with their comments made it a better read than it was before.

[1] As might be the smarter, and certainly the more conventional, thing to do ↩

[2] Tokens as in what a lexer produces: identifiers, literals, … ↩

[3] A symbol table usually means something different, a table that contains information about a program’s variables and other identifiers. We don’t have variables in this assembly language; labels are the only symbols there are ↩

[4] What actually happened, I just sat down and started writing and deleting and rewriting macros until I got a feel for what I needed. But this blog post is supposed to be a coherent story, not a stream-of-consciousness log, so let’s pretend that I thought about it like this first ↩

[5] Conspicuously, still under construction in the Little Book ↩

[6] This isn’t a real grammar for the assembly language, just an approximation ↩

[7] Fragment specifiers, according to the Little Book, are what the expr part of $e:expr is called ↩

[8] Notice that .ORIG and .END now have semicolons too, but no labels; they aren’t instructions, so they can’t have an address ↩

[9] Eagle-eyed readers may note that in the LC-3 manual these are called .BLKW and .STRINGZ, with leading dots. I elided the dots, again to make the macro less complicated ↩

[10] This now seems wasteful rather than keeping the PC in a variable. On the other hand, it seems complicated to add a bunch of local variables to the repeated part of the macro ↩

[11] In fact, any instruction word with a value between 0x0000-0x01ff is a no-op. This is a consequence of BR‘s opcode being 0b0000, which I can only assume was chosen for exactly this reason. BR, with the three status register bits also being zero, never branches and always falls through to the next instruction ↩

[12] Or even just i5 and i6 types ↩

[13] Even if it took 1.5 years to complete it ↩

Comparing Apples and AppleOranges

Posted on March 3, 2022 by Philip Chimento

Via Zach Holman’s blog post I found an interesting Twitter discussion that kicked off with these questions:

A couple of tough questions for all of you:
1. Is the date 2022-06-01 equal to the time 2022-06-01 12:00:00?
2. Is the date 2022-06-01 between the time 2022-06-01 12:00:00 and the time 2022-12-31 12:00:00?
3. Is the time 2022-06-01 12:00:00 after the date 2022-06-01?

I’ve been involved for two years and counting¹ in the design of Temporal, an enhancement for the JavaScript language that adds modern facilities for handling dates and times. One of the principles of Temporal that was established long before I got involved, is that we should use different objects to represent different concepts. For example, if you want to represent a calendar date that’s not associated with any specific time of day, you use a class that doesn’t require you to make up a bogus time of day.² Each class has a definition for equality, comparison, and other operations that are appropriate to the concept it represents, and you get to specify which one is appropriate for your use case by your choice of which one you use. In other, more jargony, words, Temporal offers different data types with different semantics.³

For me these questions all boil down to, when we consider a textual representation like 2022-06-01, what concept does it represent? I would say that each of these strings can represent more than one concept, and to get a good answer, you need to specify which concept you are talking about.

So, my answers to the three questions are “it depends”, “no but maybe yes”, and “it depends.” I’ll walk through why I think this, and how I would solve it with Temporal, for each question.

You can follow along or try out your own answers by going the Temporal documentation page, and opening your browser console. That will give you an environment where you can try these examples and experiment for yourself.

Question 1

Is the date 2022-06-01 equal to the time 2022-06-01 12:00:00?

As I mentioned above, Temporal has different data types with different semantics. In the case of this question, what the question refers to as a “time” we call a “date-time” in Temporal⁴, and the “date” is still a date. The specific types we’d use are PlainDateTime and PlainDate, respectively. PlainDate is a calendar date that doesn’t have a time associated with it: a single square on a wall calendar. PlainDateTime is a calendar date with a wall-clock time. In both cases, “plain” refers to not having a time zone attached, so we know we’re not dealing with any 23-hour or 25-hour or even more unusual day lengths.

The reason I say that the answer depends, is that you simply can’t say whether a date is equal to a date-time. They are two different concepts, so the answer is not well-defined. If you want to do that, you have to convert one to the other so that you either compare two dates, or two date-times, each with their accompanying definition of equality.

You do this in Temporal by choosing the type of object to create, PlainDate or PlainDateTime, and the resulting object’s equals() method will do the right thing:

> Temporal.PlainDate.from('2022-06-01').equals('2022-06-01 12:00:00')
true
> Temporal.PlainDateTime.from('2022-06-01').equals('2022-06-01 12:00:00')
false

I think either of PlainDate or PlainDateTime semantics could be valid based on your application, so it seems important that both are within reach of the programmer. I will say that I don’t expect PlainDateTime will get used very often in practice.⁵ But I can think of a use case for either one of these:

If you have a list of PlainDateTime events to present to a user, and you want to filter them by date. Let’s say we have data from a pedometer, where we care about what local time it was in the user’s time zone when they got their exercise, and the user has asked to see all the exercise they got yesterday. In this case I’d use date semantics: convert the PlainDateTime data to PlainDate data.
On the other hand, if the 2022-06-01 input comes from a date picker widget where the user could have input a time but didn’t, then we might decide that it makes sense to default the time of day to midnight, and therefore use date-time semantics.

Question 2

Is the date 2022-06-01 between the time 2022-06-01 12:00:00 and the time 2022-12-31 12:00:00?

I think the answer to this one is more unambiguously a no. If we use date-time semantics (in Temporal, PlainDateTime.compare()) the date implicitly converts to midnight on that day, so it comes before both of the date-times. If we use date semantics (PlainDate.compare()), 2022-06-01 and 2022-06-01 12:00:00 are equal as we determined in Question 1, so I wouldn’t say it’s “between” the two date-times.

> Temporal.PlainDateTime.compare('2022-06-01', '2022-06-01 12:00:00')
-1
> Temporal.PlainDateTime.compare('2022-06-01', '2022-12-31 12:00:00')
-1
> Temporal.PlainDate.compare('2022-06-01', '2022-06-01 12:00:00')
0
> Temporal.PlainDate.compare('2022-06-01', '2022-12-31 12:00:00')
-1

(Why these numbers?⁶ The compare methods return −1, 0, or 1, according to the convention used by Array.prototype.sort, so that you can do things like arr.sort(Temporal.PlainDate.compare). 0 means the arguments are equal and −1 means the first comes before the second.)

But maybe the answer still depends a little bit on what your definition of “between” is. If it means the date-times form a closed interval instead of an open interval, and we are using date semantics, then the answer is yes.⁷

Question 3

Is the time 2022-06-01 12:00:00 after the date 2022-06-01?

After thinking about the previous two questions, this should be clear. If we’re using date semantics, the two are equal, so no. If we’re using date-time semantics, and we choose to convert a date to a date-time by assuming midnight as the time, then yes.

Other people’s answers

I saw a lot of answers saying that you need more context to be able to compare the two, so I estimate that the way Temporal requires that you give that context, instead of assuming one or the other, does fit with the way that many people think. However, that wasn’t the only kind of reply I saw. (Otherwise the discussion wouldn’t have been that interesting!) I’ll discuss some of the other common replies that I saw in the Twitter thread.

“Yes, no, no: truncate to just the dates and compare those, since that’s the data you have in common.” People who said this seem like they might naturally gravitate towards date semantics. I’d estimate that date semantics are probably correct for more use cases. But maybe not your use case!

“No, no, yes: a date with no time means midnight is implicit.” People who said this seem like they might naturally gravitate towards date-time semantics. It makes sense to me that programmers think this way; if you’re missing a piece of data, just fill in 0 and keep going. I’d estimate that this isn’t how a lot of nontechnical users think of dates, though.

In this whole post I’ve assumed we assume the time is midnight when we convert a date to a date-time, but in the messy world of dates and times, it can make sense to assume other times than midnight, as well. This comes up especially if time zones are involved. For example, you might assume noon, or start-of-day, instead. Start-of-day is often, but not always midnight:

Temporal.PlainDateTime.from('2018-11-04T12:00')
  .toZonedDateTime('America/Sao_Paulo')
  .startOfDay()
  .toPlainTime()  // -> 01:00

“These need to have time zones attached for the question to make sense.” If this is your first reaction when you see a question like this, great! If you write JavaScript code, you probably make fewer bugs just by being aware that JavaScript’s Date object makes it really easy to confuse time zones.

I estimate that Temporal’s ZonedDateTime type is going to fit more use cases in practice than either PlainDate or PlainDateTime. In that sense, if you find yourself with this data and these questions in your code, it makes perfect sense to ask yourself whether you should be using a time-zone-aware type instead. But, I think I’ve given some evidence above that sometimes the answer to that is no: for example, the pedometer data that I mentioned above.

“Dates without times are 24-hour intervals.” Also mentioned as “all-day events”. I can sort of see where this comes from, but I’m not sure I agree with it. In the world where JavaScript Date is the only tool you have, it probably makes sense to think of a date as an interval. But I’d estimate that a lot of non-programmers don’t think of dates this way: instead, it’s a square on your calendar!

It’s also worth noting that in some calendar software, you can create an all-day event that lasts from 00:00 until 00:00 the following day, and you can also create an event for just the calendar date, and these are separate things.

A screenshot of calendar software showing a visual difference between one calendar event spanning 24 hours, and a second all-day event the next day. — A 24-hour interval and a calendar date. Although notably, Google Calendar collapses the 24-hour event into a calendar-date event if you do this.

“Doesn’t matter, just pick one convention and stick with it.” I hope after reading this post you’re convinced that it does matter, depending on your use case.

“Ugh!” That’s how I feel too and why I wrote a whole blog post about it!

How do I feel about the choices we made in Temporal?

I’m happy with how Temporal encourages the programmer to handle these cases. When I went to try out the comparisons that were suggested in the original tweet, I found it was natural to pick either PlainDate or PlainDateTime to represent the data.

One thing that Temporal could have done instead (and in fact, we went back and forth on this a few times before the proposal reached its currently frozen stage in the JS standardization process) would be to make the choice of data type, and therefore of comparison semantics, more explicit.

For example, one might make a case that it’s potentially confusing that the 12:00:00 part of the string in Temporal.PlainDate.from('2022-06-01').equals('2022-06-01 12:00:00') is ignored when the string is converted to a PlainDate. We could have chosen, for example, to throw if the argument to PlainDate.prototype.equals() was a string with a time in it, or if it was a PlainDateTime. That would make the code for answering question 1 look like this:

> Temporal.PlainDate.from('2022-06-01').equals(
... Temporal.PlainDateTime.from('2022-06-01 12:00:00')
... .toPlainDate())
true

This approach seems like it’s better at forcing the programmer to make a choice consciously by throwing exceptions when there is any doubt, but at the cost of writing such long-winded code that I find it difficult to follow. In the end, I prefer the more balanced approach we took.

Conclusion

This was a really interesting problem to dig into. I always find it good to be reminded that no matter what I think is correct about date-time handling, someone else is going to have a different opinion, and they won’t necessarily be wrong.

I said in the beginning of the post: “to get a good answer, you need to specify which concept you are talking about.” Something we’ve tried hard to achieve in Temporal is to make it easy and natural, but not too obtrusive, to specify this. When I went to answer the questions using Temporal code, I found it pretty straightforward, and I think that validates some of the design choices we made in Temporal.

I’d like to acknowledge my employer Igalia for letting me spend work time writing this post, as well as Bloomberg for sponsoring Igalia’s work on Temporal. Many thanks to my colleagues Tim Chevalier, Jesse Alama, and Sarah Groff Hennigh-Palermo for giving feedback on a draft of this post.

[1] 777 days at the time of writing, according to Temporal.PlainDate.from('2020-01-13').until(Temporal.Now.plainDateISO()) ↩

[2] A common source of bugs with JavaScript’s legacy Date when the made-up time of day doesn’t exist due to DST ↩

[3] “Semantics” is, unfortunately, a word I’m going to use a lot in this post ↩

[4] “Time” in Temporal refers to a time on a clock face, with no date associated with it ↩

[5] We even say this on the PlainDateTime documentation page ↩

[6] We don’t have methods like isBefore()/isAfter() in Temporal, but this is a place where they’d be useful. These methods seem like good contenders for a follow-up proposal in the future ↩

[7] Intervals bring all sorts of tricky questions too! Some other date-time libraries have interval objects. We also don’t have these in Temporal, but are likewise open to a follow-up proposal in the future ↩

December of Rust 2021, Part 1: A Little Computer

Posted on January 10, 2022 by Philip Chimento

In the beginning of December I read Andrei Ciobanu’s Writing a simple 16-bit VM in less than 125 lines of C. Now, I’ve been interested in virtual machines and emulators for a long time, and I work tangential to VMs as part of my day job as a JavaScript engine developer for Igalia. I found this post really interesting because, well, it does what it says on the tin: A simple VM in less than 125 lines of C.

Readers of this blog, if I have any left at this point, might remember that in December 2020 I did a series of posts solving that year’s Advent of Code puzzles in order to try to teach myself how to write programs in the Rust programming language. I did say last year that if I were to do these puzzles again, I would do them at a slower pace and wouldn’t blog about them. Indeed, I started again this year, but it just wasn’t as interesting, having already gone through the progression from easy to hard puzzles and now having some idea already of the kinds of twists that they like to do in between the first and second halves of each puzzle.

So, instead of puzzles, I decided to see if I could write a similarly simple VM in Rust this December, as a continuation of my Rust learning exercise last year¹.

Andrei Ciobanu’s article, as well as some other articles he cites, write VMs that simulate the LC-3 architecture ². What I liked about this one is that it was really concise and no-nonsense, and did really well at illustrating how a VM works. There are already plenty of other blog posts and GitHub repositories that implement an LC-3 VM in Rust, and I can’t say I didn’t incorporate any ideas from them, but I found many of them to be a bit verbose. I wanted to see if I could create something in the same concise spirit as Andrei’s, but still idiomatic Rust.

Over the course of a couple of weeks during my end-of-year break from work, I think I succeeded somewhat at that, and so I’m going to write a few blog posts about it.

About the virtual machine

This post is not a tutorial about virtual machines. There are already plenty of those, and Andrei’s article is already a great one so it doesn’t make sense for me to duplicate it. Instead, In this section I’ll note some things about the LC-3 architecture before we start.

First of all, it has a very spartan instruction set. Each instruction is 16 bits, and there are no variable length instructions. The opcode is packed in the topmost 4 bits of each word. That means there are at most 16 instructions. And one opcode (1101) is not even used!

Only three instructions are arithmetic-type ones: addition, bitwise AND, and bitwise NOT. If you’re used to x86 assembly language you’ll notice that other operations like subtraction, multiplication, bitwise OR, are missing. We only need these three to do all the other operations in 2’s-complement arithmetic, although it is somewhat tedious! As I started writing some LC-3 assembly language to test the VM, I learned how to implement some other arithmetic operations in terms of ADD, AND, and NOT.³ I’ll go into this in a following post.

The LC-3 does not have a stack. All the operations take place in registers. If you are used to thinking in terms of a stack machine (for example, SpiderMonkey is one), this takes some getting used to.

First steps

I started out by trying to port Andrei’s C code to Rust code in the most straightforward way possible, not worrying about whether it was idiomatic or not.

The first thing I noticed is that whereas in C it’s customary to use a mutable global, such as reserving storage for the VM’s memory and registers at the global level with declarations such as uint16_t mem[UINT16_MAX] = {0};, the Rust compiler makes this very difficult. You can use a mutable static variable, but accessing it needs to be marked as unsafe. In this way, the Rust compiler nudges you to encapsulate the storage inside a class:

struct VM {
    mem: [u16; MEM_SIZE],
    reg: [u16; NREGS],
    running: bool,
}

Next we write functions to access the memory. In the C code these are:

static inline uint16_t mr(uint16_t address) { return mem[address];  }
static inline void mw(uint16_t address, uint16_t val) { mem[address] = val; }

In Rust, we have to cast the address to a usize, since usize is the type that we index arrays with:

#[inline]
fn ld(&mut self, addr: u16) -> u16 {
    self.mem[addr as usize]
}

#[inline]
fn st(&mut self, addr: u16, val: u16) {
    self.mem[addr as usize] = val;
}

(I decide to name them ld and st for “load” and “store” instead of mr and mw, because the next thing I do is write similar functions for reading and writing the VM’s registers, which I’ll call r and rw for “register” and “register write”. These names look less similar, so I find that makes the code more readable yet still really concise.)

The next thing in the C code is a bunch of macros that do bit-manipulation operations to unpack the instructions. I decide to turn these into #[inline] functions in Rust. For example,

#define OPC(i) ((i)>>12)
#define FIMM(i) ((i>>5)&1)

from the C code, become, in Rust,

#[inline] #[rustfmt::skip] fn opc(i: u16) -> u16 { i >> 12 }
#[inline] #[rustfmt::skip] fn fimm(i: u16) -> bool { (i >> 5) & 1 != 0 }

I put #[rustfmt::skip] because I think it would be nicer if the rustfmt tool would allow you to put super-trivial functions on one line, so that they don’t take up more visual space than they deserve.

You might think that the return type of opc should be an enum. I originally tried making it that way, but Rust doesn’t make it very easy to convert between enums and integers. The num_enum crate provides a way to do this, but I ended up not using it, as you will read below.

We also need a way to load and run programs in LC-3 machine code. I made two methods of VM patterned after the ld_img() and start() functions from the C code.

First I’ll talk about ld_img(). What I really wanted to do is read the bytes of the file directly into the self.mem array, without copying, as the C code does. This is not easy to do in Rust. Whereas in C all pointers are essentially pointers to byte arrays in the end, this is not the case in Rust. It’s surprisingly difficult to express that I want to read u16s into an array of u16s! I finally found a concise solution, using both the byteorder and bytemuck crates. For this to work, you have to import the byteorder::ReadBytesExt trait into scope.

pub fn ld_img(&mut self, fname: &str, offset: u16) -> io::Result<()> {
    let mut file = fs::File::open(fname)?;
    let nwords = file.metadata()?.len() as usize / 2;
    let start = (PC_START + offset) as usize;
    file.read_u16_into::<byteorder::NetworkEndian>(bytemuck::cast_slice_mut(
        &mut self.mem[start..(start + nwords)],
    ))
}

What this does is read u16s, minding the correct byte order, into an array of u8. But we have an array of u16 that we want to store it in, not u8. So bytemuck::cast_slice_mut() treats the &mut [u16] slice as a &mut [u8] slice, essentially equivalent to casting it as (uint8_t*) in C. It does seem like this ought to be part of the Rust standard library, but the only similar facility is std::mem::transmute(), which does the same thing. But it also much more powerful things as well, and is therefore marked unsafe. (I’m trying to avoid having any code that needs to be marked unsafe in this project.)

For running the loaded machine code, I wrote this method:

pub fn start(&mut self, offset: u16) {
    self.running = true;
    self.rw(RPC, PC_START + offset);
    while self.running {
        let i = self.ld(self.r(RPC));
        self.rw(RPC, self.r(RPC) + 1);
        self.exec(i);
    }
}

I’ll talk more about what happens in self.exec() in the next section.

The basic execute loop

In the C code, Andrei cleverly builds a table of function pointers and indexes it with the opcode, in order to execute each instruction:

typedef void (*op_ex_f)(uint16_t instruction);
op_ex_f op_ex[NOPS] = { 
    br, add, ld, st, jsr, and, ldr, str, rti, not, ldi, sti, jmp, res, lea, trap 
};

// ...in main loop:
op_ex[OPC(i)](i);

Each function, such as add(), takes the instruction as a parameter, decodes it, and mutates the global state of the VM. In the main loop, at the point where I have self.exec(i) in my code, we have op_ex[OPC(i)](i) which decodes the opcode out of the instruction, indexes the table, and calls the function with the instruction as a parameter. A similar technique is used to execute the trap routines.

This approach of storing function pointers in an array and indexing it by opcode or trap vector is great in C, but is slightly cumbersome in Rust. You would have to do something like this in order to be equivalent to the C code:

type OpExF = fn(&mut VM, u16) -> ();

// in VM:
const OP_EX: [OpExF; NOPS] = [
    VM::br, VM::add, ..., VM::trap,
];

// ...in main loop:
OP_EX[opc(i) as usize](self, i);

Incidentally, this is why I decided above not to use an enum for the opcodes. Not only would you have to create it from a u16 when you unpack it from the instruction, you would also have to convert it to a usize in order to index the opcode table.

In Rust, a match expression is a much more natural fit:

match opc(i) {
    BR => {
        if (self.r(RCND) & fcnd(i) != 0) {
            self.rw(RPC, self.r(RPC) + poff9(i));
        }
    }
    ADD => {
        self.rw(dr(i), self.r(sr(i)) +
            if fimm(i) {
                sextimm(i)
            } else {
                self.r(sr2(i))
            });
        self.uf(dr(i));
    }
    // ...etc.
}

However, there is an even better alternative that makes the main loop much more concise, like the one in the C code! We can use the bitmatch crate to simultaneously match against bit patterns and decode parts out of them.

#[bitmatch]
fn exec(&mut self, i: u16) {
    #[bitmatch]
    match i {
        "0000fffooooooooo" /* BR */ => {
            if (self.r(RCND) & f != 0) {
                self.rw(RPC, self.r(RPC) + sext(o, 9));
            }
        }
        "0001dddsss0??aaa" /* ADD register */ => {
            self.rw(d, self.r(s) + self.r(a));
            self.uf(d);
        }
        "0001dddsss1mmmmm" /* ADD immediate */ => {
            self.rw(d, self.r(s) + sext(m, 5));
            self.uf(d);
        }
        // ...etc.
    }
}

This actually gets rid of the need for all the bit-manipulation functions that I wrote in the beginning, based on the C macros, such as opc(), fimm(), and poff9(), because bitmatch automatically does all the unpacking. The only bit-manipulation we still need to do is sign-extension when we unpack immediate values and offset values from the instructions, as we do above with sext(o, 9) and sext(m, 5).

I was curious what kind of code the bitmatch macros generate under the hood and whether it’s as performant as writing out all the bit-manipulations by hand. For that, I wrote a test program that matches against the same bit patterns as the main VM loop, but with the statements in the match arms just replaced by constants, in order to avoid cluttering the output:

#[bitmatch]
pub fn test(i: u16) -> u16 {
    #[bitmatch]
    match i {
        "0000fffooooooooo" => 0,
        "0001dddsss0??aaa" => 1,
        "0001dddsss1mmmmm" => 2,
        // ...etc.
    }
}

There is a handy utility for viewing expanded macros that you can install with cargo install cargo-expand, and then run with cargo expand --lib test (I put the test function in a dummy lib.rs file.)

Here’s what we get!

pub fn test(i: u16) -> u16 {
    match i {
        bits if bits & 0b1111000000000000 == 0b0000000000000000 => {
            let f = (bits & 0b111000000000) >> 9usize;
            let o = (bits & 0b111111111) >> 0usize;
            0
        }
        bits if bits & 0b1111000000100000 == 0b0001000000000000 => {
            let a = (bits & 0b111) >> 0usize;
            let d = (bits & 0b111000000000) >> 9usize;
            let s = (bits & 0b111000000) >> 6usize;
            1
        }
        bits if bits & 0b1111000000100000 == 0b0001000000100000 => {
            let d = (bits & 0b111000000000) >> 9usize;
            let m = (bits & 0b11111) >> 0usize;
            let s = (bits & 0b111000000) >> 6usize;
            2
        }
        // ...etc.
        _ => // ...some panicking code
    }
}

It’s looking a lot like what I’d written anyway, but with all the bit-manipulation functions inlined. The main disadvantage is that you have to AND the value with a bitmask at each arm of the match expression. But maybe that isn’t such a problem? Let’s look at the generated assembly to see what the computer actually executes. There is another Cargo tool for this, which you can install with cargo install cargo-asm and run with cargo asm --lib lc3::test⁴. In the result, there are actually only three AND instructions, because there are only three unique bitmasks tested among all the arms of the match expression (0b1111000000000000, 0b1111100000000000, and 0b1111000000100000). So it seems like the compiler is quite able to optimize this into something good.

First test runs

By the time I had implemented all the instructions except for TRAP, at this point I wanted to actually run a program on the LC-3 VM! Andrei has one program directly in his blog post, and another one in his GitHub repository, so those seemed easiest to start with.

Just like in the blog post, I wrote a program (in my examples/ directory so that it could be run with cargo r --example) to output the LC-3 machine code. It looked something like this:

let program: [u16; 7] = [
    0xf026,  // TRAP 0x26
    0x1220,  // ADD R1, R0, 0
    0xf026,  // TRAP 0x26
    0x1240,  // ADD R1, R1, R0
    0x1060,  // ADD R0, R1, 0
    0xf027,  // TRAP 0x27
    0xf025,  // TRAP 0x25
];
let mut file = fs::File::create(fname)?;
for inst in program {
    file.write_u16::<byteorder::NetworkEndian>(inst)?;
}
Ok(())

In order for this to work, I still needed to implement some of the TRAP routines. I had left those for last, and at that point my match expression for TRAP instructions looked like "1111????tttttttt" => self.trap(t), and my trap() method looked like this:

fn trap(&mut self, t: u8) {
    match t {
        _ => self.crash(&format!("Invalid TRAP vector {:#02x}", t)),
    }
}

For this program, we can see that three traps need to be implemented: 0x25 (HALT), 0x26 (INU16), and 0x27 (OUTU16). So I was able to add just three arms to my match expression:

0x25 => self.running = false,
0x26 => {
    let mut input = String::new();
    io::stdin().read_line(&mut input).unwrap_or(0);
    self.rw(0, input.trim().parse().unwrap_or(0));
}
0x27 => println!("{}", self.r(0)),

With this, I could run the sample program, type in two numbers, and print out their sum.

The second program sums an array of numbers. In this program, I added a TRAP 0x27 instruction right before the HALT in order to print out the answer, otherwise I couldn’t see if it was working! This also required changing R1 to R0 so that the sum is in R0 when we call the trap routine, and adjusting the offset in the LEA instruction.⁵

When I tried running this program, it crashed the VM! This is due to the instruction ADD R4, R4, x-1 which decrements R4 by adding -1 to it. R4 is the counter for how many array elements we have left to process, so initially it holds 10, and when we get to that instruction for the first time, we decrement it to 9. But if you look at the implementation of the ADD instruction that I wrote above, we are actually doing an unsigned addition of the lowest 5 bits of the instruction, sign-extended to 16 bits, so we are not literally decrementing it. We are adding 0xffff to 0x000a and expecting it to wrap around to 0x0009 like it does in C. But integer arithmetic doesn’t wrap in Rust!

Unless you specifically tell it to, that is. So we could use u16::wrapping_add() instead of the + operator to do the addition. But I got what I thought was a better idea, to use std::num::Wrapping! I rewrote the definition of VM at the top of the file:

type Word = Wrapping<u16>;

struct VM {
    mem: [Word; MEM_SIZE],
    reg: [Word; NREGS],
    running: bool,
}

This did require adding Wrapping() around some integer literals and adding .0 to unwrap to the bare u16 in some places, but on the whole it made the code more concise and readable. As an added bonus, this way, we are using the type system to express that the LC-3 processor does wrapping unsigned arithmetic. (I do wish that there were a nicer way to express literals of the Wrapping type though.)

And with that, the second example program works. It outputs 16, as expected. In a following post I’ll go on to explain some of the other test programs that I wrote.

Other niceties

At this point I decided to do some refactoring to make the Rust code more readable and hopefully more idiomatic as well. Inspired by the type alias for Word, I added several more, one for addresses and one for instructions, as well as a function to convert a word to an address:

type Addr = usize;
type Inst = u16;
type Reg = u16;
type Flag = Word;
type TrapVect = u8;

#[inline] fn adr(w: Word) -> Addr { w.0.into() }

Addr is convenient to alias to usize because that’s the type that we use to index the memory array. And Inst is convenient to alias to u16 because that’s what bitmatch works with.

In fact, using types in this way actually allowed me to catch two bugs in Andrei’s original C code, where the index of the register was used instead of the value contained in the register.

I also added a few convenience methods to VM for manipulating the program counter:

#[inline] fn pc(&self) -> Word { self.r(RPC) }
#[inline] fn jmp(&mut self, pc: Word) { self.rw(RPC, pc); }
#[inline] fn jrel(&mut self, offset: Word) { self.jmp(self.pc() + offset); }
#[inline] fn inc_pc(&mut self) { self.jrel(Wrapping(1)); }

With these, I could write a bunch of things to be more expressive and concise. For example, the start() method now looked like this:

pub fn start(&mut self, offset: u16) {
    self.running = true;
    self.jmp(PC_START + Wrapping(offset));
    while self.running {
        let i = self.ld(adr(self.pc())).0;
        self.inc_pc();
        self.exec(i);
    }
    Ok(())
}

I also added an iaddr() method to load an address indirectly from another address given as an offset relative to the program counter, to simplify the implementation of the LDI and STI instructions.

While I was at it, I noticed that the uf() (update flags) method always followed a store into a destination register, and I decided to rewrite it as one method dst(), which stores a value into a destination register and updates the flags register based on that value:

#[inline]
fn dst(&mut self, r: Reg, val: Word) {
    self.rw(r, val);
    self.rw(
        RCND,
        match val.0 {
            0 => FZ,
            1..=0x7fff => FP,
            0x8000..=0xffff => FN,
        },
    );
}

At this point, the VM’s main loop looked just about as concise and simple as the original C code did! The original:

static inline void br(uint16_t i)   { if (reg[RCND] & FCND(i)) { reg[RPC] += POFF9(i); } }
static inline void add(uint16_t i)  { reg[DR(i)] = reg[SR1(i)] + (FIMM(i) ? SEXTIMM(i) : reg[SR2(i)]); uf(DR(i)); }
static inline void ld(uint16_t i)   { reg[DR(i)] = mr(reg[RPC] + POFF9(i)); uf(DR(i)); }
static inline void st(uint16_t i)   { mw(reg[RPC] + POFF9(i), reg[DR(i)]); }
static inline void jsr(uint16_t i)  { reg[R7] = reg[RPC]; reg[RPC] = (FL(i)) ? reg[RPC] + POFF11(i) : reg[BR(i)]; }
static inline void and(uint16_t i)  { reg[DR(i)] = reg[SR1(i)] & (FIMM(i) ? SEXTIMM(i) : reg[SR2(i)]); uf(DR(i)); }
static inline void ldr(uint16_t i)  { reg[DR(i)] = mr(reg[SR1(i)] + POFF(i)); uf(DR(i)); }
static inline void str(uint16_t i)  { mw(reg[SR1(i)] + POFF(i), reg[DR(i)]); }
static inline void res(uint16_t i) {} // unused
static inline void not(uint16_t i)  { reg[DR(i)]=~reg[SR1(i)]; uf(DR(i)); }
static inline void ldi(uint16_t i)  { reg[DR(i)] = mr(mr(reg[RPC]+POFF9(i))); uf(DR(i)); }
static inline void sti(uint16_t i)  { mw(mr(reg[RPC] + POFF9(i)), reg[DR(i)]); }
static inline void jmp(uint16_t i)  { reg[RPC] = reg[BR(i)]; }
static inline void rti(uint16_t i) {} // unused
static inline void lea(uint16_t i)  { reg[DR(i)] =reg[RPC] + POFF9(i); uf(DR(i)); }
static inline void trap(uint16_t i) { trp_ex[TRP(i)-trp_offset](); }

My implementation:

#[bitmatch]
match i {
    "0000fffooooooooo" /* BR   */ => if (self.r(RCND).0 & f) != 0 { self.jrel(sext(o, 9)); },
    "0001dddsss0??aaa" /* ADD1 */ => self.dst(d, self.r(s) + self.r(a)),
    "0001dddsss1mmmmm" /* ADD2 */ => self.dst(d, self.r(s) + sext(m, 5)),
    "0010dddooooooooo" /* LD   */ => self.dst(d, self.ld(adr(self.pc() + sext(o, 9)))),
    "0011sssooooooooo" /* ST   */ => self.st(adr(self.pc() + sext(o, 9)), self.r(s)),
    "01000??bbb??????" /* JSRR */ => { self.rw(R7, self.pc()); self.jmp(self.r(b)); }
    "01001ooooooooooo" /* JSR  */ => { self.rw(R7, self.pc()); self.jrel(sext(o, 11)); }
    "0101dddsss0??aaa" /* AND1 */ => self.dst(d, self.r(s) & self.r(a)),
    "0101dddsss1mmmmm" /* AND2 */ => self.dst(d, self.r(s) & sext(m, 5)),
    "0110dddbbboooooo" /* LDR  */ => self.dst(d, self.ld(adr(self.r(b) + sext(o, 6)))),
    "0111sssbbboooooo" /* STR  */ => self.st(adr(self.r(b) + sext(o, 6)), self.r(s)),
    "1000????????????" /* n/a  */ => self.crash(&format!("Illegal instruction {:#04x}", i)),
    "1001dddsss??????" /* NOT  */ => self.dst(d, !self.r(s)),
    "1010dddooooooooo" /* LDI  */ => self.dst(d, self.ld(self.iaddr(o))),
    "1011sssooooooooo" /* STI  */ => self.st(self.iaddr(o), self.r(s)),
    "1100???bbb??????" /* JMP  */ => self.jmp(self.r(b)),
    "1101????????????" /* RTI  */ => self.crash("RTI not available in user mode"),
    "1110dddooooooooo" /* LEA  */ => self.dst(d, self.pc() + sext(o, 9)),
    "1111????tttttttt" /* TRAP */ => self.trap(t as u8),
}

Of course, you could legitimately complain that both are horrible soups of one- and two-letter identifiers.⁶ But I think in both of them, if you have the abbreviations close to hand (and you do, since the program is so small!) it’s actually easier to follow because everything fits well within one vertical screenful of text. The Rust version has the added bonus of the bitmatch patterns being very visual, and the reader not having to think about bit shifting in their head.

Here’s the key for abbreviations:

i — Instruction
r — Register
d — Destination register (“DR” in the LC-3 specification)
s — Source register (“SR1”)
a — Additional source register (“SR2”)
b — Base register (“BASER”)
m — iMmediate value
o — Offset (6, 9, or 11 bits)
f — Flags
t — Trap vector⁷
pc — Program Counter
st — STore in memory
ld — LoaD from memory
rw — Register Write
adr — convert machine word to memory ADdRess
jmp — JuMP
dst — Destination register STore and update flags
jrel — Jump RELative to the PC
sext — Sign EXTend
iaddr — load Indirect ADDRess

At this point the only thing left to do, to get the program to the equivalent level of functionality as the one in Andrei’s blog post, was to implement the rest of the trap routines.

Two of the trap routines involve waiting for a key press. This is actually surprisingly difficult in Rust, as far as I can tell, and definitely not as straightforward as the reg[R0] = getchar(); which you can do in C. You can use the libc crate, but libc::getchar() is marked unsafe.⁸ Instead, I ended up pulling in another dependency, the console crate, and adding a term: console::Term member to VM. With that, I could implement a getc() method that reads a character, stores its ASCII code in the R0 register, and returns the character itself:

fn getc(&mut self) -> char {
    let ch = self.term.read_char().unwrap_or('\0');
    let res = Wrapping(if ch.is_ascii() { ch as u16 & 0xff } else { 0 });
    self.rw(R0, res);
    ch
}

This by itself was enough to implement the GETC trap routine (which waits for a key press and stores its ASCII code in the lower 8 bits of R0) and the IN trap routine (which does the same thing but first prints a prompt, and echoes the character back to stdout) was not much more complicated:

0x20 => {
    self.getc();
}
0x23 => {
    print!("> ");
    let ch = self.getc();
    print!("{}", ch);
}

Next I wrote the OUT trap routine, which prints the lower 8 bits of R0 as an ASCII character. I wrote an ascii() function that converts the lower 8 bits of a machine word into a char:

#[inline]
fn ascii(val: Word) -> char {
    char::from_u32(val.0 as u32 & 0xff).unwrap_or('?')
}

// In the TRAP match expression:
0x21 => print!("{}", ascii(self.r(R0))),

Now the two remaining traps were PUTS (print a zero-terminated string starting at the address in R0) and PUTSP (same, but the string is packed two bytes per machine word). These two routines are very similar in that they both access a variable-length area of memory, starting at the address in R0 and ending with the next memory location that contains zero. I found a nice solution that feels very Rust-like to me, a strz_words() method that returns an iterator over exactly this region of memory:

fn strz_words(&self) -> impl Iterator<Item = &Word> {
    self.mem[adr(self.r(R0))..].iter().take_while(|&v| v.0 != 0)
}

The two trap routines differ in what they do with the items coming out of this iterator. For PUTS we convert each machine word to a char with ascii():

0x22 => print!("{}", self.strz_words().map(|&v| ascii(v)).collect::<String>()),

(It’s too bad that we have a borrowed value in the closure, otherwise we could just do map(ascii). On the positive side, collect::<String>() is really nice.)

PUTSP is a bit more complicated. It’s a neat trick to use flat_map() to convert our iterator over machine words into a twice-as-long iterator over bytes. However, we still have to collect the bytes into an intermediate vector so we can check if the last byte is zero, because the string might have an odd number of bytes. In that case we’d still have a final zero byte which we have to pop off the end, because the iterator doesn’t finish until we get a whole memory location that is zero.

0x24 => {
    let mut bytes = self
        .strz_words()
        .flat_map(|&v| v.0.to_ne_bytes())
        .collect::<Vec<_>>();
    if bytes[bytes.len() - 1] == 0 {
        bytes.pop();
    };
    print!("{}", String::from_utf8_lossy(&bytes));
}

Conclusion

At this point, what I had was mostly equivalent to what you have if you follow along with Andrei’s blog post until the end, so I’ll end this post here. Unlike the C version, it is not under 125 lines long, but it does clock in at just under 200 lines.⁹

After I had gotten this far, I spent some time improving what I had, making the VM a bit fancier and writing some tools to use with it. I intend to make this article into a series, and I’ll cover these improvements in following posts, starting with an assembler.

You can find the code in a GitHub repo. I have kept this first version apart, in an examples/first.rs file, but you can browse the other files in that repository if you want a sneak preview of some of the other things I’ll write about.

Many thanks to Federico Mena Quintero who gave some feedback on a draft of this post.

[1] In the intervening time, I didn’t write any Rust code at all ↩

[2] “LC” stands for “Little Computer”. It’s a computer architecture described in a popular textbook, and as such shows up in many university courses on low-level programming. I hadn’t heard of it before, but after all, I didn’t study computer science in university ↩

[3] I won’t lie, mostly by googling terms like “lc3 division” on the lazyweb ↩

[4] This tool is actually really bad at finding functions unless they’re in particular locations that it expects, such as lib.rs, which is the real reason why I stuck the test function in a dummy lib.rs file ↩

[5] Note that the offset is calculated relative to the following instruction. I assume this is so that branching to an offset of 0 takes you to the next instruction instead of looping back to the current one ↩

[6] Especially since bitmatch makes you use one-letter variable names when unpacking ↩

[7] i.e., memory address of the trap routine. I have no idea why the LC-3 specification calls it a vector, since it is in fact a scalar ↩

[8] I’m not sure why. Federico tells me it is possibly because it doesn’t acquire a lock on stdin. ↩

[9] Not counting comments, and provided we cheat a bit by sticking a few trivial functions onto one line with #[rustfmt::skip] ↩

Advent of Rust 25: Baby Steps

Posted on December 28, 2020 by Philip Chimento

It’s the final post in the series chronicling my attempt to teach myself the Rust programming language by solving programming puzzles from Advent of Code 2020.

Day 25, Part 1

Today’s puzzle is about cracking an encryption key, in order to get at a piece of secret information (called loop size in the puzzle) by taking a piece of known public information (public key) and reversing the algorithm used to generate it. Of course, the algorithm (called transform subject number) is not easy to reverse, and that’s what the puzzle is about.

The puzzle description suggests guessing the loop size by trial and error which I am skeptical about, but this is the code that would do that by brute force:

fn transform_subject_number(subject_number: u64, loop_size: usize) -> u64 {
    let mut value = 1;
    for _ in 0..loop_size {
        value *= subject_number;
        value %= 20201227;
    }
    value
}

fn guess_loop_size(public_key: u64) -> usize {
    for loop_size in 1.. {
        if transform_subject_number(7, loop_size) == public_key {
            return loop_size;
        }
    }
    panic!("Not reachable");
}

#[derive(Debug)]
struct Party {
    loop_size: usize,
}

impl Party {
    fn public_key(&self) -> u64 {
        transform_subject_number(7, self.loop_size)
    }

    fn encryption_key(&self, other_public_key: u64) -> u64 {
        transform_subject_number(other_public_key, self.loop_size)
    }
}

fn main() {
    let card_public_key = 2084668;
    let door_public_key = 3704642;
    let card = Party {
        loop_size: guess_loop_size(card_public_key),
    };
    let door = Party {
        loop_size: guess_loop_size(door_public_key),
    };
    println!("{}", card.encryption_key(door.public_key()));
}

This is taking a long time. I’m guessing that this is not the way to do it.

I notice that, were it not for integer overflow, we’d be able to write the transform subject number result as S^L (mod 20201227) (where S is the subject number and L is the loop size.) So, the total of what we know is this:

P_c ≡ 7^L_c (mod 20201227)
P_d ≡ 7^L_d (mod 20201227)
P_c^L_d ≡ P_d^L_c (mod 20201227)

where P is the public key, and the subscript c or d indicates card or door. The symbol “≡” means “congruent with” although I had to look it up on Wikipedia. Since I’m not even using all of this information in the trial and error implementation, I’m not surprised that it isn’t working.

I’m sure this is a solved problem, just like the hexagons yesterday, so I start searching (although I’m not sure what to search for, I try things like “modulo inverse”) and eventually land on the Wikipedia article for Modular exponentiation, which explains “the task of finding the exponent […] when given [the base, modular exponentiation, and modulus] is believed to be difficult.” Thanks, I guess…

These pages contain a little too much math all at once for my holiday-addled brain, especially because they confusingly use the notation a⁻¹ to mean… something that’s not 1/a… so I decide to read a bit more on the topic hoping that something will sink in.

Eventually I read the “Modular arithmetic” section on the Wikipedia article for Discrete logarithm and a light begins to dawn. They give an example of how to calculate the possible solutions using Fermat’s Little Theorem.¹ However, this approach turns out not to be useful for me because it already requires knowing one possible solution, and that’s exactly what I don’t have.

I do some more searching and find the Baby-step giant-step algorithm. I would probably have skipped over this if I had just stumbled upon the Wikipedia article without any context (because I don’t know what a finite abelian group is) but I reached it via another site with a bit more explanation of the missing link connecting the problem at hand to the Wikipedia article.²

The problem is of the form a^k ≡ b (mod n), where we have to find k. For the example in the puzzle description, we can fill in: a = 7, b = 17807724, n = 20201227.

The first thing I do is replace transform_subject_number() with a more general pow_m() function using the algorithm that I already have, called the “memory-efficient algorithm” in the Wikipedia article, and check that the tests still pass:

fn pow_m(base: u64, exponent: usize, modulus: u64) -> u64 {
    if modulus == 1 {
        return 0;
    }
    let mut value = 1;
    for _ in 0..exponent {
        value *= base;
        value %= modulus;
    }
    value
}

fn transform_subject_number(subject_number: u64, loop_size: usize) -> u64 {
    pow_m(subject_number, loop_size, 20201227)
}

Then I rewrite pow_m() to use the faster “right-to-left binary” algorithm from the Wikipedia article, and again check that the tests still pass:

fn pow_m(base: u64, exponent: usize, modulus: u64) -> u64 {
    if modulus == 1 {
        return 0;
    }
    let mut value = 1;
    let mut mod_base = base % modulus;
    let mut mod_exponent = exponent;
    while mod_exponent > 0 {
        if mod_exponent % 2 == 1 {
            value *= mod_base;
            value %= modulus;
        }
        mod_exponent >>= 1;
        mod_base *= mod_base;
        mod_base %= modulus;
    }
    value
}

Next I rewrite guess_loop_size() to use the baby-steps giant-steps algorithm, as described by Wikipedia:

fn bsgs(base: u64, modulus: u64, result: u64) -> Option<usize> {
    let m = (modulus as f64).sqrt().ceil() as u64;
    let mut table = HashMap::new();
    let mut e = 1;
    for j in 0..m {
        table.insert(e, j);
        e *= base;
        e %= modulus;
    }
    let factor = pow_m(base, (modulus - m - 1) as usize, modulus);
    let mut gamma = result;
    for i in 0..m {
        if let Some(j) = table.get(&gamma) {
            return Some((i * m + j) as usize);
        }
        gamma *= factor;
        gamma %= modulus;
    }
    None
}

fn guess_loop_size(public_key: u64) -> usize {
    bsgs(7, 20201227, public_key).unwrap()
}

The tests still pass, so that means it correctly handles the example code. I run this, it finishes almost immediately, and I get the right answer.

Day 25, Part 2

I’m not going to spoil the very last puzzle! I solve it without writing any code though.

Afterword

I found this puzzle one of the most difficult ones, as I’m not really into cryptography, and it required quickly getting acquainted with an area of mathematics that I had never encountered before. I don’t even think it would have been possible for me to do it, if I hadn’t had enough experience in other areas of mathematics that I at least knew some search terms that brought me in the right direction.

It was difficult in a different way than the image-assembly puzzle on Day 20, though; that one was more like, you could easily see what needed to be done, but it was so tedious to get right. The hard part today was to find out what needed to be done, and once I had landed on the right Wikipedia page with an explanation of the algorithm, it was simple enough to implement. In a way it was similar to yesterday’s hexagons, but the topic of modular discrete logarithms was so much more difficult to absorb quickly than the topic of hexagons was.

Reflections on the Whole Thing

How do I feel about doing 50 puzzles in the past 28 days?
First of all, if I do this again, I’m not going to keep up this pace and I’m not going to blog about it. It was fun while it lasted, but it’s really quite time-consuming, and it’s not even over yet — Rust is open source, so I have practically obligated myself in January to do the right thing and follow through with submitting all the suggestions for improvement of the Rust tools that I can glean from the earlier posts in the series.

I enjoyed the VM-like puzzles the most, after that the parser ones,³ and after that the ones that could be solved with interesting uses of iterators and Itertools. The cryptography puzzles like today’s, I didn’t enjoy that much. I appreciated that there were so many Game of Life variations as an homage to Conway who passed away this year, but as puzzles they were not so interesting; after I wrote the code to solve one, I basically just copied it for the subsequent puzzles. Conway’s Game of Life is so mesmerizing when you can watch it evolving, but since it wasn’t necessary for solving the puzzle I didn’t really feel like spending time building visualizations.

I didn’t find the Reddit board very helpful unless I was actively looking for a hint from the hint threads, or looking for other people’s visualizations of the Game of Life puzzles. Reading all the posts from people who knew exactly the right algorithm to use, or solved the puzzle in mere minutes, or made fantastic visualizations, made me feel inadequate. Even though there were a few of these puzzles that I thought I did very well at, there were always people on the Reddit board who did better.⁴

Can I program in Rust now?
I’m not sure. Probably not. I know just enough that I should now go and read the Rust book, and I will get much more out of it than I would have if I had started out by reading the book.

Would I recommend learning Rust?
Certainly I found it very rewarding. It’s refreshing to see how they designed it to avoid or mitigate common classes of mistakes. Its tools and its standard library were a pleasure to use. That said, I would mainly recommend learning it if you are already experienced in another programming language. Put another way, I found learning Rust while knowing C++ as a reference point to be comparable to (what I imagine is the experience of) learning C++ while knowing JavaScript as a reference point. The things that you already know from the other language do serve you well, and give you a better footing in the new language, but there are also many new concepts that have no equivalent in the other language and you just don’t think about them explicitly. For C++, an example would be pointers, and for Rust an example would be borrows. Speaking for myself at least, I wouldn’t want to be learning borrowing at the same time as I was learning for-loops. Not to mention I’ve finished this whole series while still only having a vague idea of what a lifetime is!

Is the hype justified?
Although based on limited experience, at this point I believe the advice of using Rust for new projects where you would otherwise be using C or C++ is sound! I’m excited to use Rust for a project in the future. On the other hand, I don’t think Rust quite lives up to the hype of being absolutely safe because I was able to easily write code in almost all of these puzzles that would abort when given any sort of unexpected input, but it is at least true that those cases might instead be buffer overflows in C.

Would I recommend learning programming by doing these puzzles?
I’ve seen people recommend this while reading about Advent of Code this month, but honestly? I wouldn’t. If you are learning programming, start with something that’s straightforward and not tricky at all. Programming is tricky enough already by itself.

To conclude this series, I will leave you with a fast-loading version of the Rust error types diagram that Federico sent me after reading my early posts, rendered from DOT and accessible without Google Drive. This diagram was really helpful for me along the way, big thanks to whoever first made it, so I’m glad I can re-share it in a hopefully improved format.

Happy New Year, let’s hope for some better times in 2021!

[1] Yes, that really is the name; is there also Fermat’s Big Theorem? ↩

[2] I hesitate to link there because they have big banners saying “Get FAANG Ready With [our site]” which promotes a software engineering industry culture that I don’t support ↩

[3] Hopefully that means I’m in the right job ↩

[4] Or claimed to, in order to make others feel inadequate⁵ ↩

[5] No, I don’t have a very high opinion of the quality of discourse on Reddit, why do you ask? ↩

Advent of Rust 24: A Hexagonal Tribute to Conway

Posted on December 27, 2020 by Philip Chimento

Today in the penultimate post from the chronicle of teaching myself the Rust programming language by doing programming puzzles from Advent of Code 2020: a hexagonal grid, and another homage to Conway, this time unexpected.

But before that, I will finally solve that sea monster puzzle from Day 20.

Day 20, Part 2 (Yet Again)

First, as promised in the previous post, I will show the refactored code that assembles the full image.

struct Solver {
    tiles: Vec<Tile>,
    connections: MultiMap<u64, u64>,
    corners: [u64; 4],
    used_tile_ids: HashSet<u64>,
}

impl Solver {
    fn new(tiles: &[Tile]) -> Self {
        let mut connections = MultiMap::new();
        for (tile1, tile2) in tiles.iter().tuple_combinations() {
            if tile1.connection_side(tile2).is_some() {
                connections.insert(tile1.id, tile2.id);
                connections.insert(tile2.id, tile1.id);
            }
        }
        let corners: Vec<_> = tiles
            .iter()
            .map(|tile| tile.id)
            .filter(|id| match connections.get_vec(id).unwrap().len() {
                2 => true,
                3 | 4 => false,
                _ => panic!("Impossible"),
            })
            .collect();
        Self {
            tiles: tiles.to_vec(),
            connections,
            corners: corners.try_into().unwrap(),
            used_tile_ids: HashSet::new(),
        }
    }

    fn find_and_orient_tile(&mut self, tile: &Tile, direction: Direction) -> Option<Tile> {
        let tile_connections = self.connections.get_vec(&tile.id).unwrap();
        let maybe_next_tile = self
            .tiles
            .iter()
            .filter(|t| tile_connections.contains(&t.id) && !self.used_tile_ids.contains(&t.id))
            .find_map(|candidate| tile.match_other(candidate, direction));
        if let Some(t) = &maybe_next_tile {
            self.used_tile_ids.insert(t.id);
        }
        maybe_next_tile
    }

    fn arrange(&mut self) -> Array2<u8> {
        // Find top left corner - pick an arbitrary corner tile and rotate it until
        // it has connections on the right and bottom
        let mut tl_corner = self
            .tiles
            .iter()
            .find(|tile| self.corners.contains(&tile.id))
            .unwrap()
            .clone();
        self.used_tile_ids.insert(tl_corner.id);
        let mut tl_corner_connections = vec![];
        for possible_edge in &self.tiles {
            match tl_corner.connection_side(&possible_edge) {
                None => continue,
                Some(dir) => tl_corner_connections.push(dir),
            }
        }
        tl_corner = tl_corner.rot90(match (tl_corner_connections[0], tl_corner_connections[1]) {
            (Direction::RIGHT, Direction::BOTTOM) | (Direction::BOTTOM, Direction::RIGHT) => 0,
            (Direction::LEFT, Direction::BOTTOM) | (Direction::BOTTOM, Direction::LEFT) => 1,
            (Direction::LEFT, Direction::TOP) | (Direction::TOP, Direction::LEFT) => 2,
            (Direction::RIGHT, Direction::TOP) | (Direction::TOP, Direction::RIGHT) => 3,
            _ => panic!("Impossible"),
        });

        // Build the top edge
        let mut t_row = vec![tl_corner];
        loop {
            match self.find_and_orient_tile(&&t_row[t_row.len() - 1], Direction::RIGHT) {
                None => break,
                Some(tile) => {
                    t_row.push(tile);
                }
            }
        }

        let ncols = t_row.len();
        let nrows = self.tiles.len() / ncols;

        println!("whole image is {}×{}", ncols, nrows);

        // For each subsequent row...
        let mut rows = vec![t_row];
        for row in 1..nrows {
            // Arrange the tiles that connect to the ones in the row above
            rows.push(
                (0..ncols)
                    .map(|col| {
                        self.find_and_orient_tile(&rows[row - 1][col], Direction::BOTTOM)
                            .unwrap()
                    })
                    .collect(),
            );
        }

        // Concatenate all the image data together
        let all_rows: Vec<_> = rows
            .iter()
            .map(|row| {
                let row_images: Vec<_> = row.iter().map(|t| t.image.view()).collect();
                concatenate(Axis(1), &row_images).unwrap()
            })
            .collect();
        concatenate(
            Axis(0),
            &all_rows.iter().map(|row| row.view()).collect::<Vec<_>>(),
        )
        .unwrap()
    }
}

There are two main things that I changed here. First is that I noticed I was passing a lot of the same arguments (corners, edges, connections) to the methods that I had, so that was a code smell that indicated that these should be gathered together into a class, which I’ve called Solver.

The second insight was that I don’t actually need to keep track of which tiles are corners, edges, and middles; each tile can only connect in one way, so I only need to find the corners (both for the answer of Part 1, and to pick the top left corner to start assembling the image from.)

Now that I have the full image, I have to actually solve the Part 2 puzzle: cross-correlate the image with the sea monster matrix.

Unlike NumPy, Rust’s ndarray does not have any built-in facilities for cross-correlation, nor is it provided by any packages that I can find. So I will have to write code to do this, but because what I need is actually a very simple form of cross-correlation, I don’t think it will be so hard.

What I need to do is take a slice of the image at every position, of the size of the sea monster, except where the sea monster would extend outside the boundaries of the image. Then I multiply the slice by the sea monster, and sum all the elements in it, and if that sum is equal to the sum of the elements in the sea monster, then there is a sea monster located there.

I will need to do this operation on each of the eight orientations of the image (rotated four ways, and flipped) until I find one where sea monsters are present. Then to get the answer to the puzzle, I’ll have to subtract the number of sea monsters times the number of pixels in a sea monster, from the sum of the pixels in the image.

I write this code:

fn all_orientations(image: &Array2<u8>) -> [ArrayView2<u8>; 8] {
    [
        image.view(),
        image.view().reversed_axes(),
        image.slice(s![.., ..;-1]),
        image.slice(s![.., ..;-1]).reversed_axes(),
        image.slice(s![..;-1, ..]),
        image.slice(s![..;-1, ..]).reversed_axes(),
        image.slice(s![..;-1, ..;-1]),
        image.slice(s![..;-1, ..;-1]).reversed_axes(),
    ]
}

static SEA_MONSTER: [&str; 3] = [
    "                  # ",
    "#    ##    ##    ###",
    " #  #  #  #  #  #   ",
];

fn count_sea_monsters(image: &ArrayView2<u8>) -> (usize, usize) {
    let mon_rows = SEA_MONSTER.len();
    let mon_cols = SEA_MONSTER[0].len();
    let mut sea_monster = Array2::zeros((mon_rows, mon_cols));
    for (y, line) in SEA_MONSTER.iter().enumerate() {
        for (x, cell) in line.bytes().enumerate() {
            sea_monster[[y, x]] = (cell != b' ') as u8;
        }
    }
    let mon_pixels: u8 = sea_monster.iter().sum();

    let mut monsters = 0;
    let rows = image.nrows();
    let cols = image.ncols();
    for y in 0..(rows - mon_rows) {
        for x in 0..(cols - mon_cols) {
            let slice = image.slice(s![y..(y + mon_rows), x..(x + mon_cols)]);
            let correlation = &slice * &sea_monster.view();
            if correlation.iter().sum::<u8>() == mon_pixels {
                monsters += 1;
            }
        }
    }
    (monsters, monsters * mon_pixels as usize)
}

First I make sure it produces the right answer for the example data, then I add this to main():

let full_image = solver.arrange();
let (_, pixels) = all_orientations(&full_image)
    .iter()
    .find_map(|image| {
        let (count, pixels) = count_sea_monsters(image);
        if count != 0 { Some((count, pixels)) } else { None }
    })
    .unwrap();
println!("{}", full_image.iter().filter(|&&c| c > 0).count() - pixels);

Sadly, it doesn’t work. When trying to connect up the top left corner I get a panic because it is possible to connect it on three sides, not two! This is obviously a bug in my program (the tile wouldn’t have been in the list of corners if it had been able to connect on three sides!) I should investigate and fix this bug, but I am so done with this puzzle. In one last burst, I decide to paper over the bug by only trying the tiles that I already know should connect, replacing the tl_corner_connections code with the following:

let tl_corner_connections: Vec<_> = self
    .tiles
    .iter()
    .filter(|t| {
        self.connections.get_vec(&tl_corner.id).unwrap().contains(&t.id)
    })
    .map(|candidate| tl_corner.connection_side(&candidate))
    .filter(Option::is_some)
    .map(Option::unwrap)
    .collect();

Finally, finally, this gives me the correct answer, and I see the sea monster light up on the Advent of Code map. There is still a bug, but let us close this book and never speak of this code again.

Day 24, Part 1

Without that sea monster puzzle weighing on me, I’m happy to start the second-to-last puzzle. It involves a floor tiled with hexagonal tiles. The tiles have a white side and a black side, and can flip from one to the other. The puzzle involves starting from a center tile, and following directions (east, northeast, west, etc.) to get to another tile, which must be flipped. The answer to the puzzle is how many tiles are flipped after following all the directions.

So! I’ve never had to work with a hexagonal grid before, but so many games have one, it must be a solved problem. I google “represent hex grid in array” and land on a Stack Overflow question, which leads me to this brilliant page, “Hexagonal Grids” by Amit Patel. This is nothing short of a miracle, telling me everything I need to know in order to do things with hexagonal grids.

After reading that page and thinking about it for a while, I decide that I will use cube coordinates (x, y, z) and that I don’t even need to store a grid. I just need to store the destination coordinates that each instruction from the input takes me to. A tile is white at the end, if its coordinates are reached an even number of times (including 0), and a tile is black if its coordinates are reached an odd number of times.

I could store the destination coordinates in a hashmap from coordinate to count, but I wonder if there is a multiset similar to the multimap I used a few days ago. There is. With that, I can write the code for Part 1:

use multiset::HashMultiSet;

type Hex = (i32, i32, i32);

#[derive(Debug, PartialEq)]
enum Direction {
    EAST,
    SOUTHEAST,
    SOUTHWEST,
    WEST,
    NORTHWEST,
    NORTHEAST,
}

impl Direction {
    fn move_rel(&self, (x, y, z): Hex) -> Hex {
        use Direction::*;
        match self {
            EAST => (x + 1, y - 1, z),
            SOUTHEAST => (x, y - 1, z + 1),
            SOUTHWEST => (x - 1, y, z + 1),
            WEST => (x - 1, y + 1, z),
            NORTHWEST => (x, y + 1, z - 1),
            NORTHEAST => (x + 1, y, z - 1),
        }
    }
}

fn parse_line(text: &str) -> Vec<Direction> {
    use Direction::*;
    let mut iter = text.bytes();
    let mut retval = Vec::with_capacity(text.len() / 2);
    while let Some(b) = iter.next() {
        retval.push(match b {
            b'e' => EAST,
            b's' => match iter.next() {
                Some(b2) if b2 == b'e' => SOUTHEAST,
                Some(b2) if b2 == b'w' => SOUTHWEST,
                Some(b2) => panic!("bad direction s{}", b2),
                None => panic!("bad direction s"),
            },
            b'w' => WEST,
            b'n' => match iter.next() {
                Some(b2) if b2 == b'w' => NORTHWEST,
                Some(b2) if b2 == b'e' => NORTHEAST,
                Some(b2) => panic!("bad direction n{}", b2),
                None => panic!("bad direction n"),
            },
            _ => panic!("bad direction {}", b),
        });
    }
    retval
}

fn main() {
    let input = include_str!("input");
    let destination_counts: HashMultiSet<_> = input
        .lines()
        .map(|line| {
            parse_line(line)
                .iter()
                .fold((0, 0, 0), |hex, dir| dir.move_rel(hex))
        })
        .collect();
    let count = destination_counts
        .distinct_elements()
        .filter(|destination| destination_counts.count_of(destination) % 2 == 1)
        .count();
    println!("{}", count);
}

Day 24, Part 2

In a surprise twist, the second part of the puzzle is yet another Conway’s Game of Life, this time on the hex grid! So no more storing the coordinates of the flipped tiles in a multiset. I will need to have some sort of array to store the grid, and calculate the number of occupied neighbour cells, as we have done on several previous puzzles.

The Hexagonal Grids page comes through once again. Isn’t this great, that I knew nothing about hexagonal grids before encountering this puzzle, and there’s just a page on the internet that explains all of it well enough that I can use it to solve this puzzle! I will use axial coordinates (meaning, just discard one of the cube coordinates) and store the grid in a rectangular ndarray. The only question is how big the array has to be.

I decide, as in Day 17, that a good upper bound is probably the size of the starting pattern, plus the number of iterations of the game, extended in each direction. So, for the example pattern in the puzzle description, the highest number in any of the three coordinates is 3 (and −3), and the number of iterations is 100, so we’d want a grid of 103×103.

The hex page recommends encapsulating access to the hex grid in a class, so that’s exactly what I do:

struct Map {
    map: Array2<i8>,
    ref_q: i32,
    ref_r: i32,
}

impl Map {
    fn from_counts(counts: &HashMultiSet<Hex>) -> Self {
        let initial_extent = counts.distinct_elements().fold(0, |acc, (x, y, z)| {
            acc.max(x.abs()).max(y.abs()).max(z.abs())
        });
        let extent = initial_extent + 100; // n_iterations = 100
        let size = extent as usize;
        let map = Array2::zeros((2 * size + 1, 2 * size + 1));
        let mut this = Self {
            map,
            ref_q: extent,
            ref_r: extent,
        };
        for &(x, y, _) in counts
            .distinct_elements()
            .filter(|dest| counts.count_of(dest) % 2 == 1)
        {
            this.set(x, y);
        }
        this
    }

    fn set(&mut self, x: i32, y: i32) {
        let q = (x + self.ref_q) as usize;
        let r = (y + self.ref_r) as usize;
        self.map[[q, r]] = 1;
    }

    fn calc_neighbours(map: &Array2<i8>) -> Array2<i8> {
        let shape = map.shape();
        let width = shape[0] as isize;
        let height = shape[1] as isize;
        let mut neighbours = Array2::zeros(map.raw_dim());
        // Add slices of the occupied cells shifted one space in each hex
        // direction
        for &(xstart, ystart) in &[(1, 0), (0, 1), (-1, 1), (-1, 0), (0, -1), (1, -1)] {
            let xdest = xstart.max(0)..(width + xstart).min(width);
            let ydest = ystart.max(0)..(height + ystart).min(height);
            let xsource = (-xstart).max(0)..(width - xstart).min(width);
            let ysource = (-ystart).max(0)..(height - ystart).min(height);
            let mut slice = neighbours.slice_mut(s![xdest, ydest]);
            slice += &map.slice(s![xsource, ysource]);
        }
        neighbours
    }

    fn iterate(&mut self) {
        let neighbours = Map::calc_neighbours(&self.map);
        let removals = &neighbours.mapv(|count| (count == 0 || count > 2) as i8) * &self.map;
        let additions =
            &neighbours.mapv(|count| (count == 2) as i8) * &self.map.mapv(|cell| (cell == 0) as i8);
        self.map = &self.map + &additions - &removals;
    }

    fn count(&self) -> usize {
        self.map
            .fold(0, |acc, &cell| if cell > 0 { acc + 1 } else { acc })
    }
}

I store the map as a 2-dimensional ndarray, with coordinates (q, r) equal to (x, y) in the cube coordinate scheme (I just drop the z coordinate.) I store the offset of the center tile in (q_ref, r_ref).

I make a constructor that takes the multiset from Part 1 as input, and an iterate() method that calculates one iteration of the map and updates the class. The calc_neighbours() and iterate() code is practically copied from Day 11 except that we only shift the map in six directions instead of eight. (Which six directions those are, I get from the hex grids page.)

I’m not a big fan of acc.max(x.abs()).max(y.abs()).max(z.abs()) and wish I knew of a better way to do that.

I can then write the following code in main() which gives me the answer:

let mut map = Map::from_counts(&destination_counts);
for _ in 0..100 {
    map.iterate();
}
println!("{}", map.count());

Afterword

The Day 20 puzzle was a bit disappointing, since I spent so much more time on it than any of the other puzzles, and the solution wasn’t even particularly good. I’m not sure what made it so much more difficult, but I suspect that I just didn’t find the right data structure to read the data into.

Day 24, on the other hand, I completed easily with the help of a very informative web site. I suppose this puzzle was a bit of a gimmick, though; if you know how to deal with hexagonal grids then it’s very quick, and if you don’t, then it’s much more difficult. In my case, doing a search for the right thing made all the difference. If I had started out writing code without the knowledge that I had learned, I probably would have used offset coordinates, and, as you can read on the Hexagonal Grids page, that would mean that the directions are different depending on whether you’re in an odd or even-numbered column. The code would have been much more complicated!

This series will return for one final installment within the next few days.

Advent of Rust, Day 22 and 23: Profiling, Algorithms, and Data Structures

Posted on December 24, 2020 by Philip Chimento

It’s that time again, time for a new post in the chronicle of me teaching myself the Rust programming language by solving programming puzzles from Advent of Code 2020.

Day 22, Part 1

Today’s puzzle is about the card game of Combat¹, a two-player game with a numbered deck of cards. Each player takes a card off the top of the deck, and whoever has the highest card takes both. When one player has no cards left, the other player wins. The input to the puzzle is the cards in each deck, and the answer is the winning player’s score: the bottom card of their deck times 1, plus the next bottom-most card times 2, plus the next bottom-most card times 3, etc.

I read about VecDeque at the same time I read about Vec, on the very first day I started learning Rust with these puzzles, but I haven’t had an opportunity to use it yet. However, this seems like one. The program is quite straightforward:

use std::collections::VecDeque;

fn main() {
    let input = include_str!("input");
    let mut deck_blocks = input.split("\n\n");
    let mut deck1 = read_deck(deck_blocks.next().unwrap());
    let mut deck2 = read_deck(deck_blocks.next().unwrap());
    play_combat(&mut deck1, &mut deck2);
    println!(
        "{}",
        score_deck(if deck1.is_empty() { &deck2 } else { &deck1 })
    );
}

fn read_deck(block: &str) -> VecDeque<u16> {
    block.lines().skip(1).map(|s| s.parse().unwrap()).collect()
}

fn play_combat(deck1: &mut VecDeque<u16>, deck2: &mut VecDeque<u16>) {
    while !deck1.is_empty() && !deck2.is_empty() {
        let card1 = deck1.pop_front().unwrap();
        let card2 = deck2.pop_front().unwrap();
        if card1 > card2 {
            deck1.push_back(card1);
            deck1.push_back(card2);
        } else {
            deck2.push_back(card2);
            deck2.push_back(card1);
        }
    }
}

fn score_deck(deck: &VecDeque<u16>) -> u16 {
    deck.iter()
        .rev()
        .enumerate()
        .map(|(ix, val)| ((ix + 1) as u16) * val)
        .sum()
}

I’m also happy that I get all the & operators right this time. Sure, it’s a small program, but the achievement feels good.

Day 22, Part 2

Part 2 is a bit more complicated: we have to play Recursive Combat. Each player draws a card. If both players have enough cards in their deck, then they start a new sub-game of Recursive Combat with the top cards in their deck, as many as the number on the card they drew, and the winner of the sub-game takes both cards. If either player doesn’t have enough cards, then whoever had the higher card takes both cards. In the case of an infinite loop, Player 1 wins outright. The scoring rules are the same.

I’m again able to write this fairly straightforwardly. What I write at first has a few bugs, but once again writing inline tests based on the examples in the puzzle description helped me debug them. Here’s the code:

fn play_recursive_combat(deck1: &mut Deck, deck2: &mut Deck) -> bool {
    let mut played_rounds = HashSet::new();
    while !deck1.is_empty() && !deck2.is_empty() {
        let this_round = (deck1.clone(), deck2.clone());
        if played_rounds.contains(&this_round) {
            return true;
        }
        played_rounds.insert(this_round);
        let card1 = deck1.pop_front().unwrap();
        let card2 = deck2.pop_front().unwrap();
        let player1_wins = if deck1.len() >= card1 && deck2.len() >= card2 {
            let mut deck1_copy = deck1.clone();
            let mut deck2_copy = deck2.clone();
            deck1_copy.truncate(card1);
            deck2_copy.truncate(card2);
            play_recursive_combat(&mut deck1_copy, &mut deck2_copy)
        } else {
            card1 > card2
        };

        if player1_wins {
            deck1.push_back(card1);
            deck1.push_back(card2);
        } else {
            deck2.push_back(card2);
            deck2.push_back(card1);
        }
    }
    deck2.is_empty()
}

Day 23, Part 1

Today’s puzzle is simulating the outcome of yet another game. There are 10 cups, arranged in a circle, each labelled with a number, and one is the “current cup”. The three cups after the current cup are picked up, and inserted after the “destination cup”, which is the cup labelled with the current cup’s number minus 1. (If that cup is one of the ones picked up, then the destination cup is the current cup minus 2, and so on.) Then the current cup shifts to the next in the circle. The answer to the puzzle is the order of the cups after 100 of these moves, starting from the cup labelled 1.

I decide that since the cups are in a circle, the starting point doesn’t really matter, and I can make it so that the current cup is always the starting point. That way I don’t have to keep track of the index of the current cup, and I can use a VecDeque again to pop it off the front and push it onto the back, and the process of making a move becomes simpler.

The code that I write is fairly straightforward and goes without much incident:

use std::collections::VecDeque;

fn dec_nonnegative_mod(num: i8, n_cups: i8) -> i8 {
    (num + n_cups - 2) % n_cups + 1
}

fn do_move(cups: &mut VecDeque<i8>) {
    let n_cups = cups.len() as i8;
    let current_cup = cups.pop_front().unwrap();
    let mut move_cups: VecDeque<_> = (0..3).map(|_| cups.pop_front().unwrap()).collect();
    let mut destination_cup = dec_nonnegative_mod(current_cup, n_cups);
    while move_cups.contains(&destination_cup) {
        destination_cup = dec_nonnegative_mod(destination_cup, n_cups);
    }
    let insert_index = cups.iter().position(|&cup| cup == destination_cup).unwrap() + 1;
    (0..3).for_each(|n| cups.insert(insert_index + n, move_cups.pop_front().unwrap()));
    cups.push_back(current_cup);
}

fn main() {
    let input = "253149867";
    let mut cups: VecDeque<i8> = input
        .chars()
        .map(|c| c.to_digit(10).unwrap() as i8)
        .collect();
    let n_moves = 100;
    for _ in 0..n_moves {
        do_move(&mut cups);
    }
    while cups[0] != 1 {
        cups.rotate_left(1);
    }
    cups.pop_front();
    println!(
        "{}",
        cups.iter()
            .map(|n| n.to_string())
            .collect::<Vec<_>>()
            .join("")
    );
}

Day 23, Part 2

For part 2, we have 1 million cups (after the initial ordering, the rest is padded out with the numbers 10 through 999999 in order) and 10 million moves. The answer is the product of the two cups that end up after cup 1.

First I change the data type of the cups from i8 to i64. Then I make the changes in main() to reflect the updated rules of the game:

fn main() {
    let input = "253149867";
    let n_cups = if is_part2() { 1_000_000 } else { 9 };
    let mut cups: VecDeque<i64> = input
        .chars()
        .map(|c| c.to_digit(10).unwrap() as i64)
        .chain(10..(n_cups + 1))
        .collect();
    let n_moves = if is_part2() { 10_000_000 } else { 100 };
    for _ in 0..n_moves {
        do_move(&mut cups);
    }
    while cups[0] != 1 {
        cups.rotate_left(1);
    }
    cups.pop_front();
    if is_part2() {
        println!("{}", cups[0] * cups[1]);
    } else {
        println!(
            "{}",
            cups.iter()
                .map(|n| n.to_string())
                .collect::<Vec<_>>()
                .join("")
        );
    }
}

I run it and it seems to be taking a long time. Now I’d like to know just how long it will take so that I can decide whether it’s worth it to try to speed it up or just let it brute-force itself to the end. I add a progress bar with this cool library that I’ve heard about, and I’m pleased that it only takes a few lines:

let progress = indicatif::ProgressBar::new(n_moves);
progress.set_style(
    indicatif::ProgressStyle::default_bar().template("[{eta_precise}] {wide_bar} {pos}/{len}"),
);
// ...
for /* ... */ {
    // ...
    progress.inc(1);
}
// ...
progress.finish_and_clear();

By looking at the ETA, I suspect it will take well over 3 days to finish all 10 million moves!

My guess is that it’s taking so long because of inserting three elements into the vector, which has to make a copy of all the elements that come after the insertion. I am fairly bad at data structures, but I suspect that this might be one of the rare cases where it’s advantageous to use a linked list, since that has quick insertion in the middle. However, before I rewrite the whole thing I change the wasteful 3× insert, to split the vector at the insertion point, and append the three moved cups before sticking the second half of the vector back on:

let insert_index = cups.iter().position(|&cup| cup == destination_cup).unwrap() + 1;
let mut back = cups.split_off(insert_index);
cups.append(&mut move_cups);
cups.append(&mut back);
cups.push_back(current_cup);

This shaves five hours off the ETA, but it’s not enough.

I also move the determination of n_cups out of the loop, and instead provide it as a parameter of the do_move() function, but it doesn’t really affect the running time either.

Rust’s standard library does include a linked list, but it’s a doubly linked list and doesn’t have APIs for detaching and inserting whole ranges, as far as I can tell.

I wonder if, before resorting to rewriting the program with a different linked list data structure from a package, I should profile the existing program to find out what is taking so long. I suspect the insert, but it could be something else, like searching the vector for the destination cup.

I google “rust profiling tools” and land first on cargo-profiler. It looks really easy to install and use, but unfortunately it crashes if you have a digit in the path of your program:

$ cargo profiler callgrind -- 2

Compiling puzzle23 in debug mode...

Profiling puzzle23 with callgrind...
error: Regex error -- please file a bug. In bug report, please include the original output file from profiler, e.g. from valgrind --tool=cachegrind --cachegrind-out-file=cachegrind.txt

This bug is in a giant-ass regular expression that doesn’t give me a lot of confidence in the robustness this tool. On to the next one.

The cpuprofiler package has pretty decent getting-started documentation. It requires gperftools, so I first install my Linux distribution’s gperftools package, then add this preamble to my program:

extern crate cpuprofiler;
use cpuprofiler::PROFILER;

I then sandwich the main loop between the following two lines:

PROFILER.lock().unwrap().start("./23.profile").unwrap();
PROFILER.lock().unwrap().stop().unwrap();

I change the number of moves to 1000 and run the program. A 23.profile file appears in my project directory! According to the documentation, I can then use the pprof tool to check which lines in do_move() are hot (output slightly truncated):

$ pprof --list=do_move ./target/debug/puzzle23 23.profile
     0   3261 Total samples (flat / cumulative)
     .      .   12: fn do_move(cups: &mut VecDeque<i64>, n_cups: i64) {
     .      .   13:     let current_cup = cups.pop_front().unwrap();
     .      .   14:     let mut move_cups: VecDeque<_> = (0..3).map(|_| cups.pop_front().unwrap()).collect();
     .      .   15:     let mut destination_cup = dec_nonnegative_mod(current_cup, n_cups);
     .      1   16:     while move_cups.contains(&destination_cup) {
     .      .   17:         destination_cup = dec_nonnegative_mod(destination_cup, n_cups);
     .      .   18:     }
     .   3257   19:     let insert_index = cups.iter().position(|&cup| cup == destination_cup).unwrap() + 1;
     .      1   20:     let mut back = cups.split_off(insert_index);
     .      .   21:     cups.append(&mut move_cups);
     .      1   22:     cups.append(&mut back);
     .      .   23:     cups.push_back(current_cup);
     .      1   24: }

So I was wrong. It’s not the insert that is causing the problem at all, and I probably don’t need to use a linked list! The problem is searching the vector for the destination cup.

Interestingly, rewriting that loop not to use position(), cuts the expected running time almost in half, to a little over two days:

let mut ix = 0;
for &cup in cups.iter() {
    if cup == destination_cup {
        break;
    }
    ix += 1;
}
let insert_index = ix + 1;
let mut back = cups.split_off(insert_index);
cups.append(&mut move_cups);
cups.append(&mut back);
cups.push_back(current_cup);

However, either of (a) using enumerate() or (b) writing the loop without an iterator at all (for ix in 0.. and indexing cups[ix]) increases the expected running time back up to the same neighbourhood as using position(). I’m a bit surprised at that, but I file it away and move on.

Although I’m pleasantly surprised that profiling a program is so easy with Rust and Cargo, I’m starting to think that profiling and optimizing this code is a red herring, because if I keep this algorithm, then no matter what I’ll still have to search through the vector to find the destination cup, and I’m not sure it’s possible to do that any faster. I can see two possibilities from here: either there a different algorithm that will run faster; or there is a mathematical trick that can be used to figure out the answer in an easier way, like on Day 13.

I try changing the number of cups to 20 and the number of moves to 200, and printing out the order of cups every move, to see if I can see any patterns, but after staring at it for a while, I still don’t see anything.

I am getting a bit sick of this puzzle since it has taken a much larger chunk of my day than I actually wanted to spend on it, and I’m frustrated that I haven’t solved Day 20 yet. I decide to check the Reddit thread for this puzzle, where I hope to get a hint without spoiling myself entirely.

The first hint that I find is to use an array where the value at a given index is the next cup after the cup labelled with that index. This is actually so helpful that in the end I think I may have spoiled myself more than I wanted to. When I write the code, I realize two things:

This is actually a sort of linked list in disguise, so actually my idea earlier was on the right track.
By storing the linked list elements contiguously in an array, you can eliminate the search for the destination cup altogether.

Since with this scheme, we do need to store the current cup as part of the state, I decide to make a struct. I also change the data type of the cups once more, to usize, since the cup numbers are going to be used as array indices.

I run the program and this time the progress bar suggests it will be done in 21 minutes. That’s still a long time, but I decide I will just wait. If I get the wrong answer at the end, then I’ll spend time to profile it and make it faster. But when it finishes, the website tells me the answer is correct, so I move on.

Here’s the code, with the rewritten main() function for both of Parts 1 and 2:

fn dec_nonnegative_mod(num: usize, n_cups: usize) -> usize {
    (num + n_cups - 2) % n_cups + 1
}

struct Links {
    n_cups: usize,
    current_cup: usize,
    links: Vec<usize>,
}

impl Links {
    fn from_list(cups: &[usize]) -> Self {
        let n_cups = cups.len();
        let mut links = vec![0; n_cups + 1];
        for (&this, &next) in cups.iter().tuple_windows() {
            links[this] = next;
        }
        links[cups[n_cups - 1]] = cups[0];
        Self {
            n_cups,
            current_cup: cups[0],
            links,
        }
    }

    fn next(&self, cup: usize) -> usize {
        self.links[cup]
    }

    fn do_move(&mut self) {
        let move1 = self.links[self.current_cup];
        let move2 = self.links[move1];
        let move3 = self.links[move2];
        let mut insert_after = dec_nonnegative_mod(self.current_cup, self.n_cups);
        while insert_after == move1 || insert_after == move2 || insert_after == move3 {
            insert_after = dec_nonnegative_mod(insert_after, self.n_cups);
        }
        let next_current_cup = self.links[move3];
        self.links[self.current_cup] = next_current_cup;
        let insert_before = self.links[insert_after];
        self.links[insert_after] = move1;
        self.links[move3] = insert_before;
        self.current_cup = next_current_cup;
    }
}

fn main() {
    let input = "253149867";
    let n_cups = if is_part2() { 1_000_000 } else { 9 };
    let cups: Vec<usize> = input
        .chars()
        .map(|c| c.to_digit(10).unwrap() as usize)
        .chain(10..(n_cups + 1))
        .collect();
    let mut links = Links::from_list(&cups);
    let n_moves = if is_part2() { 10_000_000 } else { 100 };
    let progress = indicatif::ProgressBar::new(n_moves);
    progress.set_style(
        indicatif::ProgressStyle::default_bar()
            .template("[{eta_precise} left] {wide_bar} {pos}/{len}"),
    );
    for _ in 0..n_moves {
        links.do_move();
        progress.inc(1);
    }
    progress.finish_and_clear();
    if is_part2() {
        let next = links.next(1);
        let next2 = links.next(next);
        println!("{}", next * next2);
    } else {
        let mut cup = links.next(1);
        let mut order = vec![];
        while cup != 1 {
            order.push(cup.to_string());
            cup = links.next(cup);
        }
        println!("{}", order.join(""));
    }
}

Day 20, Part 2 (Again)

Back to the difficult puzzle from Day 20! In the meantime, a plan is forming; the plan seems clunky enough that it makes me think I am probably missing a more elegant solution, but I have gotten tired of this puzzle by now and I want to be done with it! What I will try to do, is to separate the tiles into a group of four corner tiles with two connections each, a group of edge tiles with three connections each, and a group of center tiles with four connections each. By the number of tiles in each group I should be able to figure out how big the total picture is.

I can then pick one of the corner tiles, determine which two sides don’t connect to any other tiles, place it in the correct orientation in the top left corner, and then start connecting edge tiles to it. Once I have the edge all connected, then I should be able to place all the center tiles correctly.

Here’s what I have written at this point, which I think is overly long and clunky and could use some refactoring:

#[derive(Clone, Copy, Debug, PartialEq)]
enum Direction {
    TOP = 0,
    RIGHT = 1,
    BOTTOM = 2,
    LEFT = 3,
}

impl Direction {
    fn opposite(&self) -> Self {
        use Direction::*;
        match *self {
            TOP => BOTTOM,
            RIGHT => LEFT,
            BOTTOM => TOP,
            LEFT => RIGHT,
        }
    }

    // returns the number of times to call rot90() on @other, to make it point
    // the same way as @self
    fn difference(self, other: Self) -> usize {
        ((4 + (other as i8) - (self as i8)) % 4) as usize
    }

    fn all() -> [Direction; 4] {
        use Direction::*;
        [TOP, RIGHT, BOTTOM, LEFT]
    }
}

bitflags! {
    struct Directions: u8 {
        const NONE = 0;
        const TOP = 0b0001;
        const RIGHT = 0b0010;
        const BOTTOM = 0b0100;
        const LEFT = 0b1000;
    }
}

#[derive(Clone, Debug, PartialEq)]
struct Tile {
    id: u64,
    // top, right, bottom, left borders with bits counted from LSB to MSB in
    // clockwise direction
    borders: [u16; 4],
    border_bits: u8, // number of bits in border
    image: Array2<u8>,
}

impl Tile {
    fn new(id: u64, grid: &Array2<u8>) -> Self {
        let borders = [
            s![0, ..],
            s![.., grid.ncols() - 1],
            s![grid.nrows() - 1, ..;-1],
            s![..;-1, 0],
        ]
        .iter()
        .map(|&slice| to_bits(grid.slice(slice)))
        .collect::<Vec<u16>>()
        .try_into()
        .unwrap();
        let image = grid
            .slice(s![1..grid.nrows() - 1, 1..grid.ncols() - 1])
            .into_owned();
        Tile {
            id,
            borders,
            image,
            border_bits: grid.ncols() as u8,
        }
    }

    // rotate counterclockwise n times
    fn rot90(&self, n: usize) -> Self {
        let rotated_view = match n % 4 {
            0 => self.image.view(),
            1 => self.image.slice(s![.., ..;-1]).reversed_axes(),
            2 => self.image.slice(s![..;-1, ..;-1]),
            3 => self.image.slice(s![..;-1, ..]).reversed_axes(),
            _ => panic!("Impossible"),
        };
        let rotated_image = rotated_view.into_owned();
        let mut borders = self.borders.clone();
        borders.rotate_left(n);
        Tile {
            id: self.id,
            borders,
            image: rotated_image,
            border_bits: self.border_bits,
        }
    }

    fn fliplr(&self) -> Self {
        let flipped_image = self.image.slice(s![.., ..;-1]).into_owned();
        let mut borders: Vec<_> = self
            .borders
            .iter()
            .map(|&b| flip_bits(b, self.border_bits))
            .collect();
        borders.reverse();
        borders.rotate_right(1);
        Tile {
            id: self.id,
            borders: borders.try_into().unwrap(),
            image: flipped_image,
            border_bits: self.border_bits,
        }
    }

    fn flipud(&self) -> Self {
        let flipped_image = self.image.slice(s![..;-1, ..]).into_owned();
        let mut borders: Vec<_> = self
            .borders
            .iter()
            .map(|&b| flip_bits(b, self.border_bits))
            .collect();
        borders.reverse();
        borders.rotate_left(1);
        Tile {
            id: self.id,
            borders: borders.try_into().unwrap(),
            image: flipped_image,
            border_bits: self.border_bits,
        }
    }

    fn connection_side(&self, other: &Self) -> Option<Direction> {
        let mut borders2: HashSet<_> = other.borders.iter().cloned().collect();
        borders2.extend(
            other
                .borders
                .iter()
                .map(|&b| flip_bits(b, self.border_bits)),
        );
        for (&border, &direction) in self.borders.iter().zip(Direction::all().iter()) {
            if borders2.contains(&border) {
                return Some(direction);
            }
        }
        None
    }

    fn connects_in_direction(&self, direction: Direction, other: &Self) -> Option<Direction> {
        let border_to_connect = self.borders[direction as usize];
        if let Some((_, &other_side)) = other
            .borders
            .iter()
            .zip(Direction::all().iter())
            .find(|(&border, _)| border == flip_bits(border_to_connect, self.border_bits))
        {
            println!(
                "{}'s {:?} side connects to {}'s {:?} side",
                self.id, direction, other.id, other_side
            );
            Some(other_side)
        } else {
            None
        }
    }
}

fn flip_bits(border: u16, n_bits: u8) -> u16 {
    assert!(n_bits <= 16);
    border.swap_bits() >> (16 - n_bits)
}

fn to_bits(slice: ArrayView<u8, Ix1>) -> u16 {
    let mut retval = 0;
    for (ix, &cell) in slice.iter().enumerate() {
        if cell > 0 {
            retval |= 2_u16.pow(ix.try_into().unwrap());
        }
    }
    retval
}

fn network(tiles: &Vec<Tile>) -> HashMap<u64, HashSet<u64>> {
    let mut connections = HashMap::new();
    for (tile1, tile2) in tiles.iter().tuple_combinations() {
        if tile1.connection_side(tile2).is_some() {
            let links1 = connections.entry(tile1.id).or_insert(HashSet::new());
            links1.insert(tile2.id);
            let links2 = connections.entry(tile2.id).or_insert(HashSet::new());
            links2.insert(tile1.id);
        }
    }
    connections
}

fn categorize(
    tiles: Vec<Tile>,
    connections: &HashMap<u64, HashSet<u64>>,
) -> (Vec<Tile>, Vec<Tile>, Vec<Tile>) {
    let mut corners = vec![];
    let mut edges = vec![];
    let mut centers = vec![];
    for tile in tiles {
        match connections.get(&tile.id).unwrap().len() {
            2 => corners.push(tile),
            3 => edges.push(tile),
            4 => centers.push(tile),
            _ => panic!("Impossible"),
        }
    }
    (corners, edges, centers)
}

fn orient_tile_correctly(tile: &Tile, tile_to_fit: &Tile, direction: Direction) -> Option<Tile> {
    match tile.connects_in_direction(direction, tile_to_fit) {
        None => (),
        Some(dir) => {
            let rotations = direction.opposite().difference(dir);
            println!("rotating {} by {} ccw", tile_to_fit.id, rotations);
            return Some(tile_to_fit.clone().rot90(rotations));
        }
    }
    let flipped = tile_to_fit.fliplr();
    match tile.connects_in_direction(direction, &flipped) {
        None => (),
        Some(dir) => {
            let rotations = direction.opposite().difference(dir);
            println!("flipping {} and rotating by {} ccw", flipped.id, rotations);
            return Some(flipped.rot90(rotations));
        }
    }
    None
}

fn find_and_orient_tile(
    tile: &Tile,
    possible_tiles: &[Tile],
    direction: Direction,
    connections: &HashMap<u64, HashSet<u64>>,
    used_tile_ids: &mut HashSet<u64>,
) -> Option<Tile> {
    let tile_connections = connections.get(&tile.id).unwrap();
    let candidates = possible_tiles
        .iter()
        .filter(|t| tile_connections.contains(&t.id) && !used_tile_ids.contains(&t.id));
    for candidate in candidates {
        println!(
            "candidate for connecting to {} ({:?}) is {} ({:?})",
            tile.id, tile.borders, candidate.id, candidate.borders
        );
        let next_tile = orient_tile_correctly(tile, candidate, direction);
        if let Some(t) = &next_tile {
            used_tile_ids.insert(t.id);
            return next_tile;
        }
    }
    None
}

fn arrange(
    corners: &[Tile],
    edges: &[Tile],
    centers: &[Tile],
    connections: &HashMap<u64, HashSet<u64>>,
) -> Array2<u8> {
    assert_eq!(corners.len(), 4);

    let mut used_tile_ids = HashSet::new();

    // Find top left corner - pick an arbitrary corner tile and rotate it until
    // it has connections on the right and bottom
    let mut tl_corner = corners[0].clone();
    used_tile_ids.insert(tl_corner.id);
    let mut tl_corner_connections = Directions::NONE;
    for possible_edge in edges {
        match tl_corner.connection_side(&possible_edge) {
            None => continue,
            Some(dir) if dir == Direction::TOP => tl_corner_connections |= Directions::TOP,
            Some(dir) if dir == Direction::RIGHT => tl_corner_connections |= Directions::RIGHT,
            Some(dir) if dir == Direction::BOTTOM => tl_corner_connections |= Directions::BOTTOM,
            Some(dir) if dir == Direction::LEFT => tl_corner_connections |= Directions::LEFT,
            Some(_) => panic!("Impossible"),
        }
    }
    tl_corner = tl_corner.rot90(match tl_corner_connections {
        dir if dir == Directions::RIGHT | Directions::BOTTOM => 0,
        dir if dir == Directions::BOTTOM | Directions::LEFT => 1,
        dir if dir == Directions::LEFT | Directions::TOP => 2,
        dir if dir == Directions::TOP | Directions::RIGHT => 3,
        _ => panic!("Impossible"),
    });

    // Build the top edge
    let mut t_row = vec![tl_corner];
    let mut current_tile = &t_row[t_row.len() - 1];
    loop {
        match find_and_orient_tile(
            &current_tile,
            &edges,
            Direction::RIGHT,
            connections,
            &mut used_tile_ids,
        ) {
            None => break,
            Some(tile) => {
                t_row.push(tile);
                current_tile = &t_row[t_row.len() - 1];
            }
        }
    }
    let tr_corner = find_and_orient_tile(
        &current_tile,
        &corners,
        Direction::RIGHT,
        connections,
        &mut used_tile_ids,
    )
    .unwrap();

    t_row.push(tr_corner);

    let ncols = t_row.len();
    let nrows = (corners.len() + edges.len() + centers.len()) / ncols;

    println!("whole image is {}×{}", ncols, nrows);

    // For each subsequent row except the bottom one...
    let mut rows = vec![t_row];
    for row in 1..nrows - 1 {
        // Find the left edge of the row
        let left = find_and_orient_tile(
            &rows[row - 1][0],
            &edges,
            Direction::BOTTOM,
            connections,
            &mut used_tile_ids,
        )
        .unwrap();
        let mut current_row = vec![left];
        // Arrange the middle tiles
        for col in 1..ncols - 1 {
            let next_tile = find_and_orient_tile(
                &current_row[col - 1],
                &centers,
                Direction::RIGHT,
                connections,
                &mut used_tile_ids,
            )
            .unwrap();
            current_row.push(next_tile);
        }
        // Find the right edge of the row
        let right = find_and_orient_tile(
            &current_row[ncols - 2],
            &edges,
            Direction::RIGHT,
            connections,
            &mut used_tile_ids,
        )
        .unwrap();
        current_row.push(right);

        rows.push(current_row);
    }

    // Now the bottom left corner
    let bl_corner = find_and_orient_tile(
        &rows[nrows - 2][0],
        &corners,
        Direction::BOTTOM,
        connections,
        &mut used_tile_ids,
    )
    .unwrap();
    let mut b_row = vec![bl_corner];
    // Bottom edge
    for col in 1..ncols - 1 {
        b_row.push(
            find_and_orient_tile(
                &b_row[col - 1],
                &edges,
                Direction::RIGHT,
                connections,
                &mut used_tile_ids,
            )
            .unwrap(),
        );
    }
    // Last tile
    let br_corner = find_and_orient_tile(
        &b_row[ncols - 2],
        &corners,
        Direction::RIGHT,
        connections,
        &mut used_tile_ids,
    )
    .unwrap();
    b_row.push(br_corner);
    rows.push(b_row);

    // Stack all the image data together
    let all_rows: Vec<_> = rows
        .iter()
        .map(|row| {
            let row_images: Vec<_> = row.iter().map(|t| t.image.view()).collect();
            concatenate(Axis(1), &row_images).unwrap()
        })
        .collect();
    concatenate(
        Axis(0),
        &all_rows.iter().map(|row| row.view()).collect::<Vec<_>>(),
    )
    .unwrap()
}

This isn’t done yet, I haven’t gotten to the point of scanning for sea monsters either, so I’ll have to tack that on to one of the final posts. It took me quite a long time to get this far, because of two bugs that I got stuck on:

When deciding which borders connect to each other, the border on one tile actually has to equal the bit-reversed border on the other tile. I didn’t notice this in Part 1, but it didn’t affect the answer, so it was in sort of a blind spot.
I was looking at ndarray::stack() in an old version of the ndarray docs on docs.rs without realizing it. In more recent versions, the stack() function was renamed to concatenate(), and stack() now does something else, and I couldn’t figure out why my result was wrong until I saw the tiny “⚠ Go to latest version” label in the corner. It would be nice if docs.rs would redirect you to the latest version when coming from search engines!

Afterword

Three of the four puzzles in this post went very smoothly. The final one got me to the point where I had to look for a hint, and I think this is another of those places where better knowledge of computer science fundamentals such as algorithms and data structures would have helped me; but on the other hand maybe not, since it is a puzzle after all, not a textbook problem. At least, it is nice that I at least had an idea (linked list) that was in the right direction, I don’t think it would have been enough to get there without the hint of storing the elements contiguously. These puzzles are challenging, but I still have to say that I’ve rarely, possibly never, faced a situation in my professional career where I’ve been hampered by missing knowledge such as this!

At least I did learn something from trying to solve the problem and from the hint! I learned how to profile Rust programs, and I learned a new application of a data structure that I will hopefully remember in the future.

I will make two more posts in this series: one with Day 24 plus the finish of Day 20, and one with Day 25.

[1] Which, played with the usual deck of 52 cards in four suits, we knew as War when I was a kid ↩

Advent of Rust, Day 20 and 21: Stumped by Sea Monsters

Posted on December 22, 2020 by Philip Chimento

Welcome back to another two days’ worth of the ongoing chronicle of me trying to teach myself the Rust programming language by doing programming puzzles from Advent of Code 2020.

Day 20, Part 1

Unlike in the past puzzles, I have no idea how to tackle this problem, so I start just by reading in the data. I’ll create a struct Tile to hold the data. I don’t have to store the actual image data, just the borders, so I can compare them to the borders of the other tiles.

There are eight borders — one on each of the four sides, plus the tiles may also be flipped, so the same borders again but reversed. I’ll store each border as a u16 bit pattern for easy comparing, in an array of length 8.

I do start out with the vague idea that I should use tuple_combinations() for this. But anyway, I’ll read in the data first.

#[macro_use]
extern crate scan_fmt;

use ndarray::{s, Array2, ArrayView, Ix1};
use std::convert::TryInto;

struct Tile {
    id: u16,
    borders: [u16; 8],
}

impl Tile {
    fn new(id: u16, grid: &Array2<u8>) -> Self {
        let borders = [
            s![0, ..],
            s![0, ..;-1],
            s![grid.nrows() - 1, ..],
            s![grid.nrows() - 1, ..;-1],
            s![.., 0],
            s![..;-1, 0],
            s![.., grid.ncols() - 1],
            s![..;-1, grid.ncols() - 1],
        ]
        .iter()
        .map(|&slice| to_bits(grid.slice(slice)))
        .collect::<Vec<u16>>()
        .try_into()
        .unwrap();
        Tile { id, borders }
    }
}

fn to_bits(slice: ArrayView<u8, Ix1>) -> u16 {
    let mut retval = 0;
    for (ix, &cell) in slice.iter().enumerate() {
        if cell > 0 {
            retval |= 2_u16.pow(ix.try_into().unwrap());
        }
    }
    retval
}

fn read_grid(lines: Vec<&str>) -> Array2<u8> {
    let rows = lines.len();
    let cols = lines[0].len();
    let mut cells = Array2::zeros((rows, cols));
    for (y, line) in lines.iter().enumerate() {
        for (x, tile) in line.bytes().enumerate() {
            cells[[y, x]] = match tile {
                b'#' => 1,
                b'.' => 0,
                _ => panic!("Bad tile '{}'", tile),
            };
        }
    }
    cells
}

fn read_tile(input: &str) -> Tile {
    let mut lines = input.lines();
    let header = lines.next().unwrap();
    let id = scan_fmt!(header, "Tile {}:", u16).unwrap();
    let grid = read_grid(lines.collect::<Vec<&str>>());
    Tile::new(id, &grid)
}

fn read_input(input: &'static str) -> impl Iterator<Item = Tile> {
    input.split("\n\n").map(read_tile)
}

I write the slice things (s![0, ..;-1]) after reading a bit more of the documentation about slicing ndarrays. I think this notation is an improvement on the Conway’s Game of Life code that I wrote a few days ago.

Maybe if there is only ever one possibility for each edge to match another tile’s edge, we can iterate over tuple_combinations() and check that there are four tiles that have only two possible connections to other tiles?

I add a HashSet<u16> of connections to the Tile type, and add these two methods:

fn connect(&mut self, other: &mut Self) {
    let borders1: HashSet<_> = self.borders.iter().collect();
    let borders2: HashSet<_> = other.borders.iter().collect();
    let intersection: HashSet<_> = borders1.intersection(&borders2).collect();
    match intersection.len() {
        0 => (),
        1 => {
            self.connections.insert(other.id);
            other.connections.insert(self.id);
        }
        _ => panic!(
            "Tile {} and {} can connect in more than one way",
            self.id, other.id
        ),
    }
}

fn n_connections(&self) -> usize {
    self.connections.len()
}

I try this to set up the network of tiles:

fn network(tiles: &mut Vec<Tile>) {
    for (&mut tile1, &mut tile2) in tiles.iter_mut().tuple_combinations() {
        tile1.connect(&mut tile2);
    }
}

tuple_combinations() doesn’t work, for one thing:

error[E0277]: the trait bound `std::slice::IterMut<'_, Tile>: std::clone::Clone` is not satisfied
  --> puzzle20.rs:81:54
   |
81 |     for (&mut tile1, &mut tile2) in tiles.iter_mut().tuple_combinations() {
   |                                                      ^^^^^^^^^^^^^^^^^^ the trait `std::clone::Clone` is not implemented for `std::slice::IterMut<'_, Tile>`

error[E0277]: the trait bound `&mut Tile: std::clone::Clone` is not satisfied
  --> puzzle20.rs:81:54
   |
81 |     for (&mut tile1, &mut tile2) in tiles.iter_mut().tuple_combinations() {
   |                                                      ^^^^^^^^^^^^^^^^^^ the trait `std::clone::Clone` is not implemented for `&mut Tile`
   |
   = note: `std::clone::Clone` is implemented for `&Tile`, but not for `&mut Tile`

If I understand the error message correctly, this is because tuple_combinations() makes copies of the elements, and I don’t think I can copy the tiles — for one thing, I want the number of connections to be updated in the same copy of the tile, I don’t want different copies of the same tile with different connections!

I try indexing the array and iterating over tuple combinations of indices:

fn network(tiles: &mut Vec<Tile>) {
    let indices = 0..tiles.len();
    for (ix1, ix2) in indices.tuple_combinations() {
        tiles[ix1].connect(&mut tiles[ix2]);
    }
}

This also doesn’t work:

error[E0499]: cannot borrow `*tiles` as mutable more than once at a time
  --> puzzle20.rs:80:33
   |
80 |         tiles[ix1].connect(&mut tiles[ix2]);
   |         -----      -------      ^^^^^ second mutable borrow occurs here
   |         |          |
   |         |          first borrow later used by call
   |         first mutable borrow occurs here

This I don’t understand. I don’t want to borrow the whole array twice, I just want to modify elements in it.

Eventually, after trying a lot more things, I give up, and store the connections in a separate collection:

impl Tile {
    fn connects_to(&self, other: &Self) -> bool {
        let borders1: HashSet<_> = self.borders.iter().collect();
        let borders2: HashSet<_> = other.borders.iter().collect();
        let intersection: HashSet<_> = borders1.intersection(&borders2).collect();
        match intersection.len() {
            0 => false,
            1 => true,
            _ => panic!(
                "Tile {} and {} can connect in more than one way",
                self.id, other.id
            ),
        }
    }
}

fn network(tiles: Vec<Tile>) -> HashMap<u16, HashSet<u16>> {
    let mut connections = HashMap::new();
    for (tile1, tile2) in tiles.iter().tuple_combinations() {
        if tile1.connects_to(tile2) {
            let links1 = connections.entry(tile1.id).or_insert(HashSet::new());
            links1.insert(tile2.id);
            let links2 = connections.entry(tile2.id).or_insert(HashSet::new());
            links2.insert(tile1.id);
        }
    }
    connections
}

This works, but running it on the example input I get ‘Tile 2311 and 1951 can connect in more than one way’! Then I realize that yes, they could both be connected, but if you flip both of them, then they can still be connected. So we should expect that tiles connect in either 0 or 2 possible ways in connects_to().

For the array of connections in the example input, I get 4 tiles with 2 connections, 4 tiles with 3 connections, and one tile with 4 connections, which is consistent with the tiles being arranged in a 3×3 square.

Running it on the real input I get a panic, because a zero-length string is being passed to read_tile(). Sure enough, there is an extra newline at the end of the file. I wonder if instead of splitting on "\n\n" I can split on "\n{2,}", but that requires adding a dependency on the regex package which I’m not using anywhere else, so I add a .filter(|s| s.len() > 0) to read_input().

Finally I write the main() function which looks like this:

fn main() {
    let input = include_str!("input");
    let tiles = read_input(input);
    let connections = network(&tiles);

    let answer: u32 = connections
        .iter()
        .filter(|(_, links)| links.len() == 2)
        .map(|(id, _)| id)
        .product();
    println!("{}", answer);
}

I get an overflow in the call to product() so I change the type of the IDs to u64. This gives me the right answer.

Day 20, Part 2

Next we have to remove the borders of the tiles, and arrange them in the correct configuration, and then search for “sea monsters” that look like this:

                  #
#    ##    ##    ###
 #  #  #  #  #  #

From experience at a job a long time ago, I know how to search for the sea monster, by using cross-correlation, which I would be willing to bet that ndarray is capable of doing. Before I get to that part, however, I’m a bit at a loss for how to stitch the pictures together. I feel that I could do it, but without a good plan it would probably be very messy and take me a lot of time to finish.

I decide once again to start by storing the data that I will need, and then see if I get any ideas while doing that.

First of all, I will need to store the tile, as an ndarray::Array2, without the borders. This can be done by taking a slice and copying it into an owned array, in the constructor:

let image = grid
    .slice(s![1..grid.nrows() - 1, 1..grid.ncols() - 1])
    .into_owned();
Tile { id, borders, image }

Now I’m not sure what to do. I take a break and work on Day 21 instead. When I come back to it, I don’t really have any more ideas.

I decide that since the orientation of each tile is now important, I should do some refactors to Tile. I will add rot90(), fliplr(), and flipud() methods (named after the NumPy functions), and I won’t store eight borders in an arbitrary order, but instead four borders in the order top, right, bottom, and left. The rotating and flipping operations will manipulate the array of borders so that they are correct for the new orientation.

#[derive(Debug, PartialEq)]
struct Tile {
    id: u64,
    // top, right, bottom, left borders with bits counted from LSB to MSB in
    // clockwise direction
    borders: [u16; 4],
    border_bits: u8, // number of bits in border
    image: Array2<u8>,
}

impl Tile {
    fn new(id: u64, grid: &Array2<u8>) -> Self {
        let borders = [
            s![0, ..],
            s![.., grid.ncols() - 1],
            s![grid.nrows() - 1, ..;-1],
            s![..;-1, 0],
        ]
        .iter()
        .map(|&slice| to_bits(grid.slice(slice)))
        .collect::<Vec<u16>>()
        .try_into()
        .unwrap();
        let image = grid
            .slice(s![1..grid.nrows() - 1, 1..grid.ncols() - 1])
            .into_owned();
        Tile {
            id,
            borders,
            image,
            border_bits: grid.ncols() as u8,
        }
    }

    // rotate counterclockwise n times
    fn rot90(&self, n: usize) -> Self {
        let rotated_view = match n % 4 {
            0 => self.image.view(),
            1 => self.image.slice(s![.., ..;-1]).reversed_axes(),
            2 => self.image.slice(s![..;-1, ..;-1]),
            3 => self.image.slice(s![..;-1, ..]).reversed_axes(),
            _ => panic!("Impossible"),
        };
        let rotated_image = rotated_view.into_owned();
        let mut borders = self.borders.clone();
        borders.rotate_left(n);
        Tile {
            id: self.id,
            borders,
            image: rotated_image,
            border_bits: self.border_bits,
        }
    }

    fn fliplr(&self) -> Self {
        let flipped_image = self.image.slice(s![.., ..;-1]).into_owned();
        let mut borders: Vec<_> = self
            .borders
            .iter()
            .map(|&b| flip_bits(b, self.border_bits))
            .collect();
        borders.reverse();
        borders.rotate_right(1);
        Tile {
            id: self.id,
            borders: borders.try_into().unwrap(),
            image: flipped_image,
            border_bits: self.border_bits,
        }
    }

    fn flipud(&self) -> Self {
        let flipped_image = self.image.slice(s![..;-1, ..]).into_owned();
        let mut borders: Vec<_> = self
            .borders
            .iter()
            .map(|&b| flip_bits(b, self.border_bits))
            .collect();
        borders.reverse();
        borders.rotate_left(1);
        Tile {
            id: self.id,
            borders: borders.try_into().unwrap(),
            image: flipped_image,
            border_bits: self.border_bits,
        }
    }

    fn connects_to(&self, other: &Self) -> bool {
        let borders1: HashSet<_> = self.borders.iter().cloned().collect();
        let mut borders2: HashSet<_> = other.borders.iter().cloned().collect();
        borders2.extend(
            other
                .borders
                .iter()
                .map(|&b| flip_bits(b, self.border_bits)),
        );
        let intersection: HashSet<_> = borders1.intersection(&borders2).collect();
        match intersection.len() {
            0 => false,
            1 => true,
            _ => panic!(
                "Tile {} and {} can connect in more than one way",
                self.id, other.id
            ),
        }
    }
}

fn flip_bits(border: u16, n_bits: u8) -> u16 {
    assert!(n_bits <= 16);
    border.swap_bits() >> (16 - n_bits)
}

fn to_bits(slice: ArrayView<u8, Ix1>) -> u16 {
    let mut retval = 0;
    for (ix, &cell) in slice.iter().enumerate() {
        if cell > 0 {
            retval |= 2_u16.pow(ix.try_into().unwrap());
        }
    }
    retval
}

I also write some tests for this code, and debug it until the tests pass. The full code is on the repository. This, at least, still gives me the correct answer for Part 1. However, I’m really tired of doing this puzzle, so I think I’ll stop for now and leave it for another day.

Day 21, Part 1

Here, we have ingredient lists in a language we don’t understand, of foods containing ingredients such as “sqjhc”, “mxmxvkd”, and “kfcds”. These foods also have allergen lists in English. Each allergen is present in one ingredient and each ingredient contains either zero or one allergen. We have to determine which ingredients contain zero allergens, and count how many times they appear in our input.

Here’s another puzzle where I’m not sure how to get started, and so I decide once more to at least write code to read in the data.

use std::collections::HashSet;

struct Food {
    ingredients: HashSet<String>,
    allergens: HashSet<String>,
}

impl Food {
    fn from_string(s: &str) -> Self {
        let mut split1 = s[0..s.len() - 1].split(" (contains ");
        let (ingredients_list, allergens_list) = (split1.next().unwrap(), split1.next().unwrap());
        let ingredients = ingredients_list.split(' ').map(String::from).collect();
        let allergens = allergens_list.split(", ").map(String::from).collect();
        Food {
            ingredients,
            allergens,
        }
    }
}

I also write a test for this, which I’ll copy here since I found the way to assert the contents of a HashSet to be unusually verbose:

#[test]
fn test_parse_food() {
    let food = Food::from_string("mxmxvkd kfcds sqjhc nhms (contains dairy, fish)");
    assert_eq!(
        food.ingredients,
        ["mxmxvkd", "kfcds", "sqjhc", "nhms"]
            .iter()
            .cloned()
            .map(String::from)
            .collect()
    );
    assert_eq!(
        food.allergens,
        ["dairy", "fish"]
            .iter()
            .cloned()
            .map(String::from)
            .collect()
    );
}

It would be nice if we could say something like assert_contains!(food.ingredients, ["mxmxvkd", "kfcds", "sqjhc", "nhms"]).

I’m having trouble figuring out how to get the answer not because I’m not sure about how something works in Rust, but just because it’s not even clear to me how to solve the example problem.

I take a break and think about it some more. I wonder if we could take the tuple combinations of all foods, and compare the foods in each tuple; if they have a common allergen, then all the ingredients that are in one of the foods but not both, are non-allergens.

I write this code:

fn find_non_allergens(foods: &[Food]) -> HashSet<String> {
    foods
        .iter()
        .tuple_combinations()
        .flat_map(|(food1, food2)| {
            if food1.allergens.union(&food2.allergens).count() > 0 {
                food1
                    .ingredients
                    .symmetric_difference(&food2.ingredients)
                    .map(String::from)
                    .collect()
            } else {
                vec![]
            }
        })
        .collect()
}

That doesn’t work, though! The list of non-allergens that it yields on the example problem is too long, including ingredients that might have allergens. This is because of something in the puzzle description that I forgot to take into account: foods might not list all the allergens that they have.

My second attempt makes a list of ingredients possible_allergens that initially could contain any allergen, and removes them as they are found to be impossible:

fn find_non_allergens(foods: &[Food]) -> HashSet<String> {
    let all_allergens: HashSet<_> = foods
        .iter()
        .flat_map(|food| food.allergens.clone())
        .collect();
    let all_ingredients: Vec<_> = foods
        .iter()
        .flat_map(|food| food.ingredients.clone())
        .collect();
    let mut possible_allergens: HashMap<_, HashSet<_>> = all_ingredients
        .iter()
        .map(|ingredient| (ingredient.clone(), all_allergens.clone()))
        .collect();
    for (food1, food2) in foods.iter().tuple_combinations() {
        let allergens_in_both: Vec<_> = food1.allergens.intersection(&food2.allergens).collect();
        let ingredients_not_in_both: Vec<_> = food1
            .ingredients
            .symmetric_difference(&food2.ingredients)
            .map(String::from)
            .collect();
        for ingredient in &ingredients_not_in_both {
            for &allergen in &allergens_in_both {
                possible_allergens
                    .get_mut(ingredient)
                    .unwrap()
                    .remove(allergen);
            }
        }
    }
    possible_allergens
        .iter()
        .filter(|(_, allergens)| allergens.len() == 0)
        .map(|(ingredient, _)| ingredient.clone())
        .collect()
}

This doesn’t work either; “soy” is not ruled out for any of the ingredients, because there are no two foods in the example data that both contain “soy”.

I note that the compiler’s error message when trying to mutate a HashMap value gotten with get() is not very enlightening:

error[E0596]: cannot borrow data in a `&` reference as mutable
  --> puzzle21.rs:45:17
   |
45 |                 possible_allergens.get(ingredient).unwrap().remove(allergen);
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot borrow as mutable

Maybe it should suggest using get_mut() here!

While writing this I notice that I mistakenly used union() in the previous example instead of intersection(), so I quickly go back and try that, but it doesn’t work either.

Then I have a brainwave and realize that I don’t need to compare pairs of foods at all. If a food is listed as containing allergens, then the ingredients that are not in that food cannot contain those allergens. I write this code and get the correct answer from both the example data and the real puzzle input:

fn find_non_allergens(foods: &[Food]) -> HashSet<String> {
    let all_allergens: HashSet<_> = foods
        .iter()
        .flat_map(|food| food.allergens.clone())
        .collect();
    let all_ingredients: Vec<_> = foods
        .iter()
        .flat_map(|food| food.ingredients.clone())
        .collect();
    let mut possible_allergens: HashMap<_, HashSet<_>> = all_ingredients
        .iter()
        .map(|ingredient| (ingredient.clone(), all_allergens.clone()))
        .collect();
    for food in foods {
        for ingredient in all_ingredients
            .iter()
            .filter(|&ingredient| !food.ingredients.contains(ingredient))
        {
            for allergen in &food.allergens {
                possible_allergens
                    .get_mut(ingredient)
                    .unwrap()
                    .remove(allergen);
            }
        }
    }
    possible_allergens
        .iter()
        .filter(|(_, allergens)| allergens.is_empty())
        .map(|(ingredient, _)| ingredient.clone())
        .collect()
}

fn main() {
    let input = include_str!("input");
    let foods: Vec<Food> = input.lines().map(|s| Food::from_string(s)).collect();
    let non_allergens = find_non_allergens(&foods);
    let count: usize = non_allergens
        .iter()
        .map(|ingredient| {
            foods
                .iter()
                .filter(|food| food.ingredients.contains(ingredient))
                .count()
        })
        .sum();
    println!("{}", count);
}

Day 21, Part 2

Part 2 of the puzzle, as one might have expected, is to figure out which ingredients do contain which allergens. The answer to the puzzle is a comma-separated list of ingredients, sorted by allergen alphabetically.

For this we need to reuse possible_allergens, so I first split find_non_allergens() into a find_possible_allergens() function that returns possible_allergens, and a new find_non_allergens() function that does the filtering and mapping on that.

Then we can copy part of the solution from Day 16:

fn determine_allergens(
    possible_allergens: &HashMap<String, HashSet<String>>,
) -> HashMap<String, String> {
    let mut to_be_determined: Vec<_> = possible_allergens
        .iter()
        .map(|(s, set)| (s.clone(), set.clone()))
        .collect();

    let mut dangerous_ingredient_list = HashMap::new();
    while !to_be_determined.is_empty() {
        to_be_determined.sort_by_key(|(_, set)| set.len());
        to_be_determined.reverse();

        let (ingredient, allergens) = to_be_determined.pop().unwrap();
        if allergens.is_empty() {
            continue;
        }
        assert!(allergens.len() == 1, "unable to determine allergens");
        let allergen = allergens.iter().next().unwrap();
        dangerous_ingredient_list.insert(allergen.clone(), ingredient.clone());
        for (_, remaining_allergens) in &mut to_be_determined {
            remaining_allergens.remove(allergen);
        }
    }
    dangerous_ingredient_list
}

The difference is that we have to re-sort the list every iteration, since in one iteration you might rule out soy, but the next item might not have soy as a possibility because it was ruled out from a later item — as I find out by a panic when trying to run it.

Then to get the answer, I write:

let mut dangerous_ingredient_list = determine_allergens(&possible_allergens);
let mut dangerous_ingredients = dangerous_ingredient_list.drain().collect::<Vec<_>>();
dangerous_ingredients.sort_by(|(allergen1, _), (allergen2, _)| allergen1.cmp(allergen2));
let list = dangerous_ingredients
    .drain(..)
    .map(|(_, ingredient)| ingredient)
    .collect::<Vec<_>>()
    .join(",");
println!("{}", list);

The full code is on the repository.

Afterword

The puzzles are definitely getting more difficult! In many of the previous puzzles, I’ve felt like I had a general idea about how to proceed, but I didn’t feel that on Day 20. Instead, this is the first time that I actually gave up on a puzzle and went on to the next one.

There certainly is a lot of .iter()....collect() in the code from both days, to convert between Vec, HashSet, and HashMap. This seems unnecessarily verbose, I wonder if there’s a more idiomatic way that I’m missing?

Advent of Rust 18 and 19: Parsers and New Math

Posted on December 20, 2020 by Philip Chimento

It’s another two days of the ongoing chronicle of me trying to teach myself the Rust programming language by doing programming puzzles from Advent of Code 2020. I’m well over halfway, in fact today’s post includes the three-quarters mark.

I mentioned it last time, but I’ve found this series from an experienced Rust programmer also doing these puzzles in Rust to be very enlightening (hat tip, Federico Mena Quintero.)

Day 18, Part 1

Today we have to solve arithmetic homework problems, from a bizarro parallel universe where there is no precedence of operators. We get an input file with homework problems that look like 1 + 2 * 3 + 4 * 5 + 6 to which the answer is not 33 as you might expect, but 71 because in the absence of parentheses you just do all the operations from left to right.

I have in the past had to build a parser for (normal) arithmetic expressions, so in theory I should know how to do this. However, reading the Faster than Lime posts, I have seen that the author is often using a Rust parser generator package called PEG for the problems where I used scan_fmt!(), and it looks quite easy to use. So I decide to use PEG as well to generate this parser.

Luckily there is literally the exact thing I need, a parser for arithmetic expressions with precedence of operators, in one of the examples in the PEG documentation! I just need to tweak it so that the precedence of addition and multiplication is equal:

extern crate peg;

peg::parser! {
    grammar bizarro_arithmetic() for str {
        rule number() -> u64 = n:$(['0'..='9']) { n.parse().unwrap() }
        pub rule expr() -> u64 = precedence!{
            x:(@) " + " y:@ { x + y }
            x:(@) " * " y:@ { x * y }
            --
            n:number() { n }
            "(" e:expr() ")" { e }
        }
    }
}

It’s slightly weird, and trips me up for a little while, that in order to accommodate the spaces in the strings I have to explicitly use the token " + ", and can’t just tell it to tokenize on whitespace. But eventually I figure it out.

Another thing that Faster than Lime has been doing is writing inline tests! This is pretty cool, and I decide to do it here with the examples in the puzzle description:

#[test]
fn part1_examples() {
    assert_eq!(bizarro_arithmetic::expr("1 + 2 * 3 + 4 * 5 + 6"), Ok(71));
    assert_eq!(bizarro_arithmetic::expr("1 + (2 * 3) + (4 * (5 + 6))"), Ok(51));
    assert_eq!(bizarro_arithmetic::expr("2 * 3 + (4 * 5)"), Ok(26));
    assert_eq!(bizarro_arithmetic::expr("5 + (8 * 3 + 9 + 3 * 4 * 3)"), Ok(437));
    assert_eq!(bizarro_arithmetic::expr("5 * 9 * (7 * 3 * 3 + 9 * 3 + (8 + 6 * 4))"), Ok(12240));
    assert_eq!(bizarro_arithmetic::expr("((2 + 4 * 9) * (6 + 9 * 8 + 6) + 6) + 2 + 4 * 2"), Ok(13632));
}

Running cargo test will execute these tests, and that helps me fix the bug that I mentioned above where I had "+" instead of " + ".

Once that is working, I’m pleasantly surprised that I’ve gotten to the point of confidence that the main() function practically writes itself:

fn main() {
    let input = include_str!("input");
    let answer: u64 = input
        .lines()
        .map(bizarro_arithmetic::expr)
        .map(|n| n.unwrap())
        .sum();
    println!("{}", answer);
}

Thanks to PEG, this has been one of the easiest puzzles so far, unless you consider using a parser generator “cheating” (which I most certainly don’t.)

Day 18, Part 2

Part 2 of the puzzle is just the same as Part 1, only with addition given a higher precedence than multiplication. According to the PEG documentation this can be achieved just by swapping the order and adding a -- to separate the two. I add a rule to the parser generator:

pub rule expr2() -> u64 = precedence!{
    x:(@) " * " y:@ { x * y }
    --
    x:(@) " + " y:@ { x + y }
    --
    n:number() { n }
    "(" e:expr2() ")" { e }
}

I also write inline tests for this one using the examples in the puzzle description. This inline testing thing is brilliant!

Then I change the map() call in main() to use the correct parser depending on whether we are running Part 1 or Part 2:

.map(if is_part2() {
    bizarro_arithmetic::expr2
} else {
    bizarro_arithmetic::expr
})

On to the next puzzle!

Day 19, Part 1

Today’s puzzle is validating a bunch of messages against rules given in a grammar-like form. The grammar has a certain syntax, which makes me think I can use what I learned yesterday about PEG to parse it. But instead of using a PEG parser that gives a number as the parsed result, we have to build a parser that gives… another, dynamically-generated, parser.

This sounds really daunting, and I almost say “oh no!” out loud when I read the puzzle description.

I take a long break and by the time I come back to it, it has occurred to me that maybe instead of another parser, the PEG parser can give a regular expression that we can use to match the messages? Ultimately, a regular expression is really a sort of dynamically-generated parser. I feel good about this idea.

After thinking about it some more after that, I realize I might have to have one intermediate step, because rules can reference other rules that haven’t been defined yet. So I will have to read them all in, parse them into an intermediate form such as an enum, and then generate a regular expression against which I match all the messages.

The rules are in a form like this:

0: 1 2
1: "a"
2: 1 3 | 3 1
3: "b"

This means that rule 0 consists of rule 1 followed by rule 2, rule 1 is the 'a' character, rule 3 is the 'b' character, and rule 2 consists of either 1 followed by 3, or 3 followed by 1, in other words "ab" or "ba". The rules are guaranteed to have no loops. Ideally this would yield the regular expression ^a(ab|ba)$ which I could then match against the list of messages.

I write a Rule enum and a RuleSet type, during which I learn that enum variants may not reference other enum variants. The compiler suggests using Box, so that’s what I do:

#[derive(Debug)]
enum Rule {
    Literal(char),
    Ref(usize),
    Seq(Box<Rule>, Box<Rule>),
    Choice(Box<Rule>, Box<Rule>),
}

type RuleSet = HashMap<usize, Rule>;

Using what I learned from yesterday’s puzzle, the parser follows:

peg::parser! {
    grammar rule_set() for str {
        rule index() -> usize = n:$(['0'..='9']+) ":" { n.parse().unwrap() }
        rule number() -> Rule = n:$(['0'..='9']+) { Rule::Ref(n.parse().unwrap()) }
        rule literal() -> Rule = "\"" c:$(['a'..='z' | 'A'..='Z']) "\"" {
            Rule::Literal(c.chars().next().unwrap())
        }
        rule seq() -> Rule = a:number() " " b:number() { Rule::Seq(Box::new(a), Box::new(b)) }
        rule choice_side() -> Rule = seq() / number()
        rule choice() -> Rule = a:choice_side() " | " b:choice_side() {
            Rule::Choice(Box::new(a), Box::new(b))
        }
        pub rule parse_line() -> (usize, Rule) =
            ix:index() " " expr:(choice() / seq() / number() / literal()) { (ix, expr) }
    }
}

This doesn’t compile:

error[E0446]: private type `Rule` in public interface
  --> puzzle19.rs:15:2
   |
6  |   enum Rule {
   |   - `Rule` declared as private
...
15 |   peg::parser! {
   |  __^
16 | |     grammar rule_set() for str {
17 | |         rule index() -> usize = n:$(['0'..='9']+) ":" { n.parse().unwrap() }
18 | |         rule number() -> Rule = n:$(['0'..='9']+) { Rule::Ref(n.parse().unwrap()) }
...  |
24 | |     }
25 | | }
   | |_^ can't leak private type
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

I’m not sure why this is happening, because enum Rule is at the top level and so is the PEG parser, so they ought to just have access to each other? In any case, I change it to pub enum Rule and it compiles

I then write a test for this parser:

#[test]
fn example1() {
    let mut rule_set = RuleSet::new();
    let example1 = ["0: 1 2", "1: \"a\"", "2: 1 3 | 3 1", "3: \"b\""];
    for line in example1.iter() {
        let (ix, rule) = rules_grammar::parse_line(line).unwrap();
        rule_set.insert(ix, rule);
    }

    use Rule::*;
    assert_eq!(rule_set.get(0), Some(Seq(Ref(1), Ref(2))));
    assert_eq!(rule_set.get(1), Some(Literal('a')));
    assert_eq!(
        rule_set.get(2),
        Some(Choice(Seq(Ref(1), Ref(3)), Seq(Ref(3), Ref(1))))
    );
    assert_eq!(rule_set.get(3), Some(Literal('b')));
}

It’s very wrong according to the compiler!

error[E0308]: mismatched types
  --> puzzle19.rs:46:29
   |
46 |     assert_eq!(rule_set.get(0), Some(Seq(Ref(1), Ref(2))));
   |                             ^
   |                             |
   |                             expected `&usize`, found integer
   |                             help: consider borrowing here: `&0`

error[E0308]: mismatched types
  --> puzzle19.rs:46:42
   |
46 |     assert_eq!(rule_set.get(0), Some(Seq(Ref(1), Ref(2))));
   |                                          ^^^^^^
   |                                          |
   |                                          expected struct `Box`, found enum `Rule`
   |                                          help: store this in the heap by calling `Box::new`: `Box::new(Ref(1))`
   |
   = note: expected struct `Box<Rule>`
                found enum `Rule`
   = note: for more on the distinction between the stack and the heap, read https://doc.rust-lang.org/book/ch15-01-box.html, https://doc.rust-lang.org/rust-by-example/std/box.html, and https://doc.rust-lang.org/std/boxed/index.html

error[E0308]: mismatched types
  --> puzzle19.rs:46:50
   |
46 |     assert_eq!(rule_set.get(0), Some(Seq(Ref(1), Ref(2))));
   |                                                  ^^^^^^
   |                                                  |
   |                                                  expected struct `Box`, found enum `Rule`
   |                                                  help: store this in the heap by calling `Box::new`: `Box::new(Ref(2))`
   |
   = note: expected struct `Box<Rule>`
                found enum `Rule`
   = note: for more on the distinction between the stack and the heap, read https://doc.rust-lang.org/book/ch15-01-box.html, https://doc.rust-lang.org/rust-by-example/std/box.html, and https://doc.rust-lang.org/std/boxed/index.html

error[E0369]: binary operation `==` cannot be applied to type `Option<&Rule>`
  --> puzzle19.rs:46:5
   |
46 |     assert_eq!(rule_set.get(0), Some(Seq(Ref(1), Ref(2))));
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |     |
   |     Option<&Rule>
   |     Option<Rule>
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

First of all, I’m not sure why the integer has to be borrowed to index the hash map! The second and third errors I do understand, because the rules need a Box::new() around them, though I’m not sure how that will work with asserting equality! Will it compare the addresses of the Boxes? And I don’t understand the fourth error, can’t assert_eq!() just check that the rules contain all the same variants?

Even if I do this to reference the right side:

assert_eq!(rule_set.get(&0), Some(&Seq(Box::new(Ref(1)), Box::new(Ref(2)))));

I still get an error such as this:

error[E0369]: binary operation `==` cannot be applied to type `Rule`
  --> puzzle19.rs:51:5
   |
51 |       assert_eq!(
   |  _____^
   | |_____|
   | |
52 | |         *rule_set.get(&0).unwrap(),
53 | |         Seq(Box::new(Ref(1)), Box::new(Ref(2)))
54 | |     );
   | |      ^
   | |______|
   | |______Rule
   |        Rule
   |
   = note: an implementation of `std::cmp::PartialEq` might be missing for `Rule`

(This hint doesn’t show up in the editor inline messages, but it does on the command line. Strange.) I guess PartialEq is exactly what’s missing, because adding it makes the tests pass, so apparently my worry about Box comparing by address was unfounded.

I also write the second example from the puzzle description as a test, but the first line 0: 4 1 5 already poses a problem, since I’ve written Seq so that it takes only a number, not another Seq. I have to rewrite the Rule::Seq member to be defined as Seq(Vec(Rule))¹ and then rewrite the grammar to correspond to that. PEG’s ** operator with an optional minimum number of repetitions is very handy here:

rule seq() -> Rule = refs:(number() **<2,> " ") { Rule::Seq(refs) }

While debugging I also notice that order is important in /, which I’d originally gotten wrong.

I then write a trio of functions that convert rule 0 of a rule set into regex code for matching the input:

fn rule_to_regex(rule_set: &RuleSet, rule: &Rule) -> String {
    use Rule::*;
    match rule {
        Literal(c) => c.to_string(),
        Ref(ref_ix) => rule_index_to_regex(rule_set, *ref_ix),
        Seq(refs) => refs
            .iter()
            .map(|r| rule_to_regex(rule_set, r))
            .collect::<Vec<String>>()
            .join(""),
        Choice(l, r) => format!(
            "({}|{})",
            rule_to_regex(rule_set, &*l),
            rule_to_regex(rule_set, &*r)
        ),
    }
}

fn rule_index_to_regex(rule_set: &RuleSet, ix: usize) -> String {
    let rule = rule_set.get(&ix).unwrap();
    rule_to_regex(rule_set, rule)
}

fn rule_set_to_regex(rule_set: &RuleSet) -> String {
    format!("^{}$", rule_index_to_regex(rule_set, 0))
}

I add code to my tests such as this, in order to debug it:

let regex = rule_set_to_regex(&rule_set);
assert_eq!(regex, "^a(ab|ba)$");

Once I’ve got it working then I also test the second example against the example messages given in the puzzle description, with code like this:

let matcher = Regex::new(®ex).unwrap();
assert!(matcher.is_match("ababbb"));
assert!(!matcher.is_match("bababa"));
assert!(matcher.is_match("abbbab"));
assert!(!matcher.is_match("aaabbb"));
assert!(!matcher.is_match("aaaabbb"));

I’ve already done most of the work for the main function at this point, I just need to copy it from the tests:

fn main() {
    let input = include_str!("input");
    let mut blocks = input.split("\n\n");

    let rules_block = blocks.next().unwrap();
    let mut rule_set = RuleSet::new();
    for line in rules_block.lines() {
        let (ix, rule) = rules_grammar::parse_line(line).unwrap();
        rule_set.insert(ix, rule);
    }
    let matcher = Regex::new(&rule_set_to_regex(&rule_set)).unwrap();

    let messages_block = blocks.next().unwrap();
    let matches = messages_block
        .lines()
        .map(|line| matcher.is_match(line))
        .count();

    println!("{}", matches);
}

I get an answer and it’s too high. Oh, no! I am probably generating incorrect regex code.

Before I start digging into that, I decide to do a quick reality check, and print out the number of rules in the RuleSet, as well as the number of total messages, to make sure I have gotten all of the ones in the file. The number of rules is correct, but the number of messages is equal to the answer I just got! And then I see it … I used map() instead of filter() to count the number of valid messages. Now I get the correct answer.

Day 19, Part 2

Part 2 of the puzzle asks you to change rule 8 from 8: 42 to 8: 42 | 42 8, and rule 11 from 11: 42 31 | 42 11 31, and then count the number of valid messages again. The puzzle description says that this means the rules do now contain loops, but they give you a very big hint to hard-code this and not make your code generally handle loops.

I look at the two rule modifications. In regex-speak, 8: 42 | 42 8 is really just saying 8: 42+, that is, one or more applications of rule 42. So we could modify rule_index_to_regex() to hard-code an exception when ix == 8, return "(...)+" where ... is the regex code for rule 42. This seems like it should be easy to do.

For the modification to rule 11, on the other hand, I cannot come up with the change that I would need to make to the regex. After some browsing I find that someone else has asked the same thing on Stack Overflow, and it is not possible to do it in a regular expression because such a match is not regular! (The Stack Overflow shows how it can be done in Perl with backreferences, but the documentation tells me that the regular expression engine used in Rust’s regex module does not support this because it can blow up exponentially.²)

I am briefly afraid that I will have to rewrite my entire program to not use regular expressions because the problem is no longer regular.

I realize that I should listen to the hint from the puzzle description, and only deal with the input that I have, instead of trying to solve the puzzle generally. I can get around this limitation by allowing up to a certain number of loops, instead of infinite loops. For example, let’s say that rule 42 is "a" and rule 31 is "b", then rule 11 ought to match ab, aabb, aaabbb, etc. But we know that none of our messages are infinitely long, so there’s a maximum number of as and bs that we have to match. We can write the regex (ab|a{2}b{2}|a{3}b{3}|a{4}b{4}), for example, if that maximum number is 4. We just have to figure out, or more likely, estimate,³ what that maximum number is. The upper bound for that maximum number is the length of our longest message divided by 2, which is 44, but in practice it should be much less, because rules 42 and 31 are not single letters, and rule 11 is not the start symbol.

I start with trying a maximum of 4 repetitions. This is now my rule_index_to_regex() function:

fn rule_index_to_regex(rule_set: &RuleSet, ix: usize, is_part2: bool) -> String {
    let rule = rule_set.get(&ix).unwrap();
    match ix {
        8 if is_part2 => format!("({})+", rule_to_regex(rule_set, rule, is_part2)),
        11 if is_part2 => {
            if let Rule::Seq(refs) = rule {
                format!(
                    "({0}{1}|{0}{{2}}{1}{{2}}|{0}{{3}}{1}{{3}}|{0}{{4}}{1}{{4}})",
                    rule_to_regex(rule_set, &refs[0], is_part2),
                    rule_to_regex(rule_set, &refs[1], is_part2)
                )
            } else {
                panic!("Wrong type for rule 11");
            }
        }
        _ => rule_to_regex(rule_set, rule, is_part2),
    }
}

(I’ve changed the signatures of rule_to_regex() and rule_set_to_regex() accordingly.)

This gets me a correct test result on the test that I’ve written for the example given in Part 2 of the puzzle description. So I try it on the real input, and that gives me the correct answer. Apparently four is enough.

Afterword

Day 18 was only not tricky because of a few circumstances that coincided:

I had seen similar problems before, and knew what sort of tools I needed to solve it, before I tried to write any code.
I had been primed with the idea of using the PEG package by the blog posts that I’d read.
There was an example in the PEG documentation consisting of almost exactly what I needed to solve the puzzle.

If any of these three had not applied, I’m sure it would have taken me much longer to do, as Day 19 did.

Part 2 of Day 19, with the a{n}b{n} regex, is the first time during this series where I’ve been held back by not having a computer science degree — it seems likely to me that this is something I’d have learned in school. Nonetheless, I did manage to compensate the gap in my knowledge enough to solve the puzzle, just by reading Stack Overflow.⁴ This is why Stack Overflow is such a great resource.

It doesn’t sit well with me that there are a few instances of the borrow operator here that I just don’t understand why they are needed.

Lastly, now that I’ve gotten a bit more awareness of what I know and don’t know about Rust, going back and reading how an experienced Rust programmer solved the first few days’ puzzles was incredibly helpful. Perhaps the most helpful thing of all that I learned there, is the workflow of writing a function, writing tests for that function right below it, and running cargo test until they pass.

[1] My hunch was that the Box wouldn’t be necessary because the Vec‘s size is known without it, and indeed the compiler accepted it ↩

[2] As we are indeed trying to achieve here? ↩

[3] Or even more likely, try increasing numbers until one works ↩

[4] I literally googled “regex match ab aabb aaabbb” and landed on that Stack Overflow post as the first result, isn’t it wild that this is possible? ↩

Advent of Rust 16 and 17: Really, That’s More Iterators Than I Wanted; and More Adventures With NumRust

Posted on December 18, 2020 by Philip Chimento

The last couple of days I took a break from this chronicle of my attempt to teach myself the Rust programming language by solving the programming puzzles on Advent of Code 2020. But now I’m back with another two days’ worth of puzzles!

One thing that I read in the meantime, thanks to a tip from Federico was the first installment of someone else’s blog who’s doing the same thing as I am. That blog is a really good read, and I think the main difference with my series is that the author is already a Rust expert! The style is also very different, as well; I am mostly trying to emphasize the things that I found surprising, struggled with, or didn’t understand. Their blog is much more didactic.¹

One really cool thing that I picked up from that blog post is the include_str!() macro, which makes it possible to read the input file into a string at compile time, and dispense with the read_lines() boilerplate and most of the error handling. I think I will be using this from now on. This way also makes it easy to substitute in the example inputs from the puzzle descriptions.

Also in the meantime, I spent a lot of time trying to configure either of the Rust Enhanced or SublimeLinter-contrib-rustc plugins for my Sublime Text editor. I had a surprisingly hard time with this, neither of them seemed to work. In the latter case it was because the plugin was out of sync with the API of the newest version of SublimeLinter. I thought the former plugin, Rust Enhanced, would be a smooth experience since it claims to be the officially sanctioned Rust plugin for Sublime Text, but it didn’t work either! Whenever I would try to run any of the Rust commands, I would get this message:

Blocking waiting for file lock on build directory

I finally figured out that this error goes away if you run cargo clean. My guess, it’s probably because I copied the puzzle15 folder to puzzle16 today, instead of creating it with cargo new! My bad!

The editor plugin makes development much smoother, since I can get the compiler’s answer directly when I save the file, without even leaving my editor. This makes the endless rounds of compile, add a &, compile, add a *, go much faster.

It is mildly annoying that the plugin shows the compiler messages inline, instead of in the editor gutter like all the SublimeLinter plugins do. One day when I have time, I’ll investigate to see if there is a setting that controls this.

On to the puzzle.

Day 16, Part 1

This puzzle is about train tickets. We get a more free-form input file than usual, and we have to use it to deduce the meanings of fields on train tickets that we’ve scanned. In the input file there is a list of valid ranges for each field, and lists of fields on train tickets. Some of the train tickets are invalid, and that’s what Part 1 of the puzzle is about. The answer is the sum of all the values on any ticket that are not valid no matter what field they belong to.

I start by thinking about what data structure I would use to store the ranges. Some of them are overlapping and it would be nice to be able to collapse those. I google “rust collapse range” and land in the intervallum package, where the IntervalSet type seems like exactly what I want for this. It can collapse two intervals into a single interval set which covers the two intervals, and it is allowed to have one or more holes in the middle. So I can collapse all the valid ranges together, and then check each ticket field value to see if it is contained within the any of the valid ranges.

Here’s the program that I write. I split the input into three blocks, and process the first one (the one with the valid ranges in it.)

extern crate gcollections;
extern crate interval;
#[macro_use]
extern crate scan_fmt;

use gcollections::ops::set::Union;
use interval::interval_set::ToIntervalSet;

fn main() {
    let input = "class: 1-3 or 5-7\n\
row: 6-11 or 33-44\n\
seat: 13-40 or 45-50\n\
\n\
your ticket:\n\
7,1,14\n\
\n\
nearby tickets:\n\
7,3,47\n\
40,4,50\n\
55,2,20\n\
38,6,12\n";
    // let input = include_str!("input");
    let mut blocks = input.split("\n\n");

    let mut constraints = vec![].to_interval_set();
    blocks
        .next()
        .unwrap()
        .lines()
        .map(|line| {
            let mut parts = line.split(": ");
            (parts.next().unwrap(), parts.next().unwrap())
        })
        .flat_map(|(_, intervals)| intervals.split(" or "))
        .map(|interval| {
            scan_fmt!(interval, "{d}-{d}", u16, u16)
                .unwrap()
                .to_interval_set()
        })
        .for_each(|interval| constraints = constraints.union(&interval));
    println!("{}", constraints);

    let my_ticket_block = blocks.next().unwrap();
    assert!(my_ticket_block.starts_with("your ticket:\n"));

    let other_tickets_block = blocks.next().unwrap();
    assert!(other_tickets_block.starts_with("nearby tickets:\n"));
}

I initially thought .collect::<Vec<(u16, u16)>>().to_interval_set() would work, but that gives me a panic: “This operation is only for pushing interval to the back of the array, possibly overlapping with the last element.” so apparently we have to use union() on each item. Having a bit of experience with Rust and its reputation for safety, I’d expect it to return a Result or something if you tried this, instead of crashing. This seems consistent with my impression that some Rust packages are high quality and others are a bit rough around the edges, similar to NPM packages. This one seems a bit rough and underdocumented, but good enough that we can still use it.

To use the union() method, I inexplicably have to include the gcollections package as well.

Once that’s done I can process the third block as well, and calculate the answer with one long iterator chain:

let other_tickets_block = blocks.next().unwrap();
assert!(other_tickets_block.starts_with("nearby tickets:\n"));
let error_rate: u16 = other_tickets_block
    .lines()
    .skip(1)
    .flat_map(|line| line.split(','))
    .map(|s| s.parse().unwrap())
    .filter(|val| !constraints.contains(val))
    .sum();

println!("Error rate: {}", error_rate);

Day 16, Part 2

In Part 2 of the puzzle we have to discard all the tickets that had an invalid value, and with the remaining tickets figure out which field belongs to which field name, by comparing the fields with the valid ranges for each field name. The answer is the product of all the values on our own ticket that correspond to fields whose name starts with “departure”.

A lot of the logic will be the same as in Part 1, so I start by rewriting Part 1 to save the intermediate steps that we’ll need for Part 2 as well:

let mut blocks = input.split("\n\n");

let constraints_block = blocks.next().unwrap();
let field_descriptions: HashMap<&str, interval::interval_set::IntervalSet<u16>> =
    constraints_block
        .lines()
        .map(|line| {
            let mut parts = line.split(": ");
            let field_name = parts.next().unwrap();
            let intervals = parts.next().unwrap();
            let interval_set = intervals
                .split(" or ")
                .map(|interval| scan_fmt!(interval, "{d}-{d}", u16, u16).unwrap())
                .collect::<Vec<(u16, u16)>>()
                .to_interval_set();
            (field_name, interval_set)
        })
        .collect();

let mut all_valid_values = vec![].to_interval_set();
for interval in field_descriptions.values() {
    all_valid_values = all_valid_values.union(interval);
}

let my_ticket_block = blocks.next().unwrap();
assert!(my_ticket_block.starts_with("your ticket:\n"));

let other_tickets_block = blocks.next().unwrap();
assert!(other_tickets_block.starts_with("nearby tickets:\n"));
let (valid_tickets, invalid_tickets): (Vec<Vec<u16>>, Vec<Vec<u16>>) = other_tickets_block
    .lines()
    .skip(1)
    .map(|line| {
        line.split(',')
            .map(|s| s.parse().unwrap())
            .collect::<Vec<u16>>()
    })
    .partition(|ticket| ticket.iter().all(|val| all_valid_values.contains(val)));

if is_part2() {
} else {
    let error_rate: u16 = invalid_tickets
        .iter()
        .flat_map(|ticket| ticket.iter().filter(|val| !all_valid_values.contains(val)))
        .sum();
    println!("Error rate: {}", error_rate);
}

I am excited to use the partition() method, that I found by browsing the documentation, to split the iterator into two vectors, one of valid and one of invalid tickets.

I now have vectors of the numbers from each ticket. However, to solve this puzzle I’ll have to transform those into vectors of all the numbers in each position on the tickets (for example, a vector of all numbers coming first on each ticket, coming second, etc.) instead of vectors of each ticket. This sounds like something I should be able to do with a zip() method which I’m familiar with from Python.

Rust’s zip() takes only two iterators. At first I think that itertools::multizip will solve my problem, but that takes a number of iterators that must be known at compile time, which is not the case here. After some more googling I find this Stack Overflow answer and find a code snippet which I simply copy as-is into my program. I don’t really understand it, but I understand enough to see approximately what it does.

I decide to store the possible fields for each position on the ticket as a vector of hashsets, possible_fields_by_position. The order of the vector is by position of the number on the ticket. The elements of the vector are hash sets consisting of zero or more field numbers, which the numbers in this position might be valid for.

To do this, I have to pre-fill the vector with empty hashsets, and I’m not sure how to do this. Googling “rust fill vector of hashset” sends me here.

If the problem is solvable, then there will be at least one element of the vector where the hash set has only one item in it, and then we will know what field corresponds with that position on the ticket. We can then consider that field “determined” and remove it from all the other hash sets in the vector, which hopefully leaves at least one more hash set with only one item, allowing is to uniquely determine another field, and so on.

One interesting thing to note is that I first had this:

for (_, remaining_fields) in possible_fields_by_position {
    remaining_fields.remove(field_ix);
}

Here, for the first time that I’ve seen, the compiler gives a plain wrong suggestion:

error[E0382]: borrow of moved value: `possible_fields_by_position`
   --> puzzle16.rs:97:15
    |
78  |         let mut possible_fields_by_position: Vec<(usize, HashSet<usize>)> = (0..valid_tickets[0]
    |             ------------------------------- move occurs because `possible_fields_by_position` has type `Vec<(usize, std::collections::HashSet<usize>)>`, which does not implement the `Copy` trait
...
97  |         while possible_fields_by_position.len() > 0 {
    |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^ value borrowed here after move
...
102 |             for (_, remaining_fields) in possible_fields_by_position {
    |                                          ---------------------------
    |                                          |
    |                                          value moved here, in previous iteration of loop
    |                                          help: consider borrowing to avoid moving into the for loop: `&possible_fields_by_position`

error[E0596]: cannot borrow `remaining_fields` as mutable, as it is not declared as mutable
   --> puzzle16.rs:103:17
    |
102 |             for (_, remaining_fields) in possible_fields_by_position {
    |                     ---------------- help: consider changing this to be mutable: `mut remaining_fields`
103 |                 remaining_fields.remove(field_ix);
    |                 ^^^^^^^^^^^^^^^^ cannot borrow as mutable

Even without the above, I certainly had more trouble than usual with getting the & operators right! Maybe it’s time for a re-read of the chapter on ownership in the Rust book.

For the last step I am also left wondering how to pre-fill a vector with zeros. The syntax is [0; number_of_zeros]. I do actually stumble upon a Github issue complaining about it not being documented well enough.

Here is the very long full program:

extern crate gcollections;
extern crate interval;
#[macro_use]
extern crate scan_fmt;

use gcollections::ops::set::{Contains, Union};
use interval::interval_set::{IntervalSet, ToIntervalSet};
use std::collections::HashSet;
use std::env;

// https://stackoverflow.com/a/55292215/172999
struct Multizip<T>(Vec<T>);

impl<T> Iterator for Multizip<T>
where
    T: Iterator,
{
    type Item = Vec<T::Item>;

    fn next(&mut self) -> Option<Self::Item> {
        self.0.iter_mut().map(Iterator::next).collect()
    }
}

fn main() {
    let input = include_str!("input");
    let mut blocks = input.split("\n\n");

    let constraints_block = blocks.next().unwrap();
    let field_descriptions: Vec<(&str, IntervalSet<u16>)> = constraints_block
        .lines()
        .map(|line| {
            let mut parts = line.split(": ");
            let field_name = parts.next().unwrap();
            let interval_set = parts
                .next()
                .unwrap()
                .split(" or ")
                .map(|interval| scan_fmt!(interval, "{d}-{d}", u16, u16).unwrap())
                .collect::<Vec<(u16, u16)>>()
                .to_interval_set();
            (field_name, interval_set)
        })
        .collect();

    let mut all_valid_values = vec![].to_interval_set();
    for (_, interval) in &field_descriptions {
        all_valid_values = all_valid_values.union(interval);
    }

    let my_ticket_block = blocks.next().unwrap();
    assert!(my_ticket_block.starts_with("your ticket:\n"));

    let other_tickets_block = blocks.next().unwrap();
    assert!(other_tickets_block.starts_with("nearby tickets:\n"));
    let (valid_tickets, invalid_tickets): (Vec<Vec<u16>>, Vec<Vec<u16>>) = other_tickets_block
        .lines()
        .skip(1)
        .map(read_csv_numbers)
        .partition(|ticket| ticket.iter().all(|val| all_valid_values.contains(val)));

    if is_part2() {
        let mut possible_fields_by_position: Vec<_> = (0..valid_tickets[0].len())
            .map(|_| HashSet::new())
            .enumerate()
            .collect();
        for (position, position_values) in
            Multizip(valid_tickets.iter().map(|ticket| ticket.iter()).collect()).enumerate()
        {
            for (field_ix, (_, interval)) in field_descriptions.iter().enumerate() {
                if position_values.iter().all(|val| interval.contains(val)) {
                    possible_fields_by_position[position].1.insert(field_ix);
                }
            }
        }

        possible_fields_by_position.sort_by_key(|(_, set)| set.len());
        possible_fields_by_position.reverse();

        let mut determined_fields_by_position = vec![0; possible_fields_by_position.len()];
        while !possible_fields_by_position.is_empty() {
            let (position, possible_fields) = possible_fields_by_position.pop().unwrap();
            assert!(possible_fields.len() == 1, "unable to determine fields");
            let field_ix = possible_fields.iter().next().unwrap();
            determined_fields_by_position[position] = *field_ix;
            for (_, remaining_fields) in &mut possible_fields_by_position {
                remaining_fields.remove(field_ix);
            }
        }

        let my_ticket_values: Vec<u16> = my_ticket_block
            .lines()
            .skip(1)
            .flat_map(read_csv_numbers)
            .collect();

        let answer: u64 = determined_fields_by_position
            .iter()
            .map(|field_ix| field_descriptions[*field_ix].0)
            .zip(my_ticket_values.iter())
            .filter(|(field_name, _)| field_name.starts_with("departure"))
            .map(|(_, value)| *value as u64)
            .product();

        println!("My ticket values {:?}", answer);
    } else {
        let error_rate: u16 = invalid_tickets
            .iter()
            .flat_map(|ticket| ticket.iter().filter(|val| !all_valid_values.contains(val)))
            .sum();
        println!("Error rate: {}", error_rate);
    }
}

fn read_csv_numbers(line: &str) -> Vec<u16> {
    line.split(',').map(|s| s.parse().unwrap()).collect()
}

Day 17, Part 1

I didn’t manage to finish the Day 16 blog post before I started working on Day 17’s puzzle, so I’m tacking it on here. Day 17 brings us a three-dimensional Conway’s Game of Life!

I remember very well implementing Game of Life with a different twist on Day 11, so today I will write some more code using ndarray, copying from Day 11 where I can.

In three dimensions we have 3³ − 1 = 26 neighbours instead of the 3² − 1 = 8 that we have in two dimensions. I “hand-unrolled”² the loop over the 8 neighbours in the calc_neighbours() function from Day 11:

fn calc_neighbours(seats: &Array2<i8>) -> Array2<i8> {
    let shape = seats.shape();
    let width = shape[0];
    let height = shape[1];
    let mut neighbours = Array2::<i8>::zeros(seats.raw_dim());
    // Add slices of the occupied seats shifted one space in each direction
    let mut slice = neighbours.slice_mut(s![1.., 1..]);
    slice += &seats.slice(s![..width - 1, ..height - 1]);
    slice = neighbours.slice_mut(s![.., 1..]);
    slice += &seats.slice(s![.., ..height - 1]);
    slice = neighbours.slice_mut(s![..width - 1, 1..]);
    slice += &seats.slice(s![1.., ..height - 1]);
    slice = neighbours.slice_mut(s![1.., ..]);
    slice += &seats.slice(s![..width - 1, ..]);
    slice = neighbours.slice_mut(s![..width - 1, ..]);
    slice += &seats.slice(s![1.., ..]);
    slice = neighbours.slice_mut(s![1.., ..height - 1]);
    slice += &seats.slice(s![..width - 1, 1..]);
    slice = neighbours.slice_mut(s![.., ..height - 1]);
    slice += &seats.slice(s![.., 1..]);
    slice = neighbours.slice_mut(s![..width - 1, ..height - 1]);
    slice += &seats.slice(s![1.., 1..]);
    neighbours
}

That will simply not do when we have 26 neighbours! I decide to first rewrite it in the Day 11 code with the outer product of two vectors [-1, 0, 1] and make sure that it still gives the same answer. I find that the cartesian_product() method from Itertools is ideal for this:

fn calc_neighbours(seats: &Array2<i8>) -> Array2<i8> {
    let shape = seats.shape();
    let width = shape[0] as isize;
    let height = shape[1] as isize;
    let mut neighbours = Array2::<i8>::zeros(seats.raw_dim());
    // Add slices of the occupied seats shifted one space in each direction
    for (xstart, ystart) in (-1..=1).cartesian_product(-1..=1) {
        if xstart == 0 && ystart == 0 {
            continue;
        }
        let xdest = xstart.max(0)..(width + xstart).min(width);
        let ydest = ystart.max(0)..(height + ystart).min(height);
        let xsource = (-xstart).max(0)..(width - xstart).min(width);
        let ysource = (-ystart).max(0)..(height - ystart).min(height);
        let mut slice = neighbours.slice_mut(s![xdest, ydest]);
        slice += &seats.slice(s![xsource, ysource]);
    }
    neighbours
}

It’s still pretty verbose, and I suspect it could be done more cleverly, but this is good enough to be straightforwardly adapted to three dimensions for Day 17:

fn calc_neighbours(grid: &Array3<i8>) -> Array3<i8> {
    let shape = grid.shape();
    let width = shape[0] as isize;
    let height = shape[1] as isize;
    let depth = shape[2] as isize;
    let mut neighbours = Array3::<i8>::zeros(grid.raw_dim());
    // Add slices of the occupied grid shifted one space in each direction
    for starts in iter::repeat(-1..=1).take(3).multi_cartesian_product() {
        if starts.iter().all(|start| *start == 0) {
            continue;
        }
        let (xstart, ystart, zstart) = (starts[0], starts[1], starts[2]);
        let xdest = xstart.max(0)..(width + xstart).min(width);
        let ydest = ystart.max(0)..(height + ystart).min(height);
        let zdest = zstart.max(0)..(depth + zstart).min(depth);
        let xsource = (-xstart).max(0)..(width - xstart).min(width);
        let ysource = (-ystart).max(0)..(height - ystart).min(height);
        let zsource = (-zstart).max(0)..(depth - zstart).min(depth);
        let mut slice = neighbours.slice_mut(s![xdest, ydest, zdest]);
        slice += &grid.slice(s![xsource, ysource, zsource]);
    }
    neighbours
}

Unlike on Day 11, the game board is infinite and has no walls. However, also unlike on Day 11 where we had a termination condition for the game, today we have to simulate exactly 6 iterations of the game, so we can actually pretend the board is finite. The occupied cells can expand at by 1 every turn at most, so the board can be bounded at the size of the input board plus the number of iterations in every direction.

In the example given in the puzzle description, the board is 3×3×1 and we simulate 3 iterations, so the maximum board size would be 9×9×7 in the example. (The example’s actual output fits in 7×7×5, but we need an upper bound.)

I’m able to reuse a lot of the code from Day 11, so this should look very familiar if you’ve been following along:

use itertools::Itertools;
use ndarray::{s, Array3};
use std::iter;

fn main() {
    // let input = ".#.\n..#\n###\n";
    let input = include_str!("input");
    let n_turns = 6;
    let mut grid = read_grid(input, n_turns);

    for _ in 0..n_turns {
        let neighbours = calc_neighbours(&grid);
        let activations = &neighbours.mapv(|count| (count == 3) as i16)
            * &grid.mapv(|active| (active == 0) as i16);
        let deactivations = &neighbours.mapv(|count| (count < 2 || count > 3) as i16) * &grid;
        grid = grid + activations - deactivations;
    }

    dump_grid(&grid);
    println!("{}", grid.sum());
}

fn calc_neighbours(grid: &Array3<i16>) -> Array3<i16> {
    let shape = grid.shape();
    let width = shape[0] as isize;
    let height = shape[1] as isize;
    let depth = shape[2] as isize;
    let mut neighbours = Array3::<i16>::zeros(grid.raw_dim());
    // Add slices of the occupied grid shifted one space in each direction
    for starts in iter::repeat(-1..=1).take(3).multi_cartesian_product() {
        if starts.iter().all(|start| *start == 0) {
            continue;
        }
        let (xstart, ystart, zstart) = (starts[0], starts[1], starts[2]);
        let xdest = xstart.max(0)..(width + xstart).min(width);
        let ydest = ystart.max(0)..(height + ystart).min(height);
        let zdest = zstart.max(0)..(depth + zstart).min(depth);
        let xsource = (-xstart).max(0)..(width - xstart).min(width);
        let ysource = (-ystart).max(0)..(height - ystart).min(height);
        let zsource = (-zstart).max(0)..(depth - zstart).min(depth);
        let mut slice = neighbours.slice_mut(s![xdest, ydest, zdest]);
        slice += &grid.slice(s![xsource, ysource, zsource]);
    }
    neighbours
}

fn read_grid(input: &str, padding: usize) -> Array3<i16> {
    let lines: Vec<&str> = input.lines().collect();
    let height = lines.len();
    let width = lines[0].len();
    let mut cells = Array3::zeros((width + 2 * padding, height + 2 * padding, 2 * padding + 1));
    for (y, line) in lines.iter().enumerate() {
        for (x, tile) in line.bytes().enumerate() {
            cells[[x + padding, y + padding, padding]] = match tile {
                b'#' => 1,
                b'.' => 0,
                _ => panic!("Bad tile '{}'", tile),
            };
        }
    }
    cells
}

fn dump_grid(grid: &Array3<i16>) {
    for xy in grid.axis_iter(ndarray::Axis(2)) {
        for x in xy.axis_iter(ndarray::Axis(1)) {
            println!(
                "{}",
                x.mapv(|active| if active != 0 { '#' } else { '.' })
                    .iter()
                    .collect::<String>()
            )
        }
        println!("");
    }
}

The main difference is that I’ve written a dump_grid() function as well.

I remarked on Day 11 that in my opinion the ndarray package suffers from a few deficiencies compared to NumPy, and I ran into those same problems today:

The return value of sum() is limited to the data type of the array it is being called on.
You can’t easily use boolean arrays as a mask.

Day 17, Part 2

Part 2 of the puzzle is exactly the same as Part 1, only four-dimensional! I briefly wish that I had spent the time on generalizing calc_neighbours() to work with any number of dimensions. But on the other hand, I really don’t feel confident enough with either Rust or ndarray that I could anticipate being able to do that without spending all day on it.

So instead I decide to do the simplest thing: copy the file to another puzzle17-2.rs and add another section to Cargo.toml:

[[bin]]
name = "puzzle17-2"
path = "puzzle17-2.rs"

I notice that you now need to specify which binary to run, which is nice, because I wasn’t sure how I would choose between the two:

error: `cargo run` could not determine which binary to run. Use the `--bin` option to specify a binary, or the `default-run` manifest key.
available binaries: puzzle17, puzzle17-2

Then I just change all the Array3 to Array4 and add an extra index where needed:

use itertools::Itertools;
use ndarray::{s, Array4};
use std::iter;

fn main() {
    let input = include_str!("input");
    let n_turns = 6;
    let mut grid = read_grid(input, n_turns);

    for _ in 0..n_turns {
        let neighbours = calc_neighbours(&grid);
        let activations = &neighbours.mapv(|count| (count == 3) as i16)
            * &grid.mapv(|active| (active == 0) as i16);
        let deactivations = &neighbours.mapv(|count| (count < 2 || count > 3) as i16) * &grid;
        grid = grid + activations - deactivations;
    }

    println!("{}", grid.sum());
}

fn calc_neighbours(grid: &Array4<i16>) -> Array4<i16> {
    let shape = grid.shape();
    let width = shape[0] as isize;
    let height = shape[1] as isize;
    let depth = shape[2] as isize;
    let limit4 = shape[3] as isize;
    let mut neighbours = Array4::<i16>::zeros(grid.raw_dim());
    // Add slices of the occupied grid shifted one space in each direction
    for starts in iter::repeat(-1..=1).take(4).multi_cartesian_product() {
        if starts.iter().all(|start| *start == 0) {
            continue;
        }
        let (xstart, ystart, zstart, wstart) = (starts[0], starts[1], starts[2], starts[3]);
        let xdest = xstart.max(0)..(width + xstart).min(width);
        let ydest = ystart.max(0)..(height + ystart).min(height);
        let zdest = zstart.max(0)..(depth + zstart).min(depth);
        let wdest = wstart.max(0)..(limit4 + wstart).min(limit4);
        let xsource = (-xstart).max(0)..(width - xstart).min(width);
        let ysource = (-ystart).max(0)..(height - ystart).min(height);
        let zsource = (-zstart).max(0)..(depth - zstart).min(depth);
        let wsource = (-wstart).max(0)..(limit4 - wstart).min(limit4);
        let mut slice = neighbours.slice_mut(s![xdest, ydest, zdest, wdest]);
        slice += &grid.slice(s![xsource, ysource, zsource, wsource]);
    }
    neighbours
}

fn read_grid(input: &str, padding: usize) -> Array4<i16> {
    let lines: Vec<&str> = input.lines().collect();
    let height = lines.len();
    let width = lines[0].len();
    let mut cells = Array4::zeros((
        width + 2 * padding,
        height + 2 * padding,
        2 * padding + 1,
        2 * padding + 1,
    ));
    for (y, line) in lines.iter().enumerate() {
        for (x, tile) in line.bytes().enumerate() {
            cells[[x + padding, y + padding, padding, padding]] = match tile {
                b'#' => 1,
                b'.' => 0,
                _ => panic!("Bad tile '{}'", tile),
            };
        }
    }
    cells
}

This gives me the right answer.

Afterword

The puzzles seem to be getting harder as we go along, particularly Day 16 more than Day 17. I am almost certain that Day 16 could be solved much more elegantly than the solution that I had. Day 17, on the other hand, gave me a chance to improve on the solution that I already had for Day 11.

Having the compiler messages directly in my editor is a mixed blessing! It’s definitely streamlined things, but on the other hand I think it has actually made me more lazy and less inclined to learn about when the & operator is necessary — I’ve caught myself just adding them here and there in likely places without bothering to think, and letting the compiler sort the rest out. I’m definitely not this careless with pointers in C!

[1] And therefore probably more useful if you actually want to learn Rust yourself ↩

[2] Or more like, the loop was never rolled in the first place because it was too complicated for me to write at the time ↩

Advent of Rust 14 and 15: Bits And/Or Pieces

Posted on December 15, 2020 by Philip Chimento

This blog: chronicling my adventure of teaching myself the Rust programming language since December 1, courtesy of the programming puzzles at Advent of Code 2020. I’m not sure if anyone is reading anymore at this point 😆

Day 14, Part 1

Today’s puzzle is once again simulating some computer instructions, although the model this time is simple enough that I don’t feel I have to write a struct with methods to simulate it.

We have to write values into memory addresses, with a bitmask: XX1100XX (left to right, most significant to least significant, which I’ll refer to as bit 7 through 0) means that bits 7 and 6 of the value to be written are left unchanged, bits 5 and 4 are overwritten with 0, bits 3 and 2 are overwritten with 1, and bits 1 and 0 are also left unchanged. The answer to the puzzle is the sum of all the values in memory.

Contrary to what people might expect about programmers, I have not memorized how to set a bit to 0 or 1 in a number, so I always look it up to make sure I’m doing it correctly, before I do it. To set a bit to 0, use the bitwise AND operator (& in most languages) with an integer consisting of all bits 1 except the bit you want to set to 0; and to set a bit to 1, use the bitwise OR operator (| in most languages) with the bit that you want to set to 1. Based on this, I decide to write a function to parse the bitmask string, and it will split the bitmask into two bitmasks: an AND-mask to set bits to 0, and an OR-mask to set them to 1. So in the example I gave above, the AND-mask would be 11110011 and the OR-mask would be 00110000.

To read the other lines in the file I’ll use my old friend scan_fmt!(). Finally, to emulate the memory, I’ll use a HashMap<u64, u64>. Naively emulating it in an array that spans the entire 2³⁶-word address space would be a bit big (288 GiB). I do quickly look up “rust sparse array” but don’t find anything that seems more compelling than a hash map.

While writing the program I do need to look up “rust exponentiation” and find that numbers have a pow() method. But I do get an error:

error[E0689]: can't call method `pow` on ambiguous numeric type `{integer}`
  --> puzzle14.rs:38:35
   |
38 |             b'0' => and_mask -= 2.pow(ix),
   |                                   ^^^
   |
help: you must specify a concrete type for this numeric value, like `i32`
   |
38 |             b'0' => and_mask -= 2_i32.pow(ix),
   |                                 ^^^^^

That syntax (2_i32) is new to me, I haven’t seen it (or seen it suggested by the compiler) before.

error[E0308]: mismatched types
  --> puzzle14.rs:38:43
   |
38 |             b'0' => and_mask -= 2_u64.pow(ix),
   |                                           ^^ expected `u32`, found `usize`
   |
help: you can convert an `usize` to `u32` and panic if the converted value wouldn't fit
   |
38 |             b'0' => and_mask -= 2_u64.pow(ix.try_into().unwrap()),
   |                                           ^^^^^^^^^^^^^^^^^^^^^^

This I’m also surprised at, why does u64::pow() take u32 specifically and not u64 or usize?

error[E0599]: no method named `try_into` found for type `usize` in the current scope
  --> puzzle14.rs:41:45
   |
41 |             b'1' => or_mask += 2_u64.pow(ix.try_into()?),
   |                                             ^^^^^^^^ method not found in `usize`
   |
   = help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
   |
1  | use std::convert::TryInto;
   |

It would be nice if the compiler would suggest that when suggesting try_into() in the first place!

Otherwise, writing the solution to Part 1 goes smoothly:¹

use std::collections::HashMap;
use std::convert::TryInto;
use std::num;

#[macro_use]
extern crate scan_fmt;

fn main() -> Result<(), Box<dyn Error>> {
    let mut memory = HashMap::new();
    let mut or_mask: u64 = 0;
    let mut and_mask: u64 = u64::MAX;

    let file = fs::File::open("input")?;
    for line in read_lines(file) {
        if line.starts_with("mask") {
            let (new_or_mask, new_and_mask) = parse_mask(&line[7..])?;
            or_mask = new_or_mask;
            and_mask = new_and_mask;
            continue;
        }
        let (addr, value) = scan_fmt!(&line, "mem[{}] = {}", u64, u64)?;
        memory.insert(addr, value & and_mask | or_mask);
    }

    println!("{}", memory.values().sum::<u64>());

    Ok(())
}

fn parse_mask(line: &str) -> Result<(u64, u64), num::TryFromIntError> {
    let mut or_mask: u64 = 0;
    let mut and_mask: u64 = u64::MAX;

    for (ix, byte) in line.bytes().rev().enumerate() {
        match byte {
            b'0' => and_mask -= 2_u64.pow(ix.try_into()?),
            b'1' => or_mask += 2_u64.pow(ix.try_into()?),
            b'X' => (),
            _ => panic!("Bad byte {}", byte),
        }
    }
    Ok((or_mask, and_mask))
}

I’m slightly surprised at a few of these type annotations:

memory has type HashMap<u64, u64>, so memory.values() has type Iterator<Item = u64>, so memory.values().sum() should unambiguously be u64?
and_mask and or_mask have type u64, so if I add or subtract 2^ix then that could be assumed to be u64 as well?

Day 14, Part 2

In Part 2, the meaning of the bitmask changes. Now it applies to the address being written to, not the value being written. A zero in the mask now has no effect on the value, so we ignore the AND-mask; a one in the mask means the same thing as in Part 1, so we continue to use the OR-mask; and now the Xs mean that that bit in the mask is “floating”, meaning that we have to write the value to both of the memory addresses in which that bit is either 0 or 1. (And if we have two Xs in the mask, then we have to write to four memory addresses, etc.)

Nonetheless, the program can almost remain the same! First I change parse_mask() to return a third mask as well, the float_mask, which has 1s in the bits that are supposed to be floating:

fn parse_mask(line: &str) -> Result<(u64, u64, u64), num::TryFromIntError> {
    let mut or_mask: u64 = 0;
    let mut and_mask: u64 = u64::MAX;
    let mut float_mask: u64 = 0;

    for (ix, byte) in line.bytes().rev().enumerate() {
        let bit = 2_u64.pow(ix.try_into()?);
        match byte {
            b'0' => and_mask -= bit,
            b'1' => or_mask += bit,
            b'X' => float_mask += bit,
            _ => panic!("Bad byte {}", byte),
        }
    }
    Ok((or_mask, and_mask, float_mask))
}

I change the memory.insert(addr, value & and_mask | or_mask); line to instead do write_floating_memory(&mut memory, addr | or_mask, value, float_mask); if we are in Part 2.

The write_floating_memory() function is a bit more involved. I do know that we have to write the value to 2ⁿ addresses, where n is the number of 1-bits in the float mask, so I know I can iterate through the values from 0 to 2ⁿ − 1 and use each bit of each of those values in place of one of the floating bits. But I struggle somewhat with getting those bits into the right place.

I try a few things and then I realize that I can use the iterated value as a vector of bits, where I can pop bits off the end after I’ve used them, with the right shift operator (>>). So that’s what I write:

fn write_floating_memory(memory: &mut HashMap<u64, u64>, addr: u64, value: u64, float_mask: u64) {
    for mut floating_bits in 0..2_u64.pow(float_mask.count_ones()) {
        let mut masked_addr = addr;
        for bit_ix in 0..36 {
            let bit = 2_u64.pow(bit_ix);
            if float_mask & bit != 0 {
                match floating_bits & 1 {
                    0 => masked_addr &= !bit,
                    1 => masked_addr |= bit,
                    _ => panic!("Not possible"),
                };
                floating_bits >>= 1;
            }
        }
        memory.insert(masked_addr, value);
    }
}

There’s probably a more efficient way to do that by using bit-manipulation operators, but doing that intuitively is not my strong suit, and I’d rather not have to think hard about it when I can just solve the puzzle this way.

I initially get the wrong answer from the example input because I forgot to ignore the AND-mask (I had addr & and_mask | or_mask.) While debugging this, I learn the {:b} and {:036b} formats for println!(), which are useful for printing out the masks.

Once that is solved, though, I get the right answer for the example input, and then for the real puzzle.

Instead of “Not possible” it seems like it would be useful to have bit pattern matching in the match expression. I’m not the only one to have suggested this and I find the bitmatch package, but I don’t bother changing anything at this point now that I’ve got the answer.

Day 15, Part 1

I happened to solve the Day 15 puzzle within about 20 minutes after it was released, so I’m tacking it on to today’s entry.

The puzzle is a counting game played by the elves at the North Pole. Each person takes a turn reading off a list of starting numbers, and once the starting numbers are done, the game begins. The next player considers whether the last player’s number had been spoken before. If not, they say 0. If so, they say the number of turns between the previous time the number was spoken, and the last turn number. The puzzle answer is the number that’s spoken on turn 2020.

A few considerations that I made when writing the program below:

This time we don’t even have to read the input from a file, it’s given in the puzzle, so I delete read_lines() from the boilerplate.
I originally think I have to store the last two occurrences of the given number, and shift them along when the number occurs again, but the last occurrence is actually always the previous turn. I only have to store the second-to-last occurrence, and insert the previous turn’s number after the calculation for this turn.
I use 0-based turn numbers internally, but 1-based when printing them out. “Turn 2020” is a human-readable 1-based number, so it’s actually turn 2019 internally.
The number spoken after the starting numbers are done is always 0, because the starting numbers each occur for the first time.

fn main() {
    let input = vec![15, 12, 0, 14, 3, 1];
    let mut last_seen: HashMap<u16, u16> = input
        .iter()
        .enumerate()
        .map(|(turn, starting_number)| (*starting_number, turn as u16))
        .collect();
    let mut last_turn_number = 0;

    for turn in (input.len() as u16 + 1)..2020 {
        let this_turn_number = match last_seen.get(&last_turn_number) {
            Some(prev_seen) => turn - 1 - prev_seen,
            None => 0,
        };
        last_seen.insert(last_turn_number, turn - 1);
        last_turn_number = this_turn_number;

        println!("Turn {}: {}", turn + 1, this_turn_number);
    }
}

Day 15, Part 2

The answer to Part 2 is the number that is spoken on turn 30 million. I figure this will take an inconvenient amount of time to calculate, but not so inconvenient that I don’t want to try a brute force solution. I change the types of the integers to usize everywhere, to accommodate the larger numbers, and add an if expression, and that’s it, really. It takes a few minutes to run, but it’s done.

fn main() {
    let input = vec![15, 12, 0, 14, 3, 1];
    let mut last_seen: HashMap<usize, usize> = input
        .iter()
        .enumerate()
        .map(|(turn, starting_number)| (*starting_number, turn))
        .collect();
    let mut last_turn_number = 0;

    for turn in (input.len() + 1)..if is_part2() { 30000000 } else { 2020 } {
        let this_turn_number = match last_seen.get(&last_turn_number) {
            Some(prev_seen) => turn - 1 - prev_seen,
            None => 0,
        };
        last_seen.insert(last_turn_number, turn - 1);
        last_turn_number = this_turn_number;

        println!("Turn {}: {}", turn + 1, this_turn_number);
    }
}

Afterword

After Day 14 I was going to say that it seems like the puzzles follow a general pattern of having Part 1 be a fairly straightforward problem, and Part 2 adding a complication that makes you have to think harder about it. But Day 15 breaks that mold!

[1] For certain definitions of “smoothly,” that is. I’ve gotten used to always having a few rounds of compilation where I fix the borrow operators, and any program that I post here should be understood to have been bashed into shape by the compiler — I still can’t write Rust ↩