Advent of Rust 8: Please Make Your Error More Error-y

It’s Day 8 of the no-longer-so-stream-of-consciousness-log of teaching myself the Rust programming language by solving the programming puzzles at Advent of Code 2020.

Today I start off by refactoring my boilerplate code some more. I got loads of friendly advice from Federico Mena Quintero including how to configure Cargo so that the main.rs doesn’t have to be in a subdirectory, some reading material on error handling which I am working my way through, and a talk which I watched last night while washing the dishes, on Rust programs as a dialogue between the programmer and the compiler. A dialogue between the programmer and the compiler is certainly what some of these puzzles have been!

Reorganizing the project structure to be flat is something I’ve wanted to do since Day 1! I create a new project with cargo new puzzle8, but then I delete the src/ directory and create puzzle8.rs in the main puzzle8 directory, and put the following in Cargo.toml:

[[bin]]
name = "puzzle8"
path = "puzzle8.rs"

(I haven’t read anything yet about TOML and am just parroting the examples in the Cargo documentation; in particular, don’t ask me what the double square brackets signify.)

I decide to start with this boilerplate in puzzle8.rs:

use std::env;
use std::fs;
use std::io::{self, BufRead};

fn main() -> Result<(), io::Error> {
    let file = fs::File::open("input")?;
    let lines: Vec<String> = read_lines(file).collect();
    println!("Hello, world! {}", lines.len());

    Ok(())
}

fn is_part2() -> bool {
    env::args().nth(1).map(|val| val == "2").unwrap_or(false)
}

fn read_lines(file: fs::File) -> impl Iterator<Item = String> {
    io::BufReader::new(file).lines().map(|res| res.unwrap())
}

Federico’s advice included moving the fs::File::open() call into the main body. I find this version of the code much shorter and easier to read.

What I’m also doing this week is enabling the Rust documentation in the DevDocs tab that I permanently have open in my browser so that I can search the API documentation more easily. This wouldn’t have been so useful to me in the first few days, since I wouldn’t have understood the documentation very well, but now it is.

Now I’m ready to get started on the puzzle!

Day 8, Part 1

Today’s puzzle involves executing some computer instructions in a made-up assembly language, and figuring out where it goes into an infinite loop. The answer to the puzzle is the value that’s in the accumulator just before the infinite loop starts.

This is exciting, we get to build a tiny virtual machine! That sounds intimidating but there are a few things that make it easy:

  • there are only 3 instruction codes (“opcodes”).
  • there is only one variable (“accumulator”).
  • there are no choices, branches, or side effects, so if you reach the same instruction for the second time, then you know you’re in an infinite loop.

I think I will start by designing a data structure and API for the virtual machine. Instructions can be a tuple of an opcode and a parameter (and looking at the input file, I see parameters greater than +128 and less than -127, but none more than three digits, so i16 seems like a good type for now). The opcodes are nop to do nothing, acc to increase or decrease the accumulator, or jmp to jump to a different instruction. I think this can be represented by an enum, if Rust has such a thing. I google “rust enum” and land on a chapter in the Rust book.

I start with this:

enum Opcode {
    NOP,
    ACC,
    JMP,
}

struct Instruction {
    opcode: Opcode,
    param: i16,
}

Reading a bit further on in the book, I find out something that is surprising to me: enums in Rust are not what is known as enums in other programming languages! They are actually variant data types. So, because the NOP instruction ignores its parameter, I can instead write the following, remembering #[derive(Debug)] from Day 2, and dispense with having a separate struct Instruction:

#[derive(Debug)]
enum Instruction {
    NOP,
    ACC(i16),
    JMP(i16),
}

Then for the virtual machine itself I will need a Vec<Instruction> for the program code, a usize for the program counter (an index pointing to which instruction in the program code is being executed), and a member for the accumulator. The puzzle description doesn’t say how many bits the accumulator is, but let’s give it a fairly large signed integer type: i32. We will also need one bit per instruction to keep track of whether the instruction has been executed before or not. We could store the program code as Vec<(Instruction, bool)>, which would waste a lot of space but be just fine for solving the puzzle. But out of interest, I also take a look at the bitvec package and it seems like it won’t be too difficult to use, so I tentatively decide to try that, and will revert to the tuple approach if I run into problems.

struct VM {
    acc: i32,
    pc: usize,
    code: Vec<Instruction>,
    visited: BitVec,
}

Now I need to decide what methods the virtual machine will have. It will need a method to convert a line of assembly language into an instruction and push the instruction into the code vector. This should probably look like assemble_line(line: &str) -> Result<(), ???>. It will also need a method to run the program until an infinite loop is detected, which I will call run() -> Result<(), ???>. I’m not sure what to put in the ??? types in both cases. In the case of assemble_line(), the possible errors could be that the string is not in the right format, or that the opcode is unknown, or the parameter is out of range. In the case of run(), the possible errors could be that an infinite loop was detected, or that the accumulator overflowed, or any number of other errors that I haven’t thought of.

I think I will need to learn how to define my own error type in Rust, so I google “rust define error type” and land on the “Defining an Error Type” page in Rust by Example and although that helpfully tells me that it’s possible to give parameters to error types, the example doesn’t actually show how to do what I want! My impression a few days ago was that Rust by Example wasn’t all that useful for me, and now that I know a bit more, I think it’s because the examples in Rust by Example tend to be too concise and basic for my purposes. Maybe they would be more useful to jog my memory if I had done these things once before.

The next search result is the Stack Overflow question “How do you define custom Error types in Rust?”. Unfortunately most of the answers to this question consist basically of “I am the author of a crate that lets you do this,” and I’ve gotten burned often enough by “Hey, you should use my npm package for this” that it seems to me I should stay far away from those! The bottom answer does usefully tell me that I don’t actually need to implement the Error trait, I can just use my own enum. It doesn’t show what that looks like, but having just read about enums I think I can figure it out from here.

enum VMError {
    InvalidString(String),
    InvalidOpcode(String),
    InvalidParameter(String),
    InfiniteLoop,
}

impl VM {
    fn assemble_line(&self, line: &str) -> Result<(), VMError> {
        Ok(())
    }

    fn run(&self) -> Result<(), VMError> {
        Ok(())
    }
}

I wonder if I have to write a constructor as well, since I don’t want to have to create a VM object by specifying values for acc, pc, code, and visited. I google “rust constructor” and don’t find a very good resource to answer my question, but I manage to piece together something from a few different resources:

fn new() -> Self {
    VM {
        acc: 0,
        pc: 0,
        code: vec![],
        visited: BitVec::new(),
    }
}

Now that I have the outline, I’d like to start by writing the code for assemble_line(). What this needs to do is take the first three characters of the string and determine the opcode from them, and parse position 4 until the end of the string into the integer parameter. Then it needs to store the resulting Instruction in the code vector, and append a zero to the visited bit vector. Here’s what I come up with:

fn assemble_line(&mut self, line: &str) -> Result<(), VMError> {
    let opcode_string = &line[..3];
    let parameter_string = &line[4..];
    let parameter = parameter_string
        .parse::<i16>()
        .map_err(|_| VMError::InvalidParameter(parameter_string.to_string()))?;
    let instruction = match opcode_string {
        "nop" => Instruction::NOP,
        "acc" => Instruction::ACC(parameter),
        "jmp" => Instruction::JMP(parameter),
        _ => return Err(VMError::InvalidOpcode(opcode_string.to_string())),
    };
    self.code.push(instruction);
    self.visited.push(false);
    Ok(())
}

A few observations:

  • self needs to be mutable, which I had forgotten.
  • It’s actually quite verbose to construct a VMError, so I wonder if I’m doing it right after all. (At first I tried simply InvalidParameter(parameter_string), but the error type needs to own the string.)
  • I don’t need VMError::InvalidString after all, it’s covered by InvalidParameter and InvalidOpcode.

I then try to write some code in main() to test the new function:

fn main() -> Result<(), io::Error> {
    let file = fs::File::open("input")?;
    let mut vm = VM::new();
    for line in read_lines(file) {
        vm.assemble_line(&line)?;
    }
    println!("{:?}", vm.code);

    Ok(())
}

I note happily that if I forget the ? on assemble_line(), the compiler will tell me that the error must be handled, even though the successful case doesn’t return anything. But other than that, I’m sad, because I get this error:

error[E0277]: `?` couldn't convert the error to `std::io::Error`
  --> puzzle8.rs:62:32
   |
58 | fn main() -> Result<(), io::Error> {
   |              --------------------- expected `std::io::Error` because of this
...
62 |         vm.assemble_line(&line)?;
   |                                ^ the trait `std::convert::From<VMError>` is not implemented for `std::io::Error`
   |
   = note: the question mark operation (`?`) implicitly performs a conversion on the error value using the `From` trait
   = help: the following implementations were found:
             <std::io::Error as std::convert::From<std::ffi::NulError>>
             <std::io::Error as std::convert::From<std::io::ErrorKind>>
             <std::io::Error as std::convert::From<std::io::IntoInnerError<W>>>
   = note: required by `std::convert::From::from`

I guess this is what that Stack Overflow answer meant when it said I didn’t have to implement the Error trait if my error types lined up … and they do not line up. I try a few things (like changing the return type of main() to Result<(), impl Error>) and go back to search results for “rust custom error”, but don’t make any progress. Then I realize that there is a section in the Error Handling in Rust blog post that Federico sent me, that is relevant to this. I haven’t reached that part yet, but I skip ahead to it and read it.

Copying some code from that page, I manage to cobble this together:

impl fmt::Display for VMError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        match &*self {
            VMError::InvalidOpcode(opcode) => write!(f, "Unknown opcode {}", opcode),
            VMError::InvalidParameter(param) => {
                write!(f, "Parameter {} not a 16-bit integer", param)
            }
            VMError::InfiniteLoop => write!(f, "Infinite loop detected"),
        }
    }
}

impl Error for VMError {}

impl convert::From<VMError> for io::Error {
    fn from(err: VMError) -> io::Error {
        io::Error::new(io::ErrorKind::Other, err)
    }
}

The & on match &*self confuses me, here’s another lifetime <'_> that I don’t get, and on the whole this is definitely code that I wouldn’t have been able to write without copying it from somewhere, but it works! What I do understand about it is that for the ? operator to be able to convert my custom VMError into io::Error, I need to implement the From<VMerror> trait for io::Error1, and for that trait, VMError also needs to implement Error, and for Error it also needs to implement Display.

On to the run() method! This needs to clear the program counter, accumulator, and visited vector to 0, and execute the instructions one by one, updating the accumulator and program counter as necessary, and marking each one as visited. If the program counter reaches the end of the instructions, then the program is done, but if it reaches an instruction that was already visited, we have hit an infinite loop.

fn run(&mut self) -> Result<(), VMError> {
    self.pc = 0;
    self.acc = 0;
    self.visited.set_all(false);
    let code_length = self.code.len();
    while self.pc < code_length {
        if self.visited[self.pc] {
            return Err(VMError::InfiniteLoop);
        }
        self.visited.set(self.pc, true);
        match self.code[self.pc] {
            Instruction::NOP => self.pc += 1,
            Instruction::ACC(value) => {
                self.acc += value;
                self.pc += 1;
            }
            Instruction::JMP(distance) => self.pc += distance,
        };
    }
    Ok(())
}

The big mistake that I made here is that I can’t add the i16 parameter to self.acc: i32 or self.pc: usize. So I google “rust convert i16 to i32” and find out about the as operator. Writing self.acc += param as i32 works for the accumulator, but self.pc += distance as isize does not (and self.pc += distance as usize would be incorrect); but neither can I make self.pc into an isize instead of usize, because I need it to index into the vectors. What I would really like for this situation is a method that does checked addition of a signed type to an unsigned type, but I browse through the documentation for usize and such a method doesn’t seem to exist. I do find this Stack Overflow answer that has a good implementation that I can copy, though:

fn checked_jump(pc: usize, jump: i16) -> Option<usize> {
    if jump.is_negative() {
        pc.checked_sub(jump.wrapping_abs() as u16 as usize)
    } else {
        pc.checked_add(jump as usize)
    }
}

Then I can change the JMP case to self.pc = checked_jump(self.pc, distance).ok_or(VMError::InvalidJump)? once again gladly using the diagram of Rust Result/Option Transformations.

Now I test this by putting vm.run()?; in my main() function and I do indeed see that an infinite loop error gets printed. I just need to handle this error and print the accumulator, and I’ll have my answer:

let err = vm.run().unwrap_err();
match err {
    VMError::InfiniteLoop => println!("{}", vm.acc),
    _ => return Err(err.into()),
}

I’m sure this is not the most idiomatic way to expect a particular error enum, but I get my answer, and it is correct according to the Advent of Code website. The full code is a bit long so I won’t put it here, but you can scroll down to see it at the end of Part 2.

Day 8, Part 2

Part 2 of the puzzle is to figure out which instruction causes the infinite loop and should be repaired. We know that exactly one instruction has been corrupted, and that it’s either a NOP which was changed to JMP, or vice versa. (So that’s why NOP instructions have parameters!) We also now know that the program is supposed to terminate when the program counter points exactly one past the end of the code. (In Part 1 I assumed that it could point anywhere past the end, even if it was way past.)

First I refactor the Part 1 code. I add the i16 parameter to Instruction::NOP, and this requires writing the match expression as Instruction::NOP(_) =>. Then I check that we don’t jump more than 1 past the end, and return a new VMError::PastTheEnd if we do, by adding this to the end of run():

if self.pc != code_length {
    Err(VMError::PastTheEnd)
} else {
    Ok(())
}

I will have to iterate through all the instructions one by one, flip JMP to NOP and vice versa, and check if the program terminates; if not, flip back. I will add a repair_instruction(pc: &usize) method that will do the flipping:

fn repair_instruction(&mut self, pc: usize) -> bool {
    match self.code[pc] {
        Instruction::NOP(param) => {
            self.code[pc] = Instruction::JMP(param);
            true
        },
        Instruction::ACC(_) => false,
        Instruction::JMP(param) => {
            self.code[pc] = Instruction::NOP(param);
            true
        }
    }
}

I give it a return value so that I know I can skip running the program if the instruction was ACC.

Then I add the following code to main(), to flip each instruction, run the program, and flip back if that didn’t fix the problem:

for pc in 0..vm.code.len() {
    if !vm.repair_instruction(pc) {
        continue;
    }
    if vm.run().is_ok() {
        break;
    }
    vm.repair_instruction(pc);
}
println!("{}", vm.acc);

To my surprise, this all basically runs and gives me the correct answer the first time! I did give repair_instruction() a &usize parameter originally, but it turned out that wasn’t necessary. Fastest Part 2 so far!

Here’s the full code (the day-to-day boilerplate is now so small compared to the rest of the program, I may as well include it):

use bitvec::prelude::*;
use std::convert;
use std::env;
use std::error::Error;
use std::fmt;
use std::fs;
use std::io::{self, BufRead};

#[derive(Debug)]
enum Instruction {
    NOP(i16),
    ACC(i16),
    JMP(i16),
}

#[derive(Debug)]
enum VMError {
    InvalidOpcode(String),
    InvalidParameter(String),
    InvalidJump,
    PastTheEnd,
    InfiniteLoop,
}

impl fmt::Display for VMError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        match &*self {
            VMError::InvalidOpcode(opcode) => write!(f, "Unknown opcode {}", opcode),
            VMError::InvalidParameter(param) => {
                write!(f, "Parameter {} not a 16-bit integer", param)
            }
            VMError::InvalidJump => write!(f, "Negative jump overflow"),
            VMError::PastTheEnd => write!(f, "Positive jump overflow"),
            VMError::InfiniteLoop => write!(f, "Infinite loop detected"),
        }
    }
}

impl Error for VMError {}

impl convert::From<VMError> for io::Error {
    fn from(err: VMError) -> io::Error {
        io::Error::new(io::ErrorKind::Other, err)
    }
}

struct VM {
    acc: i32,
    pc: usize,
    code: Vec<Instruction>,
    visited: BitVec,
}

impl VM {
    fn new() -> Self {
        VM {
            acc: 0,
            pc: 0,
            code: vec![],
            visited: BitVec::new(),
        }
    }

    fn assemble_line(&mut self, line: &str) -> Result<(), VMError> {
        let opcode_string = &line[..3];
        let parameter_string = &line[4..];
        let parameter = parameter_string
            .parse::<i16>()
            .map_err(|_| VMError::InvalidParameter(parameter_string.to_string()))?;
        let instruction = match opcode_string {
            "nop" => Instruction::NOP(parameter),
            "acc" => Instruction::ACC(parameter),
            "jmp" => Instruction::JMP(parameter),
            _ => return Err(VMError::InvalidOpcode(opcode_string.to_string())),
        };
        self.code.push(instruction);
        self.visited.push(false);
        Ok(())
    }

    fn run(&mut self) -> Result<(), VMError> {
        self.pc = 0;
        self.acc = 0;
        self.visited.set_all(false);
        let code_length = self.code.len();
        while self.pc < code_length {
            if self.visited[self.pc] {
                return Err(VMError::InfiniteLoop);
            }
            self.visited.set(self.pc, true);
            match self.code[self.pc] {
                Instruction::NOP(_) => self.pc += 1,
                Instruction::ACC(value) => {
                    self.acc += value as i32;
                    self.pc += 1;
                }
                Instruction::JMP(distance) => {
                    self.pc = checked_jump(self.pc, distance).ok_or(VMError::InvalidJump)?
                }
            };
        }
        if self.pc != code_length {
            Err(VMError::PastTheEnd)
        } else {
            Ok(())
        }
    }

    fn repair_instruction(&mut self, pc: usize) -> bool {
        match self.code[pc] {
            Instruction::NOP(param) => {
                self.code[pc] = Instruction::JMP(param);
                true
            }
            Instruction::ACC(_) => false,
            Instruction::JMP(param) => {
                self.code[pc] = Instruction::NOP(param);
                true
            }
        }
    }
}

// https://stackoverflow.com/a/54035801/172999
fn checked_jump(pc: usize, jump: i16) -> Option<usize> {
    if jump.is_negative() {
        pc.checked_sub(jump.wrapping_abs() as u16 as usize)
    } else {
        pc.checked_add(jump as usize)
    }
}

fn main() -> Result<(), io::Error> {
    let file = fs::File::open("input")?;
    let mut vm = VM::new();
    for line in read_lines(file) {
        vm.assemble_line(&line)?;
    }

    if is_part2() {
        for pc in 0..vm.code.len() {
            if !vm.repair_instruction(pc) {
                continue;
            }
            if vm.run().is_ok() {
                break;
            }
            vm.repair_instruction(pc);
        }
        println!("{}", vm.acc);
    } else {
        let err = vm.run().unwrap_err();
        match err {
            VMError::InfiniteLoop => println!("{}", vm.acc),
            _ => return Err(err.into()),
        }
    }

    Ok(())
}

fn is_part2() -> bool {
    env::args().nth(1).map(|val| val == "2").unwrap_or(false)
}

fn read_lines(file: fs::File) -> impl Iterator<Item = String> {
    io::BufReader::new(file).lines().map(|res| res.unwrap())
}

Afterword

Today’s puzzle went more smoothly than yesterday’s, especially Part 2. The most time-consuming part was figuring out all of the fuss required to make a custom error type that implements the Error trait. That was certainly not as easy as I had expected it would be, and I will probably avoid doing it for future puzzles.

In hindsight, it probably would have been a lot easier if I had used my custom error type only for runtime errors in the VM. Then I wouldn’t have had to implement the Error trait in order to be able to return those errors from main().

The bitvec package, on the other hand, was super easy to use.

Finally, to be clear, unlike in the first few days’ blog posts, I am no longer writing down every mistake that I make! I am still making plenty of mistakes, but I’m finding that it no longer helps me that much to log all of them, because I am starting to understand what the compiler errors mean. This doesn’t mean that I’ve learned quick enough to write all this code without mistakes!


[1] Is adding a trait to a standard library type as bad as monkeypatching a built-in object in JavaScript though? I’m not sure