Skip to main content

1st week of Advent of Code '22 in Rust

· 13 min read
Matej Focko

Let's go through the first week of Advent of Code in Rust.

note

If you wish to have a look at the solutions, you can follow them on my GitLab. More specifically in the /src/bin/.

I will try to summarize my experience with using Rust for the AoC. Trying it out ages ago, I believe it will be pain and suffering, but we will see. For each day I will also try to give a tl;dr of the problem, so that you can better imagine the relation to my woes or 👍 moments.

Day 1: Calorie Counting

tl;dr

As the name suggests, we get the calories of the food contained in the elves backpacks and we want to choose the elf that has the most food ;)

Wakey wakey!

Programming in Rust at 6am definitely hits. I've also forgotten to mention how I handle samples. With each puzzle you usually get a sample input and expected output. You can use them to verify that your solution works, or usually doesn't.

At first I've decided to put asserts into my main, something like

assert_eq!(part_1(&sample), 24000);
info!("Part 1: {}", part_1(&input));

assert_eq!(part_2(&sample), 45000);
info!("Part 2: {}", part_2(&input));

However, once you get further, the sample input may take some time to run itself. So in the end, I have decided to turn them into unit tests:

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_part_1() {
let sample = parse_input("samples/day01.txt");
assert_eq!(part_1(&sample), 24000);
}

#[test]
fn test_part_2() {
let sample = parse_input("samples/day01.txt");
assert_eq!(part_2(&sample), 45000);
}
}

And later on I have noticed, it's hard to tell the difference between the days, so I further renamed the mod from generic tests to reflect the days.

Also after finishing the first day puzzle, I have installed an sccache to cache the builds, so that the build time is lower, cause it was kinda unbearable.

Solution

Well, it's a pretty simple problem. You just take the input, sum the calories and find the biggest one. However, if we try to generalize to more than the biggest one, the fun appears. We have few options:

  • keep all the calories, sort them, take what we need
  • keep all the calories and use max heap
  • use min heap and maintain at most N calories that we need

Day 2: Rock Paper Scissors

tl;dr

You want to know what score did you achieve while playing Rock Paper Scissors. And then you want to be strategic about it.

Apart from the technical details of the puzzle, it went relatively smooth.

Solution

I took relatively naïve approach and then tried to simplify it.

Day 3: Rucksack Reorganization

tl;dr

Let's go reorganize elves' backpacks! Each backpacks has 2 compartments and you want to find the common item among those compartments. Each of them has priority, you care only about the sum.

This is the day where I started to fight the compiler and neither of us decided to give up. Let's dive into it \o/

Fun fact

Fighting the compiler took me 30 minutes.

We need to find a common item among 2 collections, that's an easy task, right? We can construct 2 sets and find an intersection:

let top: HashSet<i32> = [1, 2, 3].iter().collect();
let bottom: HashSet<i32> = [3, 4, 5].iter().collect();

Now, the first issue that we encounter is caused by the fact that we are using a slice (the […]), iterator of that returns references to the numbers. And we get immediately yelled at by the compiler, because the numbers are discarded after running the .collect. To fix this, we can use .into_iter:

let top: HashSet<i32> = [1, 2, 3].into_iter().collect();
let bottom: HashSet<i32> = [3, 4, 5].into_iter().collect();

This way the numbers will get copied instead of referenced. OK, let's find the intersection of those 2 collections:

println!("Common elements: {:?}", top.intersection(&bottom));
Common elements: [3]
caution

Notice that we need to do &bottom. It explicitly specifies that .intersection borrows the bottom, i.e. takes an immutable reference to it.

That's what we want, right? Looks like it! \o/

Next part wants us to find the common element among all of the backpacks. OK, so that should be fairly easy, we have an intersection and we want to find intersection over all of them.

Let's have a look at the type of the .intersection

pub fn intersection<'a>(
    &'a self,
    other: &'a HashSet<T, S>
) -> Intersection<'a, T, S>

OK… Huh… But we have an example there!

let intersection: HashSet<_> = a.intersection(&b).collect();

Cool, that's all we need.

let top: HashSet<i32> = [1, 2, 3, 4].into_iter().collect();
let bottom: HashSet<i32> = [3, 4, 5, 6].into_iter().collect();
let top_2: HashSet<i32> = [2, 3, 4, 5, 6].into_iter().collect();
let bottom_2: HashSet<i32> = [4, 5, 6].into_iter().collect();

let intersection: HashSet<_> = top.intersection(&bottom).collect();
println!("Intersection: {:?}", intersection);
Intersection: {3, 4}

Cool, so let's do the intersection with the top_2:

let top: HashSet<i32> = [1, 2, 3, 4].into_iter().collect();
let bottom: HashSet<i32> = [3, 4, 5, 6].into_iter().collect();
let top_2: HashSet<i32> = [2, 3, 4, 5, 6].into_iter().collect();
let bottom_2: HashSet<i32> = [4, 5, 6].into_iter().collect();

let intersection: HashSet<_> = top.intersection(&bottom).collect();
let intersection: HashSet<_> = intersection.intersection(&top_2).collect();
println!("Intersection: {:?}", intersection);

And we get yelled at by the compiler:

error[E0308]: mismatched types
--> src/main.rs:10:58
|
10 | let intersection: HashSet<_> = intersection.intersection(&top_2).collect();
| ------------ ^^^^^^ expected `&i32`, found `i32`
| |
| arguments to this function are incorrect
|
= note: expected reference `&HashSet<&i32>`
found reference `&HashSet<i32>`

/o\ What the hell is going on here? Well, the funny thing is, that this operation doesn't return the elements themselves, but the references to them and when we pass the third set, it has just the values themselves, without any references.

tip

It may seem as a very weird decision, but in fact it makes some sense… It allows you to do intersection of items that may not be possible to copy. Overall this is a „tax“ for having a borrow checker drilling your ass having your back and making sure you're not doing something naughty that may cause an undefined behavior.

To resolve this we need to get an iterator that clones the elements:

let top: HashSet<i32> = [1, 2, 3, 4].into_iter().collect();
let bottom: HashSet<i32> = [3, 4, 5, 6].into_iter().collect();
let top_2: HashSet<i32> = [2, 3, 4, 5, 6].into_iter().collect();
let bottom_2: HashSet<i32> = [4, 5, 6].into_iter().collect();

let intersection: HashSet<_> = top.intersection(&bottom).cloned().collect();
let intersection: HashSet<_> = intersection.intersection(&top_2).cloned().collect();
let intersection: HashSet<_> = intersection.intersection(&bottom_2).cloned().collect();
println!("Intersection: {:?}", intersection);
Intersection: {4}

Solution

The approach is pretty simple, if you omit the 1on1 with the compiler. You just have some fun with the set operations :)

Day 4: Camp Cleanup

tl;dr

Elves are cleaning up the camp and they got overlapping sections to clean up. Find how many overlap and can take the day off.

RangeInclusive is your friend not an enemy :)

Solution

Relatively easy, you just need to parse the input and know what you want. Rust's RangeInclusive type helped a lot, cause it took care of all abstractions.

Day 5: Supply Stacks

tl;dr

Let's play with stacks of crates.

Very easy problem with very annoying input. You can judge yourself:

    [D]    
[N] [C]
[Z] [M] [P]
1 2 3

move 1 from 2 to 1
move 3 from 1 to 3
move 2 from 2 to 1
move 1 from 1 to 2

Good luck transforming that into something reasonable :)

Fun fact

Took me 40 minutes to parse this reasonably, including fighting the compiler.

Solution

For the initial solution I went with a manual solution (as in I have done all the work. Later on I have decided to explore the std and interface of the std::vec::Vec and found split_off which takes an index and splits (duh) the vector:

let mut vec = vec![1, 2, 3];
let vec2 = vec.split_off(1);
assert_eq!(vec, [1]);
assert_eq!(vec2, [2, 3]);

This helped me simplify my solution a lot and also get rid of some edge cases.

Day 6: Tuning Trouble

tl;dr

Finding start of the message in a very weird protocol. Start of the message is denoted by NN unique consecutive characters.

Solution

A lot of different approaches, knowing that we are dealing with input consisting solely of ASCII letters, I bit the bullet and went with sliding window and constructing sets from that window, checking if the set is as big as the window.

One possible optimization could consist of keeping a bit-vector (i.e. usize variable) of encountered characters and updating it as we go. However this has a different issue and that is removal of the characters from the left side of the window. We don't know if the same character is not included later on.

Other option is to do similar thing, but keeping the frequencies of the letters, and again knowing we have only ASCII letters we can optimize by having a vector of 26 elements that keeps count for each lowercase letter.

Day 7: No Space Left On Device

tl;dr

Let's simulate du to get some stats about our file system and then pinpoint directories that take a lot of space and should be deleted.

I was waiting for this moment, and yet it got me! imagine me swearing for hours

Solution

We need to „build“ a file system from the input that is given in a following form:

$ cd /
$ ls
dir a
14848514 b.txt
8504156 c.dat
dir d
$ cd a
$ ls
dir e
29116 f
2557 g
62596 h.lst
$ cd e
$ ls
584 i
$ cd ..
$ cd ..
$ cd d
$ ls
4060174 j
8033020 d.log
5626152 d.ext
7214296 k

There are few ways in which you can achieve this and also you can assume some preconditions, but why would we do that, right? :)

You can „slap“ this in either HashMap or BTreeMap and call it a day. And that would be boring…

tip

BTreeMap is quite fitting for this, don't you think?

I always wanted to try allocation on heap in Rust, so I chose to implement a tree. I fought with the Box<T> for some time and was losing…

Then I looked up some implementations of trees or linked lists and decided to try Rc<Cell<T>>. And I got my ass whopped by the compiler once again. /o\

tip

Box<T> represents a dynamically allocated memory on heap. It is a single pointer, you can imagine this as std::unique_ptr<T> in C++.

Rc<T> represents a dynamically allocated memory on heap. On top of that it is reference counted (that's what the Rc stands for). You can imagine this as std::shared_ptr<T> in C++.

Now the fun stuff. Neither of them lets you mutate the contents of the memory.

Cell<T> allows you to mutate the memory. Can be used reasonably with types that can be copied, because the memory safety is guaranteed by copying the contents when there is more than one mutable reference to the memory.

RefCell<T> is similar to the Cell<T>, but the borrowing rules (how many mutable references are present) are checked dynamically.

So in the end, if you want something like std::shared_ptr<T> in Rust, you want to have Rc<RefCell<T>>.

So, how are we going to represent the file system then? We will use an enumeration, hehe, which is an algebraic data type that can store some stuff in itself 😩

type FileHandle = Rc<RefCell<AocFile>>;

#[derive(Debug)]
enum AocFile {
File(usize),
Directory(BTreeMap<String, FileHandle>),
}

Let's go over it! FileHandle represents dynamically allocated AocFile, not much to discuss. What does the #[derive(Debug)] do though? It lets us to print out the value of that enumeration, it's derived, so it's not as good as if we had implemented it ourselves, but it's good enough for debugging, hence the name.

Now to the fun part! AocFile value can be represented in two ways:

  • File(usize), e.g. AocFile::File(123) and we can pattern match it, if we need to
  • Directory(BTreeMap<String, FileHandle>) will represent the directory and will contain map matching the name of the files (or directories) within to their respective file handles

I will omit the details about constructing this file system, cause there are a lot of technicalities introduced by the nature of the input. However if you are interested, you can have a look at my solution.

We need to find small enough directories and also find the smallest directory that will free enough space. Now the question is, how could we do that. And there are multiple ways I will describe.

I have chosen to implement tree catamorphism 😩. It is basically a fold over a tree data structure. We descent down into the leaves and propagate computed results all the way to the root. You can also notice that this approach is very similar to dynamic programming, we find overlapping sections of the computation and try to minimize the additional work (in this case: we need to know sizes of our descendants, but we have already been there).

Another approach that has been suggested to me few days later is running DFS on the graph. And, funnily enough, we would still need to combine what we found in the branches where we descent. So in the end, it would work very similarly to my solution.

One of the more exotic options would be precomputing the required information at the same time as parsing. That could be done by adding additional fields to the nodes which would allow storing such information and updating it as we construct the file system.

Post Mortem

Things that have been brought up in the discussion later on.

Rc<T> vs Rc<RefCell<T>>

It has been brought up that I have a contradicting statement regarding the dynamically allocated memory. Specifically:

  • You can imagine Rc<T> as an std::shared_ptr<T> (in C++)
  • When you want an equivalent of std::shared_ptr<T>, you want to use Rc<RefCell<T>>

Now, in Rust it is a bit more complicated, because the type that represents the „shared pointer“ is Rc<T>. What RefCell<T> does is making sure that there is only one „owner“ of a mutable reference at a time (and dynamically, as opposed to the Cell<T>).

Therefore to be precise and correct about the equivalents of std::shared_ptr<T> in Rust, we can say that

  • Rc<T> is an equivalent of a const std::shared_ptr<T>,
  • and Rc<RefCell<T>> is an equivalent of a std::shared_ptr<T>.

You can easily see that they only differ in the mutability. (And even that is not as simple as it seems, because there is also Cell<T>)