Today I came across a blog post titled An Easy Problem Made Hard: Rust & Binary Trees. It presented an interesting challenge to me, because it explained how in Rust this problem became harder due to its model and I wanted to understand if it can be solved in an easier way. I suggest you read it beforehand to understand the challenge and their solution.
The challenge
The goal is to invert a binary tree data structure. The blog post solves it using recursion in both Haskell & C++ and the presents a Rust solution. Their Rust approach creates a data structure like this:
pub struct TreeNode {
pub val: i32,
pub left: Option<Rc<RefCell<TreeNode>>>,
pub right: Option<Rc<RefCell<TreeNode>>>,
}
They point out that this needs three wrapper types. To me Option
made
intuitive sense, since a leaf node needs the None
value. However, Rc
and
RefCell
didn't immediately make sense to me. There is a clear ownership
structure here, so why do we need reference counting? And why would we need a
RefCell
if we're really only doing an in-place mutation?
As an exercise I wanted to see if I could figure out a solution that does away with these two wrappers and create something simpler.
A naive solution
Before implementing inversion, let's create a simpler struct
and see how far
we get:
#[derive(Debug, PartialEq, Eq)]
pub struct TreeNode {
pub val: i32,
pub left: Option<TreeNode>,
pub right: Option<TreeNode>,
}
Let's just compile this structure:
error[E0072]: recursive type `TreeNode` has infinite size
--> src/main.rs:4:1
|
4 | pub struct TreeNode {
| ^^^^^^^^^^^^^^^^^^^
5 | pub val: i32,
6 | pub left: Option<TreeNode>,
| -------- recursive without indirection
|
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
|
6 | pub left: Option<Box<TreeNode>>,
| ++++ +
Interesting! Because the tree references the same type (making it recursive),
we can't just do this. Based on my understanding of Rust's ownership model,
this data would live entirely on the stack, but that can't work if Rust can't
determine at compile time how much room it needs to make on the stack for this
structure. To resolve this, the compiler already tells us what do to: Use a Box
type to explicitly allocate on the heap.
If you carefully read the error message, you'll notice that Rc
is suggested
as another option. So it turns out the original post used it to solve more than
the reference counting problem, it also used it to allocate memory on the heap!
The last alternative here, of using references (&
) won't work very well for us,
because it doesn't fit the data model here: We want the tree structure to own
its child elements and with references we'd have to keep them somewhere else.
While this probably works for our small example, it likely creates more
headaches (for example, lifetimes) than it solves problems for us. I also think
this is a neat example of how Rust makes you think about the data model of your
program.
A slightly more complicated structure
Let's update our TreeNode
:
#[derive(Debug, PartialEq, Eq)]
pub struct TreeNode {
pub val: i32,
pub left: Option<Box<TreeNode>>,
pub right: Option<Box<TreeNode>>,
}
This builds successfully now, so let's implement a solution. We start with a naive implementation and see what Rust thinks. We have two options here: We could either mutate the entire tree in-place using a mutable reference or we can take ownership (consume) the object and return a new object. I'll use a mutable reference approach here, but it could be an interesting exercise to solve this by taking ownership, too.
pub fn invert(&mut self) {
if let Some(left) = &mut self.left {
left.invert();
}
if let Some(right) = &mut self.right {
right.invert();
}
(self.left, self.right) = (self.right, self.left)
}
This is pretty straightforward: First we check if we have a child element for both left and right. If so, we invert them recursively, again using mutable references. Then we do a simple swap. What does the compiler think?
error[E0507]: cannot move out of `self.right` which is behind a mutable reference
--> src/main.rs:56:36
|
56 | (self.left, self.right) = (self.right, self.left)
| ^^^^^^^^^^ move occurs because `self.right` has type `Option<Box<TreeNode>>`, which does not implement the `Copy` trait
error[E0507]: cannot move out of `self.left` which is behind a mutable reference
--> src/main.rs:56:48
|
56 | (self.left, self.right) = (self.right, self.left)
| ^^^^^^^^^ move occurs because `self.left` has type `Option<Box<TreeNode>>`, which does not implement the `Copy` trait
For more information about this error, try `rustc --explain E0507`.
It looks like the compiler isn't able to figure out that we're trying to do an in-place swap and treats it as moving the object out (taking ownership). It's easier to see if we write it out in more detail:
let left = self.left;
let right = self.right;
self.left = right;
self.right = left;
The error:
error[E0507]: cannot move out of `self.right` which is behind a mutable reference
--> src/main.rs:58:21
|
58 | let right = self.right;
| ^^^^^^^^^^ move occurs because `self.right` has type `Option<Box<TreeNode>>`, which does not implement the `Copy` trait
|
help: consider borrowing here
|
58 | let right = &self.right;
| +
For more information about this error, try `rustc --explain E0507`.
The suggested solution here (to use a reference) wouldn't work in the next step
because we want to assign the object (self.left = right
), not a reference.
But surely doing an in-place swap should be possible, we're not doing
anything fancy here?
Turns out, yes, it's actually very easy using std::mem::swap
(docs).
This function takes two mutable references and swaps them around. So our
function becomes:
pub fn invert(&mut self) {
if let Some(left) = &mut self.left {
left.invert();
}
if let Some(right) = &mut self.right {
right.invert();
}
std::mem::swap(&mut self.left, &mut self.right);
}
And that's it, suddenly our tree is inverted!
Conclusion
It turns out, we still have two wrappers here, so there is some extra complexity created by Rust. At the same time, we were able to solve this challenge with slightly less overhead than before. This by no means makes this solution better: Trying to solve every challenge in Rust with the most optimal solution will have you spending more time optimising than building your program. For me, it was an interesting challenge to expand my knowledge of Rust and there's some joy in the elegance of finding a smaller solution.
Have a look at the full source code.