Basic binary tree manipulation in Rust

Question 1

The goal here is to implement a basic binary tree with values only on leaves, with depth, mirror, and in_order (traversal) operations.

I have not used Rust before.

Some particular questions I have:

In a few places, I'm defining methods by passing self by reference and then matching on its derefenced form. I then have to borrow the fields with ref in the destructuring. Is this the intended form?
Is into_in_order using Vec properly/optimally? (I tried to use just lv.extend(&mut rv) but got an error that that particular method was still under churn and I should wait for "the dust to settle.")
Am I using Boxes correctly? When should I use Boxes vs. * consts?
Why do (*r).mirror() and r.mirror() do the same thing? I would have expected the latter to throw because Boxes don't have a mirror method.
Is there a less verbose alternative to the Branch(Box::new(Branch(...))) syntax?

#[derive(Debug, PartialEq, Clone)]
enum BTree<T> {
 Leaf(T),
 Branch(Box<BTree<T>>, Box<BTree<T>>),
}
impl <T> BTree<T> {
 fn depth(&self) -> i32 {
 match *self {
 BTree::Leaf(_) => 1,
 BTree::Branch(ref l, ref r) =>
 std::cmp::max(l.depth(), r.depth()) + 1,
 }
 }
 fn into_in_order(self) -> Vec<T> {
 match self {
 BTree::Leaf(val) => vec!(val),
 BTree::Branch(l, r) => {
 let mut lv = l.into_in_order();
 let rv = r.into_in_order();
 lv.extend(rv.into_iter());
 lv
 }
 }
 }
}
impl <T : Clone> BTree<T> {
 fn mirror(&self) -> BTree<T> {
 match *self {
 BTree::Leaf(_) => (*self).clone(),
 BTree::Branch(ref l, ref r) =>
 BTree::Branch(Box::new((*r).mirror()), Box::new((*l).mirror())),
 //
 // why does this work?
 // BTree::Branch(Box::new(r.mirror()), Box::new(l.mirror())),
 }
 }
}
#[test]
#[allow(unused_variables)]
fn test_btree_creation() {
 use BTree::*;
 let leaf: BTree<i32> = Leaf(10);
 let branch: BTree<i32> = Branch(Box::new(Leaf(15)), Box::new(Leaf(20)));
 let tree: BTree<i32> = Branch(Box::new(branch.clone()), Box::new(Leaf(30)));
 assert_eq!(branch, branch.clone());
}
#[test]
fn test_btree_depth() {
 use BTree::*;
 assert_eq!(Leaf(10).depth(), 1);
 let branch: BTree<i32> = Branch(Box::new(Leaf(15)), Box::new(Leaf(20)));
 assert_eq!(branch.depth(), 2);
 let tree: BTree<i32> = Branch(Box::new(branch.clone()), Box::new(Leaf(30)));
 assert_eq!(tree.depth(), 3);
 let other_tree: BTree<i32> = Branch(
 Box::new(branch.clone()), Box::new(branch.clone()));
 assert_eq!(other_tree.depth(), 3);
}
#[test]
fn test_btree_mirror() {
 use BTree::*;
 assert_eq!(Leaf(10).mirror(), Leaf(10));
 assert_eq!(
 Branch(Box::new(Leaf(10)), Box::new(Leaf(20))).mirror(),
 Branch(Box::new(Leaf(20)), Box::new(Leaf(10))));
 assert_eq!(
 Branch(
 Box::new(Leaf(10)),
 Box::new(Branch(Box::new(Leaf(20)), Box::new(Leaf(30))))
 ).mirror(),
 Branch(
 Box::new(Branch(Box::new(Leaf(30)),
 Box::new(Leaf(20)))), Box::new(Leaf(10))));
}
#[test]
fn test_btree_in_order() {
 use BTree::*;
 assert_eq!(Leaf(10).into_in_order(), vec!(10));
 assert_eq!(Branch(Box::new(Leaf(10)), Box::new(Leaf(20))).into_in_order(),
 vec!(10, 20));
 assert_eq!(
 Branch(
 Box::new(Leaf(10)),
 Box::new(Branch(Box::new(Leaf(20)), Box::new(Leaf(30))))
 ).into_in_order(),
 vec!(10, 20, 30));
}

I also have the following macro definitions (not all used above):

macro_rules! assert_eq {
 ($actual:expr, $expected:expr) => (
 if $expected != $actual {
 panic!("expected {:?}, but got {:?}", $expected, $actual);
 }
 )
}
macro_rules! assert_neq {
 ($actual: expr, $expected_not: expr) => (
 if $expected_not == $actual {
 panic!("expected {:?} not to equal {:?}", $expected_not, $actual);
 }
 )
}
macro_rules! assert_approx {
 ($actual: expr, $expected: expr) => {
 if ($expected - $actual).abs() > 1e-3 {
 panic!("expected {:?} or similar, but got {:?}", $expected, $actual);
 }
 }
}

Question 2

Overall, your code seems pretty spot-on. It was easy to read and understand. The most confusing to me was the mirror method, as that's not a common method for trees in my experience. You may want to consider adding some doc-comments, but I certainly wouldn't expect them in code at this level. ^_^

I do have some minor nits, of course!

There's no space between impl and <. impl<T>, not impl <T>.
For such a common and easily-understood function like max, I would go ahead and use the function to allow using it unqualified. For less common functions, I'd use the module, allowing you to skip typing the fully-qualified path.
The vec! macro idiomatically uses []. vec![], not vec!().
There's no space between a generic type and : when defining constraints. T: Clone, not T : Clone.
I prefer to use where clauses for constraints. They read better and are more flexible to modification.
Not only can you say r.mirror() instead of (*r).mirror(), it's definitely preferred. Rust will automatically dereference for you.
Instead of using allow(unused_variables), you can use a leading underscore _ to indicate that you are aware the variable is unused. You could also just not bind it to a variable at all. allow is a big hammer, and has the possibility of hiding variables you meant to use.

Beyond the stylistic issues, there's one performance-related point. Your into_in_order method will allocate N vectors, one for each node in the tree. That will cause a lot of memory churn. Instead, I'd recommend breaking it up into two functions: one that allocates a vector and another that traverses the tree. This way, you can minimize allocations. Of course, benchmarking would be a good idea, but I didn't do that! ^_^

It's also possible you might want to create an iterative version of this, as recursive code does have the possibility of hitting stack limits. Doing this would also allow you to write an Iterator implementation that traverses in-order. Then your method basically becomes a collect(). That might be fun to try next!

Now for your questions:

In a few places, I'm defining methods by passing self by reference and then matching on its dereferenced form. I then have to borrow the fields with ref in the destructuring. Is this the intended form?

Yes, this is pretty typical.

Is into_in_order using Vec properly/optimally?

Ah, yes, good question. I think I touched on this above.

Am I using Boxes correctly?

Looks fine to me.

When should I use Boxes vs. * consts?

You almost never want to use a raw pointer. The compiler cannot help you whatsoever with raw pointers and reading from them requires unsafe code. These will generally come into play when you are writing FFI code or if you are writing the code that underlies a safe abstraction.

Why do (*r).mirror() and r.mirror() do the same thing? I would have expected the latter to throw because Boxes don't have a mirror method.

This was touched on above, but this specific case works because Box implements Deref:

impl<T> Deref for Box<T>
 where T: ?Sized
{
 type Target = T
 fn deref(&self) -> &T
}

Is there a less verbose alternative to the Branch(Box::new(Branch(...))) syntax?

I was wondering the same thing as I read through your tests. They get pretty gnarly as they grow. I don't have any great suggestion, but I wonder if some kind of builder would help. Or perhaps some clever macro implementation that could reduce the visual clutter...

Question 3

Super, thanks! The automatic dereferencing is particularly nice. So I've written BTree::len(&self) -> usize, and converted BTree::into_in_order to (self) -> Vec<T> with a helper BTree::_into_in_order(self, vec: &mut Vec<T>), and this works fine. I'm not quite so clear on the iterator behavior, though. From looking at the collections::Vec implementation, it looks like there's a separate iterator type, and into_iter converts a vector into that type—is that right?

Question 4

(Follow-up: iterators over at codereview.stackexchange.com/questions/110161/…)

Shepmaster Shepmaster 8,77827 silver badges28 bronze badges · Accepted Answer · 2015-11-08 04:18:42Z

Overall, your code seems pretty spot-on. It was easy to read and understand. The most confusing to me was the mirror method, as that's not a common method for trees in my experience. You may want to consider adding some doc-comments, but I certainly wouldn't expect them in code at this level. ^_^

I do have some minor nits, of course!

There's no space between impl and <. impl<T>, not impl <T>.
For such a common and easily-understood function like max, I would go ahead and use the function to allow using it unqualified. For less common functions, I'd use the module, allowing you to skip typing the fully-qualified path.
The vec! macro idiomatically uses []. vec![], not vec!().
There's no space between a generic type and : when defining constraints. T: Clone, not T : Clone.
I prefer to use where clauses for constraints. They read better and are more flexible to modification.
Not only can you say r.mirror() instead of (*r).mirror(), it's definitely preferred. Rust will automatically dereference for you.
Instead of using allow(unused_variables), you can use a leading underscore _ to indicate that you are aware the variable is unused. You could also just not bind it to a variable at all. allow is a big hammer, and has the possibility of hiding variables you meant to use.

Beyond the stylistic issues, there's one performance-related point. Your into_in_order method will allocate N vectors, one for each node in the tree. That will cause a lot of memory churn. Instead, I'd recommend breaking it up into two functions: one that allocates a vector and another that traverses the tree. This way, you can minimize allocations. Of course, benchmarking would be a good idea, but I didn't do that! ^_^

It's also possible you might want to create an iterative version of this, as recursive code does have the possibility of hitting stack limits. Doing this would also allow you to write an Iterator implementation that traverses in-order. Then your method basically becomes a collect(). That might be fun to try next!

Now for your questions:

In a few places, I'm defining methods by passing self by reference and then matching on its dereferenced form. I then have to borrow the fields with ref in the destructuring. Is this the intended form?

Yes, this is pretty typical.

Is into_in_order using Vec properly/optimally?

Ah, yes, good question. I think I touched on this above.

Am I using Boxes correctly?

Looks fine to me.

When should I use Boxes vs. * consts?

You almost never want to use a raw pointer. The compiler cannot help you whatsoever with raw pointers and reading from them requires unsafe code. These will generally come into play when you are writing FFI code or if you are writing the code that underlies a safe abstraction.

Why do (*r).mirror() and r.mirror() do the same thing? I would have expected the latter to throw because Boxes don't have a mirror method.

This was touched on above, but this specific case works because Box implements Deref:

impl<T> Deref for Box<T>
 where T: ?Sized
{
 type Target = T
 fn deref(&self) -> &T
}

Is there a less verbose alternative to the Branch(Box::new(Branch(...))) syntax?

I was wondering the same thing as I read through your tests. They get pretty gnarly as they grow. I don't have any great suggestion, but I wonder if some kind of builder would help. Or perhaps some clever macro implementation that could reduce the visual clutter...

Super, thanks! The automatic dereferencing is particularly nice. So I've written BTree::len(&self) -> usize, and converted BTree::into_in_order to (self) -> Vec<T> with a helper BTree::_into_in_order(self, vec: &mut Vec<T>), and this works fine. I'm not quite so clear on the iterator behavior, though. From looking at the collections::Vec implementation, it looks like there's a separate iterator type, and into_iter converts a vector into that type—is that right?
(Follow-up: iterators over at codereview.stackexchange.com/questions/110161/…)

Stack Exchange Network

Basic binary tree manipulation in Rust

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Basic binary tree manipulation in Rust

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions