Return to Answer

added 81 characters in body

Source Link

edited Jan 2, 2022 at 0:09

Davislor

edited Jan 2, 2022 at 0:09

Davislor

9.1k
19
39

An alternative interface is std::format, which often has very low overhead.

added 478 characters in body

Source Link

edited Jan 1, 2022 at 23:46

Davislor

edited Jan 1, 2022 at 23:46

Davislor

9.1k
19
39

You’d also save yourself from exposing the entire definition of your node and allowing anyone who requests a value to arbitrarily tamper with your tree.

Use Const Where Appropriate

Currently, you don’t have any const functions, so if someone creates a const RBTree, they can’t do anything with it. Since your interface returns Node*, there’s nothing stopping people from seeing all your tree’s internals and even changing them. You really don’t want this.

You’d also save yourself from exposing the entire definition of your node and allowing anyone who requests a value to arbitrarily tamper with your tree.

Use Const Where Appropriate

Source Link

answered Jan 1, 2022 at 23:38

Davislor

answered Jan 1, 2022 at 23:38

Davislor

9.1k
19
39

This ia a great follow-up and deserves a closer look than I can give it right now, but a handful of things jumped out at me.

Write the Class as a Template

You’ve got a nice tree class, but it’d be a shame to re-write the whole thing whenever you want to hold a different data type. You’re better off writing this as a class template and instantiating RBTree<Transaction>.

Duck-Type to Standard Interfaces

For example, if you name your begin and end iterators .begin() and .end(), a lot of the standard library will work with them, and if you call them anything else, it won’t. And I like camelCase too, but the C++ standard library uses snake_case.

A minor instance of this is that your serialization function is named toString, in camelCase, but the C++ standard library uses to_string(s), s.to_string(), or both, and you might well find a template duck-typed to expect your type to, as well.

On a larger scale:

Follow the STL Interface as Closely as Possible

Or as closely as you feel like. I’d honestly suggest you just leave out everything relating to allocators and comparators at first, and if you’re still interested in them when you have something you like, you can go back and add them.

This is a good practice for two reasons. First, it gives you a data structure you can use as a drop-in replacement for std::multimap (if you’re storing and comparing values) or the more complicated std::map (if you have different keys and values). This will be much more useful to anyone considering actually using your code, and not only that, it lets you benchmark your implementation against the one in your compiler’s library.

Second, it’s a great way to learn your way around the STL.

So, you can start with a STL reference, pick a place to start, and copy and paste those declarations from the public interface into your class declaration. Then implement them beneath, until you’ve filled in enough of the interface to be able to write a test driver.

Use Separate Files

Maybe you’re just flattening it to make your post shorter, but it looks like you’ve got everything in a blob right now.

You really want a "RBTree.h" and a test driver like main.cpp that calls it. Most of your code looks like it would need to go in your header, but you might have some non-template definitions that belong in a RBTree.cpp as well. A "Transaction.h" and Transaction.cpp would also be good.

Write Portable Strings

Your Transaction::toString function currently has the string ": -\x9C" in two places. Since there’s no comment, I’m left to assume that you want the program to run in IBM code page 850, where \x9E is £.

This was the default on some versions of Windows, but not in all countries. Windows 11 changes this to UTF-8, and most other operating systems also use UTF-8 by default. Most cannot support code page 850 at all. Since there’s no comment, someone compiling this file would have no way of knowing that it needs the execution character set and current code page to be 850.

What you want to do instead is use Unicode escape codes. Since the pound sign is U+00A3, you can write ": -\u00A3". This should still work if you compile on Windows with cl /execution-charset:.850 (although I admit I haven’t actually tested that) but these days, I would just compile with /utf-8 in Visual Studio, or use the default settings for almost any other compiler.

If you save as UTF-8 with a byte-order mark, every compiler should automatically detect the source character set and understand ": -£". This will still work if you choose an execution character set different from your source character set.

You should be saving your source files as UTF-8 with a byte-order mark, as this is the only format that all major compilers using their default settings understand. If you use anything else, definitely warn maintainers in the comments!

While you’re at it, adding the line

using namespace std::literals;

will let you write ": -£"s to make this literal a std::string rather than a const char* const. You can also use ": -£"sv for a std::string_view, which is often more efficient, but that doesn’t help you concatenate with +.

Smart Pointers

You ask about changing your shared_ptr links to plain C-style pointers. This could work—after all, it’s how people code binary trees in C—but is a bad idea. You would need to do manual memory management again.

So starting with shared_ptr and getting that to work was a great idea. I’d recommend you next learn to write this using node_type = std::unique_ptr<Node>; for your left and right pointers. This forces you to use plain pointers for any other weak references to those nodes (such as pointers back up to the parent, or pointers stored in iterators), but that’s okay, because the memory management still is automatic, with RAII.

Not only is unique_ptr more efficient than shared_ptr and a lot less likely to corrupt your heap than a C-style *, it allows you to implement functions like extract from the STL, and see what they’re used for. Many modern C++ libraries now pass around unique_ptr as efficient, versatile object handles, so it’s well worth understanding them.

You said in a previous discussion that you got compiler errors when you tried to use std::unique_ptr. This is probably because you cannot copy a unique_ptr, at least not by writing up2 = up1 or std::unique_ptr<Node>(up).

But you can do other things with one. You can swap two of them with std::swap, which is an efficient way to rotate a subtree. You can move the pointer, usually by writing up2 = std::move<up1> or foo(std::move(up1)). That transfers the pointer to a different pointer and invalidates the original. You can dereference it to access the object it points to, or get a dumb pointer to the object, or even .release ownership of it. You can turn it into a shared_ptr, and a few other things.

It might take a little while to get used to, but it’s actually a good fit for your problem, because every node should either be the root, or the left or right child of exactly one node in the tree. So, it should have one unique node that owns it, and if the same node were ever linked two different places in the tree, it would be a serious bug. If you implement .extract(), it becomes possible to cut a node off its tree, but then it’s pruned from the tree, so there’s still only ever one pointer to it. You can also splice a unique_ptr to a node back into the tree, or a different one, but only by moving and consuming it, so there is still only one canonical unique_ptr to the node in existence at any time. In fact, the way .extract and .insert(node_type&&) are specified is a strong hint from the language standard that the STL map and set containers are supposed to use unique_ptr to store their nodes.

But this is a more advanced technique, and it’s possible to make the tree work without it.

Iterators?

I suggested above that you implement the std::multimap interface, which instead of returning pointers to a node, either returns the stored value by reference, or an iterator to the node. The implementation would probably actually be a wrapper to a RBTree::Node*.

The main things an iterator can do are dereference with *it (returning the data by reference), increment with ++it (which you already have, but spell .successor()), decrement with --it (which you already have, but spell predecessor()), and compare with == and !=.

You have the right general idea by returning Node*, but your library would be immediately be able to use all the STL algorithms, ranges, initialization and so on by turning them into iterators. And you’ve already done most of the work!

Some Implementation Tips

In the comments, @greybeard linked to the left-leaning red-black tree data structure, which goes to some great references, including this paper.

You don’ need to use that particular algorithm. If you do, one tip is that there’s a single representation of a 2-node, a 3-node and a 4-node if you happen to use those, and those could all be created as subtrees, by constructors or factories with two, three or four rvalue-reference arguments.

default