I've started learning C++ and I wanted to ask you to take a look at my linked list implementation and give comments what's wrong with. It compiles and runs fine, but I am curious about const correctness, proper use of pointers/references, etc., so feel free to go super tough on me!
I've read about these concepts online but this doesn't get me far without an experienced C++ programmer's opinion.
Node:
#include <iostream>
#pragma once
template <typename T>
class Node
{
const char * const _key;
Node *_next;
T _value;
public:
Node(const char * const key, const T &value) : _key(key), _value(value), _next(nullptr) {}
const char * const getKey()
{
return _key;
}
Node *getNext()
{
return _next;
}
void setNext(Node *next)
{
_next = next;
}
T &getValue()
{
return _value;
}
};
Linked List:
#include <iostream>
#include "Node.h"
#pragma once
template <typename T>
class LinkedList
{
int _count;
Node<T> *_first;
public:
LinkedList() : _count(0), _first(nullptr) {}
Node<T> *add(const char * const key, const T &value)
{
Node<T> *first = new Node<T>(key, value);
first->setNext(_first);
_first = first;
_count++;
return _first;
}
void remove(const char * const key)
{
Node<T> *next = _first;
Node<T> *previous = nullptr;
while (next)
{
if (next->getKey() == key)
{
if (next == _first)
_first = next->getNext();
else
previous->setNext(next->getNext());
_count--;
return;
}
previous = next;
next = next->getNext();
}
}
int getCount()
{
return _count;
}
Node<T> * const getFirst()
{
return _first;
}
};
I am not too certain how to properly define value inside Node class, because I want to support both primitive types and objects.
This actually worked for me, although this is probably not correct?
if (next->getKey() == key)
-
1\$\begingroup\$ I would also recommend using sentinels on your list. It makes the code easier to write as you don't need to special case them empty list (As there is always a sentinel). \$\endgroup\$Loki Astari– Loki Astari2014年06月30日 19:26:49 +00:00Commented Jun 30, 2014 at 19:26
3 Answers 3
There are quite a few issues with your code as it stands. I know you've said to go "super tough" on you, but the amount of knowledge required to write a relatively complete linked list implementation in C++ is more than just about any other language.
As a minor point up-front, I'm going to change anything that has a prefixed _
(like _first
or _key
) to using a postfix underscore (so first_
and key_
). The reason for this is that certain symbols are reserved for compiler usage; the rules are arcane, but the simplest thing to do is:
- Avoid anything with a leading underscore (
_<varname>
) - Avoid anything with 2 trailing underscores (
<varname>__
)
Edit: As pointed out by @JerryCoffin, you should avoid __
just about anywhere (prefix, postfix, and so on).
The first (and largest) problem is that this never frees any of the heap allocated memory (that is, memory allocated with new
). Every time you do:
Node<T> *first = new Node<T>(key, value);
This will allocate memory from the heap. Unlike garbage collected languages (most languages except C and C++), it is up to you, the programmer, to free this memory when you are done with it. Here, this means manually writing the destructor:
~LinkedList()
{
Node<T>* next = _first;
while(next != nullptr) {
first_ = first_->next;
delete next;
next = first_;
}
}
This will make sure that memory is reclaimed.
To go along with this, there are a few methods that are implicitly generated for you when you write a class in C++. These are the copy assignment operator, and the copy constructor. They have the forms:
LinkedList& operator=(const LinkedList& rhs); // Copy assignment
LinkedList(const LinkedList& rhs); // Copy constructor
When are these called? Say you define a LinkedList<int>
:
LinkedList<int> a;
// Do some operations on a, such as adding nodes
LinkedList<int> b;
// Some operations on b
b = a; // Calls the copy assignment operator, operator=
LinkedList<int> c(a); // Calls the copy constructor
The methods that C++ implements for you here will do something that you do not want them to do: the b.first_
pointer will be equal to a.first_
. What this means is that, if the destructor for a
is called, then b
will be pointing to memory locations that have been reclaimed using delete
(which is very, very bad).
To fix this, they need to be implemented to do the correct thing. So, what is the correct thing to do? Let's look at them each in turn:
The copy constructor just needs to copy each element of the passed in LinkedList
in turn:
LinkedList(const LinkedList& rhs)
{
Node<T>* tmp = rhs.first_;
while(tmp != nullptr) {
add(tmp->key_, tmp->value_);
}
}
The copy assignment operator needs to:
- Free the current memory held by
this
- Copy each value in the list
rhs
So, the implementation looks something like:
LinkedList& operator=(const LinkedList& rhs)
{
// Stop self assignment, like: a = a;
if(&rhs != this) {
// Step 1: Free current memory
Node<T>* tmp = first_;
while(tmp->next != nullptr) {
first_ = first_->next;
delete tmp;
tmp = first_;
}
count_ = 0;
// Step 2: Create a copy of rhs
temp = rhs.first_;
while(tmp != nullptr) {
add(tmp->key_, tmp->value_);
}
}
return *this;
}
This is actually a very well known thing in C++: it is called the rule of three (or in more modern C++, the rule of five - but ignore this for the moment). What it says is:
If you need to manually write any one of the destructor/copy constructor/copy assignment operator, then you need to manually define all three of them.
For a full treatment of this, you might want to read this post.
Using a const char * const
as a key in a linked list is unusual, and doesn't really fit with the basic idea of the data structure. If you want to look something up by key, you're generally looking for a data structure called a map
(or hashmap
, or dictionary
; there are multiple names for it). I'd remove the complication there, and just get rid of the key
member entirely.
void add(const T &value)
{
Node<T> *first = new Node<T>(key, value);
first->next_ = first_;
first_ = first;
count_++;
}
I've also removed the return of the Node<T> *
. You generally don't want to let the users of your class have direct access to a Node<T> *
.
Your remove
function also leaks memory, you again need to have a delete
in there.
Node<T>* tmp = next;
if (next == _first) {
_first = next->getNext();
delete tmp; // Need a delete here
}
else {
previous->setNext(next->getNext());
delete tmp; // Or here
}
_count--;
Edit: How you've written a search is much more dictionary-like. For the standard library list, you'd often use something like std::find_if
, which takes a predicate (which is just a fancy word for a function that returns a bool). This allows you to use an actual function to select what you're looking for. As an example, if you have a Student
struct:
struct Student
{
std::string name;
int age;
int year;
};
Then you had a list of Students:
std::list<Student> students;
You might want to search for a student who has the name "John": this would look like:
std::find_if(students.begin(), students.end(), [](const Student& s)
{ return s.name == "John"; });
In fact, this will actually return an iterator into the container.
This also gives you more flexibility, as you can use the same function to find a student who is, say, 17 years old:
std::find_if(students.begin(), students.end(), [](const Student& s)
{ return s.age == 17; });
Edit 2: As was pointed out, the copy assignment operator is not written in the best way. Firstly, understanding why this is the case is important. If you look at the steps we go through, we firstly free all the memory currently held, and then repopulate the current list. The problem with this is that if anything throws an exception during either of these, we'll be left with object state that is inconsistent. The way to fix this is to use what is commonly termed "copy and swap". This also has the nice property that it allows us some code reuse, cutting down on duplication.
The first port of call is to write a (member) swap
function:
template <typename T>
void LinkedList<T>::swap(LinkedList<T>& other)
{
using std::swap;
swap(first_, other.first_);
swap(count_, other.count_);
}
Then a free function that forwards to this:
template <typename T>
void swap(LinkedList<T>& a, LinkedList<T>& b)
{
a.swap(b);
}
Now we can write a copy assignment operator as follows:
LinkedList& operator=(LinkedList rhs)
{
swap(*this, rhs);
return *this;
}
Note that the rhs
parameter is now taken by value. This will perform a copy construction, hence once we enter the call, rhs
will contain a full copy. Then all that needs to happen is a few pointer swaps. Note also that the original memory that was held is destroyed: when rhs
goes out of scope at the end of the function, its destructor is called. Since it now points to (what was) the original memory for this
, that memory is properly reclaimed.
This has a number of benefits: Firstly, it is succinct and reuses code, secondly, it is exception safe: if any operation throws, our object is left in a consistent state - the update is effectively atomic; it either works completely or fails completely.
Using swap
, you can also easily write a move assignment operator.
-
\$\begingroup\$ Minor point: you could also mention that
getCount()
should beconst
. \$\endgroup\$Jamal– Jamal2014年06月30日 05:10:53 +00:00Commented Jun 30, 2014 at 5:10 -
\$\begingroup\$ thank you for your review, yeah I am aware of delete, I just somehow wasn't sure whether to include it inside the datastructure or call it outside, but I see now. However if I remove a key how would I search for node or what is the point of find function in the first place if I am just going to pass data as a parameter? \$\endgroup\$Michael Naumov– Michael Naumov2014年06月30日 05:12:54 +00:00Commented Jun 30, 2014 at 5:12
-
\$\begingroup\$ @MichaelNaumov You'd search for it via the value, so something like
value == search_value
. \$\endgroup\$Yuushi– Yuushi2014年06月30日 05:22:54 +00:00Commented Jun 30, 2014 at 5:22 -
1\$\begingroup\$ Couple of issues. 1) C++ has fine grain deterministic garbage collection (smart pointers). Any conversation on new and delete should mention this. 2) When talking about the compiler generated methods you should mention the "Rule of Three" Please make it more prominent at the top of your discussion. 3) Your implementation of assignment operator is outdated (and bad (never alter state (delete) until you have the value to swap into its place)). The modern way of doing this is via "Copy And Swap idiom" \$\endgroup\$Loki Astari– Loki Astari2014年06月30日 19:24:30 +00:00Commented Jun 30, 2014 at 19:24
-
1\$\begingroup\$ The three stages in assignment are: 1) Copy without modifying state of the destination object (as the act of copying can generate an exception and you want to make sure you put1 your object in a bad state). 2) Swap out the old value with the new value (in an exception safe manor) 3) Delete/Free the old value. Does not matter if it fails as (2) made sure the state of the object is good. \$\endgroup\$Loki Astari– Loki Astari2014年06月30日 20:24:17 +00:00Commented Jun 30, 2014 at 20:24
getters and setters
The point of declaring an element private is to restrict an access to it. Providing an unchecked getter and setter (as
getNext()
andsetNext()
) defeats this purpose. You may as well make_next
public.The real goal of restricting access is to help the client to not mess things up (a scientific term is maintaining an invariant). I don't see what invariant is maintained by
_value
. TheNode
class doesn't really care what value the node stores; it is the client code that does. A_value
getter just complicates client's life.Sorely missed are iterators. Their absence locks your class out of the STL algorithms library. A client would have to reimplement everything that STL provides for free.
Your Node
implementation is atypical of a common implementation. A node is really just a part of the linked list and just needs two data members: its data and a pointer to the next node (and a pointer to the previous node for a doubly-linked list). It does not need its own member functions, but it could also have its own constructor, instead of having the list itself take care of it.
template <typename T>
struct Node
{
T data;
Node* next;
Node(T data) : data(data), next(nullptr) {}
};
Although this can be a class
, it's okay to make it a struct
because it can just be defined inside of the LinkedList
class, safely as a private
data member.
Misc.:
You should remove
<iostream>
since it's unused, unless you're going to implement a display function inLinkedList
. If you overloadoperator<<
instead, then you'll just need<ostream>
. Regardless, this doesn't belong toNode
.getCount()
should beconst
as it's not supposed to modify any data members:int getCount() const { return _count; }
-
\$\begingroup\$ hi thanks for your comment, the key was so I can use that in future for hashtable implementation, but I should probably derive from node to do that. Can you explain why use std::unique_ptr as opposed to *? Is it just for automatic garbage collection? Also shouldn't the constructor now be Node(std::unique_ptr<T> &data)? \$\endgroup\$Michael Naumov– Michael Naumov2014年06月30日 04:42:58 +00:00Commented Jun 30, 2014 at 4:42
-
\$\begingroup\$ @MichaelNaumov: Yes, and also ownership. You could still use raw pointers for a data structure implementation (hardly anything else), but you'll have to be sure manage memory correctly. Unfortunately, you aren't doing that because you use
new
withoutdelete
. \$\endgroup\$Jamal– Jamal2014年06月30日 04:45:22 +00:00Commented Jun 30, 2014 at 4:45 -
\$\begingroup\$ @MichaelNaumov: No, the constructor should only take a value, not a pointer. When a new node is created, it should automatically point to
nullptr
or a sentinel node. \$\endgroup\$Jamal– Jamal2014年06月30日 16:43:21 +00:00Commented Jun 30, 2014 at 16:43