57

I am maintaining a fixed-length table of 10 entries. Each item is a structure of like 4 fields. There will be insert, update and delete operations, specified by numeric position. I am wondering which is the best data structure to use to maintain this table of information:

  1. array - insert/delete takes linear time due to shifting; update takes constant time; no space is used for pointers; accessing an item using [] is faster.

  2. stl vector - insert/delete takes linear time due to shifting; update takes constant time; no space is used for pointers; accessing an item is slower than an array since it is a call to operator[] and a linked list .

  3. stl list - insert and delete takes linear time since you need to iterate to a specific position before applying the insert/delete; additional space is needed for pointers; accessing an item is slower than an array since it is a linked list linear traversal.

Right now, my choice is to use an array. Is it justifiable? Or did I miss something?

Which is faster: traversing a list, then inserting a node or shifting items in an array to produce an empty position then inserting the item in that position?

What is the best way to measure this performance? Can I just display the timestamp before and after the operations?

Mohan
1,9772 gold badges21 silver badges45 bronze badges
asked Dec 15, 2009 at 5:51
5
  • 4
    Why do you need to do any array shifting if it's a fixed size? Commented Dec 15, 2009 at 5:56
  • 2
    What are you actually implementing? Commented Dec 15, 2009 at 6:00
  • 1
    @Myles: It's like a ranking information. Think of it like a list of 10 prices sorted in increasing order. If a new price comes and I have to insert it into the middle rank, then the current last price has to go. So I need to shift some positions to make way for the new rank. Commented Dec 15, 2009 at 6:30
  • 1
    "accessing an item is slower than an array since it is a call to operator[]" - Ok, stop right there. Commented Jul 16, 2013 at 15:59
  • Another option is a std::array, if the length truly is fixed. Commented Jul 16, 2013 at 16:00

8 Answers 8

65

Use STL vector. It provides an equally rich interface as list and removes the pain of managing memory that arrays require.

You will have to try very hard to expose the performance cost of operator[] - it usually gets inlined.

I do not have any number to give you, but I remember reading performance analysis that described how vector<int> was faster than list<int> even for inserts and deletes (under a certain size of course). The truth of the matter is that these processors we use are very fast - and if your vector fits in L2 cache, then it's going to go really really fast. Lists on the other hand have to manage heap objects that will kill your L2.

answered Dec 15, 2009 at 5:53
8
  • 3
    I read somewhere in some post it was looping through more than 1000 items comparing array and vector and in the vector's case, it was mentioned it took longer... and the answers said that it is because it is a template, using operator[] produces overhead...? Commented Dec 15, 2009 at 5:57
  • 7
    @jasonline I use vectors all over a game I wrote for the iPhone. I have never once run into a performance issue with the use of operator[]. These are what we call "micro optimizations" - they're micro because they have little effect on the overall performance of the app. While I spent years learning the lessons of micro optimization, I highly recommend you avoid them and focus on creating a great app. Commented Dec 15, 2009 at 6:01
  • 2
    Actually vector<int> is always faster than list<int> for inserts and deletes if you have to search for the element, especially on large datasets. While vector operations are always O(N) and list operations are on average O(N/2), the list operations are dominated by unordered memory access time which makes them individually much more expensive than the extra N/2 operations the vector version uses. Commented Dec 15, 2009 at 6:02
  • 1
    Ok, you mentioned vector<int> is faster under a certain size. In what case will it be that list<int> will become faster than the vector<int>? Commented Dec 15, 2009 at 6:04
  • 1
    Bjarne Stroustrup is pretty critical of list<T> and favors vector<T> here : channel9.msdn.com/Events/GoingNative/GoingNative-2012/… Commented Mar 4, 2012 at 16:52
32

Premature optimization is the root of all evil.

Based on your post, I'd say there's no reason to make your choice of data structure here a performance based one. Pick whatever is most convenient and return to change it if and only if performance testing demonstrates it's a problem.

answered Dec 15, 2009 at 5:54
2
  • 2
    Yes, I know the performance doesn't matter so much in this case since it is only a few data but I was just wondering... Commented Dec 15, 2009 at 5:59
  • 4
    When your writing small application your faced with Log(1) problems. When you've gotten the basics down, you can start to know what you need to scale too. When you know what you need to scale too, not some theoretical limit eg..."We need to scale in the future". Then you can choose algorithms and data structures to suit. Until then, just use std::vector. Commented Dec 15, 2009 at 6:24
25

It is really worth investing some time in understanding the fundamental differences between lists and vectors. The most significant difference between the two is the way they store elements and keep track of them.

- Lists -

List contains elements which have the address of a previous and next element stored in them. This means that you can INSERT or DELETE an element anywhere in the list with constant speed O(1) regardless of the list size. You also splice (insert another list) into the existing list anywhere with constant speed as well. The reason is that list only needs to change two pointers (the previous and next) for the element we are inserting into the list.

Lists are not good if you need random access. So if one plans to access nth element in the list - one has to traverse the list one by one - O(n) speed

- Vectors -

Vector contains elements in sequence, just like an array. This is very convenient for random access. Accessing the "nth" element in a vector is a simple pointer calculation (O(1) speed). Adding elements to a vector is, however, different. If one wants to add an element in the middle of a vector - all the elements that come after that element will have to be re allocated down to make room for the new entry. The speed will depend on the vector size and on the position of the new element. The worst case scenario is inserting an element at position 2 in a vector, the best one is appending a new element. Therefore - insert works with speed O(n), where "n" is the number of elements that need to be moved - not necessarily the size of a vector.

There are other differences that involve memory requirements etc., but understanding these basic principles of how lists and vectors actually work is really worth spending some time on.

As always ... "Premature optimization is the root of all evil" so first consider what is more convenient and make things work exactly the way you want them, then optimize. For 10 entries that you mention - it really does not matter what you use - you will never be able to see any kind of performance difference whatever method you use.

answered Jul 16, 2013 at 14:23
7

Prefer an std::vector over and array. Some advantages of vector are:

  • They allocate memory from the free space when increasing in size.
  • They are NOT a pointer in disguise.
  • They can increase/decrease in size run-time.
  • They can do range checking using at().
  • A vector knows its size, so you don't have to count elements.

The most compelling reason to use a vector is that it frees you from explicit memory management, and it does not leak memory. A vector keeps track of the memory it uses to store its elements. When a vector needs more memory for elements, it allocates more; when a vector goes out of scope, it frees that memory. Therefore, the user need not be concerned with the allocation and deallocation of memory for vector elements.

answered Dec 15, 2009 at 6:06
3
  • Regarding the size, I don't think it matters in this case because I'm maintaining a fixed size of 10. I agree though that at() provides security in accessing elements. Commented Dec 15, 2009 at 6:09
  • 1
    @jasonline Well, if all you want is to maintain a list of 10 elements and don't need the conveniences offered by the standard library (like templates), just use a statically allocated array. C++ is a flexible language, anyway! Commented Dec 15, 2009 at 6:15
  • 2
    The question explicitly said that the table is fixed size, so I'd recommend std::array, which gives the benefits of std::vector without the overhead for resizing. Why pay for something you aren't going to use? Commented Jul 16, 2013 at 16:02
3

You're making assumptions you shouldn't be making, such as "accessing an item is slower than an array since it is a call to operator[]." I can understand the logic behind it, but you nor I can know until we profile it.

If you do, you'll find there is no overhead at all, when optimizations are turned on. The compiler inlines the function calls. There is a difference in memory performance. An array is statically allocated, while a vector dynamically allocates. A list allocates per node, which can throttle cache if you're not careful.

Some solutions are to have the vector allocate from the stack, and have a pool allocator for a list, so that the nodes can fit into cache.

So rather than worry about unsupported claims, you should worry about making your design as clean as possible. So, which makes more sense? An array, vector, or list? I don't know what you're trying to do so I can't answer you.

The "default" container tends to be a vector. Sometimes an array is perfectly acceptable too.

answered Dec 15, 2009 at 6:01
1
  • Yes, I can agree with the default container being the vector. Commented Dec 15, 2009 at 6:14
1

First a couple of notes:

A good rule of thumb about selecting data structures: Generally, if you examined all the possibilities and determined that an array is your best choice, start over. You did something very wrong.

STL lists don't support operator[], and if they did the reason that it would be slower than indexing an array has nothing to do with the overhead of a function call.

Those things being said, vector is the clear winner here. The call to operator[] is essentially negligible since the contents of a vector are guaranteed to be contiguous in memory. It supports insert() and erase() operations which you would essntially have to write yourself if you used an array. Basically it boils down to the fact that a vector is essentially an upgraded array which already supports all the operations you need.

answered Dec 15, 2009 at 6:11
2
  • Yes, you're right. Lists don't have operator[], my mistake. What made my choice of array wrong? Because of the vector operator[] and the thought that I still need to implement the operations in an array compared to the vector. Right? Commented Dec 15, 2009 at 6:25
  • Vijay Mathew did a good job outlining the benefits of the vector over the array, but also keep in mind that since the vector keeps it's contents contiguous in memory the array has no benefits over the vector. You aren't entirely wrong to assume that operator[] has worse performance than indexing an array, but under the hood the complier will probably reduce them to exactly the same thing. Commented Dec 15, 2009 at 9:51
0

I am maintaining a fixed-length table of 10 entries. Each item is a structure of like 4 fields. There will be insert, update and delete operations, specified by numeric position. I am wondering which is the best data structure to use to maintain this table of information:

Based on this description it seems like list might be the better choice since its O(1) when inserting and deleting in the middle of the data structure. Unfortunately you cannot use numeric positions when using lists to do inserts and deletes like you can for arrays/vectors. This dilemma leads to a slew of questions which can be used to make an initial decision of which structure may be best to use. This structure can later be changed if testing clearly shows its the wrong choice.

The questions you need to ask are three fold. The first is how often are you planning on doing deletes/inserts in the middle relative to random reads. The second is how important is using a numeric position compared to an iterator. Finally, is order in your structure important.

If the answer to the first question is random reads will be more prevalent than a vector/array will probably work well. Note iterating through a data structure is not considered a random read even if the operator[] notation is used. For the second question, if you absolutely require numeric position than a vector/array will be required even though this may lead to a performance hit. Later testing may show this performance hit is negligible relative to the easier coding with numerical positions. Finally if order is unimportant you can insert and delete in a vector/array with an O(1) algorithm. A sample algorithm is shown below.

template <class T>
void erase(vector<T> & vect, int index) //note: vector cannot be const since you are changing vector
{
 vect[index]= vect.back();//move the item in the back to the index
 vect.pop_back(); //delete the item in the back
}
template <class T>
void insert(vector<T> & vect, int index, T value) //note: vector cannot be const since you are changing vector
{
 vect.push_back(vect[index]);//insert the item at index to the back of the vector
 vect[index] = value; //replace the item at index with value
}
answered Oct 30, 2014 at 4:14
0

I Believe it's as per your need if one needs more insert/to delete in starting or middle use list(doubly-linked internally) if one needs to access data randomly and addition to last element use array ( vector have dynamic allocation but if you require more operation as a sort, resize, etc use vector)

answered Jun 26, 2020 at 11:04

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.