I was reading about LinkedList
and ArrayList
. I found many people say that iteration through an ArrayList
is faster than LinkedList
but in the same time adding and removing elements from LinkedList
is faster than ArrayList
.
How to prove that states?
I've written an experiment in Java and ran it in debug mode but I didn't see any differences between ArrayList
and LinkedList
.
What is your approach to this issue and How to prove what people say is true?
Here is my code :
import java.util.ArrayList;
import java.util.LinkedList;
public class List {
public static void main(String[] args) {
ArrayList<Integer> arraylist = new ArrayList<Integer>();
arraylist.add(1);
arraylist.add(2);
arraylist.add(3);
arraylist.add(4);
arraylist.add(5);
arraylist.remove(1);
arraylist.remove(2);
for(int x: arraylist) {
System.out.println(x);
}
LinkedList<Integer> arraylista = new LinkedList<Integer>();
arraylista.add(1);
arraylista.add(2);
arraylista.add(3);
arraylista.add(4);
arraylista.add(5);
arraylista.remove(1);
arraylista.remove(2);
for(int x: arraylista) {
System.out.println(x);
}
}
}
3 Answers 3
Due to the way they're implemented, iterating across a linked list may be slightly more expensive in terms of the extra overhead due to handling the objects itself, but there's very little in terms of performance that would lead me to believe one is faster than the other.
The real differences come in to inserting elements, as you've alluded to yourself. We can talk about this in terms of what the actual implementation is and in terms of concepts.
Suppose then that you want to insert an element to your list. For an ArrayList
, you have some decisions to make:
- Are we inserting elements to the front, middle, or the rear of the list?
- Is the list full or near full?
Note that ArrayList
will expand the entire array around the index location you place the value in, and will then insert, unless you're adding at the end - it will just add it into the single location.
In that instance, the amortized runtime of an add operation for an ArrayList
is ~O(n) for any insertions that aren't at the end, and (generally) O(1) for insertions at the end of the list (depending on if it needs to make more room for its internal array).
/**
* Appends the specified element to the end of this list.
*
* @param e element to be appended to this list
* @return <tt>true</tt> (as specified by {@link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
/**
* Inserts the specified element at the specified position in this
* list. Shifts the element currently at that position (if any) and
* any subsequent elements to the right (adds one to their indices).
*
* @param index index at which the specified element is to be inserted
* @param element element to be inserted
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public void add(int index, E element) {
rangeCheckForAdd(index);
ensureCapacityInternal(size + 1); // Increments modCount!!
System.arraycopy(elementData, index, elementData, index + 1,
size - index);
elementData[index] = element;
size++;
}
For a LinkedList
, the decision is generally easier - it's O(1) to add a node to either the front or the back of the list, due to it being doubly linked. However, it costs a guaranteed O(n) to insert to any index location otherwise, since you have to iterate over some amount of elements before you find the node you're to insert between.
A benchmark may be helpful (although your benchmark is woefully broken), but looking closer at the implementation may be better in this scenario.
-
hi .. what you mean by O(1) ? and what you mean by LinkedList is doubly linked.? thanksuser5595273– user559527312/12/2015 21:42:19Commented Dec 12, 2015 at 21:42
-
1O(1) means that it's just as fast independent of the size of the list. doubly linked means the nodes have references to both the previous node and the next node.simon– simon12/12/2015 21:49:58Commented Dec 12, 2015 at 21:49
-
@kolsyrad God bless you brother :)user5595273– user559527312/12/2015 21:51:47Commented Dec 12, 2015 at 21:51
-
So O(n) is dependent of the size of the list , is it true?user5595273– user559527312/12/2015 21:54:19Commented Dec 12, 2015 at 21:54
An important note before you read this answer: These are only potential implementations and not the actual ones, the principles behind it should still hold however.
An ArrayList
uses an array as its inner structure, consider this array:
int[] array = new int[1];
This array will hold one element. Let's pretend we can add to it using the method defined in ArrayList
, it would look like this:
array.add(0);
What happens when we have to add another element to this array? It clearly only holds one item. Well it will have to be resized, a common way of resizing is to double it's capacity. So let's do that:
int[] temp = new int[2];
for(int i = 0; i < array.length; i++) {
temp[i] = array[i];
}
array = temp;
What would happen if we were to add two more elements? We would have to create yet another temporary array, copy the values over, and insert them again. This is one of the reasons adding to a LinkedList
is faster. A LinkedList
is not built with the help of an underlying array.
A LinkedList
consists of nodes that all hold references to either the previous and the next node or just the next node, so inserting into the end of it is very cheap if you save a reference to the last element of the LinkedList
Depending on how far you've come in learning Java I'd highly recommend that you implement these data structures from scratch, it's very helpful to understand how they work!
@Makoto and @CommuSoft mention very important aspects of the differences that I did not cover, so read those as well.
There are some problems with these statements. First of all they are too generic: insertion in a LinkedList<T>
can be faster than insertion in an ArrayList<T>
if it is done at the beginning or the end, but not necessary on an arbitrary place. For instance inserting an element in a LinkedList<T>
with n elements at the n-1'th place will in general be more expensive for a LinkedList<T>
than for an ArrayList<T>
. So first of all one has to be specific.
Secondly it is hard to prove this for every possible architecture, runtime environment, operating system, etc.
Time complexity
Therefore what most people mean is that this is true in terms of time complexity. Time complexity define the order of the amount of instructions necessary to perform a certain task.
Insertion
In short: insertion into a
LinkedList<T>
is time complexity-wise more efficient that insertion into anArrayList<T>
.
Now let's take the example of insertion at the end. In general a LinkedList<T>
maintains a reference to the last node of the linked list. It can thus access the last node in a constant amount of operations. Furthermore it sets the reference of the next
the last node to the new last node, again in a constant amount of operations. Finally it sets the reference to the last node to the new last node in a constant amount of operations. Therefore the entire task is carried out in a constant number of operations. Computer scientists denote this with O(1).
An ArrayList<T>
has more problems inserting an element. It maintains an array inside that can store elements as well as a counter that maintains the current amount of elements inside that array. In case you want to insert an element there are two possible cases:
- There is still space in the array, in that case you simply add the element to the array (in constant time) and update the counter (in constant time). This operation is thus carried out in constant time.
- It can happen that the internal array is full. In that case, you construct a new (larger) array. Copy all the elements into that array and add the element to that array. Copying cannot be done in constant time: in order to copy n elements, you need something in the order of n operations. Therefore it can take linear time to carry out this operation.
In general insertion in an ArrayList<T>
is thus done in O(n) time.
Computer scientists always study the behavior for a large n (hundreds, thousands, millions, etc.). They claim that O(1) < O(n) because they say that "there exists an n - the number of items already in the list - for which from that point insertion in a LinkedList<T>
will always be cheaper than insertion into an ArrayList<T>
".
Iterating
In short: iterating over a
LinkedList<T>
is time complexity-wise equivalent to iterating over anArrayList<T>
.
When it comes to iterating over both lists, LinkedList<T>
and ArrayList<T>
will have the same time complexity. The iterator for a LinkedList<T>
will - at every point - hold a reference to the current node. In case it has to advance, it will fetch a reference to the .getNext()
node, and access the content in that node. This happens all in constant time O(1).
The iterator for an ArrayList<T>
happens by using a counter that stores the amount of thus far iterated elements (minus one). When iterating is advanced, the counter is incremented and the item at that index is fetched. This all happens in constant time O(1).
Random access
In short: random access for a
LinkedList<T>
is time complexity-wise more expensive than random access for anArrayList<T>
.
If you plan to perform random access meaning you provide an index i and you want to fetch the element located at position i, you are better off using an ArrayList<T>
.
In case you want to access the i-th element of a LinkedList<T>
you will need to iterate through the first i nodes in the list. Iterating i times, requires a workload that scales with i, so the time complexity is O(i). Since i can scale up to n (the amount of elements in the list), random access for linked lists is O(n).
For an ArrayList<T>
random access is much faster: it simply fetches the element from the internal list, which can be done in constant time O(1).
etc.
For other operations the same analysis can be made. @hege_hegedus shared in his/her comment a link to a "cheatsheet" that lists the time complexity of most operations on these data structures.
"many people say that iteration through an arraylist is faster than linkedlist"
-- they're both pretty fast with iteration, but an AL is generally faster with random access, i.e., with calling theget(...)
method.