Recently, I tried to solve Problem 23 of Project Euler. For that I first create a list of all abundant numbers, called abundants
.
Next I iterate over this list and build another list of all sums of abundant numbers that are below a certain limit. Now I noticed something strange. I use a nested loop to iterate twice over the list. But if I use an array to store the sum it takes some seconds, if I add the sums to an ArrayList it takes hours. What's the reason for that? I thought the costly operation are the two nested loops, but it seems the costly operation is ArrayList#add. Any hints why this is the case?
Here the code for the array:
for (int i = 0; i < abundants.size(); i++) {
for (int j = 0; j < abundants.size(); j++) {
int tot = abundants.get(i) + abundants.get(j);
if (tot <= limit)
isSum[tot] = true;
}
}
}
Here the code for the ArrayList:
ArrayList<Integer> sums = new ArrayList<Integer>();
for (int i = 0; i < abundants.size(); i++) {
for (int j = 0; j < abundants.size(); j++) {
int s = abundants.get(i) + abundants.get(j);
if (!sums.contains(s) && s < limit) {
sums.add(s);
}
}
}
5 Answers 5
Your ArrayList implementation is O(n^3) whereas the other is O(n^2): sums.contains(...)
has to traverse the entire sums
list for every iteration of your inner loop.
-
Could you please add a possible solution (if you know any), because I'd also be interested in that.Bobby– Bobby2011年04月27日 07:30:01 +00:00Commented Apr 27, 2011 at 7:30
-
@Bobby, @Roflcoptr,
if (!sums.contains(s) && s < limit) { sums.add(s); }
should beif (s <= limit) { sums.set(s, True); }
to be equivalent.ikegami– ikegami2011年04月27日 07:30:22 +00:00Commented Apr 27, 2011 at 7:30 -
Can this really be the reason for some seconds vs. some hours?RoflcoptrException– RoflcoptrException2011年04月27日 07:40:01 +00:00Commented Apr 27, 2011 at 7:40
-
It depends really on how big all these lists are, but yes it can be! :)sjr– sjr2011年04月27日 07:41:09 +00:00Commented Apr 27, 2011 at 7:41
-
3So... if you have 20,000 elements in
abundants
, you are going over that 20,000 * 20,000 times. For each of those times, you are doing something. If you are, in the worst case, doing 20,000 more operations on EACH of those iterations, this could take hours or days.sjr– sjr2011年04月27日 07:45:55 +00:00Commented Apr 27, 2011 at 7:45
I think rather that your problem is in ArrayList#contains
, which has to traverse the whole list, thus raising your complexity to O(n^3), as opposed to O(n^2) of the program #1.
Your code isn't equivalent, the .contains()
is more expensive than what you are doing with the raw array. The .contains()
walks the entire array every time is called, you don't do this in the raw array based version.
Because int
can be much faster than Integer
.
Try using Integer[]
in the first case or TIntArrayList
in the second case for comparison.
-
1There is a reason that Java didn't make int, long, float and the others first-class Objects...Bill K– Bill K2011年04月27日 07:21:44 +00:00Commented Apr 27, 2011 at 7:21
-
While
int
is faster thanInteger
, it takesextremely
special circumstances for this to cause a slowdown by a factor of 3000 (hours vs. seconds, as reported in the question).meriton– meriton2011年04月30日 00:39:18 +00:00Commented Apr 30, 2011 at 0:39 -
@meriton, This is very true. I missed that point. The improvement is more likely to be 20% to 300% depending on the situation. The general rule is when comparing two things, compare like for like and don't use it in different way. ;)Peter Lawrey– Peter Lawrey2011年04月30日 16:41:54 +00:00Commented Apr 30, 2011 at 16:41
If you know the (maximum) number of the elements, try to initialize the Array list with a given size:
ArrayList<Integer> sums = new ArrayList<Integer>(abundants.size() * abundants.size());
With that the ArrayList won't have to be resized, this will increase the speed.
-
1contains() still have to traverse the entire ArrayList so the speed increase is marginal. It's better to try to change the runtime complexity from O(n^3) to something more sane.Emil Vikström– Emil Vikström2011年04月27日 07:26:40 +00:00Commented Apr 27, 2011 at 7:26
contains()
might get expensive if it can't perform a binary search.BitSet
would be even better.