Big-O for nested loop

Question 1

I am reading this post on Big-O
It says that the following code is O(n^2):

bool ContainsDuplicates(String[] strings)
{
 for(int i = 0; i < strings.Length; i++)
 {
 for(int j = 0; j < strings.Length; j++)
 {
 if(i == j) // Don't compare with self
 {
 continue;
 }
 if(strings[i] == strings[j])
 {
 return true;
 }
 }
 }
 return false;
}

But I can not understand why.
The inner loop does something constant.
So it is summation of 1....N of a constant. i.e. constant number O(1).
The outer loop is summation over the O(1).
So I would imagine it is n*O(1).

I think I am misunderstanding something here.
I do not think that all nested loops mean O(n^2), right?

Question 2

Where do you get the idea that the inner loop is "doing something constant"? It's a loop. That is your first clue that it takes O(N).

Question 3

This can be reduced to O(N^2/2) by having the inner loop start at i+1 instead of 0. As written, in the worst case (no dupes) 1/2 the comparisons are redundant.

Question 4

O(N^2/2) = O(N^2) (In both cases, as N goes to infinity, a doubling of N means a quadrupling of the run time. That's all O(N^2) means.)

Question 5

Worth noting is that in a language like C where string comparison is O(N), this code would actually be O(N^2 * M) where M is the length of the strings.

Question 6

Comparison (of constant length string) is O(1). N comparisons is O(N). N^2 comparisons is O(N^2).

Question 7

Your mistake is with the inner loop. It does something constant n times, so it is O(n). The outer loop does the inner loop n times, so it is O(n × n), or O(n² ).

In general, if the number of iterations a loop makes is dependant on the size of the input, it is O(n). And if k such loops are nested, it is O(n^k ).

Question 8

perhaps an easier to understand notation would be to say that the algorithm runs in O(i*j). Here both loops iterate over the length f String, thus i = j = n where n is strings.length

Question 9

@Newtopian: You usually assume that the strings are about constant in length; that's actually a pretty good approximation in practice.

Question 10

@Donal indeed, we can, with proper data set knowledge make such assumptions, but as always there are exceptions, DNA analysis comes to mind where string length can be a few characters in length for a codon to multi mega bytes for a full chromosome. That said in this case in particular example the string length is of no consequence since we iterate the strings themselves, not the characters. That is, unless the equality operator has been overloaded to some funky logic depending on length of the operands. I doubt however the OP intended to dig this far in his analysis.

Question 11

@Newtopian: To be fair, the ContainsDuplicates function/method from the question isn't very likely to be used with full chromosomes.

Question 12

@Donal : probably not no :-)

Question 13

If the length of the string is n, the test if i == j will execute n^2 times. The order of an algorithm is the order of whatever part of it has the highest order. Since 'i' is compared to 'j' n^2 times, the order of the algorithm cannot possibly be less than O(n^2).

For a very large 'n', doubling 'n' will quadruple the run time.

Question 14

But isn't the test i==j an O(1)?

Question 15

The test itself is O(1). The inner loop executes the test 'n' times, so it's O(n*1). The outer loop executes the inner loop 'n' times, so it's O(n*n*1). So the code overall is O(n^2). If you execute an O(1) step n^2 times, that's O(n^2).

Question 16

Minor nit: actually for any value of N doubling it will quadruple the run time -- it's just that it doesn't matter that much for small N.

Question 17

@Peter: That's not true. For example, going from n=2 to n=4 may not double run time because a 32-bit machine may use the same number of memory fetches to read 2 bytes as 4. Similarly, for small n, the overhead of entering the function may be significant, and that's constant. Larger n may have better branch prediction on average. And so on.

Question 18

In this context, "bery large 'n'" means as 'n' tends toward infinity, but ignoring any inefficiencies that may occur due to large 'n's (such as not fitting in a 32-bit or 64-bit integer). The point of big-O notation is to analyze how an algorithm's performance changes as 'n' increases, so long as it stays within the machine's capabilities. (This may be a quibble in this context, but in general, understanding what big-O notation includes and ignores is important to understanding what it means.)

Question 19

You are misinterpreting what a constant operation means.

A constant operation is an operation which always executes in fixed amount of time independent of input it receives.

i == j is a constant operation because it executes this statement in fixed amount of time. Lets say it takes 1 unit of time.

But this constant operation is performed (no of values of i)*(no of values of j) times. Lets say i and j are run 10 times each. Then by calculation it takes 100 unit of time for completion of i==j when in a nested loop. So it will vary as the values of i and j vary.

We can be sure that i==j will be done in 1 unit time but we cannot know how many times i==j will be performed.

If it is performed n times then it will take n units of time. Outer loop executes inner loop n times. So in essence i==j operations is done n^2 times.

All nested loops mean O(n^(no of nested loops)). Here O means the upper limit which means code will execute in less than or equal to the value of O(). This is the guarantee that it will not take longer than that but it can take less time not larger.

Question 20

Nested loops run O(i₁ * i₂ * ... * i_n), where n is the number of nested loops and i_x is the number of iterations in loop x. (Or, to put that another way, it's the product of the number of iterations in each of the loops.)

Your pair of loops iterates over the same array twice, which by coincidence is O(n * n) or O(n²).

What happens inside the innermost loop is treated as constant because it's going to happen every time. Big-O notation isn't really about measuring the performance of linear code, it's more about making relative comparisons of how iterative algorithms behave when given n items of input to deal with. Certain operations that don't actually do any iterating (such as a push or pop on a stack) get dinged as O(1) because they do handle one element of the data.

Question 21

not true if i_k is not constant: think about sweepline algos worst case they are O(n^2) best case they are O(n) (not counting initial sort) with some inner-loops being over all or over none depending on the problem. also quicksort implemented without recursion is a nested loop yet it still is O(n * log n)

Question 22

Quicksort with random pivot selection is nondeterministic, which means the O(n log n) figure is an average. It still fits what I said: i1 = n and i2 = log n.

KeithB KeithB 1,2357 silver badges9 bronze badges · Accepted Answer · 2011-09-25 21:26:17Z

19

Your mistake is with the inner loop. It does something constant n times, so it is O(n). The outer loop does the inner loop n times, so it is O(n × n), or O(n² ).

In general, if the number of iterations a loop makes is dependant on the size of the input, it is O(n). And if k such loops are nested, it is O(n^k ).

Share

Improve this answer

edited Sep 26, 2011 at 2:53

Jon Purdy's user avatar

Jon Purdy

20.6k9 gold badges66 silver badges95 bronze badges

answered Sep 25, 2011 at 21:26

KeithB's user avatar

KeithB KeithB

1,2357 silver badges9 bronze badges

5

1

perhaps an easier to understand notation would be to say that the algorithm runs in O(i*j). Here both loops iterate over the length f String, thus i = j = n where n is strings.length

Newtopian
– Newtopian

2011年09月26日 02:20:21 +00:00
Commented Sep 26, 2011 at 2:20
1

@Newtopian: You usually assume that the strings are about constant in length; that's actually a pretty good approximation in practice.

Donal Fellows
– Donal Fellows

2011年09月26日 13:10:37 +00:00
Commented Sep 26, 2011 at 13:10
@Donal indeed, we can, with proper data set knowledge make such assumptions, but as always there are exceptions, DNA analysis comes to mind where string length can be a few characters in length for a codon to multi mega bytes for a full chromosome. That said in this case in particular example the string length is of no consequence since we iterate the strings themselves, not the characters. That is, unless the equality operator has been overloaded to some funky logic depending on length of the operands. I doubt however the OP intended to dig this far in his analysis.

Newtopian
– Newtopian

2011年09月27日 03:36:04 +00:00
Commented Sep 27, 2011 at 3:36
@Newtopian: To be fair, the ContainsDuplicates function/method from the question isn't very likely to be used with full chromosomes.

Donal Fellows
– Donal Fellows

2011年09月30日 12:34:27 +00:00
Commented Sep 30, 2011 at 12:34
@Donal : probably not no :-)

Newtopian
– Newtopian

2011年09月30日 17:59:14 +00:00
Commented Sep 30, 2011 at 17:59

Add a comment |

Stack Exchange Network

Big-O for nested loop

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Big-O for nested loop

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions