Return first duplicate

Question 1

This is a problem from Codesignal.

The guaranteed constraints are 1 <= a.length <= 10^5, and 1 <= a[i] <= a.length.

You have to return the first duplicate encoutered in a, or -1 if no duplicate is found.

def firstDuplicate(a):
 b = [0] * 100001
 for num in a:
 if b[num] == 1:
 return num
 else:
 b[num] = 1
 return -1

Is there a better way to do this? Perhaps faster or using less memory? The unused 0 index in the b array also doesn't feel very clean.

Question 2

you can use the 0th index with num - 1, also if b[num]: return num is sufficient.

Question 3

To save some space you can simply do b = [0] * (len(a) + 1). You know the elements are bound by 1 <= a[i] <= a.length so why blindly initialize to the maximum (10^5)... Doing num-1 adds one more operation per iteration and slows down - it is better to just have one slot not used...

Question 4

The list -- which is essentially one implementation of a set -- could be filled with False and True directly and if b[num] == 1 could be replaced with if b[num].

Question 5

Serious question: What does Codesignal really want --- creativity? speed? footprint? I don't want to distract into the standard discussions of "Interview Questions," but maybe knowing the intent of the original question can better guide the answers.

Question 6

@CarlWitthoft, they don't specify any criteria for optimization (perhaps unfortunately) besides that the solution has to run under 4 seconds, which is effectively the same as no constraint at all. I'm just using those websites to get better at algorithms. I try to optimize for speed and then for memory, knowing there's usually a tradeoff. Then come here for advice on how to do things better. I think for me the main appeal of those websites is the addictive quality that keeps me going all day on this stuff.

Question 7

Memory

Your list implementation uses (depending on your architecture) 8 bytes per list element.

>>> import sys
>>> b = [False] * 100001
>>> sys.getsizeof(b)
800064

Note: This is just the memory of the list structure itself. In general, the contents of the list will use additional memory. In the "original" version, this would be pointers to two integer constants (0 and 1); in the "original_improved" version, it would be storing pointers to the two boolean singletons (False and True). Since we're only storing references to two objects in each case, this additional memory is insignificant, so we'll ignore it.

800kB of memory is not a huge amount, but to be polite, we can reduce it:

import array
def aneufeld_array(a):
 b = array.array('b', (0,)) * (len(a) + 1)
 for num in a:
 if b[num]:
 return num
 b[num] = 1
 return -1

Update: bytearray is a better choice than array.array('b', ...)!

def aneufeld_bytearray(a):
 b = bytearray(len(a) + 1)
 for num in a:
 if b[num]:
 return num
 b[num] = 1
 return -1

The bytearray(size) creates a tightly packed of bytes. Unlike lists, which can store different kinds of things in each element of the list, the bytearray, as its name implies, will only store bytes.

With this new structure, we now use only 1 byte per flag, so around 100kB of memory:

>>> b = bytearray(100001)
>>> sys.getsizeof(b)
100058

Performance wise, this solution is close to the original speed, so we're not giving up any significant speed while reducing the memory load to around 12.5%.

We can still do better. Using an entire byte for a single flag is wasteful; we can squeeze 8 flags into each byte. The bitarray class does all of the heavy lifting for us:

import bitarray
def aneufeld_bitarray(a):
 b = bitarray.bitarray(len(a) + 1)
 b.setall(False)
 for num in a:
 if b[num]:
 return num
 b[num] = True
 return -1

This gets us down to 12.5kB for the bit-array of flags. Unfortunately, this additional memory optimization comes with an additional speed hit, due to the bit packing and unpacking. The performance is still better than "Sriv_improved" performance, and we're using only 1/64th of the original memory requirement.

Timing, on my machine:

 4.94 ms 4.62 ms 4.55 ms original
 3.89 ms 3.85 ms 3.84 ms original_improved
20.05 ms 20.03 ms 19.78 ms hjpotter92
 9.59 ms 9.69 ms 9.75 ms Reinderien
 8.60 ms 8.68 ms 8.75 ms Reinderien_improved
19.69 ms 19.69 ms 19.40 ms Sriv
13.92 ms 13.99 ms 13.98 ms Sriv_improved
 6.84 ms 6.84 ms 6.86 ms aneufeld_array
 4.76 ms 4.80 ms 4.77 ms aneufeld_bytearray
12.71 ms 12.65 ms 12.57 ms aneufeld_bitarray

Question 8

@superbrain My bad - clearly I needed sleep. Fixed it, ... and then completely replaced it bytearray, which is far superior.

Question 9

Yeah, using repeat is better in that sense, although it still costs more space (because the array overallocates) than array.array('b', (0,)) * (len(a) + 1), which is also shorter and about 2000 (!) times faster (for the worst case length).

Question 10

@superbrain Clearly, Python needs an array.array(type, length) constructor, based on all the hoops we're going through to allocate an array of the desired size.

Question 11

@superbrain Updated my times using your array multiplication suggestion. Nice.

Question 12

Ha! Didn't think it would be noticeable in the solution time. I had only tested repeat-vs-multiply in isolation and repeat took about 0.006 seconds, which I thought was irrelevant in the overall solution time, but I forgot that I print the solution times in milliseconds :-)

Question 13

I'll propose an alternate implementation, because dictionaries are good at storing key-value pairs and you don't care about the value:

def first_duplicate(given_list):
 seen = set()
 for value in given_list:
 if value in seen:
 return value
 seen.add(value)
 return -1

A set will buy you basically the same behaviour as the "keys of a dictionary" for these purposes.

Question 14

It's nice and I usually would've written it like this as well, but given that the OP also asked "Perhaps faster or using less memory", I find it worth mentioning that in a worst case like list(range(1, 10**5+1)), this not only takes twice as much time as the original but also five times as much memory. (But I guess neither the 17 upvotes nor the OP's acceptance are concerned about that :-)

Question 15

@superbrain Certainly; it's not the fastest, but I would venture to say that it's the "most canonical", if there is such a thing.

Question 16

Yes, I certainly agree with that. Like I said I usually would've written it like this as well (probably have multiple times, even with the same name seen).

Question 17

This approach applies to arbitrary lists. The OP's approach applies primarily to positive integer lists whose maximum is limited.

Question 18

@GZ0 That's how it should be advertised :-) Not quite arbitrary, though. Needs hashable elements.

Question 19

Benchmark and slightly improved versions of some solutions.

Congratulations, in the worst case (a = list(range(1, 10**5 + 1))) your original solution is about 2-4.5 times faster than the solutions in the previous answers:

 5.45 ms 5.46 ms 5.43 ms original
 4.58 ms 4.57 ms 4.57 ms original_improved
25.10 ms 25.59 ms 25.27 ms hjpotter92
11.59 ms 11.69 ms 11.68 ms Reinderien
10.33 ms 10.47 ms 10.45 ms Reinderien_improved
23.16 ms 23.07 ms 23.02 ms Sriv
17.00 ms 16.97 ms 16.94 ms Sriv_improved

Done with Python 3.9.0 64-bit on Windows 10 64-bit.

original_improved is yours but faster by not doing == 1 and by using False instead of 0, as that's fastest to recognize as false. And for smaller input lists it takes less space as it makes b smaller accordingly.

Code:

from timeit import timeit
from collections import defaultdict
def original(a):
 b = [0] * 100001
 for num in a:
 if b[num] == 1:
 return num
 else:
 b[num] = 1
 return -1
def original_improved(a):
 b = [False] * (len(a) + 1)
 for num in a:
 if b[num]:
 return num
 b[num] = True
 return -1
def hjpotter92(given_list):
 seen = defaultdict(bool)
 for value in given_list:
 if seen[value]:
 return value
 seen[value] = True
 return -1
def Reinderien(given_list):
 seen = set()
 for value in given_list:
 if value in seen:
 return value
 seen.add(value)
 return -1
def Reinderien_improved(given_list):
 seen = set()
 seen_add = seen.add # Suggestion by Graipher
 for value in given_list:
 if value in seen:
 return value
 seen_add(value)
 return -1
def Sriv(a):
 for i in a:
 if a[abs(i) - 1] > 0:
 a[abs(i) - 1] *= -1
 else:
 return abs(i)
 else:
 return -1
def Sriv_improved(a):
 for i in map(abs, a):
 if a[i - 1] > 0:
 a[i - 1] *= -1
 else:
 return i
 else:
 return -1
funcs = original, original_improved, hjpotter92, Reinderien, Reinderien_improved, Sriv, Sriv_improved
a = list(range(1, 10**5+1))
tss = [[] for _ in funcs]
for _ in range(4):
 for func, ts in zip(funcs, tss):
 t = min(timeit(lambda: func(copy), number=1)
 for copy in (a.copy() for _ in range(50)))
 ts.append(t)
 for func, ts in zip(funcs, tss):
 print(*('%5.2f ms ' % (t * 1000) for t in ts[1:]), func.__name__)
 print()

Question 20

Comments are not for extended discussion; this conversation has been moved to chat.

Question 21

Perhaps, using a dictionary to keep account of already seen values?

from collections import defaultdict
def first_duplicate(given_list):
 seen = defaultdict(bool)
 for value in given_list:
 if seen[value]:
 return value
 seen[value] = True
 return -1

Function name should be lower_snake_case.
defaultdict initialises with default value as False. You can pass None instead of bool
Use better/descriptive names of variables.
Since if clause is returning a value, using the else clause is not needed.

Question 22

Thank you for the advice, I didn't know about defaultdict(). Out of curiosity, why is the lower_snake_case better practice? I actually used to do that and switched to camel because that's what they use on that website.

Question 23

Because of PEP8.

Question 24

For a worst case like list(range(1, 10**5+1)) this not only takes about 4.7 as much time as the original (see my answer) but also about 6.6 times as much memory as the original (and even 7.9 times as much in 32-bit Python).

Question 25

Note that all the elements are positive, and the values are not greater than the length.
There is a very clever method to find the solution in these cases.

The idea is to mark the values by turning a[value] negative.
If a duplicate exists, it will encounter a[duplicate] as negative.

Here's the implementation:

for i in a:
 if a[abs(i) - 1] > 0:
 a[abs(i) - 1] *= -1
 else:
 print(abs(i))
 break
else:
 print(-1)

Make sure to turn the values to 0-based indexing though!

This approach is O(N) time complexity and O(1) extra space complexity.

Question 26

It's not O(1) extra space complexity. The negative integer objects cost space.

Question 27

Each single int, as long as it's bounded by a constant as it is here, takes constant space. But you don't just have a single one here. If I give you let's say a = list(range(1, N//2+1)) * 2, then you'll create new int objects for the first N/2 elements, which costs Θ(N) extra space. Not O(1).

Question 28

For example for a = list(range(1, 314159)) * 2, the set solution builds a set of those numbers that takes 8.4 MB extra space, while you build the negations of those numbers and take 8.8 MB extra space: demo. That's with 64-bit Python. On 32-bit Python, it's 4.2 MB for the set and 5.0 MB for your numbers.

Question 29

Well, in languages like C++ or Java, with primitive types like their int, you'd overwrite the original values with their negations indeed without using extra space. And if I weren't so mean to use a = list(range(1, N//2+1)) * 2 but instead used a = list(range(1, N+1)), then you might also take only O(1) extra space. Because you'd create the new number objects but the original number objects would be deleted since there'd be no references to them anymore. Then the new objects could occupy the memory where the originals were. It's just that my second half references the same objects, ...

Question 30

Oops, check the demo again. 314159 is larger than allowed. 19661 is allowed and leads to 524 kB for the set and 550 kB for your negatives. That's btw the largest allowed one where set "wins" in this regard. At 19662, the set resizes to 2.1 MB. At the limit, i.e., with a = list(range(1, 50_001)) * 2, the set is still at 2.1 MB and your negatives reach 1.4 MB. So at that limit you do take less extra space than the set. Just not O(1) :-)

Question 31

Using list comprehension, make a list of the numbers occurring more than once

i for i in a if a.count(i) > 1

Return the first match or -1 if no match was found

next((i for i in a if a.count(i) > 1), -1)

Question 32

You totally live up to your name :-)

Question 33

@superbrain How much slower? :)

Question 34

In my answer's benchmark it would take about 17000 times as long as the original solution. Also: > 2?

Question 35

Edited the 2. Of course I didn't try running it. I'm a bit disappointed the next doesn't prevent it from going through the whole list

AJNeufeld AJNeufeld 35.2k5 gold badges41 silver badges103 bronze badges · Accepted Answer · 2020-10-14 23:25:46Z

Memory

Your list implementation uses (depending on your architecture) 8 bytes per list element.

>>> import sys
>>> b = [False] * 100001
>>> sys.getsizeof(b)
800064

Note: This is just the memory of the list structure itself. In general, the contents of the list will use additional memory. In the "original" version, this would be pointers to two integer constants (0 and 1); in the "original_improved" version, it would be storing pointers to the two boolean singletons (False and True). Since we're only storing references to two objects in each case, this additional memory is insignificant, so we'll ignore it.

800kB of memory is not a huge amount, but to be polite, we can reduce it:

import array
def aneufeld_array(a):
 b = array.array('b', (0,)) * (len(a) + 1)
 for num in a:
 if b[num]:
 return num
 b[num] = 1
 return -1

Update: bytearray is a better choice than array.array('b', ...)!

def aneufeld_bytearray(a):
 b = bytearray(len(a) + 1)
 for num in a:
 if b[num]:
 return num
 b[num] = 1
 return -1

The bytearray(size) creates a tightly packed of bytes. Unlike lists, which can store different kinds of things in each element of the list, the bytearray, as its name implies, will only store bytes.

With this new structure, we now use only 1 byte per flag, so around 100kB of memory:

>>> b = bytearray(100001)
>>> sys.getsizeof(b)
100058

Performance wise, this solution is close to the original speed, so we're not giving up any significant speed while reducing the memory load to around 12.5%.

We can still do better. Using an entire byte for a single flag is wasteful; we can squeeze 8 flags into each byte. The bitarray class does all of the heavy lifting for us:

import bitarray
def aneufeld_bitarray(a):
 b = bitarray.bitarray(len(a) + 1)
 b.setall(False)
 for num in a:
 if b[num]:
 return num
 b[num] = True
 return -1

This gets us down to 12.5kB for the bit-array of flags. Unfortunately, this additional memory optimization comes with an additional speed hit, due to the bit packing and unpacking. The performance is still better than "Sriv_improved" performance, and we're using only 1/64th of the original memory requirement.

Timing, on my machine:

 4.94 ms 4.62 ms 4.55 ms original
 3.89 ms 3.85 ms 3.84 ms original_improved
20.05 ms 20.03 ms 19.78 ms hjpotter92
 9.59 ms 9.69 ms 9.75 ms Reinderien
 8.60 ms 8.68 ms 8.75 ms Reinderien_improved
19.69 ms 19.69 ms 19.40 ms Sriv
13.92 ms 13.99 ms 13.98 ms Sriv_improved
 6.84 ms 6.84 ms 6.86 ms aneufeld_array
 4.76 ms 4.80 ms 4.77 ms aneufeld_bytearray
12.71 ms 12.65 ms 12.57 ms aneufeld_bitarray

@superbrain My bad - clearly I needed sleep. Fixed it, ... and then completely replaced it bytearray, which is far superior.
Yeah, using repeat is better in that sense, although it still costs more space (because the array overallocates) than array.array('b', (0,)) * (len(a) + 1), which is also shorter and about 2000 (!) times faster (for the worst case length).
@superbrain Clearly, Python needs an array.array(type, length) constructor, based on all the hoops we're going through to allocate an array of the desired size.
@superbrain Updated my times using your array multiplication suggestion. Nice.
Ha! Didn't think it would be noticeable in the solution time. I had only tested repeat-vs-multiply in isolation and repeat took about 0.006 seconds, which I thought was irrelevant in the overall solution time, but I forgot that I print the solution times in milliseconds :-)

Stack Exchange Network

Return first duplicate

6 Answers 6

Memory

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Return first duplicate

6 Answers 6

Memory

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions