2

Is there a way to find if a list contains duplicates. For example:

list1 = [1,2,3,4,5]
list2 = [1,1,2,3,4,5]
list1.*method* = False # no duplicates
list2.*method* = True # contains duplicates
asked Jun 28, 2012 at 17:24
3
  • 1
    Is this assuming the lists are always sorted? Commented Jun 28, 2012 at 17:25
  • Possible duplicate: stackoverflow.com/questions/1920145/… Commented Jun 28, 2012 at 17:27
  • 1
    @tyjkenn: Checking for existence of duplicates is simpler than finding the actual duplicates (which is what the other question is about). Commented Jun 28, 2012 at 17:30

4 Answers 4

14

If you convert the list to a set temporarily, that will eliminate the duplicates in the set. You can then compare the lengths of the list and set.

In code, it would look like this:

list1 = [...]
tmpSet = set(list1)
haveDuplicates = len(list1) != len(tmpSet)
answered Jun 28, 2012 at 17:27
2
  • 2
    +1 for including some actual text to explain what you are doing as opposed to just plopping down code. Commented Jun 28, 2012 at 17:34
  • 1
    @jdi: I actually tried to just plop down some code originally but it came under the 30 characters minimum. Commented Jun 28, 2012 at 17:50
2

Convert the list to a set to remove duplicates. Compare the lengths of the original list and the set to see if any duplicates existed.

>>> list1 = [1,2,3,4,5]
>>> list2 = [1,1,2,3,4,5]
>>> len(list1) == len(set(list1))
True # no duplicates
>>> len(list2) == len(set(list2))
False # duplicates
answered Jun 28, 2012 at 17:27
2

Check if the length of the original list is larger than the length of the unique "set" of elements in the list. If so, there must have been duplicates

list1 = [1,2,3,4,5]
list2 = [1,1,2,3,4,5]
if len(list1) != len(set(list1)):
 #duplicates
answered Jun 28, 2012 at 17:28
0

The set() approach only works for hashable objects, so for completness, you could do it with just plain iteration:

import itertools
def has_duplicates(iterable):
 """
 >>> has_duplicates([1,2,3])
 False
 >>> has_duplicates([1, 2, 1])
 True
 >>> has_duplicates([[1,1], [3,2], [4,3]])
 False
 >>> has_duplicates([[1,1], [3,2], [4,3], [4,3]])
 True
 """
 return any(x == y for x, y in itertools.combinations(iterable, 2))
answered Jun 28, 2012 at 17:43
4
  • Ouch. This one hurts for complexity. Better to write hash functions for your unhashable objects. Commented Jun 28, 2012 at 17:58
  • @JoelCornett Mind doing it for list ? Commented Jun 28, 2012 at 18:07
  • listHash = lambda x: hash(tuple(x)). Note that since this hash is just a one-time thing, you don't have to worry about objects mutating on you. Commented Jun 28, 2012 at 20:58
  • Here's a simpler one: lambda x: 1. Creating such a function doesn't make list objects any more hashable, 'cause list.__hash__ is still None. As for efficiency, you can easily tweak this to take constant extra memory. Hashing is just a CPU/memory tradeoff. Commented Jun 29, 2012 at 7:04

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.