Python methods to find duplicates

Asked 13 years, 2 months ago

Viewed 394 times

Is there a way to find if a list contains duplicates. For example:

list1 = [1,2,3,4,5]
list2 = [1,1,2,3,4,5]
list1.*method* = False # no duplicates
list2.*method* = True # contains duplicates

python

Improve this question

asked Jun 28, 2012 at 17:24

David542's user avatar

David542 David542

112k210 gold badges574 silver badges1k bronze badges

1

Is this assuming the lists are always sorted?

tyjkenn
– tyjkenn

2012年06月28日 17:25:42 +00:00
Commented Jun 28, 2012 at 17:25
Possible duplicate: stackoverflow.com/questions/1920145/…

tyjkenn
– tyjkenn

2012年06月28日 17:27:12 +00:00
Commented Jun 28, 2012 at 17:27
1

@tyjkenn: Checking for existence of duplicates is simpler than finding the actual duplicates (which is what the other question is about).

interjay
– interjay

2012年06月28日 17:30:44 +00:00
Commented Jun 28, 2012 at 17:30

Add a comment |

4 Answers 4

Sorted by: Reset to default

If you convert the list to a set temporarily, that will eliminate the duplicates in the set. You can then compare the lengths of the list and set.

In code, it would look like this:

list1 = [...]
tmpSet = set(list1)
haveDuplicates = len(list1) != len(tmpSet)

Improve this answer

answered Jun 28, 2012 at 17:27

3Doubloons's user avatar

3Doubloons 3Doubloons

2,10614 silver badges26 bronze badges

2

+1 for including some actual text to explain what you are doing as opposed to just plopping down code.

jdi
– jdi

2012年06月28日 17:34:03 +00:00
Commented Jun 28, 2012 at 17:34
1

@jdi: I actually tried to just plop down some code originally but it came under the 30 characters minimum.

3Doubloons
– 3Doubloons

2012年06月28日 17:50:54 +00:00
Commented Jun 28, 2012 at 17:50

Add a comment |

Convert the list to a set to remove duplicates. Compare the lengths of the original list and the set to see if any duplicates existed.

>>> list1 = [1,2,3,4,5]
>>> list2 = [1,1,2,3,4,5]
>>> len(list1) == len(set(list1))
True # no duplicates
>>> len(list2) == len(set(list2))
False # duplicates

Improve this answer

edited Jun 28, 2012 at 17:48

answered Jun 28, 2012 at 17:27

FogleBird's user avatar

FogleBird FogleBird

77.2k25 gold badges132 silver badges135 bronze badges

Add a comment |

Check if the length of the original list is larger than the length of the unique "set" of elements in the list. If so, there must have been duplicates

list1 = [1,2,3,4,5]
list2 = [1,1,2,3,4,5]
if len(list1) != len(set(list1)):
 #duplicates

Improve this answer

edited Jun 29, 2012 at 19:56

answered Jun 28, 2012 at 17:28

Paul Seeb's user avatar

Paul Seeb Paul Seeb

6,1963 gold badges29 silver badges38 bronze badges

Add a comment |

The set() approach only works for hashable objects, so for completness, you could do it with just plain iteration:

import itertools
def has_duplicates(iterable):
 """
 >>> has_duplicates([1,2,3])
 False
 >>> has_duplicates([1, 2, 1])
 True
 >>> has_duplicates([[1,1], [3,2], [4,3]])
 False
 >>> has_duplicates([[1,1], [3,2], [4,3], [4,3]])
 True
 """
 return any(x == y for x, y in itertools.combinations(iterable, 2))

Improve this answer

answered Jun 28, 2012 at 17:43

lqc's user avatar

lqc lqc

7,3881 gold badge28 silver badges27 bronze badges

Ouch. This one hurts for complexity. Better to write hash functions for your unhashable objects.

Joel Cornett
– Joel Cornett

2012年06月28日 17:58:53 +00:00
Commented Jun 28, 2012 at 17:58
@JoelCornett Mind doing it for list ?

lqc
– lqc

2012年06月28日 18:07:38 +00:00
Commented Jun 28, 2012 at 18:07
listHash = lambda x: hash(tuple(x)). Note that since this hash is just a one-time thing, you don't have to worry about objects mutating on you.

Joel Cornett
– Joel Cornett

2012年06月28日 20:58:43 +00:00
Commented Jun 28, 2012 at 20:58
Here's a simpler one: lambda x: 1. Creating such a function doesn't make list objects any more hashable, 'cause list.__hash__ is still None. As for efficiency, you can easily tweak this to take constant extra memory. Hashing is just a CPU/memory tradeoff.

lqc
– lqc

2012年06月29日 07:04:53 +00:00
Commented Jun 29, 2012 at 7:04

Add a comment |

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

python

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

Python methods to find duplicates

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related