13

Can anyone suggest a good solution to remove duplicates from nested lists if wanting to evaluate duplicates based on first element of each nested list?

The main list looks like this:

L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46]]

If there is another list with the same element at first position [k][0] that had already occurred, then I'd like to remove that list and get this result:

L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33]]

Can you suggest an algorithm to achieve this goal?

6 Answers 6

32

Do you care about preserving order / which duplicate is removed? If not, then:

dict((x[0], x) for x in L).values()

will do it. If you want to preserve order, and want to keep the first one you find then:

def unique_items(L):
 found = set()
 for item in L:
 if item[0] not in found:
 yield item
 found.add(item[0])
print list(unique_items(L))
2
  • your conversion to a dict was so much more elegant than mind that I stole it :) Commented Jul 17, 2009 at 14:02
  • Doesn't the first one also preserve order because dicts preserve order since Python 3.7 and the keys are inserted in the order that the comprehension produces them? Commented Oct 1, 2020 at 13:49
4

use a dict instead like so:

L = {'14': ['65', 76], '2': ['5', 6], '7': ['12', 33]}
L['14'] = ['22', 46]

if you are receiving the first list from some external source, convert it like so:

L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46]]
L_dict = dict((x[0], x[1:]) for x in L)
2

Use Pandas :

import pandas as pd
L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46],['7','a','b']]
df = pd.DataFrame(L)
df = df.drop_duplicates()
L_no_duplicates = df.values.tolist()

If you want to drop duplicates in specific columns only use instead:

df = df.drop_duplicates([1,2])
0

i am not sure what you meant by "another list", so i assume you are saying those lists inside L

a=[]
L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46],['7','a','b']]
for item in L:
 if not item[0] in a:
 a.append(item[0])
 print item
2
  • 1
    This would be more efficient if you used a set for 'a' - you're O(N^2) using a list like that, and amortised O(N) using a set. Commented Jul 17, 2009 at 13:58
  • that has not come to mind, thanks for the info. nevertheless, that code works in older Python version that doesn't come with set. ;) Commented Jul 17, 2009 at 14:14
0

If the order does not matter, code below

print [ [k] + v for (k, v) in dict( [ [a[0], a[1:]] for a in reversed(L) ] ).items() ]

gives

[['2', '5', '6'], ['14', '65', '76'], ['7', '12', '33']]

0
def Remove(duplicate):
 final_list = []
 for num in duplicate:
 if num not in final_list:
 final_list.append(num)
 return final_list
duplicate = [2, 4, 10, 20, 5, 2, 20, 4]
print(Remove(duplicate))
1
  • 1
    Plase provide some comments about your code and changes you made with the original code. Commented Mar 28, 2023 at 12:57

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.