Python: Removing duplicates from a list

Question 1

After reading some data from a file and sorting through it, I get this.

[['John', 1], ['Lisa', 2], ['Carly', 2], ['Zacharry', 1], ['Brian', 3], ['John', 5], ['Carly', 2]]

How can I removed the duplicates while also adding the values they have so my output would look like this

[['John', 6], ['Lisa', 2], ['Carly', 4], ['Zacharry', 1], ['Brian', 3]]

I've been able to isolate the duplicates on their own with the total sum of data, however I have no idea how to get my desired output.

Note: Order of the list is important in my case and that my data stays in a list

When I've isolated the duplicates I get this output:

[['John', 6], ['Carly', 4]]

My Code:

def create_bills(filename, capacity):
fob = open(filename)
newlst = list()
for line in fob:
 a = line.split(" $")
 b = [a[0], int(a[1])]
 newlst.append(b)
print(newlst)
newlst2 = list()
for i in range(len(newlst)):
 n = i + 1
 while n < len(newlst):
 if newlst[i][0] == newlst[n][0]:
 newlst2.append([newlst[i][0], (newlst[i][1] + newlst[n][1])])
 n += 1
newlst3 = list()
for i in range(len(newlst)):
 pass
print(newlst2)

Thank you!

Question 2

If you've isolated the duplicates then you've solved your problem! Show us what you've done and we'll be able to help you out.

Question 3

You can use a dict, more specifically an OrderedDict to keep track of the counts:

from collections import OrderedDict
lst = [['John', 1], ['Lisa', 2], ['Carly', 2], ['Zacharry', 1], ['Brian', 3], ['John', 5], ['Carly', 2]]
d = OrderedDict()
for k, v in lst:
 if k not in d:
 d[k] = v
 else:
 d[k] += v
print map(list, d.items())
#[['John', 6], ['Lisa', 2], ['Carly', 4], ['Zacharry', 1], ['Brian', 3]]

Code readability issue aside, it's important to note that it takes O(N^2) complexity if you maintain the counts in a list, like what the original code is doing. The dictionary approach takes O(N).

Question 4

This is awesome sir! Thank you very much. One question though, how can I make your answer return what you had printed? for instance, make something equal to "[['John', 6], ['Lisa', 2], ['Carly', 4], ['Zacharry', 1], ['Brian', 3]]"

Question 5

You can just change the last line from print to return, assuming you have put the code into function.

Question 6

Sorry but that doesn't exactly work, the output becomes this OrderedDict([('John', 6), ('Lisa', 2), ('Carly', 4), ('Zacharry', 1), ('Brian', 3)])

Question 7

That is d, which is an OrderedDict. The last line transform it into the list representation by map(list, d.items()), but leaving the original d untouched, of course (if that's what you are thinking...). You can use result = map(list, d.items()) to "make something equal to" it, or just return it in a function.

Question 8

This should give your answer.

def out(a):
 x={name:0 for name,value in a}
 for name,value in a:
 x[name]=x[name]+value
 final=[]
 for i in a:
 if (i[0],x[i[0]]) not in final:
 final.append((i[0],x[i[0]])) 
 return final

The output is [('John', 6), ('Lisa', 2), ('Carly', 4), ('Zacharry', 1), ('Brian', 3)]

Question 9

Issue is though that the original order is not present

Question 10

Why do you need to preserve the original order? Is there any specific reason?

YS-L YS-L 14.8k4 gold badges52 silver badges61 bronze badges · Accepted Answer · 2014-11-12 02:46:51Z

3

You can use a dict, more specifically an OrderedDict to keep track of the counts:

from collections import OrderedDict
lst = [['John', 1], ['Lisa', 2], ['Carly', 2], ['Zacharry', 1], ['Brian', 3], ['John', 5], ['Carly', 2]]
d = OrderedDict()
for k, v in lst:
 if k not in d:
 d[k] = v
 else:
 d[k] += v
print map(list, d.items())
#[['John', 6], ['Lisa', 2], ['Carly', 4], ['Zacharry', 1], ['Brian', 3]]

Code readability issue aside, it's important to note that it takes O(N^2) complexity if you maintain the counts in a list, like what the original code is doing. The dictionary approach takes O(N).

Share

Improve this answer

edited Nov 12, 2014 at 3:10

answered Nov 12, 2014 at 2:46

YS-L's user avatar

YS-L YS-L

14.8k4 gold badges52 silver badges61 bronze badges

4 Comments

SirGoose

SirGoose Over a year ago

This is awesome sir! Thank you very much. One question though, how can I make your answer return what you had printed? for instance, make something equal to "[['John', 6], ['Lisa', 2], ['Carly', 4], ['Zacharry', 1], ['Brian', 3]]"

2014年11月12日T03:18:50.237Z+00:00

YS-L

YS-L Over a year ago

You can just change the last line from print to return, assuming you have put the code into function.

2014年11月12日T03:20:46.783Z+00:00

SirGoose

SirGoose Over a year ago

Sorry but that doesn't exactly work, the output becomes this OrderedDict([('John', 6), ('Lisa', 2), ('Carly', 4), ('Zacharry', 1), ('Brian', 3)])

2014年11月12日T03:34:56.48Z+00:00

YS-L

YS-L Over a year ago

That is d, which is an OrderedDict. The last line transform it into the list representation by map(list, d.items()), but leaving the original d untouched, of course (if that's what you are thinking...). You can use result = map(list, d.items()) to "make something equal to" it, or just return it in a function.

2014年11月12日T04:18:24.117Z+00:00

CollectivesTM on Stack Overflow

Python: Removing duplicates from a list

2 Answers 2

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related