How can I count duplicates in a nested list based on first two elements in python

Question 1

I have a list in the form:

lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0], [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 0, 0, 0]]

However the last two sub-elements will always be zero at the start so it could be like:

lst = [[1, 0], [1, 1], [2, 0], [2, 1], [2, 1], [1, 1], [3, 1], [1, 3], [2, 1], [2, 0]]

If that is easier.

What I want is to remove and count the duplicates of this list and set the 3rd sub-element to the count so if we take the above I want:

lst = [[1, 0, 1, 0], [1, 1, 2, 0], [2, 0, 2, 0], [2, 1, 3, 0], [3, 1, 1, 0], [1, 3, 1, 0]]

I have found explanations of how to remove duplicates at: Removing Duplicates from Nested List Based on First 2 Elements and Removing duplicates from list of lists in Python

but I don't know how to count the duplicates. The order of the elements in the overall list doesn't matter but the order of the elements in the sub-lists must be preserved as [1,3] and [3,1] aren't the same thing.

If this turns out to be a dead end I could do something like hash the first two elements for counting but only if I could get them back after counting.

Any help is appreciated. Sorry for dyslexia!

Question 2

you need a Counter, some tuples (as lists are not hashable), and a list comprehension

Question 3

For example:

lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0], [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 0, 0, 0]]
from collections import Counter
c = Counter(tuple(i) for i in lst)
print [list(item[0][0:2] + (item[1], 0)) for item in c.items()]
# [[1, 0, 1, 0], [1, 1, 2, 0], [3, 1, 1, 0], [2, 1, 3, 0], [1, 3, 1, 0], [2, 0, 2, 0]]

Question 4

To elaborate on the great hint provided by njzk2:

Turn your list of lists into a list of tuples
Create a Counter from it
Get a dict from the Counter

Set the 3rd element of the sublists to the frequency from the Counter

from collections import Counter
lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0], [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 0, 0, 0]]
list_of_tuples = [tuple(elem) for elem in lst]
dct = dict(Counter(list_of_tuples))
lst = [list(e) for e in dct]
for elem in lst:
 elem[2] = dct[tuple(elem)]

Edit: removed duplicates with the line before the for loop. Didn't see that requirement before.

Question 5

You can do this to keep count of the duplicates:

lst = [[1, 0], [1, 1], [2, 0], [2, 1], [2, 1], [1, 1], [3, 1], [1, 3], [2, 1], [2, 0]]
for x in lst:
 count = 1
 tmpLst = list(lst)
 tmpLst.remove(x)
 for y in tmpLst:
 if x[0] == y[0] and x[1] == y[1]:
 count = count + 1
 x.append(count)
 #x.append(0) #if you want to add that 4th element
print lst

Result:

[[1, 0, 1], [1, 1, 2], [2, 0, 2], [2, 1, 3], [2, 1, 3], [1, 1, 2], [3, 1, 1], [1, 3, 1], [2, 1, 3], [2, 0, 2]]

Then you can take lst and remove duplicates as mentioned in the link you posted.

Question 6

A different (maybe functional) approach.

lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0],\
 [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0],\
 [2, 1, 0, 0], [2, 0, 0, 0]] 
def rec_counter(lst):
 # Inner method that is called at the end. Receives a
 # list, the current element to be compared and an accumulator
 # that will contain the result.
 def counter(lst, elem, acc):
 new_lst = [x for x in lst if x != elem]
 elem[2] = lst.count(elem)
 acc.append(elem)
 if len(new_lst) == 0:
 return acc
 else:
 return counter(new_lst, new_lst[0], acc)
 # This part starts the recursion of the inner method. If the list
 # is empty, nothing to do. Otherwise, count starting with the first
 # element of the list and an empty accumulator.
 if len(lst) == 0:
 return []
 else:
 return counter(lst, lst[0], [])
print rec_counter(lst)
# [[1, 0, 1, 0], [1, 1, 2, 0], [2, 0, 2, 0], \
# [2, 1, 3, 0], [3, 1, 1, 0], [1, 3, 1, 0]]

njzk2 njzk2 39.4k7 gold badges72 silver badges110 bronze badges · Accepted Answer · 2014-06-02 17:47:04Z

For example:

lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0], [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 0, 0, 0]]
from collections import Counter
c = Counter(tuple(i) for i in lst)
print [list(item[0][0:2] + (item[1], 0)) for item in c.items()]
# [[1, 0, 1, 0], [1, 1, 2, 0], [3, 1, 1, 0], [2, 1, 3, 0], [1, 3, 1, 0], [2, 0, 2, 0]]

CollectivesTM on Stack Overflow

How can I count duplicates in a nested list based on first two elements in python

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related