I have a list in the form:
lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0], [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 0, 0, 0]]
However the last two sub-elements will always be zero at the start so it could be like:
lst = [[1, 0], [1, 1], [2, 0], [2, 1], [2, 1], [1, 1], [3, 1], [1, 3], [2, 1], [2, 0]]
If that is easier.
What I want is to remove and count the duplicates of this list and set the 3rd sub-element to the count so if we take the above I want:
lst = [[1, 0, 1, 0], [1, 1, 2, 0], [2, 0, 2, 0], [2, 1, 3, 0], [3, 1, 1, 0], [1, 3, 1, 0]]
I have found explanations of how to remove duplicates at: Removing Duplicates from Nested List Based on First 2 Elements and Removing duplicates from list of lists in Python
but I don't know how to count the duplicates. The order of the elements in the overall list doesn't matter but the order of the elements in the sub-lists must be preserved as [1,3] and [3,1] aren't the same thing.
If this turns out to be a dead end I could do something like hash the first two elements for counting but only if I could get them back after counting.
Any help is appreciated. Sorry for dyslexia!
-
1you need a Counter, some tuples (as lists are not hashable), and a list comprehensionnjzk2– njzk22014年06月02日 17:42:31 +00:00Commented Jun 2, 2014 at 17:42
4 Answers 4
For example:
lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0], [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 0, 0, 0]]
from collections import Counter
c = Counter(tuple(i) for i in lst)
print [list(item[0][0:2] + (item[1], 0)) for item in c.items()]
# [[1, 0, 1, 0], [1, 1, 2, 0], [3, 1, 1, 0], [2, 1, 3, 0], [1, 3, 1, 0], [2, 0, 2, 0]]
To elaborate on the great hint provided by njzk2:
- Turn your list of lists into a list of tuples
- Create a Counter from it
- Get a dict from the Counter
Set the 3rd element of the sublists to the frequency from the Counter
from collections import Counter lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0], [2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 0, 0, 0]] list_of_tuples = [tuple(elem) for elem in lst] dct = dict(Counter(list_of_tuples)) lst = [list(e) for e in dct] for elem in lst: elem[2] = dct[tuple(elem)]
Edit: removed duplicates with the line before the for loop. Didn't see that requirement before.
You can do this to keep count of the duplicates:
lst = [[1, 0], [1, 1], [2, 0], [2, 1], [2, 1], [1, 1], [3, 1], [1, 3], [2, 1], [2, 0]]
for x in lst:
count = 1
tmpLst = list(lst)
tmpLst.remove(x)
for y in tmpLst:
if x[0] == y[0] and x[1] == y[1]:
count = count + 1
x.append(count)
#x.append(0) #if you want to add that 4th element
print lst
Result:
[[1, 0, 1], [1, 1, 2], [2, 0, 2], [2, 1, 3], [2, 1, 3], [1, 1, 2], [3, 1, 1], [1, 3, 1], [2, 1, 3], [2, 0, 2]]
Then you can take lst
and remove duplicates as mentioned in the link you posted.
A different (maybe functional) approach.
lst = [[1, 0, 0, 0], [1, 1, 0, 0], [2, 0, 0, 0], [2, 1, 0, 0],\
[2, 1, 0, 0], [1, 1, 0, 0], [3, 1, 0, 0], [1, 3, 0, 0],\
[2, 1, 0, 0], [2, 0, 0, 0]]
def rec_counter(lst):
# Inner method that is called at the end. Receives a
# list, the current element to be compared and an accumulator
# that will contain the result.
def counter(lst, elem, acc):
new_lst = [x for x in lst if x != elem]
elem[2] = lst.count(elem)
acc.append(elem)
if len(new_lst) == 0:
return acc
else:
return counter(new_lst, new_lst[0], acc)
# This part starts the recursion of the inner method. If the list
# is empty, nothing to do. Otherwise, count starting with the first
# element of the list and an empty accumulator.
if len(lst) == 0:
return []
else:
return counter(lst, lst[0], [])
print rec_counter(lst)
# [[1, 0, 1, 0], [1, 1, 2, 0], [2, 0, 2, 0], \
# [2, 1, 3, 0], [3, 1, 1, 0], [1, 3, 1, 0]]