1
\$\begingroup\$

I'm trying to remove the duplicate from the list and count the list after removing the duplicates

seq = [[1,2,3], [1,2,3], [2,3,4], [4,5,6]]
new_seq = [[1,2,3], [2,3,4], [4,5,6]]
count = 3 

My code takes around 23 seconds for around 66,000 lists in a list

How can I make my code faster?

def unique(seq):
 new_seq = []
 count = 0
 for i in seq:
 if i not in new_seq:
 new_seq.append(i)
 count += 1
 return count
asked May 6, 2016 at 19:15
\$\endgroup\$
2
  • 2
    \$\begingroup\$ What are you really trying to accomplish? Is this function part of a larger program? Tell us about the context. \$\endgroup\$ Commented May 6, 2016 at 19:41
  • \$\begingroup\$ The lists comes from another function which calculates an algorithm \$\endgroup\$ Commented May 6, 2016 at 19:45

1 Answer 1

3
\$\begingroup\$

Your function is slow because it is O(n2): each element being added to new_seq has to be compared against every previously added element.

To deduplicate a sequence, use a set. Constructing the set is only O(n) because it uses hashing.

Then, to obtain the size of the set, use len().

def unique(seq):
 return len(set(tuple(element) for element in seq))
answered May 6, 2016 at 19:34
\$\endgroup\$
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.