I want to get the unique values from the following list:
['nowplaying', 'PBS', 'PBS', 'nowplaying', 'job', 'debate', 'thenandnow']
The output which I require is:
['nowplaying', 'PBS', 'job', 'debate', 'thenandnow']
This code works:
output = []
for x in trends:
if x not in output:
output.append(x)
print(output)
is there a better solution I should use?
30 Answers 30
First declare your list properly, separated by commas. You can get the unique values by converting the list to a set.
mylist = ['nowplaying', 'PBS', 'PBS', 'nowplaying', 'job', 'debate', 'thenandnow']
myset = set(mylist)
print(myset)
If you use it further as a list, you should convert it back to a list by doing:
mynewlist = list(myset)
Another possibility, probably faster would be to use a set from the beginning, instead of a list. Then your code should be:
output = set()
for x in trends:
output.add(x)
print(output)
As it has been pointed out, sets do not maintain the original order. If you need that, you should look for an ordered set implementation (see this question for more).
-
8If you need to maintain the set order there is also a library on PyPI: pypi.python.org/pypi/ordered-setJace Browning– Jace Browning2013年09月26日 01:12:31 +00:00Commented Sep 26, 2013 at 1:12
-
13why lists have '.append' and sets have '.add' ??Antonello– Antonello2014年01月28日 11:05:51 +00:00Commented Jan 28, 2014 at 11:05
-
75"append" means to add to the end, which is accurate and makes sense for lists, but sets have no notion of ordering and hence no beginning or end, so "add" makes more sense for them.maackle– maackle2014年03月11日 03:01:14 +00:00Commented Mar 11, 2014 at 3:01
-
3the 'sets' module is deprecated, yes. So you don't have to 'import sets' to get the functionality. if you see
import sets; output = sets.Set()
that's deprecated This answer uses the built-in 'set' class docs.python.org/2/library/stdtypes.html#setFlipMcF– FlipMcF2015年12月09日 00:25:42 +00:00Commented Dec 9, 2015 at 0:25 -
14This does not work if the values of the list are not hashable (e.g., sets or lists)steffen– steffen2018年05月02日 05:14:15 +00:00Commented May 2, 2018 at 5:14
To be consistent with the type I would use:
mylist = list(set(mylist))
-
150Please note, the result will be unordered.Aminah Nuraini– Aminah Nuraini2015年10月26日 08:45:38 +00:00Commented Oct 26, 2015 at 8:45
-
43@Ninjakannon your code will sort the list alphabetically. That does not have to be the order of the original list.johk95– johk952017年07月27日 10:37:11 +00:00Commented Jul 27, 2017 at 10:37
-
22Note a neat way to do this in python 3 is
mylist = [*{*mylist}]
. This is an*arg
-style set-expansion followed by an*arg
-style list-expansion.Luke Davis– Luke Davis2017年12月11日 10:10:05 +00:00Commented Dec 11, 2017 at 10:10 -
5@LukeDavis best answer for me,
sorted([*{*c}])
is 25% faster thansorted(list(set(c)))
(measured withtimeit.repeat
with number=100000)jeannej– jeannej2018年12月05日 17:58:23 +00:00Commented Dec 5, 2018 at 17:58 -
11N.B.: This fails if the list has unhashable elements.(e.g. elements which are itself sets, lists or hashes).Heinrich– Heinrich2020年04月20日 12:40:22 +00:00Commented Apr 20, 2020 at 12:40
If we need to keep the elements order, how about this:
used = set()
mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = [x for x in mylist if x not in used and (used.add(x) or True)]
And one more solution using reduce
and without the temporary used
var.
mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = reduce(lambda l, x: l.append(x) or l if x not in l else l, mylist, [])
UPDATE - Dec, 2020 - Maybe the best approach!
Starting from python 3.7, the standard dict preserves insertion order.
Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6.
So this gives us the ability to use dict.fromkeys()
for de-duplication!
NOTE: Credits goes to @rlat for giving us this approach in the comments!
mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = list(dict.fromkeys(mylist))
In terms of speed - for me its fast enough and readable enough to become my new favorite approach!
UPDATE - March, 2019
And a 3rd solution, which is a neat one, but kind of slow since .index
is O(n).
mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = [x for i, x in enumerate(mylist) if i == mylist.index(x)]
UPDATE - Oct, 2016
Another solution with reduce
, but this time without .append
which makes it more human readable and easier to understand.
mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = reduce(lambda l, x: l+[x] if x not in l else l, mylist, [])
#which can also be writed as:
unique = reduce(lambda l, x: l if x in l else l+[x], mylist, [])
NOTE: Have in mind that more human-readable we get, more unperformant the script is. Except only for the dict.fromkeys()
approach which is python 3.7+ specific.
import timeit
setup = "mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']"
#10x to Michael for pointing out that we can get faster with set()
timeit.timeit('[x for x in mylist if x not in used and (used.add(x) or True)]', setup='used = set();'+setup)
0.2029558869980974
timeit.timeit('[x for x in mylist if x not in used and (used.append(x) or True)]', setup='used = [];'+setup)
0.28999493700030143
# 10x to rlat for suggesting this approach!
timeit.timeit('list(dict.fromkeys(mylist))', setup=setup)
0.31227896199925453
timeit.timeit('reduce(lambda l, x: l.append(x) or l if x not in l else l, mylist, [])', setup='from functools import reduce;'+setup)
0.7149233570016804
timeit.timeit('reduce(lambda l, x: l+[x] if x not in l else l, mylist, [])', setup='from functools import reduce;'+setup)
0.7379565160008497
timeit.timeit('reduce(lambda l, x: l if x in l else l+[x], mylist, [])', setup='from functools import reduce;'+setup)
0.7400134069976048
timeit.timeit('[x for i, x in enumerate(mylist) if i == mylist.index(x)]', setup=setup)
0.9154880290006986
ANSWERING COMMENTS
Because @monica asked a good question about "how is this working?". For everyone having problems figuring it out. I will try to give a more deep explanation about how this works and what sorcery is happening here ;)
So she first asked:
I try to understand why
unique = [used.append(x) for x in mylist if x not in used]
is not working.
Well it's actually working
>>> used = []
>>> mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
>>> unique = [used.append(x) for x in mylist if x not in used]
>>> print used
[u'nowplaying', u'PBS', u'job', u'debate', u'thenandnow']
>>> print unique
[None, None, None, None, None]
The problem is that we are just not getting the desired results inside the unique
variable, but only inside the used
variable. This is because during the list comprehension .append
modifies the used
variable and returns None
.
So in order to get the results into the unique
variable, and still use the same logic with .append(x) if x not in used
, we need to move this .append
call on the right side of the list comprehension and just return x
on the left side.
But if we are too naive and just go with:
>>> unique = [x for x in mylist if x not in used and used.append(x)]
>>> print unique
[]
We will get nothing in return.
Again, this is because the .append
method returns None
, and it this gives on our logical expression the following look:
x not in used and None
This will basically always:
- evaluates to
False
whenx
is inused
, - evaluates to
None
whenx
is not inused
.
And in both cases (False
/None
), this will be treated as falsy
value and we will get an empty list as a result.
But why this evaluates to None
when x
is not in used
? Someone may ask.
Well it's because this is how Python's short-circuit operators works.
The expression
x and y
first evaluates x; if x is false, its value is returned; otherwise, y is evaluated and the resulting value is returned.
So when x
is not in used (i.e. when its True
) the next part or the expression will be evaluated (used.append(x)
) and its value (None
) will be returned.
But that's what we want in order to get the unique elements from a list with duplicates, we want to .append
them into a new list only when we they came across for a fist time.
So we really want to evaluate used.append(x)
only when x
is not in used
, maybe if there is a way to turn this None
value into a truthy
one we will be fine, right?
Well, yes and here is where the 2nd type of short-circuit
operators come to play.
The expression
x or y
first evaluates x; if x is true, its value is returned; otherwise, y is evaluated and the resulting value is returned.
We know that .append(x)
will always be falsy
, so if we just add one or
next to him, we will always get the next part. That's why we write:
x not in used and (used.append(x) or True)
so we can evaluate used.append(x)
and get True
as a result, only when the first part of the expression (x not in used)
is True
.
Similar fashion can be seen in the 2nd approach with the reduce
method.
(l.append(x) or l) if x not in l else l
#similar as the above, but maybe more readable
#we return l unchanged when x is in l
#we append x to l and return l when x is not in l
l if x in l else (l.append(x) or l)
where we:
- Append
x
tol
and return thatl
whenx
is not inl
. Thanks to theor
statement.append
is evaluated andl
is returned after that. - Return
l
untouched whenx
is inl
-
I try to understand why
unique = [used.append(x) for x in mylist if x not in used]
is not working. Why do we have to putand (used.append(x) or True)
at the end of the list comprehensions?Monica– Monica2016年08月13日 17:45:41 +00:00Commented Aug 13, 2016 at 17:45 -
3@Monica basically, because
used.append(x)
addsx
intoused
but the return value from this function isNone
, so if we skip theor True
part, we get:x not in used and None
which will always evaluate toFalse
and theunique
list will remain empty.Todor– Todor2016年08月13日 19:20:41 +00:00Commented Aug 13, 2016 at 19:20 -
2Don't worry, there are no stupid questions, only stupid answers :) I updated my answer with an attempt to better explain how it works, hope I make it clear and you can understand it now.Todor– Todor2016年08月14日 00:21:49 +00:00Commented Aug 14, 2016 at 0:21
-
1Even faster is using a set:
timeit.timeit('[x for x in mylist if x not in used and not used.add(x)]', setup='used = set();'+setup)
Michael– Michael2016年11月09日 12:12:41 +00:00Commented Nov 9, 2016 at 12:12 -
5Another option worth mentioning and working since Python 3.7 is using
dict
as it keeps the order of the keys but also eliminates duplicates:list(dict.fromkeys(mylist))
Timing-wise it positions as 3rd.radeklat– radeklat2020年12月10日 15:12:26 +00:00Commented Dec 10, 2020 at 15:12
A Python list:
>>> a = ['a', 'b', 'c', 'd', 'b']
To get unique items, just transform it into a set (which you can transform back again into a list if required):
>>> b = set(a)
>>> print(b)
{'b', 'c', 'd', 'a'}
-
64Nice, so
a = list(set(a))
gets the unique items.Brian Burns– Brian Burns2013年08月24日 23:08:15 +00:00Commented Aug 24, 2013 at 23:08 -
11Brian,
set(a)
is sufficient to "get the unique items". You only need to construct another list if you specifically need a list for some reason.jbg– jbg2014年06月30日 11:02:11 +00:00Commented Jun 30, 2014 at 11:02 -
8Note the result will be unordered.Timothy Aaron– Timothy Aaron2017年01月23日 22:13:47 +00:00Commented Jan 23, 2017 at 22:13
What type is your output variable?
Python sets are what you need. Declare output like this:
output = set() # initialize an empty set
and you're ready to go adding elements with output.add(elem)
and be sure they're unique.
Warning: sets DO NOT preserve the original order of the list.
Options to remove duplicates may include the following generic data structures:
- set: unordered, unique elements
- ordered set: ordered, unique elements
Here is a summary on quickly getting either one in Python.
Given
from collections import OrderedDict
seq = [u"nowplaying", u"PBS", u"PBS", u"nowplaying", u"job", u"debate", u"thenandnow"]
Code
Option 1 - A set
(unordered):
list(set(seq))
# ['thenandnow', 'PBS', 'debate', 'job', 'nowplaying']
Python doesn't have ordered sets, but here are some ways to mimic one.
Option 2 - an OrderedDict
(insertion ordered):
list(OrderedDict.fromkeys(seq))
# ['nowplaying', 'PBS', 'job', 'debate', 'thenandnow']
Option 3 - a dict
(insertion ordered), default in Python 3.6+. See more details in this post:
list(dict.fromkeys(seq))
# ['nowplaying', 'PBS', 'job', 'debate', 'thenandnow']
Note: listed elements must be hashable. See details on the latter example in this blog post. Furthermore, see R. Hettinger's post on the same technique; the order preserving dict is extended from one of his early implementations. See also more on total ordering.
-
4@Henry Henrinson I appreciate your voicing your reason in down-voting this answer. However, your opinion and claim " The Python 3.6 solution is not order preserving" are not qualified with references. To be clear, in Python 3.6, dictionaries preserve insertion order in the CPython implementation. It is a language feature in Python 3.7+. Moreover, see an on-going blog post on that approach claimed at that time to be the fastest ordered option in Python 3.6.pylang– pylang2019年05月01日 17:49:09 +00:00Commented May 1, 2019 at 17:49
Maintaining order:
# oneliners
# slow -> . --- 14.417 seconds ---
[x for i, x in enumerate(array) if x not in array[0:i]]
# fast -> . --- 0.0378 seconds ---
[x for i, x in enumerate(array) if array.index(x) == i]
# multiple lines
# fastest -> --- 0.012 seconds ---
uniq = []
[uniq.append(x) for x in array if x not in uniq]
uniq
Order doesn't matter:
# fastest-est -> --- 0.0035 seconds ---
list(set(array))
-
1This has terrible performance (O(n^2)) for large lists and is neither simpler nor easier to read than
list(set(array))
. The only advantage is the preservation of order, which was not asked for.jlh– jlh2017年09月27日 09:38:42 +00:00Commented Sep 27, 2017 at 9:38 -
3This is great for simple scripts where you want to keep order and don't care about speed.JeffCharter– JeffCharter2018年01月23日 18:04:24 +00:00Commented Jan 23, 2018 at 18:04
-
@JeffCharter- added one that maintains order and is mucho faster :)daino3– daino32018年02月07日 17:08:51 +00:00Commented Feb 7, 2018 at 17:08
-
1@MMT - list comprehensiondaino3– daino32018年02月21日 15:47:43 +00:00Commented Feb 21, 2018 at 15:47
-
3I really appreciate you taking the time to break out the timestamps tooLotus– Lotus2018年12月08日 17:53:51 +00:00Commented Dec 8, 2018 at 17:53
Getting unique elements from List
mylist = [1,2,3,4,5,6,6,7,7,8,8,9,9,10]
Using Simple Logic from Sets - Sets are unique list of items
mylist=list(set(mylist))
In [0]: mylist
Out[0]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Using Simple Logic
newList=[]
for i in mylist:
if i not in newList:
newList.append(i)
In [0]: mylist
Out[0]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Using pop method ->pop removes the last or indexed item and displays that to user. video
k=0
while k < len(mylist):
if mylist[k] in mylist[k+1:]:
mylist.pop(mylist[k])
else:
k=k+1
In [0]: mylist
Out[0]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Using Numpy
import numpy as np
np.unique(mylist)
In [0]: mylist
Out[0]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
-
1this answer deserves more updoots: for unhashable types where you want to check value uniqueness rather than identity uniqueness the simple logic is correct - meaning it's more correct in general.ocket8888– ocket88882018年08月15日 16:30:13 +00:00Commented Aug 15, 2018 at 16:30
-
If you are using numpy in your code (which might be a good choice for larger amounts of data), check out numpy.unique:
>>> import numpy as np
>>> wordsList = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
>>> np.unique(wordsList)
array([u'PBS', u'debate', u'job', u'nowplaying', u'thenandnow'],
dtype='<U10')
(http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html)
As you can see, numpy supports not only numeric data, string arrays are also possible. Of course, the result is a numpy array, but it doesn't matter a lot, because it still behaves like a sequence:
>>> for word in np.unique(wordsList):
... print word
...
PBS
debate
job
nowplaying
thenandnow
If you really want to have a vanilla python list back, you can always call list().
However, the result is automatically sorted, as you can see from the above code fragments. Check out numpy unique without sort if retaining list order is required.
set - unordered collection of unique elements. List of elements can be passed to set's constructor. So, pass list with duplicate elements, we get set with unique elements and transform it back to list then get list with unique elements. I can say nothing about performance and memory overhead, but I hope, it's not so important with small lists.
list(set(my_not_unique_list))
Simply and short.
-
1Could you add some explanation on your code for OP?Paco– Paco2015年02月06日 12:54:28 +00:00Commented Feb 6, 2015 at 12:54
-
I tried your answer, this is a good answer but with an explanation it will turns into a great answer :)Papouche Guinslyzinho– Papouche Guinslyzinho2015年02月24日 11:35:05 +00:00Commented Feb 24, 2015 at 11:35
-
1set - unordered collection of unique elements. List of elements can be passed to set's constructor. So, pass list with duplicate elements, we get set with unique elements and transform it back to list then get list with unique elements. I can say nothing about performance and memory overhead, but I hope, it's not so important with small lists.MultiTeemer– MultiTeemer2015年02月28日 01:36:47 +00:00Commented Feb 28, 2015 at 1:36
Same order unique list using only a list compression.
> my_list = [1, 2, 1, 3, 2, 4, 3, 5, 4, 3, 2, 3, 1]
> unique_list = [
> e
> for i, e in enumerate(my_list)
> if my_list.index(e) == i
> ]
> unique_list
[1, 2, 3, 4, 5]
enumerates
gives the index i
and element e
as a tuple
.
my_list.index
returns the first index of e
. If the first index isn't i
then the current iteration's e
is not the first e
in the list.
Edit
I should note that this isn't a good way to do it, performance-wise. This is just a way that achieves it using only a list compression.
First thing, the example you gave is not a valid list.
example_list = [u'nowplaying',u'PBS', u'PBS', u'nowplaying', u'job', u'debate',u'thenandnow']
Suppose if above is the example list. Then you can use the following recipe as give the itertools example doc that can return the unique values and preserving the order as you seem to require. The iterable here is the example_list
from itertools import ifilterfalse
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in ifilterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
-
What's the reason for
seen_add = seen.add
?wjandrea– wjandrea2017年05月20日 03:08:18 +00:00Commented May 20, 2017 at 3:08 -
1It saves one attribute lookup for each element.Michael– Michael2018年01月18日 17:43:10 +00:00Commented Jan 18, 2018 at 17:43
-
What is the purpose of
ifilterfalse(seen.__contains__, iterable)
? Is there a benefit versusfor element not in seen:...
?jpp– jpp2018年05月22日 08:45:29 +00:00Commented May 22, 2018 at 8:45
As a bonus, Counter
is a simple way to get both the unique values and the count for each value:
from collections import Counter
l = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
c = Counter(l)
By using basic property of Python Dictionary:
inp=[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
d={i for i in inp}
print d
Output will be:
set([u'nowplaying', u'job', u'debate', u'PBS', u'thenandnow'])
-
And, from dinamic values?e-info128– e-info1282018年05月21日 17:33:31 +00:00Commented May 21, 2018 at 17:33
-
@e-info128 Quite similarly, put those in a
set
.tripleee– tripleee2018年12月04日 11:08:54 +00:00Commented Dec 4, 2018 at 11:08 -
4This is a
set
, not adict
.tripleee– tripleee2018年12月04日 11:09:08 +00:00Commented Dec 4, 2018 at 11:09
def get_distinct(original_list):
distinct_list = []
for each in original_list:
if each not in distinct_list:
distinct_list.append(each)
return distinct_list
-
7please add some explanation - this is only code. If you look at the other answers, they always go with code and explanation.Alexander– Alexander2016年01月25日 10:18:50 +00:00Commented Jan 25, 2016 at 10:18
-
1@Alexander not always useless, but typically is.ivan_pozdeev– ivan_pozdeev2016年01月25日 17:40:39 +00:00Commented Jan 25, 2016 at 17:40
set
can help you filter out the elements from the list that are duplicates. It will work well for str
, int
or tuple
elements, but if your list contains dict
or other list
elements, then you will end up with TypeError
exceptions.
Here is a general order-preserving solution to handle some (not all) non-hashable types:
def unique_elements(iterable):
seen = set()
result = []
for element in iterable:
hashed = element
if isinstance(element, dict):
hashed = tuple(sorted(element.iteritems()))
elif isinstance(element, list):
hashed = tuple(element)
if hashed not in seen:
result.append(element)
seen.add(hashed)
return result
def setlist(lst=[]):
return list(set(lst))
-
13Try not to use [] as a default parameter. It is the same instance that is used every time so modifications affect the next time the function is called. Not so much of an issue here but it's still unnecessary.Holloway– Holloway2014年06月16日 08:32:15 +00:00Commented Jun 16, 2014 at 8:32
-
3@Trengot Exactly. It should be lst=None, and add a line lst = [] if lst is Nonexis– xis2014年07月24日 20:29:00 +00:00Commented Jul 24, 2014 at 20:29
-
2@xis: or simply
lst or []
mike3996– mike39962014年12月17日 12:16:03 +00:00Commented Dec 17, 2014 at 12:16 -
1Please note, the result will be unordered.Aminah Nuraini– Aminah Nuraini2015年10月26日 08:46:10 +00:00Commented Oct 26, 2015 at 8:46
If you want to get unique elements from a list and keep their original order, then you may employ OrderedDict
data structure from Python's standard library:
from collections import OrderedDict
def keep_unique(elements):
return list(OrderedDict.fromkeys(elements).keys())
elements = [2, 1, 4, 2, 1, 1, 5, 3, 1, 1]
required_output = [2, 1, 4, 5, 3]
assert keep_unique(elements) == required_output
In fact, if you are using Python ≥ 3.6, you can use plain dict
for that:
def keep_unique(elements):
return list(dict.fromkeys(elements).keys())
It's become possible after the introduction of "compact" representation of dicts. Check it out here. Though this "considered an implementation detail and should not be relied upon".
-
I'd like to really drive home that last point. Having a dict internally keep the order of insertion is is an implementation detail of CPython, and there is no guarantee that it will work on another Python engine (like PyPy or IronPython), and it can change in future versions without breaking backward compatibility. So please don't depend on that behaviour in any production-ready code.Berislav Lopac– Berislav Lopac2017年03月18日 11:08:19 +00:00Commented Mar 18, 2017 at 11:08
-
@BerislavLopac, I absolutely agree. It may change and it does not follow "Readability counts" rule. But it's still convenient for one-off scripts and REPL sessions.skovorodkin– skovorodkin2017年03月23日 07:22:23 +00:00Commented Mar 23, 2017 at 7:22
-
1In fact -- to correct my own point -- starting with Python 3.7 the ordered dicts are actually a language feature instead of an implementation quirk. See the answer at stackoverflow.com/a/39980744/122033Berislav Lopac– Berislav Lopac2018年12月04日 15:28:47 +00:00Commented Dec 4, 2018 at 15:28
To get unique values from your list use code below:
trends = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
output = set(trends)
output = list(output)
IMPORTANT: Approach above won't work if any of items in a list is not hashable which is case for mutable types, for instance list or dict.
trends = [{'super':u'nowplaying'}, u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
output = set(trends)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
That means that you have to be sure that trends
list would always contains only hashable items otherwise you have to use more sophisticated code:
from copy import deepcopy
try:
trends = [{'super':u'nowplaying'}, [u'PBS',], [u'PBS',], u'nowplaying', u'job', u'debate', u'thenandnow', {'super':u'nowplaying'}]
output = set(trends)
output = list(output)
except TypeError:
trends_copy = deepcopy(trends)
while trends_copy:
trend = trends_copy.pop()
if trends_copy.count(trend) == 0:
output.append(trend)
print output
I am surprised that nobody so far has given a direct order-preserving answer:
def unique(sequence):
"""Generate unique items from sequence in the order of first occurrence."""
seen = set()
for value in sequence:
if value in seen:
continue
seen.add(value)
yield value
It will generate the values so it works with more than just lists, e.g. unique(range(10))
. To get a list, just call list(unique(sequence))
, like this:
>>> list(unique([u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']))
[u'nowplaying', u'PBS', u'job', u'debate', u'thenandnow']
It has the requirement that each item is hashable and not just comparable, but most stuff in Python is and it is O(n) and not O(n^2), so will work just fine with a long list.
In addition to the previous answers, which say you can convert your list to set, you can do that in this way too
mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenadnow']
mylist = [i for i in set(mylist)]
output will be
[u'nowplaying', u'job', u'debate', u'PBS', u'thenadnow']
though order will not be preserved.
Another simpler answer could be (without using sets)
>>> t = [v for i,v in enumerate(mylist) if mylist.index(v) == i]
[u'nowplaying', u'PBS', u'job', u'debate', u'thenadnow']
- At the begin of your code just declare your output list as empty:
output=[]
- Instead of your code you may use this code
trends=list(set(trends))
-
Please note, the result will be unordered.Aminah Nuraini– Aminah Nuraini2015年10月26日 08:46:31 +00:00Commented Oct 26, 2015 at 8:46
You can use sets. Just to be clear, I am explaining what is the difference between a list and a set. sets are unordered collection of unique elements.Lists are ordered collection of elements. So,
unicode_list=[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job',u'debate', u'thenandnow']
list_unique=list(set(unicode_list))
print list_unique
[u'nowplaying', u'job', u'debate', u'PBS', u'thenandnow']
But: Do not use list/set in naming the variables. It will cause error: EX: Instead of use list instead of unicode_list in the above one.
list=[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job',u'debate', u'thenandnow']
list_unique=list(set(list))
print list_unique
list_unique=list(set(list))
TypeError: 'list' object is not callable
use set to de-duplicate a list, return as list
def get_unique_list(lst):
if isinstance(lst,list):
return list(set(lst))
-
This approach will change the order of the elements in the list which might be undesirable behaviorgomons– gomons2018年05月30日 08:02:55 +00:00Commented May 30, 2018 at 8:02
Set is a collection of un-ordered and unique elements. So, you can use set as below to get a unique list:
unique_list = list(set([u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']))
-
1Although this code may answer the question, providing additional context regarding why and/or how it answers the question would significantly improve its long-term value. Please edit your answer to add some explanation.Toby Speight– Toby Speight2016年05月31日 15:42:25 +00:00Commented May 31, 2016 at 15:42
-
"Set is a collection of ordered and unique elements." Unfortunately not; sets are not ordered as noted in the answers above.kuzzooroo– kuzzooroo2019年08月27日 04:22:01 +00:00Commented Aug 27, 2019 at 4:22
My solution to check contents for uniqueness but preserve the original order:
def getUnique(self):
notunique = self.readLines()
unique = []
for line in notunique: # Loop over content
append = True # Will be set to false if line matches existing line
for existing in unique:
if line == existing: # Line exists ? do not append and go to the next line
append = False
break # Already know file is unique, break loop
if append: unique.append(line) # Line not found? add to list
return unique
Edit: Probably can be more efficient by using dictionary keys to check for existence instead of doing a whole file loop for each line, I wouldn't use my solution for large sets.
I know this is an old question, but here's my unique solution: class inheritance!:
class UniqueList(list):
def appendunique(self,item):
if item not in self:
self.append(item)
return True
return False
Then, if you want to uniquely append items to a list you just call appendunique on a UniqueList. Because it inherits from a list, it basically acts like a list, so you can use functions like index() etc. And because it returns true or false, you can find out if appending succeeded (unique item) or failed (already in the list).
To get a unique list of items from a list, use a for loop appending items to a UniqueList (then copy over to the list).
Example usage code:
unique = UniqueList()
for each in [1,2,2,3,3,4]:
if unique.appendunique(each):
print 'Uniquely appended ' + str(each)
else:
print 'Already contains ' + str(each)
Prints:
Uniquely appended 1
Uniquely appended 2
Already contains 2
Uniquely appended 3
Already contains 3
Uniquely appended 4
Copying to list:
unique = UniqueList()
for each in [1,2,2,3,3,4]:
unique.appendunique(each)
newlist = unique[:]
print newlist
Prints:
[1, 2, 3, 4]
For long arrays
s = np.empty(len(var))
s[:] = np.nan
for x in set(var):
x_positions = np.where(var==x)
s[x_positions[0][0]]=x
sorted_var=s[~np.isnan(s)]
Try this function, it's similar to your code but it's a dynamic range.
def unique(a):
k=0
while k < len(a):
if a[k] in a[k+1:]:
a.pop(k)
else:
k=k+1
return a
Use the following function:
def uniquefy_list(input_list):
"""
This function takes a list as input and return a list containing only unique elements from the input list
"""
output_list=[]
for elm123 in input_list:
in_both_lists=0
for elm234 in output_list:
if elm123 == elm234:
in_both_lists=1
break
if in_both_lists == 0:
output_list.append(elm123)
return output_list
set
, which is dependent on the types found in the list. e.g:d = dict();l = list();l.append (d);set(l)
will lead toTypeError: unhashable type: 'dict
.frozenset
instead won't save you. Learn it the real pythonic way: implement a nested n^2 loop for a simple task of removing duplicates from a list. You can, then optimize it to n.log n. Or implement a real hashing for your objects. Or marshal your objects before creating a set for it.unique_items = list(dict.fromkeys(list_with_duplicates))
(CPython 3.6+)