Get unique values from a list in python

Question

I want to get the unique values from the following list:

['nowplaying', 'PBS', 'PBS', 'nowplaying', 'job', 'debate', 'thenandnow']

The output which I require is:

['nowplaying', 'PBS', 'job', 'debate', 'thenandnow']

This code works:

output = []
for x in trends:
    if x not in output:
        output.append(x)
print(output)

is there a better solution I should use?

Does the order matter? I.e. do you want the order of first occurrence, or would ["PBS", "debate", "job", "thenandnow", "nowplaying"] work as well? — DSM, Oct 15 '12 at 14:16
all the top solutions work for the example of the question, but they don't answer the questions. They all use `set`, which is dependent on the types found in the list. e.g: `d = dict();l = list();l.append (d);set(l)` will lead to `TypeError: unhashable type: 'dict`. `frozenset` instead won't save you. Learn it the real pythonic way: implement a nested n^2 loop for a simple task of removing duplicates from a list. You can, then optimize it to n.log n. Or implement a real hashing for your objects. Or marshal your objects before creating a set for it. — ribamar, Aug 10 '16 at 17:18
If you need to preserve the order of the list: `unique_items = list(dict.fromkeys(list_with_duplicates))` (CPython 3.6+) — Boris Verkhovskiy, Nov 03 '19 at 20:08
related: [How to use multiprocessing to drop duplicates in a very big list?](https://stackoverflow.com/q/59762414/9059420) — Darkonaut, Jan 31 '20 at 21:43

score 1468 · Answer 1 · edited Oct 09 '19 at 04:05

1468

First declare your list properly, separated by commas. You can get the unique values by converting the list to a set.

mylist = ['nowplaying', 'PBS', 'PBS', 'nowplaying', 'job', 'debate', 'thenandnow']
myset = set(mylist)
print(myset)

If you use it further as a list, you should convert it back to a list by doing:

mynewlist = list(myset)

Another possibility, probably faster would be to use a set from the beginning, instead of a list. Then your code should be:

output = set()
for x in trends:
    output.add(x)
print(output)

As it has been pointed out, sets do not maintain the original order. If you need that, you should look for an ordered set implementation (see this question for more).

edited Oct 09 '19 at 04:05

Boris Verkhovskiy

14,854
11
100
103

answered Oct 15 '12 at 14:11

lefterav

15,463
1
17
14

8

If you need to maintain the set order there is also a library on PyPI: https://pypi.python.org/pypi/ordered-set – Jace Browning Sep 26 '13 at 01:12
11

why lists have '.append' and sets have '.add' ?? – Antonello Jan 28 '14 at 11:05
1

Sorry, this is rather a philosophical question. I think it is meant to have a different name, so that it is clear that when you are adding something in the set your item will be lost if an equal item is already in the list. – lefterav Jan 30 '14 at 12:01
67

"append" means to add to the end, which is accurate and makes sense for lists, but sets have no notion of ordering and hence no beginning or end, so "add" makes more sense for them. – maackle Mar 11 '14 at 03:01
I'm new to Python, but it looks like `sets()` is [deprecated](https://docs.python.org/2/library/sets.html) – Himmel Jul 11 '15 at 00:19
3

the 'sets' module is deprecated, yes. So you don't have to 'import sets' to get the functionality. if you see `import sets; output = sets.Set()` that's deprecated This answer uses the built-in 'set' class https://docs.python.org/2/library/stdtypes.html#set – FlipMcF Dec 09 '15 at 00:25
1

Maintaining the order is as simple as `mylist = list(sorted(set(mylist)))` so long as the required ordering is in line with Python's default sort. – Ninjakannon Jul 17 '17 at 20:08
11

This does not work if the values of the list are not hashable (e.g., sets or lists) – steffen May 02 '18 at 05:14
mylist = list(set(mylist)) would be a summary of the above. – D.L Jun 15 '20 at 18:43
"probably faster" - you will never be sure about that - especially as it might be dependent on your individual version (including compiler variants of python language). without measuring real world runtimes its not sure. seeing that it takes more python level codes would mean to me that it might be slower rather than faster - the lower level codes are likely to be faster when used as a single call rather than for the case of doing multiple calls in a loop. by the way, for example sort algorithms will show very varying time growth rates when amount of data increases. this is sort of a sort. – Alexander Stohr Apr 08 '22 at 09:53

score 462 · Answer 2 · edited May 27 '16 at 13:18

462

To be consistent with the type I would use:

mylist = list(set(mylist))

edited May 27 '16 at 13:18

Max Alibaev

681
7
17

answered Dec 04 '14 at 23:02

alemol

8,058
2
24
29

135

Please note, the result will be unordered. – Aminah Nuraini Oct 26 '15 at 08:45
43

@Ninjakannon your code will sort the list alphabetically. That does not have to be the order of the original list. – johk95 Jul 27 '17 at 10:37
@johk95 True, I should have clarified. However, it is also possible to provide your own sorting method to `sorted`. – Ninjakannon Jul 28 '17 at 11:26
20

Note a neat way to do this in python 3 is `mylist = [*{*mylist}]`. This is an `*arg`-style set-expansion followed by an `*arg`-style list-expansion. – Luke Davis Dec 11 '17 at 10:10
5

@LukeDavis best answer for me, `sorted([*{*c}])` is 25% faster than `sorted(list(set(c)))` (measured with `timeit.repeat` with number=100000) – jeannej Dec 05 '18 at 17:58
The question is "Get unique values from a list" not Sorting methods. – alemol Dec 18 '18 at 16:55
6

N.B.: This fails if the list has unhashable elements.(e.g. elements which are itself sets, lists or hashes). – Heinrich supports Monica Apr 20 '20 at 12:40

score 226 · Answer 3 · edited Jul 05 '23 at 07:29

If we need to keep the elements order, how about this:

used = set()
mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = [x for x in mylist if x not in used and (used.add(x) or True)]

And one more solution using reduce and without the temporary used var.

mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = reduce(lambda l, x: l.append(x) or l if x not in l else l, mylist, [])

UPDATE - Dec, 2020 - Maybe the best approach!

Starting from python 3.7, the standard dict preserves insertion order.

Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6.

So this gives us the ability to use dict.fromkeys() for de-duplication!

NOTE: Credits goes to @rlat for giving us this approach in the comments!

mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = list(dict.fromkeys(mylist))

In terms of speed - for me its fast enough and readable enough to become my new favorite approach!

UPDATE - March, 2019

And a 3rd solution, which is a neat one, but kind of slow since .index is O(n).

mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = [x for i, x in enumerate(mylist) if i == mylist.index(x)]

UPDATE - Oct, 2016

Another solution with reduce, but this time without .append which makes it more human readable and easier to understand.

mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = reduce(lambda l, x: l+[x] if x not in l else l, mylist, [])
#which can also be writed as:
unique = reduce(lambda l, x: l if x in l else l+[x], mylist, [])

NOTE: Have in mind that more human-readable we get, more unperformant the script is. Except only for the dict.fromkeys() approach which is python 3.7+ specific.

import timeit

setup = "mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']"

#10x to Michael for pointing out that we can get faster with set()
timeit.timeit('[x for x in mylist if x not in used and (used.add(x) or True)]', setup='used = set();'+setup)
0.2029558869980974

timeit.timeit('[x for x in mylist if x not in used and (used.append(x) or True)]', setup='used = [];'+setup)
0.28999493700030143

# 10x to rlat for suggesting this approach!   
timeit.timeit('list(dict.fromkeys(mylist))', setup=setup)
0.31227896199925453

timeit.timeit('reduce(lambda l, x: l.append(x) or l if x not in l else l, mylist, [])', setup='from functools import reduce;'+setup)
0.7149233570016804

timeit.timeit('reduce(lambda l, x: l+[x] if x not in l else l, mylist, [])', setup='from functools import reduce;'+setup)
0.7379565160008497

timeit.timeit('reduce(lambda l, x: l if x in l else l+[x], mylist, [])', setup='from functools import reduce;'+setup)
0.7400134069976048

timeit.timeit('[x for i, x in enumerate(mylist) if i == mylist.index(x)]', setup=setup)
0.9154880290006986

ANSWERING COMMENTS

Because @monica asked a good question about "how is this working?". For everyone having problems figuring it out. I will try to give a more deep explanation about how this works and what sorcery is happening here ;)

So she first asked:

I try to understand why unique = [used.append(x) for x in mylist if x not in used] is not working.

Well it's actually working

>>> used = []
>>> mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
>>> unique = [used.append(x) for x in mylist if x not in used]
>>> print used
[u'nowplaying', u'PBS', u'job', u'debate', u'thenandnow']
>>> print unique
[None, None, None, None, None]

The problem is that we are just not getting the desired results inside the unique variable, but only inside the used variable. This is because during the list comprehension .append modifies the used variable and returns None.

So in order to get the results into the unique variable, and still use the same logic with .append(x) if x not in used, we need to move this .append call on the right side of the list comprehension and just return x on the left side.

But if we are too naive and just go with:

>>> unique = [x for x in mylist if x not in used and used.append(x)]
>>> print unique
[]

We will get nothing in return.

Again, this is because the .append method returns None, and it this gives on our logical expression the following look:

x not in used and None

This will basically always:

evaluates to False when x is in used,
evaluates to None when x is not in used.

And in both cases (False/None), this will be treated as falsy value and we will get an empty list as a result.

But why this evaluates to None when x is not in used? Someone may ask.

Well it's because this is how Python's short-circuit operators works.

The expression x and y first evaluates x; if x is false, its value is returned; otherwise, y is evaluated and the resulting value is returned.

So when x is not in used (i.e. when its True) the next part or the expression will be evaluated (used.append(x)) and its value (None) will be returned.

But that's what we want in order to get the unique elements from a list with duplicates, we want to .append them into a new list only when we they came across for a fist time.

So we really want to evaluate used.append(x) only when x is not in used, maybe if there is a way to turn this None value into a truthy one we will be fine, right?

Well, yes and here is where the 2nd type of short-circuit operators come to play.

The expression x or y first evaluates x; if x is true, its value is returned; otherwise, y is evaluated and the resulting value is returned.

We know that .append(x) will always be falsy, so if we just add one or next to him, we will always get the next part. That's why we write:

x not in used and (used.append(x) or True)

so we can evaluate used.append(x) and get True as a result, only when the first part of the expression (x not in used) is True.

Similar fashion can be seen in the 2nd approach with the reduce method.

(l.append(x) or l) if x not in l else l
#similar as the above, but maybe more readable
#we return l unchanged when x is in l
#we append x to l and return l when x is not in l
l if x in l else (l.append(x) or l)

where we:

Append x to l and return that l when x is not in l. Thanks to the or statement .append is evaluated and l is returned after that.
Return l untouched when x is in l

I try to understand why `unique = [used.append(x) for x in mylist if x not in used]` is not working. Why do we have to put `and (used.append(x) or True) `at the end of the list comprehensions? — Monica, Aug 13 '16 at 17:45
@Monica basically, because `used.append(x)` adds `x` into `used` but the return value from this function is `None`, so if we skip the `or True` part, we get: `x not in used and None` which will always evaluate to `False` and the `unique` list will remain empty. — Todor, Aug 13 '16 at 19:20
So if I understood it correctly, we use logic in this problem. if a or b is TRUE we are getting TRUE. Though I'm still wondering how Python know that it will be an object instead of TRUE logic. Sorry if it is a stupid question, but I'd love to understand it — Monica, Aug 13 '16 at 21:28
Don't worry, there are no stupid questions, only stupid answers :) I updated my answer with an attempt to better explain how it works, hope I make it clear and you can understand it now. — Todor, Aug 14 '16 at 00:21
I really appreciate your profound answer!! It helps a lot! :) I just want to make sure as it comes to the first part why we are getting [None, None, None, None, None]. The reason is that we append the x value to the used list and assign it to the variable unique. Well, the append is a destructive operator what means that it modifies the list in place and does not return a new list. Therefore it returns None. Am I right? — Monica, Aug 14 '16 at 14:52
Even faster is using a set: `timeit.timeit('[x for x in mylist if x not in used and not used.add(x)]', setup='used = set();'+setup)` — Michael, Nov 09 '16 at 12:12
IMO the most readable option is the fastest and least readable is the slowest. — Nulano, Jul 08 '19 at 18:30
python definitely needs a method for uniquing, before the code gets too ~~pythonic~~ unreadable. — Nik O'Lai, Nov 01 '19 at 19:07
Another option worth mentioning and working since Python 3.7 is using `dict` as it keeps the order of the keys but also eliminates duplicates: `list(dict.fromkeys(mylist))` Timing-wise it positions as 3rd. — rlat, Dec 10 '20 at 15:12
for comparison, and bearing in mind this does not preserve order, consider adding a timer for the basic approach - `list(set(mylist))` ... in my testing, that was second fastest, after your top solution, and faster than `[i for i in set(mylist)]` which was unintuitively slow. IF the consumer does not care about the type of the return, then simply `set(mylist)` was fastest in my testing, alongside the somewhat specialist `[*{*mylist}]` which was as fast as the set solution. — F1Rumors, Feb 08 '23 at 14:41

score 120 · Answer 4 · edited Oct 09 '19 at 04:10

120

A Python list:

>>> a = ['a', 'b', 'c', 'd', 'b']

To get unique items, just transform it into a set (which you can transform back again into a list if required):

>>> b = set(a)
>>> print(b)
{'b', 'c', 'd', 'a'}

edited Oct 09 '19 at 04:10

Boris Verkhovskiy

14,854
11
100
103

answered Oct 15 '12 at 14:11

Nicolas Barbey

6,639
4
28
34

64

Nice, so `a = list(set(a))` gets the unique items. – Brian Burns Aug 24 '13 at 23:08
11

Brian, `set(a)` is sufficient to "get the unique items". You only need to construct another list if you specifically need a list for some reason. – jbg Jun 30 '14 at 11:02
7

Note the result will be unordered. – Timothy Aaron Jan 23 '17 at 22:13

score 110 · Answer 5 · edited Oct 09 '19 at 04:06

110

What type is your output variable?

Python sets are what you need. Declare output like this:

output = set()  # initialize an empty set

and you're ready to go adding elements with output.add(elem) and be sure they're unique.

Warning: sets DO NOT preserve the original order of the list.

edited Oct 09 '19 at 04:06

Boris Verkhovskiy

14,854
11
100
103

answered Oct 15 '12 at 14:07

Samuele Mattiuzzo

10,760
5
39
63

pylang · Answer 6 · 2022-04-08T00:23:09.167

Options to remove duplicates may include the following generic data structures:

set: unordered, unique elements
ordered set: ordered, unique elements

Here is a summary on quickly getting either one in Python.

Given

from collections import OrderedDict


seq = [u"nowplaying", u"PBS", u"PBS", u"nowplaying", u"job", u"debate", u"thenandnow"]

Code

Option 1 - A set (unordered):

list(set(seq))
# ['thenandnow', 'PBS', 'debate', 'job', 'nowplaying']

Python doesn't have ordered sets, but here are some ways to mimic one.

Option 2 - an OrderedDict (insertion ordered):

list(OrderedDict.fromkeys(seq))
# ['nowplaying', 'PBS', 'job', 'debate', 'thenandnow']

Option 3 - a dict (insertion ordered), default in Python 3.6+. See more details in this post:

list(dict.fromkeys(seq))
# ['nowplaying', 'PBS', 'job', 'debate', 'thenandnow']

Note: listed elements must be hashable. See details on the latter example in this blog post. Furthermore, see R. Hettinger's post on the same technique; the order preserving dict is extended from one of his early implementations. See also more on total ordering.

@Henry Henrinson I appreciate your voicing your reason in down-voting this answer. However, your opinion and claim " The Python 3.6 solution is not order preserving" are not qualified with references. To be clear, in Python 3.6, dictionaries [preserve *insertion order*](https://stackoverflow.com/a/39980744/4531270) in the CPython implementation. It is a language feature in Python 3.7+. Moreover, see an on-going [blog post](https://www.peterbe.com/plog/fastest-way-to-uniquify-a-list-in-python-3.6) on that approach claimed at that time to be the fastest ordered option in Python 3.6. — pylang, May 01 '19 at 17:49

daino3 · Answer 7 · 2018-02-07T17:08:06.473

56

Maintaining order:

# oneliners
# slow -> . --- 14.417 seconds ---
[x for i, x in enumerate(array) if x not in array[0:i]]

# fast -> . --- 0.0378 seconds ---
[x for i, x in enumerate(array) if array.index(x) == i]

# multiple lines
# fastest -> --- 0.012 seconds ---
uniq = []
[uniq.append(x) for x in array if x not in uniq]
uniq

Order doesn't matter:

# fastest-est -> --- 0.0035 seconds ---
list(set(array))

edited Feb 07 '18 at 17:08

answered Jul 03 '17 at 20:36

daino3

4,386
37
48

1

This has terrible performance (O(n^2)) for large lists and is neither simpler nor easier to read than `list(set(array))`. The only advantage is the preservation of order, which was not asked for. – jlh Sep 27 '17 at 09:38
2

This is great for simple scripts where you want to keep order and don't care about speed. – JeffCharter Jan 23 '18 at 18:04
@JeffCharter- added one that maintains order and is mucho faster :) – daino3 Feb 07 '18 at 17:08
This "thing"/operation `[uniq.append(x) for x in array if x not in uniq]` how is it called in python? – MMT Feb 21 '18 at 15:08
1

@MMT - [list comprehension](http://www.secnetix.de/olli/Python/list_comprehensions.hawk) – daino3 Feb 21 '18 at 15:47
3

I really appreciate you taking the time to break out the timestamps too – Lotus Dec 08 '18 at 17:53

score 22 · Answer 8 · answered Feb 02 '18 at 10:51

Getting unique elements from List

mylist = [1,2,3,4,5,6,6,7,7,8,8,9,9,10]

Using Simple Logic from Sets - Sets are unique list of items

mylist=list(set(mylist))

In [0]: mylist
Out[0]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Using Simple Logic

newList=[]
for i in mylist:
    if i not in newList:
        newList.append(i)

In [0]: mylist
Out[0]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Using pop method ->pop removes the last or indexed item and displays that to user. video

k=0
while k < len(mylist):
    if mylist[k] in mylist[k+1:]:
        mylist.pop(mylist[k])
    else:
        k=k+1

In [0]: mylist
Out[0]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Using Numpy

import numpy as np
np.unique(mylist)

In [0]: mylist
Out[0]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Reference

this answer deserves more updoots: for unhashable types where you want to check _value_ uniqueness rather than _identity_ uniqueness the simple logic is correct - meaning it's more correct in general. — ocket8888, Aug 15 '18 at 16:30

score 17 · Answer 9 · edited Jan 05 '16 at 20:01

17

set - unordered collection of unique elements. List of elements can be passed to set's constructor. So, pass list with duplicate elements, we get set with unique elements and transform it back to list then get list with unique elements. I can say nothing about performance and memory overhead, but I hope, it's not so important with small lists.

list(set(my_not_unique_list))

Simply and short.

edited Jan 05 '16 at 20:01

Tomasz Jakub Rup

10,502
7
48
49

answered Feb 06 '15 at 12:16

MultiTeemer

375
2
11

1

Could you add some explanation on your code for OP? – Paco Feb 06 '15 at 12:54
I tried your answer, this is a good answer but with an explanation it will turns into a great answer :) – Papouche Guinslyzinho Feb 24 '15 at 11:35
1

set - unordered collection of unique elements. List of elements can be passed to set's constructor. So, pass list with duplicate elements, we get set with unique elements and transform it back to list then get list with unique elements. I can say nothing about performance and memory overhead, but I hope, it's not so important with small lists. – MultiTeemer Feb 28 '15 at 01:36

score 17 · Answer 10 · edited May 23 '17 at 11:47

If you are using numpy in your code (which might be a good choice for larger amounts of data), check out numpy.unique:

>>> import numpy as np
>>> wordsList = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
>>> np.unique(wordsList)
array([u'PBS', u'debate', u'job', u'nowplaying', u'thenandnow'], 
      dtype='<U10')

(http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html)

As you can see, numpy supports not only numeric data, string arrays are also possible. Of course, the result is a numpy array, but it doesn't matter a lot, because it still behaves like a sequence:

>>> for word in np.unique(wordsList):
...     print word
... 
PBS
debate
job
nowplaying
thenandnow

If you really want to have a vanilla python list back, you can always call list().

However, the result is automatically sorted, as you can see from the above code fragments. Check out numpy unique without sort if retaining list order is required.

OdraEncoded · Answer 11 · 2015-08-11T16:45:23.003

Same order unique list using only a list compression.

> my_list = [1, 2, 1, 3, 2, 4, 3, 5, 4, 3, 2, 3, 1]
> unique_list = [
>    e
>    for i, e in enumerate(my_list)
>    if my_list.index(e) == i
> ]
> unique_list
[1, 2, 3, 4, 5]

enumerates gives the index i and element e as a tuple.

my_list.index returns the first index of e. If the first index isn't i then the current iteration's e is not the first e in the list.

Edit

I should note that this isn't a good way to do it, performance-wise. This is just a way that achieves it using only a list compression.

score 7 · Answer 12 · answered Jun 16 '16 at 11:57

7

As a bonus, Counter is a simple way to get both the unique values and the count for each value:

from collections import Counter
l = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
c = Counter(l)

answered Jun 16 '16 at 11:57

Berislav Lopac

16,656
6
71
80

score 7 · Answer 13 · answered Mar 30 '18 at 05:08

7

By using basic property of Python Dictionary:

inp=[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
d={i for i in inp}
print d

Output will be:

set([u'nowplaying', u'job', u'debate', u'PBS', u'thenandnow'])

answered Mar 30 '18 at 05:08

SKY

175
2
8

And, from dinamic values? – e-info128 May 21 '18 at 17:33
@e-info128 Quite similarly, put those in a `set`. – tripleee Dec 04 '18 at 11:08
4

This is a `set`, not a `dict`. – tripleee Dec 04 '18 at 11:09

score 6 · Answer 14 · answered Oct 15 '12 at 14:12

First thing, the example you gave is not a valid list.

example_list = [u'nowplaying',u'PBS', u'PBS', u'nowplaying', u'job', u'debate',u'thenandnow']

Suppose if above is the example list. Then you can use the following recipe as give the itertools example doc that can return the unique values and preserving the order as you seem to require. The iterable here is the example_list

from itertools import ifilterfalse

def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in ifilterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

What is the purpose of `ifilterfalse(seen.__contains__, iterable)`? Is there a benefit versus `for element not in seen:...` ? — jpp, May 22 '18 at 08:45

score 6 · Answer 15 · edited Jan 25 '16 at 10:11

6

def get_distinct(original_list):
    distinct_list = []
    for each in original_list:
        if each not in distinct_list:
            distinct_list.append(each)
    return distinct_list

edited Jan 25 '16 at 10:11

Tunaki

132,869
46
340
423

answered Jan 25 '16 at 10:09

oliver smith

95
1
4

6

please add some explanation - this is only code. If you look at the other answers, they always go with code _and_ explanation. – Alexander Jan 25 '16 at 10:18
@Alexander [not always useless, but typically is](http://meta.stackoverflow.com/questions/262695/new-answer-deletion-option-code-only-answer/311766#311766). – ivan_pozdeev Jan 25 '16 at 17:40

score 6 · Answer 16 · answered Apr 04 '18 at 19:20

set can help you filter out the elements from the list that are duplicates. It will work well for str, int or tuple elements, but if your list contains dict or other list elements, then you will end up with TypeError exceptions.

Here is a general order-preserving solution to handle some (not all) non-hashable types:

def unique_elements(iterable):
    seen = set()
    result = []
    for element in iterable:
        hashed = element
        if isinstance(element, dict):
            hashed = tuple(sorted(element.iteritems()))
        elif isinstance(element, list):
            hashed = tuple(element)
        if hashed not in seen:
            result.append(element)
            seen.add(hashed)
    return result

skovorodkin · Answer 17 · 2016-10-01T22:01:18.003

5

If you want to get unique elements from a list and keep their original order, then you may employ OrderedDict data structure from Python's standard library:

from collections import OrderedDict

def keep_unique(elements):
    return list(OrderedDict.fromkeys(elements).keys())

elements = [2, 1, 4, 2, 1, 1, 5, 3, 1, 1]
required_output = [2, 1, 4, 5, 3]

assert keep_unique(elements) == required_output

In fact, if you are using Python ≥ 3.6, you can use plain dict for that:

def keep_unique(elements):
    return list(dict.fromkeys(elements).keys())

It's become possible after the introduction of "compact" representation of dicts. Check it out here. Though this "considered an implementation detail and should not be relied upon".

edited Oct 01 '16 at 22:01

answered Oct 01 '16 at 20:59

skovorodkin

9,394
1
39
30

I'd like to really drive home that last point. Having a dict internally keep the order of insertion is is an implementation detail of CPython, and there is no guarantee that it will work on another Python engine (like PyPy or IronPython), and it can change in future versions without breaking backward compatibility. So please don't depend on that behaviour in any production-ready code. – Berislav Lopac Mar 18 '17 at 11:08
@BerislavLopac, I absolutely agree. It may change and it does not follow "Readability counts" rule. But it's still convenient for one-off scripts and REPL sessions. – skovorodkin Mar 23 '17 at 07:22
1

In fact -- to correct my own point -- starting with Python 3.7 the ordered dicts are actually a language feature instead of an implementation quirk. See the answer at https://stackoverflow.com/a/39980744/122033 – Berislav Lopac Dec 04 '18 at 15:28

score 4 · Answer 18 · answered Jun 16 '14 at 08:25

4

def setlist(lst=[]):
   return list(set(lst))

answered Jun 16 '14 at 08:25

Ricky Wilson

3,187
4
24
29

13

Try not to use [] as a default parameter. It is the same instance that is used every time so modifications affect the next time the function is called. Not so much of an issue here but it's still unnecessary. – Holloway Jun 16 '14 at 08:32
3

@Trengot Exactly. It should be lst=None, and add a line lst = [] if lst is None – xis Jul 24 '14 at 20:29
2

@xis: or simply `lst or []` – mike3996 Dec 17 '14 at 12:16
1

Please note, the result will be unordered. – Aminah Nuraini Oct 26 '15 at 08:46

Andriy Ivaneyko · Answer 19 · 2016-08-05T09:57:35.763

To get unique values from your list use code below:

trends = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
output = set(trends)
output = list(output)

IMPORTANT: Approach above won't work if any of items in a list is not hashable which is case for mutable types, for instance list or dict.

trends = [{'super':u'nowplaying'}, u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
output = set(trends)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  TypeError: unhashable type: 'dict'

That means that you have to be sure that trends list would always contains only hashable items otherwise you have to use more sophisticated code:

from copy import deepcopy

try:
    trends = [{'super':u'nowplaying'}, [u'PBS',], [u'PBS',], u'nowplaying', u'job', u'debate', u'thenandnow', {'super':u'nowplaying'}]
    output = set(trends)
    output = list(output)
except TypeError:
    trends_copy = deepcopy(trends)
    while trends_copy:
        trend = trends_copy.pop()
        if trends_copy.count(trend) == 0:
            output.append(trend)
print output

score 4 · Answer 20 · answered Mar 02 '17 at 11:28

I am surprised that nobody so far has given a direct order-preserving answer:

def unique(sequence):
    """Generate unique items from sequence in the order of first occurrence."""
    seen = set()
    for value in sequence:
        if value in seen:
            continue

        seen.add(value)

        yield value

It will generate the values so it works with more than just lists, e.g. unique(range(10)). To get a list, just call list(unique(sequence)), like this:

>>> list(unique([u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']))
[u'nowplaying', u'PBS', u'job', u'debate', u'thenandnow']

It has the requirement that each item is hashable and not just comparable, but most stuff in Python is and it is O(n) and not O(n^2), so will work just fine with a long list.

Alaf Azam · Answer 21 · 2017-06-17T07:45:50.643

In addition to the previous answers, which say you can convert your list to set, you can do that in this way too

mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenadnow']
mylist = [i for i in set(mylist)]

output will be

[u'nowplaying', u'job', u'debate', u'PBS', u'thenadnow']

though order will not be preserved.

Another simpler answer could be (without using sets)

>>> t = [v for i,v in enumerate(mylist) if mylist.index(v) == i]
[u'nowplaying', u'PBS', u'job', u'debate', u'thenadnow']

Sanjar Stone · Answer 22 · 2014-02-04T00:41:47.593

2

At the begin of your code just declare your output list as empty: output=[]
Instead of your code you may use this code trends=list(set(trends))

edited Feb 04 '14 at 00:41

answered Feb 04 '14 at 00:31

Sanjar Stone

874
7
7

Please note, the result will be unordered. – Aminah Nuraini Oct 26 '15 at 08:46

Tung Nguyen · Answer 23 · 2019-09-28T14:45:11.810

2

Set is a collection of un-ordered and unique elements. So, you can use set as below to get a unique list:

unique_list = list(set([u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']))

edited Sep 28 '19 at 14:45

answered May 31 '16 at 15:17

Tung Nguyen

1,486
2
18
13

1

Although this code may answer the question, providing additional context regarding _why_ and/or _how_ it answers the question would significantly improve its long-term value. Please [edit] your answer to add some explanation. – Toby Speight May 31 '16 at 15:42
"Set is a collection of ordered and unique elements." Unfortunately not; sets are not ordered as noted in the answers above. – kuzzooroo Aug 27 '19 at 04:22

score 2 · Answer 24 · answered Feb 06 '17 at 20:52

You can use sets. Just to be clear, I am explaining what is the difference between a list and a set. sets are unordered collection of unique elements.Lists are ordered collection of elements. So,

    unicode_list=[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job',u'debate', u'thenandnow']
    list_unique=list(set(unicode_list))
    print list_unique
[u'nowplaying', u'job', u'debate', u'PBS', u'thenandnow']

But: Do not use list/set in naming the variables. It will cause error: EX: Instead of use list instead of unicode_list in the above one.

list=[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job',u'debate', u'thenandnow']
        list_unique=list(set(list))
        print list_unique
    list_unique=list(set(list))
TypeError: 'list' object is not callable

score 2 · Answer 25 · answered Mar 02 '18 at 19:21

2

use set to de-duplicate a list, return as list

def get_unique_list(lst):
        if isinstance(lst,list):
            return list(set(lst))

answered Mar 02 '18 at 19:21

Goran B.

542
4
14

This approach will change the order of the elements in the list which might be undesirable behavior – gomons May 30 '18 at 08:02

score 1 · Answer 26 · edited May 26 '17 at 12:37

My solution to check contents for uniqueness but preserve the original order:

def getUnique(self):
    notunique = self.readLines()
    unique = []
    for line in notunique: # Loop over content
        append = True # Will be set to false if line matches existing line
        for existing in unique:
            if line == existing: # Line exists ? do not append and go to the next line
                append = False
                break # Already know file is unique, break loop
        if append: unique.append(line) # Line not found? add to list
    return unique

Edit: Probably can be more efficient by using dictionary keys to check for existence instead of doing a whole file loop for each line, I wouldn't use my solution for large sets.

c1646091 · Answer 27 · 2016-07-16T08:04:27.657

I know this is an old question, but here's my unique solution: class inheritance!:

class UniqueList(list):
    def appendunique(self,item):
        if item not in self:
            self.append(item)
            return True
        return False

Then, if you want to uniquely append items to a list you just call appendunique on a UniqueList. Because it inherits from a list, it basically acts like a list, so you can use functions like index() etc. And because it returns true or false, you can find out if appending succeeded (unique item) or failed (already in the list).

To get a unique list of items from a list, use a for loop appending items to a UniqueList (then copy over to the list).

Example usage code:

unique = UniqueList()

for each in [1,2,2,3,3,4]:
    if unique.appendunique(each):
        print 'Uniquely appended ' + str(each)
    else:
        print 'Already contains ' + str(each)

Prints:

Uniquely appended 1
Uniquely appended 2
Already contains 2
Uniquely appended 3
Already contains 3
Uniquely appended 4

Copying to list:

unique = UniqueList()

for each in [1,2,2,3,3,4]:
    unique.appendunique(each)

newlist = unique[:]
print newlist

Prints:

[1, 2, 3, 4]

score -1 · Answer 28 · edited Jun 19 '15 at 12:30

-1

For long arrays

s = np.empty(len(var))

s[:] = np.nan

for  x in  set(var):

    x_positions = np.where(var==x)

    s[x_positions[0][0]]=x


sorted_var=s[~np.isnan(s)]

edited Jun 19 '15 at 12:30

S.L. Barth is on codidact.com

8,198
71
51
66

answered Jun 19 '15 at 12:20

user5028205

1

score -3 · Answer 29 · answered Aug 17 '14 at 00:44

-3

Try this function, it's similar to your code but it's a dynamic range.

def unique(a):

    k=0
    while k < len(a):
        if a[k] in a[k+1:]:
            a.pop(k)
        else:
            k=k+1



    return a

answered Aug 17 '14 at 00:44

CreamStat

2,155
6
27
43

score -3 · Answer 30 · answered Jan 06 '15 at 17:07

Use the following function:

def uniquefy_list(input_list):
"""
This function  takes a list as input and return a list containing only unique elements from the input list

"""
output_list=[]
for elm123 in input_list:
    in_both_lists=0
    for elm234 in output_list:
        if elm123 == elm234:
            in_both_lists=1
            break
    if in_both_lists == 0:
        output_list.append(elm123)

return output_list

Get unique values from a list in python

30 Answers30

Linked

Related