How to find all occurrences of an element in a list

Question

index() will give the first occurrence of an item in a list. Is there a neat trick which returns all indices in a list for an element?

score 853 · Accepted Answer · edited Mar 14 '22 at 14:16

853

You can use a list comprehension with enumerate:

indices = [i for i, x in enumerate(my_list) if x == "whatever"]

The iterator enumerate(my_list) yields pairs (index, item) for each item in the list. Using i, x as loop variable target unpacks these pairs into the index i and the list item x. We filter down to all x that match our criterion, and select the indices i of these elements.

edited Mar 14 '22 at 14:16

Tomerikoo

18,379
16
47
61

answered Jun 09 '11 at 14:13

Sven Marnach

574,206
118
941
841

score 171 · Answer 2 · answered Jun 09 '11 at 14:47

171

While not a solution for lists directly, numpy really shines for this sort of thing:

import numpy as np
values = np.array([1,2,3,1,2,4,5,6,3,2,1])
searchval = 3
ii = np.where(values == searchval)[0]

returns:

ii ==>array([2, 8])

This can be significantly faster for lists (arrays) with a large number of elements vs some of the other solutions.

answered Jun 09 '11 at 14:47

JoshAdel

66,734
27
141
140

1

Here, `values` could be a NumPy array or a Python list. – Hari Dec 22 '21 at 09:15
6

@Hari I get different results from `np.where([7, 8, 9, 8] == 8)[0]` and `np.where(np.array([7, 8, 9, 8]) == 8)[0]`; only the latter works as intended. – Attila the Fun Mar 28 '22 at 20:01
1

Indeed, @AttilatheFun. I am not able to refer to the piece of code that led me to think that numpy where works with list also. Casting as a numpy array is the correct and safe thing to do before using numpy where. – Hari Mar 29 '22 at 08:55

Paulo Almeida · Answer 3 · 2017-11-10T20:41:47.210

41

A solution using list.index:

def indices(lst, element):
    result = []
    offset = -1
    while True:
        try:
            offset = lst.index(element, offset+1)
        except ValueError:
            return result
        result.append(offset)

It's much faster than the list comprehension with enumerate, for large lists. It is also much slower than the numpy solution if you already have the array, otherwise the cost of converting outweighs the speed gain (tested on integer lists with 100, 1000 and 10000 elements).

NOTE: A note of caution based on Chris_Rands' comment: this solution is faster than the list comprehension if the results are sufficiently sparse, but if the list has many instances of the element that is being searched (more than ~15% of the list, on a test with a list of 1000 integers), the list comprehension is faster.

edited Nov 10 '17 at 20:41

answered Sep 07 '13 at 02:29

Paulo Almeida

7,803
28
36

6

You say this is faster than a list comp, can you show your timings that demonstrate this? – Chris_Rands Nov 09 '17 at 13:09
9

This was a long time ago, I probably used `timeit.timeit` with randomly generated lists. That's an important point though, and I suppose that may be why you ask. At the time it didn't occur to me, but the speed gains are only true if the results are sufficiently sparse. I just tested with a list full of the element to search for, and it's much slower than the list comprehension. – Paulo Almeida Nov 09 '17 at 14:58

score 26 · Answer 4 · answered Jun 09 '11 at 14:14

26

How about:

In [1]: l=[1,2,3,4,3,2,5,6,7]

In [2]: [i for i,val in enumerate(l) if val==3]
Out[2]: [2, 4]

answered Jun 09 '11 at 14:14

NPE

486,780
108
951
1,012

pylang · Answer 5 · 2018-05-23T14:58:36.160

18

more_itertools.locate finds indices for all items that satisfy a condition.

from more_itertools import locate


list(locate([0, 1, 1, 0, 1, 0, 0]))
# [1, 2, 4]

list(locate(['a', 'b', 'c', 'b'], lambda x: x == 'b'))
# [1, 3]

more_itertools is a third-party library > pip install more_itertools.

edited May 23 '18 at 14:58

answered Feb 09 '18 at 01:42

pylang

40,867
14
129
121

score 12 · Answer 6 · answered Jun 09 '11 at 14:15

12

occurrences = lambda s, lst: (i for i,e in enumerate(lst) if e == s)
list(occurrences(1, [1,2,3,1])) # = [0, 3]

answered Jun 09 '11 at 14:15

phihag

278,196
72
453
469

Trenton McKinney · Answer 7 · 2022-11-17T19:58:47.107

There’s an answer using np.where to find the indices of a single value, which is not faster than a list-comprehension, if the time to convert a list to an array is included
The overhead of importing numpy and converting a list to a numpy.array probably makes using numpy a less efficient option for most circumstances. A careful timing analysis would be necessary.
- In cases where multiple functions/operations will need to be performed on the list, converting the list to an array, and then using numpy functions will likely be a faster option.
This solution uses np.where and np.unique to find the indices of all unique elements in a list.
- Using np.where on an array (including the time to convert the list to an array) is slightly slower than a list-comprehension on a list, for finding all indices of all unique elements.
- This has been tested on an 2M element list with 4 unique values, and the size of the list/array and number of unique elements will have an impact.
Other solutions using numpy on an array can be found in Get a list of all indices of repeated elements in a numpy array
Tested in [python 3.10.4, numpy 1.23.1] and [python 3.11.0, numpy 1.23.4]

import numpy as np
import random  # to create test list

# create sample list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(20)]

# convert the list to an array for use with these numpy methods
a = np.array(l)

# create a dict of each unique entry and the associated indices
idx = {v: np.where(a == v)[0].tolist() for v in np.unique(a)}

# print(idx)
{'s1': [7, 9, 10, 11, 17],
 's2': [1, 3, 6, 8, 14, 18, 19],
 's3': [0, 2, 13, 16],
 's4': [4, 5, 12, 15]}

`%timeit` on a 2M element list with 4 unique `str` elements

# create 2M element list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(2000000)]

Functions

def test1():
    # np.where: convert list to array and find indices of a single element
    a = np.array(l)
    return np.where(a == 's1')
    

def test2():
    # list-comprehension: on list l and find indices of a single element
    return [i for i, x in enumerate(l) if x == "s1"]


def test3():
    # filter: on list l and find indices of a single element
    return list(filter(lambda i: l[i]=="s1", range(len(l))))


def test4():
    # use np.where and np.unique to find indices of all unique elements: convert list to array
    a = np.array(l)
    return {v: np.where(a == v)[0].tolist() for v in np.unique(a)}


def test5():
    # list comprehension inside dict comprehension: on list l and find indices of all unique elements
    return {req_word: [idx for idx, word in enumerate(l) if word == req_word] for req_word in set(l)}

Function Call

%timeit test1()
%timeit test2()
%timeit test3()
%timeit test4()
%timeit test5()

Result `python 3.10.4`

214 ms ± 19.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
85.1 ms ± 1.48 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
146 ms ± 1.65 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
365 ms ± 11.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
360 ms ± 5.82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Result `python 3.11.0`

209 ms ± 15.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
70.4 ms ± 1.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
132 ms ± 4.65 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
371 ms ± 20.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
314 ms ± 15.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

score 8 · Answer 8 · answered Oct 18 '18 at 08:33

8

Or Use range (python 3):

l=[i for i in range(len(lst)) if lst[i]=='something...']

For (python 2):

l=[i for i in xrange(len(lst)) if lst[i]=='something...']

And then (both cases):

print(l)

Is as expected.

answered Oct 18 '18 at 08:33

U13-Forward

69,221
14
89
114

score 5 · Answer 9 · answered Feb 10 '19 at 13:08

Getting all the occurrences and the position of one or more (identical) items in a list

With enumerate(alist) you can store the first element (n) that is the index of the list when the element x is equal to what you look for.

>>> alist = ['foo', 'spam', 'egg', 'foo']
>>> foo_indexes = [n for n,x in enumerate(alist) if x=='foo']
>>> foo_indexes
[0, 3]
>>>

Let's make our function findindex

This function takes the item and the list as arguments and return the position of the item in the list, like we saw before.

def indexlist(item2find, list_or_string):
  "Returns all indexes of an item in a list or a string"
  return [n for n,item in enumerate(list_or_string) if item==item2find]

print(indexlist("1", "010101010"))

Output

[1, 3, 5, 7]

Simple

for n, i in enumerate([1, 2, 3, 4, 1]):
    if i == 1:
        print(n)

Output:

0
4

score 4 · Answer 10 · answered Apr 03 '19 at 15:58

4

Using filter() in python2.

>>> q = ['Yeehaw', 'Yeehaw', 'Googol', 'B9', 'Googol', 'NSM', 'B9', 'NSM', 'Dont Ask', 'Googol']
>>> filter(lambda i: q[i]=="Googol", range(len(q)))
[2, 4, 9]

answered Apr 03 '19 at 15:58

Niranjan Nagaraju

774
5
8

score 4 · Answer 11 · answered Jun 09 '11 at 14:26

4

One more solution(sorry if duplicates) for all occurrences:

values = [1,2,3,1,2,4,5,6,3,2,1]
map(lambda val: (val, [i for i in xrange(len(values)) if values[i] == val]), values)

answered Jun 09 '11 at 14:26

Artsiom Rudzenka

27,895
4
34
52

score 3 · Answer 12 · answered Dec 01 '17 at 18:53

3

If you need to search for all element's positions between certain indices, you can state them:

[i for i,x in enumerate([1,2,3,2]) if x==2 & 2<= i <=3] # -> [3]

answered Dec 01 '17 at 18:53

Denis Rasulev

3,744
4
33
47

score 3 · Answer 13 · answered Dec 07 '17 at 16:31

You can create a defaultdict

from collections import defaultdict
d1 = defaultdict(int)      # defaults to 0 values for keys
unq = set(lst1)              # lst1 = [1, 2, 2, 3, 4, 1, 2, 7]
for each in unq:
      d1[each] = lst1.count(each)
else:
      print(d1)

Trenton McKinney · Answer 14 · 2022-06-15T04:29:55.883

Using a `for-loop`:

Answers with enumerate and a list comprehension are more pythonic, but not necessarily faster. However, this answer is aimed at students who may not be allowed to use some of those built-in functions.
create an empty list, indices
create the loop with for i in range(len(x)):, which essentially iterates through a list of index locations [0, 1, 2, 3, ..., len(x)-1]
in the loop, add any i, where x[i] is a match to value, to indices
- x[i] accesses the list by index

def get_indices(x: list, value: int) -> list:
    indices = list()
    for i in range(len(x)):
        if x[i] == value:
            indices.append(i)
    return indices

n = [1, 2, 3, -50, -60, 0, 6, 9, -60, -60]
print(get_indices(n, -60))

>>> [4, 8, 9]

The functions, get_indices, are implemented with type hints. In this case, the list, n, is a bunch of ints, therefore we search for value, also defined as an int.

Using a `while-loop` and `.index`:

With .index, use try-except for error handling, because a ValueError will occur if value is not in the list.

def get_indices(x: list, value: int) -> list:
    indices = list()
    i = 0
    while True:
        try:
            # find an occurrence of value and update i to that index
            i = x.index(value, i)
            # add i to the list
            indices.append(i)
            # advance i by 1
            i += 1
        except ValueError as e:
            break
    return indices

print(get_indices(n, -60))
>>> [4, 8, 9]

Your self-define `get_indeices` is a bit faster(~15%) than normal list comprehension. I am trying to figure it out. — Travis, Jan 31 '20 at 11:28

score 3 · Answer 15 · answered Mar 28 '21 at 06:44

A dynamic list comprehension based solution incase we do not know in advance which element:

lst = ['to', 'be', 'or', 'not', 'to', 'be']
{req_word: [idx for idx, word in enumerate(lst) if word == req_word] for req_word in set(lst)}

results in:

{'be': [1, 5], 'or': [2], 'to': [0, 4], 'not': [3]}

You can think of all other ways along the same lines as well but with index() you can find only one index although you can set occurrence number yourself.

score 1 · Answer 16 · answered Aug 03 '17 at 13:51

If you are using Python 2, you can achieve the same functionality with this:

f = lambda my_list, value:filter(lambda x: my_list[x] == value, range(len(my_list)))

Where my_list is the list you want to get the indexes of, and value is the value searched. Usage:

f(some_list, some_element)

MusicalNinja · Answer 17 · 2022-05-15T17:41:31.933

Create a generator

Generators are fast and use a tiny memory footprint. They give you flexibility in how you use the result.

def indices(iter, val):
    """Generator: Returns all indices of val in iter
    Raises a ValueError if no val does not occur in iter
    Passes on the AttributeError if iter does not have an index method (e.g. is a set)
    """
    i = -1
    NotFound = False
    while not NotFound:
        try:
            i = iter.index(val, i+1)
        except ValueError:
            NotFound = True
        else:
            yield i
    if i == -1:
        raise ValueError("No occurrences of {v} in {i}".format(v = val, i = iter))

The above code can be use to create a list of the indices: list(indices(input,value)); use them as dictionary keys: dict(indices(input,value)); sum them: sum(indices(input,value)); in a for loop for index_ in indices(input,value):; ...etc... without creating an interim list/tuple or similar.

In a for loop you will get your next index back when you call for it, without waiting for all the others to be calculated first. That means: if you break out of the loop for some reason you save the time needed to find indices you never needed.

How it works

Call .index on the input iter to find the next occurrence of val
Use the second parameter to .index to start at the point after the last found occurrence
Yield the index
Repeat until index raises a ValueError

Alternative versions

I tried four different versions for flow control; two EAFP (using try - except) and two TBYL (with a logical test in the while statement):

"WhileTrueBreak": while True: ... except ValueError: break. Surprisingly, this was usually a touch slower than option 2 and (IMV) less readable
"WhileErrFalse": Using a bool variable err to identify when a ValueError is raised. This is generally the fastest and more readable than 1
"RemainingSlice": Check whether val is in the remaining part of the input using slicing: while val in iter[i:]. Unsurprisingly, this does not scale well
"LastOccurrence": Check first where the last occurrence is, keep going while i < last

The overall performance differences between 1,2 and 4 are negligible, so it comes down to personal style and preference. Given that .index uses ValueError to let you know it didn't find anything, rather than e.g. returning None, an EAFP-approach seems fitting to me.

Here are the 4 code variants and results from timeit (in milliseconds) for different lengths of input and sparsity of matches

@version("WhileTrueBreak", versions)
def indices2(iter, val):
    i = -1
    while True:
        try:
            i = iter.index(val, i+1)
        except ValueError:
            break
        else:
            yield i

@version("WhileErrFalse", versions)
def indices5(iter, val):
    i = -1
    err = False
    while not err:
        try:
            i = iter.index(val, i+1)
        except ValueError:
            err = True
        else:
            yield i

@version("RemainingSlice", versions)
def indices1(iter, val):
    i = 0
    while val in iter[i:]:
        i = iter.index(val, i)
        yield i
        i += 1

@version("LastOccurrence", versions)
def indices4(iter,val):
    i = 0
    last = len(iter) - tuple(reversed(iter)).index(val)
    while i < last:
        i = iter.index(val, i)
        yield i
        i += 1

Length: 100, Ocurrences: 4.0%
{'WhileTrueBreak': 0.0074799987487494946, 'WhileErrFalse': 0.006440002471208572, 'RemainingSlice': 0.01221001148223877, 'LastOccurrence': 0.00801000278443098}
Length: 1000, Ocurrences: 1.2%
{'WhileTrueBreak': 0.03101000329479575, 'WhileErrFalse': 0.0278000021353364, 'RemainingSlice': 0.08278000168502331, 'LastOccurrence': 0.03986000083386898}
Length: 10000, Ocurrences: 2.05%
{'WhileTrueBreak': 0.18062000162899494, 'WhileErrFalse': 0.1810499932616949, 'RemainingSlice': 2.9145700042136014, 'LastOccurrence': 0.2049500006251037}
Length: 100000, Ocurrences: 1.977%
{'WhileTrueBreak': 1.9361200043931603, 'WhileErrFalse': 1.7280600033700466, 'RemainingSlice': 254.4725100044161, 'LastOccurrence': 1.9101499929092824}
Length: 100000, Ocurrences: 9.873%
{'WhileTrueBreak': 2.832529996521771, 'WhileErrFalse': 2.9984100023284554, 'RemainingSlice': 1132.4922299943864, 'LastOccurrence': 2.6660699979402125}
Length: 100000, Ocurrences: 25.058%
{'WhileTrueBreak': 5.119729996658862, 'WhileErrFalse': 5.2082200068980455, 'RemainingSlice': 2443.0577100021765, 'LastOccurrence': 4.75954000139609}
Length: 100000, Ocurrences: 49.698%
{'WhileTrueBreak': 9.372120001353323, 'WhileErrFalse': 8.447749994229525, 'RemainingSlice': 5042.717969999649, 'LastOccurrence': 8.050809998530895}

score -1 · Answer 18 · answered Jul 05 '20 at 09:47

Here is a time performance comparison between using np.where vs list_comprehension. Seems like np.where is faster on average.

# np.where
start_times = []
end_times = []
for i in range(10000):
    start = time.time()
    start_times.append(start)
    temp_list = np.array([1,2,3,3,5])
    ixs = np.where(temp_list==3)[0].tolist()
    end = time.time()
    end_times.append(end)
print("Took on average {} seconds".format(
    np.mean(end_times)-np.mean(start_times)))

Took on average 3.81469726562e-06 seconds

# list_comprehension
start_times = []
end_times = []
for i in range(10000):
    start = time.time()
    start_times.append(start)
    temp_list = np.array([1,2,3,3,5])
    ixs = [i for i in range(len(temp_list)) if temp_list[i]==3]
    end = time.time()
    end_times.append(end)
print("Took on average {} seconds".format(
    np.mean(end_times)-np.mean(start_times)))

Took on average 4.05311584473e-06 seconds

How to find all occurrences of an element in a list

18 Answers18

`%timeit` on a 2M element list with 4 unique `str` elements

Functions

Function Call

Result `python 3.10.4`

Result `python 3.11.0`

Getting all the occurrences and the position of one or more (identical) items in a list

Let's make our function findindex

Simple

Using a `for-loop`:

Using a `while-loop` and `.index`:

Create a generator

How it works

Alternative versions

Linked

Related

How to find all occurrences of an element in a list

18 Answers18

%timeit on a 2M element list with 4 unique str elements

Functions

Function Call

Result python 3.10.4

Result python 3.11.0

Getting all the occurrences and the position of one or more (identical) items in a list

Let's make our function findindex

Simple

Using a for-loop:

Using a while-loop and .index:

Create a generator

How it works

Alternative versions

Linked

Related

`%timeit` on a 2M element list with 4 unique `str` elements

Result `python 3.10.4`

Result `python 3.11.0`

Using a `for-loop`:

Using a `while-loop` and `.index`: