I have an 3d array with shape (1000, 12, 30), and I have a list of 2d array's of shape (12, 30), what I want to do is check if these 2d arrays exist in the 3d array. Is there a simple way in Python to do this? I tried keyword in but it doesn't work. 
 
    
    - 1,641
- 2
- 26
- 44
- 
                    The solution here should apply to your problem https://stackoverflow.com/questions/7100242/python-numpy-first-occurrence-of-subarray#20689091. Marking duplicate – Xero Smith May 03 '18 at 03:02
- 
                    Those questions are not the same. – Teodorico Levoff May 03 '18 at 03:03
- 
                    The solutions apply to this case. Adjust the rolling window accordingly – Xero Smith May 03 '18 at 03:05
- 
                    It doesn't apply. Questions that are a duplicate should be marked as a duplicate, those are not the same questions. I don't understand why you would mark this as a duplicate. – Teodorico Levoff May 03 '18 at 03:06
- 
                    I have retracted the flag though. – Xero Smith May 03 '18 at 03:06
3 Answers
There is a way in numpy , you can do with np.all
a = np.random.rand(3, 1, 2)
b = a[1][0]
np.all(np.all(a == b, 1), 1)
Out[612]: array([False,  True, False])
Solution from bnaecker
np.all(a == b, axis=(1, 2))
If only want to check exit or not
np.any(np.all(a == b, axis=(1, 2)))
 
    
    - 145
- 9
 
    
    - 317,841
- 20
- 164
- 234
- 
                    1
- 
                    @Wen I see! thanks for this. I'm still not sure how this would work if the depth is 30 like what I mentioned in the question? – Teodorico Levoff May 03 '18 at 03:11
- 
                    1@TeodoricoLevoff Check out NumPy's [broadcasting rules](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html). `b` in this case will be broadcast (replicated) along the first dimension to match `a`. Then the `axis` arguments to `np.all` reduce that along the last two dimensions, leaving a boolean array of shape `(30,)` with `True` at indices `i` where `a[i] == b`. – bnaecker May 03 '18 at 03:13
- 
                    1@TeodoricoLevoff Also note, that you might need to use `np.allclose()` rather than `np.all()` if you're dealing with floating point numbers. – bnaecker May 03 '18 at 03:14
- 
                    @bnaecker I understand. But I want to return True only if the complete (12, 30) array exist in the (1000, 12, 30). I think the solution mentioned above checks each single value in the 30 lists and outputs a boolean for each? – Teodorico Levoff May 03 '18 at 03:15
- 
                    
- 
                    
Here is a fast method (previously used by @DanielF as well as @jaime and others, no doubt) that uses a trick to benefit from short-circuiting: view-cast template-sized blocks to single elements of dtype void. When comparing two such blocks numpy stops after the first difference, yielding a huge speed advantage.
>>> def in_(data, template):
...     dv = data.reshape(data.shape[0], -1).view(f'V{data.dtype.itemsize*np.prod(data.shape[1:])}').ravel()
...     tv = template.ravel().view(f'V{template.dtype.itemsize*template.size}').reshape(())
...     return (dv==tv).any()
Example:
>>> a = np.random.randint(0, 100, (1000, 12, 30))
>>> check = a[np.random.randint(0, 1000, (10,))]
>>> check += np.random.random(check.shape) < 0.001    
>>>
>>> [in_(a, c) for c in check]
[True, True, True, False, False, True, True, True, True, False]
# compare to other method
>>> (a==check[:, None]).all((-1,-2)).any(-1)
array([ True,  True,  True, False, False,  True,  True,  True,  True,
       False])
Gives same result as "direct" numpy approach, but is almost 20x faster:
>>> from timeit import timeit
>>> kwds = dict(globals=globals(), number=100)
>>> 
>>> timeit("(a==check[:, None]).all((-1,-2)).any(-1)", **kwds)
0.4793281531892717
>>> timeit("[in_(a, c) for c in check]", **kwds)
0.026218891143798828
 
    
    - 51,835
- 3
- 54
- 99
- 
                    1I was hoping someone would who was better at actual coding would eventually improve my old `vview` code. Once you have the void view couldn't you just use `np.in1d` though? – Daniel F May 03 '18 at 06:20
- 
                    @DanielF You are right, that should be even faster. Could you give me a pointer to your post so I can properly credit you? – Paul Panzer May 03 '18 at 13:07
- 
                    @DanielF Strange, I tried with `in1d` or rather the new `isin` and it is 10x slower. Not sure what's going on here. – Paul Panzer May 03 '18 at 13:20
- 
                    I've given answers with it a few times: [here](https://stackoverflow.com/questions/49397704/find-index-given-multiple-values-of-array-with-numpy/49400557#49400557) and [here](https://stackoverflow.com/questions/48988038/find-boolean-mask-by-pattern/49002944#49002944) most recently. But the original idea came from @jaime [here](https://stackoverflow.com/a/16973510/4427777) – Daniel F May 04 '18 at 06:10
Numpy
Given
a = np.arange(12).reshape(3, 2, 2)
lst = [
    np.arange(4).reshape(2, 2),
    np.arange(4, 8).reshape(2, 2)
]
print(a, *lst, sep='\n{}\n'.format('-' * 20))
[[[ 0  1]
  [ 2  3]]
 [[ 4  5]
  [ 6  7]]
 [[ 8  9]
  [10 11]]]
--------------------
[[0 1]
 [2 3]]
--------------------
[[4 5]
 [6 7]]
Notice that lst is a list of arrays as per OP.  I'll make that a 3d array b  below.
Use broadcasting.  Using the broadcasting rules.  I want the dimensions of a as (1, 3, 2, 2) and b as (2, 1, 2, 2).
b = np.array(lst)
x, *y = b.shape
c = np.equal(
    a.reshape(1, *a.shape),
    np.array(lst).reshape(x, 1, *y)
)
I'll use all to produce a (2, 3) array of truth values and np.where to find out which among the a and b sub-arrays are actually equal.
i, j = np.where(c.all((-2, -1)))
This is just a verification that we achieved what we were after.  We are supposed to observe that for each paired i and j values, the sub-arrays are actually the same.
for t in zip(i, j):
    print(a[t[0]], b[t[1]], sep='\n\n')
    print('------')
[[0 1]
 [2 3]]
[[0 1]
 [2 3]]
------
[[4 5]
 [6 7]]
[[4 5]
 [6 7]]
------
in
However, to complete OP's thought on using in
a_ = a.tolist()
list(filter(lambda x: x.tolist() in a_, lst))
[array([[0, 1],
        [2, 3]]), array([[4, 5],
        [6, 7]])]
 
    
    - 285,575
- 57
- 475
- 624