Fast numpy roll

Question

I have a 2d numpy array and I want to roll each row in an incremental fashion. I am using np.roll in a for loop to do so. But since I am calling this thousands of times, my code is really slow. Can you please help me out on how to make it faster.

My input looks like

array([[4,1],
       [0,2]])

and my output looks like

array([[4,1],
       [2,0]])

Here the zeroth row [4,1] was shifted by 0, and the first row [0,2] was shifted by 1. Similarly the second row will be shifted by 2 and so on.

EDIT

temp = np.zeros([dd,dd])
for i in range(min(t + 1, dd)):
    temp[i,:] = np.roll(y[i,:], i, axis=0)

Here's a possible solution: http://stackoverflow.com/questions/20360675/roll-rows-of-a-matrix-independently — Lev Levitsky, Feb 07 '17 at 22:27

Divakar · Accepted Answer · 2017-02-07T23:11:25.257

Here's one vectorized solution -

m,n = a.shape
idx = np.mod((n-1)*np.arange(m)[:,None] + np.arange(n), n)
out = a[np.arange(m)[:,None], idx]

Sample input, output -

In [256]: a
Out[256]: 
array([[73, 55, 79, 52, 15],
       [45, 11, 19, 93, 12],
       [78, 50, 30, 88, 53],
       [98, 13, 58, 34, 35]])

In [257]: out
Out[257]: 
array([[73, 55, 79, 52, 15],
       [12, 45, 11, 19, 93],
       [88, 53, 78, 50, 30],
       [58, 34, 35, 98, 13]])

Since, you have mentioned that you are calling such a rolling routine multiple times, create the indexing array idx once and re-use it later on.

Further improvement

For repeated usages, you are better off creating the full linear indices and then using np.take to extract the rolled elements, like so -

full_idx = idx + n*np.arange(m)[:,None]
out = np.take(a,full_idx)

Let's see what's the improvement like -

In [330]: a = np.random.randint(11,99,(600,600))

In [331]: m,n = a.shape
     ...: idx = np.mod((n-1)*np.arange(m)[:,None] + np.arange(n), n)
     ...: 

In [332]: full_idx = idx + n*np.arange(m)[:,None]

In [333]: %timeit a[np.arange(m)[:,None], idx] # Approach #1
1000 loops, best of 3: 1.42 ms per loop

In [334]: %timeit np.take(a,full_idx)          # Improvement
1000 loops, best of 3: 486 µs per loop

Around 3x improvement there!

B. M. · Answer 2 · 2017-02-07T23:09:29.853

1

a tricky but fast solution :

p=5
a=randint(0,p,(p,p))

aa=hstack((a,a))
m,n=aa.strides

b=np.lib.stride_tricks.as_strided(aa,a.shape,(m+n,n)) 
c=np.lib.stride_tricks.as_strided(aa.ravel()[p:],a.shape,(m-n,n)) 
  ## 

[[2 1 4 2 4]
 [0 4 2 0 3]
 [1 3 3 4 4]
 [1 0 3 2 4]
 [3 3 2 1 3]]

[[2 1 4 2 4]
 [4 2 0 3 0]
 [3 4 4 1 3]
 [2 4 1 0 3]
 [3 3 3 2 1]]

[[2 1 4 2 4]
 [3 0 4 2 0]
 [4 4 1 3 3]
 [3 2 4 1 0]
 [3 2 1 3 3]]

edited Feb 07 '17 at 23:09

answered Feb 07 '17 at 22:47

B. M.

18,243
2
35
54

Very nice use of strides! The rolling order is flipped though. – Divakar Feb 07 '17 at 22:54
yes . I look after the other direction, but it's not as simple.... :( . – B. M. Feb 07 '17 at 22:57
Stack the column flipped version? – Divakar Feb 07 '17 at 22:57

Fast numpy roll

2 Answers2

Linked