Shuffling numpy arrays keeping the respective values

Question

I have two numpy arrays, both with 1 million spaces, and all the values inside the first one have a corresponding value with the values in the second one. I want to shuffle both of them while keeping the respective values. How do I do that?

score 0 · Accepted Answer · answered May 18 '21 at 02:34

Since both arrays are of same size, you can use Numpy Array Indexing.

def unison_shuffled_copies(a, b):
    assert len(a) == len(b)                  # don't need if we know array a and b is same length
    p = numpy.random.permutation(len(a))     # generate the shuffled indices
    return a[p], b[p]                        # make the shuffled copies using same arrangement according to p

This is referencing this answer, with some changes.

score 0 · Answer 2 · answered May 18 '21 at 02:37

You can keep track of the numpy.random.state with get_state() and reset it between shuffles with set_state(). This will make the shuffle behave the same on both arrays.

import numpy as np

arr1 = np.arange(9).reshape((3, 3))
arr2 = np.arange(10,19).reshape((3, 3))

arr1, arr2

# array([[0, 1, 2],
#        [3, 4, 5],
#        [6, 7, 8]]),
# array([[10, 11, 12],
#        [13, 14, 15],
#        [16, 17, 18]])

# get state
state = np.random.get_state()
np.random.shuffle(arr1)
arr1

# array([[6, 7, 8],
#        [3, 4, 5],
#        [0, 1, 2]])

# reset state and shuffle other array
np.random.set_state(state)
np.random.shuffle(arr2)
arr2

#array([[16, 17, 18],
#       [13, 14, 15],
#       [10, 11, 12]])

score 0 · Answer 3 · answered Jul 19 '22 at 23:19

I am a bit late, but a good solution would be to shuffle the indices and retrieve the shuffled values accordingly. For instance,

Say I have keys = np.array([1, 2, 3, 4, 5]) and values = np.array([1, 2, 3, 4, 5])

In order to shuffle while preserving their locations, you can create another variable "idcs"

idcs = np.arange(0, keys.shape[0])

Then just shuffle

np.random.shuffle(idcs)

And index both "keys" and "values" the same way

newKeys = keys[idcs]
newValues = values[idcs]
print(newKeys)
print(newValues)
-> [3 2 5 1 4]
-> [3 2 5 1 4]

Shuffling numpy arrays keeping the respective values

3 Answers3