Assigning to array changes dtype

Question

I have a multidimensional array containing grayscale integer values which need to be normalized to the range 0-1. To be more precise, the multidimensional array in question is an array where every element contains matrix that represents a specific image, and each of those matrices (images) contains image's pixels with an integer value in the range 0-255.

Here is the normalization function:

def normalize(x, mmin=0.0, mmax=255.0):
    x = (x - mmin )/(mmax - mmin + 10**(-5))

    return x

RIGHT: When in main module I apply the function in this way:

trainingSet_Images = myUtils.normalize(trainingSet_Images)

The result is correctly an array of matrices with floating-point values.

WRONG: But applying normalize() function in this way:

for i in range(len(trainingSet_Images)):
   trainingSet_Images[i] = myUtils.normalize(trainingSet_Images[i])

all elements of trainingSet_Images are a matrix of integers, with zero values.

It seems that Python remembers the original type of matrices - but why does the first way of doing the assignment work and the second way not?

Could you provide a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example)? — Keldorn, Apr 16 '20 at 08:45
Not a `machine-learning` question - kindly do not spam irrelevant tags (removed) — desertnaut, Apr 16 '20 at 10:54

yatu · Accepted Answer · 2020-04-16T09:49:33.790

That is because by assigning back to the array as in the second method, the resulting dtype from performing the normalization (which will be a float), gets downcasted to the array dtpye, so it gets floored.

This is mentioned in the Assigning values to indexed arrays section of the docs, where it's stated that:

Note that assignments may result in changes if assigning higher types to lower types (like floats to ints) or even exceptions (assigning complex to floats or ints)

Here's an example assigning back from the result of applying your normalize function:

a = np.array([[255,255,255],[0,255,255]])

normalize(a)
array([[0.99999996, 0.99999996, 0.99999996],
       [0.        , 0.99999996, 0.99999996]])

Whereas in the second method:

normalize(a[1])
# array([0.        , 0.99999996, 0.99999996])

a[1] = normalize(a[1])

print(a)
array([[255, 255, 255],
       [  0,   0,   0]])

The same would apply if you did:

a[:] = normalize(a)

print(a)
array([[0, 0, 0],
       [0, 0, 0]])

Assigning to array changes dtype

1 Answers1