1

Is there a way to configure pandas to use numpy.nan as its NaN constant, or at least a single module-global constant pandas.nan = float('nan'), rather than a brand-new and unique float('nan')for each NaN it needs to represent?

(Alternatively, is there any justification for not using a single, globally unique NaN constant, the way that numpy does?)

Community
  • 1
  • 1
kjo
  • 33,683
  • 52
  • 148
  • 265
  • 4
    `numpy` doesn't use a single globally-unique NaN constant. It has a constant, `numpy.nan`, which is predefined, but `len(set(np.array([np.nan]*10))) == 10`, and `f = np.array([np.nan]*2); print f[0] is f[1]` gives `False`. – DSM Feb 13 '13 at 13:49
  • I'm curious as to how it would help if there were only one NaN. But you could always write a function that takes a float and returns `numpy.nan` if the input was any NaN, and just the input otherwise, and call it on everything. That's more or less what pandas/numpy would have to do internally to support your request. – Ben Feb 13 '13 at 14:00
  • 1
    I understand that it would be a bit of an advantage because `(a is b) or (a == b)` logic would say NaN is the same. As it is, this wish is a bit crazy. Even if python used a singleton NaN, you are lost with different data types. Personally, I disagree with the fact that python does not raise a ValueError if you try to hash a NaN, but overall, NaNs are just very special, and if you have them, you have to be very careful with many things... – seberg Feb 13 '13 at 14:34

0 Answers0