The implicit index-matching of pandas for operations between different DataFrame/Series is great and most of the times, it just works.
However, I've stumbled on an example that does not work as expected:
import pandas as pd # 0.21.0
import numpy as np # 1.13.3
x = pd.Series([True, False, True, True], index = range(4))
y = pd.Series([False, True, True, False], index = [2,4,3,5])
# logical AND: this works, symmetric as it should be
pd.concat([x, y, x & y, y & x], keys = ['x', 'y', 'x&y', 'y&x'], axis = 1)
# x y x&y y&x
# 0 True NaN False False
# 1 False NaN False False
# 2 True False False False
# 3 True True True True
# 4 NaN True False False
# 5 NaN False False False
# but logical OR is not symmetric anymore (same for XOR: x^y vs. y^x)
pd.concat([x, y, x | y, y | x], keys = ['x', 'y', 'x|y', 'y|x'], axis = 1)
# x y x|y y|x
# 0 True NaN True False <-- INCONSISTENT!
# 1 False NaN False False
# 2 True False True True
# 3 True True True True
# 4 NaN True False True <-- INCONSISTENT!
# 5 NaN False False False
Researching a bit, I found two points that seem relevant:
bool(np.nan)equalsTrue, cf. https://stackoverflow.com/a/15686477/2965879|is resolved tonp.bitwise_or, rather thannp.logical_or, cf. https://stackoverflow.com/a/37132854/2965879
But ultimately, the kicker seems to be that pandas does casting from nan to False at some point. Looking at the above, it appears that this happens after calling np.bitwise_or, while I think this should happen before?
In particular, using np.logical_or does not help because it misses the index alignment that pandas does, and also, I don't want np.nan or False to equal True. (In other words, the answer https://stackoverflow.com/a/37132854/2965879 does not help.)
I think that if this wonderful syntactic sugar is provided, it should be as consistent as possible*, and so | should be symmetric. It's really hard to debug (as happened to me) when something that's always symmetric suddenly isn't anymore.
So finally, the question: Is there any feasible workaround (e.g. overloading something) to salvage x|y == y|x, and ideally in such a way that (loosely speaking) nan | True == True == True | nan and nan | False == False == False | nan?
*even if De Morgan's law falls apart regardless - ~(x&y) can not fully match ~y|~x because the NaNs only come in at the index alignment (and so are not affected by a previous negation).