I have a DataFrame containing 2 columns x and y that represent coordinates in a Cartesian system. I want to obtain groups with an even(or almost even) number of points. I was thinking about using pd.qcut() but as far as I can tell it can be applied only to 1 column.
For example, I would like to divide the whole set of points with 4 intervals in x and 4 intervals in y (numbers might not be equal) so that I would have roughly even number of points. I expect to see 16 intervals in total (4x4).
I tried a very direct approach which obviously didn't produce the right result (look at 51 and 99 for example). Here is the code:
df['x_bin']=pd.qcut(df.x,4)                                                
df['y_bin']=pd.qcut(df.y,4)                                                
grouped=df.groupby([df.x_bin,df.y_bin]).count()                                       
print(grouped)
The output:
x_bin                      y_bin                                 
(7.976999999999999, 7.984] (-219.17600000000002, -219.17]  51  51
                           (-219.17, -219.167]             60  60
                           (-219.167, -219.16]             64  64
                           (-219.16, -219.154]             99  99
(7.984, 7.986]             (-219.17600000000002, -219.17]  76  76
                           (-219.17, -219.167]             81  81
                           (-219.167, -219.16]             63  63
                           (-219.16, -219.154]             53  53
(7.986, 7.989]             (-219.17600000000002, -219.17]  78  78
                           (-219.17, -219.167]             77  77
                           (-219.167, -219.16]             68  68
                           (-219.16, -219.154]             51  51
(7.989, 7.993]             (-219.17600000000002, -219.17]  70  70
                           (-219.17, -219.167]             55  55
                           (-219.167, -219.16]             77  77
                           (-219.16, -219.154]             71  71
Am I making a mistake in thinking it is possible to do with pandas only or am I missing something else?
 
    