I have two dataframes:
df1 - is a pivot table that has totals for both columns and rows, both with default names "All" df2 - a df I created manually by specifying values and using the same index and column names as are used in the pivot table above. This table does not have totals.
I need to multiply the first dataframe by the values in the second. I expect the totals return NaNs since totals don't exist in the second table.
When I perform multiplication, I get the following error:
ValueError: cannot join with no level specified and no overlapping names
When I try the same on dummy dataframes it works as expected:
import pandas as pd
import numpy as np
table1 = np.matrix([[10, 20, 30, 60],
                  [50, 60, 70, 180],
                  [90, 10, 10, 110],
                  [150, 90, 110, 350]])
df1 = pd.DataFrame(data = table1, index = ['One','Two','Three', 'All'], columns =['A', 'B','C', 'All'] )
print(df1)
table2 = np.matrix([[1.0, 2.0, 3.0],
                  [5.0, 6.0, 7.0],
                  [2.0, 1.0, 5.0]])
df2 = pd.DataFrame(data = table2, index = ['One','Two','Three'], columns =['A', 'B','C'] )
print(df2)
df3 = df1*df2
print(df3)
This gives me the following output:
         A   B    C  All
One     10  20   30   60
Two     50  60   70  180
Three   90  10   10  110
All    150  90  110  350
         A    B    C
One   1.00 2.00 3.00
Two   5.00 6.00 7.00
Three 2.00 1.00 5.00
           A  All      B      C
All      nan  nan    nan    nan
One    10.00  nan  40.00  90.00
Three 180.00  nan  10.00  50.00
Two   250.00  nan 360.00 490.00
So, visually, the only difference between df1 and df2 is the presence/absence of the column and row "All".
And I think the only difference between my dummy dataframes and the real ones is that the real df1 was created with pd.pivot_table method:
df1_real = pd.pivot_table(PY, values = ['Annual Pay'], index = ['PAR Rating'],
          columns = ['CR Range'], aggfunc = [np.sum], margins = True)
I do need to keep the total as I'm using them in other calculations.
I'm sure there is a workaround but I just really want to understand why the same code works on some dataframes of different sizes but not others. Or maybe an issue is something completely different.
Thank you for reading. I realize it's a very long post..
 
     
    