I have two frames. df1 containing thousands of rows of four columns of string data. The string data can be almost anything but what matters is which columns match. Here I just chose character names for illustration. Within one row of df1, the four columns when compared, are a code for 1 of 14 fairly complex equations. df2 contains the code and pre-calculated results for the 14 equations. Instead of calculating each of the thousands of values on the fly, I want to compare the four columns in df1 and populate a Result column using the pre-calculated results from df2.
df1 looks like this:
     Col1    Col2    Col3    Col4
0    'Sam'   'Sam'   'Sam'   'Sam'  #<----this is 'AAAA'
1    'Sam'   'Sam'   'Eric'  'Eric' #<----this is 'AABB' 
2    'Sam'   'Eric'  'Sam'   'Sam'  #<----this is 'ABAA'
..   ....                          
n    'Sam'   'Eric'  'Piggy' 'Jack' #<----this is 'ABCD'
df2 looks like this:
      Code    Result
0    'AAAA'   1.0040
1    'ABAB'   7.7777
2    'AABA'   5.5400
3    'ABAA'   0.0400
4    'BAAA'   1.0250
5    'ABAB'   2.0000
6    'ABAC'   6.5300
7    'ABCA'   5.0090
8    'BAAC'   1.0008
9    'BACA'   1.0000
10   'AABB'   56.000
11   'AABC'   0.2500
12   'BCAA'   0.5000
13   'ABCD'   1.7777
Afterwards df1 should look like this:
     Col1    Col2    Col3    Col4    Result
0    'Sam'   'Sam'   'Sam'   'Sam'   1.0040
1    'Sam'   'Sam'   'Eric'  'Eric'  56.000
2    'Sam'   'Eric'  'Sam'   'Sam'   0.0400
..   ....                            .....
n    'Sam'   'Eric'  'Piggy' 'Jack'  1.7777
Because I am unaware of a way to do this all at once, I thought the quickest way to handle this would be to apply a mask to df1 14 times and update the masked data based on an np.array created by a mask from df2. So the first update ('AAAA') would look like this:
#create mask for 'AAAA' in df1
mask1 = (df1['Col1'] == df2['Col2']) & (df1['Col1'] == df2['Col3']) & (df1['Col1'] == df2['Col4'])
#create mask for 'AAAA' in df2
mask2 = df2['Code'] == 'AAAA'
#create a temporary array to update df1
temp_arr = np.tile(df2.loc[mask2, 'Result'].to_numpy(), np.sum(mask1))
#add the array to the column
df1.loc[mask1, 'Result'] = temp_arr
If there's a more efficient way to do this comparison, please let me know. Thanks.
As soon as I can answer, I'll post my solution using @Hwei Geok Ng 's suggestion but unless someone has something more efficient, I'll leave that suggestion as the answer.
Edit: I apologize. I had to edit the output table. I made a mistake in editing. The table originally had letters instead of the data that was supposed to be there. It is correct now.
 
    