An alternative approach is to use join setting the index of the right hand side DataFrame to the columns ['username', 'column1']:
df1.join(df2.set_index(['username', 'column1']), on=['userid', 'column1'], how='left')
The output of this join merges the matched keys from the two differently named key columns, userid and username, into a single column named after the key column of df1, userid; whereas the output of the merge maintains the two as separate columns. To illustrate, consider the following example:
import numpy as np
import pandas as pd
df1 = pd.DataFrame({'ID': [1,2,3,4,5,6], 'pID' : [21,22,23,24,25,26], 'Values' : [435,33,45,np.nan,np.nan,12]})
## ID Values pID
## 0 1 435.0 21
## 1 2 33.0 22
## 2 3 45.0 23
## 3 4 NaN 24
## 4 5 NaN 25
## 5 6 12.0 26
df2 = pd.DataFrame({'ID' : [4,4,5], 'pid' : [24,25,25], 'Values' : [544, 545, 676]})
## ID Values pid
## 0 4 544 24
## 1 4 545 25
## 2 5 676 25
pd.merge(df1, df2, how='left', left_on=['ID', 'pID'], right_on=['ID', 'pid']))
## ID Values_x pID Values_y pid
## 0 1 435.0 21 NaN NaN
## 1 2 33.0 22 NaN NaN
## 2 3 45.0 23 NaN NaN
## 3 4 NaN 24 544.0 24.0
## 4 5 NaN 25 676.0 25.0
## 5 6 12.0 26 NaN NaN
df1.join(df2.set_index(['ID','pid']), how='left', on=['ID','pID'], lsuffix='_x', rsuffix='_y'))
## ID Values_x pID Values_y
## 0 1 435.0 21 NaN
## 1 2 33.0 22 NaN
## 2 3 45.0 23 NaN
## 3 4 NaN 24 544.0
## 4 5 NaN 25 676.0
## 5 6 12.0 26 NaN
Here, we also need to specify lsuffix and rsuffix in join to distinguish the overlapping column Value in the output. As one can see, the output of merge contains the extra pid column from the right hand side DataFrame, which IMHO is unnecessary given the context of the merge. Note also that the dtype for the pid column has changed to float64, which results from upcasting due to the NaNs introduced from the unmatched rows.
This aesthetic output is gained at a cost in performance as the call to set_index on the right hand side DataFrame incurs some overhead. However, a quick and dirty profile shows that this is not too horrible, roughly 30%, which may be worth it:
sz = 1000000 # one million rows
df1 = pd.DataFrame({'ID': np.arange(sz), 'pID' : np.arange(0,2*sz,2), 'Values' : np.random.random(sz)})
df2 = pd.DataFrame({'ID': np.concatenate([np.arange(sz/2),np.arange(sz/2)]), 'pid' : np.arange(0,2*sz,2), 'Values' : np.random.random(sz)})
%timeit pd.merge(df1, df2, how='left', left_on=['ID', 'pID'], right_on=['ID', 'pid'])
## 818 ms ± 33.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit df1.join(df2.set_index(['ID','pid']), how='left', on=['ID','pID'], lsuffix='_x', rsuffix='_y')
## 1.04 s ± 18.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)