I have two dataframes, df1:
group value
g1 A
g1 B
g1 C
g1 D
g2 B
g2 C
g2 E
g3 A
g3 D
g3 E
g4 B
g4 D
...
and df2:
value1 value2
A B
A C
B C
B D
...
I would like to merge df1 and df2 by merging columns value with both value1 and value2, namely:
group value1 value2
g1 A B
g1 A C
g1 B C
g2 B C
g1 B D
g4 B D
...
As shown, group illustrates two columns value1 and value2 only if the two values simultaneously corresponds to the same group in df1.
df = df1.merge(df2, left_on='value', right_on='value1', how='inner')
df = df.merge(df2, left_on='value', right_on='value2', how='inner')
df = df.drop(['value'], axis=1)
The problem is, my actual dataframes df1 and df2 are quite large and every time I try to run the code, the session crashes (Not to mention, not quite sure if my code is correct to begin with).
Any idea how to optimize this process (and to confirm whether my approach is correct)?