I want to find the common rows between 2 dataframe. To find the common rows, I can use inner_join(), semi_join(), and merge(). I have gone through different posts including this. But, these operations are not fulfilling my purposes. Because my data in the dataframe is a little different!
Sometimes, the data in the dataframe can be vise versa. Like the 3rd and 5th rows of dataframe-1 and dataframe-2. Dataframe-1 contains A3 A1 0.75 but Dataframe-2 contains A1 A3 0.75 . I would like to take these 2 rows as the same.
My first dataframe looks like
  query target weight
1    A1     A2   0.60
2    A2     A5   0.50
3    A3     A1   0.75
4    A4     A5   0.88
5    A5     A3   0.99
6    (+)-1(10),4-Cadinadiene     Falcarinone-10     0.09
7    Leucodelphinidin-100    (+)-1(10),4-Cadinadiene     0.876
8    Lignin  (2E,7R,11R)-2-Phyten-1-ol   0.778
9    (2E,7R,11R)-2-Phyten-1-ol   Leucodelphinidin    0.55
10   Falcarinone     Lignin  1
11   A1  (+)-1(10),4-Cadinadiene     1
12   A2  Lignin-10  1
13   A3  (2E,7R,11R)-2-Phyten-1-ol   1
14   Falcarinone  A6    1
15   A4  Leucodelphinidin    1
16   A4  Leucodelphinidin    1
17   Falcarinone  A100    1
18   A4  Falcarinone     1
the second dataframe looks like
  query target
1    A1     A2   
2    A2     A5   
3    A1     A3  // Missing in the output
4    A4     A5   
5    A3     A5  // Missing in the output
6    A3  (2E,7R,11R)-2-Phyten-1-ol   
7    (+)-1(10),4-Cadinadiene     Falcarinone    
8    Leucodelphinidin    (+)-1(10),4-Cadinadiene-100    
9    Lignin-2  (2E,7R,11R)-2-Phyten-1-ol   
10   A11  (+)-1(10),4-Cadinadiene    
11   A2  Lignin  
12   A3  (2E,7R,11R)-2-Phyten-1-0l 
13   Falcarinone  A60    
14   A4  Leucodelphinidin  // Missing in the output
The code I am using
output <- semi_join(Dataframe-1, Dataframe-2) OR
output <- inner_join(df_only_dd, sample_data_dd_interaction)
The output I am getting
  query target weight
1    A1     A2   0.60
2    A2     A5   0.50
But, my expected output is like this
  query target weight
1    A1     A2   0.60
2    A2     A5   0.50
3    A3     A1   0.75
4    A4     A5   0.88
5    A5     A3   0.99
6    A4  Leucodelphinidin  1
Reproducible code is given below
df_1 <- read.table(text="query   target     weight
A1  A2  0.6
A2  A5  0.5
A3  A1  0.75
A4  A5  0.88
A5  A3  0.99
(+)-1(10),4-Cadinadiene     Falcarinone     0.09
Leucodelphinidin    (+)-1(10),4-Cadinadiene     0.876
Lignin  (2E,7R,11R)-2-Phyten-1-ol   0.778
(2E,7R,11R)-2-Phyten-1-ol   Leucodelphinidin    0.55
Falcarinone     Lignin  1
A1  (+)-1(10),4-Cadinadiene     1
A2  Lignin  1
A3  (2E,7R,11R)-2-Phyten-1-ol   1
Falcarinone  A6    1
A4  Leucodelphinidin    1
A4  Leucodelphinidin    1
Falcarinone  A100    1
A5  Falcarinone     1", header=TRUE)
df_2 <- read.table(text="query   target
A1  A2 
A2  A5
A1  A3  
A4  A5  
A3  A5  
(+)-1(10),4-Cadinadiene     Falcarinone    
Leucodelphinidin    (+)-1(10),4-Cadinadiene-100    
Lignin-2  (2E,7R,11R)-2-Phyten-1-ol   
A11  (+)-1(10),4-Cadinadiene    
A2  Lignin  
A3  (2E,7R,11R)-2-Phyten-1-0l 
Falcarinone  A6    
A4  Leucodelphinidin  ", header=TRUE)
Any kind of suggestion is appreciated.
 
     
    