please can you help me again?
I have a data frame that contains 4 columns, which are either a gene symbol or a rank that I have assigned the gene symbol like this:
     mb_rank  mb_gene  ts_rank  ts_gene
[1]  1        BIRCA    1        MYCN
[2]  2        MYCN     2        MOB4
[3]  3        ATXN1    3        ABHD17C
[4]  4        ABHD17C  4        AEBP2
5 etc... for up to 6000 rows in some data sets. 
the ts columns are usually a lot longer than the mb columns. 
I want to arrange the data so that non-duplicates are removed thereby leaving only genes that appear in both columns of the data frame e.g.
     mb_rank  mb_gene  ts_rank  ts_gene
[1]  2        MYCN     1        MYCN
[2]  4        ABHD17C  3        ABHD17C
In this example of the desired outcome, the non-duplicated genes have been removed leaving only genes that appeared in both lists to begin with.
I have tried many things like:
`df[df$mb_gene %in% df$ts_gene,]` 
but it doesn't work and seems to hit and miss some gene 
2) I attempted to write an IF function but my skills are to limited.
I hope I have described this well enough but if I can clarify anything please ask, I'm really stuck. Thanks in advance!
 
     
     
    