I am stuck with the following problem. I have two dataframes (df1, df2) I'd like to left_join(df1, df, by=c("a", "b", "c")) using three variables a, b and c. Then, I noticed, that the number of rows in the joined dataframe increased. Therefore I checked, whether there were any duplicate entries:
duplicated(paste(df1$a, df1$b, df1$c)
duplicated(paste(df2$a, df2$b, df2$c)
In both dataframes I found several duplicate entries. Now here comes my question: How can I exclude those duplicates before I join the two dataframes? My problem is that duplicated() only marks the duplicated values (i.e. the second appearance). I would like to exclude the first appearance too. I hope you get the point.
Thank you for help!