Let's say these are my two datasets:
# This is dataset "df"
df <- tibble(country = rep(c("US", "UK", "FRA", "SPA"), times = 3),
             date = rep(c("2020-01-01", "2020-02-01", "2020-03-01"), each = 4))
   country date
 1 FRA     2020-01-01
 2 SPA     2020-01-01
 3 UK      2020-01-01
 4 US      2020-01-01
 5 FRA     2020-02-01
 6 SPA     2020-02-01
 7 UK      2020-02-01
 8 US      2020-02-01
 9 FRA     2020-03-01
10 SPA     2020-03-01
11 UK      2020-03-01
12 US      2020-03-01
# This is dataset "dd"
dd <- tibble(country = rep(c("US", "UK", "FRA"), times = 3),
             date = rep(c("2020-01-01", "2020-02-01", "2020-03-01"), each = 3),
             cases = seq(1:9),
             deaths = seq(from = 2, to = 18, by = 2))
  country date       cases deaths
1 FRA     2020-01-01     1      2
2 UK      2020-01-01     2      4
3 US      2020-01-01     3      6
4 FRA     2020-02-01     4      8
5 UK      2020-02-01     5     10
6 US      2020-02-01     6     12
7 FRA     2020-03-01     7     14
8 UK      2020-03-01     8     16
9 US      2020-03-01     9     18
I would like to create a new dataset called dj which combines df and the cases column in dd by country and date. I also want NA to be printed in the cases column (in dj) where there is missing data (i.e. for country "SPA").
In essence, I am trying to get dj to have 12 rows and 3 columns (country, date, cases), with NA for cases in "SPA". Like this:
   country date       cases
1  FRA     2020-01-01     1
2  SPA     2020-01-01    NA
3  UK      2020-01-01     2
4  US      2020-01-01     3
5  FRA     2020-02-01     4
6  SPA     2020-02-01    NA
7  UK      2020-02-01     5
8  US      2020-02-01     6
9  FRA     2020-03-01     7
10 SPA     2020-03-01    NA
11 UK      2020-03-01     8
12 US      2020-03-01     9
I know you can combine two datasets using dplyr package *_join() functions, but how can I use these when I only want to merge a dataset with one particular column from another dataset and use more than one 'by' argument (i.e. country and date)??
I'm having difficulty in figuring this out myself since I'm new to R, so any help would be very much appreciated :)
 
     
    