I merged two datasets from the same survey. One is on the individual level selectedindividual and one on the household level, selectedhousehold. I have merged the two datasets using the following code (using left_join from dplyr):
mergeddf <- left_join(selectedhousehold, selectedindividual)  %>% group_by(shserial)  %>% slice (1)
The shserial variable is the primary key present in both datasets. Each household has a specific number. Each household can contain up to two individuals, which will therefore have the same shserial. Because I want to conduct my analysis on the household level, I did not want to just use merge() on the datasets (as this duplicated the observations in households with 2 individuals, for, for example, one of my dependent variables on the household level - GrossIncome). 
I still have a problem though:
I have a variable on the individual level, WrkStat (with three levels, working, NWork, FTEduc) that I want to include in my model. The code that I used for merging the datasets only retained the first observations for two shserials with the same number (I assume this, at least, I could not figure out how slice() works exactly). This is not great for my analysis as I don't want to select one of two individuals in a household at random. To illustrate, the summary statistics of WrkStat in both the non-merged and the merged dataframes: 
> summary(selectedindividual$WrkStat)
working  FTEduc   NWork    NA's 
    324     748    2455     201 
> summary(mergeddf$WrkStat)
working  FTEduc   NWork    NA's 
    251      77    2097       5 
As a solution, I figured I would create a new variable, WrkStat2 which combines the observations for two individuals in one household. I want to create this variable before merging the datasets. 
I was hoping I could create this new variable on the basis of the mutual shserial number. 
However, I can't figure out how to do this. 
EDIT:
The structure of my dataframe:
selectedindividual <- structure(list(`shserial` = c(1010574, 1010574, 
1011104, 1011104, 1011109, 1011109, 1011134, 1011134, 1011142, 
1011143, 1011148, 1011148, 1011154, 1011154, 1011156, 1011171, 
1011171, 1011174, 1011174, 1011182), `WrkStat` = structure(c(3L, 
2L, 3L, 2L, 3L, NA, 1L, NA, 3L, 3L, 3L, 2L, 3L, 2L, 3L, 1L, 2L, 
3L, NA, 3L), .Label = c("working", "FTEduc", "NWork"), class = "factor")), row.names = c(NA, 
-20L), class = c("tbl_df", "tbl", "data.frame"))
Gives this output:
   shserial WrkStat
      <dbl> <fct>  
 1  1010574 NWork  
 2  1010574 FTEduc 
 3  1011104 NWork  
 4  1011104 FTEduc 
 5  1011109 NWork  
 6  1011109 NA     
 7  1011134 working
 8  1011134 NA     
 9  1011142 NWork  
10  1011143 NWork  
11  1011148 NWork  
12  1011148 FTEduc 
13  1011154 NWork  
14  1011154 FTEduc 
15  1011156 NWork  
16  1011171 working
17  1011171 FTEduc 
18  1011174 NWork 
I would like this output:
   shserial WrkStat2
      <dbl> <fct>  
 1  1010574 NWork/FTEduc  
 2  1011104 NWork/FTEduc 
 3  1011109 NWork     
 4  1011134 working
 5  1011142 NWork  
 6  1011143 NWork  
 7  1011148 NWork/FTEduc
 8  1011154 NWork/FTEduc 
 9  1011156 NWork  
10  1011171 working/FTEduc
11  1011174 NWork 
(This also removes the NA's that are not on it's own (so not a combined WrkStat), although I think it would also be fine if all the NA's would be removed in this process).
Another edit:
WrkStat2 should have the following labels: 
"working/working",
"working/NWork",
"working/FTEduc",
"NWork/NWork",
"NWork/FTEduc",
"FTEduc/FTEduc",
"working",
"NWork",
"FTEduc"
I'm sorry if something is not clear or if the whole things is impossible (let me know). I am not great at thinking logically and have struggled with this for a few days.
 
    