Background
I have two df's in r, one called d and the other called insurance. d contains a list of unique ID along with two attributes, gender and zip (which are irrelevant to this question). insurance contains some of those same ID numbers, but they aren't unique. They represent people's health insurance plans in a given year. Some ID in the insurance table repeat, because that person has had more than one insurance plan in that year (i.e. they changed insurers). Here's some code to make these tables:
d <- data.frame(ID = c("a","b","c","d","e","f"), 
                    gender = c("f","f","m","f","m","m"), 
                    zip = c(48601,60107,29910,54220,28173,44663),stringsAsFactors=FALSE)
Which looks like:
| ID | gender | zip | 
|---|---|---|
| a | f | 48601 | 
| b | f | 60107 | 
| c | m | 29910 | 
| d | f | 54220 | 
| e | m | 28173 | 
| f | m | 44663 | 
insurance <- data.frame(ID = c("a","a","c","d","f","f"), 
                     ins_type = c("public","private","private","private","private","public"), 
                     insurer = c("medicare","aetna","cigna","uhc","uhc","medicaid"),
                     stringsAsFactors = FALSE)
Which looks like:
| ID | ins_type | insurer | 
|---|---|---|
| a | public | medicare | 
| a | private | aetna | 
| c | private | cigna | 
| d | private | uhc | 
| f | private | uhc | 
| f | public | medicaid | 
Here's my goal:
I need d to reflect whether anyone in d$ID has had any public insurance, and if so, which insurer. Specifically, this means "looking up" to the insurance table and making 2 new variables in d: first, a 1/0 or yes/no variable for ever having public insurance (call this variable d$public); second, which insurer it is (call this d$insurer).
The tricky bit is that I need d$ID to remain unique, because it has to be a primary key for another aspect of the project I haven't outlined here. So the idea is if any ID has any entry for public insurance, then d$public should get a "1" or a "yes" or whatever.
Desired result
I'd like a table that looks like this:
| ID | gender | zip | public | insurer | 
|---|---|---|---|---|
| a | f | 48601 | 1 | medicare | 
| b | f | 60107 | 0 | NA | 
| c | m | 29910 | 0 | NA | 
| d | f | 54220 | 0 | NA | 
| e | m | 28173 | 0 | NA | 
| f | m | 44663 | 1 | medicaid | 
What I've tried
Versions of this question have been asked before (e.g. here, here) but I can't quite get this right.
I've tried using joins, like so:
d2 <- d %>%
  left_join(insurance, by=c("ID"="ID"))
This gets me the columns I want, but it makes IDs like a repeat, which I can't have.
Thanks for any help!
 
     
    