The dataset I have contains states and I would like for a new variable or column to be called Region, Pacific-Oregon, Calif., Washington Rock Mountains - Nevada, Montana Idaho, ect
I am confussed on where to go from here. Any ideas?
The dataset I have contains states and I would like for a new variable or column to be called Region, Pacific-Oregon, Calif., Washington Rock Mountains - Nevada, Montana Idaho, ect
I am confussed on where to go from here. Any ideas?
 
    
    The classic way to do this would be with merge(), or (since you added the tidyr tag, so you're in the "Hadleyverse") dplyr::full_join().  Assuming you have one data frame with states and other data:
d1 <- data.frame(state=c("Alaska","Massachusetts",
                 "Massachusetts","Florida"),
                 other_stuff=1:4)
and another data frame containing the matches between the states and their regions:
d2 <- data.frame(state=c("Alaska","Massachusetts","Florida"),
                 region=c("Western","Northeast","Southeast"))
Then
library("dplyr")
d1 %>% full_join(d2,by="state")
should do what you want.
But it's up to you to figure out where to get d2, or the equivalent information, from.
 
    
    Due to the fact that you did not provide your data I suppose youre data looks something like this:
df <- data.frame(state = c("Alabama", "Alaska", "Arizona", "Arkansas", "California", "Oregon", "Washington"))
I suppose you have a column in your data.frame (in this case called df$state) that has information on the state. You can create a new variable called region like this:
df$region[df$state == "California" | df$state == "Oregon" ] <- "Pacific"
df
