I am an R newbie. This is my first question. I have a dataset containing 1) all US zip codes, 2) unique count of sales transactions, and 3) the sum of sales transactions. Is there a way to obtain the coefficient of determination (R^2) for every zip code using Count of Sales and Sum of Sales Transactions as my x and y variables, respectively? Specifically, I am looking to create a table with R^2s for every US zip code using the two variables mentioned.
            Asked
            
        
        
            Active
            
        
            Viewed 69 times
        
    1 Answers
2
            You can do this with the purrr package.
Here is an example with mtcars:
library(purrr)
mtcars %>%
  split(.$cyl) %>%
  map(~ lm(mpg ~ wt, data = .x)) %>%
  map(summary) %>%
  map_dbl("r.squared") %>% 
  data.frame(cyl = names(.), r2 = ., row.names = NULL)
         r2 cyl
1 0.5086326   4
2 0.4645102   6
3 0.4229655   8
And here is the flow for your problem, everything in "quotes" needs to be changed in your variables or dataframe, except for the "r.squared".
df <- "your dataframe" %>%
  split(.$"zipcode") %>%
  map(~ lm("sum of sales" ~ "count of sales", data = .x)) %>%
  map(summary) %>%
  map_dbl("r.squared") %>% 
  data.frame(zipcode = names(.), r2 = ., row.names = NULL)
 
    
    
        phiver
        
- 23,048
- 14
- 44
- 56
