I want to calculate a Pearson correlation between several columns. The solution JasonAizkalns posted in this thread is very useful for me.
  df %>%
  select_if(is.numeric) %>%
  group_by(year) %>%
  group_map(~ correlate(.x))
Now I'm wondering two things:
- How can I get p-Values?
- Why are some correlation coefficients marked in red? I have not found anything about it in the documentation. Are these already the significant correlations? If yes, which significance level is used?
I am searching for an extension as simple as possible, without having to use a completely different method.
Thanks for any tips!
Edit 1 (11/28/22): Because my grouping variable ("trainingsmodus") is a character variable and I get the following error message, I have adapted my code.
Error in
group_by(): ! Must group by variables found in.data. ✖ Columntrainingsmodusis not found. Backtrace:
- ... %>% ...
- dplyr:::group_by.data.frame(., trainingsmodus)
My adapted code:
df %>%
  select_if(is.character) %>%
  group_by(year) %>%
  group_map(~ correlate(.x)) %>%
  add_column(year)
Even if I create the grouping variable as a numeric variable, the results of both groups are exactly identical, and this makes no sense. Does anyone have a tip on how I can correct the code?
Edit 2 (11/28/22) Repro of my df and the code:
df <- data.frame(year = c("lorem", "ipsum", "lorem", "ipsum"),    
             var1 = 4:7,
             var2 = 5:8,
             var3 = 6:9,
             var4 = 7:10)
library(rstatix)
df %>%
      select_if(is.character) %>%
      group_by(year) %>%
      group_map(~ cor_test(df,
                vars = c("var1", "var2", "var3", "var4"), 
                vars2 = c("var1", "var2", "var3", "var4") %>%
      filter(is.finite(statistic))) 
 
 
    