Using OECD data, I can retrieve a database where variables are specified by their IDs and a list with the corresponding labels. Here is a minimal example that reproduces the data structure:
df <- tibble(LOCATION=c("DEU","ITA","USA"),UNIT=c("DEU","EUR","USD"),value=c(120,140,160))
df
## A tibble: 3 x 3
#> LOCATION UNIT value
#> <chr> <chr> <dbl>
#> 1 DEU DEU 120
#> 2 ITA EUR 140
#> 3 USA USD 160
df_labels <- list(LOCATION = data.frame(id =c("DEU","ITA","USA"),
label=c("Germany","Italy","United States")),
UNIT = data.frame(id=c("USD","EUR"),
label=c("Dollar","Euro")))
df_labels
#> $LOCATION
#> id label
#> 1 DEU Germany
#> 2 ITA Italy
#> 3 USA United States
#>
#> $UNIT
#> id label
#> 1 USD Dollar
#> 2 EUR Euro
What I want to do is to replace the IDs in variables LOCATION and UNIT in df with the corresponding labels provided in df_labels.
I defined the following function:
get_labels <- function(x,db) {
variable = deparse(substitute(x))
return(factor(x,levels=db[[variable]]$id,labels=db[[variable]]$label))
}
so that I can use it in mutate as follows:
df %>% mutate(LOCATION = get_labels(LOCATION,df_labels),
UNIT = get_labels(UNIT,df_labels))
## A tibble: 3 x 3
#> LOCATION UNIT value
#> <fct> <fct> <dbl>
#> 1 Germany Euro 120
#> 2 Italy Euro 140
#> 3 United States Dollar 160
However, I haven't been able to use the function across multiple columns. If I try it using across:
df %>% mutate(across(where(is.character), ~get_labels(.,df_labels)))
the result is an NA in the affected columns. Apparently, the problem is with deparse(substitute(.)), which does not capture the column names. Unfortunately, looking at similar questions such as this one didn't help.