My question builds on the data table answer to this question (full disclosure: linked question was also asked by me). I have benefitted greatly from other SO questions and answers as well, and I've spent a lot of time reading about functions but haven't succeeded yet.
I've got a few lines of code that work well for my purposes, but I have to run the same code for 5 different variables. Therefore, I would like to write a function to make this process more efficient.
Sample data frame:
    id <- c(1, 1, 1, 1, 2, 3, 4, 4, 5, 5, 5)
    bmi <- c(18, 22, 23, 23, 20, 38, 30, 31, 21, 22, 24)
    other_data <- c("north_africa", "north_africa", "north_africa", "north_africa", "western_europe", "south_america", "eastern_europe", "eastern_europe", "ss_africa", "ss_africa", "ss_africa")
    other_data2 <- c(0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0)
    big_df <- data.frame(id, bmi, other_data, other_data2)
    #first make a data table with just the id and bmi columns
    bmi_dt <- as.data.table(big_df[c(1, 2)])
    #restructure data so that each ID only has one row
    bmi_dt <- bmi_dt[, c(bmi_new = paste(bmi, collapse = "; "), .SD), by = id][!duplicated(bmi_dt$id)]
    #split the strings of multiple numbers into 4 new cols
    bmi_dt[, c("bmi1", "bmi2", "bmi3", "bmi4") := tstrsplit(as.character(bmi_new), "; ", fixed=TRUE)]
    #make columns numeric
    bmi_dt <- bmi_dt[, lapply(.SD, as.numeric), by = id]
    #function to replace NA with 0 in a data table
    func_na <- function(DT) {
       for (i in names(DT))
          DT[is.na(get(i)), i:=0, with=FALSE]
    }
    func_na(bmi_dt)
That last part, the function, was written by Matt Dowle in this SO answer.
I have been trying to create an overall function for this sequence by starting small, but even the most basic part won't work properly. This is one of my failed attempts:
    big_func <- function(DT, old_col, id_col) {
      DT <- DT[, c(new_col = paste(old_col, collapse = "; "), .SD), by = id_col][!duplicated(id_col)]
      DT
    }  
    test <- big_func(bmi_dt, bmi, id)
I'd really like to understand:
a) Why doesn't my attempt work for the first part?
b) Does it make sense to create one large function for all of this?
c) If so, how do I do that?
Edit: I see now that there is a good question about reshaping data tables here. I think my question about writing functions is a separate issue.
 
     
    