I'm using the R mice package to impute random missing questionnaire item values for a few participants. The sum score of the questionnaire is later used in a multilevel model as predictor (level 2) of reaction times in a task (multiple trials, level 1), using brms.
I already tried two different approaches to create a mids object which includes all data and can later be used in brms_multiple but none worked so far:
1.) I kept the data frames separate, imputed the item values in the questionnaire data frame, created a data frame in long format including the original data and all imputations (using the complete function) and calculated the sum scores for each participant in each imputation (using rowSums). Afterwards, I joined this long data frame with the level-1 reaction time data (using full_join) and tried to convert it in a mids object (as.mids). This was, however, not feasible given the multiple occurrences of .id which emerged due to the joining.
2.) I joined the data frames before imputation and tried to impute only the level-2 questionnaire by extending mice with miceadds. Here, I defined only the item scores as predictors via the predictor matrix, 2lonly.function as method,the correct imputation function and ID as cluster variable. This resulted in Error in edit.setup(data, setup, ...) : `mice` detected constant and/or collinear variables. No predictors were left after their removal.
Did anyone experience similar issues and could solve them?
--- edit: here is a reproducible example for method 1 (my preferred one)
#So this is a fake dataset for the level 1 data:
data1 <- structure(list(participant = structure(1:20, .Label = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"), class = "factor"),
scale1 = c(20.5176893097081, 17.1907529978866, NA, NA, 23.0900118234823,
16.825451016666, 17.9720180052918, 28.4363035263208, 26.0191098441877,
26.1444447937135, NA, 25.091133563164, 10.3353758051478,
18.0322232007671, 14.1767794585022, 20.9102922916395, 20.6239907650613,
17.661597152285, 18.3255223659322, 18.9958533053766),
scale2 = c(23.8446274459682,
NA, 13.3562256053306, 8.52823315494693, 18.3034641524201,
17.1100738924451, 20.0295218831116, 15.6986473122548, 14.9647149797442,
32.1875950434602, 25.255823725488, NA, 15.2625337013248,
17.6354282904461, 5.86783073951034, NA, 16.3987924521716,
11.3574747700045, 18.3557569542574, 18.741406021827)),
row.names = c(NA,
-20L), class = "data.frame")
#This is for the level 2 data:
data2 <- structure(list(participant = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L,
9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L,
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L,
13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L,
14L, 14L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 16L,
16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L, 17L, 17L, 17L,
17L, 17L, 17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 18L, 18L, 18L,
18L, 18L, 18L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L,
20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L),
.Label = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"), class = "factor"),
RT = c(416, 389, 383, 411, 354, 404, 354, 433, 411, 408,
339, 368, 474, 407, 411, 366, 401, 427, 415, 376, 398, 393,
391, 483, 466, 427, 372, 380, 360, 383, 374, 412, 412, 394,
403, 387, 427, 383, 362, 402, 397, 445, 393, 407, 450, 381,
395, 428, 423, 423, 435, 404, 405, 426, 392, 408, 383, 371,
409, 422, 386, 412, 420, 353, 429, 350, 395, 428, 428, 437,
423, 475, 444, 369, 360, 429, 365, 379, 391, 446, 405, 360,
354, 399, 428, 403, 432, 392, 394, 448, 474, 411, 398, 373,
415, 333, 401, 395, 403, 429, 344, 426, 391, 394, 456, 371,
339, 409, 373, 389, 384, 408, 436, 359, 394, 440, 415, 418,
401, 379, 330, 452, 388, 388, 315, 389, 399, 403, 344, 441,
404, 409, 357, 369, 385, 385, 452, 370, 436, 371, 403, 459,
466, 408, 451, 393, 355, 362, 418, 440, 360, 377, 400, 390,
369, 414, 390, 368, 381, 387, 386, 415, 387, 374, 442, 405,
441, 395, 420, 431, 435, 438, 420, 412, 391, 408, 409, 413,
371, 447, 392, 385, 421, 377, 419, 437, 401, 392, 431, 491,
412, 399, 446, 408, 369, 387, 372, 428, 389, 401)),
row.names = c(NA,
-200L), class = "data.frame")
# run imputation on level 1 data
imputed <- mice(data1)
#create dataframe with all imputation + sum scores of scales (each participant)
data1_imputed <- complete(imputed, action = "long", include = TRUE)
data1_imputed$sumscore <- rowSums(data1_imputed[c("scale1", "scale2")])
# merge imputed level 1 data with level 2 data
data_all <- dplyr::full_join(data1_imputed, data2)
# try to create mids object with merged data - NOT WORKING
merged_imputed <- as.mids(data_all)```