I would appreciate any help to do this:
- for each
Pfit the model for each column ofweights.- do the step 1 for all observations in the dataset to get
p * wsubmodels, wherepis the number ofPandwis the number of columns ofweights.- pool the results by combining the posterior samples of the submodels.
A similar brms implementation is available here, also discussed here. But that allows only incorporation of uncertainty coming from multiple imputation P. I want to account for the uncertainty coming from the estimation of the weights as well.
@paul.buerkner says there that we could also extract those data sets and then pass them to the actual model fitting function as a list of data frames. Using this approach, the solution would probably be to pass my individual p * w data frames, each with p=1 and w=1. Still, it is not clear how the list of these data frames could in practice be passed to the model fitting function.
In my case, the multiply imputted values of the predictors are in long format with each imputation stage represented by a value of the variable P, and the set of weights estimated at each P are in wide format and represented by the variables w_1 to w_10.
There is also the stan approach discussed here. This requires calling "stan by parallel and it returns a list of stanfit objects," and then use sflist2stanfit to create one stanfit object from the list. Here the problem is that I would need to separate my dataframe into p * w = 50, or more than 100 datasets (if I increase the number of imputations P to more than 10 as generally recommended), each with p=1 and w=1.
Following @BobCarpenter's recommendation, below I present my attempt to parallelize the likelihood. It seems this approach could allow me to account for one of the sources of uncertainty separately, but not both jointly as intended. Here I am not certain how to specify the transformed parameters block and the likelihood to account for the uncertainty from P too.
Any help to improve and/or correct my current attempt to accomplish my objective would be much appreciated. Any input to implement any of the other approaches discussed above would also be much appreciated. I apologise if you find any basic mistakes in my code. While I am certain about what I want to do and why, the stan implementation of my ideas is still challenging - still learning.
#model
data{
int N;
int P;
int ncases[N];
int A[N];
int B[N];
int nn[N];
int id[N];
real w_1[N];
real w_2[N];
real w_3[N];
real w_4[N];
real w_5[N];
real w_6[N];
real w_7[N];
real w_8[N];
real w_9[N];
real w_10[N];
}
parameters {
real beta_0;
real beta_1;
real beta_2;
real beta_3;
}
transformed parameters {
vector[N] pp_hat;
vector[N] odds;
for (i in 1:N) {
odds[i] = exp(beta_0) * (1 + beta_1*A[i] + beta_2*B[i] + beta_3*A[i]*B[i]);
pp_hat[i] = odds[i]/(1 + odds[i]);
}
}
model {
beta_0 ~ normal(0, 5);
beta_1 ~ normal(0, 5);
beta_2 ~ normal(0, 5);
beta_3 ~ normal(0, 5);
for (i in 1:N){
target += w_1[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_2[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_3[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_4[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_5[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_6[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_7[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_8[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_9[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_10[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
}
}
Thanks in advance.