# Tagged: R-project

# Multiple imputation: special issue of Journal of Statistical Software

This issue looks good for folk wanting to do multiple imputation. Some goodies for useRs in particular:

- Multiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box by Yu-Sung Su, Andrew Gelman, Jennifer Hill, Masanao Yajima — I gave this one a try. Has a useful suite of functions for fitting linear regression, GLMs, and GLMMs (using lme4) to the imputed data, pooling the estimates.
- mice: Multivariate Imputation by Chained Equations in R by Stef van Buuren, Karin Groothuis-Oudshoorn
- Amelia II: A Program for Missing Data by James Honaker, Gary King, Matthew Blackwell

# Computing standard errors (SEs) and confidence intervals for BLUPs in lmer/lme4

Useful email exchange over here. Key bits—something like this will work:

rfg <- ranef(my.lmer, postVar=TRUE) my.se = sqrt(as.numeric(attributes(rfg$group)$postVar))

But note Dimitris Rizopoulos’s remark:

“… maybe one thing that I think needs to be kept in mind is that the posterior variances that ranef(…, postVar = TRUE) returns condition on the MLEs and do not take their variability into account.”

Douglas Bates notes over here:

“… I prefer to call the quantities that are returned by the ranef extractor ‘the conditional modes of the random effects’. If you want to be precise, these are the conditional modes (for a linear mixed model they are also the conditional means) of the random effects

BgivenY=y, evaluated at the parameter estimates. One can also evaluate the condtional variance-covariance ofBgivenY=yand hence obtain a prediction interval. These are returned by ranef when the optional argument ‘postVar’ is TRUE…“BTW, the reason that I say ‘conditional modes’, rather than ‘conditional means’, is so the term can apply to generalized linear mixed models, nonlinear mixed models and generalized nonlinear mixed models. The conditional distribution of

BgivenYis a continuous multivariate distribution that can be evaluated explicitly for a linear mixed model but is known only up to a constant for the generalized forms of the model. We can still optimize the conditional density ofBgivenY=yfor particular values of the parameters but doing so provides the conditional modes, not the conditional means.

Goldstein (see e.g., his book *Multilevel Statistical Models*) calculates that multiplying by 1.39 rather than 1.96 allows pairwise comparison of these estimates.

Also the arm package has a function se.ranef. Code for that below:

function (object) { se.bygroup <- ranef(object, postVar = TRUE) n.groupings <- length(se.bygroup) for (m in 1:n.groupings) { vars.m <- attr(se.bygroup[[m]], "postVar") K <- dim(vars.m)[1] J <- dim(vars.m)[3] se.bygroup[[m]] <- array(NA, c(J, K)) for (j in 1:J) { se.bygroup[[m]][j, ] <- sqrt(diag(as.matrix(vars.m[, , j]))) } names.full <- dimnames(se.bygroup) dimnames(se.bygroup[[m]]) <- list(names.full[[1]], names.full[[2]]) } return(se.bygroup) }

# Useful commands in R

search()

gives the attached packages.

detach("package:whateveritscalled")

removes one.