Variable Selection Method for SAE Modeling
In small area estimation (SAE) modeling, it is crucial to have consistent covariates that are most relevant to the outcome variable(s) of interest. An ideal variable selection would be a parsimonious process that aims to identify a minimal set of covariates with maximum predictive power.
Westat statisticians Weijia Ren, Ph.D., Jianzhu Li, Ph.D., Andreea Erciulescu, Ph.D., Tom Krenzke, M.S., and Leyla Mohadjer, Ph.D., have now published a new article in Stats that addresses this need.
Westat developed a 2-phase variable selection method for SAE modeling of measures related to the proficiency of adult competency collected in the first cycle of Program for the International Assessment of Adult Competencies (PIAAC).
Phase 1 identifies a small set of variables that are consistently highly correlated with the outcomes through methods such as correlation matrix and multivariate least absolute selection and shrinkage operator (LASSO) analysis. Phase 2 uses a k-fold cross-validation process to select a final set of variables to be used in the final SAE models.