Struggles with Survey Weighting and Regression Modeling
Struggles with Survey Weighting and Regression Modeling
Key takeaways
Bibliography: Gelman, A., 2007. Struggles with Survey Weighting and Regression Modeling. Statist. Sci. 22. https://doi.org/10.1214/088342306000000691
Authors:: Andrew Gelman
Collections:: PhD
First-page:
The general principles of Bayesian data analysis imply that models for survey responses should be constructed conditional on all variables that affect the probability of inclusion and nonresponse, which are also the variables used in survey weighting and clustering. However, such models can quickly become very complicated, with potentially thousands of poststratification cells. It is then a challenge to develop general families of multilevel probability models that yield reasonable Bayesian inferences. We discuss in the context of several ongoing public health and social surveys. This work is currently open-ended, and we conclude with thoughts on how research could proceed to solve these problems.
content: "@gelmanStrugglesSurveyWeighting2007" -file:@gelmanStrugglesSurveyWeighting2007
Reading notes
"Survey weighting is a mess. It is not always clear how to use weights in estimating anything more complicated than a simple mean or ratios, and standard errors are tricky even with simple weighted means. (Software packages such as Stata and SUDAAN perform analysis of weighted survey data, but it is not always clear which, if any, of the available procedures are appropriate
for complex adjustment schemes. In addition, the construction of weights is itself an uncodified process.)"