Struggles with Survey Weighting and Regression Modeling

Key takeaways

(file:///C:\Users\scott\Zotero\storage\HL75N45M\Gelman%20-%202007%20-%20Struggles%20with%20Survey%20Weighting%20and%20Regression%20Mod.pdf)

Bibliography: Gelman, A., 2007. Struggles with Survey Weighting and Regression Modeling. Statist. Sci. 22. https://doi.org/10.1214/088342306000000691

Authors:: Andrew Gelman

Collections:: PhD

First-page:

Abstract

The general principles of Bayesian data analysis imply that models for survey responses should be constructed conditional on all variables that affect the probability of inclusion and nonresponse, which are also the variables used in survey weighting and clustering. However, such models can quickly become very complicated, with potentially thousands of poststratiﬁcation cells. It is then a challenge to develop general families of multilevel probability models that yield reasonable Bayesian inferences. We discuss in the context of several ongoing public health and social surveys. This work is currently open-ended, and we conclude with thoughts on how research could proceed to solve these problems.

Citations

content: "@gelmanStrugglesSurveyWeighting2007" -file:@gelmanStrugglesSurveyWeighting2007

Reading notes

"Survey weighting is a mess. It is not always clear how to use weights in estimating anything more complicated than a simple mean or ratios, and standard errors are tricky even with simple weighted means. (Software packages such as Stata and SUDAAN perform analysis of weighted survey data, but it is not always clear which, if any, of the available procedures are appropriate
for complex adjustment schemes. In addition, the construction of weights is itself an uncodified process.)"