Using Quasi-variance to Communicate Sociological Results from Statistical Models
Using Quasi-variance to Communicate Sociological Results from Statistical Models
Key takeaways
Bibliography: Gayle, V., Lambert, P.S., 2007. Using Quasi-variance to Communicate Sociological Results from Statistical Models. Sociology 41, 1191–1208. https://doi.org/10.1177/0038038507084830
Authors:: Vernon Gayle, Paul S. Lambert
Tags: #Methods, #Statistical-methods, #Reference-Category-Problem, #Quasi-variance
Collections:: To Read, Methods
First-page: 2
This article introduces a sociological audience to quasi-variances' as a solution to the reference category problem'.The reference category problem is associated with the interpretation of the effects of categorical explanatory variables within statistical models, and is especially relevant to sociological applications, where categorical explanatory variables are very common. This article presents a selection of examples (using multiple and logistic regression) to illustrate and comment on quasi-variance calculations for sociological models. In addition, the article is augmented with online materials provided by the authors, which aim to help social researchers practise and apply this technique using the popular data analysis software packages SPSS and Stata.The authors conclude that quasi-variance methods offer an attractive and practicable solution to the reference category problem that can, and should, be routinely operationalized by sociological researchers.
content: "@gayleUsingQuasivarianceCommunicate2007" -file:@gayleUsingQuasivarianceCommunicate2007
Reading notes
Imported on 2024-05-07 21:16
⭐ Important
- & This paper describes Firth’s idea of ‘quasi-variance’ as a solution to the ‘reference category problem’ in presenting outputs from statistical models. (p. 2)
- & Statistical models offer an attractive way for sociological researchers to summarise patterns from social survey datasets (Dale and Davies 1994; Goldthorpe 2000). (p. 3)
- & the important point is that the communication of results from statistical models hinges upon which aspects of the modelling process the analyst chooses, rightly or wrongly, to emphasise (Berk 2004). (p. 3)
- & The benchmark category is usually referred to as the ‘reference’ or ‘base’ category. The benchmark effect is arbitrarily fixed to zero6, and other category effects are interpreted as the additional impact of not being (p. 3)
- & in the benchmark category. Standard statistical software undertakes formal comparisons of whether or not each category effect differs from the benchmark effect. These comparisons generate the well known ‘significance values’ of parameter (coefficient) estimates. The reference category problem is easily stated. Whilst it is straightforward to compare any one category with the reference (or base) category, it is more difficult to formally compare two other categories (or levels) of the explanatory variable with each other when neither is the base category. (p. 4)
- & As we shall describe, Firth’s papers (2000, 2003) illustrate how ‘quasi-variance’ statistics can be reported along with standard outputs from statistical models in order to enable such calculations. (p. 4)
- & This calculation then allows us to derive the t-statistic: t = -.03 /0.0083 = -3.2 Using conventional statistical criteria, if the t value is greater than ± 1.96, we can reject the null hypothesis and conclude that the estimate for the North West is significantly different to Yorkshire and Humberside (p<.05). (p. 9)
- & should be taken a step further to generate a Wald chi-square statistic (equal to t2), which is then evaluated at 1 degree of freedom: Wald χ2= (-.03 /0.0083)2 = 10.22; p =.0014 . The value of χ2is significant and we can formally conclude that these two regions are different with regard to self-rated good health. (p. 9)
- & Firth’s (2003) recommendation, that analysts routinely display ‘quasi-variance’ statistics for all multiple category explanatory variables, offers a neat and practical solution this impasse (p. 9)
- & In essence Firth’s method (2000, 2003) uses an approximation in order to allow for an easier calculation of the test statistic for the difference between two categories14. A single approximation statistic, known as the quasi-variance, may be calculated for each category of a categorical explanatory variable (including the reference category). The important outcome is that this statistic may be used to generate a more simplified equation for approximating the standard error of the difference between two beta estimates as used in equation (1). The new calculation for equation (2) becomes: s.e. difference ≈ (3) (p. 12)
- & As described above, quasi-variance statistics are approximations that allow us to undertake comparison tests without requiring complex data from the variancecovariance matrix of parameter estimates. The accuracy of these approximations is therefore a question of concern. Firth and Menezes (2004) explore this accuracy in some detail (see also Menezes 1999). Firstly, for the specific case of a multiple category explanatory variable with three categories, it is reported that quasi-variance (p. 25)
- & 1 are (necessarily) exactly accurate. (p. 26)