@Clarke2010

The Choice Between Fixed and Random Effects Models: Some Considerations for Educational Research

(2010) - Paul Clarke, Claire Crawford, Fiona Steele, Anna F. Vignoles

Journal: SSRN Electronic Journal
Link:: https://www.ssrn.com/abstract=1700456
DOI:: 10.2139/ssrn.1700456
Links::
Tags:: #paper #Methods #FixedEffects #RandomEffects
Cite Key:: [@Clarke2010]

Abstract

We discuss fixed and random effects models in the context of educational research and set out the assumptions behind the two approaches. To illustrate the issues, we analyse the determinants of pupil achievement in primary school, using data from the Avon Longitudinal Study of Parents and Children. We conclude that a fixed effects approach will be preferable in scenarios where the primary interest is in policy-relevant inference of the effects of individual characteristics, but the process through which pupils are selected into schools is poorly understood or the data are too limited to adjust for the effects of selection. In this context, the robustness of the fixed effects approach to the random effects assumption is attractive, and educational researchers should consider using it, even if only to assess the robustness of estimates obtained from random effects models. When the selection mechanism is fairly well understood and the researcher has access to rich data, the random effects model should be preferred because it can produce policy-relevant estimates while allowing a wider range of research questions to be addressed. Moreover, random effects estimators of regression coefficients and shrinkage estimators of school effects are more statistically efficient than those for fixed effects.

Notes

“We conclude that a fixed effects approach will be preferable in scenarios where the primary interest is in policy-relevant inference of the effects of individual characteristics, but the process through which pupils are selected into schools is poorly understood or the data are too limited to adjust for the effects of selection” (Clarke et al., 2010, p. 1)

“When the selection mechanism is fairly well understood and the researcher has access to rich data, the random effects model should be preferred because it can produce policy-relevant estimates while allowing a wider range of research questions to be addressed.” (Clarke et al., 2010, p. 1)

“Economists, for example, are more likely to focus on the impact of personal and family characteristics on achievement (Todd and Wolpin, 2003), and hence tend to use fixed effect models” (Clarke et al., 2010, p. 2)

“important focus for education researchers is on the role of schools (Townsend, 2007), which is best studied using random effects models because fixed effect approaches do not allow school characteristics to be modelled.” (Clarke et al., 2010, p. 2)

“provided they have some understanding of the process through which pupils are selected into schools, and that sufficiently rich data are available to control for the important factors, then the natural approach should be to use random effects because: a) school (level 2) characteristics can be modelled – allowing questions concerning differential school effectiveness for different types of pupils using random coefficients to be addressed – and b) precision-weighted ‘shrinkage’ estimates of the school effects can be used” (Clarke et al., 2010, p. 3)

“achievements of pupils in the same school are likely to be clustered due to the influence of unmeasured school characteristics like school leadership” (Clarke et al., 2010, p. 4)

“allow for such school effects on achievement is to fit a hierarchical regression model with a twolevel nested structure in which pupils at level 1 are grouped within schools at level 2.” (Clarke et al., 2010, p. 4)

“ij ij j p pij ij ij e” (Clarke et al., 2010, p. 4)

“The school effects (also referred to as school residuals) are usually assumed to be drawn from a normal distribution: uj ~ i.i.d. N(0, 2 u” (Clarke et al., 2010, p. 5)

“2 u σ (rather than the assumption of normality) is required to make inferences about the regression coefficients” (Clarke et al., 2010, p. 5)

“advantage of the random effect model is thus” (Clarke et al., 2010, p. 5)

“that the total residual variance can be partitioned into two components: the between-school variance 2 u σ and the within-school (between-pupil) variance 2 e” (Clarke et al., 2010, p. 6)

“two major advantages over fixed effect models: 1) the possibility of estimating shrunken residuals; 2) the possibility of accounting for differential school effectiveness through the use of random coefficients models.” (Clarke et al., 2010, p. 6)

“whilst random effects models open up a range of interesting avenues for research and are more efficient (generate narrower confidence intervals) than fixed effects models, their use comes at the cost of an important additional assumption” (Clarke et al., 2010, p. 7)

“This requirement mirrors the earlier regression assumption (for eij) and further states that uj must be uncorrelated with the pupil, family and school characteristics represented by the covariates xij ( 0 ) , cov( = j kij u x ). This is often referred to as the random effects assumption and is of great concern to economists.” (Clarke et al., 2010, p. 7)

“this assumption implies that unobserved characteristics of the school uj that influence achievement, such as ethos or teacher quality, are not correlated with pupil, family or school characteristics that are included in the model, such as whether a pupil is eligible for free school meals or has special educational needs. We return to this issue again below.” (Clarke et al., 2010, p. 7)

“Causal inferences tell us what would happen if we intervened and changed how things are currently done” (Clarke et al., 2010, p. 11)

“If each school had drawn its pupils at random from the pupil population, then the random effects assumption would hold.7 In reality, however, a non-random selection mechanism operates through which parents choose schools and some schools select which children to accept. Thus, the probability of selecting a particular school varies systematically according to a series of factors characterising the child, his/her family, the school itself, and higher levels like local education authority.” (Clarke et al., 2010, p. 12)

“Now consider a simple hierarchical regression model with treatment as the only covariate:” (Clarke et al., 2010, p. 12)

“τis the average effect of the treatment. In our subsequent examples, treatment corresponds to indicators for whether the pupil is eligible for free school meals or has special education needs” (Clarke et al., 2010, p. 12)

“school F is not empty.” (Clarke et al., 2010, p. 13)

“Of course, unless the selection mechanism is perfectly understood and the study contains reliable measures of every factor comprising school F then the random effects assumption will not hold” (Clarke et al., 2010, p. 13)

“perfectly. However, this is not necessarily fatal for the purpose of producing policy-relevant inferences.” (Clarke et al., 2010, p. 14)

“Burgess et al. (2009) study the reasons that parents in England and Wales choose schools for their children. They identify the dominant reasons as: a) proximity/ease of travel to school; b) whether a sibling attends the same school; c) whether wider family members or friends attend the same school; d) school reputation; and e) pre-school and childcare facilities offered by the school.” (Clarke et al., 2010, p. 14)

“The second illustrative example we use is an analysis of the impact of eligibility for free school meals (FSM) status – an indicator of very low family income12 – on educational achievement.” (Clarke et al., 2010, p. 17)

“much of the academic evidence base on the relationship between family background and education achievement relies on this indicator (Chowdry et al. 2008; Sammons et al., 1997; Strand, 1999).” (Clarke et al., 2010, p. 18)

“The “treatment” effect we focus on, therefore, is a comparison between pupils’ attainments if they had low socio-economic status, as indicated by FSM eligibility, compared to their attainments under higher socio-economic status.” (Clarke et al., 2010, p. 18)

“FSM status may be correlated with other unobserved pupil or family characteristics, such as parental aspirations, that also determine their achievement” (Clarke et al., 2010, p. 18)

“If we find differences between the random and fixed effects estimates of the effects of SEN and FSM on attainment then it would suggest that unmeasured but important school influences on progress are correlated with SEN/FSM, in which case the random effects assumption that uj is uncorrelated with SEN/FSM is invalid.” (Clarke et al., 2010, p. 21)

“In summary, our findings suggest that both the regression and random effects assumptions are likely to hold in this example. This suggests that we should prefer the random effects model here” (Clarke et al., 2010, p. 23)

“Even after we have added in the richest set of family background controls, however, there remains a negative association between eligibility for free school meals and academic progress, with FSM pupils scoring around between 0.09 (FE) and 0.1 (RE) standard deviations lower at Key Stage 2 than pupils who are not eligible for FSM (with similar prior attainment). This suggests that FSM has some value as an indicator of current economic hardship, over and above detailed information about family income, housing tenure and so on during early childhood” (Clarke et al., 2010, p. 24)

“In summary, we find that without the inclusion of school characteristics (particularly the schoollevel proportion of students eligible for FSM) in our model, the random effects assumption is likely to fail in this case, perhaps leading us to favour a fixed effects approach here.” (Clarke et al., 2010, p. 25)

“we also find that without very rich data on individual characteristics, the regression assumption is likely to fail, such that neither fixed nor random effects models will produce unbiased, policyrelevant estimates of the relationship between FSM eligibility and attainment, particularly when using sparse administrative data.” (Clarke et al., 2010, p. 25)