Summer Institute in Advanced Research Methods for Science, Technology, Engineering, and Mathematics Education Research

Multilevel modeling

 Instructor: Steve Raudenbush

Many studies in STEM education are longitudinal, multilevel, or both. Examples include studies of school and classroom effects on math and science learning. In longitudinal studies, we may repeatedly observe participants to assess growth in knowledge, skill, and credit accumulation. Multilevel data arise because students are clustered within social settings such as classrooms, schools, and school districts. Part I of this course considers methods for describing multilevel data. Part 2 takes up the question of causal inference in multilevel studies. We will begin with two-level experimental designs: the cluster randomized trial (randomization at level 2) and the multi-site randomized trial (randomization at level 1). Next, we will consider a family of three-level experiments including multi-site cluster randomized trials. In all cases, we ask how to analyze the data and how to plan these studies to have adequate power. We will see that some widely used methods of analysis are not well suited to multilevel experiments and that special care is required in applying hierarchical linear models or “HLM” (Raudenbush and Bloom, 2015; Raudenbush and Schwartz, 2020). Next, we will consider causal inference in non-randomized studies and the problem of non-compliance in randomized experiments by extending methods that are now standard in single-level settings but novel in multilevel settings. We will apply propensity-score matching to approximate group-randomized and multi-site trials when, in fact, treatment assignment is non-random. This could be applied to the stylized case in which schools and teachers choose to adopt the new science curriculum. In longitudinal settings, a key but often overlooked challenge is time-varying confounding: for example, a student’s past experience with an innovative science curriculum and his or her subsequent performance in science learning may influence the probability of receiving an innovative science curriculum in the future years. Understanding this dynamic process is key to education research. We will show how weighting methods applied within a hierarchical data structure can remove observed time-varying confounding and will compare this approach to more conventional approaches using time and/or person fixed effects (Hong & Raudenbush, 2008). This course will be conducted in a mixed lecture and discussion format. Lab sessions will enable fellows to gain skill in analyzing hierarchical data with a focus on recent experiments, quasi-experiments, and surveys that focus on mathematics learning in school settings.

REFERENCES:

Hong, G., & Raudenbush, S. W. (2008) Causal inference for time-varying instructional treatments. Journal of Educational and Behavioral Statistics, 33(3), 333-362.

Raudenbush, S. W. and Bloom, H. S. (2015). Learning About and From a Distribution of Program Impacts Using Multisite Trials. American Journal of Evaluation, 36(4), 475–99.

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models: Applications and Data Analysis Methods (2nd ed.). Thousand Oaks: Sage Publications.

Raudenbush, S.W., and Schwartz, D. (2020). Randomized experiments in education, with implications for multilevel causal inference. Annual Review of Statistics and Its Applications, 7(1), 177-208.