Research Associate University of Cincinnati, Ohio, United States
Abstract Information: Designing studies with the capacity to test whether or not a program works and to examine the mechanisms underlying the program theory has become a prominent and critical aim of teacher development and more generally educational research studies. To date there is little guidance for designing studies to detect mediation effects. Our work bridges that gap by developing an initial set of empirical values for a range of mediators and outcomes that are central to the study of teacher development but also central to most educational interventions. With these empirical catalogues, we hope to anchor the design of multilevel mediation studies in empirically plausible values and encourage other researchers to further add to and expand this literature.
Relevance Statement: Teacher professional development is regarded as one of the principal pathways through which we can understand and cultivate effective teaching and improve student outcomes. A critical component of studies that seek to improve teaching through professional development is the detailed assessment of the intermediate teacher development processes that scaffold program content through three key types of outcomes—teacher knowledge, instruction, and student learning. Designs that facilitate multilevel mediation analyses to probe and connect these processes and outcomes emerge as an important consideration in these studies. Despite recent shifts in research and funding priorities emphasizing the value of carefully designed teacher development studies, research in this area has lagged well behind its student outcome counterparts both methodologically and empirically. For instance, methodologically, although there is ample literature on the design of school-randomized studies for main and moderator effects, there is much less literature available to guide the design of professional development studies that seek to critically examine the coordinated system of relationships underpinning teacher development as it relates to student outcomes (e.g., Dong, Kelcey, & Spybrook, 2018; Raudenbush, 1997). Similarly, although recent literature is replete with empirical catalogues of plausible values for designing studies with student outcomes, literature on values for designing studies targeted mediation or how teacher development processes unfold in ways that influence student outcomes is sparse. Recent surveys of the empirical literature have further echoed these gaps—for example, a recent survey reported only 9 of about 1,300 teacher development studies surveyed were identified as using appropriate designs (Yoon et al., 2007). In this study, we aim to furnish researchers with plausible values that can help in designing studies focused on revealing the indirect effects of teacher outcomes on student outcomes via the pathway of teacher development. Our three primary research questions are: 1) What are plausible values of the variance decompositions of both student outcomes and teacher mediators across students, teachers, and schools? The precision of mediation effect estimates and the statistical power of school-randomized designs fundamentally depend on the variance decomposition of both the outcome and the mediator across levels. Our work aims to fill this gap by concurrently providing empirical estimates of the variance decompositions of student outcomes and teacher mediators across students, teachers, and schools. 2) To what extent do covariates (e.g., pretest, teacher background, demographics) explain variability in the teacher mediators and student outcomes? An important conclusion from previous design strategy studies is that adjusting for differences on key covariates can substantially improve the power to detect effects if they exist (Raudenbush, 1997). 3) To what extent is the scale necessitated by multilevel mediation studies comparable to the scale with which most professional development studies are conducted? Literature has demonstrated that multilevel designs often demand large sample sizes to achieve a desired power level. We examined the feasibility of the proposed designs given the resulting empirical values.