Institution/Organization Affiliation: University of Pittsburgh
SERAResearch Partner Bio:
Sheila Conway, Ph.D. is an Associate Professor of Practice in the University of Pittsburgh School of Education. Sheila coordinates special education teacher preparation programs and teaches within those programs. She is interested in the experiences of novice special education teachers to inform teacher preparation and induction programming. All four of Sheila’s children attend/ed Pitt and stop by her office regularly.
What made you interested in partnering with SERA?
As clinical faculty, I find it difficult to lead large independent research projects, given my teaching and administrative responsibilities. As a partner on SERA collaborative projects, I can make contributions in a feasible and impactful manner.
Institution/Organization Affiliation: Arkansas State University
SERAResearch Partner Bio:
Kimberley Davis, Ph.D is an Associate Professor of Special Education and Interim Department Chair in the Department of Educational Leadership, Curriculum, and Special Education at Arkansas State University. She has a B.S. in Secondary Education-Social Studies, a M.Ed. in both Special Education and Educational Leadership, and a Ph.D. in Special Education (Mild Moderate Disabilities). Dr. Davis has served in the field of education as a special education teacher, coordinator, consultant, educational diagnostician, and Special Education director (LEA). Her research interests include multi-tiered levels of intervention and support, teacher preparation, inequities in special education, inclusive practices, and culturally responsive teaching.
Dr. Davis is the proud mother of Autumn Grace, a junior middle level education candidate at Arkansas State University and Aiden Nicholas, a 7th grade student in the Nettleton Public Schools. She enjoys reading, traveling, and participating in service learning projects in her community.
What made you interested in partnering with SERA?
The opportunity to work with the [SERA Science Education Instruction for Elementary Students with Learning Disabilities] project would provide collaborative opportunities to address the needs for students with exceptional learning and behavioral needs through evidence-based interventions and supports.
Since the start of the war on poverty in the 1960s, social scientists have developed and refined experimental and quasi-experimental methods for evaluating and understanding the way public policies affect people’s lives. The overarching mission of many social scientists is to understand “what works” in social policy for reducing inequality, improving educational outcomes, and mitigating harms of early life disadvantage. This is a laudable goal. However, mounting evidence suggests that the results from many studies are fragile and hard to replicate. The so-called “replication crisis” has important implications for evidence-based analysis of social programs and policies. At the same time, there is intense debate about what constitutes a successful replication and why certain types of replication rates are so low. A crucial set of questions for evidence-based policy research involve questions about external validity and replicability. We need to understand the contexts and conditions under which interventions produce similar outcomes.
To address these concerns, Peter Steiner and I presented a new framework that provides a clear definition of replication, and highlights the conditions under which results are likely to replicate (Wong & Steiner, 2018; Steiner, Wong, & Anglin, 2019). This work introduces the Causal Replication Framework (CRF), which defines replication as a research design that tests whether two or more studies produce the same causal effect within the limits of sampling error. The CRF formalizes the conditions under which replication success can be expected. The core of the replication framework is based on potential outcomes notation (Rubin, 1974), which has the advantage of identifying clear causal estimands of interest and demonstrating research design assumptions needed for the direct replication of results. Here, a causal estimand is defined as the causal effect of a well-defined treatment-control contrast of a clear target population. In this blog post, I discuss key assumptions needed for direct replication of results, describe how the CRF may be used for planning systematic replication studies, and how the CRF is informing out development of a “crowdsourcing platform” for SERA.
Under the CRF, five assumptions are required for the direct replication of results across multiple replication studies. These assumptions may be understood broadly as replication design requirements (R1-R2 in Table 1), and individual study design requirements (A1-A3 in Table 1). Replication design assumptions include treatment and outcome stability (R1) and equivalence in causal estimands (R2). Combined, these two assumptions ensure that the same causal estimand for a well-defined treatment and target population is produced across all studies. Individual study design assumptions include unbiased identification of causal estimands (A1), unbiased estimation of causal estimands (A2), and correct reporting of estimands, estimators, and estimates (A3). These assumptions ensure that a valid research design is used for identifying effects, unbiased analysis approaches are used for estimating effects, and that effects are correctly reported – standard assumptions in most individual causal studies. Replication failure occurs when one or more of the replications and/or individual study design assumptions are not met. Table 1 summarizes the assumptions needed for the direct replication of results.
Table 1. Design Assumptions for Replication of Effects (Steiner & Wong, in press; Wong & Steiner, 2018)
Design Assumptions
For Study 1
… Through Study k
Replication design assumptions (R1-R2)
R1. Treatment and outcome stability R2. Equivalence in causal estimand
R1. Treatment and outcome stability R2. Equivalence and causal estimand
Individual study design assumptions (A1-A3)
A1. Unbiased identification of effects A2. Unbiased estimation of effects A3. Correct reporting of estimators, estimands, and estimates
A1. Unbiased identification of effects A2. Unbiased estimation of effects A3. Correct reporting of estimators, estimands, and estimates
Deriving Research Designs for Systematic Replication Studies
A key advantage of the CRF is that it is straight-forward to derive research designs for replication to identify sources of effect heterogeneity. For example, direct replications examine whether two or more studies with the same well-defined causal estimand yields the same effect (akin to the definition of verification tests in Clemens, 2017). To implement this type of design, the researcher may introduce potential violations to any individual study design assumptions (A1-A3), such as using different research design (A1) or estimation (A2) approaches for producing effects, or asking an independent investigator to reproduce the effect using the same data and code (A3). Design-replications (Lalonde, 1986) and reanalysis or reproducibility studies (Chang & Li, 2015) are examples of direct replication approaches. Conceptual replications examine whether 2+ studies with potentially different causal estimands yield the same effect (akin to definition of robustness tests in Clemens, 2017). Here, the researcher may introduce systematic violations to replication assumptions (R1-R2), such as multi-site designs where there are systematic differences in participant and setting characteristics across sites (R2), and multi-arm treatment designs when different dosage levels of an intervention is assigned (R1). The framework demonstrates that while the assumptions for direct replication of results are stringent, researchers often have multiple research design options for evaluating the replicability of effects and identifying sources of effect variation. However, until recently, these approaches have not been recognized as systematic replication designs. Wong, Anglin, and Steiner (2021) describe research designs for conceptual replication studies, and methods for assessing the extent to which replication assumptions are met in field settings.
Using Systematic Replication Designs for Understanding Effect Heterogeneity
Wong and Steiner (2018) show that replication failure is not inherently “bad” for science – as long as the source of the replication failure can be identified. Prospective research designs or research design features may be used to address replication assumptions in field settings. The multi-arm treatment design, where units are randomly assigned units to different treatment dosages, is one example of this approach. A switching replication design, where two or more groups are randomly assigned to receive treatment at different time intervals in an alternating sequence is another example (Shadish, Cook, & Campbell, 2002). Here, the timing and context under which the intervention is delivered vary, but all other study features (including composition of sample participants) remain the same across both intervention intervals. In systematic replication studies, a researcher introduces planned variation by relaxing one or more replication assumption while trying to meet all others. If replication failure is observed—and all other assumptions are met—then the researcher may infer that the assumption was violated and resulted in treatment effect heterogeneity.
Using the CRF to Plan Crowdsourced Systematic Replication Studies
In crowdsourcing efforts such as SERA, teams of independent investigators collaborate to conduct a series of conceptual replication studies using the same study protocols. These are conceptual replication studies because although sites may have the same participant eligibility criteria, the characteristics of recruited sample members, the context under which the intervention is delivered, and the settings under which the study is conducted will likely vary across independent research teams. A key challenge in all conceptual replication studies – including those in SERA – is identifying why replication failure occurred when multiple design assumptions may have been violated simultaneously.
To address these concerns, SERA is developing infrastructure and methodological supports for evaluating the replicability of effects, and for identifying sources of effect heterogeneity when replication failure is observed. These tools are based on the Causal Replication Framework and focus on three inter-related strands of work. First, we are working with the UVA School of Data Science (Brian Wright) to create a crowdsourcing platform that will ensure transparency and openness in our methods, procedures, and data. A crowdsourcing platform that allows for efficient and multi-modal data collection is critical for helping us evaluate the extent to which replication assumptions are met in field settings. We are also working on data science methods for assessing the fidelity and replicability of treatment conditions across sites in ways that are valid, efficient, and scalable. This will provide information about the extent to which the “treatment stability” assumption (R2 in Table 1) is met in field settings. Finally, we are examining statistical approaches for evaluating the replicability of effects, especially when there is evidence of effect heterogeneity. Over the next year, we look forward to sharing results from these efforts and what our team has learned about the “science” of conducting replication studies in field settings.
Replication Quantitative Methods Team includes Vivian C Wong (UVA), Peter M Steiner (Maryland), Brian Wright (UVA SDS), Anandita Krishnamachari (Research Scientist), and Christina Taylor (Replication Specialist).
Vivian C. Wong, Ph.D.
Co-Principal Investigator
Vivian C. Wong, Ph.D., is an Associate Professor in Research, Statistics, and Evaluation at UVA. Dr. Wong’s expertise is in improving the design, implementation, and analysis of experimental and quasi-experimental approaches. Her scholarship has focused recently on the design and analysis of replication research. Dr. Wong has authored numerous articles on research methodology in journals such as Journal of Educational and Behavioral Statistics, Journal of Policy Analysis and Management, and Psychological Methods. Wong is primarily responsible for developing the methodological infrastructure and supports for SERA, assisting with methodological components of the pilot study, and conducting exploratory analyses of SERA.
Institution/Organization Affiliation: Washington State University Vancouver
SERAResearch Partner Bio:
Dr. Michael Dunn, is an Associate Professor of Special Education and Literacy at Washington State University Vancouver (near Portland, Oregon). His areas of research interest include: skills/strategies for struggling readers and writers, and response to intervention (an intervention/assessment process for classifying students with a learning disability).
What made you interested in partnering with SERA?
My research interests focus on developing interventions and offering them to students to apply in their learning. This project also offers me the opportunity to collaborate with other researchers across the US.
Institution/Organization Affiliation: Brigham Young University
SERAResearch Partner Bio:
Dr. Jared Morris is an Assistant Professor in the Department of Counseling Psychology and Special Education at Brigham Young University. Jared completed his Ph.D. in special education with a minor in educational psychology at The Pennsylvania State University. He also completed a graduate certificate in applied behavior analysis at The Pennsylvania State University. He received a master’s degree in special education from The University of Utah and a bachelor’s degree in English from Brigham Young University. Jared taught students with disabilities in a variety of settings for five years.
What made you interested in partnering with SERA?
I became interested in SERA seeing a need to help further the research base for students with disabilities. Accelerating research through systematic large scale replications using the SERA model has potential to increase the research base by providing more robust data because of the increased quantity and diversity of the research samples.
Educators strive to improve and maximize the learning outcomes of their students by applying effective instructional practices. Although no instructional approach is universally effective, some teaching practices are more effective than others. Therefore, it is important to reliably identify and prioritize the most effective instructional practices for populations of learners. Rigorous experimental research is generally agreed to be the most reliable approach for identifying “what works” in education. However, it is important to recognize that scientific research has important limitations and does not always generate valid findings.
In large-scale replication projects in psychology and other fields, researchers often failed to replicate the findings of previously conducted studies (e.g., Klein et al., 2018; Open Science Collaboration, 2015), casting doubt on the validity of research findings. Researchers, including those in education, have also reported using a variety of questionable research practices such as p-hacking (exploring different ways to analyze data until desired results are obtained), selective-outcomes reporting (only reporting analysis with desired results), and hypothesizing after results are known (HARKing; e.g., Fraser et al., 2018; John et al., 2012; Makel et al., 2019), all of which increase the likelihood of false positive findings (Simmons et al., 2011). Moreover, research studies in education often involve relatively small and underpowered samples that do not adequately represent the population being studied, which further threatens the validity of study findings. Finally, most published research lies behind a paywall, inaccessible to many practitioners and policymakers (Piwowar et al., 2018), which reduces the potential application and impact of research. Open science and crowdsourcing are two related developments in research that aim to address these issues and improve the validity and impact of research.
Open science is an umbrella term that includes a variety of practices aiming to open and make transparent all aspects of research with the goal of increasing its validity and impact (Cook et al., 2018). For example, preregistration involves making one’s research plans transparent before conducting a study in order to discourage questionable research practices such as p-hacking and HARKing by making them easily discoverable. Data sharing is another key open-science practice, which allows the research community to verify analyses reported in an article, as well as analyze data sets in other ways to examine the robustness of reported findings, thereby serving as a protection against p-hacking. Materials sharing involves openly sharing materials used in a study (e.g., intervention protocols, intervention checklists) to enable other researchers to replicate the study as faithfully as possible. Finally, open access publishing and preprints provide free access to research to anyone with internet access, thereby democratizing the benefits and impact of scientific research beyond those with institutional subscriptions to the publishers of academic journals.
Crowdsourcing in research involves large-scale collaboration of various elements of the research process and can take many forms (Makel et al., 2019). For example, multiple research teams might conduct independent studies examining the same issues, as the Open Science Collaboration (2015) did when they conducted 100 replication studies to examine reproducibility of findings in psychology. Crowdsourcing in research has also occurred by multiple analysts analyzing the same data set to answer the same research question in order to examine the effects of different analytic decisions on study outcomes (Silberzahn et al., 2018). The most frequent application of crowdsourcing in research, which we have adopted in the Special Education Research Accelerator, is to involve many research teams in the collection of data, thereby increasing the size and diversity of the study sample – which serves to improve the study’s power and external validity. For example, Jones et al. (2018) involves > 200 researchers collecting data from > 11,000 participants in 41 countries to evaluate facial perceptions. Regardless of how it is employed, “crowdsourcing flips research planning from ‘what is the best we can do with the resources we have to investigate our question?’ to ‘what is the best way to investigate our question, so that we can decide what resources to recruit?” (Uhlmann et al., 2019, p. 417).
Although open science and crowdsourcing are independent constructs (i.e., research that is open is not necessarily crowdsourced, and crowdsourced studies are not necessarily open), the two approaches are closely aligned and complementary. The ultimate goal of both open science and crowdsourcing is to improve the validity and impact of research. Although using different means to achieve this goal, crowdsourcing facilitates making research open, and open science facilitates crowdsourcing. For example, in a crowdsourced study in which data are collected by many research teams, study procedures have to be determined and disseminated to collaborating researchers before the study is begun. Because study procedures are determined and documented prior to conducting the study, they can readily be posted as a preregistration. Additionally, if data are being collected across many researchers in a crowdsourced study, the project will need a clear data-management plan to enable data to be collected and entered reliably across researchers. Such well-organized data-management plans that include metadata not only facilitate the integration of data across researchers on the project, but also make data more readily usable by other researchers when shared and reduce the burden of creating supporting metadata when sharing. Moreover, because data are collected across many different researchers in many crowdsourced studies, individual researchers may be less likely to feel that they “own” the data and therefore may feel less reluctant to share them. Similarly, materials used in crowdsourced studies must have been developed and shared with the many researchers collecting data, so – like data – materials from crowdsourced research are ready to be uploaded and shared. Most broadly, crowdsourcing and open science share an ethos of collaboration and sharing for the betterment of science, and we conjecture that most researchers involved in crowdsourced studies will want to make their research as open as possible. As such, we look forward to making our crowdsourced research through the Special Education Research Accelerator as open as possible.
In crowdsourced
research, resources are combined across researchers to conduct studies that
could not be accomplished on their own.
“Crowdsourcing
flips research planning from ‘what is the best we can do with the resources we
have to investigate our question,’ to ‘what is the best way to investigate our
question, so that we can decide what resources to recruit.”
(Uhlmann
et al., 2019, p. 7).
Crowdsourcing
research is new to special education research, but in other fields, it has been
around for decades. Many aspects of research can be crowdsourced, such as deciding
what ideas to research, analyzing data, and conducting peer review (Uhlmann et
al., 2019). The most common use of crowdsourcing is regarding data collection,
in which many researchers collect data, resulting in much larger and diverse
samples of study participants. Examples of crowdsourced data collection outside
of special education include:
Citizen
data collectors: Fields such as astronomy and geology
utilize volunteers to collect data that researchers would never be able to
collect on their own. Many of these projects and opportunities to become data
collectors are compiled on the federal government citizen science website.
Conducting
crowdsourced research, especially involving crowdsourcing data collection, in
special education has many potential uses such as:
Implementing large scale observational
studies so we can better understand how students with disabilities are provided
services across the country
Validating evidence-based practices using nationally
representative student samples
Enabling adequately powered group studies
with low-incidence populations to be conducted
Dramatically increasing the number of direct
and conceptual replications so we can efficiently determine what interventions
work for who, under what conditions (Coyne
et al., 2016).
Ultimately, we
envision crowdsourcing will democratize the research enterprise by enabling
more and diverse researchers to be involved in large-scale research studies involving
large and diverse participant samples that ask and answer critical research questions
aimed at improving services for children with disabilities. By facilitating
crowdsourcing of data collection across many researchers, we hope the Special
Education Research Accelerator (SERA) will play a core role in democratizing
the research enterprise in our chosen field — special education.
Bill Therrien
Co-Principal Investigator
William J. Therrien, Ph.D., BCBA, is a Professor in Special Education at UVA. He is an expert in designing and evaluating academic programming for students with disabilities, particularly in science and reading. Along with co-directing CASPER, Dr. Therrien co-edits Exceptional Children, the flagship journal in special education, and is Research in Practice Director for UVA’s Supporting Transformative Autism Research (STAR) initiative. Therrien assists Cook with developing infrastructure and supports for SERA, conducting the pilot study and assessing the usability and feasibility of using SERA to conduct future replication pilot studies.
Institution/Organization Affiliation: Stanford University
SERAResearch Partner Bio:
Christopher J. Lemons, Ph.D., is an Associate Professor of Special Education at Stanford University. His research focuses on improving academic outcomes for children and adolescents with intellectual, developmental, and learning disabilities. His recent research has focused on developing and evaluating reading interventions for individuals with Down syndrome. His areas of expertise include reading interventions for children and adolescents with learning and intellectual disabilities, data-based individualization, and intervention-related assessment and professional development. Lemons has secured funding to support his research from the Institute of Education Sciences and the Office of Elementary and Secondary Education, both within the U.S. Department of Education and from the National Institutes of Health.
What made you interested in partnering with SERA?
The idea to crowdsource special education research is innovative and well-aligned with other initiatives focused on open science and replication. This project is moving our field forward in creative, meaningful ways and I’m happy to be involved.
Institution/Organization Affiliation: Kent State University
SERAResearch Partner Bio:
Dr. Stevenson is an Assistant Professor in the area of mild/moderate educational needs at Kent State University. He teaches graduate and undergraduate courses in core instruction, classroom management, inclusive practices, and instructional methods for struggling learners. He earned his doctorate in special education from Michigan State University. Dr. Stevenson began his career as an elementary classroom teacher with New York City Public Schools. His research interests include classroom behavior management, inclusive practices, and adoption of evidence-based instruction.
What made you interested in partnering with SERA?
Being a part of SERA is truly a unique opportunity to draw on the collective expertise of scholars and speed the pace of research in critical areas. I am delighted to be working with such a distinguished group of scholars.
Pictured from left to right: Amelia K. Moody, Ph.D., James Stocker, Ph.D., and Sharon Richter, Ph.D.
Name: Amelia K. Moody, Ph.D.
Institution/Organization Affiliation: University of North Carolina-Wilmington
Additional Research Team Members: James Stocker, Ph.D., Associate Professor & Sharon Richter, Ph.D., Assistant Professor
SERAResearch Partner Bio:
Amelia Moody received her Ph.D. in special education from the University of Virginia and currently works as a Professor in the Watson College of Education at the University of North Carolina Wilmington. Moody is the director for the Center for Assistive Technology and serves as a member of the Science, Technology, Engineering, and Mathematics (STEM) Learning Cooperative. Current grant work surrounds the research of innovative technologies that enhance educational outcomes for children with disabilities, with a focus on students with Autism Spectrum Disorders.
What made you interested in partnering with SERA?
Our team is interested in participating in SERA because we are dedicated to improving educational outcomes for students with ASD. This project allows for accelerated data collection on innovative educational interventions in efforts to determine how to best meet the needs of this population of students.