SERA2: Identifying Generalization Boundaries

The original funding for developing and piloting the Special Education Research Accelerator (SERA) was provided by a grant from the Institute for Education Sciences (IES). Though the pandemic presented many obstacles, the project resulted in, among other accomplishments, the SERA website, which serves as a hub for crowdsourced research studies conducted by SERA; a network of approximately 370 SERA partners (special education researchers interested in conducting crowdsourced research) throughout the US; and a pilot RCT replication examining the effects of elaborative interrogation on the retention of science facts among elementary students with high-incidence disabilities across eight SERA research partners (we are currently writing up the results of that study, check our blog for a forthcoming summary). We learned a lot working with our partners in this project. One thing that stood out to us was that although we had developed infrastructure and procedures for crowdsourcing research studies across many researchers and sites, a similar process for crowdsourcing the planning of research does not yet exist. Therefore, we proposed (and were fortunate to receive funding for) an IES grant, which we’re referring to as SERA2, to expand SERA by developing and piloting procedures and supports for crowdsourcing the development of lines of inquiry to systematically investigate effect heterogeneity for the purpose of estimating generalizability boundaries.

Generalization requires understanding sources of variation that may amplify or dampen intervention effects (Stuart et al., 2011; Tipton, 2012; Tipton & Olsen, 2018). Cronbach identified four classes of contextual variables that could potentially affect the size of intervention effects. They include variations in unit (or participants), treatment (or versions of the intervention), outcome, and/or setting (UTOS) characteristics. For example, the effects of an intervention may vary for students with learning disabilities compared to students with autism, when implemented in small groups versus individually, when delivered with reading specialist versus a paraprofessional, and/or the combination of these characteristics combined.

To fully inform policy and practice about the effectiveness of programs and interventions, researchers should examine treatment effect heterogeneity across key learner populations, treatment variations, outcomes, and settings. It seems to us that this is just the type of information policymakers and practitioners want to know: Does this intervention work for students with autism? Does it work when implemented in small groups? However, researchers seldom design series of conceptual replication studies that systematically examine effect heterogeneity across key moderator variables. And if one researcher or research team were to design such a series of studies, it would take them many years if not decades to conduct studies examining all the possible combinations of key moderator variables to fully examine effect heterogeneity.

In the first stage of SERA2, we worked with a Consensus Panel to identify key moderator variables across which to examine effect heterogeneity for repeated reading, a commonly used intervention to improve reading performance for students with learning disabilities. The Consensus Panel included six experts in repeated reading and/or reading instruction for culturally and linguistically students with learning disabilities. Panel members attended a two-day meeting at the University of Virginia to develop an initial list of key moderator variables for repeated reading. We will then use this list of hypothesize moderators to design a series of conceptual replication studies to investigate systematic sources of effect heterogeneity for repeated for students with learning disabilities. Drs. Scott Ardoin (University of Georgia), Young-Suk Kim (University of California-Irvine), Endia Lindo (Texas Christian University), Michael Solis (University of California-Riverside), Elizabeth Stevens (University of Kansas), and Jade Wexler (University of Maryland) participated in a Nominal Group Technique to develop initial consensus on the most important moderator variables moderator variables. Nominal Group Techniques involve four stages: idea generation, nomination, discussion, and ranking. Day 1 ended with experts ranking the importance of nominated moderator variables in each of the UTOS categories. After sharing the results of the rankings, our group of experts re-nominated, re-discussed, and re-ranked key moderator variables for repeated reading in Day 2. The highest ranked variables in each UTOS category at the end of Day 2 were:

  • Units: students with learning disabilities with low vs. high decoding skills
  • Treatments: (a) difficulty of passages and (b) modelling skilled reading of passages (tie)
  • Observations: type of oral reading fluency measure
  • Settings: individual vs. group administration

Using these variables, co-PI Dr. Vivian Wong and Project Consultant Dr. Peter Steiner (University of Maryland) are developing a series of conceptual replication studies (an integrated replication design) to systematically investigate the effects of repeated reading across the many combinations of the levels of these variables. We will then be conducting focus groups with practitioners with experience teaching repeated reading and a broader group of researchers with expertise in reading intervention to garner feedback on the selected moderator variables and the draft integrated replication design. In the second stage of the project, we will involve SERA research partners to crowdsource piloting of selected studies in the integrated replication. 

We will be providing progress updates as here as the project progresses and are excited about the potential of identifying key moderator variables for other commonly used interventions in special education, with the ultimate goal informing policy and practice by crowdsourcing studies across many research teams to systematically examine effect heterogeneity in a short time frame.

Crowdsourcing & Open Science

Educators strive to improve and maximize the learning outcomes of their students by applying effective instructional practices. Although no instructional approach is universally effective, some teaching practices are more effective than others. Therefore, it is important to reliably identify and prioritize the most effective instructional practices for populations of learners. Rigorous experimental research is generally agreed to be the most reliable approach for identifying “what works” in education. However, it is important to recognize that scientific research has important limitations and does not always generate valid findings.

In large-scale replication projects in psychology and other fields, researchers often failed to replicate the findings of previously conducted studies (e.g., Klein et al., 2018; Open Science Collaboration, 2015), casting doubt on the validity of research findings. Researchers, including those in education, have also reported using a variety of questionable research practices such as p-hacking (exploring different ways to analyze data until desired results are obtained), selective-outcomes reporting (only reporting analysis with desired results), and hypothesizing after results are known (HARKing; e.g., Fraser et al., 2018; John et al., 2012; Makel et al., 2019), all of which increase the likelihood of false positive findings (Simmons et al., 2011). Moreover, research studies in education often involve relatively small and underpowered samples that do not adequately represent the population being studied, which further threatens the validity of study findings. Finally, most published research lies behind a paywall, inaccessible to many practitioners and policymakers (Piwowar et al., 2018), which reduces the potential application and impact of research. Open science and crowdsourcing are two related developments in research that aim to address these issues and improve the validity and impact of research.

Open science is an umbrella term that includes a variety of practices aiming to open and make transparent all aspects of research with the goal of increasing its validity and impact (Cook et al., 2018). For example, preregistration involves making one’s research plans transparent before conducting a study in order to discourage questionable research practices such as p-hacking and HARKing by making them easily discoverable. Data sharing is another key open-science practice, which allows the research community to verify analyses reported in an article, as well as analyze data sets in other ways to examine the robustness of reported findings, thereby serving as a protection against p-hacking. Materials sharing involves openly sharing materials used in a study (e.g., intervention protocols, intervention checklists) to enable other researchers to replicate the study as faithfully as possible. Finally, open access publishing and preprints provide free access to research to anyone with internet access, thereby democratizing the benefits and impact of scientific research beyond those with institutional subscriptions to the publishers of academic journals.

Crowdsourcing in research involves large-scale collaboration of various elements of the research process and can take many forms (Makel et al., 2019). For example, multiple research teams might conduct independent studies examining the same issues, as the Open Science Collaboration (2015) did when they conducted 100 replication studies to examine reproducibility of findings in psychology. Crowdsourcing in research has also occurred by multiple analysts analyzing the same data set to answer the same research question in order to examine the effects of different analytic decisions on study outcomes (Silberzahn et al., 2018). The most frequent application of crowdsourcing in research, which we have adopted in the Special Education Research Accelerator, is to involve many research teams in the collection of data, thereby increasing the size and diversity of the study sample – which serves to improve the study’s power and external validity. For example, Jones et al. (2018) involves > 200 researchers collecting data from > 11,000 participants in 41 countries to evaluate facial perceptions. Regardless of how it is employed, “crowdsourcing flips research planning from ‘what is the best we can do with the resources we have to investigate our question?’ to ‘what is the best way to investigate our question, so that we can decide what resources to recruit?” (Uhlmann et al., 2019, p. 417).

Although open science and crowdsourcing are independent constructs (i.e., research that is open is not necessarily crowdsourced, and crowdsourced studies are not necessarily open), the two approaches are closely aligned and complementary. The ultimate goal of both open science and crowdsourcing is to improve the validity and impact of research. Although using different means to achieve this goal, crowdsourcing facilitates making research open, and open science facilitates crowdsourcing. For example, in a crowdsourced study in which data are collected by many research teams, study procedures have to be determined and disseminated to collaborating researchers before the study is begun. Because study procedures are determined and documented prior to conducting the study, they can readily be posted as a preregistration. Additionally, if data are being collected across many researchers in a crowdsourced study, the project will need a clear data-management plan to enable data to be collected and entered reliably across researchers. Such well-organized data-management plans that include metadata not only facilitate the integration of data across researchers on the project, but also make data more readily usable by other researchers when shared and reduce the burden of creating supporting metadata when sharing. Moreover, because data are collected across many different researchers in many crowdsourced studies, individual researchers may be less likely to feel that they “own” the data and therefore may feel less reluctant to share them. Similarly, materials used in crowdsourced studies must have been developed and shared with the many researchers collecting data, so – like data – materials from crowdsourced research are ready to be uploaded and shared. Most broadly, crowdsourcing and open science share an ethos of collaboration and sharing for the betterment of science, and we conjecture that most researchers involved in crowdsourced studies will want to make their research as open as possible. As such, we look forward to making our crowdsourced research through the Special Education Research Accelerator as open as possible.


Bryan G. Cook, Ph.D.

Principal Investigator

Bryan G. Cook, Ph.D., is a Professor in Special Education at UVA with expertise in standards for conducting high-quality intervention research in special education, replication research in special education, and open science. He co-directs, with Dr. Therrien, the Consortium for the Advancement of Special Education Research (CASPER) and is an ambassador for the Center for Open Science. He is Past President of CEC’s Division for Research, chaired the working group that developed CEC’s (2014) Standards for Evidence-Based Practices in Special Education, coedits Advances in Learning and Behavioral Disabilities, and is coauthor of textbooks on special education research and evidence-based practices in special education. Cook plays an integral role in developing infrastructure and supports for SERA, conducting the pilot study, and assessing the usability and feasibility of using SERA to conduct future replication pilot studies.

Welcome to the Special Education Research Accelerator

Bryan G. Cook, William J. Therrien, Vivian C. Wong, Christina Taylor

We are pleased to welcome you to the Special Education Research Accelerator (SERA). SERA is a platform for crowdsourcing data collection in special education research across multiple research teams. When we read about the Psychological Research Accelerator, which crowdsources data collection for massive studies in the field of psychology conducted throughout the world, we began to think about the potential benefits of crowdsourcing in special education research. In essence, instead of a single research team conducting a study, crowdsourcing of data collection involves a network of research teams collecting data. Crowdsourcing allows researchers to flip “research planning from ‘what is the best we can do with the resources we have to investigate our question,’ to ‘what is the best way to investigate our question, so that we can decide what resources to recruit’” (Uhlmann et al., 2019, p. 713).

Given that there are relatively few students with disabilities, especially low-incidence disabilities, in schools, it is often difficult for special education researchers to obtain large, representative samples for their studies. Moreover, given limited grant funding, relatively few researchers in the field have the resources to conduct studies with large, representative samples on their own. Crowdsourcing data collection across many research teams seemed to us to be well-suited to address these and other challenges faced in special education research.

However, implementing crowdsourcing in special education research presents many challenges. Can interventions be conducted with fidelity across many different research teams? How will data be managed? How will implementation fidelity be measured? How will IRB issues be handled across multiple institutions? Fortunately, the National Center for Special Education Research (NCSER) funded an unsolicited grant for us to develop and pilot a platform for crowdsourcing research in special education research (i.e., SERA) to examine these and other issues. By involving many different research teams in data collection of studies, SERA can generate large and representative study sample, involve diverse sets of researchers, and examine whether and how study findings vary across researchers and researcher sites, in ways that studies conducted by a single research team could not.

This website has two primary functions: (a) a public-facing site to provide information and resources related to SERA and crowdsourcing research in special education, and (b) a hub for research partners to access resources and interact with SERA staff related to ongoing SERA projects. We hope all of you explore the website to learn more about SERA, the team behind SERA, and our SERA research partners. Please check back periodically as we provide updates. For our research partners, please be on the lookout for an email over the next few weeks which will contain your login credentials, as well as provide additional information on navigating pilot study resources and materials.

Currently, we are in the process of preparing to conduct a crowdsourced randomized control trial that conceptually replicates Scruggs et al.’s (1994) study of acquisition of science facts. We will be examining the effect of instructor-provided elaborations and student-generated elaborations on science-fact acquisition for elementary students with high-functioning autism across more than 20 research partners and sites throughout the US.

We couldn’t be more excited about SERA, this website, and our upcoming pilot study. We plan on developing and refining SERA for future use in many different crowdsourced studies in special education. If you’re interested in potentially being involved in future studies, please send us a message using the contact form here. We hope that you’ll find the site interesting and helpful, and that (if you’re a special education researchers) you’ll consider being involved in a crowdsourced SERA study.