CUSTOMIZING AN RPP FOR A PANDEMIC

This is the third installment of Improving Improvement, our quarterly series focused on leveraging the power of research-practice partnerships (RPPs) to build schools’, districts’ and states’ capacity to improve. In our last installment, we shared lessons learned from working with existing partners during the Covid-19 crisis. Many of these lessons could be considered best practices for RPPs in any context.
In this installment, we want to build on those lessons with a discussion of a new partnership created expressly in response to the pandemic. In June, we partnered with Impact Florida—an organization that convenes networks of Florida districts to recognize, support, and scale great teaching practices— to support four districts in addressing challenges made more urgent by the pandemic. As Impact Florida’s technical assistance provider, Proving Ground provides the framework and tools necessary for districts to choose a problem of practice, identify and develop potential solutions, pilot and test those solutions, and make outcome-enhancing decisions based on the test results. The urgency of the moment meant condensing what we normally do in about 18 months into less than half that time. And we had to do it while minimizing the burden on already time- and attention-strapped district partners. While we are not ready to conclude that this accelerated model works, preliminary feedback suggests the modifications made for this moment will help us better serve partners in the future.
A More Responsive Research Infrastructure
Our collaboration with Impact Florida necessitated that we combine our process with new flexibility to serve the districts’ urgent needs in the face of Covid-19. Our partnership model has always been focused on serving practitioners’ needs without sacrificing the rigor of the evidence we help them generate and use. Striking that balance, however, generally involves two quality safeguards that limited our ability to use our model to address urgent needs. First, we restrict the outcomes we measure and, by extension, the problems our partners can address. Using outcomes on which we have done extensive R&D and with which we have extensive experience makes it easier to manage the process and ensure the rigor of results. Second, we use a thorough onboarding process that includes ingesting districts’ data into our secure warehouse and gradually guiding district staff through the entire process over the course of roughly a year. It is the longest part of the onboarding process, but it enables us to do repeatable rapid-cycle evaluations (our partners can run 4 rounds of RCTs in 2 years, for example).
Impact Florida’s goal was to select a cadre of Florida districts and help them address one of the many urgent challenges posed by the pandemic this year. For the work to be relevant, the cadre would need to be able to choose which challenge to address. And they had to be able to implement a strategy in the fall after starting work in earnest in August. Thus, when we started conversations with Impact Florida about supporting some of their partner districts, our decision was really about whether we could reengineer our process to maintain quality without the two safeguards outlined above. We decided to try. The lesson is that it appears doable if we engage district partners early and often, so they are informed of the implications before making all key decisions.
Becoming (Mostly) Outcome-Agnostic
The path to selecting outcome measures began before any of the four districts in the cadre officially signed on. Impact Florida identified six districts from its network of 13 in May. We then shared an overview, including expectations for piloting and testing, with all of them. By late May the pool narrowed to four districts, and after individual conversations with each district in early June, the Covid Recovery Cadre (CRC) was born. At that point the CRC had not yet decided on its focus area, but Impact Florida had surveyed them along the way to learn about their interests. This, and the grant requirements, let us narrow down to a few options: Algebra, HS ELA and/or SEL. In late June, we facilitated a session with all four districts to help them decide where they would focus. We included in the factors they should consider the research implications –what we would use to measure impact, how pilots might be likely to play out – alongside the usual questions of impact, effort, etc. The four districts unanimously chose Algebra as their focus area, specifically focusing on students who were in Pre-Algebra in 8th grade and Algebra 1 in 9th grade. But they were also convinced that, in the current environment, it was difficult to separate SEL from academic outcomes. We agreed to try to measure SEL outcomes as intermediate outcomes if it proved a key part of their theories of action – something we would not know until they completed a root cause analysis and chose their interventions.
This was Proving Ground’s first cycle using Algebra or SEL as an outcome, so it required us to become comfortable with the unknown. We had not seen partners’ Algebra or SEL data, and there were no prior years’ end of course exams (EOCs). This year’s EOCs are tenuous and will not give us results we can use on our partners’ timelines (we aim to provide results before budget decisions for the following year get made). We knew only that all four districts administered Algebra benchmark tests – albeit 4 different ones including 2 that are in-house– and were willing to administer additional SEL instruments if needed. Our analytic team therefore did some quick pressure testing with a dataset from one of our partners to give the thumbs up that we could make it work. We nevertheless launched with far less certainty about measurement instruments than we ever had in the past. It put a great deal of pressure on our analytic team to be adaptive as the data came in. We had to be willing to problem solve measurement challenges quickly. This was, therefore, an exercise in getting comfortable with uncertainty and trusting the team’s skills and talent. But it was also an exciting intellectual challenge for those who embraced it. And, once again, we were very explicit with our partners about the challenge and uncertainty. They went ahead as aware as we could make them of the risks involved.
Cutting Onboarding and Development Time by 75%
Proving Ground’s ordinary onboarding and initial development period lasts around 12 months. It encompasses two related foundation-setting steps: data processing and learning the continuous improvement process while developing the first intervention. Making this new accelerated timeline work required substantial modifications to both.
In a traditional engagement, we ingest partners’ data into our secure warehouse. On average, this takes around six months, but in some cases can take longer. One driver is setting up data transfers; the signing of data sharing agreements (DSAs) can take some time with two bureaucracies involved. The bulk of the rest of the time is spent on the back and forth it takes to understand partners’ data – even in something standard like attendance there is a lot of variation in coding and business rules– and address errors. Once that’s done, our team writes the scripts necessary to automate future ingestion and cleaning and the code to process all files into something we can analyze. For the CRC, waiting six months to be ready to process data would not work. We got around the problem by both changing how we handled the data and by modifying the rest of our process so that it wasn’t dependent on data at as early a stage.
The biggest change to data processing was deciding to forego ingesting the data into the warehouse. Because this is a one-year engagement with one improvement cycle per partner, the benefit of warehousing the data was substantially outweighed by the upfront cost. We instead processed the data directly, essentially replacing SQL code with Stata code to clean and stitch data sources for analysis. We also invested energy up front figuring out the bare minimum data we would need to measure impact. This had the side benefit of reducing the data requested of partners at a time when staff were generally overwhelmed. Finally, we set expectations from the beginning that we would need DSAs signed on a much shorter turnaround than normal.

Figure 1: Changes to traditional process to create accelerated process for Covid Recovery Cadre (changes for CRC highlighted in orange; cuts for CRC highlighted in red)
Modifying the continuous improvement process to be less dependent on data in the initial stages coincided with shortening the overall time it takes to develop the first pilot. In a traditional engagement, developing the first pilot takes the better part of a year because we start with a data diagnostic we produce for partners and because we are laying the groundwork for future improvement cycles. Waiting for our data diagnostic means we cannot begin developing problem statements and doing the root cause analysis until we have the DSAs signed, data cleaned and processed, and diagnostic charts and tables produced and shared with partners. Even without ingesting data into the warehouse, we would have had to delay starting the process until around September, a non-starter when the goal was to launch interventions in the fall. We got CRC partners started earlier by having them do their own data diagnostic using a template we created for them. Because we did not need their data, this started almost immediately, happening concurrent with the signing of DSAs and before the transfer of any data. It was a five-month head start.
We still had a long way to go to shave the rest of the timeline down, however. As with cutting out ingestion into the data warehouse, a key change was enabled by the fact that this was a one-year engagement. In our traditional engagements, our goal is not just to help districts identify what works and what does not for a given problem but to ensure they can sustain the work after we are gone. It is why our traditional engagements are three years and encompass multiple improvement cycles. And it means not just guiding partners through the process but building their capacity along the way. It also means diving deep at all stages of the process to generate learnings that we will use in future cycles. Because the goal here was to figure out what works and what does not for an urgent need, we could deprioritize capacity building to an extent. So, we cut several steps from the process and within each step, covered only the minimum needed for this cycle. We also did some legwork for partners that we would not have done if we were building capacity, like pre-populating portions of templates.
The biggest shortcuts came around intervention design. Traditionally, this process involves substantial stakeholder engagement and begins with partner brainstorming. For the CRC, we eliminated the former (as noted below) and used the latter only as an optional supplement to the core step of matching root causes to evidence-based intervention options. To foster that matching, we produced a self-guided tool that allowed districts to enter root causes and see a curated list of aligned evidence-based intervention options.[1] While we had been developing the idea for this tool for a long time, the CRC pushed us to create a lower fidelity working example to make it available for these partners. Its use substantially cut the time to identify a well-developed root-cause aligned solution. It also led to all partners having design and implementation support from the creators of the interventions they selected (we acted as the matchmakers). As a result, the design and implementation planning time were cut substantially as well.
Why We Are Hopeful this Model Works
The main reason we are hopeful that this model is viable is because of the success that we have seen. It is November and all four partners are implementing their interventions and on track to have evidence they can use to decide whether to scale, adapt, or drop them in the spring. Two partners identified growth mindset as a key root cause of students’ challenges with Algebra and are piloting PERTS Growth Mindset modules. One identified teacher practice as a root cause and is piloting MQI Coaching. The fourth identified prerequisite skills deficits as a root cause and is piloting tutoring using SpringMath.
Beyond the fact that partners are so far on track, we have seen benefits to some of the changes that will lead us to adopt them into our traditional model. One example is that letting go of the reins on the diagnostics seems to have increased partners’ engagement with their data and led to richer conversations as they diagnosed their challenges and developed clear problem statements. While we will still produce diagnostics for other uses, in the interest of capacity building, we will start all partners with a self-diagnostic template. Another is more intentionally connecting partners with implementation support from intervention creators where feasible.
Key Qualifications
While we are encouraged by the progress our CRC partners have made and will adopt several features of the CRC regardless of the result, it is important to qualify that some of the success depends on conditions that may be hard to replicate and some of the changes involve tradeoffs that are only worth making in the face of the kind of urgency the pandemic has created. One of the conditions that would seem hard to replicate but may be key to success in this accelerated model is the unique commitment of our four partners and their existing relationship with Impact Florida. Their commitment meant we knew they would do what was necessary to make this work – like doing the self-diagnostic on a short turnaround – despite their extremely limited time. The latter meant we were piggy backing on the goodwill and trust Impact Florida had built up when partners opted to continue in the face of the risks posed by so many modifications.
Similarly, not all the modifications are desirable beyond urgent, exceptional cases, especially the decision to drop the stakeholder engagement steps. These steps inform intervention selection, design, and implementation planning and were dropped both for the sake of time and because the pandemic limited the availability of stakeholders. While it is entirely possible for interventions to be successful without it, it is nevertheless a substantial missed opportunity. Stakeholder engagement is not just a means to an end, rather, it leads to improved design and implementation along with improved relationships with stakeholders and a better understanding of those we work with and serve. It will therefore remain a key part of the Proving Ground process going forward, regardless of the outcome of this project. In fact, we are building out plans to provide CRC partners with future opportunities to build stakeholder engagement capacity. But if this project succeeds, we at least know it is possible to adjust our process to respond to urgent needs; it is a tradeoff we would make only under urgent circumstances.
Looking Ahead
By our next installment of Improving Improvement, we should start seeing results trickle in from our 50 or so partner districts running RCTs this fall. Stay tuned for more lessons learned from the results of pilots implemented during the pandemic.
We are also always open to additional suggestions for topics for future editions of Improving Improvement. Reach out to us with any questions you have about our networks, continuous improvement process, or ideas you would like to see us tackle.
David Hersh (david_hersh@gse.harvard.edu) is Director of Proving Ground.
Suggested citation: Hersh, D. (2020). Improving Improvement: Customizing an RPP for a Pandemic. NNERPP Extra, 2(4), 17-21. https://doi.org/10.25613/P08F-V102