IMPROVING IMPROVEMENT: HOW TO KNOW AN IMPROVEMENT EFFORT IS SUCCEEDING?

David Hersh | Proving Ground

Volume 3 Issue 4 (2021), pp. 14-16

This is the sixth installment of Improving Improvement, our quarterly series focused on leveraging the power of research-practice partnerships (RPPs) to build schools’, districts’, and states’ capacity to improve. In our previous article, we laid out the work ahead for our partnerships and the questions we hope to answer in the 2021-2022 school year. In this installment, we are taking a step back to reflect on how we evaluate the success of our improvement efforts.

    Defining Success

      As with any program evaluation, the starting point for Proving Ground’s assessment of whether or not we are succeeding is a clear definition of success. What does success look like? In the last installment of Improving Improvement, we noted that our ultimate goal is for our partners to make evidence-based continuous improvement –which includes piloting and evaluating interventions– part of the ordinary course of business in their agencies. When we break down this goal, we see two components of success: 1) our partners practice evidence-based continuous improvement and 2), they do so routinely, in the ordinary course of business. We developed operational definitions for both constructs – “evidence-based continuous improvement” and “ordinary course of business”, outlined below. 

      For Proving Ground, evidence-based continuous improvement involves a set of core activities or competencies executed sequentially and repeated until the desired amount of improvement is achieved. The core activities are, in order:

        1. Clearly define the problem and set an improvement goal for it
        2. Identify root causes
        3. Identify a set of potential interventions aligned to the root causes
        4. Prioritize a potential intervention from that set to try
        5. Design the intervention using user-centered design principles
        6. Plan for implementation and progress monitoring
        7. Pilot to generate evidence of impact
        8. Use evidence from the pilot to decide whether to stop, scale or adapt
        9. Reflect on the results in light of your improvement goal

          Thus, our partners are practicing evidence-based continuous improvement if they are executing these activities. Success, however, implies doing so with some minimal threshold of fidelity. What if partners identify root causes that aren’t really root causes? What if they pilot but in ways that generate only spurious evidence of impact? We therefore developed a rubric to define in some detail the characteristics of optimal execution. For example, the quality of piloting might range from not piloting at all to doing so in a way that supports a causal inference of impact. 

          In this way we can check how well each partner is doing while they work with us (all of our partners execute at least one improvement cycle during our engagement). However, doing it well while we are there to support says little about how well they would do it after our engagement ends, or whether they would do it at all. You can only make so much impact in one cycle. This is why capacity building is an important part of our role. We need to assess not just the quality with which partners execute the competencies of improvement while they work with us, but whether they continue doing it and how well they do it when we are gone. 

          In a similar way, success also requires some minimal level of coverage. If partners are great at continuous improvement cycles but only do them on one team and/or for one narrow problem of practice a year, the likelihood of having meaningful long-term impacts is small. Therefore, our goal requires making it part of the ordinary course of business. It needs to be part of the fabric of the organization. Recognizing that it would be impractical to see thorough application in every area, we operationalized “ordinary course of business” by defining it to mean that partners apply evidence-based continuous improvement to all problems of practice aligned with strategic priorities. 

          Success for us, therefore, means our partners having institutionalized the practice of continuous improvement so thoroughly that they continue practicing the core competencies of improvement with fidelity for all strategically aligned problems of practice long after our engagement ends. 

          Measuring Progress

          Operationalized definitions of success are necessary to track progress but they are not in and of themselves metrics that can be used to measure progress. The next step for us is to develop measurable targets (and measurement instruments if needed) based on the operational definitions [1]. For example, did we succeed with a partner that practices all but two of the core competencies of evidence-based continuous improvement, as defined earlier? What if they do them all and do them well but only for some of their strategic priorities? What if they do them all, for all strategic priorities, but each with limited fidelity? There is also a time component to each question. By when do partners need to be fully practicing evidence-based continuous improvement in the ordinary course of business? It is not realistic to expect that our partners will fully institutionalize the practices during our engagements, so we need both progress markers – leading indicators that suggest they will get there – and a way of gathering data about their practice after our engagement ends. 

          For our leading indicators, we are looking for necessary steps along the path to fully institutionalized continuous improvement, which we have conceptualized into three phases:

          Phase 1: Do our partners execute a high-quality improvement cycle while working with us? How well did they execute each competency and to what degree was the decision outcome optimizing? 

          Phase 2: Based on their work with us, how confident are we that they are able to do this without us – can they generalize from the model we worked on together to other problems of practice? Additionally, how confident are we that they are willing to do this without us – have we created the internal demand? 

          Phase 3: Are they doing this after our engagement ends? How well and for how many of their strategic priorities?

            The first two phases establish the leading indicators. Our first leading indicator is the quality of the work our partners do while they are engaged with us. Using the rubric we developed, we can assess the execution of each competency. We have (or will have) targets for the share of partners scoring at defined levels on the different competencies. The second leading indicator is more difficult to measure. While we can reasonably rely on interactions, artifacts, and a partner self-assessment developed using the rubric referenced earlier to draw inferences about partners’ capacity to practice evidence-based continuous improvement without us, we will have limited direct evidence of their likelihood of doing so. We will likely rely on a combination of self-reported intentions and, for some partners, continuity planning work we are doing with them. 

            Phase 3 is the goal itself but is challenging to measure with fidelity.  First, while we have a self-diagnostic rubric that we could administer to partners after the engagement ends, it is reasonable to assume response rates will decline as we get further removed from the engagement. While the network is designed to be ongoing –all our partners continue to be members of the PG Network after the direct engagement with us– the intimacy of our connection is likely to decline. More problematically, the self-diagnostic is not designed for summative use, making it difficult to use for evaluating how well Proving Ground is doing. It is a formative tool we and partners can use to identify the competencies on which they need the most work. It is highly possible that as partners’ sophistication in continuous improvement increases –a good outcome– their self-assessments will get more critical, resulting in lower scores – implying a bad outcome. Similarly, there are likely to be issues with inter-rater reliability, as the individual(s) using the rubric may change over time. 

            There are, therefore, challenges we face in tracking progress towards our goal. We will, however, practice what we preach and continuously reflect on and improve how we go about measuring how well we are doing. 

            The Implicit Ultimate Goal

            What has so far gone unsaid is that the goal laid out here is instrumental to a larger goal. Practicing evidence-based continuous improvement is a means to an end – part of a theory that the best way to improve outcomes for students is to ensure education agencies are learning organizations that are highly skilled and disciplined in getting better. Framed this way, the real goal of our work is to improve student outcomes. If our theory is correct, partners will practice evidence-based continuous improvement in the ordinary course of business and, as a result, all the outcomes on which they practice it will improve. Improvement in our partners’ priority outcome areas should be measurably greater than that for comparable agencies that have not worked with us or otherwise practiced evidence-based continuous improvement. We are working on a way to measure this that would also address the selection effects we are likely to encounter – we do not randomly choose partners. Stay tuned for more on where we land.

            Looking Ahead

            In the next installment of Improving Improvement, we’ll share updates on the progress of our newest networks, the Georgia Improvement Network and the Rhode Island LEAP Support Network, including lessons learned from partnering with states to support districts on their improvement journeys.

            We are also always open to additional suggestions for topics for future editions of Improving Improvement. Please reach out to us with any questions you have about our networks, continuous improvement process, or ideas you’d like to see us tackle.

            David Hersh (david_hersh@gse.harvard.edu) is Director of Proving Ground.

             

            [1] While we have developed these, including them in detail here is beyond the scope of this article.

            Suggested citation: Hersh, D. (2021). Improving Improvement: How to Know an Improvement Effort is Succeeding? NNERPP Extra, 3(4), 14-16.

            NNERPP | EXTRA is a quarterly magazine produced by the National Network of Education Research-Practice Partnerships  |  nnerpp.rice.edu