Evidence of Impact: Challenges and New Directions
May 19-20, 2006
Introduction to the Conference
The conference opened with a presentation that reflected on the Humanitarian Impact Conference of 2004. The original intent was to create a forum where policymakers, donors, academics, evaluation professionals, and field practitioners from the humanitarian sector could engage in a productive conversation about the progress, gaps and direction of measurement and evaluation. Reflecting on the progress since 2004, Anisya Thomas noted that while the concept and ideas of the conference had been carried forward in a number of initiatives at the agency and sector levels (The tsunami Evaluation Coalition, the Interagency Working Group, Mary Anderson’s Listening Project, Fritz Institute’s beneficiary studies, etc.), the formal attempts at collaborating around impact had not been successful. Thus, the goals for the 2006 conference were more modest and defined as creating a neutral space where participants could ask better questions, understand the range and diversity of current initiatives about humanitarian effectiveness, accountability and impact, and critically reflect on missing components.
The emergence of evidence-based medicine to improve the quality of healthcare in America was presented as a potential model which included many components that could be applied to the humanitarian sector. These included all constituencies committing to a statement of purpose, the adoption of a set of principles to guide processes, the identification of sector-wide priorities, the creation of an environment that fosters and rewards improvement, and the preparation of the workforce to better serve beneficiaries.
Introduction to Issues of Impact Measurement
The conference began with each participant presenting their opinion about the most challenging or frustrating aspects of assessing impact. The common themes that emerged were:
Defining Impact: Different organizations perceive impact differently and utilize different levels of analyses. Similarly, donors and grant makers have different measurement criteria resulting in a morass of paperwork and data and very little usable information. Although impact is a term that is used a great deal, it means different things to different people. It is particularly hard to define impact, when organizations do not or are unable to clearly articulate their theory of change, or what outcome they intended to achieve. Furthermore, in multilateral operations and federated organizations the contribution of a single organization is hard to assess.
The Purpose of Impact Measurement: There is lack of clarity about what impact assessments are supposed to accomplish such as accountability to donors, learning to improve humanitarian interventions or accountability to beneficiaries. Donors do not seem to be significantly interested in impact measurement, as it is longer term, and they rarely provide resources. Further, it was suggested that impact evaluation cannot be done on a reactive basis; it has to be part of the preparedness plan of organizations that respond to disasters
Progress in Impact Measurement: Despite all the problems, there has been progress in understanding the importance of quality and creating standards. SPHERE, HAPI, ALNAP have all helped the sector in tangible ways although their impact is hard to quantify or measure. It was noted however that even in this realm, the ‘beltway bandits’, contractors and military seemed to have better systems and ability to quantify their contributions. This puts the humanitarian sector in a vulnerable position as it is not able to convey its capacity and contribution. There was also the observation that very little had changed since the sector wide evaluations of Rwanda. The tsunami Evaluation Coalition (TEC) and Darfur evaluations are revealing some of the very same issues that were highlighted in the Rwanda evaluations over 10 years ago.
The Consequences of Impact Measurement: Despite the accountability feature of impact evaluations, no consequences exist for organizations or individuals whose programs demonstrate poor performance. Some participants observed with sadness that despite 40 years of interventions, the sector still does not know or want to know how it does. Even when the negative consequences of some strategies are known, there is inertia or resistance to behavioral change. There is a fear that the political will does not exist to truly understand the weaknesses of the sector and duly address them.
The Practicality of Impact Measurement: Field staff are not properly trained to understand and operationalize impact. In the absence of standardized tools and approaches, there is a lot of confusion about what to measure, how to measure and what to do with the information once it is collected. Also with the transience of field staff, training and maintaining institutional processes in the field are challenges to effective measurement. In addition, the capacity and competence of local staff is rarely acknowledged. Many argued that at the end of the day, it is the local organizations, the partners on the ground that make the difference. However, they are not trusted, trained or heard.
The Science of Measuring Impact: There is an academic and research community that explores these issues and can provide assistance in progressing the measurement of impact. Particularly in the humanitarian context, a multidisciplinary approach that first identifies what needs to be measured and then defines the type of measurement capacity required is necessary. Clear demarcations between levels of analysis, independent, dependent, and confounding variables and rigorous statistical analysis must be considered. Bridging to this community and creating joint initiatives can be very helpful to the sector. On the other hand, it was also acknowledged that working with academics was very hard as their frames of reference were so different and their understanding of the field so limited.
The Future of Impact Evaluations: Moving toward joint evaluations, collaboration on the creation of standardized evaluation tools and processes, the use of third parties were some ideas for the future of impact evaluations. It was also noted that donors needed to take impact evaluation more seriously in making funding decisions, penalizing organizations that had poor track records for achieving stated outcomes.
Nonetheless, there needs to be caution in advocating the use of third parties, as the learning aspect may be lost if impact is always evaluated by external bodies. Similarly, rigor should be balanced with ethics and the realities of the humanitarian mandate. While randomized control studies may be the most rigorous, how do you decide which beneficiaries receive the service and which do not in designing your experiment?
Finally, there needs to be greater clarity on how impact evaluations will be used to facilitate organizational learning and the improved delivery of services. Understanding beneficiary perspectives and examining the consequences of organizational interventions is critical to the central mandate of the sector.
Lessons from the tsunami: Case Studies in Impact Evaluation
The first session was intended to assess the lessons learned from the tsunami, where for the first time in the sector’s recent history, resource constraints could not be blamed for not evaluating impact. The session began with a discussion of the tsunami Evaluation Coalition (TEC) led by Susanne Freuh from UN OCHA. This sector-wide learning accountability initiative, comprised of 40 UN agencies, donors and NGO’s, cost about $2 million and involved over 200 people. TEC’s achievements thus far include representing the first major system-wide evaluation since Rwanda, gathering lessons learned about the execution of a multi-agency process, including getting early in-country stakeholder support, promoting the early establishment of performance indicators and identifying good practice rather than only what did not work. There was also acknowledgement that the TEC evaluation fell short in involving beneficiaries, engaging local and regional actors and addressing the impact question.
Providing an agency perspective on impact evaluation, the second presentation by Peter Medway delved into the International Medical Corps’ approach to tsunami relief that was highly participatory, iterative, engaging several impact evaluation tools, and inclusive. Lessons learned included the importance of focusing on specific areas, responding to local priorities, targeting marginalized groups, and maintaining a long-term presence. The third and final presentation described the methodology and findings of the Fritz Institute study on beneficiary perceptions of aid effectiveness, which surveyed thousands of households 48 hours, 2 months and 9 months after the tsunami. Lessons learned included evidence that beneficiary satisfaction can be quantitatively assessed and the fact that objective assessments of beneficiary perceptions can inform the humanitarian sector of the efficacy of their intervention from the view of a core stakeholder.
Conference attendees commended the multi-agency TEC approach, but suggested that there was still work to be done to convince the donors that a sector-wide evaluation was sufficient for their purposes in lieu of individual agency evaluations. Recognizing the value of all of the approaches presented, the audience emphasized the importance of distributing the findings and data as widely as possible and in a timely manner. Advocacy for such initiatives emerged as an important next step for humanitarian agencies in order to attract adequate human and financial resources from donors.
Let’s Be Honest About Impact Evaluations: A Discussion
The second session revolved around many of the crucial questions posed by conference attendees in the introductory session such as how to balance accountability to donors with accountability to beneficiaries, whether impact evaluations are sanitized and if evaluation information can be used to improve institutions. After reviewing again some of the inherent problems in the system, the discussion leaders, Peter Walker from Tufts University and Niels Dabelstein from DANIDA focused on ways to improve the system. They asserted that management of humanitarian agencies must be open to be open to criticism, committed to learning and have systems in place to follow up on evaluations. They emphasized that the professional culture of humanitarian agencies needs to evolve to foster active learning. Furthermore, for the evaluation process also to be a learning process, they suggested that lessons must benefit the team that extracts them, learning and change must start at the beginning of the action, lessons must link explicitly to future action, and leaders must hold everyone, especially themselves, accountable for learning.
Tools and Approaches to Impact Evaluation
The intent of this session was to get a sense of different initiatives that were approaching the impact question from different perspectives. The first presentation by Nick Stockton of the Humanitarian Accountability Project (HAP) involved comparing two alternative models of decision making in humanitarian intervention, an inductive model and a deductive model. Using a scenario planning and modeling approach he illustrated the difference in the type of decisions made by two hypothetical NGO’s QualAid and FasterAid. His conclusion was that optimal effort on early consultation, communications and complaint handling saves time, money and lives relative to the more common approach of getting to the field quickly, setting up services, evaluating the services and modifying the approach. The primary lesson for the audience was that deductive learning methods based upon quality management techniques have great potential for improving humanitarian outcomes.
The next presentation on the Standardized Monitoring and Assessment of Relief and Transition (SMART) initiative by Skip Burkle of the University of Hawaii outlined a decade long effort by several agencies and USAID to standardize data collection around two core indicators, crude mortality and under five nutrition. His view was that the real value of the initiative was in aggregating data across organizations and countries that could then be further explored through qualitative analysis. After years spent in creating the initiative and coming to agreement on its parameters, the next steps are to build capacity in participating organizations through training and institutionalize the methodology by its integration into training programs.
The third presentation on tools and approaches to impact evaluation illustrated how the Emergency Capacity Building (ECB) 2, a collaborative effort of the Interagency Working Group (a consortium of the heads of emergency management of the 7 largest NGO’s, funded by the Gates Foundation), is attempting to address the need for practical impact evaluation tools. One of its outputs, a “How-to-Guide” for field practitioners, will be available this summer taking into account the need to be realistic, account for resources, usefulness for those applying it, as simple as necessary, and based on widely accepted humanitarian values, standards and guidelines.
The final presentation in this session by Gregg Greenough of the Harvard Humanitarian Initiative described the vision and process of developing a standardized tool for assessing the efficacy of HIV/AIDS programs among African Red Cross organizations. Commissioned by Fritz Institute on behalf of the New Partnership for African Red Cross and Red Crescent Societies (NEPARC), the tool was developed by bringing together the extant literature on measurement and HIV/AIDS and through the observation and interaction with numerous field programs of the Kenya Red Cross Society. The intent of the tool is to create a way where the efficacy of programs among African Red Cross organizations can be compared and best practices shared.
The Donor Perspective
As a donor, Per Byman of SIDA acknowledged the tension between donors needing as much information as possible for assessment purposes, while also realizing that asking for too much information is counter-productive. In order to account for both the effectiveness and efficiency of implementing agencies, he acknowledged that donors must be prepared to pay for quality in humanitarian systems by providing resources for reporting, evaluation, follow-up, and the creation of benchmarks and indicators. Conference participants expressed concerns over the amount of political influence that affects donor-decision making to which the speaker responded that quality impact evaluations and closer collaboration with the field can in fact provide the evidence to influence decision-making and challenge political influence.
Assessing the Impact of Disaster Funding at the World Bank: A Three-Year Journey through Twenty Years of Projects
In this case study of a comprehensive and ambitious evaluation of 20 years of World Bank loans (528 separate loans) for natural disaster assistance (which represented approximately 10% of the Banks loans during the period), Ron Parker, who led the evaluation, described the impetus of the project, the enormous time and resources it consumed and the influence that it is having on policy development at the Bank. The study illustrated numerous contradictory policies, fragmentation in information available to various parts of the Bank and trends that were significant to the future policies of the organization.
Among the findings were that the Bank was not taking into account the amount of disaster funding that a particular country had received in the past when considering new loans, the disaster vulnerability of some countries, the repetitive funding for reconstruction with almost no funding for mitigation and the recurring nature of disasters in some countries. Further, it was found that a natural disaster can instantaneously wipe out decades of development assistance, a link that had been hitherto overlooked. In Mozambique, for example, while the Bank lending financed 487 schools over 20 years, the floods in 2000 damaged or destroys approximately 500 primary schools and 7 secondary schools. Recommendations stemming from this study included preparing Country Assistance Strategies that take into account differing vulnerabilities of borrowing countries, modifying current operational guidelines to address long-term needs, and ensuring sufficient specialized expertise available for quick response. The presentation illustrated how a retrospective analysis of an organization’s own performance can be used to guide learning and evidence-based policy.
Impact Evaluations: The Way Forward
The intent of this session was to explore some high potential initiatives that are changing behavior in the humanitarian sector. The first presentation on “Building Back Better”, by Robert Piper from the UN Office of the Special Envoy for tsunami Recovery, led by President Clinton, defined the concept popularized by the former President after the tsunami as leaving survivors safer than they were pre-disaster, not perpetuating bad development patterns and not perpetuating bad development practice. Developing such specific criteria enables a meaningful framework, a strategic plan, insight important to constituencies, and use for managing expectations. However, the presenter expressed concerns as to whether the concept is relevant to relief or just recovery, whether the details have already been defined through existing processes, how this relates to the speed of recovery, and the fact that planning and assessment tools do not match.
The tsunami Recovery Impact Assessment & Monitoring System (TRIAMS), which seeks to assist governments, donors, partners, and beneficiaries in identifying the impact of tsunami response and in measuring recovery rates, was presented by Margaret Stansberry of the IFRC. This initiative, led by the IFRC and WHO, seeks to encourage national governments of countries affected by the tsunami to monitor recovery. Some of the challenges facing the initiative include lack of agreement regarding monitoring and evaluation frameworks and terminology, data fatigue in tsunami countries, difficulties in balancing the information needs of multiple stakeholders, and balancing the need for quality information with the primacy of the governments and lead reconstruction agencies. However, TRIAMS promotes sustainability and capacity building locally and there is interest and commitment across stakeholders to improve joint monitoring and reporting.
The final presentation by Shimelis Adugna of the Ethiopian Red Cross described the New Partnership for African Red Cross and Red Crescent Societies (NEPARC), which is an African led initiative which seeks to strengthen local capacity so that African National Societies can effectively respond to local disasters and be more equal partners with international humanitarian agencies. The core methodology for NEPARC is a series of third-party audits of governance, program effectiveness and sustainability applied across the member organizations to baseline and benchmark performance on the distinct dimensions (these audits are being developed for NEPARC by Fritz Institute and its partners). These audits serve as a measure of capacity, a foundation for discussions of good practice and evidence to donors of links between capacity and effectiveness.
Conference participants were very supportive of the initiatives presented and the subsequent discussions focused on recommendations for strengthening them, fostering more collaboration and widening their application. For the “Build Back Better” criteria to be applicable, conference attendees emphasized the importance of taking into account context-specific factors. Recommendations for strengthening TRIAMS included developing dedicated capacity to collect, collate and communicate data and analysis. Conference participants also emphasized that the initiative has wide applicability in any disaster context and should not be solely associated with the tsunami. There was also enthusiastic support for the accreditation concept, as exemplified by NEPARC, and suggestions that such a process of audits could be replicated in other organizations, particularly within other federations.
The conference concluded with each participant reflecting on what he or she would take away from the conference. Most participants felt that the conference had been extremely useful in including the dominant initiatives in the sector and provoking thoughtful discussion. The consensus among attendees was that rather than new initiatives, the humanitarian community needs ways for organizations to collaborate on existing efforts and prevent duplication. Moreover, there was an expressed need and interest in continuing to develop joint monitoring systems, multi-agency evaluation mechanisms and impact evaluation standards to foster joint thinking and further agency collaboration. Several attendees supported a movement towards the enforcement of standards followed by certification that is advocated by donors but also supported by the wider humanitarian sector.
An additional theme emerging during the conference conclusion was the importance of developing impact evaluations that are relevant to the situation on the ground because the improvement of field-level practice is the ultimate goal. Thus attendees pleaded for impact evaluations to be flexible and grounded in the needs of the beneficiaries. Several participants acknowledged that beneficiary surveys are becoming more common and that they can be used in concert with other impact assessment methods to ensure triangulation. All of the attendees appreciated the opportunity to interact with the other conference participants in order to share their own challenges and lessons learned, learn about new and existing initiatives in impact evaluation and strengthen their networks. The next Humanitarian Impact Conference will be hosted by Fritz Institute in June 2008.