Impact Evaluation

When designing and implementing an impact evaluation, evaluators must carefully consider an array of factors prior to selecting the approach most appropriate for evaluating the intervention. Experimental designs and strong quasi-experimental designs can help researchers move beyond assessing an intervention’s outcomes to providing evidence of the intervention’s impacts. Experimental designs use random assignment to form the intervention group (sometimes called a “treatment” or “program” group) and the control group, allowing researchers to assess the net impact of the intervention on an outcome of interest (e.g., employment, credential attainment). Quasi-experimental impact designs use methods other than random assignment to form the intervention and comparison groups. There are differing perspectives from researchers on the best methodological approaches and techniques to use, and with each year and each new study more information becomes available on which techniques yield credible evidence of an intervention’s effectiveness.

While by no means exhaustive, the following list provides links to examples of impact evaluation resources TAACCCT third-party evaluators may find helpful. Neither the U.S. Department of Labor nor Abt Associates endorses these materials or their authors. Some materials are publicly available and hyperlinks have been provided. Others must be purchased either directly from the publisher or from a bookseller, or can be downloaded via journal access databases.

Resource Table Menu

Experimental Design Resources
Quasi-Experimental Design Study Resources

Disclaimer: Neither Abt Associates, nor its TAACCCT evaluation partners, is responsible for the contents of any “off-site” web page referenced from this server or from private, third-party, pop-up, or browser-integrated software or applications. You are subject to that site’s privacy policy when you leave the TAACCCT Evaluation website. We are not responsible for Section 508 compliance (accessibility) on other websites.

Experimental Design Resources

Document Title/Link	Description	Author/Institute
Checklist For Reviewing a Randomized Controlled Trial of a Social Program or Project – To Assess Whether It Produced Valid Evidence	Updated in 2010 by the Coalition for Evidenced Based Policy, this checklist includes key items to look for in reading the results of a randomized controlled trial of a social program, project, or strategy to assess whether it produced valid evidence on the intervention’s effectiveness.	Coalition for Evidenced Based Policy
Impact Evaluation in Practice	This World Bank publication provides an introduction to the topic of impact evaluation and its practice in development.	Paul J. Gertler and Sebastian Martinez, Patrick Premand, Laura B. Rawlings, and Christel M.J. Vermeersch
Methods of Randomization in Experimental Design	The book discusses procedures of random assignment and local control in between-subjects experimental designs and the counterbalancing schemes in within-subjects or cross-over experimental designs.	Valentim Alferes
Optimal Design for Longitudinal and Multilevel Research: Documentation for the "Optimal Design" Software	This manual provides researchers with a guide to effectively designing a cluster randomized trial for a continuous outcome, including discussions of power analysis.	Jessaca Spybrook, Stephen. W. Raudenbush, Xiao-feng Liu, Richard Congdon, Andrés Martínez
Strategies for Improving Precision in Group-Randomized Experiments	This article considers strategies for achieving adequate power in experimental designs.	Stephen W. Raudenbush, Andres Martinez, and Jessaca Spybrook
The Core Analytics of Randomized Experiments for Social Research	This MDRC working paper examines the core analytic elements of randomized experiments for social research.	Howard S. Bloom (MDRC)

Quasi-Experimental Design Study Resources

General

Document Title/Link	Description	Author/Institute
Causal Effects in Non-Experimental Studies: Re-Evaluating The Evaluation of Training Programs	This article uses propensity score methods to estimate the treatment impact of the National Supported Work Demonstration.	Rajeev Dehejia and Sadek Wahba
Experimental and Quasi Experimental Designs for Generalized Causal Inference	A 2001 successor to the original Cook/Campbell Quasi-Experimentation: Design and Analysis Issues for Field Settings.	William Shadish, Thomas Cook, and Donald Campbell
Quasi-Experimentation: Design and Analysis Issues for Field Settings	This statistical textbook details a range of quasi-experimental approaches.	Thomas Cook and Donald Campbell
Which Comparison-Group (“Quasi-Experimental”) Study Designs Are Most Likely to Produce Valid Estimates of a Program’s Impact?: A Brief Overview and Sample Review Form	This report seeks to answer which quasi-experimental studies are most likely to produce valid estimates of a program’s impact.	Coalition for Evidence-Based Policy

Propensity Score Matching

Document Title/Link	Description	Author/Institute
Does Matching Overcome LaLonde’s Critique of Nonexperimental Estimators?	This paper applies cross-sectional and longitudinal propensity score matching estimators to data from the National Supported Work (NSW) Demonstration that have been previously analyzed by LaLonde (1986) and Dehejia and Wahba (1999, 2002).	Jeffrey A. Smith and Petra E. Todd
Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score	This article discusses weighting and propensity score methods to estimate average treatment effects.	Keisuke Hirano, Guido W. Imbens, and Geert Ridder
Evaluating Multi-treatment Programs: theory and evidence from the U.S. Job Training Partnership Act experiment	This paper discusses various considerations in evaluating programs that offer multiple treatments to their participants.	Miana Plesca and Jeffrey Smith
Evaluating the Effects of a Mandatory Government Program using Matched Groups within a Similar Geographic Location	This paper discusses three nearest neighbor matching estimators and compares with experimental benchmark results, following the literature initiated by Lalonde (1986).	Wang-Sheng Lee
Finite-Sample Properties of Propensity Score Matching and Weighting Estimators	This paper analyzes the finite-sample properties of matching and weighting estimators often used for estimating average treatment effects.	Markus Frolich
Matching on the Estimated Propensity Score	This article discusses deriving propensity score matching estimators.	Alberto Abadie, Guido W. Imbens
Matching Using Estimated Propensity Scores: Relating Theory to Practice	This paper seeks to bridge theoretical approximations and practice.	Donald Rubin and Neal Thomas
On Assessing the Specification of Propensity Score Models	This 2007 paper discusses a graphical method and a closely related regression test for assessing the specification of the propensity score.	Wang-Sheng Lee
Propensity Score Analysis: Statistical Methods and Applications	This book provides a systematic review of the origins, history, and statistical foundations of propensity score analysis.	Shenyang Guo and Mark Fraser
Propensity score matching in SPSS	This paper discusses how to implement various propensity score matching methods in SPSS.	Felix Thoemmes
Propensity Score Matching and Variations on the Balancing Test	This paper discusses balancing tests when using propensity score matching methods.	Wang-Sheng Lee
Reconciling Conflicting Evidence on the Performance of Propensity-Score Matching Methods	This paper summarizes results from a larger paper (Smith and Todd, 2001) that uses experimental data combined with nonexperimental data to evaluate the performance of alternative nonexperimental estimators.	Jeffrey A. Smith and Petra E. Todd
Sensitivity Testing of Net Impact Estimates of Workforce Development Programs Using Administrative Data	This paper addresses the question of whether administrative data sources, such as performance monitoring data, can be used for program evaluation purposes.	Kevin Hollenbeck
Some Practical Guidance for the Implementation of Propensity Score Matching	This paper discusses implementation issues and provides guidance to researchers who want to use PSM for evaluation purposes.	Marco Caliendo and Sabine Kopeinig
Using Matching to Estimate Treatment Effects: Data Requirements / Matching Metrics / Monte Carlo Evidence	This article provides guidance on how to choose among different matching estimators and matching metrics.	Zhong Zhao

Regression Discontinuity Design

Document Title/Link	Description	Author/Institute
A Practical Guide to Regression Discontinuity	This practitioners’ guide to implementing regression discontinuity designs discusses strengths and weaknesses of various techniques.	Robin Jacob, Pei Zhu, Marie-Andrée Somers, and Howard Bloom
Does Head Start Improve Children’s Life Chances: Evidence From a Regression Discontinuity Design	This paper exploits a new source of variation in Head Start funding to identify the program’s effects on health and schooling using regression discontinuity design.	Jens Ludwig and Douglas L. Miller
Estimating the Effect of Financial Aid Offers on a College Enrollment: A Regression-Discontinuity Approach	This paper uses regression-discontinuity design to estimate the effect of financial aid on enrollment.	Wilbert van der Klaauw
Identification and Estimation of Treatment Effects with a Regression Discontinuity Design	This paper discusses identifying conditions in regression discontinuity design.	Jinyong Hahn, Petra Todd, and Wilbert Van der Klaauw
Manipulation of the Running Variable in the Regression Discontinuity Design: A Density Test	This National Bureau of Economic Research paper develops a test of manipulation related to continuity of the running variable density function.	Justin McCrary
Regression Discontinuity Designs in Economics	This paper discusses the basic theory behind the regression discontinuity (RD) research design, when RD is likely to be valid or invalid, and summarizes different ways of estimating RD designs and their limitations.	David S. Lee and Thomas Lemieux
Regression Discontinuity Designs: A Guide to Practice	This paper reviews some of the practical and theoretical issues involved in the implementation of regression discontinuity designs.	Guido Imbens and Thomas Lemieux
Regression Discontinuity in Prospective Evaluations. The Case of the FFVP Evaluation	This paper uses regression discontinuity design for an impact evaluation of the USDA Food and Nutrition Service’s Fresh Fruit and Vegetable Program (FFVP).	Jacob Alex Klerman, Lauren E. W. Olsho, and Susan Bartlett
Regression Discontinuity Inference with Specification Error	This paper proposes a simple econometric procedure to account for uncertainty in the choice of functional form for regression discontinuity designs with discrete support.	David Lee and David Card
Technical Methods Report: Statistical Power for Regression Discontinuity Designs in Education Evaluations	This report examines theoretical and empirical issues related to the statistical power of impact estimates under clustered regression discontinuity designs.	Peter Z. Schochet
‘‘Waiting for Life to Arrive’’: A history of the regression-discontinuity design in Psychology / Statistics / Economics	This paper reviews the history of the regression discontinuity design in three academic disciplines.	Thomas D. Cook
What’s New in Econometrics	Part of the National Bureau of Economic Research’s 2007 Summer Institute, this lecture reviews five practical issues in implementation of regression discontinuity designs.	Guido Imbens and Jeffrey Wooldridge

Other QED Methods

Document Title/Link	Description	Author/Institute
Applied Longitudinal Data Analysis	This book discusses two popular statistical methods for analyzing longitudinal data: multilevel modeling of individual change and hazard/survival modeling for event occurrence in both discrete- and continuous-time.	Judith Singer and John Willett
Estimating Program Impacts on Student Achievement Using "Short" Interrupted Time Series	This MDRC working paper examines the use of interrupted time-series analysis for estimating the impacts of school restructuring programs designed to increase student achievement.	Howard S. Bloom
Evaluating the Differential Effects of Alternative Welfare-to-Work Training Components: A Re-Analysis of the California GAIN Program	This paper explores various ways of combining experimental data and non-experimental methods to estimate the differential effects of components of training programs.	V. Joseph Hotz, Guido Imbens, and Jacob Klerman
Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference	This paper discusses how to avoid misinterpretations of results when using matching methods.	Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart
The Validity and Precision of the Comparative Interrupted Time Series Design and the Difference-in-Difference Design in Educational Evaluation	This paper examines the validity and precision of two nonexperimental study designs that can be used in educational evaluation: the comparative interrupted time series design and the difference-in-difference design.	Marie-Andrée Somers, Pei Zhu, Robin Jacob, Howard Bloom
Using "Short" Interrupted Time-Series Analysis to Measure the Impacts of Whole-School Reforms: With Applications to a Study of Accelerated Schools	This paper provides a conceptual rationale, statistical procedures, and strengths and limitations of using interrupted time-series analysis in education research.	Howard S. Bloom