Editors’ Notes

Chapter 1: Validity Frameworks for Outcome Evaluation

Campbellian Validity Typology

Content of the Campbellian Validity Framework

The Campbellian Validity Typology and Program Evaluation

Critiques of Campbellian Typology

Relationship Between the Campbellian Typology and Program Evaluation

Toward a New Perspective of a Comprehensive Validity Typology for Program Evaluation

Chapter 2: What Works for Whom, Where, Why, for What, and When? Using Evaluation Evidence to Take Action in Local Contexts


What Works? What Do You Mean?

The Traditional Validity Framework

Applicability of the Traditional Framework to Modern Evaluation Practice

Toward a More Systematic Process for Making Predictions


Chapter 3: New (and Old) Directions for Validity Concerning Generalizability

Generalizability and External Validity

Enhancing Knowledge About Generalizability in the Campbellian Tradition

Diverging Traditions in Evaluation

Recommendations for Future Practice and for Areas of Future Development

Chapter 4: Criticisms of and an Alternative to the Shadish, Cook, and Campbell Validity Typology

The Context for Evaluating Validity

An Alternative Typology

Comparison With the SCC Typology


Chapter 5: Reframing Validity in Research and Evaluation: A Multidimensional, Systematic Model of Valid Inference

Logic of Valid Inference in the Campbellian Framework

Enhancing Coverage of Framework for Valid Inference

Supporting Conceptual Organization

Summary of Advantages of Dimensional Organization


Chapter 6: Conflict of Interest and Campbellian Validity

Campbell and Stanley’s Conception of Validity

Revised Conceptions

Including Conflict-of-Interest Threats



Chapter 7: The Construct(ion) of Validity as Argument

Making Interpretive Sense of Outcome Evaluation

Validity as Argument

Familiar and Unfamiliar Validities


Chapter 8: Assessing Program Outcomes From the Bottom-Up Approach: An Innovative Perspective to Outcome Evaluation

The Top-Down Approach to Validity Issues

Lessons Learned From Applying the Top-Down Approach in Program Evaluation

The Integrative Validity Model as an Alternative Typology to Address Validity Issues

The Bottom-Up Approach for Evaluating Health Promotion/Social Betterment Programs

The Usefulness of the New Perspective for Program Evaluation

Chapter 9: The Truth About Validity

Chen, Donaldson, and Mark

Gargani and Donaldson






Chen and Garbe





New Directions for Evaluation

Sponsored by the American Evaluation Association


Sandra Mathison University of British Columbia

Associate Editors

Saville Kushner
Patrick McKnight
Patricia Rogers
University of the West of England
George Mason University
Royal Melbourne Institute of Technology

Editorial Advisory Board

Michael Bamberger
Gail Barrington
Nicole Bowman
Huey Chen
Lois-ellin Datta
Stewart I. Donaldson
Michael Duttweiler
Jody Fitzpatrick
Gary Henry
Stafford Hood
George Julnes
Jean King
Nancy Kingsbury
Henry M. Levin
Laura Leviton
Richard Light
Linda Mabry
Cheryl MacNeil
Anna Madison
Melvin M. Mark
Donna Mertens
Rakesh Mohan
Michael Morris
Rosalie T. Torres
Elizabeth Whitmore
Maria Defino Whitsett
Bob Williams
David B. Wilson
Nancy C. Zajano
Independent consultant
Barrington Research Group Inc.
Bowman Consulting
University of Alabama at Birmingham
Datta Analysis
Claremont Graduate University
Cornell University
University of Colorado at Denver
University of North Carolina, Chapel Hill
Arizona State University
Utah State University
University of Minnesota
US Government Accountability Office
Teachers College, Columbia University
Robert Wood Johnson Foundation
Harvard University
Washington State University, Vancouver
Sage College
University of Massachusetts, Boston
The Pennsylvania State University
Gallaudet University
Idaho State Legislature
University of New Haven
Torres Consulting Group
Carleton University
Austin Independent School District
Independent consultant
University of Maryland, College Park
Learning Point Associates

Editorial Policy and Procedures

New Directions for Evaluation, a quarterly sourcebook, is an official publication of the American Evaluation Association. The journal publishes empirical, methodological, and theoretical works on all aspects of evaluation. A reflective approach to evaluation is an essential strand to be woven through every issue. The editors encourage issues that have one of three foci: (1) craft issues that present approaches, methods, or techniques that can be applied in evaluation practice, such as the use of templates, case studies, or survey research; (2) professional issues that present topics of import for the field of evaluation, such as utilization of evaluation or locus of evaluation capacity; (3) societal issues that draw out the implications of intellectual, social, or cultural developments for the field of evaluation, such as the women’s movement, communitarianism, or multiculturalism. A wide range of substantive domains is appropriate for New Directions for Evaluation; however, the domains must be of interest to a large audience within the field of evaluation. We encourage a diversity of perspectives and experiences within each issue, as well as creative bridges between evaluation and other sectors of our collective lives.

The editors do not consider or publish unsolicited single manuscripts. Each issue of the journal is devoted to a single topic, with contributions solicited, organized, reviewed, and edited by a guest editor. Issues may take any of several forms, such as a series of related chapters, a debate, or a long article followed by brief critical commentaries. In all cases, the proposals must follow a specific format, which can be obtained from the editor-in-chief. These proposals are sent to members of the editorial board and to relevant substantive experts for peer review. The process may result in acceptance, a recommendation to revise and resubmit, or rejection. However, the editors are committed to working constructively with potential guest editors to help them develop acceptable proposals.

Sandra Mathison, Editor-in-Chief

University of British Columbia

2125 Main Mall

Vancouver, BC V6T 1Z4



Editors’ Notes

Disclaimer: The findings and conclusions of this article are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention (CDC).

Decades ago, Suchman (1967) encouraged evaluators to apply Campbell and Stanley’s (Campbell & Stanley, 1963) writings on experiments, quasi-experiments, and validity to evaluation. Since that time, the Campbellian validity typology, as presented in Campbell and Stanley (1963), Cook and Campbell (1979), and Shadish, Cook, and Campbell (2002), has been prominent in much of the theory and practice of outcome evaluation. Despite its influence, the Campbellian validity typology and its associated methods have been criticized, sometimes generating heated debates on the typology’s strengths and weaknesses for evaluation. For some readers such debates might form part of this issue’s subtext; for others the issue should still be of interest—even to evaluators new to the field and unfamiliar with such debates. Validity frameworks are important. They can inform thinking about evaluation, guide evaluation practice, and facilitate future development of evaluation theory and methods.

This issue had its origins in a panel at the 2008 conference of the American Evaluation Association. Led by Huey T. Chen, the session focused on theory and practice as related to external validity in evaluation. The session was motivated in part by the sense that new directions, and perhaps increased attention to some old directions, are needed to reach meaningful conclusions about evaluation generalizability. But session presenters addressed issues related to validity forms beyond external validity. In addition, as planning shifted from the conference session to this issue, newly added contributors planned to address issues other than external validity. As a result, after considering alternative framings, the issue has evolved to its theme, that is, validity in the context of outcome evaluation.

The primary focus of most of the chapters is not on Campbell and colleagues’ validity typology per se, but rather on its application in the context of outcome evaluation. According to the Program Evaluation Standards (Joint Committee on Standards for Educational Evaluation, 1994), four attributes are essential for evaluation practice: utility, feasibility, propriety, and accuracy. The Campbellian typology offers clear strengths in addressing accuracy. However, it is less suited to address issues of utility, propriety, and feasibility. Perhaps a worthwhile direction for developing a comprehensive validity perspective for evaluation is to build on the Campbellian typology in ways that will better address issues related to all four attributes. This issue of New Directions for Evaluation is organized and developed under this spirit.

In general, we take the stance that we can further advance validity in outcome evaluation by revising or expanding the Campbellian typology. Chapter authors present multiple views on how to build on the Campbellian typology’s contribution and suggest alternative validity frameworks or models to serve program evaluation better. We hope that these new perspectives will advance theory and practice regarding validity in evaluation as well as improve the quality and usefulness of outcome evaluations.

Chapter authors propose the following strategies in developing a new perspective of validity typology for advancing validity in program evaluation.

Enhance External Validity

John Gargani and Stewart I. Donaldson, then Melvin M. Mark, focus on external validity. Gargani and Donaldson discuss limits of the Campbellian tradition regarding external validity. They argue that the external validity of an evaluation could be enhanced by better addressing issues about what works for whom, where, why, and when. Mark reviews several alternative framings of generalizability issues. With the use of these alternatives, he provides potentially fruitful directions for external validity enhancement.

Enhance Precision by Reclassifying the Campbellian Typology

The chapters by Charles S. Reichardt and George Julnes offer conceptual revisions of the Campbellian typology. Reichardt offers what he sees as flaws in the four types of validity in Shadish et al. (2002). He also offers his version of a typology, which includes four criteria: validity, precision, generalizability, and completeness. Julnes proposes a validity framework with three dimensions—representation (construct validity), causal inference (internal and external validity), and valuation. He argues for the conceptual and pragmatic merits of this framework.

Expand the Scope of the Typology

Ernest R. House discusses the Campbellian typology’s limitations in dealing with ethical challenges with which evaluation is increasingly faced. He notes an alarming phenomenon, visible in medical evaluations but increasingly worrisome in other areas of evaluation practice, whereby evaluation results become biased because of researchers’ intentional and unintentional manipulation. House discusses strategies for dealing with this ethical problem, including how these ethics-related problems might be incorporated within the Campbellian validity tradition.

Jennifer C. Greene is one of the few contributors to this issue who is not affiliated with the Campbellian tradition. She provides a naturalistic viewpoint in examining limits of the Campbellian typology. She discusses different validity concepts and offers strategies for strengthening validity that are not primarily associated with the Campbellian tradition. At the same time, her comments are congenial to advances within the framework provided by Campbell and colleagues. Huey T. Chen and Paul Garbe argue that outcome evaluation should address system-integration issues that go beyond the scope of goal attainment. The Campbellian typology’s strength is goal-attainment assessment. To address both goal-attainment and system-integration issues, these authors propose a validity model with three categories: viable, effectual, and transferable. With this expanded typology, they propose a bottom-up approach with the use of quantitative and qualitative methods to strengthen validity in an evaluation.

William R. Shadish, a collaborator of Campbell’s who played a key role in expanding the Campbellian typology (Shadish et al., 2002), offers his perspective on the contributions of this issue. Other chapters in the issue discuss various aspects of the Campbellian typology, with the authors representing varying degrees of closeness or distance to the tradition. Shadish speaks as an involved and interested representative of this tradition, which he upholds with vigor, thus providing balance to the perspectives in the issue. Shadish clarifies and defends the work of Campbell and his colleagues, offers themes related to the issue topic, and comments on the rest of the chapters.

Shadish takes exception with many of the arguments in the other chapters, countering our view that the typology must be revised or expanded to serve program evaluation better. Our hope is that the interplay among the ideas in all of the chapters will provide readers with multiple viewpoints as well as stimulate future development in this important area. Don Campbell advocated a “disputatious community of scholars” to create self-correcting processes. He appended critiques of his papers by others to his own reprints. In this spirit, we include Shadish’s comments and hope this will contribute to evaluators’ thinking and practice regarding validity and outcome evaluation.


Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research on teaching. In N. L. Gage (Ed.), Handbook of research on teaching (pp. 171–246). Chicago, IL: Rand McNally. Also published as Campbell, D. T., & Stanley, J. C. (1966). Experimental and quasi-experimental designs for research. Chicago, IL: Rand McNally. Since reprinted as Campbell, D. T., & Stanley, J. (1963). Experimental and quasi-experimental designs for research. Boston, MA: Houghton-Mifflin/Wadsworth.

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Chicago, IL: Rand McNally.

Joint Committee on Standards for Educational Evaluation. (1994). The program evaluation standards (2nd ed.). Thousand Oaks, CA: Sage.

Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.

Suchman, E. A. (1967). Evaluation research. New York, NY: Russell Sage Foundation.

Huey T. Chen

Stewart I. Donaldson

Melvin M. Mark


Huey T. Chen is a senior evaluation scientist of the Air Pollution and Respiratory Health Branch at the Centers for Disease Control and Prevention (CDC).

Stewart I. Donaldson is dean and professor of psychology at the Claremont Graduate University.

Melvin M. Mark is professor and head of psychology at Penn State University.