AUTOPSY OF A FAILED EVALUATION Examination Against The 2011 Program Evaluation Standards Daniel L. Stufflebeam 9/13/11
FOCUS: • The 2011 Program Evaluation Standards
THE SESSION’S PARTS ARE • A rationale for evaluation standards • A case showing the utility of The Program Evaluation Standards • The contents of The Standards • Recommendations for applying The Standards
PART 1 A RATIONALE FOR EVALUATION STANDARDS
STANDARDS FOR EVALUATIONS ARE • Widely-shared shared principles for guiding and judging the conduct and use of an evaluation • Developed & approved by experts in the conduct & use of evaluation
STANDARDS FOR EVALUATIONS PROVIDE • Principled Direction • Technical Advice • A Basis for Professional Credibility • A Basis for Evaluating Evaluations • A Basis for Public Accountability
EVALUATORS IGNORE OR FAIL TO MEET STANDARDS TO • Their professional peril • The detriment of their clients
As in the famous ENRON debacle, failure to meet standards may contribute to • Lack of an impartial perspective • Erroneous conclusions • Unwarranted decisions • Cover-up of findings • Misguided decisions • Breakdown of trust • Organizational repercussions • Personal losses & tragedies • Lowered credibility for evaluators, their organizations, & the evaluation profession • Increased government controls
PART 2: A CASE A UNIVERSITY’S REVIEW OF ITS GRADUATE PROGRAMS (A university group should have followed evaluation standards but didn’t.)
Board had voted confidence in the president (12/06) Faculty gave president & provost low ratings (2/07) Enrollment was declining U. faced a fiscal crisis Review focused on resource allocation Morale was low CONTEXT WAS PROBLEMATIC
REVIEW’S STATED PURPOSES: • Address a fiscal crisis over the university’s inability to support all of its programs & maintain excellence • Determine which programs are highest strategic priorities based on quality • Identify programs for increased funds
SCOPE OF THE REVIEW • To be completed within 1 year • All masters and doctoral programs • Launched on 7/19/06 • 114 programs were reviewed
Keyed to Dickeson book (chapter 5) Data book Program’s report Dean’s report Review team’s report Appeals of review team’s report Provost’s final report Board’s decisions No update of U. mission No appeals of provost’s conclusions No adoption of standards for reviews Minimal participation of outside evaluators No external metaevaluation or peer review of the review THE REVIEW’S PLAN
GENERAL REVIEW CRITERIA • External demand • Quality of student & program outcomes • Quality of program administration & planning • Program size, scope, & productivity • Program impact, justification, & essentiality • Opportunity analysis • Compelling program factor (features that make it unique & excellent)
DEFINITION OF SUB-CRITERIA • Many • Evolved throughout the review • Caused confusion & controversy
CRITERIA OMITTED FROM DICKESON’S LIST • History, development, & expectations of the program • Internal demand for the program • Quality of program inputs & processes • Revenue & other resources generated • Program costs & associated costs
EVALUATION PROCEDURES • Program’s self-report • Document & data book review • Group & individual interviews • Variable protocols for ratings (1-5) • Training of review team leaders • Rating of each program by department, dean, review team, & provost • Synthesis by provost & staff
REVIEW PERSONNEL • Essentially internal • Provost was both primary decision maker & de facto lead evaluator • Provost’s staff assisted the process • A program representative wrote the program’s report & sent it to department faculty, dean, & review team • Faculty input varied across programs • The dean rated the college’s programs & sent reports to the department chairs & review team (not in original plan)
REVIEW PERSONNEL (continued) • Seven 7-person review teams rated designated programs & on the same day e-mailed all reports to the provost & to pertinent deans & department chairs • Review team members were mostly from outside the program’s college • Provost met with deans before finalizing decisions • Provost met with team leaders before releasing final report • An internal evaluation expert assisted
FINAL REPORT • Issued on May 11, 2007 • Gave priorities for funding in each college • Announced plans to maintain 56, increase 16, merge 6, maintain/merge 17 subject to review, transfer 8, close 26, & create 6 new degrees
FINAL REPORT (continued) • Gave no evidentiary basis for decisions • Referenced no technical appendix • Referenced no accessible files of supporting data, analyses, & data collection tools • Gave no rating of each program on each criterion & overall
OUTCOMES • Local paper applauded the report (5/12/06) • Review evidence & link to conclusions were inaccessible to many interested parties • Professors, alumni, & others protested • President announced an appeal process (5/18/07) • Faculty voted to call for a censure of the provost (5/18/07) • Provost resigned (5/20/07) • Appeals overturned 10 planned cuts (7/14/07)
OUTCOMES (continued) • Potential savings from cuts were reduced • Community watched a contentious process • Board fired the president (8/15/07) • President threatened to sue • Board awarded ex-president $530,000 severance pay (10/27/07) • Projected review of undergraduate programs was canceled, ceding that area priority by default • Reviews were scheduled to resume in 2010
CLEARLY, THIS EVALUATION FAILED • No standards were required to reach this conclusion. • However, adherence to approved standards might have prevented the review’s failure.
MY TAKE-ON THE PLUS SIDE: • Review was keyed to an important need to restructure programs. • There was significant faculty involvement in studying programs. • General criteria were established.
HOWEVER, THERE WERE SERIOUS DEFICIENCIES. • No independent perspectives • Top evaluator & decision maker were the same • Evidence to support conclusions was not reported • Political viability was not maintained • Evidence disappeared • No independent evaluation of the review
PART 3 The Program Evaluation Standards
FOR A MORE SYSTEMATIC EXAMINATION OF THE CASE • Let’s see if use of The Program Evaluation Standards might have helped ensure the study’s success. • Let’s also use the case to develop a working knowledge of The Program Evaluation Standards.
THE JOINT COMMITTEE ON STANDARDS FOR EDUCATIONAL EVALUATION • Developed The Program Evaluation Standards • Includes evaluation users and experts • Was sponsored by 17 professional societies
Accreditation officials Administrators Curriculum specialists Counselors Evaluators Rural education Measurement specialists Policymakers Psychologists Researchers Teachers Higher education THE SPONSORS REPRESENTED
The Program Evaluation Standards • Are accredited by the American National Standards Institute • As an American National Standard • Include 30 specific standards
NOW, LET’S LOOK AT • THE CONTENTS OF THE STANDARDS & • DISCUSS THEIR RELEVANCE TO THE PROGRAM REVIEW CASE
THE 30 STANDARDS ARE ORGANIZED AROUND 5 ATTRIBUTES OF A SOUND EVALUATION • UTILITY • FEASIBILITY • PROPRIETY • ACCURACY • EVALUATION ACCOUNTABILITY :
EACH STANDARD INCLUDES CONSIDERABLE DETAIL • Label • Summary statement • Definitions • Rationale • Guidelines • Common errors to avoid • Illustrative case
CAVEAT • Time permits us to deal with the 30 standards only at a general level. • You can benefit most by studying the full text of the standards.
THE UTILITY STANDARDS • Require evaluations to be • Informative • Timely • Influential • Grounded in explicit values • Intended to ensure an evaluation • Is aligned with stakeholder needs • Enables process and findings uses and other appropriate influence
U1 Evaluator Credibility U2 Attention to Stakeholders U3 Negotiated Purposes U4 Explicit Values U5 Relevant Information U6 Meaningful Processes and Products U7 Timely and Appropriate Communicating and Reporting U8 Concern for Consequences and Influence LABELS FOR THE UTILITY STANDARDS ARE
THE U1 EVALUATOR CREDIBILITY STANDARD STATES: • Evaluations should be conducted by qualified people who establish and maintain credibility in the evaluation context. • How well did the program review meet this standard?
THE U2 ATTENTION TO STAKEHOLDERS STANDARD STATES: • Evaluations should devote attention to the full range of individuals and groups invested in the program and affected by its evaluation. • How well did the program review meet this standard?
THE U4 EXPLICT VALUES STANDARD STATES: • Evaluations should clarify and specify the individual and cultural values underpinning purposes, processes, and judgments. • How well did the program review address this standard?
THE U8 CONCERN FOR CONSEQUENCES AND INFLUENCE STANDARD STATES: • Evaluations should promote responsible and adaptive use while guarding against unintended negative consequences and misuse. • Did the program review case meet this standard?
OVERALL, BASED ON THIS SAMPLING OF UTILITY STANDARDS • Did the program review pass or fail the requirement for utility? • Why or why not?
DID FAILURE TO MEET ANY OF THESE UTILITY STANDARDS CONSTITUTE A FATAL FLAW? • If yes, which failed standard(s) constituted a fatal flaw? • What could the provost have done to ensure that the review passed the Utility requirements?
THE FEASIBILITY STANDARDS • Are intended to ensure that an evaluation is • Economically and Politically Viable • Realistic • Contextually sensitive • Responsive • Prudent • Diplomatic • Efficient • Cost Effective
LABELS FOR THE FEASIBILITY STANDARDS ARE F1 Project Management F2 Practical Procedures F3 Contextual Viability F4 Resource Use
THE F2 PRACTICAL PROCEDURES STANDARD STATES: • The procedures should be practical and responsive to the way the program operates. • Did the program review employ workable, responsive procedures?