150 likes | 159 Views
EVALUATING EPP -CREATED ASSESSMENTS. Donna Fiebelkorn, Lake Superior State University Beth Grzelak, Eastern Michigan University. Goals for today’s session: . Introduce the CAEP Evaluation Framework Provide the opportunity for participants to engage with the Evaluation Framework
E N D
EVALUATING EPP-CREATED ASSESSMENTS Donna Fiebelkorn, Lake Superior State University Beth Grzelak, Eastern Michigan University
Goals for today’s session: • Introduce the CAEP EvaluationFramework • Provide the opportunity for participants to engage with the Evaluation Framework • Discuss validity and reliability from the CAEP perspective • Q & A
Evaluation Framework focuses on: Relevancy 1. Administration and Purpose 2. Content of Assessment Reliability and Actionability 3.Scoring Quality 4. Data Reliability 5. Data Validity 6. Survey Content 7. Survey Data Quality Note: For Survey Instruments, use Sections 1, 2, 6, and 7 only. All others, use Sections 1, 2, 3, 4, and 5.
Briefly, about the work we have done • Between us, we have reviewed assessments from nine programs, public and private, small and medium. • Not one assessment, from any EPP, met the ‘Sufficient’ level as defined by the CAEP Evaluation Framework • In each case, we worked in teams of 2-3 • Initially independently, and then with team members to reconcile discrepancies. • The review team’s report was then reviewed and accepted by CAEP staff member. • Before we discuss the issues we saw, let’s introduce you to the CAEP Evaluation Framework…
And then . . . Holistic Evaluation of EPP-Created Assessments Criteria evaluated during stages of accreditation review and decision-making: • EPP provides evidence that data are compiled and tabulated accurately • Interpretations of assessment results are appropriate for items and resulting data • Results from successive administrations are compared
Data Validity Example: Sufficient? CAEP Training Example: This observation rubric has “face validity” because all of the stakeholders (supervising faculty, program faculty, mentors and principals) agreed, after modifications, that it was measuring good teaching.
Data Reliability Example: Sufficient? CAEP Training Example: We have not yet determined inter-rater reliability for this assessment but plan to conduct a series of faculty meetings in which faculty observe videos of candidates in their teaching residency, complete the observation rubric individually and then discuss differences in ratings. Inter-rater reliability will be determined and differences discussed in order to ensure consistency across raters.
Opportunities for Continued Discussion Open Space Topic? Team Work? Other?