1 / 22

Evaluation Rating Forms

Evaluation Rating Forms. Craig McClure, MD May 15, 2003 Educational Outcomes Service Group. Typical Use of Rating Scales. End of Rotation (global) After single encounter (focused) To incorporate input from multiple evaluators Videotaped encounters

onaona
Download Presentation

Evaluation Rating Forms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluation Rating Forms • Craig McClure, MD • May 15, 2003 • Educational Outcomes Service Group

  2. Typical Use of Rating Scales • End of Rotation (global) • After single encounter (focused) • To incorporate input from multiple evaluators • Videotaped encounters • NOT As checklist for single encounters: Yes/No

  3. Alternate Forms • Multiple episodes versus focused (single) episode • Measuring global (six domains) versus task-specific behavior

  4. Global Rating of Learner • Domains of competence, not specific skills, tasks, or behaviors • Completed retrospectively concerning multiple days and activities • May be from multiple sources • Use rating scales

  5. Focused Rating Scale • Single patient encounter • Concerning specific task, skill, behavior

  6. Advantages (Global) • Easy to develop • Easy to use (training minimal) • Can be used to evaluate all domains • Reasonable reliability when • Focused evaluation • Tailored to competencies measured

  7. Systematic Rater Errors (Global) • Leniency/Severity • Range Restriction • Halo Effect • Inappropriate Weighting

  8. Drawbacks (Global) • Content validity uncertain • Questionable validity of general assessments extrapolated to whole domain • Inefficient at directing learner improvement • Accuracy variable • Generosity factor • Poor discrimination between learners

  9. Mixed Research results • Discriminating between competence levels • Reliably rating more skilled physicians higher than less skilled • Reliability of ratings • Reproducibility • Best: knowledge • Harder: patient care, interpersonal skills

  10. Clarify Evaluative Objectives • Global versus focused • Define using competency-based language emphasized by ACGME

  11. Group the Competencies • Patient Care, • Medical knowledge, • Practice-Based Learning and Improvement, • Interpersonal and Communication Skills, • Professionalism, and • Systems-Based Practice.

  12. Composition of Form • Short is better than long • Big font is better than small • Clean better than cluttered

  13. Each Behavior is Evaluated Independently • Otherwise: • Uncertain what to evaluate • Learner uncertain what to address

  14. Decide on Options in the Scale • Best if minimum of five • Best if a descriptor present for each • Absence of middle labels skews ratings toward the positive side

  15. Primacy Effect “The results showed that when the positive side of the scale was on the left, the ratings were more positive and had reduced variance than when the positive label was on the right.”

  16. Lake Wobegon Effect • Where all the children are above average • Faculty tend to interpret anchors as more negative than literal • Generosity effect

  17. Consider Changing Anchors • IF desire to keep evaluative anchors • Poor, fair, below average, average, above average and excellent • Very poor, poor, fair, good, very good, excellent

  18. Consider Using Frequency Anchors • Frequency of observable resident behaviors from “never” to “always” • Considerable education of the evaluators to minimize inter-rater variability needed for judgmental rating • Permits PD competency judgment

  19. Example of Stem for Frequency Anchor • Resident demonstrates respect in speaking to patient… • Never, • 25%, • 50%, • 75%, • Always

  20. Competency Judgment at Program Level • Permits competency definitions to vary by year of training • Diminishes effect of inter-rater variability • Focuses on observable behavior • Requires less training of evaluators

  21. References • Evaluations, S. Swing, Academic Emergency Medicine 2002;9:1278-88 • Assessment of Communication and Interpersonal Skills Competencies, Academic Emergency Medicine 2002;9: 1257-69 • ACGME/ABMS Joint Initiative Toolbox of Assessment Methods, September 2000

  22. References (2) • Challenges in using rater judgments in medical education, M.A. Albanese, Journal of Evaluation in Clinical Practice,6:3: 305-319

More Related