html5-img
1 / 29

Evaluation: Testing, Objective-to-Test-Item Matching and Judgments of Worth

Evaluation: Testing, Objective-to-Test-Item Matching and Judgments of Worth. EDTEC 540 James Marshall. Session Overview. Evaluation Approaches Testing – one possible data point in evaluation Norm-referenced Criterion-referenced Objective-to-test-item matching

dusty
Download Presentation

Evaluation: Testing, Objective-to-Test-Item Matching and Judgments of Worth

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluation:Testing, Objective-to-Test-Item Matchingand Judgments of Worth EDTEC 540 James Marshall

  2. Session Overview • Evaluation Approaches • Testing – one possible data point in evaluation • Norm-referenced • Criterion-referenced • Objective-to-test-item matching • Measurement error, reliability and validity

  3. Evaluation, typically • Typically, it doesn’t happen! That said, it should • And it is required for many funded projects • What happened? Were goals and objectives achieved? How can we find that out? • At the end is NOT the only time to measure worth. When else? • Strategies: tests, observations, surveys, chats with managers, look at work, results

  4. Evaluation Approaches Objectivist • Belief in a reality that can be known and measured. Prevalent in education and our business. • Objectives-based, deceptively simple. Establish goals-->set objectives--> tailor instruction to obj-->judge effectiveness. • Measures are analytical/quantitative in nature. • Examples • Do first-graders know the letters of the alphabet? • Can the new account representative describe the features of each checking account – as defined by the bank? • Others? • Advantages/disadvantages?

  5. Evaluation Approaches Constructivist • Belief that people construct their own realities. Advocates believe that truth is a matter of consensus, not measurement against an objective reality. • Evaluation creates detailed descriptions of that which is inside the head of the learner. • Reliance upon open-ended exercises, observation, cases and immersion in the field. • Observation is useful for us, in that IDs build prototypes, conduct formative evaluations, revise and cycle again. • Measures are qualitative in nature. • Examples • Role play exercise to deal with a hostile customer • Theme Park Tycoon – running a theme park for a year • Essay question asking you to describe your understanding of Educational Technology • Advantages/disadvantages?

  6. Evaluation Approaches Postmodern/Critical • Objectivists proclaim objectivity. Constructivists approve of subjectivity. Postmoderns are social activists. • Focus on questions of power, “Who are you to set objectives for others?” Use of deconstruction to see what’s inside texts and materials. • Most interested in the hidden curriculum, such as the teaching of traditional gender roles. • What does the curriculum teach? • Why should IDs care about this evaluation approach?

  7. Evaluation Frameworks:Kirkpatrick’s Model Level 4: Does it matter? Does it advance strategy? Level 3: Are they doing it (objectives) consistently and appropriately?+++++++++++++++++++++++++ Level 2: Can they do it (objectives)? Do they show the skills and abilities? Level 1: Did they like the experience? Satisfaction? Use? Repeat use?

  8. Evaluation Frameworks: CIPP • Context assesses program/product needs, problems or opportunities specific to the project environment. • Input to assess, evaluate and allocate project resources in order to meet identified needs and objectives, solve problems, and optimize program impact. • Process assesses project implementation. • Product assesses planed and unintended (unforeseen) outcomes, both to keep a project on track and to determine effectiveness or impact.

  9. Types of Tests Used to evaluate changes in skills and knowledge Is testing alone sufficient?

  10. Test Types: Norm-Referenced • Compare an individual's performance to the performance of other people. • Require varying item difficulties. • Assume not everybody is going to "get it" • Discern those who "got it" from those who didn't.

  11. Normal Distribution

  12. 51 22 28 570 Test Types: Norm-Referenced • Norm-referenced tests compare the individual to the group. • Accomplished statistically by “norming” the test with large numbers of people. • Consider: • You sat for the GRE and received the following scores. You need to retake the test. • What is your study plan?

  13. Test Types: Norm-Referenced • Limitations • Not especially helpful for: • identifying individual skill deficiencies • identifying weaknesses in the instruction

  14. Test Types: Criterion-Referenced • Compares an individual's performance to the acceptable standard of performance for those tasks. • Requires completely specified objectives. • Asks: Can this person do that which has been specified in the objectives? • Results in yes-no decisions about competence.

  15. Test Types: Criterion-Referenced • Applications • Diagnosis of individual skill deficiencies • Certification of skills • Evaluation and revision of instruction • Limitations • Tend to focus on specific skills • Results may not reflect general aptitudes • Everyone may get an “A”

  16. Which Test is Which? NR CRT IQ test GRE SDSU Writing Competency Red Cross Lifesaving Certificate EDTEC 540 midterm and final exams

  17. Which Test is Which? NR CRT Give out a CA driver's license Pick students for Russian lang. training Determine entrance into medical school PADI Scuba Certification Select one EDTEC scholarship recipient Figure out where to revise a course Decide which students need remediation

  18. Utility of Test Scores • Selection & screening (before): • mastery of prerequisites -- for remediation/placement • mastery of course objectives -- for acceleration (“testing out”) • Individual diagnosis and prescription (along the way) • Practice (along the way) • Grades & summative scores (at or after the end): • promotion • certification and licensure • Administrative: • course evaluation • trainer accountability

  19. Criterion-referenced Test Items Objectives Items Here is a map of the USA with the states outlined-- but no names. Use the state abbreviations and fill them in-- you've got 15 mins to get at least 45. Take a look at this pair of shoes. What problems do you see? What will you need to fix them? The goal of the instruction is: "ID's will know how to write resumes." Write at least 2 objectives with all four parts. Given a map of the USA with state borders marked, the lwbat write the abbreviation for 45 of 50 states in 15 mins. Given a pair of well-worn shoes, the lwbat identify what's wrong with the shoes and the tools and materials necessary to fix them. Given a goal, lwbt write at least two appropriate objectives with proper ABCD parts.

  20. Matching Test Items to Objectives • Matching ensures validity • Validity is the extent to which the test measures what is important to performance. Does a high score on the test equate to high performance on the job? • The validity of a criterion-referenced test is enhanced when: • objectives match real-world performances (based on solid analysis); • test items match stated objectives (including condition).

  21. Match, or Not? • Given any stocked fruit or vegetable, the Ralphs Grocery Checker will be able to verbally state the code which matches the produce provided with 100% accuracy. • Here is a persimmon from the produce department and the produce code job aid. Please state the produce code for this item. You may examine the persimmon and reference the job aid.

  22. Match, or Not? • Given a tree in need of pruning, the gardener’s apprentice will be able to select the correct tree pruning device, based upon the type of tree presented. • Here is an overgrown elm tree. Please select the appropriate tool with which you will prune the tree.

  23. Match, or Not? • Given a descriptive order for a Café Mocha, including size, caf/decaf, type of milk, the barista will be able to create the drink as specified in the Starbuck’s Guide to Coffee Creations. • A customer has just ordered a Grande, non-fat, mocha. Please list the ingredients you will need, and describe the steps you would take to create the drink.

  24. Evaluating a Training Program Consider: • Your evaluation uses a criterion-based test to see if the new account representatives can describe the different types of accounts offered by the bank. • All representatives were able to meet the specified criteria • Case closed… or, do you want to know more?

  25. Ideas in Testing • Measurement Error • Validity • Reliability

  26. Measurement Error Many causes: • mechanical or scoring errors • poor wording (confusing, ambiguous) • poor subject matter, content (validity) • score variation from one time to another (reliability) • score variation from "equivalent" tests • test administration procedure • inter-rater reliability • mood of the student

  27. Validity • Does the test assess what's important? Does it really seek out the skill and knowledge linked to the world? (content validity) • Types: • Content Validity (most important to us) • Predictive Validity (e.g. SAT, GRE)

  28. Reliability • Are the scores produced by the test trustworthy and stable over time? • Assessed by: • parallel (equivalent) forms or test-retest • internal consistency

  29. Testing and Evaluation A Look Ahead: • ED 690 – Procedures of Investigation • Provides introduction to evaluation procedures and methods • Introduces research process, statistical analysis • ED 791A, 791B, 791C • Evaluation sequence most often completed by EDTEC students, over writing a thesis • Conduct a full-scale evaluation (design, research, report) for a living, breathing client over a two-semester timeframe

More Related