1 / 11

Developing the Tests for NCLB: No Item Left Behind

Developing the Tests for NCLB: No Item Left Behind. Steve Dunbar Iowa Testing Programs University of Iowa. Test Development: A Technical Concern. Procedures are well-established – it’s sort of a ‘rocket-art’

chi
Download Presentation

Developing the Tests for NCLB: No Item Left Behind

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Developing the Tests for NCLB:No Item Left Behind Steve Dunbar Iowa Testing Programs University of Iowa

  2. Test Development: A Technical Concern • Procedures are well-established – it’s sortof a ‘rocket-art’ • Aspects of ‘quality’ that seem distinct to an observer are inseparable to a developer • Quality control requires resources – talent, time, and money – to do well • TD is the grunt work of assessment

  3. Best Practice in Test Development • Interpret content standards; translate intotest specifications • Search for stimulus material; draft items • Do the 3Rs: REVIEW-REVISE-REPLACE • Prepare material for field testing • Oops – we forgot about finding the kids to participate in field testing, many comparable samples of them

  4. More Best Practice in TD • Administer, retrieve, and score tryout materials; get item analysisresults to TDers • Do the 3Rs: REVIEW-REVISE-REPLACE • Prepare more material for field testing • Oops – more kids for field testing, more comparable samples

  5. What do we get from Best Practice? • Something elusive (important content, interesting materials, good questions, cognitive complexity, comparability) • Something intangible (fairness, alignment with standards, intended consequences) • Something concrete (coverage, rater reliability, a validity or generalizability coefficient, acceptable cost)

  6. Some TD Half Truths • Multiple Choice ItemsDevelopment is hard Scoring is easy (and public)Quality Control built in to TD process • Open-ended ItemsDevelopment is easyScoring is hard (and private)Quality Control elusive due to scoring

  7. Comparability in Test Materials • Test form as the unit for judging comparability • Easy to achieve with many items on the test and many potential throwaways in the pool • Experienced test development staff • Good field testing and scoring needed

  8. Group Differences and Fairness • TD seeks a balance • Tension is that balance requires questions, lots of them • Instructional influences confounded with group effects • DIF requires good matching questions

  9. Cost Factors in Large-Scale Testing • Development CostsRecur with each test formAre fixed by instrument design • Scoring CostsRecur with each test administrationMay change because of ‘unexpected’ circumstances

  10. Validity in Test Development • Best practice ensures content quality, balance, and alignment with standards – critical aspects of validity & reliability • TD is predicated on anticipated use • Other aspects of validity & reliability aren’t understood until it’s too late, i.e. when the test is operational

  11. Validity & Capacity in NCLB • NCLB is census testing • Census testing places heavy demands on TD and other aspects of an accountability system • Limit on capacity in TD meansonly 1R, or 2Rsfewer rounds of field testing dwindling pools of test materials • No item left behind

More Related