Developing the tests for nclb no item left behind
1 / 11

Developing the Tests for NCLB: No Item Left Behind - PowerPoint PPT Presentation

  • Uploaded on

Developing the Tests for NCLB: No Item Left Behind. Steve Dunbar Iowa Testing Programs University of Iowa. Test Development: A Technical Concern. Procedures are well-established – it’s sort of a ‘rocket-art’

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Developing the Tests for NCLB: No Item Left Behind' - chi

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Developing the tests for nclb no item left behind

Developing the Tests for NCLB:No Item Left Behind

Steve Dunbar

Iowa Testing Programs

University of Iowa

Test development a technical concern
Test Development: A Technical Concern

  • Procedures are well-established – it’s sortof a ‘rocket-art’

  • Aspects of ‘quality’ that seem distinct to an observer are inseparable to a developer

  • Quality control requires resources – talent, time, and money – to do well

  • TD is the grunt work of assessment

Best practice in test development
Best Practice in Test Development

  • Interpret content standards; translate intotest specifications

  • Search for stimulus material; draft items


  • Prepare material for field testing

  • Oops – we forgot about finding the kids to participate in field testing, many comparable samples of them

More best practice in td
More Best Practice in TD

  • Administer, retrieve, and score tryout materials; get item analysisresults to TDers


  • Prepare more material for field testing

  • Oops – more kids for field testing, more comparable samples

What do we get from best practice
What do we get from Best Practice?

  • Something elusive (important content, interesting materials, good questions, cognitive complexity, comparability)

  • Something intangible (fairness, alignment with standards, intended consequences)

  • Something concrete (coverage, rater reliability, a validity or generalizability coefficient, acceptable cost)

Some td half truths
Some TD Half Truths

  • Multiple Choice ItemsDevelopment is hard Scoring is easy (and public)Quality Control built in to TD process

  • Open-ended ItemsDevelopment is easyScoring is hard (and private)Quality Control elusive due to scoring

Comparability in test materials
Comparability in Test Materials

  • Test form as the unit for judging comparability

  • Easy to achieve with many items on the test and many potential throwaways in the pool

  • Experienced test development staff

  • Good field testing and scoring needed

Group differences and fairness
Group Differences and Fairness

  • TD seeks a balance

  • Tension is that balance requires questions, lots of them

  • Instructional influences confounded with group effects

  • DIF requires good matching questions

Cost factors in large scale testing
Cost Factors in Large-Scale Testing

  • Development CostsRecur with each test formAre fixed by instrument design

  • Scoring CostsRecur with each test administrationMay change because of ‘unexpected’ circumstances

Validity in test development
Validity in Test Development

  • Best practice ensures content quality, balance, and alignment with standards – critical aspects of validity & reliability

  • TD is predicated on anticipated use

  • Other aspects of validity & reliability aren’t understood until it’s too late, i.e. when the test is operational

Validity capacity in nclb
Validity & Capacity in NCLB

  • NCLB is census testing

  • Census testing places heavy demands on TD and other aspects of an accountability system

  • Limit on capacity in TD meansonly 1R, or 2Rsfewer rounds of field testing dwindling pools of test materials

  • No item left behind