1 / 30

Hands On Session

Hands On Session. With (Real) Data. CELPT Testing. Language proficiency written test Singapore based Ngee Ann Polytechnic Students with varied cultural backgrounds Profile reporting by language category Calibrated bank of some 1500 questions. Two is four.

thy
Download Presentation

Hands On Session

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hands On Session With (Real) Data

  2. CELPT Testing Language proficiency written test Singapore based Ngee Ann Polytechnic Students with varied cultural backgrounds Profile reporting by language category Calibrated bank of some 1500 questions

  3. Two is four • Each CELPT test becomes two tests • Limited attempt to tailor questions • Match difficulty of questions • And candidate ability

  4. How does it work? • Each test has 220 questions • Starter test of 20 questions • Branching on 20-question test score • Easier test • More difficult test • Only a single branch point

  5. In a little more detail Questions 1 to 20 Starter Test Questions 21 to 60 Part 1A Test Questions 61 to 100 Part 1B Test Questions 101 to 160 Part 2A Test Questions 161 to 220 Part 2B Test

  6. Hands On – The basic task Have two tests 21001 and 21002 Break down into four tests 21001 and 71001 21002 and 71002 Wish to build a single item bank

  7. Hands On – The basic task • Need to calibrate all questions • Into a single bank • How many questions? • 2 x 220 = 440? 4 x 120 = 480? • Some common questions! • Starter and others…

  8. How should the task be tackled? Four different tests – 4 x 120 questions Four different sets of students Key is the common questions Use questions to link during analysis But how? Many ways to do this!

  9. Hands on soon! • Task for the afternoon • Does it matter how to do this? • Could bank one test at a time • Adding tests one by one • Could all tests be banked together?

  10. IBS – Item Banking System Takes in data from many tests Finds common questions Includes existing bank if appropriate Analyses to find optimum statistics Banks the end results

  11. Hands On – for ‘Test’ 21002 • IBS analysis of test 21001 • IBS analysis of test 71001 and bank • One group • Plot difficulty estimates • For common questions in tests • One set from test one from bank

  12. Hands On – for ‘Test’ 21002 • Also a further IBS analysis • Tests 21001 and 7001 • Second group • Plot difficulty estimates • For common questions in banks • For each type of analysis

  13. Task again Test 21001 Bank 1 Test 71001 + Bank 1 Bank 2 Test 21001 + Test 71001 Bank 3 Group 1: 71001 and 21001 (Bank 1) Group 2: Bank 2; Bank 3

  14. Let’s look at FIT in 21002/71002 • CE0888 – Link fit = 4.22 • In 21002 - δ/σ = 1.57/0.21; fit -3.78 • In 71002 - δ/σ = -1.21/0.31; fit -0.52 • In 21002, item is 57/60 in Part 1A • in 71002, item is 39/60 in Part 1B • Omits high in 1A • Question is probably OK

  15. Question CE0888 – 57/1A/I/A Teenagers today are driven by _____________ to do a lot of things they would otherwise not do. A. peer group pressure B. a peer group pressure C. the peer group pressure D. some peer group pressure

  16. Let’s look at FIT in 21002/71002 • CE7414 – Link fit = -5.27 • In 21002 - δ/σ = 1.87/0.21; fit 5.22 • In 71002 - δ/σ = 2.01/0.14; fit 5.07 • In 21002, item is 15/60 in Starter • in 71002, item is 15/60 in Starter • Bad question; unstable difficulty

  17. Question CE7514 – 15/ST/H/C There are many reasons for our losses. They include the following :- __________________________________________; The raw materials we bought were not equal to those specified. A. Violation of set procedures by staff in the assembly plant B. The violation of set procedures in the assembly plant by staff C. Set procedures were violated by staff in the assembly plant D. Violating set procedures being common in the assembly plant

  18. Still need to bank four tests Test 21001 Bank 1 Test 71001 + Bank 1 Bank 2 Test 21002 + Bank 2 Bank 4 Test 71002 + Bank 3 Bank 5 21001, 71001, 21002, 71002 Bank 6

  19. Hands On – for all two/four tests Now have a bank built in steps - Bank 5 Bank has questions from all four tests Also have a bank built in one pass Bank 6 also has all questions?

  20. Hands On – for all tests together • Plot difficulty estimates • Bank 5 vs bank 6 • Four groups for plotting • CE0000 - CE2999, CE3000 - CE49000 • CE5000 - CE6999, CE7000 - CE7999 • Plot any questions found common

  21. Hands On – for all tests together • Statistical estimation – how robust? • What of fit? • Fit within is important • Fit between is also very important • Different groups of students • Fit will be a reflection of many factors

  22. More general points Joint approach to calibration is preferable Balances (smoothes) lumpy data Gives a better overall idea of ‘reality’ Helps to identify real problems Measurement model makes this possible

  23. The analysis of many tests • Analysis has to be possible – connectivity • Design needs consideration • Subtests – questions are only in one subtest • Largest possible group of questions • that occurs in unique grouping of tests • Q1, Q4 and Q6 are in tests X and Y

  24. Linking • With four tests maximum is: • 4C1 + 4C2 + 4C3 + 4C4 • Or 4 + 6 + 4 + 1 = 15 • All 15 are found in this analysis • Provides test by test linking • Some subtests very small

  25. Fit again Misfitting people Misfitting questions All due to question/person interactions Not independent Examples

  26. Where have we been today? Seeking to learn more about what we do Wishing to measure not just report Looking at inconsistent behaviour Trying to understand our data Aiming to build better tests

  27. An Impossibility! How can the Rasch Model apply fully? The formulation is so strict! But it is a system to provide measurement If it fails, then at least we can know about it But hang on, the model failing? More like the data not fitting….

  28. So here is the rub… The test constructor needs to be aware Of what is required and what has happened Where the data have come from Which students were used, what questions The model will help to interpret of data In the end, it all depends on the user

  29. In Conclusion - 1 It has been a romp And there is much more to say and do Plenty of books Plenty of analysis programs Rasch community

  30. In Conclusion - 2 Bond, T.G and Fox, C.M Applying the Rasch Model: Fundamental Measurement in the Human Sciences Lawrence Erlbaum Associates ISBN: 0-8058-4252-7

More Related