1 / 65

Advising on the Construction of Psychological and Educational Tests

Advising on the Construction of Psychological and Educational Tests. Gideon J. Mellenbergh. University of Amsterdam, The Netherlands. Paper presented at the Colloquium Advising on Research Methods Royal Netherlands Academy of Arts and Sciences (KNAW)

btitus
Download Presentation

Advising on the Construction of Psychological and Educational Tests

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advising on the Construction of Psychological and Educational Tests Gideon J. Mellenbergh University of Amsterdam, The Netherlands

  2. Paper presented at the Colloquium Advising on Research Methods Royal Netherlands Academy of Arts and Sciences (KNAW) Amsterdam, The NetherlandsMarch 28-29, 2007

  3. Contents • 1. Definitions • 2. General discussion topics • 3. Item writing • 3.1 Maximum performance items • 3.2 Typical performance items • 3.3 Pilot studies on item quality • 4. First draft • 4.1 Assembly of maximum performance tests • 4.2 Assembly of typical performance tests • 5. Administration modes • 6. Try-out

  4. Definitions A psychological or educational test is an instrument for the measurement of a test taker’s (maximum or typical) performance, which is assumed to reflect a latent variable, under standardized conditions.

  5. Reflective test reflects a latent variable (construct) Assumption that must be testedMaximum performance abilities or aptitudes, skills (e.g., intelligence, arithmetic)Typical performance (questionnaire) attitudes, interests, personality characteristics

  6. Remarks slide 5 The distinction between maximum and typical performance was made by Cronbach (1960). The latent variable can be a continuous or a discrete (e.g., ordinal-polytomous, dichotomous) variable. Standardized conditions are needed to make fair comparisons between and within test takers.

  7. Definitions 2. An item is the smallest possible building block of a test.

  8. 2. General discussion topics 1. The construct (latent variable) of interest. Which ability, attitude, interest or personality characteristic must be measured?

  9. 2. General discussion topics 2. The target population (of interest) and the frame population (that can be tested).

  10. 2. General discussion topics 3. Existing tests for the construct of interest (literature, documentations, psychometric qualities etc.).

  11. Remark slide 10 Tests consist for many different latent variables, but often these tests do not satisfy the client’s research needs.

  12. 2. General discussion topics 4. Objectives: research, diagnosis, decision making etc.

  13. Construction strategies Preferences for theory based strategies: - Construct Method - Facet Design Method

  14. Remark slide 13 Empirical research indicates that these two methods yield better tests than other test construction methods (Oosterveld, 1996).

  15. Construct Method (Jackson, 1971) Item writing is based on a theory of the content of the latent variable that will be measured. For example, Ettema’s (2005) questionnaire for the assessment of dementia patients’ quality of life.

  16. Construct Method (Jackson, 1971) Definition based on literature review: Dementia specific quality of life is the multidimensional evaluation of the person-environment system of the individual, in terms of adaptation to the perceived consequences of the dementia.

  17. Construct Method (Jackson, 1971) Theory Dröes’ adaptation-coping model. The model distinguishes seven adaptation dimensions for coping, for example, ‘Developing an adequate relationship with the staff’ Item example ‘Has conflicts with caretakers’

  18. Facet Design Method (Guttman, 1965) Item writing is based on a conceptual analysis of the construct. A number of facets are distinguished, and each of the facets has a number of elements. For example, Stouthard’s (1993) questionnaire for the measurement of patients’ dental anxiety.

  19. Facet Design Method (Guttman, 1965) Construct: dental anxiety Facets: (1) time before treatment (elements: chair, waiting room, on the way to, at home) (2) aspects of dental treatment (elements: introductory, patient- dentist interaction, treatment) (3) patient’s reactions (elements: emotional, physical, cognitive)

  20. Facet Design Method (Guttman, 1965) Factorial combination of the facets 4 x 3 x 3 = 36 cells of the facet design Item example Waiting room/Treatment/Emotional ‘When I know the dentist is going to extract a tooth, I am already afraid in the waiting room’

  21. 3. Item writing • An item consists of • a task • a response mode

  22. 3. Item writing • Tasks • Maximum performance tests: problem • Typical performance test: statement

  23. 3. Item writing • Response modes • Free response • Choice

  24. 3. Item writing Example Free response 8 x 14 = ...

  25. 3. Item writing Example Choice 8 x 14 = (1) 32 (2) 112 (3) 132

  26. 3.1 Maximum performance items Free-response - short-answer items 8 x 14 = ...

  27. 3.1 Maximum performance items Free-response - essay item ‘Give reasons for the outbreak of the French revolution’

  28. 3.1 Maximum performance items Free-response Responses to free-response items must be graded by judges (correct, partly correct, incorrect)

  29. 3.1 Maximum performance items Choice Conventional: multiple-choice items Preferred number of options: 3

  30. Remark slide 29 More options reduce the probability of guessing the correct answer. However, item writers have often difficulty to write a fourth or fifth plausible option. Therefore it is recommended to write somewhat more three-choice items instead of less four- or five-choice items.

  31. 3.1 Maximum performance items Choice Structure Stem 8 x 14 = Distractor 1 32 Correct option 112 Distractor 2 132

  32. 3.1 Maximum performance items Choice Three options are in alphabetical, logical, or numerical order Options are in vertical position

  33. Item writing rules A large number of useful rules - alphabetical, logical, or numerical order - vertical option positions - avoid tricks - avoid window dressing - three options of equal length - avoid negatives - distractors which are plausible for test takers who don’t know the correct answer Etc.

  34. Remark slide 33 An overview of item writing rules is given by Haladyna, Downing and Rodriguez (2002).

  35. Clients are recommended to check their concept items against these rules.

  36. Test takers’ responses are assessed in a response scale dichotomous (correct/incorrect) ordinal-polytomous (e.g., correct/partly correct/incorrect) bounded-continuous (e.g., number of seconds an examinee needs to give the correct answer to the multiplication ‘8 x 14’)

  37. 3.1 Typical performance items Structure Statement & Response scale

  38. 3.1 Typical performance items • Response scales • dichotomous • ordinal-polytomous • bounded-continuous

  39. 3.1 Typical performance items Classification StatementResponse scale Frequency Number of Frequency Categories All-or-None Two categories Endorsement Continuous Uninterrupted Scale Intensity Discrete More than Two Ordered Categories

  40. Example Frequency How frequently are you happy? (a) never (b) seldom (c) sometimes (d) often (e) usually (f) always

  41. Example • All-or-None Endorsement • Thurstone and Chave’s (1929) Attitude Toward the Church Questionnaire • I feel that church attendance is a • fair index of nation’s morality • (a) agree • (b) don’t agree

  42. Example Continuous Endorsement Intensity Thurstone and Chave (1929) Write an x somewhere on the line below to indicate where you think you belong Strongly favorable Neutral Strongly against to the church the church

  43. Example • Discrete Endorsement Intensity • Likert’s (1932) Internationalism Attitude Questionnaire • Our country should never declare war again under any circumstances • (a) Strongly approve • (b) Approve • (c) Undecided • (d) Disapprove • (e) Strongly disapprove • Preferred number of options of Likert items: 4 to 7

  44. Clients are recommended to make an informed choice between these different item types.

  45. Item writing rules A large number of rules • Use positive statements and avoid direct negatives. A positive statement can be indicative (‘I am feeling great’) or contra-indicative (‘I am feeling blue’)

  46. Item writing rules A large number of rules • If a statement consists of a condition (‘at noisy parties’) and a behavior part (‘I am feeling uneasy’) put the condition at the beginning, for example: • ‘At noisy parties, I am feeling uneasy’ Etc.

  47. Remark slide 46 This rule may go against correct use of style. Test translators tend to reverse the condition and behavior parts in their item translations.

  48. Clients are recommended to use these rules.

  49. 3.3 Pilot studies on item quality Clients are recommended to do the following pilot studies on concept items: 1.Experts: A small group of experts (in both content and item writing) discuss the concept items

  50. Remark slide 49 The group needs to consist of (1) content, and (2) item writing experts.

More Related