1 / 38

Psychometric Defined by Research

Psychometric Defined by Research. Goals of This Session. Brief wrap up of brown bags Psychometrics defined through research Broad historical perspective Research framework The parallel universe concept Some current research here at Measured Progress Some concluding remarks.

benita
Download Presentation

Psychometric Defined by Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Psychometric Defined by Research

  2. Goals of This Session • Brief wrap up of brown bags • Psychometrics defined through research • Broad historical perspective • Research framework • The parallel universe concept • Some current research here at Measured Progress • Some concluding remarks

  3. Psychometric Brown Bags • All these brown bags have been introductory in nature • Eventually, these will be posted on our company website • Staff members • Clients • Teachers & parents

  4. Psychometric Brown Bags • We have covered a lot of ground • Statistics, classical test theory, item response theory • Standard setting, equating, adaptive testing, DIF, skills diagnosis

  5. Psychometric Brown Bags • If you found these talks interesting let us know • Because of the introductory nature of our presentations – there’s lots more we could present on

  6. We can really define psychometrics from a variety of perspectives • Historical • Assessment program • Analyzing data here at Measured Progress • Research

  7. Historical Perspective • The history of psychometrics has deep roots in the cross roads of psychology, physiology, and philosophy • Ultimately these disciplines are trying to better understand the human experience • Psychometrics does this by quantifying behavioral observations

  8. Historical Perspective • Early psychometricians focused primarily on the quantification of intelligence • Psychometricians have also worked extensively on the application of psychometric models to assess patients within a clinical setting

  9. Historical Perspective • Psychometrics is ultimately a very broad discipline • Psychometrics is an example of blending of the social sciences with the quantitative sciences • Sociometrics • Econometrics

  10. Research in Psychometrics • Because of the psychometrics is a broad discipline there are many national and international research organizations and societies • This results in many • Peer reviewed journals • Conferences • Opportunities for research

  11. Research Societies • American Educational Research Association • National Council on Measurement in Education • Psychometric Society • American Psychological Association • International Testing Commission • Society for Industrial/Organizational Psychology

  12. Psychometricians at Other Organizations • Again, because of our broad discipline psychometricians work in a variety of places: • American Institute of Certified Public Accountants • National Board of Medical Examiners • Law School Admissions Council • The Rand Corporation • Research Triangle Institute

  13. Research at Other Organizations • Research Agendas • This approach tends to be an laundry list approach of ideas that are not well connected • Products and Services • This is a narrowly focused method with a specific goal • Both these approaches are not resource friendly and will lead to research programs that are not well orchestrated

  14. Psychometric Research at Measured Progress • We wanted to come up with a different way of organizing and conducting research • Our approach is an attempt at: • Connecting research projects in meaningful ways • Allowing for product based research to be done in a cost effective manner • Connecting research with products

  15. Psychometric Research at Measured Progress • This approach also allows for external opportunities • Interns • Through other research institutes • Center for Assessment • Center for Advanced Studies in Measurement and Assessment • Center for Educational Research and Evaluation • The Research & Evaluation Methods Program • Visiting Scholars • Clients

  16. Research Framework • Because all assessment programs have some common structure, any research project should fit somewhere in that structure. • Most research projects relate to more than one area. Still, a framework with separately delineated areas is helpful for organizing and discussing such research.

  17. Research Framework • Design and Modeling • Statistical Analyses • Scoring and Reporting

  18. Design and Modeling • Included in this category is research having to do with modeling the students, the assessment tasks, the interaction of the students with the tasks, or test-centered research • The focus is on the design or modeling of the test as a whole

  19. Design and Modeling • Task modeling • Student modeling • Modeling Student-Task interaction • Test-centered modeling research • Test design • Test assembly

  20. Statistical Analyses • Focus is on statistics used to evaluate the individual assessment tasks, and the overall assessment instrument with respect to the psychometric model applied to the test data. • This includes research on the calibration of psychometric models, model fit analyses, estimation of reliability, and validity analyses.

  21. Statistical Analyses • Calibration and ability estimation • Interpretation of estimated parameters • item parameters and ability distribution • Model fit • Reliability and Generalizability • Validity • internal and external

  22. Scoring and Reporting • Here the focus is on how best to score assessment tasks and the assessment instrument as a whole. • This includes how to transform the observed scores and ability estimates from the psychometric model into useful and interpretable score reports

  23. Scoring and Reporting • Observed scores, scaled scores, & IRT ability • Equating • Linking • Standard setting • Score Reports and Interpretive Guides

  24. The Parallel Universe Concept

  25. Parallel Universe • A research project will certainly fit somewhere in the Framework – it’s helpful for organizing different research projects. • But can the converse be true? Could the Framework fit into a research project? Could the Framework help organize the research project?

  26. Parallel Universe • Sometimes a research projectis better characterized as a research program. Research Program: A set of research projects organized around a common theme and intended to address most or all of the components listed in the Framework. • There are also other parallel universes besides research programs! For example, any one of our testing programs!

  27. Parallel Universe ExampleSkills Diagnosis Research Program • Design and Modeling • How does one design a test specifically for diagnostic purposes? What’s the psychometric model? Content specifications? IRT specs? • Statistical Analyses • Need new estimation methods for new models. Also new fit statistics. How do we estimate reliability? Internal validity stats? • Scoring and Reporting • How to report diagnostic scores?

  28. Parallel Universe ExampleA State Testing Program • Design and Modeling • Which psychometric model will be used? • How many items? How many subscores? • Statistical Analyses • What calibration software is used and in what way? • What kind of supporting statistical analyses will be done? DIF? Dimensionality? Validity? Reliability? • Scoring and Reporting • Design of the score report. • Statistics to be reported. • Interpretation of the scores

  29. Current Research • Here’s 8 of the 17 papers we’re presenting at the 2007 AERA/NCME Meeting in Chicago, how each fits in the Framework, and its possible relevance to real life.

  30. Conditional item exposure in multidimensional adaptive testing. • Researchers: Matt Finkelman, Michael Nering, & Louis Roussos.  • Framework: Design and Modeling • Modeling the item selection algorithm so as to prevent items from being over-exposed. • Application: CAT is desired to be used with multidimensional IRT, but current exposure control techniques won’t work.

  31. Generalized Mathematical Formulation for Computing Inter-Rater Inconsistency for BOW, Bookmark, and Yes/No methods • Researchers: Abdullah Ferdous & Barbara Plake (Univ. of Nebraska) • Framework: Statistical Analyses and Scoring and Reporting • Standard setting raters are part of the scoring procedure • Provide statistical support for internal validity • Application: Can be used as part of the standard setting process to improve the quality of the ratings.

  32. Use of Subset of Test Items in Bookmark Standard Setting • Researchers: Abdullah Ferdous • Framework: Scoring and Reporting • Standard setting is part of the scoring procedure • Application: Can be used to streamline the Bookmark standard setting procedure, saving money and time, and perhaps increasing reliability by reducing fatigue.

  33.   Using the DFIT framework to evaluate equating items • Researchers: Michael Nering, & Wonsuk Kim. • Framework: Statistical Analyses • Support the internal validity of the equating items. • Application: A new method that may be more sensitive to ill-suited equating items than the current method that is used.

  34.   Using Person Fit in a Body of Work Standard Setting • Researchers: Matt Finkelman & Wonsuk Kim.  • Framework: Statistical Analyses (major) and Scoring and Reporting (minor) • Statistical support for selecting students to be used in the body of work standard setting method. • Application: A new method for detecting aberrant students who should be excluded from the BOW standard setting.

  35. Development and evaluation of an effect-size measure for the DIMTEST statistic • Researchers: Minhee Seo (U. of Ill.) & Louis Roussos • Framework: Statistical Analyses • DIMTEST assesses test unidimensionality, giving statistical support for test internal validity. • Application: Testing programs want us to check dimensionality. DIMTEST is a reliable hypothesis test. An effect size measure much improves the interpretation of DIMTEST results.

  36. Variations of Body of Work. • Researchers: Kevin Sweeney and Abdullah Ferdous. • Framework: Scoring and Reporting. • Body or Work is a standard setting method, which, of course, determines cut scores for a test. • Application: To improve the efficiency of BOW by reducing the time and work activities in preparing for and conducting the standard setting.

  37. Detection of compromised items in personnel selection examination • Researchers: Yongwei Yang (Gallup), Abdullah Ferdous, & Katherine Chin (U. of Neb). • Framework: Statistical Analyses—Validity. • Old method looked at change in p-value over time; new improved method does this conditional on ability. • Application: Improves the efficacy of personnel selection by getting rid of compromised items. Can also be used to improve item bank for appropriate assessment programs (like CAT).

  38. Concluding Remarks • In Educational Assessment, psychometric research and practice are interdependent. • Good communication b/t research and practice is essential for the efficacy of both. • Our research ideas come directly from questions and problems that arise in practice • Our Research Framework helps give structure and completeness to both our research and practice.

More Related