The Challenges of Reading Comprehension Assessment

1. The Challenges of Reading Comprehension Assessment Tracey Cullen

2. Reading has changed markedly in the past decades, from being seen as a transmission of information from author to reader, to the reader actively constructing meaning by interacting with the text.

3. The teaching of reading has changed as a result: Comprehension strategies such as visualising, summarising, inferring, asking questions etc. are taught using rich and authentic texts. The processes involved in reading the text are as important as the end product of the reading.

4. �This dramatic shift in thinking about reading as a process requires an equally momentous shift in our thinking about reading assessment� (Henk, 1993, p 103) There is concern that the significant improvements in the quality of instruction based on the new understanding of the reading process may be masked while teachers continue to use outdated measures of assessment (Henk, 1993; Kamil, 2000; Valencia, 1987).

5. Assessment Making a positive difference to student learning should be the main focus of assessments When reading assessment information is used appropriately, research shows that teaching and learning can be enhanced

6. Problem� However, the way some assessments are being used is resulting in a negative impact on teaching and learning (Clarke, 2001; Farr, 1992; Gambrell, 2007; Teale, 2008). �The policy of the past decade of promoting testing as if it were the means to higher reading achievement has shown no evidence whatsoever that this was a good idea� (Teale, 2008, p. 358).

7. What can happen as a result, is that teachers are spending a great deal of time assessing students to provide data for school managers or school reports, which uses up valuable instructional time, and either the assessments aren�t used to inform teaching, or they don�t provide enough useful information to enable teaching to be informed (Clarke, 2001; Gambrell, 2007; Ministry of Education, 1997; H. Timperley, 2003).

8. Informal Reading Inventory is one of these �outdated� assessments Students read silently and orally a text, usually 100-200 words whilst the teacher undertakes a running record and then the student answers a series of questions Good diagnostic tool for identifying reading strategies used in non-fluent readers or struggling readers One of the most commonly used method of assessing fluent readers in NZ and internationally

9. So what�s wrong with Informal Reading Inventories?

10. Low-Level Questioning �Despite what we know about the complexities of reading comprehension� many teachers and assessment specialists still measure comprehension by how well children recall the details of what they have read�

11. Unfortunately, most of the questions asked only require low-level inferencing or simple recall of parts of the text, and do not encourage deep thinking about the text or the encouragement of ownership of active comprehension strategies �IRI not sensitive enough to distinguish between children who can remember and those who can think about it�

12. This focus on low level thinking can cause students to believe that reading is just the storing and retrieving of information, and answering questions � insignificant, annoying and even painful �Lists of pre-planned questions may turn into a ritualised quiz and lose their effectiveness�

13. Results of international reading tests show that students are able to answer questions, extract meaning and recall but they can not critically respond to text (Allington, 2006). They do not spontaneously identify with characters or imagine themselves in stories. There is wide-spread concern about levels of reading comprehension levels throughout in schools across internationally and in New Zealand (Applegate, 2002; Davis, 2006; Stone, 2004).

14. Only One Answer From what we know about the complex process of reading comprehension and how it involves an interaction between the reader and the text, assuming that there is one correct answer to a question is naive and unsound �How with even a modicum of respect for fairness, can we use tests with single correct answers if we know that answers are influenced by experience and background knowledge�

15. Prior Knowledge Many of the questions can be answered without even reading the text This is because background or prior knowledge plays a significant role in the comprehension of texts Higher knowledge on a topic correlates with higher comprehension scores Would it be fairer to ensure that each student gets the chance to read about something they are familiar with so that it is their comprehension skills that are being assessed and not their level of prior knowledge?

16. Memory In order to answer questions correctly, you need to be able to remember what you have read When informal reading inventories are given and the text is removed from the student at question time, it becomes a test of short term memory knowledge Critics would say however that if you understand something, you will remember it better than if you did not understand it so well Some students have memory difficulties and even though they have comprehended well as they have been reading � they are unable to retrieve the necessary information from their memory Is this a true indication of their level of reading comprehension?

17. Expressive Problems Similar to the memory debate, is the issue of students comprehending texts well but being unable to express themselves in written or oral forms in test situations, thus giving a false indication of their level of understanding It has been suggested that expressive methods should not be used for assessing an output processing task such as reading comprehension

18. Short and Unnatural Texts IRIs usually only have one or two passages per age level to choose from these texts are short and unnatural, preventing students from engaging with the texts As a result, the results of the IRI may not be indicative of the student�s reading comprehension ability Perhaps students view these texts differently than they would a normal text that they would read and not use their skills optimally There is wide recognition in academic literature that motivation and interest in a task affects the outcome but it seems apparent that most reading assessment tests, including IRIs, have paid little attention to this fact

19. �Even the most highly motivated student can become bored having to answer twenty questions on a three paragraph test� (Day & Park, 2005, p. 68).

20. Variation in task results Research shows also that depending on the task given to students to measure comprehension, results will vary. �Even measures that purport to get at the same construct give different results for the same child and even students who achieve the same score on the same measure may earn that score in quite different ways� (Shuy et al., 2006, p. 223).

21. One study showed that children provided much fuller accounts of a text that they had read when they told their friend about it than when they told their teacher about it, indicating that children�s responses are not always a true indication of their level of understanding of a text

22. In another study, two groups of children were given the same text to read. The first group had to answer questions related to what they thought might happen in the text while the second group had to sketch a picture of their prediction. The group that sketched their prediction were far more engaged and showed a much more thorough understanding than the group that answered the questions

23. Time Consuming The large amount of time taken to administer them for very little benefit for fluent readers. With fluent/older readers, there can be little gained from this procedure as they do not make many errors and because they read silently most of the time, oral reading is not always an indication of their actual reading ability

24. �If running records are used with older readers, there should be a special reason for taking them� (Clay, 2000 p. 25).

25. Reliability and Validity If a test measures what it is designed to measure it is reliable and if a test can be replicated with the same results, it is valid Factors that can affect reliability and validity are if test takers are: Fatigued Stressed nervous Unmotivated guessing answers misinterpreting questions

26. If the tests used: Are not given in a standardised way Are uninteresting and unnatural Have a low number of passages to choose from Texts need to be complete enough to have macrostructures � to convey meaning � the more words � the more reliable

27. On-going issues with validity and reliability however, indicate that IRIs are not suitable for age-level placements even though this is common practice

28. Is the PROBE Reading Assessment an Effective Measure of Reading Comprehension? By Qin Chen & Ken. E. Blaiklock, 2007 Study used 33 Year 4 students Small sample size means that caution is needed in interpreting the results

29. No information about the reliability or the validity of the test in the test manuals (does it measure what it claims to measure and are the results consistent?) Study showed low correlations between the performance of students on even numbered questions and their performance on odd numbered questions means there may be problems with the reliability of the PROBE test

30. Could be due to the length of the texts (100 words) and the small number of questions (6-8) High correlation found between accuracy rate on fiction/non-fiction texts Low correlation found between comprehension scores on fiction/non-fiction texts

31. �teachers should not use performance on a fiction passage to make judgements about children�s comprehension of a non-fiction passage and vice-versa�

32. �Overall, the correlational results suggest that a student�s performance on the PROBE test may not be a good indicator of where he or she would score on other tests of reading comprehension�.

33. �the lack of information about any trialling of the passages on groups of children means that teachers cannot assume that the assigned Reading Ages are an accurate indication of the average performance of children at particular ages�.

34. What should assessments be like then? Have questions that are more like discussions where personal responses are invited Focus on the important aspects and be more inferential than straight recall Questions/activities should evaluate the reader�s ability to integrate information across the text rather than recalling information and ideas from specific parts The reader�s prior knowledge and metacognitive skills used while reading should be taken into account Activities should involve students asking questions Activities should be related to normal classroom reading tasks

35. Texts should be long enough to have complete structures, be authentic, motivational and interesting Texts/activities should be culturally appropriate Texts should measure the student�s reading motivation and habits Questions should not be able to be answered without reading the text, therefore being passage dependent Assessments need to reflect the complexity of the reading process and be undertaken during the process of reading More than one answer should be allowed for each question, with justification

36. What tools meet these recommendations?

37. Cloze? Every 5th word deleted 0-34% = too hard, 35-49% = instructional level, 50%+ = independent level Concerns regarding cloze tests are that they do not assess the complex processes occurring between the text and the reader and they do not require the reader to integrate information from the whole passage, only assessing sentence comprehension, which are lower order skills

38. Is it any better than IRIs? Probably no worse and certainly a lot less time consuming. It is quick and easy to prepare can be administered to groups of students Can use any text that is desired � authenticity Can use text level identified in journals etc

39. Recall Tests Recall tests where the student reads a text, one or more times, and then either says or writes down all that is remembered Guessing or being given clues from questions are not issues Straight-forward to prepare very time consuming and complex to score

40. said to be highly valid measures of reading comprehension as the student needs to be able to fully comprehend the text in order to retell it in a logical and coherent way However, the influence of children�s expressive and written skills needs to be taken into account, lessening the validity Criticised for not encourage reflective deep reading that is part of the complex task of reading comprehension

41. Naturalistic Methods Observing students undertaking everyday reading tasks to determine if they understand what they are reading Activities can be as simple as observing students at guided reading time to see who offers predictions, answers questions, and discusses types of recreational reading. Activities can be more deliberate and involve tasks such as story mapping, sketching, rating characters, writing to authors and other types of reading responses e.g. Sheena Cameron�s �Teaching Comprehension Strategies� (2010) and Alison Davis� Teaching Reading Comprehension (2008) books

42. Yes� Naturalistic activities: are not time consuming nor are difficult to construct nor are difficult to administer are authentic tasks measure the process of comprehension therefore reflecting current theory in reading and assessment.

43. One of the reasons why teachers use commercially produced comprehension tests such as IRIs is that they like the standardised nature (even though they aren�t standardised at all) of the tests. All children read the same text and answer the same questions so you can compare one child with another whether or not the test is conceptually sound or valid or reliable in its construction

44. Validity/Reliability Potentially, naturalistic procedures are the most valid and reliable assessments of reading comprehension

45. Valid because: observations are daily They occur in natural context They elicit personal comprehension behaviours Students responses are to normal day in day out tasks rather than contrived, artificial test situations. Reliability is strong because many observations on daily basis Weak because of different teachers abilities to evaluate student behaviours

46. Is it not better to have an assessment method that is theoretically sound, authentic, motivating, interesting, non-time consuming and measures the process of comprehension than one that is none of these but does provide the same texts and questions?

47. �Not everything that can be counted counts, and not everything that counts can be counted�, (Teale, 2008, p. 188).

48. Words of Wisdom by Tracey Cullen�. If educators are serious about raising literacy levels, then student engagement, interest and motivation are critical factors in the equation.

49. If we can develop and use comprehension assessment procedures that promote engagement, interest, responsive and critical thinking in the task of reading, rather than reflecting it as simple a mundane task where you read a text and answer questions, then perhaps students will view all reading tasks as engaging and motivating and respond critically and thoughtfully.

50. Children learn to read by reading therefore if we want them to think reading is a meaningful, enjoyable, informative activity that they can respond to critically and positively, then we need to reflect these visions in our teaching and assessment methods.

51. �There is still much more to learn about how to measure a phenomenon that is as elusive as it is important� (Pearson & Hamm, 2005, p 64).

52. Side effect: privilege those with highest general verbal ability Snippets or textoids Side effect: privilege those with highest general verbal ability Snippets or textoids

54. Did some work on number 1 Habits are important outcomes Scott Paris� important work on constrained and unconstrained skills.Did some work on number 1 Habits are important outcomes Scott Paris� important work on constrained and unconstrained skills.

56. So how did we do in responding the the challenges from Valencia & Pearson?

57. So how did we do in responding the the challenges from Valencia & Pearson?

58. A note about readability There are so many factors that influence the difficulty of a text: student prior knowledge interest level of the student number and nature of new ideas or concepts the complexity of the text structure Students� prior knowledge of the text structure the length and layout of the text Illustrations familiarity of vocabulary used length of sentences density of ideas uncommon technical terms use of idioms metaphoric language print size purpose for reading

59. For over fifty years experts have been attempting to measure readability through more than a hundred methods, with children consistently proving most of the formulas wrong. Most of the formulas should not be used on passages fewer than three hundred words in length or on texts graded below seven years of age

60. Assigning a readability level to a text can be only an approximation.

61. Absolum, M. (2006). Clarity in the Classroom - Using Formative ASsessment. Building Learning-Focused Relationships. Auckland: Hodder Education. Allington, R. L. (2006). What Really Matters for Struggling Readers. Designing Research-Based Programmes. (2nd ed.). Boston: Pearson. Applegate, M., Quinn, K., & Applegate, A. (2002). Levels of thinking required by comprehension questions in informal reading inventories. The Reading Teacher, 56(2), 174-180. Cain, K., & Oakhill, J. . (2006). Assessment matters: Issues in the measurement of reading comprehension. British Journal of Educational Psychology, 76, 697-708. Cairney, T. (1992). Beyond the question: An evaluation of alternative strategies for the assessment of reading comprehension. Paper presented at the American Educational Research Conference. Caldwell, J. (2008). Comprehension assessment � a classroom guide. New Work: The Guildford Press. Carlisle, J. F. (1990). Diagnostic assessment of listening and reading comprehension. In H. L. Swanson, & Keogh, B. (Ed.), Learning Disabilities: Theoretical and Research Issues. New Jersey: Lawrence Erlbaum Associates. Carlisle, J. F. (1991). Planning an assessment of listening and reading. Topics in Language Disorders, 12(1), 17-31. Carroll, J. B. (1971). Learning from Verbal Discourse in Educational Media: A Review of the Literature. Final Report. Chase, N., & Hynd, C. (1987). Reader response: An alternative way to teach students to think about text. Journal of REading 30(6), 530-540. Chen, Q. (2007). Is the PROBE reading assessment an effective measure. Teachers and Curriculum(10), 15-19. Clarke, S. (1998). Assessment in the Primary Classroom. Strategies for Planning Assessment, Pupil Feedback and Target Setting. London: Hodder & Stoughton. Clarke, S. (2001). Unlocking Formative ASsessment. Auckland: Hodder Education. Clay, M. M. (2002). An Observation Survey (2nd ed.). Auckland: Heinemann. Creswell, J. W. (2005). Educational Research - Planning, Cunducting and Evaluating Quantitative and Qualitative Research. New Jersey: Pearson-Merrill Prentice Hall.

62. Cross, D. R. (1987). Assessment of reading comprehension - matching test purposes and test propertires. Educational Psychologist, 22(3 & 4), 313-332. Davis, A. (2006). Characteristics of teacher expertise associated with raising the reading comprehension abilities of year 5-9 students. Auckland, Auckland. Davis, A. (2008). Teaching Reading Comprehension. Wellington: Learning Media. Duffy, G. G. (2009). Explaining reading: A resource for teaching concepts, skills, and strategies: The Guilford Press. Dymock, S., & Nicholson, T. (1999). Reading Comprehension - What is it? How do you teach it? . Wellington: NZCER. Elley, W. B. (1974). One hundred years of reading instruction. Paper presented at the Woman of Education Address . Farr, R. (1992). Putting it all together: Solving the reading assessment puzzle. The Reading Teacher, 46(1), 26-37. Fawson, P. C., Ludlow, B. C., Reutzel, D. R., Sudweeks, R., & Smith, J. A. (2006). Examining the reliability of running records: Attaining generalizable results. The Journal of Educational Research, 100(2), 113-126. Flippo, R. F., Hollang, D., McCarthy, M., & Swinning, E. (2009). Asking the right questions: How to select an informal reading inventory. The Reading Teacher, 63(1), 79-83. Fountas, I. C., & Pinnell, G. S. (1996). Guided Reading: Good First Teaching for All Children: Heinemann, 361 Hanover Street, Portsmouth, NH 03801-3912 ($32.50). Gambrell, L. B., Morrow, L. M., & Pressley, M. . (2007). Best Practices in Literacy Instruction (3rd ed.). New York: The Guilford Press. Giacobbe, E. (1996). Guided Reading - Good first teaching for all children. Portsmouth: Heinemann. Heinz, P. J. (2004). Towards enhanced second language reading comprehension assessment: Computerized versus manual scoring of written recall protocols. Reading in a Foreign Language, 16(2), 97-124. Henk, W. A. (1993). New directions in reading assessment. Reading and Writing Quarterly: Overcoming Learning Difficulties, 9, 103-120. Hill, J., Hawk, K., & Taylor. (2001). Professional Development: What makes it work? Paper presented at the NZARE Conference. Invemizzi, M., Landrum, T., Howell, J., & Warley, H. (2005). Toward the peaceful coexistence of test developers, policymakers & teachers in an era of accountability. The Reading Teacher, 58(7), 610-619. 26(3), 322-331.

63. Johnston, P. (1981). Implications of basic research for the assessment of reading comprehension: Illinois University. Johnston, P., & Pearson, P. D. (1982). Prior Knowledge, Connectivity & the Assessment of Reading Comprehension. Urbana: Illinois University. Johnston, P. H. (1983). Reading comprehension assessment. A cognitive basis. Newark, Delaware: International Reading Association. Julian, K. M. (1982). Measuring Reading Comprehension: An Assessment System Based Upon Psychological Theory., Loyola University of Chicago, New York. Kamil, M. L., Mosenthal, P. B., Pearson, P. D., & Barr, P. (Ed.). (2000). Handbook of Reading Research. Mahwah: Lawrence Erlbaum Associates. Leslie, L., & Caldwell, J. (2006). Qualitative reading inventory. 4. Boston: Pearson. McDonald, L. (2004). Moving from reader response to critical reading: developing 10�11-year-olds' ability as analytical readers of literary texts. Literacy, 38(1), 17-25. Ministry of Education. (1997). Planning and Assessment in English. Wellington: Learning Media. Ministry of Education. (2000). Using Running Records - A Resource for New Zealand Classroom Teachers. Wellington: Learning Media. Ministry of Education. (2003). Effective Literacy Practice Years 1-4. Wellington: Learning Media. Ministry of Education. (2006). Effective Literacy Practice - Years 5-8. Wellington: Learning Media. Moore, D. W. (1983). A Case for Naturalistic Assessment of Reading Comprehension. Language Arts, 60(8), 957-969. Mutch, C. (2005). Doing Educational Reserach: A Practitioner's Guide to Getting Started. Wellington: New Zealand Council for Educational Research. Nilsson, N. L. (2008). A critical analysis of eight IRIs. The Reading Teacher, 526-536. Paris, S. G. (1991). Assessment in Remediation of Metacognitive Aspects of Children's Reading Comprehension. Topics in Language Disorders, 12(1), 32-50. Paris, S. G., & Carpenter, R. (2003). Centre for Improvement of Early Reading Achievement: FAQ about IRIs. The Reading Teacher, 56(6), 578-580. Paris, S. G., & Stahl, S. A. (2005). Children's reading comprehension and assessment: Lawrence Erlbaum.

64. Parker, M., & Hurry, J. (2007). Teachers' use of questioning and modelling comprehension skills in primary classrooms. Educational Review, 59(3), 299-314. Pearson, P. D., & Hamm, D. (2005). The Assessment of Reading Comprehension: A Review of Practices - Past, Present, & Future. In S. G. Paris, & Stahl, S. (Ed.), Reading Comprehension and Assessment. Mahwah: Lawrence Erlbaum Associates. Pikulski, J., & Shanahan, T. (Ed.). (1974). Informal reading inventories - a critical analysis. Ross, J. A. (2004). Effects of running records assessment on early literacy achievement. The Journal of Educational Research, 97(4), 186-195. Rowell, E. (1976). Do elementary students read better orally or silently? The Reading Teacher, 29(4), 367-370. Shuy, T., McArdle, P., & Albro, E. (2009). Introduction to this special issue: Reading comprehension assessment. Scientific Studies of Reading, 10(3), 221-224. Sibanda, J. (2010). The nexus between direct reading instruction, reading theoretical perspectives, and pedagogical practices of University of Swaziland Bachelor of Education Students. RELC Journal, 41, 149-164. Smith, W. A., & Elley, W. B. (1994). Learning to Read in New Zealand. Auckland: Longman Paul. Snyder, L., Caccamise, D., & Wise, B. (2005). The assessment of reading comprehension - Considerations and cautions. Topics in Language Disorders, 25(1), 33-50. Spear-Swirling, L. (2004). Fourth Graders' Performance on a State-Mandated Assessment Involvement Involving Two Different Measures of Reading Comprehension. Reading Psychology, 25, 121-148. Spector, S. (2005). How reliable are informal reading inventories? Psychology in the schools, 42(6), 593-605. Stone, C. A., Silliman, E. R., & Ehren, B. J. (Ed.). (2004). Handbook of Language and Literacy. New York: The Guilford Press. Sweet, A. (2005). Assessment of reading comprehension: The RAND Reading Study Group Vision In S. G. Paris, & Stahl, S. (Ed.), Children's Reading Comprehension and Assessment New Jersey: Lawrence Erlbaum Associates. Symes, I., & Timperley, H. (2003). Using achievement information to raise student achievement. SET: Research Information for Teachers, 1, 36-39. Teale, W. (2008). What counts? Literacy assessments in urban schools. The Reading Teacher, 62(4), 358-361. Timperley, H. (2003). Evidence-based leadership: the use of running records. NZ Journal of Educatioanl Leadership, 18, 65-76. Timperley, H., & Parr, J. (2004). Using Evidence in Teaching Practice. Implications for Professional Learning. Auckland: Hodder Moa Beckett Publishers. Valencia, S., & Pearson, P. D. (1987). Reading assessment: A time for change. The Reading Teacher, 40, 726-733. Wagner, G. (1986). Interpreting Cloze Scores in the Assessment of Text Readability and Reading Comprehension: Directions. Wixon, K., & Peters, C. W. (1987). Implementing an interactive view of reading. Educational Psychologist, 22(3 &4), 333-356. Wolf, D. F. (1993). Issues in reading comprehension assessment: Implications for the development of research instruments and classroom tests. Foreign Language Annals,

The Challenges of Reading Comprehension Assessment

The Challenges of Reading Comprehension Assessment

Presentation Transcript

Reading Comprehension

Reading Comprehension:

Assessment in Guided Reading Reading Records Comprehension Assessment

The Tortured History of Reading Comprehension Assessment

Reading Comprehension

Assessment of Reading Comprehension (Cognitive Strategies)

Reading Comprehension…

Reading Comprehension

Reading Comprehension

The Complexities of Reading Comprehension

Reading Comprehension

READING COMPREHENSION

The development of reading: comprehension

Reading Comprehension

Reading Comprehension

Reading Comprehension And Reading Comprehension Tests Of FMET

Reading Comprehension

Reading Comprehension

Reading Comprehension

Reading Comprehension

Reading Comprehension

Assessment in Guided Reading Reading Records & Comprehension Assessment

The Challenges of Reading Comprehension Assessment