Learning from Interim Assessments: District Implementation to Classroom Practice

Learning from Interim Assessments: District Implementation to Classroom Practice JAMES H. MCMILLAN LISA M. ABRAMS VIRGINIA COMMONWEALTH UNIVERSITY MARCES Annual Conference University of Maryland, College Park October 20, 2011 (PowerPoint available at http://www.soe.vcu.edu/merc)

Flight Plan • Why we need interim assessment • What the research says about impact • Qualitative study summary • Quantitative study summary • Recommendations for practice

Need for Interim Assessments • Increased pressure to understand student achievement: • Are students making progress toward meeting the requirements of the state test? • Are students on track to pass the state test? • Are subgroups of students on track to meet AYP targets? • Greater information needs: • Measure of student progress relative to a set of specific content standards/skills • Identify content areas of strength/areas for improvement • Shape instructional decisions • Serve as an “early warning” system • Inform strategies to support the learning of individual students • Results that can be aggregated: student classroom grade/team level school district levels

Offer a Range of Instructional Uses(see Supovitz & Klein, 2003) • Planning: • Decide on content • Pace and instructional strategies or approaches (i.e., mastery orientation) • Delivery: • Targeted instruction: whole class or small groups depending on mastery of content/skills • Provide feedback and/or re-teaching selected content and/or skills • Selection and use of supplemental or additional resources • Remediation: • Identify low-performing students • Design plans for providing additional supports/assistance • Evaluation: • Monitor/track student progress • Examine effectiveness of interventions • Determine instructional effectiveness

What We Know About Interim Testing • Widespread use across districts in Virginia and nationally (Marsh, Pane & Hamilton, 2006). • Mixed views on usefulness of interim test results • Compared to own classroom assessments  less useful, provide redundant information. • Compared to state test more useful that those of state tests to “identify and correct gaps in their teaching.” • Factors that influence teachers’ views: quick turnaround of results, alignment with curriculum, capacity and support, instructional leadership, perceived validity, reporting, added-value

Impact on Teachers • Informs instructional adjustments (Brunner et al., 2005; Marsh, Pane & Hamilton, 2006; Oláh, Lawrence & Riggan, 2010; Yeh, 2006) • Increased collaboration and problem solving (Lachat & Smith, 2005; Wayman & Cho, 2009; Yeh, 2006) • Enhanced self-efficacy, increased reflection (Brunner et al., 2005; Yeh, 2006) • Increased emphasis on testing; test preparation and primary influence of colleagues and standards on practice (Loeb, Knapp & Elfers, 2008) • Variability within schools – some teachers use information others do not –80% of the variability in teacher survey responses was within rather than between schools (Marsh, Pane & Hamilton, 2006).

Impact on Students • Achievement – although limited, research suggests impact may be mixed • Targeted instruction led to improvements in student test scores (Lachat & Smith, 2005; Nelson & Eddy, 2008; Trimble, Gay & Matthews, 2005; Yeh, 2006) and proficiency in reading and mathematics (Peterson, 2007). • Large-scale studies have failed to find significant differences in student achievement between treatment and comparison schools (Henderson, Petrosino & Guckenburg, 2008; May & Robinson, 2007; Quint, Speanik & Smith, 2008). • Increased engagement and motivation (Yeh, 2006) • Increased access to learning opportunities – tutoring and remedial services (Marsh, Pane & Hamilton, 2006) • Targeted instruction toward the “bubble kids”.

MERC Research on Interim Assessments • Qualitative study • Explored the extent to which teachers used interim test results to support learning. • Quantitative study • Designed to examine teachers’ self-reports about using interim test results and the influence of results on instruction. • What conditions are necessary to promote use of test results? • How do teachers analyze and use interim test results to inform instruction? To inform decisions about students? • What most influence teachers’ use of test results?

Qualitative Study Research Design and Methods • Qualitative double-layer category focus-group design (Krueger & Casey, 2009) • Layers : school type & district (N=6) • Protocol • the general nature of interim testing policies and the type of data teachers receive • expectations for using interim test results • instructional uses of interim test results • general views on interim testing policies, practices and procedures • Focus group sessions

Participants • Selection: two-stage convenience sampling process • District  School Principal  Teachers • Data Collection: • Spring 2009/Fall 2010; 15 focus groups w/67 core-content area teachers • Demographic Profile: • The majority were: white (82%), female (88%), taught at the elementary level (80%) • Average of 11.5 years of classroom experience (range of 1-34 yrs.) • 33% were beginning teachers with 1-3 years of teaching experience and 20% had been teaching for over 20 years. • 20% were middle school teachers in the areas of civics, science, mathematics and language/reading

Data Analysis • Transcript-based approach using a constant-comparative analytic framework was used to identify emergent patterns or trends (Krueger & Casey, 2009). • Analysis focused on the frequency and extensiveness of viewpoints or ideas • Codes created in 9 key areas and applied to the text • “alignment”, “test quality”, “individualized instruction”, “testing time” • High inter-coder agreement

Findings: District Policies and Expectations Theme 1: interim testing policies related to test construction and administration were similar among school divisions. Inconsistencies were evident across content areas and grade levels within districts. Theme 2: There are clear and consistent district- and building-level expectations for teachers’ analysis and use of interim test results to make instructional adjustments in an effort to support student achievement. They are graded, but they are not part of their grade. So they will [interim test results] show up on their report card as a separate category just so parents know and the students know what the grade is, but it doesn’t have any effect on their class grade. Our principal expects when you have a grade level meeting to be able to say, this is what I’m doing about these results, because it is an unwritten expectation but it is clearly passed on… by sitting down with them the first time they are giving the test and describing how you do data analysis and literally walking them through it and showing them patterns to look for.

Findings: Access to Results and Analysis Theme 3: Timely access to test results and use of a software program supported data analysis and reporting. Theme 4: It was important for teachers to discuss results with others and have time with colleagues to discuss results. That if we are supposed to be using this information to guide instruction we need immediate feedback, like the day of, so we can plan to adjust instruction for the following day. We have achievement team meetings where we look at every single teacher, every single class, everything, and look at the data really in depth to try to figure out what’s going on. What is the problem with this class? Why is this one doing better?

Findings: Informing Instruction Theme 5: Teachers’ analyze interim test results at the class and individual student level to inform review, re-teaching, and remediation or enrichment. Theme 6: A variety of factors related to data quality and validity impact teachers’ use of interim test data. If I see a large number of my students missing in this area, I am going to try to re-teach it to the whole class using a different method. If it is only a couple of [students], I will pull them aside and instruct one-on-one. We really need to focus on the tests being valid. It is hard to take it seriously when you don’t feel like it is valid. When you look at it and you see mistakes or passages you know your students aren’t going to be able to read because it is way above their reading level. It makes a difference in my instruction. I mean, I think I’m able to help students more that are having difficulty based on it. I am able to hone in on exactly where the problem is. I don’t have to fish around.

Findings: Testing Time vs. Learning Time • Theme 7: Teachers expressed significant concerns about the amount of instructional time that is devoted to testing and the implications for the quality of their instruction. I think it is definitely made us change the way we teach because you are looking for how can I teach this the most effectively and the fastest…that is the truth, you have got to hurry up and get through it [curriculum] so that you can get to the next thing so that they get everything [before the test]. I do feel like sometimes I don’t teach things as well as I used to because of the time constraints. You are sacrificing learning time for testing time…we leave very little time to actually teaching. These kids are losing four weeks out of the year of instructional time.

Conclusions From Qualitative Study • In the main, consistent with other research. • Importance of conversations among teachers. • Relatively little emphasis on instructional correctives. • Alignment and high quality items are essential.

Quantitative Study: Research Design and Methods Survey design Conducted Spring 2010 Administered online in 4 school districts Target population: elementary (4 and 5th grades) and middle school teachers (core content areas) 460 teachers responded; 390 w/useable responses Response rates ranged from 25.4% to 85. 1% across the districts1 Survey items adapted from the Urban Data Study survey, American Institutes for Research Analyses Frequency and measures of central tendency Factor analysis and regression procedures 1. Response rate reported for 3 of the 4 participating districts due to difference in recruitment procedures.

Demographic Information: Race and Gender Note: Total Sample Size N = 390. a. The data contain one missing value.

Demographic Information: Grade Level and Years of Experience Note: Total Sample Size N = 390

Demographic Information:Subjects and Grade Level Note: Total Sample Size N = 390. a. Responses to this item allowed for multiple selections.

Demographic Information: Degrees Note: Total Sample Size N = 390. a. Frequencies will not add up to N=390 due to multiple selections by participants

Interim Assessment Survey Survey Topics:Variables for Analysis: Policies and Procedures Six Conditions for Use Accessing Test Data Instructional Adjustments Analyzing Results Authentic Strategies Instructional Uses Use of Scores Attitudes Traditional Strategies Demographics

Condition: Alignment Note: Scale- Strongly Disagree= 1; Disagree= 2 Agree= 3 Strongly Agree= 4 Reliability estimate for the scale, Cronbach’s α=.901 (n=300).

Condition: Division Policy Note: Scale- Strongly Disagree= 1; Disagree= 2 Agree= 3 Strongly Agree= 4 Reliability estimate for the scale, Cronbach’s α=.864 (n= 267).

Condition: School Environment Note: Agreement Scale- Strongly Disagree= 1; Disagree= 2 Agree= 3 Strongly Agree= 4 Reliability estimate for the scale, Cronbach’s α =.856 (n=283).

Condition: Time Spent Analyzing and Reviewing Interim Data Note: Frequency Scale- 0, <1 hour; 1-2 hours; 2-3 hours; more than 3 hours; Reliability estimate for the scale, Cronbach’s α =.718 (n = 358).

Condition: Frequency of Analysis and Review Note: Frequency scale- Never = 1; 1-2 times a quarter = 2; 1-2 times a month= 3; 1-2 times a week = 4 ; Reliability estimate for the scale, Cronbach’s α =.815 (n = 200).

Condition: Teachers’ Interactions Note: Extent Scale: Not at all = 1; Slight Extent = 2; Moderate Extent= 3; Major Extent = 4 a. Reliability estimate for the scale, Cronbach’s α =.867 (n = 361).

Conditions: Some Additional Individual Items Note: Scale not at all = 1; minor = 2; moderate = 3; major = 4 ;

Instructional Adjustments Scale Range = 1- 4 1= no influence or change on instruction 4= major influence or change on instruction

Instructional Adjustments: Some Individual Items • 85% of teachers reported making some kind of change in instructional strategies • 67% of teachers reported some level of change in student expectations • 84% of teachers reported some level of influence in adjusting goals for student learning • 35% of teachers indicated that reviewing results with their principal or assistant principal was somewhat or very useful

Instructional Adjustments: Some Individual Items

Authentic Instructional Strategies Scale Range = 1- 5 1= Large decrease in use of strategy 5= Large increase in use of strategy

Use of Scores Scale Range = 1- 4 1= No use 4= Extensive use

Traditional Instructional Strategies Scale Range = 1- 5 1= Large decrease in use of strategy 5= Large increase in use of strategy

Bivariate Correlations Between Conditions and Use *correlations significant at .05; **correlations significant at .01.

Regression: Conditions With Instructional Adjustments Note: Total Sample Size N = 390. a. The data contain one missing value.

Regression: Conditions With Use of Specific Scores Note: Total Sample Size N = 390. a. The data contain one missing value.

Conclusions From Quantitative Study • interim testing may serve a meaningful formative purpose and effect instruction. • District policy and school leadership that encourage an environment in which use of data is encouraged and supported, and making time available for teacher review and analysis of data (especially with other teachers) is positively related to teachers’ instructional adjustments and use of specific report scores. • Teachers report extensive use of interim test data across many different instructional adjustments. No single type of adjustment was used most often. • Only 37% of teachers agree or strongly agree that interim testing is of little use in instruction. • Elementary school teachers’ use of interim data only slightly greater than middle school teachers’ use. • Greatest barriers to using interim data are lack of time for review and analysis of data and pacing guide pressures.

Recommendations for Effective Practice

Questions? Learning from Interim Assessments: District Implementation to Classroom Practice

Learning from Interim Assessments: District Implementation to Classroom Practice