1 / 39

Teachers and Student Achievement in the Chicago Public High Schools

Teachers and Student Achievement in the Chicago Public High Schools. Daniel Aaronson Federal Reserve Bank of Chicago Lisa Barrow Federal Reserve Bank of Chicago William Sander DePaul University latest version, on my hard disk in Chicago. What Are We Trying To Do?.

calder
Download Presentation

Teachers and Student Achievement in the Chicago Public High Schools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Teachers and Student Achievement in the Chicago Public High Schools Daniel Aaronson Federal Reserve Bank of Chicago Lisa Barrow Federal Reserve Bank of Chicago William Sander DePaul University latest version, on my hard disk in Chicago

  2. What Are We Trying To Do? • Estimate the importance of teachers to educational achievement. • Why does the Fed care about this? • Productivity study of Teachers and (in the future) Students. • Test scores are an indicator of future student productivity. [Grogger and Eide (1995), Murnane et al (1995), Neal and Johnson (1996), Hanushek and Kimko (2000), Bishop (1992)]

  3. Lots of Policy Implications Along the Way • How to compensate teachers. • Most industries, Productivity=Compensation. • How to set up accountability standards. • How sensitive are the teacher rankings to specification issues? Does this matter for accountability systems? • Can the econometrician predict who the good teachers are? • Or more importantly, can the principal? • How to determine hiring, tenure policy. • Critical for thinking about other policy levers -- e.g. reducing class size -- with quality/quantity tradeoffs.

  4. New Literature on Teacher Effects • Original U.S. study--Coleman (1966). • 1990s – use of administrative records. Pioneered in Tennessee and Texas. • Advantages: • Micro data -- lots of cross-sectional variation. • Longitudinal – ability to minimize sorting behavior and other confounding factors by looking at fixed effect models, repeated measures and multiple cohorts. • Many students per teacher.

  5. What Do We Contribute? • Large urban, mostly minority, mostly poor school system. • Critical for policy. • Chicago particularly useful since it was doing so poorly (perhaps less so now). i.e. Secretary Bennett’s fondness for the Chicago schools.

  6. School Districts in the U.S., 2000-01

  7. What Do We Contribute? • Match students with teachers at the classroom level. • No aggregation issues. Level that plausibly corresponds to the intervention effects.

  8. What Do We Contribute? • High schools (most studies on elementary schools). • Can look at subject rather than general teachers. • Populations rather than samples. • Know everyone that is in every classroom (including non-math classes). • Decent info on teacher characteristics. • Covers major compensation factors. • Can isolate quality coming from observed and unobserved stuff. • Many of these features are available in other datasets but rarely together.

  9. Data • Administrative records from the Chicago Public High Schools for 1996-97 to 1999-2000 (3 years). • Only use 9th grade to this point but have all of HS. • Population: 27,000 to 29,000 9th grade students in each year. • Sample: 53,000 unique kids • See paper for discussion of sample selection issues

  10. Table 1 -- Student Descriptive Statistics

  11. Table 2 --More Student Test Score Statistics

  12. Sampling of other stuff available to us (Table 1)

  13. Table 4 -- Teacher Characteristics

  14. What Do We Do? • Step 1: Estimate teacher “quality”. • Step 2: Estimate the relationship between measured teacher quality and observable teacher characteristics.

  15. Estimating Teacher Quality • Simple strategy  value added model (include lagged dependent variable(s) on RHS). Picks up cumulative inputs for prior years while allowing for flexible autoregressive relationship in test scores.

  16. Estimating Teacher Quality • Problem is biased by (simple representation) • where Nj is the number of students per teacher • I.e. the teacher dummies may be confounded by time, school, individual and family (especially nonrandom sorting), and random fluctuations that should not be attributed to the teacher effect. schools time white noise family,indiv

  17. Individual and Family Effects • Gains help here. • Also can control for lots of stuff (see paper). • Have to be sorting into certain teachers based on changes in unobserved characteristics. • Throw out transition schools where this is likely. • Clotfelter et al (2004). • How much within-school classroom sorting is there? Table 3 – mean variance by teacher of lagged test scores.

  18. School Effects • Sorting across schools is likely important. • ie. School-level policies (e.g. curriculum), personnel (principal), latent family or neighborhood characteristics that might influence school choice. • Note: funding and most curricula decisions are central to the district and thus are not in play here. • School fixed effects – look at only within school variation. • Don’t have to assign school effect to particular measures. • Variation used? Alternatives.

  19. Sampling Variation • Kane and Staiger (2002) – big problem • Fixed effects in small samples can be severely problematic. Sampling variation can overwhelm signal. A few good (or bad) apples upset the cart. • Variability is strongly related to the number of observations that make up the teacher fixed effect. I.e. teachers with low numbers of student tend to be the highest and lowest performing in a literal interpretation of the fixed effect distribution. • Artificially inflates our FE dispersion.

  20. Figure 2 -- Teacher Effect Estimates versus the Number of Student-Semester Observations Regress on ==> -0.00047 (0.00008). Disappears when >200.

  21. Sampling Variation -- What Do We Do? • Trim outlying observations on test score gains. • Set minimum number (15, 50, 100) of student-semester observations for identification. • Adjust for the size of sampling error by assuming that the estimated teacher effect is the sum of the actual effect and noise. • Use the mean of the square of the standard error estimates of as an estimate of sampling variance and subtract this from the observed variance. • If Nj is big enough (around 200), this problem essentially goes away. Practically, we can’t restrict to these guys though (misses interesting group).

  22. Table 7 -- Distribution of Teacher Effects

  23. Is this Teacher Quality? • Transition matrices -- year-to-year movement in teacher quality. Measure of stability (permanent vs. transitory). • Reestimate production function but with time subscripts on . • First, separate into quartiles (reduce measurement error). • Should be masses on diagonals. • Pure noise would be equal shares in each cell. Easily reject random draw scenario though.

  24. Table 8 -- Transition Matrices

  25. More on Year to Year Movement • Can do with continuous measure too • if pure noise, correlation in gains will be –0.5 • get about 0.5 for t-1, 0.3 for t-2. • However, intensify sampling variability by looking at year-to-year movements. Probably a no-no. • Similar to results in Kane and Staiger (2002) for NC schools.

  26. More on Year to Year Movement • Of those in top decile in year t: • 18% are there in year t+1 • (random = 10 %). Statistically significant. • 22% of those in 1997 are there in 1999. • Of those in the bottom decile in year t: • 12% are there in year t+1 • Turnover higher. To appear in the transition matrix, you must be in the records 2 years in a row. But those at the bottom are less likely to reappear. Random draw is no longer 10%. After adjusting, looks statistically significant. • 23% of those in 1997 are there in 1999.

  27. How consistent are the rankings across specifications? • Each regression produces a teacher quality score. • Q: Does the way the regression is specified matter to how a teacher is ranked? • If so, suggests potential concern about how accountability standards are set up.

  28. How consistent are the rankings across specifications? Correlation matrix of teacher FE from Specifications 1-5 on table 7 Warning label: Preliminary. no adjustment for sampling variation by individual teacher These are lower bound. All specifications include year FE and lagged scores. 2 adds basic student demographics, 3 adds richer student covariates and peer measures, 4 adds school FE (but no peer and limited student stuff), 5 is kitchen sink

  29. How consistent are the rankings across specifications? Predicting bottom 10 percentile of teachers Share of teachers commonly ranked in bottom 10 percentile Warning label: Preliminary. no adjustment for sampling variation by individual teacher. These are lower bound. All specifications include year FE and lagged scores. 2 adds basic student demographics, 3 adds richer student covariates and peer measures, 4 adds school FE (but no peer and limited student stuff), 5 is kitchen sink

  30. How consistent are the rankings across specifications? Predicting the top 10 percentile of teachers Share of teachers commonly ranked in top 10 percentile Warning label: Preliminary. no adjustment for sampling variation by individual teacher. These are lower bound. All specifications include year FE and lagged scores. 2 adds basic student demographics, 3 adds richer student covariates and peer measures, 4 adds school FE (but no peer and limited student stuff), 5 is kitchen sink

  31. Robustness-- Cream Skimming • Discourage certain students from taking test. Look at correlation between and share missing scores in teacher j’s classes: -0.044 (0.196). • No evidence. • Report scores of those who do well but are exempt. Correlation between and share of students excluded: 0.083 (0.015). • Hmmm. So we excluded anyone (12% of sample) who is exempt and reran the results. Did not change anything.

  32. More Robustness Checks (table 9)

  33. By Student “Initial Ability” (table 9)

  34. Table 10 -- Controlling for English Teachers

  35. Can we predict teacher quality from resume items? • Because of concerns raised by Moulton (1986) about the efficiency of OLS estimates of teacher attributes in the presence of a school-specific fixed effect and multiple teachers per student, we do not include the teacher characteristics directly in the production function . • Rather, use a GLS estimator that regresses teacher effects on teacher characteristics. • See paper for technical details.

  36. Can we predict teacher quality from resume items? • Vast majority of teacher quality unexplained by observables (90%), even after correcting for sampling error. See table 11. • Measures used for compensation purposes -- tenure, advanced degrees, certification is probably 3 or less percent. • BA major matters. But most demographic (except female) and human capital traits don’t. • But at least principals can look at the autoregression!!

  37. Conclusions • Lots of issues related to using test scores to evaluate teachers. • Dispersion of teacher effects can be way off in Naïve regressions. • But consistency of teacher rankings are not too bad (more work to be done here), especially if include school fixed effects. • Teachers matter and to all groups of students • Perhaps differentially (more to be done here). • Unobservable teacher characteristics seem to drive much of the dispersion in teacher quality. But the principal can observe productivity over time.

More Related