Artifact Corrected Correlations between theoretical text complexity and empirical text complexity

Artifact Corrected Correlations between theoretical text complexity and empirical text complexity • PCRC – San Diego • February 7-10, 2013 • Jackson Stenner

A Story worth remembering! • Oasis 475 • Common Core • Sarah Kershaw (FSU) • Art Graesser (University of Memphis) and Danielle McNamara (Arizona) • Bob Calfee (UC Riverside

Theoretical versus Empirical Text Complexity for 719 Articles* Mean Theoretical = 884.4L (356.2) Mean Empirical = 884.4L (355.0) Reliability = 0.997 SEM = 12.8L r = 0.968 r” = 0.969 R2” = 0.938 RMSE” = 89.6L * Inclusion criteria: 50 encounters and 1,000 items

A Brief Digression: Correlation corrected for measurement error Works for effect sizes (e.g. Cohens d) also!

Two ends of the stick

Hunter and Schmidt (2004) identified six more artifacts relevant to our problem

Artifacts • Error of measurement (a1) in the dependent variable: Study validity will be systematically lower than true validity to the extent that empirical text complexity is measured with random error. Measurement error is ubiquitous (a1 = √0.997 = .998599). • Error of measurement in the independent variable: Theory validity will systematically understate the true validity of the attribute measured because the theory is imperfectly represented (a2 = 1.0*). • Range variation (a3) in the independent variable: Study validity will be systematically lower than true validity to the extent that the range of theoretical text complexities in the study is smaller than the population range.

Artifacts cont’d. • Range variation (a4) in the dependent variable: Study validity will be systematically lower than true validity to the extent that the range of empirical text complexities is smaller than the population range. Correction for double range restriction (a3 + a4 = .9853). • Deviation from perfect construct validity (a6) in the dependent variable: Task type used to measure empirical text complexity may have some specificity not shared by alternative task types, thus, resulting in slightly different text orderings depending upon which task type is used (a5 = 1.0*). • Deviation from perfect construct validity (a5) in the independent variable: Construct mis-specification due to less than perfect operationalization of the constructs in the specification equation (a6 =√.9833= .9916).

Artifacts cont’d. • Linear Bias (a7) is quite small for large sample studies.Bias = (2N-2)/(2N-1) = .9993Transcription and reporting error (a8 = 1.0*).

A = a1 a2 (a3 + a4)a5 a6 a7 a8 = (.9985) (1) (.9853) (.9916) (1) (.9993) (1) = .9748 Corrected = Observed = .96762 = .9926 Correlation Correlation .9748 A 95% confidence interval for the observed correlation (se = .00475) .96287≤P≤.97237 95% confidence interval for the corrected correlation .988≤P≤.998

Conclusion The fact that correlations between theoretical estimates and empirical estimates are influenced by a dozen or more artifactual sources of variation poses threats to inference whenever raw correlations are interpreted. These problems are particularly troublesome when the population correlation is in fact r = 1.0. Because of attenuation due to artifacts the raw (uncorrected) correlation, say r = .90, invites speculation about what moderator variables may have been omitted in the study. And so begins a time consuming, potentially expensive, search for moderators that don’t exist. When researchers fail to produce moderator variables that account for the unexplained variance, which they must, then it is concluded that better theories or better operationalizations of key constructs are needed. When these repair strategies also fail the research is dead-ended and assigned to the dustbin of science. We conjecture that social and human science literatures are littered with correlational studies that conform to the above depressing scenario: “Failure to correct for these artifacts results in massive underestimation of the mean correlation. Failure to control for variation in these artifacts results in a large overstatement of the variance of population correlations and, thus, in a potential false assertion of moderator variables where there are none” (Hunter and Schmidt, 2004, 132.

Contact Info: A. Jackson Stenner Chairman & CEO, MetaMetrics University of North Carolina, Chapel Hill jstenner@Lexile.com

Artifact Corrected Correlations between theoretical text complexity and empirical text complexity

Artifact Corrected Correlations between theoretical text complexity and empirical text complexity

Presentation Transcript

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity

Text Complexity