710 likes | 718 Views
Chapter 1. Assessment in Social and Educational Contexts (Salvia, Ysseldyke & Bolt, 2012). Dr. Julie Esparza Brown SPED 512: Diagnostic Assessment Winter 2013 Chapters 1, 11, 12, 13, and 14 are included in this presentation. AGENDA – Week 3. Questions for the Good of the Group
E N D
Chapter 1 Assessment in Social and Educational Contexts (Salvia, Ysseldyke & Bolt, 2012) Dr. Julie Esparza Brown SPED 512: Diagnostic Assessment Winter 2013 Chapters 1, 11, 12, 13, and 14 are included in this presentation
AGENDA – Week 3 • Questions for the Good of the Group • Instruction and Lab Time: Continue WJ-III • Break • Group activity to process Chapters 1, 3, 11, 12, and 14 • Powerpoint overview of Chapters 1, 3, 11, 12, and 14
Individualized Support • Schools must provide support as a function of individual student need • To what extent is the current level of instruction working? • How much instruction is needed? • What kind of instruction is needed? • Are additional supports necessary?
Assessment Defined • Assessment is the process of collecting information (data) for the purpose of making decisions about students • E.g. what to teach, how to teach, whether the student is eligible for special services
How Are Assessment Data Collected? • Assessment extends beyond testing and may include: • Record review • Observations • Tests • Professional judgments • Recollections
Why Care About Assessment? • A direct link exists between assessment and the decisions that we make. Sometimes these decisions are markedly important. • Thus, the procedures for gathering data are of interest to many people – and rightfully so. • Why might students, parents, and teachers care? • The general public? • Certification boards?
Common Themes Moving Forward • Not all tests are created equal • Differences in content, reliability, validity, and utility • Assessment practices are dynamic • Changes in the political, technological, and cultural landscape drive a continuous process of revision
Common Themes Moving Forward • The importance of assessment in education • Educators are faced with difficult decisions • Effective decision-making will require knowledge of effective assessment • Assessment can be intimidating, but significant improvements have happened and continue to happen • More confidence in the technical adequacy of instruments • Improvements in the utility and relevance of assessment practices • MTSS framework
Chapter 11 Assessment of Academic Achievement with Multiple-Skill Devices
Achievement Tests • Achievement Tests • Norm-referenced • Allow for comparisons between students • Criterion-referenced • Allow for comparisons between individual students and a skill benchmark. • Why do we use achievement tests? • Assist teachers in determining skills students do and do not have • Inform instruction • Academic screening • Progress evaluation
Classifying Achievement Tests Diagnostic Achievement Number of students who can be tested • High • High • Low • Low More efficient administration – Comparisons between students can be made but very little power in determining strengths and weaknesses Less efficient administration – Allows for more qualitative information about the student. Less efficient administration – Dense content and numerous items allow teachers to uncover specific strengths and weaknesses Efficient administration – Typically only quantitative data are available
Considerations for Selecting a Test • Four Factors • Content validity • What the test actually measures should match its intended use • Stimulus-response modes • Students should not be hindered by the manner of test administration or required response • Standards used in state • Relevant norms • Does the student population being assessed match the population from which the normative data were acquired?
Tests of Academic Achievement • Peabody Individual Achievement Test (PIAT-R/NU) • Wide Range Achievement Test 4 (WRAT4) • Wechsler Individual Achievement Test 3 (WIAT-III)
Peabody Individual Achievement Test-Revised/Normative Update (PIAT-R/NU) • In general… • Individually administered; norm-referenced for K-12 students • Norm population • Most recent update was completed in 1998 • Representative of each grade level • No changes to test structure
PIAT-R/NU • Scores • For all but one subtest (written expression), response to each item is pass/fail • Raw scores converted into: • Standard scores • Percentile ranks • Normal curve equivalents • Stanines • 3 composite scores • Total reading • Total test • Written language
PIAT-R/NU • Reliability and Validity • Despite new norms, reliability and validity data are only available for the original PIAT-R (1989) • Previous reliability and validity data are likely outdated • Outdated tests may not be relevant in the current educational context
Wide Range Achievement Test 4 (WRAT4) • In general… • Individually administered • 15-45 minute test length depending on age (5-94 age range) • Norm-referenced, but covers a limited sample of behaviors in 4 content areas • Norm population • Stratified across age, gender, ethnicity, geographic region, and parental education
WRAT4 • Scores • Raw scores converted to: • Standard scores, confidence intervals, percentiles, grade equivalents, and stanines • Reading composite available • Reliability • Internal consistency and alternate-form data are sufficient for screening purposes • Validity • Performance increases with age • WRAT4 is linked to other tests that have since been updated; additional evidence is necessary
Wechsler Individual Achievement Test- Third Edition (WIAT-III) • General • Diagnostic, norm-referenced achievement test • Reading, mathematics, written expression, listening, and speaking • Ages 4-19 • Norm Population • Stratified sampling was used to sample within several common demographic variables: • Pre K – 12, age, race/ethnicity, sex, parent education, geographic region
WIAT-III • Subtests and scores • 16 subtests arranged into 7 domain composite scores and one total achievement score (structure provided on next slide) • Raw scores converted to: • Standard scores, percentile ranks, normal curve equivalents, stanines, age and grade equivalents, and growth scale value scores.
WIAT-III • Reliability • Adequate reliability evidence • Split-half • Test-retest • Interrater agreement • Validity • Adequate validity evidence • Content • Construct • Criterion • Clinical Utility • Stronger reliability and validity evidence increase the relevance of information derived from the WIAT-III
Getting the Most Out of an Achievement Test • Helpful but not sufficient – most tests allow teachers to find an appropriate starting point • What is the nature of the behaviors being sampled by the test? • Need to seek out additional information concerning student strengths and weaknesses • Which items did the student excel on? Which did he or she struggle with? • Were there patterns of responding?
Chapter Twelve Using Diagnostic Reading Tests
Why Do We Assess Reading? • Reading is fundamental to success in our society, and therefore reading skill development should be closely monitored • Diagnostic tests can help to plan appropriate intervention • Diagnostic tests an help determine a student’s continuing need for special services
The Ways in Which Reading is Taught • The effectiveness of different approaches is heavily debated • Whole-word vs. code-based approaches • Over time, research has supported the importance of phonemic awareness and phonics
Skills Assessed by Diagnostic Approaches • Oral Reading • Rate of Reading • Oral Reading Errors • Teacher pronunciation/aid • Hesitation • Gross mispronunciation • Partial mispronunciation • Omission of a word • Insertion • Substitution • Repitition • Inversion
Skills Assessed by Diagnostic Approaches (cont.) • Reading Comprehension • Literal comprehension • Inferential comprehension • Critical comprehension • Affective comprehension • Lexical comprehension
Skills Assessed by Diagnostic Approaches (cont.) • Word-Attack Skills (i.e., word analysis skills) – use of letter-sound correspondence and sound blending to identify words • Word Recognition Skills – “sight vocabulary”
Diagnostic Reading Tests • See Table 12.1 • Group Reading Assessment and Diagnostic Evaluation (GRADE) • DIBELS Next • Test of Phonemic Awareness – 2 Plus (TOPA 2+)
GRADE (Williams, 2001) • Pre-school to 12th grade • 60 to 90 minutes • Assesses pre-reading, reading readiness, vocabulary, comprehension, and oral language • Missing some important demographic information for norm group, high total reliabilities (lower subscale reliabilities), adequate information to support validity of total score.
DIBELS Next (Good and Kaminski, 2010) • Kindergarten-6th grade • Very brief administration (used for screening and monitoring) • First Sound Fluency, Letter Naming Fluency, Phoneme Segmentation Fluency, Nonsense Word Fluency, Oral Reading Fluency, and DAZE (comprehension) • Use of benchmark expectations or development of local norms • Multiple administrations necessary for making important decisions
TOPA 2+ (Torgesen & Bryant, 2004) • Ages 5 to 8 • Phonemic awareness and letter-sound correspondence • Good norms description • Reliability better for kindergarteners than for more advanced students • Adequate overall validity
Chapter 13 Using Diagnostic Mathematics Measures
Why Do We Assess Mathematics? • Multiple-skill assessments provide broad levels of information, but lack specificity when compared to diagnostic assessments • More intensive assessment of mathematics helps educators: • Assess the extent to which current instruction is working • Plan individualized instruction • Make informed eligibility decisions
Ways to Teach Mathematics 1980s: Constructivist approach – standards-based math. Students construct knowledge with little or no help from teachers 1960s: New Math; movement away from traditional approaches to mathematics instruction < 1960: Emphasis on basic facts and algorithms, deductive reasoning, and proofs > 2000: Evidence supports explicit and systematic instruction (most similar to “traditional” approaches to instruction).
Behaviors Sampled by Diagnostic Mathematics Tests • National Council of Teachers of Mathematics (NCTM) • Content Standards • Number and operations • Algebra • Geometry • Measurement • Data analysis and probability • Process Standards • Problem solving • Reasoning and proof • Communication • Connections • Representation
Specific Diagnostic Math Tests • Group Mathematics Assessment and Diagnostic Evaluation (G●MADE) • KeyMath-3 Diagnostic Assessment (KeyMath-3 DA)
G●MADE • General • Group administered, norm-referenced, standards-based test • Used to identify specific math skill strengths and weaknesses • Students K-12 • 9 levels of difficulty teachers may select from
G●MADE • Subtests • Concepts and communication • Language, vocabulary, and representations of math • Operations and computation • Addition, subtraction, multiplication, and division • Process and applications • Applying appropriate operations and computations to solve word problems
G●MADE • Scores • Raw scores converted to: • Standard scores, grade scores, stanines, percentiles, and normal curve equivalents, and growth scale values. • Norm population • 2002 and 2003; nearly 28,000 students • Selected based on geographic region, community type, socioeconomic status, students with disabilities
G●MADE • Reliability • Acceptable levels of split-half and alternative form reliability • Validity • Based on NCTM standards (content validity) • Strong criterion related evidence
KeyMath-3 Diagnostic Assessment (KeyMath-3 DA) • General • Comprehensive assessment of math skills and concepts • Untimed, individually administered, norm-referenced test; 30-40 minutes • 4 years 6 months through 21 years
KeyMath-3 DA Subtests • Numeration • Algebra • Geometry • Measurement • Data analysis and probability • Mental computation and estimation • Addition and subtraction • Multiplication and division • Foundations of problem solving • Applied problem solving
KeyMath-3 DA • Scores • Raw scores converted to: • Standard scores, scaled scores, percentile rank, grade and age equivalents, growth scale values • Composite scores • Operations, basic concepts, and application • Norm population • 3,630 individuals • 4, 6, and 21 years – demographic distribution approximates data reported in 2004 census
KeyMath-3 DA • Reliability • Internal consistency, alternate-form, and test-retest reliability • Adequate for screening and diagnostic purposes • Validity • Adequate content and criterion-related validity evidence for all composite scores
Chapter 14 Using Measures of Oral and Written Language