Loading in 2 Seconds...
Loading in 2 Seconds...
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Dependability Of College Student Ratings Of Teaching And Learning Quality RajatChadha, Ph.D. Australian Council for Educational Research Theodore W. Frick, Ph.D. School of Education, Indiana University AERA Annual Conference New Orleans April 9, 2011
Background • Problem: Beyond global ratings of quality, few items on course evaluation instruments are predictive of student learning achievement in higher education (Cohen, 1981; Kulik, 2001). • If better items or scales can be developed which are reliable, valid, and more predictive of student learning, then use of course evaluations may better address accountability concerns in higher education.
Background (cont’d) • Frick, Chadha, Watson, et al. have developed Teaching and Learning Quality Scales (TALQ) to address this issue. • To date, 3 empirical studies have been conducted on TALQ. • Study 1: 140 students in 89 courses at multiple institutions. • Study 2: 193 students in 111 courses at multiple institutions • Study 3: 464 students in 12 courses at one institution
Background (cont’d) • Results from these 3 studies of TALQ revealed similar patterns of correlations among scales. • In study #3, the major finding: • If students agreed that they experienced Academic Learning Time (ALT) and they agreed that their instructors used First Principles of Instruction, • then they were 5 times more likely to be high masters of course objectives according to independent instructor ratings.
Nine a priori TALQ Scales are based on extant theory and empirical research • Academic Learning Time scale • Learning progress scale • Student satisfaction scale • Global course and instructor quality items • Authentic problems scale (Principle 1, First Principles of Instruction) • Activation scale (Principle 2, First Principles of Instruction) • Demonstration scale (Principle 3, First Principles of Instruction) • Application scale (Principle 4, First Principles of Instruction) • Integration scale (Principle 5, First Principles of Instruction)
Purpose of present study • Since the TALQ appears to have promise for predicting student academic learning time (ALT), • And ALT is predictive of student learning achievement: • We next wanted to study in depth the dependability of TALQ scales in order to possibly: • Shorten the instrument • Improve problematic items
TALQ Course Evaluation Instrument • 40 items scrambled into a random order. • No information about the scales to students. • 6 faculty members reviewed TALQ and suggested changing“real world problems” to “authentic problems”.Explanation about authentic problems was added. “Note: In the items below, authentic problems or authentic tasks are meaningful learning activities that are clearly relevant to you at this time, and which may be useful to you in the future (e.g., in your chosen profession or field of work, in your life, etc.).”
Participants • 8 volunteer professors teaching 12 courses from diverse subject areas: business; philosophy; history; kinesiology; social work; computer science; nursing; and health, physical education and recreation. • Administered during 13th to 15th week of fall semester. • 464 students: • 52 freshmen • 104 sophomore • 115 juniors • 185 seniors • Class participation rates ranged form 49% to 100%.
Dependability (Reliability) of Student Ratings Classical Test Theory Generalizability Theory Total Variance Total Variance True Score Variance (Object of Measurement) Error Source #1 Variance (Facet#1) Error Source #2 Variance (Facet#2) Error Source #n Variance (Facet#3) True Score Variance Error Source Variance For both relative and absolute decision Only for relative decision
Generalizability Theory • A measurement is a sample from the universe of admissible observations. • Similar conditions are grouped to form a facet. • Universe of generalization: universe of conditions to which a decision maker wishes to generalize. • Universe score: mean of scores over the universe of generalization. • Dependability: accuracy of generalizing from observed score to universe score. • Index of Dependability (𝜙 coefficient) is analogous to classical test theory reliability coefficient. • Estimation of number of conditions in each facet to yield dependable scores.
Implications • First Principles of Instruction • synthesized from instructional theories and models in the literature • empirically related to Academic Learning Time • Academic Learning Time is related empirically to student learning achievement. ALT is under control of students and thus instructors should not be held accountable. • However, instructors could be held accountable for using First Principles of Instruction (Frick, et al., 2010): • If students agreed that First Principles occurred, they were 3 times more likely to agree that they experienced ALT; • If students agreed that both First Principles and ALT occurred, they were 5 times more likely to be independently rated as HIGH masters of course objectives by their instructors; • If students did NOT agree that both First Principles and ALT occurred, they were26 times more likely to be independently rated as LOW masters of course objectives by their instructors.