The interaction plateau
500 likes | 538 Views
Explore the concept of an interaction plateau in tutoring systems through a dialogue between a human tutor and student, comparing different types of tutoring models and the effectiveness of each. The dialogue delves into understanding a problem step-by-step and addressing misconceptions. Major differences between low-interaction tutoring and natural tutoring are highlighted, along with conditions affecting effectiveness. Experiments with different student levels and instructional methods are analyzed to determine the most impactful tutoring approach.
The interaction plateau
E N D
Presentation Transcript
The interaction plateau CPI 494, April 9, 2009 Kurt VanLehn
Schematic of a natural language tutoring systems, AutoTutor Stepend T: Tell Stepstart Only if out of hints T: Elicit S: Correct Remediation: T: Hint or prompt S: Incorrect
Schematic of other natural language tutors, e.g., Atlas, Circsim-Tutor, Kermit-SE Stepend T: Tell Stepstart Only if out of hints T: Elicit S: Correct Remediation: T: What is…?S: I don’t know.T:Well, what is…S:…T:… S: Incorrect Often called a KCD: Knowledge construction dialogue
Hypothesized ranking of tutoring, most effective first • Expert human tutors • Ordinary human tutors • Natural language tutoring systems • Step-based tutoring systems • Answer-based tutoring systems • No tutoring
Hypothesized effect sizes Bloom’s (1984) 2-sigma: 4 weeks of human tutoring vs. classroom Classroom
Hypothesized effect sizes Kulik (1984) meta-analysis of CAI vs. classroom 0.4 sigma Classroom
Hypothesized effect sizes Many intelligent tutoring systems: e.g., Andes (VanLehn et al, 2005), Carnegie Learning’s tutors… Classroom
A problem and its steps • Suppose you are running in a straight line at constant speed. You throw a pumpkin straight up. Where will it land? • Initially, you and the pumpkin have the same horizontal velocity. • Your throw exerts a net force vertically on the pumpkin. • Thus causing a vertical acceleration. • Which leaves the horizontal velocity unaffected. • So when the pumpkin falls, it has traveled the same distance horizontally as you have. • Thus, it lands in your hands
A dialogue between a human tutor (T) and human student (S) • Suppose you are running in a straight line at constant speed. You throw a pumpkin straight up. Where will it land? S: Behind me. • T: Hmm. Let’s think about that. Before you toss the pumpkin and are just carrying it, do you and the pumpkin have the same speed?S: Yes • T: Good. When you toss it up, is the net force on it exactly vertical?S: I’m not sure.T: You exert a force on the pumpkin, right?Etc.
Schematic of dialogue about a single step Stepend T: Tell Stepstart T: Elicit S: Correct Remediation: T: Hint, or prompt, or explain, or analogy, or … S: Incorrect
Comparisons of expert to novice human tutors Stepend T: Tell Novices Stepstart Experts T: Elicit S: Correct T: Hint, or prompt, or explain, or analogy, or … S: Incorrect Experts may have a wider variety
Schematic of an ITS handling of a single step Stepend T: Tell Stepstart Only if out of hints S: Correct T: Hint S: Incorrect
Major differences • Low-interaction tutoring (e.g., CAI) • Remediation on answer only • Step-based interaction (e.g., ITS) • Remediation on each step • Hint sequence, with final “bottom out” hint • Natural tutoring (e.g., human tutoring) • Remediation on each step, substep, inference… • Natural language dialogues • Many tutorial tactics
Conditions(VanLehn, Graesser et al., 2007) • Natural tutoring • Expert Human tutors • Typed • Spoken • Natural language dialogue computer tutors • Why2-AutoTutor (Graesser et al.) • Why2-Atlas (VanLehn et al.) • Step-based interaction • Canned text remediation • Low interaction • Textbook
Human tutors(a form of natural tutoring) Stepend T: Tell Stepstart T: Elicit S: Correct T: Hint, or prompt, or explain, or analogy, or … S: Incorrect
Why2-Atlas(a form of natural tutoring) Stepend T: Tell Stepstart T: Elicit S: Correct A Knowledge Construction Dialogue S: Incorrect
Why2-AutoTutor(a form of natural tutoring) Stepend T: Tell Stepstart T: Elicit S: Correct Hint or prompt S: Incorrect
Canned-text remediation(a form of step-based interaction) Stepend T: Tell Stepstart T: Elicit S: Correct Text S: Incorrect
Experiment 1: Intermediate students & instruction No reliable differences
Experiment 2:AutoTutor > Textbook = Nothing Reliably different
Experiments 1 & 2(VanLehn, Graesser et al., 2007) No significant differences
Experiment 3: Intermediate students & instruction Deeper assessments
Experiment 3: Intermediate students & instruction No reliable differences
Relearning Experiment 4: Novice students & intermediate instruction
Experiment 4: Novice students & intermediate instruction All differences reliable
Relearning Experiment 5: Novice students & intermediate (but shorter) instruction Add Add
Experiment 5: Novice students & intermediate instruction No reliable differences
Experiment 5: Low-pretest students only Aptitude-treatment interaction?
Experiment 5, Low-pretest students only Spoken human tutoring > canned text remediation
Experiments 6 and 7 Novice students & novice instruction Was the intermediate text over the novice students’ heads?
Experiments 6 and 7 Novice students & novice instruction No reliable differences
Interpretation = Can follow reasoning only with tutor’s help (ZPD) predict: Tutoring > Canned text remediation = Can follow reasoning without any help predict: Tutoring = Canned text remediation Experiments 1 & 4 Content complexity Experiments 3 & 5 Experiments 6 & 7 High-pretest Low-pretest Intermediates High-pretest Low-pretest Novices
Original research questions • Can natural language tutorial dialog add pedagogical value? • Yes, when students must study content that is too complex to be understood by reading alone • How feasible is a deep linguistic tutoring system? • We built it. It’s fast enough to use. • Can deep linguistic and dialog techniques add pedagogical value?
When content is too complex to learn by reading alone: Deep>Shallow? Why2-Atlas is not clearly better than Why2-AutoTutor
When to use deep vs. shallow? Use both Use deep Use locally smart FSA Use equivalent texts
Results from all 7 experiments(VanLehn, Graesser et al., 2007) • Why2: Atlas = AutoTutor • Why2 > Textbook • No essays • Content differences • Human tutoring = Why2 = Canned text remediation • Except when novice students worked with instruction designed for intermediates, then Human tutoring > Canned text remediation
Other evidence for the interaction plateau (Evens & Michael, 2006) No significant differences
Other evidence for the interaction plateau (Reif & Scott, 1999) No significant differences
Other evidence for the interaction plateau (Chi, Roy & Hausmann, in press) No significant differences
Still more studies where natural tutoring = step-based interaction • Human tutors • Human tutoring = human tutoring with only content-free prompting for step remediation (Chi et al., 2001) • Human tutoring = canned text during post-practice remediation (Katz et al., 2003) • Socratic human tutoring = didactic human tutoring (Rosé et al., 2001a • Socratic human tutoring = didactic human tutoring (Johnson & Johnson, 1992) • Expert human tutoring = novice human tutoring (Chae, Kim & Glass, 2005) • Natural language tutoring systems • Andes-Atlas = Andes with canned text (Rosé et al, 2001b) • Kermit = Kermit with dialogue explanations (Weerasinghe & Mitrovic, 2006)
Hypothesis 1: Exactly how tutors remedy a step doesn’t matter much Stepend T: Tell Stepstart T: Elicit S: Correct What’s in here doesn’t matter much S: Incorrect
Main claim: There is an interaction plateau Hypothesis 1
Hypothesis 2: Cannot eliminate the step remediation loop Stepend T: Tell Stepstart Must avoid this T: Elicit S: Correct S: Incorrect
Main claim: There is an interaction plateau Hypothesis 2
Conclusions • What does it take to make computer tutors as effective as human tutors? • Step-based interaction • Bloom’s 2-sigma results may have been due to weak control conditions (classroom instruction) • Other evaluations have also used weak controls • When is natural language useful? • For steps themselves (vs. menus, algebra…) • NOT for feedback & hints (remeditation) on steps
Future directions for tutoring systems research • Making step-based instruction ubiquitous • Authoring & customizing • Novel task domains • Increasing engagement
Final thought • Many people “just know” that more interaction produces more learning. • “It ain’t so much the things we don’t know that get us into trouble. It’s the things we know that just ain’t so.” • Josh Billings (aka. Henry Wheeler Shaw)