The interaction plateau

The interaction plateau CPI 494, April 9, 2009 Kurt VanLehn

Schematic of a natural language tutoring systems, AutoTutor Stepend T: Tell Stepstart Only if out of hints T: Elicit S: Correct Remediation: T: Hint or prompt S: Incorrect

Schematic of other natural language tutors, e.g., Atlas, Circsim-Tutor, Kermit-SE Stepend T: Tell Stepstart Only if out of hints T: Elicit S: Correct Remediation: T: What is…?S: I don’t know.T:Well, what is…S:…T:… S: Incorrect Often called a KCD: Knowledge construction dialogue

Hypothesized ranking of tutoring, most effective first • Expert human tutors • Ordinary human tutors • Natural language tutoring systems • Step-based tutoring systems • Answer-based tutoring systems • No tutoring

Hypothesized effect sizes

Hypothesized effect sizes Bloom’s (1984) 2-sigma: 4 weeks of human tutoring vs. classroom Classroom

Hypothesized effect sizes Kulik (1984) meta-analysis of CAI vs. classroom  0.4 sigma Classroom

Hypothesized effect sizes Many intelligent tutoring systems: e.g., Andes (VanLehn et al, 2005), Carnegie Learning’s tutors… Classroom

My main claim: There is an interaction plateau

A problem and its steps • Suppose you are running in a straight line at constant speed. You throw a pumpkin straight up. Where will it land? • Initially, you and the pumpkin have the same horizontal velocity. • Your throw exerts a net force vertically on the pumpkin. • Thus causing a vertical acceleration. • Which leaves the horizontal velocity unaffected. • So when the pumpkin falls, it has traveled the same distance horizontally as you have. • Thus, it lands in your hands

A dialogue between a human tutor (T) and human student (S) • Suppose you are running in a straight line at constant speed. You throw a pumpkin straight up. Where will it land? S: Behind me. • T: Hmm. Let’s think about that. Before you toss the pumpkin and are just carrying it, do you and the pumpkin have the same speed?S: Yes • T: Good. When you toss it up, is the net force on it exactly vertical?S: I’m not sure.T: You exert a force on the pumpkin, right?Etc.

Schematic of dialogue about a single step Stepend T: Tell Stepstart T: Elicit S: Correct Remediation: T: Hint, or prompt, or explain, or analogy, or … S: Incorrect

Comparisons of expert to novice human tutors Stepend T: Tell Novices Stepstart Experts T: Elicit S: Correct T: Hint, or prompt, or explain, or analogy, or … S: Incorrect Experts may have a wider variety

Schematic of an ITS handling of a single step Stepend T: Tell Stepstart Only if out of hints S: Correct T: Hint S: Incorrect

Major differences • Low-interaction tutoring (e.g., CAI) • Remediation on answer only • Step-based interaction (e.g., ITS) • Remediation on each step • Hint sequence, with final “bottom out” hint • Natural tutoring (e.g., human tutoring) • Remediation on each step, substep, inference… • Natural language dialogues • Many tutorial tactics

Conditions(VanLehn, Graesser et al., 2007) • Natural tutoring • Expert Human tutors • Typed • Spoken • Natural language dialogue computer tutors • Why2-AutoTutor (Graesser et al.) • Why2-Atlas (VanLehn et al.) • Step-based interaction • Canned text remediation • Low interaction • Textbook

Human tutors(a form of natural tutoring) Stepend T: Tell Stepstart T: Elicit S: Correct T: Hint, or prompt, or explain, or analogy, or … S: Incorrect

Why2-Atlas(a form of natural tutoring) Stepend T: Tell Stepstart T: Elicit S: Correct A Knowledge Construction Dialogue S: Incorrect

Why2-AutoTutor(a form of natural tutoring) Stepend T: Tell Stepstart T: Elicit S: Correct Hint or prompt S: Incorrect

Canned-text remediation(a form of step-based interaction) Stepend T: Tell Stepstart T: Elicit S: Correct Text S: Incorrect

Experiment 1: Intermediate students & instruction

Experiment 1: Intermediate students & instruction No reliable differences

Experiment 2:AutoTutor > Textbook = Nothing Reliably different

Experiments 1 & 2(VanLehn, Graesser et al., 2007) No significant differences

Experiment 3: Intermediate students & instruction Deeper assessments

Experiment 3: Intermediate students & instruction No reliable differences

Relearning Experiment 4: Novice students & intermediate instruction

Experiment 4: Novice students & intermediate instruction All differences reliable

Relearning Experiment 5: Novice students & intermediate (but shorter) instruction Add Add

Experiment 5: Novice students & intermediate instruction No reliable differences

Experiment 5: Low-pretest students only Aptitude-treatment interaction?

Experiment 5, Low-pretest students only Spoken human tutoring > canned text remediation

Experiments 6 and 7 Novice students & novice instruction Was the intermediate text over the novice students’ heads?

Experiments 6 and 7 Novice students & novice instruction No reliable differences

Interpretation = Can follow reasoning only with tutor’s help (ZPD) predict: Tutoring > Canned text remediation = Can follow reasoning without any help predict: Tutoring = Canned text remediation Experiments 1 & 4 Content complexity Experiments 3 & 5 Experiments 6 & 7 High-pretest Low-pretest Intermediates High-pretest Low-pretest Novices

Original research questions • Can natural language tutorial dialog add pedagogical value? • Yes, when students must study content that is too complex to be understood by reading alone • How feasible is a deep linguistic tutoring system? • We built it. It’s fast enough to use. • Can deep linguistic and dialog techniques add pedagogical value?

When content is too complex to learn by reading alone: Deep>Shallow? Why2-Atlas is not clearly better than Why2-AutoTutor

When to use deep vs. shallow? Use both Use deep Use locally smart FSA Use equivalent texts

Results from all 7 experiments(VanLehn, Graesser et al., 2007) • Why2: Atlas = AutoTutor • Why2 > Textbook • No essays • Content differences • Human tutoring = Why2 = Canned text remediation • Except when novice students worked with instruction designed for intermediates, then Human tutoring > Canned text remediation

Other evidence for the interaction plateau (Evens & Michael, 2006) No significant differences

Other evidence for the interaction plateau (Reif & Scott, 1999) No significant differences

Other evidence for the interaction plateau (Chi, Roy & Hausmann, in press) No significant differences

Still more studies where natural tutoring = step-based interaction • Human tutors • Human tutoring = human tutoring with only content-free prompting for step remediation (Chi et al., 2001) • Human tutoring = canned text during post-practice remediation (Katz et al., 2003) • Socratic human tutoring = didactic human tutoring (Rosé et al., 2001a • Socratic human tutoring = didactic human tutoring (Johnson & Johnson, 1992) • Expert human tutoring = novice human tutoring (Chae, Kim & Glass, 2005) • Natural language tutoring systems • Andes-Atlas = Andes with canned text (Rosé et al, 2001b) • Kermit = Kermit with dialogue explanations (Weerasinghe & Mitrovic, 2006)

Hypothesis 1: Exactly how tutors remedy a step doesn’t matter much Stepend T: Tell Stepstart T: Elicit S: Correct What’s in here doesn’t matter much S: Incorrect

Main claim: There is an interaction plateau Hypothesis 1

Hypothesis 2: Cannot eliminate the step remediation loop Stepend T: Tell Stepstart Must avoid this T: Elicit S: Correct S: Incorrect

Main claim: There is an interaction plateau Hypothesis 2

Conclusions • What does it take to make computer tutors as effective as human tutors? • Step-based interaction • Bloom’s 2-sigma results may have been due to weak control conditions (classroom instruction) • Other evaluations have also used weak controls • When is natural language useful? • For steps themselves (vs. menus, algebra…) • NOT for feedback & hints (remeditation) on steps

Future directions for tutoring systems research • Making step-based instruction ubiquitous • Authoring & customizing • Novel task domains • Increasing engagement

Final thought • Many people “just know” that more interaction produces more learning. • “It ain’t so much the things we don’t know that get us into trouble. It’s the things we know that just ain’t so.” • Josh Billings (aka. Henry Wheeler Shaw)

The interaction plateau

The interaction plateau

Presentation Transcript

Plateau People

THE PENINSULAR PLATEAU

Cumberland Plateau

PLATEAU

Columbian Plateau

The Colorado Plateau

The Colorado Plateau

Plateau Indians

Plateau

Plateau Indians

Allegheny Plateau

The Plateau People

Colorado Plateau

The « Plateau des Glières »

People of the Plateau

Stand on the Plateau, Study on the Plateau

Colorado Plateau

Colorado plateau

The People of the Plateau

THE PENINSULAR PLATEAU