Boredom across activities and across the year within reasoning mind
This presentation is the property of its rightful owner.
Sponsored Links
1 / 56

Boredom Across Activities, and Across the Year, within Reasoning Mind PowerPoint PPT Presentation


  • 61 Views
  • Uploaded on
  • Presentation posted in: General

Boredom Across Activities, and Across the Year, within Reasoning Mind. William L. Miller, Ryan Baker, Mathew Labrum, Karen Petsche , Angela Z. Wagner. In recent years. Increasing interest in modeling more about students than just what they know. In recent years.

Download Presentation

Boredom Across Activities, and Across the Year, within Reasoning Mind

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Boredom across activities and across the year within reasoning mind

Boredom Across Activities, and Across the Year, within Reasoning Mind

William L. Miller, Ryan Baker, Mathew Labrum, Karen Petsche, Angela Z. Wagner


In recent years

In recent years

  • Increasing interest in modeling more about students than just what they know


In recent years1

In recent years

  • Increasing interest in modeling more about students than just what they know

  • Can we assess a broad range of constructs


In recent years2

In recent years

  • Increasing interest in modeling more about students than just what they know

  • Can we assess a broad range of constructs

  • In a broad range of contexts


Boredom

Boredom

  • A particularly important construct to measure


Boredom is

Boredom is

  • Common in real-world learning (D’Mello, 2013)


Boredom is1

Boredom is

  • Common in real-world learning (D’Mello, 2013)

  • Associated with worse learning outcomes in the short-term (Craig et al., 2004; Rodrigo et al., 2007)


Boredom is2

Boredom is

  • Common in real-world learning (D’Mello, 2013)

  • Associated with worse learning outcomes in the short-term (Craig et al., 2004; Rodrigo et al., 2007)

  • Associated with worse course grades and standardized exam performance (Pekrun et al., 2010; Pardos et al., 2013)


Boredom is3

Boredom is

  • Common in real-world learning (D’Mello, 2013)

  • Associated with worse learning outcomes in the short-term (Craig et al., 2004; Rodrigo et al., 2007)

  • Associated with worse course grades and standardized exam performance (Pekrun et al., 2010; Pardos et al., 2013)

  • Associated with lower probability of going to college, years later (San Pedro et al., 2013)


Online learning environments

Online learning environments

  • Offer great opportunities to study boredom in context

    • Very fine-grained interaction logs that indicate everything the student did in the system


Automated boredom detection

Automated boredom detection

  • Can we detect boredom in real time, while a student is learning?

  • Can we detect boredom retrospectively, from log files?


Automated boredom detection1

Automated boredom detection

  • Can we detect boredom in real time, while a student is learning?

  • Can we detect boredom retrospectively, from log files?

  • Would allow us to study affect at a large scale

    • Figure out which content is most boring, in order to improve it


Affect detection physical sensors

Affect Detection: Physical Sensors?

  • Lots of work shows that affect can be detected using physical sensors

    • Tone of voice (Litman & Forbes-Riley, 2005)

    • EEG (Conati & McLaren, 2009)

    • Posture sensor and video (D’Mello et al., 2007)

  • It’s hypothesized – but not yet conclusively demonstrated – that using physical sensors may lead to better performance than interaction logs alone


Sensor free affect detection

Sensor-free affect detection

  • Easier to scale to the millions of students who use online learning environments

  • In settings that do not have cameras, microphones, and other physical sensors

    • Home settings

      • have parents bought equipment?

      • can they set it up and maintain it?

    • Classroom settings

      • can school maintain equipment?

      • do students intentionally destroy equipment?

      • parent concerns and political climate


Sensor free boredom detection

Sensor-free boredom detection

  • Has been developed for multiple learning environments

    • Problem solving tutors (Baker et al. 2012; Pardos et al. 2013)

    • Dialogue tutors (D’Mello et al. 2008)

    • Narrative virtual learning environments (Sabourin et al. 2011; Baker et al. 2014)

    • Science simulations (Paquette et al., 2014)

  • The principles of affect detection are largely the same across environments

  • But the behaviors associated with boredom differ considerably between environments


This talk

This talk

  • We discuss our work to develop sensor-free boredom detection for Reasoning Mind Genie 2 (Khachatryan et al, 2014)

  • Self-paced blended learning mathematics curriculum for elementary school students

    • Youngest population for sensor-free affect detection so far

  • Used by approximately 100,000 students a year


Reasoning mind genie 2

Reasoning Mind Genie 2

  • Combines

    • Guided Study with a pedagogical agent “Genie”

    • Speed Games that support development of fluency

  • Used in schools 3-5 days a week for 45-90 minutes per day


Reasoning mind genie 21

(a)

(c)

Reasoning Mind Genie 2

(b)


Reasoning mind genie 22

Reasoning Mind Genie 2

  • Better affect and more on-task behavior than most pedagogies, online or offline (Ocumpaughet al., 2013)

  • Still a substantial amount of boredom

  • Reducing boredom is a key goal


Role for affect detection

Role for affect detection

  • If we can detect boredom in log files

  • We can determine which content is more boring, and improve that content


Related work

Related Work

  • Evidence that specific design features associated with boredom in Cognitive Tutors for high school algebra (Doddannara et al., 2013)


Related work1

Related Work

  • Evidence that specific design features associated with boredom in Cognitive Tutors for high school algebra (Doddannara et al., 2013)

  • Evidence that some disengaged behaviors increase during the year (Beck, 2005)

    • Important to verify that differences in affect due to actual content/design, not time of year


Approach to boredom detection

Approach to Boredom Detection

  • Collect “ground truth” data on student boredom, using field observations

  • Synchronize log data to field observations

  • Distill meaningful data features of log data, hypothesized to relate to boredom

  • Develop automated detector using classification algorithm

  • Validate detector for new students/new lessons/new populations


Bromp 2 0 field observations ocumpaugh et al 2012

BROMP 2.0 Field Observations(Ocumpaugh et al., 2012)

  • Conducted through Android app HART (Baker et al., 2012)

  • Protocol designed to reduce disruption to student

    • Some features of protocol: observe with peripheral vision or side glances, hover over student not being observed, 20-second “round-robin” observations of several students, bored-looking people are boring

  • Inter-rater reliability around 0.8 for behavior, 0.65 for affect

  • 64 coders now certified in USA, Philippines, India


Data collection

Data collection

  • 408 elementary school students


Data collection1

Data collection

  • Diverse sample important for model generalizability (Ocumpaugh et al., 2014)

  • 11 different 8th grade classes

  • 6 schools

    • 2 urban in Texas, predominantly African-American

    • 1 urban in Texas, predominantly Latino

    • 1 suburban in Texas, predominantly White

    • 1 suburban in Texas, mixed ethnicity/race

    • 1 rural in West Virginia, predominantly White


Affect coding

Affect coding

  • 3 expert coders observed each student using BROMP

  • Coded 5 categories of affect

    • Engaged Concentration

    • Boredom

    • Confusion

    • Frustration

    • ?

  • 4891 observations collected in RM classrooms


Building detectors

Building detectors

  • Observations were synchronized with the logs of the students interactions with RM, using HART app and internet time server

  • For each observation, a set of 93 meaningful features describing the student’s behavior was engineered

  • Computed on actions occurring during or preceding an observation (up to 20 seconds before)


Features examples

Features: Examples

  • Individual action features

    • Whether an action was correct or not

    • How long the action took

  • Features across all past activity

    • Fraction of previous attempts on the current skill the student has gotten correct

  • Other known models applied to logs

    • Probability student knows skill (Bayesian Knowledge Tracing)

    • Carelessness

    • Moment-by-Moment Learning Graph


Automated detector of boredom

Automated detector of boredom

  • Detectors were built using RapidMiner 5.3

  • For each algorithm the best features were selected using forward selection/backward elimination

  • Data was re-sampled to have more equal class frequencies; models were evaluated on original class distribution

  • Detectors were validated using 10-foldstudent-level cross-validation


Automated detector of boredom1

Automated detector of boredom

  • Detectors were built using 4 machine learning algorithms that have been successful for building affect detectors in the past:

    • J48

    • JRip

    • Step Regression

    • Naïve Bayes


Best one

Best One

  • Detectors were built using 4 machine learning algorithms that have been successful for building affect detectors in the past:

    • J48

    • JRip

    • Step Regression

    • Naïve Bayes


Machine learning

Machine learning

  • Performance of the detectors was evaluated using

  • A’

    • Given two observations, probability of correctly identifying which one is an example of a specific affective state and which one is not

    • A’ of 0.5 is chance level and 1 is perfect

    • Identical to Wilcoxon statistic

    • Very similar to AUC ROC (Area Under the Receiver-Operating Characteristic Curve)


Results

Results

  • A’ = 0.64

  • Compared to similar detectors in other systems, validated in similar stringent fashion


Using detectors

Using detectors

  • Model applied to entire year of data from these classrooms

  • 2,974,944 actions by 462 students

  • Includes 54 additional students not present during observations

  • Aggregation over pseudo-confidences rather than binary predictions

    • Retains more information


Apparent downward trend

Apparent downward trend


Apparent downward trend1

Apparent downward trend

  • Is it statistically significant?


Apparent downward trend2

Apparent downward trend

  • Is it statistically significant?

  • Yes.

  • Students are less bored later in the year

  • F-test controlling for student

  • p<0.001


Is it practically significant

Is it practically significant?


Is it practically significant1

Is it practically significant?

  • No.

  • r = -0.06


Is it practically significant2

Is it practically significant?

  • No.

  • r = -0.06

  • With large enough samples, anythingis statistically significant


Kind of a positive thing

Kind of a positive thing

  • At minimum, students aren’t getting more bored as the year goes on

  • In other systems, students get more disengaged as the year goes on (Beck, 2005)

  • And the overall level of boredom (~14%) is not very high


Beyond this

Beyond this

  • Curriculum is self-paced


Beyond this1

Beyond this

  • Curriculum is self-paced

  • Which means that predicting boredom by date may obscure real variation


Beyond this2

Beyond this

  • Curriculum is self-paced

  • Which means that predicting boredom by date may obscure real variation

  • Instead, look at boredom by learning objective


Predicting boredom by objective

Predicting boredom by objective

  • p<0.001

  • r=0.343


If we cluster objectives into two groups

If we cluster objectives into two groups

  • “High boredom”

  • “Low boredom”

  • Ignoring the one point in between the two groups

  • Cohen’s D = 0.67


Future work

Future work

  • So… what is it that differentiates the higher boredom lessons from the lower boredom lessons?


Future work1

Future work

  • So… what is it that differentiates the higher boredom lessons from the lower boredom lessons?

    • Nothing obvious, unfortunately…


Future work2

Future work

  • So… what is it that differentiates the higher boredom lessons from the lower boredom lessons?

    • Nothing obvious, unfortunately…

    • May be necessary to develop a taxonomy of potential differences, and see which are predictive

    • May be possible to build off prior work by (Doddannara et al., 2013) that did exactly this for Cognitive Tutor


Future work3

Future Work

  • Can we fix the more boring lessons?

    • Either by determining why they are boring

    • Or just by adding a little more “fun content”


Eventual goal

Eventual Goal

  • Use precise assessments of boredom to help us enhance Reasoning Mind

    • Improving engagement

    • Improving learning outcomes


Thank you

Thank you

twitter.com/BakerEDMLab

Baker EDM Lab

Baker EDM Lab

See our free online MOOT “Big Data and Education”

All lab publications available online – Google “Ryan Baker”

“Data, Analytics, and Learning” – EdX, Fall 2014


  • Login