1 / 22

Towards Identifying Unresolved Discussions in Student Online Forums

Towards Identifying Unresolved Discussions in Student Online Forums. Jihie Kim, Jia Li, and Taehwan Kim Information Sciences Institute/ University of Southern California http://ai.isi.edu/discourse jihie@isi.edu. “ Talk to as many other people as possible.

enorris
Download Presentation

Towards Identifying Unresolved Discussions in Student Online Forums

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Identifying Unresolved Discussions in Student Online Forums Jihie Kim, Jia Li, and Taehwan Kim Information Sciences Institute/ University of Southern California http://ai.isi.edu/discourse jihie@isi.edu

  2. “Talk to as many other people as possible. CS is learned by talking to others, not by reading, or so it seems to me now.” -- Advice from an undergraduate computer science student http://www-scf.usc.edu/~csci402/

  3. Discussion Board and Corpora Extensible open-source discussion board (phpBB) serves as a platform for bridging ISI research and USC teaching practice 15 semesters running… CS and Engineering courses Undergrad/Graduate USC/Non-USC Almost 800 students Over 8000 messages

  4. Student Messages in an Undergraduate Operating Systems Course Text is incoherent and ungrammatical. Problem description: Non-factoid questions are difficult to identify, dependent on context, and may include multiple sentences or paragraphs. Answers require explanations.

  5. Thread Length Distribution # of threads Data from an undergraduate CS Course # of messages # of messages Data from a graduate CS Course Threads are often very short, many consisting of only 1-2 messages Students jump into programming details without understanding larger picture or related concepts TA and instructors are not always available to fully guide interactions  Need of Discussion Assessment and Scaffolding

  6. PedDiscourse Research Discussion Assessment Which discussions need instructor attention? Who is asking and answering questions? What topics are discussed when? Discussion Scaffolding Promote reflection Promote collaboration among students

  7. Individual messages Topic, quantity Relations among messages Response/Replies Roles that a message play Discussion threads Thread lengths and quantity Discussion Topic Discussion Focus … Related course data Notes, web pages, readings Assignments and projects Modeling discussion threads . . .

  8. Discussion Assessment Which discussions need instructor attention? Identify roles that individual messages play (ques, ans, ack, etc.) Analyze patterns of message roles Find discussion threads without an answer for the initial question

  9. Roles of individual messages Use Searle’s theory of Speech Acts (Searle, 1969) to model threaded discussions Speech Acts Choose SAs to use Question (QUES), Answer or Suggestion (ANS-SUG), Correction or Objection (Neg-Ack), ….. Provide relationship between a pair of messages Multiple SA’s per pair of messages in thread A single message can be related (via SAs) with multiple messages

  10. Speech Acts (SAs) in a discussion thread QUES The Professor gave us 2 methods for forking threads from the main program. One was ....... The other was to ......... When you fork a thread where does it get created and take its 8 pages from? Do you have to calculate ......? If so how? Where does it store its PCReg .......? Any suggestions would be helpfule. S1 ANS-SUG read the student documentation for the Fork syscall S2 ISSUE, QUES I am still confused. I understand it is in the same address space as the parent process, where do we allocate the 8 pages of mem for it? And how do we keep track of .....? … I am sure it is a simple concept that I am just missing. S1 ANS-SUG If you use the first implementation...., then you'll have a hard limit on the number of threads....If you use the second implementation, you need to.... Either way, you'll need to implement the AddrSpace::NewStack() function and make sure that there is memory available. S3

  11. Speech Act categories explored Code 1 Kappa: 0.54 Code 3 Code 2 Kappa: 0.70 Kappa: 0.58

  12. Current Speech Act Categories

  13. Data cleaning and pre-processing Discussion data Noisy, Incoherent High variations – messages may contain answers or suggestions in the form of questions Informal dialect used by students Data pre-processing – Tokenization, Stemming, other filtering steps applied (e.g. Removing programming code existing within messages, pluralized words,…etc….) Data Categorization Transform/Replace commonly occurring words/word-sequences with categories Apostrophe words ( ‘re, ‘ve, ‘m…) Technical terms existing within messages replaced by TECH_TERM - (from commonly used technical terms in course) Don’t replace pronouns (“you can” in ANS vs. “I can”)

  14. Features for SA Classification F1: Cue phases and their positions (e.g. “Thank” position) F2: Message Position F3: Previous Message Information F4: Poster Class F5: Poster Change F6: Message Length Example TBL rules

  15. SA Classification Results

  16. Profiling discussion threads with SAs (Q1) Were all questions answered? (Y/N) (Q2) Were there any issues or confusion? (Y/N) (Q3) Were those issues or confusions resolved? (Y/N)

  17. Thread classification with SA classifiers (Q1) Were all questions answered? (Y/N) (Q2) Were there any issues or confusion? (Y/N) (Q3) Were those issues or confusions resolved? (Y/N) • Feature Set1: Whether there was an [SA] in the thread • Feature Set2: Whether the last message in the thread included [SA] • SVM Classification results with • human annotated SAs (b) SVM Classification results with system generated SAs

  18. Direct thread classification without SA classifiers (Q1) Were all questions answered? (Y/N) (Q2) Were there any issues or confusion? (Y/N) (Q3) Were those issues or confusions resolved? (Y/N) • F1’: cue phrases and their positions (last message or not) in the thread • With SAs (b) Direct classification

  19. Summary and Discussion Identifying unresolved discussions • Discerning speech acts (SAs) in student online discussions • Classify discussion threads with SA as features • Compare SA-based classification and direct thread classification with phrase features • SA-based features may help some difficult cases • E.g. Longer threads with more than one questions raised

  20. Related Work • Pedagogical/tutorial dialogue Instructional discourse modeling (Yuan et al., 2008; Graesser et al., 2005; McLaren et al., 2007; Boyer et al., 2008; Fossati 2008; Litman et al., 2003) • Dialogue modeling in email messages or blog (e.g. AAAI 2008 workshop on Enhanced Messaging) • Email speech acts • Requests and commitments • Handling noisy data and high variance in text (Knoblock et al., 2007) • Course topic and task modeling using information extraction techniques (Roy et al. 2008; Jovanovic et al., 2006 ) • Trace student e-learning activities (Israel and Aiken, 2007; Dringus and Ellis, 2005)

  21. Ongoing Work: Discussion Assessment • Discussion thread pattern and phase analysis • question, understanding, solving and closing • Discussion topic analysis • Coherency of discussion topics • Student profiling • Information providers (peer mentors) vs. information seekers • Information flow and influence network among participants • Use of workflows (distributed systems) for large-scale assessment • E.g. participation changes over several semesters

  22. Supported by National Science Foundation (NSF) More details available at http://ai.isi.edu/discourse Email: jihie@isi.edu

More Related