Pedagogical Discourse: Connecting Students to Past Discussions and Peer Mentors within an Online Discussion Board Jihie Kim http://ai.isi.edu/discourse email@example.com
“Talk to as many other people as possible. CS is learned by talking to others, not by reading, or so it seems to me now.” -- Advice from a computer science student http://www-scf.usc.edu/~csci402/
Discussion Board and Corpora Extensible open-source discussion board (phpBB) serves as a platform for bridging ISI research and USC teaching practice 13 semesters running… Six courses Undergrad/Graduate USC/Non-USC Almost 700 students Over 7000 messages
Student Messages in an Undergraduate Operating Systems Course Text is incoherent and ungrammatical. Problem description: Non-factoid questions are difficult to identify, dependent on context, and may include multiple sentences or paragraphs. Answers require explanations.
Thread Length Distribution # of threads Data from an undergraduate CS Course # of messages # of messages Data from a graduate CS Course Threads are often very short, many consisting of only 1-2 messages Students jump into programming details without understanding larger picture or related concepts TA and instructors are not always available to fully guide interactions Need of Discussion Assessment and Scaffolding
Discussion Scaffolding PedaBot: Promote reflection • Find useful information from past discussions • Past discussions could provide suggestionson current problems although the suggestions may not present exact answers • Promote further discussion on related technical topics • Graduate discussions are more concept oriented than undergraduate discussions – could provide interesting references for similar problems MentorMatch: Promote collaboration among students • Identify student ‘experts’ on topic • Connect help-seekers to mentors
Pre-process messages from past discussions (both undergr & grads) Model messages with technical terms used Divide related text into coherent sub units (tiles) (e.g.TextTiling) Model topics in the discussions[Feng et al., AAAI-06] Approach to Generating PedaBot Response • Identify problems/questions in the current discussion • Pick the first post in a thread (80% of first posts are questions) • In our first study, we include only the first posts • Automatically identifying “Question” messages or discussion focus --[Ravi & Kim - AIED 2007; Feng et al., HLT-NAACL 06] • Match students’ problems to similar past discussions • Current and past messages represented as term vectors (with TF/IDF, LSA) • Match by similarity (use cosine similarity) • Filter candidate responses by topics --[Kim et al., ITS-2008] • Generate response • Return most similar message or set of messages in thread
Evaluating PedaBot Responses : Design • “Current” discussion corpus • Fall 2006 – 207 msgs. (first message posted in threads) • Past discussion corpora (taken from 4 semesters prior to Fall 2006) • Student messages from Undergraduate discussions – 3788 msgs • Instructor messages from Undergraduate discussions - 531 msgs • Student messages from Graduate discussions – 957 msgs • PedaBot responses rated by 4 people – average ratings used • Evaluation of system responses – 2 criteria • Technical quality of retrieved results • Relevance of retrieved results w.r.t asked question
Preliminary Evaluation (a): Technical Quality Rating 5 – Very High Quality 4 – Good Quality 3 – Technical 2 – Somewhat technical 1 – Not technical • Technical Quality of results (messages) returned by the system Page table loading into memory…. When we have the page table in disk, we cannot map the physical pages because the page tables are larger than physical space. Using memmap will not work… Technical Rating = 5 M1 … Technical Rating = 1 M5 How many points will be taken off for assignment #2, if the first test case does not work? …
Preliminary Evaluation (b) : Relevance/Similarity Rating 5 – Very Good Response 4 – Good Response 3 – Related 2 – Somewhat Related 1 – Unrelated • Relevance - How “related” was the message returned by the system w.r.t the question asked by the student ? Current message What is RPC ? M1 RPC stands for “Remote Procedure Call” . It is used in … Relevance Rating = 5 ... M5 How do we implement the test cases for virtual memory in assignment #3 ? Relevance Rating = 1
PedaBot User Interface Relevant past discussion or document
Whole discussion can be viewed Students can rate retrieved discussions
Pilot Study of PedaBot Integration into a live student discussion board (Fall 2007) Upper-level undergraduate Operating Systems course offered by the Computer Science department at the University of Southern California (Male N=104, Female N=15) Student surveys collected in Fall 2007 and Fall 2008
Difference in thread length w, w/o PedaBot Hypothesis: Use of Pedabot for reflection will increase student participation in discussions Initial analysis
Discussion Scaffolding PedaBot: Promote reflection • Find useful information from past discussions • Past discussions could provide suggestionson current problems although the suggestions may not present exact answers • Promote further discussion on related technical topics • Graduate discussions are more concept oriented than undergraduate discussions – could provide interesting references for similar problems MentorMatch: Promote collaboration among students • Identify student ‘experts’ on particular topic • Connect help-seekers to experts (mentors)
MentorMatch Motivation Get students the help they need Promote collaboration between help-seekers and mentors Peer replies better than instructor replies at furthering discussion Acknowledge mentors for their role in assisting classmates Pilot Study Integrated into live student discussion board (Fall 2008) Run 5 weeks (10 weeks into 15 week semester) Design Do students use tool / find it helpful? Does notifying mentors encourage participation? Do mentors receive better grades?
Student topic profiles Build profile of topic categories for each student Classify each student’s message using topic models Models based on higher-level topics covered in course and textbook index terms that map to them (Feng, Kim, Shaw, Hovy, AAAI-2006) Distinguish between help seekers & mentors (experts) Messages weighted based on type: question or response - for now, initial post or reply (most initial posts are questions) Include contributions of short messages - e.g. yes/no, acknowledgement
Can view class and personal mentor info “Topic experts” link opens a window with mentor names and topics. “Topics requesting your expertise” section displays links to discussions on topics of the student’s expertise.
Did students find tool useful? Interest and usefulness (N=20) Participation (N=20)
Do mentors receive better grades? • Project grades (max=40), usage stats & profiling scores for 2 projects (topics) • Scores of experts (Exp) compared to averages of non-experts (NEA) • GradesTopic Profiling Score
Groups formed from discussion: Network Analysis(Kang, Kim, Shaw 2009) • Active Group Participants : • 47: Instructor, 1461: TA • 1320, 1425, 1348, 1437, 1277, 1459 • BRIDGE: • 1289, 1294,1435,1371 • All Bridge Students received good grade <Group distribution in fall 2008 semester> 24
Summary Discussion Scaffolding • Promote reflection (Kim, et al., ITS 2008; Feng et al., IUI 2006 ) • Promote collaboration among students (Kim and Shaw, IAAI 2009; Shaw, Kim and Supanakoon, AIED 2009) Discussion Assessment • Workflows (distributed computing) for large scale assessment • Community detection and information flow (Kang, Kim, Shaw 2009) • Identify discussion threads with unanswered questions (Ravi & Kim, AIED 2007) • Assess student participation (Kim & Beal AERA 2006; Ravi et al., 2007; Kim et al., AIED 2009; Kim et al., AIED 2007; Kim et al., AIED 2005) • Assess topics discussed over time (Feng, Kim, Shaw, Hovy, AAAI-2006) • Identify discussion focus (Feng, Shaw, Kim, Hovy, HLT-NAACL 2006) • Assess tutor participation effect (Shaw, AIED 2005) • Sentiment in student discussions (Wyner et al., 2009) More details/papers available at: http://ai.isi.edu/discourse Supported by NSF CISE/IIS and EHR/CCLI
Related Work • Dialogue modeling in email messages or blog (e.g. AAAI 2008 workshop on Enhanced Messaging) • Email speech acts • Requests and commitments • Handling noisy data and high variance in text (Knoblock et al., 2007) • Pedagogical dialogue Instructional discourse modeling (Yuan et al., 2008; Graesser et al., 2005; McLaren et al., 2007; Boyer et al., 2008; Fossati 2008; ) • Course topic and task modeling using information extraction techniques (Roy et al. 2008; Jovanovic et al., 2006 ) • Trace student e-learning activities (Israel and Aiken, 2007; Dringus and Ellis, 2005) • Scaffolding strategies for e-learning tools (Tang and McCalla 2005; Bari and Benzater, 2005; …) • Social Media analysis