the learning chatbot
Download
Skip this Video
Download Presentation
The Learning Chatbot

Loading in 2 Seconds...

play fullscreen
1 / 15

The Learning Chatbot - PowerPoint PPT Presentation


  • 176 Views
  • Uploaded on

The Learning Chatbot. Bonnie Chantarotwong IMS-256 Fall 2006. What is wrong with state of the art chatbots?. They are repetitive They are predictable simple pattern matching & set response They have no memory can lead to circular conversations They don’t sound like real people.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The Learning Chatbot' - westbrook


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the learning chatbot

The Learning Chatbot

Bonnie Chantarotwong

IMS-256 Fall 2006

what is wrong with state of the art chatbots
What is wrong with state of the art chatbots?
  • They are repetitive
  • They are predictable
    • simple pattern matching & set response
  • They have no memory
    • can lead to circular conversations
  • They don’t sound like real people
proposed solution
Proposed Solution
  • Train chatbots on a corpora of conversations in order to mimic a given personality
filtering the training corpus
Filtering the training corpus
  • Need a lot of conversations containing query screen name
  • Eliminate undesirable data
    • phone numbers
    • Addresses
    • sensitive gossip
  • Eliminate highly technical data
    • since most tech problems are very specific, unless the bot was trained on a newsgroup, learned responses are not likely to be useful for tech support
parsing the training corpus
Parsing the training corpus
  • Extract messages from HTML
  • Group together consecutive messages by the same screen name
  • Simplify prompt messages
    • !!!!!!!??????? -> !?
    • Ohhhhhhhhhhhh! -> ohh!
    • WhATz uP?? -> whatz up?
    • hahahahaha -> haha
  • Break prompts into word sequences (eliminating stop words)
    • I tookthecat toavet -> [i, took, cat, to, vet]
constructing the cfd
Constructing the CFD
  • CFD conditions are prompt words
  • FD samples are string responses, with numeric count indicating strength of correlation
  • Example:
    • Cfd[‘sleep’].sorted_samples() ->

[“sleep is the best thing ever”, “are you tired?”, “maybe after I eat.”, “hang on a sec.”]

constructing the cfd7
Constructing the CFD

Simple Concept:

If a prompt is n words long, then each word is 1/n likely to have caused the response

i

1/3

Can we put mittens on it?

1/3

want

1/3

kitten

1/3

1/3

Me too, I’m hungry.

food

1/3

1/6

they

1/6

1/6

What kind did they have?

had

1/6

good

1/6

1/6

at

restaurant

Original Conversation:A: I wantakittenB: Can we put mittens on it?A: I want foodB: Me too, I’m hungryA: They had good food attherestaurantB: What kind did they have?

using the cfd
Using the CFD

Problem

  • Each word in a prompt is not equally likely to have caused the response
    • More common words (such as ‘I’) are less indicative of meaning

Solution

  • Take into account the commonality of the word over all conversations
    • Divide the weight of the word/response pair by the weight sum over all samples for that word
    • Rare words are weighted more; using a dynamic scale
    • This improved quality of bot responses greatly!
using the cfd example
Using the CFD - Example

CFD:

Cfd[‘i’] = [(“Can we put mittens on it?”, 1/3), (“Me too, I’m hungry”, 1/3)] Sum: 2/3

Cfd[‘want’] = [(“Can we put mittens on it?”, 1/3), (“Me too, I’m hungry”, 1/3)] Sum: 2/3

Cfd[‘kitten’] = [(“Can we put mittens on it?”, 1/3)] Sum: 1/3

Cfd[‘food’] = [ (“Me too, I’m hungry”, 1/3), (“What kind of food did they have?”, 1/6)] Sum: 1/2

Cfd[‘they’] = [ (“What kind of food did they have?”, 1/6)] Sum: 1/6

Cfd[‘had’] = [ (“What kind of food did they have?”, 1/6)] Sum: 1/6

Cfd[‘good’] = [ (“What kind of food did they have?”, 1/6)] Sum: 1/6

Cfd[‘at’] = [ (“What kind of food did they have?”, 1/6)] Sum: 1/6

Cfd[‘restaurant’] = [ (“What kind of food did they have?”, 1/6)] Sum: 1/6

Responses (“how is your kitten?”) = [(“Can we put mittens on it?”, (1/3 / 1/3) = 1)]

Responses (“the food was good”) = [(“Me too, I’m hungry”, (1/3 / 1/2) = 2/3),

(“What kind of food did they have?”, (1/6 / 1/2 + 1/6 / 1/6) = 4/3]

using the cfd10
Using the CFD

Responses (“the food was good”) = [(“Me too, I’m hungry”, (1/3 / 1/2) = 2/3),

(“What kind of food did they have?”, (1/6 / 1/2 + 1/6 / 1/6) = 4/3]

  • Given the CFD, the response to any prompts containing ‘food’ and ‘good’ will give back “What kind of food did they have?”
  • Problem: This can lead to redundancy
    • A: The food was good
    • B: What kind of food did they have?
    • A: Didn’t you think the food was good?
    • B: What kind of food did they have?
  • Solution: Store an FD of used responses, and don’t use them again
    • A: The food was good
    • B:What kind of food did they have?
    • A: Didn’t you think the food was good?
    • B: Me too I’m hungry
what if we have no responses
What if we have no responses?

Because:

  • We’ve never encountered any of the prompt words
  • We’ve used up all relevant responses

We can:

  • Find a random response
  • Enhance randomness by favoring unlikely responses (near the end of association lists) to reduce redundancy
  • Fabricate a response based on pattern matching
  • Select a response from a default list of responses (i.e. “Lets talk about something else”, “I don’t know anything about that”

* All my bots implement 1 & 2, and one bot (bonnie) also implements 3 & 4

interactive webpages are not trivial
Interactive Webpages are not trivial
  • Especially if you want to retain some “memory” of the past
  • First CGI problem: A new bot is created with every web prompt - all memory is lost
  • Solution:
    • Write all bot state changes to a file, including used responses.
    • Run this file with every prompt, and reset it when a new conversation starts
    • The bot loads the huge CFD & all state changes from scratch with EVERY call. Slow, but works.
interactive webpages are not trivial13
Interactive Webpages are not trivial

Memory = Self-modifying code

interactive webpages are not trivial14
Interactive Webpages are not trivial

Memory = Self-modifying code

slide15
Demo

http://ischool.berkeley.edu/~bonniejc/

ad