The learning chatbot
Download
1 / 15

The Learning Chatbot - PowerPoint PPT Presentation


  • 176 Views
  • Updated On :

The Learning Chatbot. Bonnie Chantarotwong IMS-256 Fall 2006. What is wrong with state of the art chatbots?. They are repetitive They are predictable simple pattern matching & set response They have no memory can lead to circular conversations They don’t sound like real people.

Related searches for The Learning Chatbot

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The Learning Chatbot' - westbrook


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
The learning chatbot l.jpg

The Learning Chatbot

Bonnie Chantarotwong

IMS-256 Fall 2006


What is wrong with state of the art chatbots l.jpg
What is wrong with state of the art chatbots?

  • They are repetitive

  • They are predictable

    • simple pattern matching & set response

  • They have no memory

    • can lead to circular conversations

  • They don’t sound like real people


Proposed solution l.jpg
Proposed Solution

  • Train chatbots on a corpora of conversations in order to mimic a given personality


Filtering the training corpus l.jpg
Filtering the training corpus

  • Need a lot of conversations containing query screen name

  • Eliminate undesirable data

    • phone numbers

    • Addresses

    • sensitive gossip

  • Eliminate highly technical data

    • since most tech problems are very specific, unless the bot was trained on a newsgroup, learned responses are not likely to be useful for tech support


Parsing the training corpus l.jpg
Parsing the training corpus

  • Extract messages from HTML

  • Group together consecutive messages by the same screen name

  • Simplify prompt messages

    • !!!!!!!??????? -> !?

    • Ohhhhhhhhhhhh! -> ohh!

    • WhATz uP?? -> whatz up?

    • hahahahaha -> haha

  • Break prompts into word sequences (eliminating stop words)

    • I tookthecat toavet -> [i, took, cat, to, vet]


Constructing the cfd l.jpg
Constructing the CFD

  • CFD conditions are prompt words

  • FD samples are string responses, with numeric count indicating strength of correlation

  • Example:

    • Cfd[‘sleep’].sorted_samples() ->

      [“sleep is the best thing ever”, “are you tired?”, “maybe after I eat.”, “hang on a sec.”]


Constructing the cfd7 l.jpg
Constructing the CFD

Simple Concept:

If a prompt is n words long, then each word is 1/n likely to have caused the response

i

1/3

Can we put mittens on it?

1/3

want

1/3

kitten

1/3

1/3

Me too, I’m hungry.

food

1/3

1/6

they

1/6

1/6

What kind did they have?

had

1/6

good

1/6

1/6

at

restaurant

Original Conversation:A: I wantakittenB: Can we put mittens on it?A: I want foodB: Me too, I’m hungryA: They had good food attherestaurantB: What kind did they have?


Using the cfd l.jpg
Using the CFD

Problem

  • Each word in a prompt is not equally likely to have caused the response

    • More common words (such as ‘I’) are less indicative of meaning

      Solution

  • Take into account the commonality of the word over all conversations

    • Divide the weight of the word/response pair by the weight sum over all samples for that word

    • Rare words are weighted more; using a dynamic scale

    • This improved quality of bot responses greatly!


Using the cfd example l.jpg
Using the CFD - Example

CFD:

Cfd[‘i’] = [(“Can we put mittens on it?”, 1/3), (“Me too, I’m hungry”, 1/3)] Sum: 2/3

Cfd[‘want’] = [(“Can we put mittens on it?”, 1/3), (“Me too, I’m hungry”, 1/3)] Sum: 2/3

Cfd[‘kitten’] = [(“Can we put mittens on it?”, 1/3)] Sum: 1/3

Cfd[‘food’] = [ (“Me too, I’m hungry”, 1/3), (“What kind of food did they have?”, 1/6)] Sum: 1/2

Cfd[‘they’] = [ (“What kind of food did they have?”, 1/6)] Sum: 1/6

Cfd[‘had’] = [ (“What kind of food did they have?”, 1/6)] Sum: 1/6

Cfd[‘good’] = [ (“What kind of food did they have?”, 1/6)] Sum: 1/6

Cfd[‘at’] = [ (“What kind of food did they have?”, 1/6)] Sum: 1/6

Cfd[‘restaurant’] = [ (“What kind of food did they have?”, 1/6)] Sum: 1/6

Responses (“how is your kitten?”) = [(“Can we put mittens on it?”, (1/3 / 1/3) = 1)]

Responses (“the food was good”) = [(“Me too, I’m hungry”, (1/3 / 1/2) = 2/3),

(“What kind of food did they have?”, (1/6 / 1/2 + 1/6 / 1/6) = 4/3]


Using the cfd10 l.jpg
Using the CFD

Responses (“the food was good”) = [(“Me too, I’m hungry”, (1/3 / 1/2) = 2/3),

(“What kind of food did they have?”, (1/6 / 1/2 + 1/6 / 1/6) = 4/3]

  • Given the CFD, the response to any prompts containing ‘food’ and ‘good’ will give back “What kind of food did they have?”

  • Problem: This can lead to redundancy

    • A: The food was good

    • B: What kind of food did they have?

    • A: Didn’t you think the food was good?

    • B: What kind of food did they have?

  • Solution: Store an FD of used responses, and don’t use them again

    • A: The food was good

    • B:What kind of food did they have?

    • A: Didn’t you think the food was good?

    • B: Me too I’m hungry


What if we have no responses l.jpg
What if we have no responses?

Because:

  • We’ve never encountered any of the prompt words

  • We’ve used up all relevant responses

    We can:

  • Find a random response

  • Enhance randomness by favoring unlikely responses (near the end of association lists) to reduce redundancy

  • Fabricate a response based on pattern matching

  • Select a response from a default list of responses (i.e. “Lets talk about something else”, “I don’t know anything about that”

    * All my bots implement 1 & 2, and one bot (bonnie) also implements 3 & 4


Interactive webpages are not trivial l.jpg
Interactive Webpages are not trivial

  • Especially if you want to retain some “memory” of the past

  • First CGI problem: A new bot is created with every web prompt - all memory is lost

  • Solution:

    • Write all bot state changes to a file, including used responses.

    • Run this file with every prompt, and reset it when a new conversation starts

    • The bot loads the huge CFD & all state changes from scratch with EVERY call. Slow, but works.


Interactive webpages are not trivial13 l.jpg
Interactive Webpages are not trivial

Memory = Self-modifying code


Interactive webpages are not trivial14 l.jpg
Interactive Webpages are not trivial

Memory = Self-modifying code


Slide15 l.jpg
Demo

http://ischool.berkeley.edu/~bonniejc/