1 / 32

The Computer Speaks

SpeakEasy. The Computer Speaks. SpeakEasy. Welcome to the Epcot Center. First -- some background for you. SpeakEasy. What is it? . SpeakEasy is the name of the computer program that enables the computer to speak with the inflection and timing we expect to hear in a human speaker.

kimi
Download Presentation

The Computer Speaks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SpeakEasy The Computer Speaks

  2. SpeakEasy

  3. Welcome to the Epcot Center First -- some background for you.

  4. SpeakEasy What is it? • SpeakEasy is the name of the computer program that enables the computer to speak with the inflection and timing we expect to hear in a human speaker. • Why do computers have to sound lifeless and dull? Answer: They don’t!

  5. SpeakEasy was developed by John Goldsmith, and is the joint product of The University of Chicago’s Department of Linguistics and Microsoft Research.

  6. The University of Chicago One of the leading research institutes in the world A private university established by John D. Rockefeller in 1892 Microsoft Research in Redmond WA The research arm of the Microsoft Corporation Research and development with applications of linguistics to real world problems THE UNIVERSITY OF CHICAGO andMicrosoft Corporation

  7. NLPWin A robust parser of written English... Whistler A synthetic voice which the computer can use to speak SpeakEasy was designed to mesh with two of Microsoft’s language projects...

  8. To make the computer’s voice vivid and life-like, what we need to give it is: Prosody.

  9. Prosody: • Intonation (what many people call “inflection”) , and • Timing and pausing.

  10. Speech without a prosody system?! • This is what a computer sounds like without -- and with -- prosody:

  11. Let’s hear that again ... SpeakEasy First, with prosody and then without prosody

  12. Compare a different computer voice, using only a rudimentary prosodic system: Click! Click here for SpeakEasy’s rendition: Click!

  13. What really happens to make a sentence come to life? First, we enter the sentence. Then it goes to the parser, NLPWin. NLPWin analyzes the sentence and sends the analysis back to SpeakEasy. SpeakEasy designs the prosody... And sends all of that to the Backend for synthesis.

  14. Full specification of the utterance NLPWin sends a grammatical analysis Whistler backend synthesizer NLPWin SpeakEasy Welcome to the Epcot Center

  15. Let’s look at that again. Here’s what happens when we want the computer to speak a sentence out loud. Suppose it’s this: “This sentence has been pronounced for you by Speakeasy.”

  16. “This” is a determiner. “sentence” is a noun. “has” is an auxiliary verb. ….“by” is a preposition. “SpeakEasy” is a noun. SpeakEasy computes the intonation… NLPWin Parser and Whistler provides the voice. “This sentence has been pronounced for you by Speakeasy.”

  17. Can a computer have a funny bone? Read to you by SpeakEasy

  18. Would you like to learn more about the ideas that went into the design of SpeakEasy?

  19. Here’s some of what the computer sees: Here is the sentence Here are the tones used And here is the pitch! Click here

  20. Prosody is computed in two steps: • First, we establish the right tones for the sentence; • Then we translate that into pitches that the synthesizer can understand. First, the tones:

  21. We can go see the Epcot Center today. Now the pitches:

  22. Now, maybe that’s not exactly what we meant to say. SpeakEasy is not, unfortunately, a mind-reader. Maybe you meant to say this:

  23. Do you hear the difference? Here they are again.

  24. Whistler provides for the user (a human user, or another program) to control which intonation should be used in cases like that.

  25. Questions are very tough for the computer to get right. So much depends on exactly what it is that you mean to ask -- and how you mean to put it.

  26. Yes/no questions normally rise at the end: But who-what-questions don’t ….

  27. Questions based on the wh-word (who, what, where, when, how, why) don’t rise at the end -- did you ever notice that? Where do you want to go today?

  28. Here’s something you probably never thought of. When you use a noun for the second time in a sentence, you usually say it without its normal degree of stress. If we don’t teach the computer to do that too, we get a funny sentence. Listen: It was the best of times, it was the worst of times. That’s not right! Here’s how it should be said: It was the best of times, it was the worst of times.

  29. What else can SpeakEasy do?

  30. SpeakEasy helped WBEZ, the National Public Radio Station in Chicago, with its fund-raising this spring. • Don’t forget to call this number: 1 888 YOUR NPR

  31. SpeakEasy can read the Berenstain Bears…. SpeakEasy could read a story to a child -- or provide the voice for an interactive computer game.

  32. SpeakEasy Well, there you have it. Thanks for stopping by, and thanks for listening. NLP and Whistler go together well. Don’t be surprised if you hear from me again before too long… Tal vez en español, ou français, oder Deutsch. Sayo:nara -- or, as Americans say, Sayonara!

More Related