Course overview
This presentation is the property of its rightful owner.
Sponsored Links
1 / 24

Course Overview PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on
  • Presentation posted in: General

Course Overview. What is AI? What are the Major Challenges? What are the Main Techniques? Where are we failing, and why? Step back and look at the Science Step back and look at the History of AI What are the Major Schools of Thought? What of the Future?.

Download Presentation

Course Overview

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Course overview

Course Overview

  • What is AI?

  • What are the Major Challenges?

  • What are the Main Techniques?

  • Where are we failing, and why?

  • Step back and look at the Science

  • Step back and look at the History of AI

  • What are the Major Schools of Thought?

  • What of the Future?

  • What are we trying to do? How far have we got?

  • Natural language (text & speech) (continued…)

  • Robotics

  • Computer vision

  • Problem solving

  • Learning

  • Board games

  • Applied areas: Video games, healthcare, …

  • What has been achieved, and not achieved, and why is it hard?


Language technology

Natural Language Understanding

Natural Language Generation

Speech Synthesis

Speech Recognition

Language Technology

Interlingua

open1 open2

Meaning

Hard!

Text

Text

Cheaters’ shortcut

Speech

Speech


Modern machine translation

Modern Machine Translation

  • Prevalent approach uses statistics, following an idea by Warren Weaver (conceived as early as 1947)

  • View translation as a form of decoding: “Dutch is just coded English” (or the other way round)

  • i.e. look at the problem from the computer’s point of view

  • Deciphering coded text, which replaces each English word with a coded word

    • Suppose you have a large English text, and an even larger corpus of English

    • You guess the correct version of a coded word by comparing the frequency of that word in the corpus with the frequency of all the words in your text

    • E.g., most frequent word must be ‘the’, so the most frequent word in the corpus may be code for ‘the’. (Just a guess!)

    • Check whether this combination of guesses is a proper English text; change where necessary

  • Can try with Google : “wound will cure/heal” “served his sorrow/sentence”


Modern machine translation1

Modern Machine Translation

  • But of course, Dutch is not just coded English. (For example, the right translation for “open” may depend on the words surrounding “open”.)

  • How do we find out how sentences in the two languages are related?

  • To get a good starting point, Machine Translation uses huge bilingual corpora (usually based on human translation)

  • Example: Canadian Hansard corpus, bilingual French/English parliament proceedings, also Hong Kong

  • (But I’ll use Dutch as an example)


Modern machine translation2

Modern Machine Translation

  • Here we will not explain the statistical techniques used

  • Just observe: Guess how expressions line up across two languages

  • Based on pure pattern matching. No knowledge of Dutch or English is required

  • NB: in statistical translation program, no longer easy to see understanding followed by generation


Modern machine translation3

Modern Machine Translation

  • perform various preparatory operations (e.g., match corresponding sentences with each other)

  • hypothesise ways of matching smaller expressions with each other. Example 1:

    • E: ‘that Blair responded’

    • N: ‘dat Blair antwoordde’

    • E: ‘whether Kennedy responded’

    • N: ‘of Kennedy antwoordde’


Modern machine translation4

Modern Machine Translation

  • Here is a more interesting example, involving differences in word order between the two languages:

  • Example 2:

    • E: that Blair responded to the question

    • N: dat Blair op de vraag antwoordde

    • E: whether Kennedy responded to the challenge

    • N: of Kennedy op de uitdaging antwoordde

  • Need offsets in translation model


Models unigrams bigrams trigrams

Models, Unigrams, Bigrams, Trigrams

  • Need a translation model and a language model

  • Translation model tells us likely translations (roughly)

  • Language model tells us how good those sentences are in the target language

  • Language model

    • Ideally we would like to know how common any sentence is

    • We will settle for pairs (bigrams)

  • Translation model

    • Often use something quite crude, like word by word

    • Correct positions with offsets

    • Good language model can save bad translation model


How far has machine translation advanced

How far has Machine Translation advanced?

National Institute of Standards and Technology (NIST)

Regular competitions between MT systems

(Source: K.Knight, Statistical MT Tutorial, Aberdeen 2005)


Winner 2002

winner 2002

insistent Wednesday may recurred her trips to Libya tomorrow for flying

Cairo 6-4 (AFP) – an official announced today in the Egyptian lines company for flying Tuesday is a company “insistent for flying” may resumed a consideration of a day wednesday tomorrow her trips to Libya of security council decision trace international the imposed ban comment

winner 2003

Egyptair Has Tomorrow to Resume its flights to Libya

Cairo 4-6 (AFP) – Said an official at the Egyptian Aviation Company today that the company egyptair may resume as of tomorrow, Wednesday its flights to Libya after the International Security Council resolution to the suspension of the embargo imposed on Libya.


Conclusion on statistical mt

Conclusion on Statistical MT

  • This approach to MT relies on massive parallel corpora; these are not yet available for all language pairs

  • The MT system does not “understand” the content of the sentences

  • Perhaps progress using statistical methods will flatten in future

  • but they are starting to be combined with “higher-level” information


Practical machine translation

Practical Machine Translation

Types of translation

  • Rough translation

    • Could perhaps be post-edited by a monolingual human (cheaper)

  • Restricted source translation

    • Subject and form restricted, e.g. weather forecast

  • Pre-edited translation

    • Human pre-edits, e.g. Caterpillar English

    • Can improve original too

  • Literary


Summing up modern translation

Summing up Modern Translation

  • Deep vs. Shallow?

    • Deep - comprehensive knowledge of the word.

    • Shallow - no knowledge.

    • So far, shallow approaches more successful.

    • Deep can be better on a particular domain if a lot of expert effort is put into building models

    • Shallow approach is much easier

    • Similar story in other areas of AI

  • Each of these programs on its own is highly specialised (i.e., limited)


On the other hand

On the other hand…

Humans don’t always get it right either!

  • French hotel: “Please leave your values at the front desk.”

  • Athens hotel: “We expect our visitors to complain daily at the office between the hours of 9 and 11 a.m.”

  • Tokyo hotel room: “The flattening of underwear is the job of the chambermaid - get it done, turn her on.”

  • Hong Kong tailor shop: “Order your summer suit now. Because of big rush we execute customers in strict rotation.”

  • Men's room at Mexican golf course/resort: “Guests are requested not to wash their balls in the hand basins.”

  • Budapest elevator: “due to out of order we regret that you are unbearable”

  • Bangkok Dry Cleaner's: “Drop Trousers Here for Best Results”

  • Tokyo hotel room: “Please take advantage of our chambermaids.”

    They do understand, but they may make the wrong choices in the target language


Speech recognition

Speech Recognition

  • Signal processing to recognise features

  • Coarticulation: model how each sound (“phone”) depends on neighbours

  • Dialect: different possible pronunciations

  • To recognise isolated words use unigram language model again

  • Continuous speech: use bigram or trigram model

  • Try:

    • “eat I scream” vs. “eat ice cream”

    • “eat a banana” vs. “eat a bandana”


Speech recognition1

Speech Recognition

  • Humans are remarkably good because of high level knowledge

  • Computers:

    • No background noise, single speaker, vocabulary few thousand words:

      >99%

    • In general with good acoustics:

      60-80%

    • On noisy phone

      terrible


Natural language generation nlg

Natural Language Generation (NLG)

Natural Language Generation is better than having people write texts when:

  • There are many potential documents to be written, differing according to the context (user, situation, language)

  • There are some general principles behind document design


Example noun phrase design

Example: Noun Phrase design

A noun phrase can convey an arbitrary amount of information:

  • Someone vs.

  • a designer vs.

  • an old designer vs.

  • an old designer with red hair …

    How much information should we “pack into” a given Noun Phrase?

    This is normally considered part of the aggregation task.


Some issues to consider

Some Issues to Consider

  • Preferred ordering within the text (e.g. most important first)

  • Readability of the Noun Phrase,

  • Flow of “focus”,

  • Successful use of pronouns and abbreviated references


Example content

Example Content

(NB we assume that words, basic syntax etc have been chosen)

This T-shirt was made by James Sportler .

Sportler is a famous British designer.

He drives an ancient pink Jaguar.

He works in London with Thomas Wendsop.

Wendsop won the first prize in the FWJG awards.

Can/should we add more to the Noun Phrase?


One possible addition

One possible addition

This T-shirt was made by James Sportler, who works in London with Thomas Wendsop .

Sportler is a famous British designer. He drives an ancient pink Jaguar.

Wendsop won the first prize in the FWJG awards.

  • Facts about Wendsop are now separated from one another (focus).

  • Wendsop now has greater prominence in the text (ordering)


Another possible addition

Another possible addition

This T-shirt was made by James Sportler, a famous British designer who works in London with Thomas Wendsop, who won the first prize in the FWJG awards .

Sportler drives an ancient pink Jaguar.

  • The Noun Phrase is now very complex (readability)

  • “He” now doesn’t seem to work in the second sentence (pronouns)


Another possible addition1

Another possible addition

This T-shirt was made by James Sportler, a famous British designer .

He drives an ancient pink Jaguar.

He works in London with Thomas Wendsop.

Wendsop won the first prize in the FWJG awards.

  • Possibly the best solution, but is this better than the original “text”?


Why is natural language generation hard

Why is Natural Language Generation hard?

  • Natural Language Generation involves making many choices, e.g. which content to include, what order to say it in, what words and syntactic constructions to use.

  • Linguistics does not provide us with a ready-made, precise theory about how to make such choices to produce coherent text

  • The choices to be made interact with one another in complex ways

  • Many results of choices (e.g. text length) are only visible at the end of the process

  • There doesn’t seem to be any simple and reliable way to order the choices


  • Login