Ling 180 symbsys 138 intro to computer speech and language processing
This presentation is the property of its rightful owner.
Sponsored Links
1 / 91

LING 180 SYMBSYS 138 Intro to Computer Speech and Language Processing PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on
  • Presentation posted in: General

LING 180 SYMBSYS 138 Intro to Computer Speech and Language Processing. Lecture 10: Machine Translation (II) October 26, 2005 Dan Jurafsky. Thanks to Kevin Knight for much of this material, and many slides also came from Bonnie Dorr and Christof Monz!. Outline for MT Week.

Download Presentation

LING 180 SYMBSYS 138 Intro to Computer Speech and Language Processing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Ling 180 symbsys 138 intro to computer speech and language processing

LING 180 SYMBSYS 138Intro to Computer Speech and Language Processing

Lecture 10: Machine Translation (II)

October 26, 2005

Dan Jurafsky

Thanks to Kevin Knight for much of this material, and

many slides also came from Bonnie Dorr and

Christof Monz!


Outline for mt week

Outline for MT Week

  • Intro and a little history

  • Language Similarities and Divergences

  • Four main MT Approaches

    • Transfer

    • Interlingua

    • Direct

    • Statistical

  • Evaluation


How do we evaluate mt human

How do we evaluate MT? Human

  • Fluency

    • Overall fluency

      • Human rating of sentences read out loud

    • Cohesion (Lexical chains, anaphora, ellipsis)

      • Hand-checking for cohesion.

    • Well-formedness

      • 5-point scale of syntactic correctness

  • Fidelity (same information as source?)

    • Hand rating of target text on 100pt scale

  • Clarity

    • Comprehensibility

      • Noise test

      • Multiple choice questionnaire

    • Readability

      • cloze


  • Evaluating mt problems

    Evaluating MT: Problems

    • Asking humans to judge sentences on a 5-point scale for 10 factors takes time and $$$ (weeks or months!)

    • We can’t build language engineering systems if we can only evaluate them once every quarter!!!!

    • We need a metric that we can run every time we change our algorithm.

    • It would be OK if it wasn’t perfect, but just tended to correlate with the expensive human metrics, which we could still run in quarterly.


    Bilingual evaluation understudy bleu papineni 2001

    BiLingual Evaluation Understudy (BLEU —Papineni, 2001)

    http://www.research.ibm.com/people/k/kishore/RC22176.pdf

    • Automatic Technique, but ….

    • Requires the pre-existence of Human (Reference) Translations

    • Approach:

      • Produce corpus of high-quality human translations

      • Judge “closeness” numerically (word-error rate)

      • Compare n-gram matches between candidate translation and 1 or more reference translations


    Ling 180 symbsys 138 intro to computer speech and language processing

    BLEU Evaluation Metric

    (Papineni et al, ACL-2002)

    Reference (human) translation:The U.S. island of Guam is maintaining a high state of alertafter theGuamairport and itsoffices both received an e-mail from someone calling himself the Saudi Arabian Osama bin Laden and threatening a biological/chemical attack against public places such asthe airport .

    • N-gram precision (score is between 0 & 1)

      • What percentage of machine n-grams can be found in the reference translation?

        • An n-gram is an sequence of n words

      • Not allowed to use same portion of reference translation twice (can’t cheat by typing out “the the the the the”)

    • Brevity penalty

      • Can’t just type out single word “the” (precision 1.0!)

    • *** Amazingly hard to “game” the system (i.e., find a way to change machine output so that BLEU goes up, but quality doesn’t)

    Machine translation:The American [?] international airport and its the office all receives one calls self the sand Arab rich business [?] and so on electronic mail , which sends out ; The threat will be able after public place and so on the airport to start the biochemistry attack , [?] highly alerts after the maintenance.


    Ling 180 symbsys 138 intro to computer speech and language processing

    BLEU Evaluation Metric

    (Papineni et al, ACL-2002)

    Reference (human) translation:The U.S. island of Guam is maintaining a high state of alertafter theGuamairport and itsoffices both received an e-mail from someone calling himself the Saudi Arabian Osama bin Laden and threatening a biological/chemical attack against public places such asthe airport .

    • BLEU4 formula

      • (counts n-grams up to length 4)

      • exp (1.0 * log p1 +

      • 0.5 * log p2 +

      • 0.25 * log p3 +

      • 0.125 * log p4 –

      • max(words-in-reference / words-in-machine – 1,

      • 0)

      • p1 = 1-gram precision

      • P2 = 2-gram precision

      • P3 = 3-gram precision

      • P4 = 4-gram precision

    Machine translation:The American [?] international airport and its the office all receives one calls self the sand Arab rich business [?] and so on electronic mail , which sends out ; The threat will be able after public place and so on the airport to start the biochemistry attack , [?] highly alerts after the maintenance.


    Ling 180 symbsys 138 intro to computer speech and language processing

    Multiple Reference Translations

    Reference translation 1:The U.S. island of Guam is maintaining a high state of alert after the Guam airport and its offices both received an e-mail from someone calling himself the Saudi Arabian Osama bin Laden and threatening a biological/chemical attack against public places such as the airport .

    Reference translation 1:The U.S. island of Guam is maintaining a high state of alert after the Guam airport and its offices both received an e-mail from someone calling himself the Saudi Arabian Osama bin Laden and threatening a biological/chemical attack against public places such as the airport .

    Reference translation 2:Guam International Airport and its offices are maintaining a high state of alert after receiving an e-mail that was from a person claiming to be the wealthy Saudi Arabian businessman Bin Laden and that threatened to launch a biological and chemical attack on the airport and other public places .

    Reference translation 2:Guam International Airport and its offices are maintaining a high state of alert after receiving an e-mail that was from a person claiming to be the wealthy Saudi Arabian businessman Bin Laden and that threatened to launch a biological and chemical attack on the airport and other public places .

    Machine translation:The American [?] international airport and its the office all receives one calls self the sand Arab rich business [?] and so on electronic mail , which sends out ; The threat will be able after public place and so on the airport to start the biochemistry attack , [?] highly alerts after the maintenance.

    Machine translation:The American [?] international airport and itsthe office all receives one calls self the sand Arab rich business [?] and so on electronic mail ,which sends out ; The threat will be able afterpublic place and so on theairportto start the biochemistryattack , [?] highly alerts after the maintenance.

    Reference translation 3:The US International Airport of Guam and its office has received an email from a self-claimed Arabian millionaire named Laden , which threatens to launch a biochemical attack on such public places as airport . Guam authority has been on alert .

    Reference translation 3:The US International Airport of Guam and its office has received an email from a self-claimed Arabian millionaire named Laden ,which threatens to launch a biochemical attack on such public places as airport . Guam authority has been on alert .

    Reference translation 4:US Guam International Airport and its office received an email from Mr. Bin Laden and other rich businessman from Saudi Arabia . They said there would be biochemistry air raid to Guam Airport and other public places . Guam needs to be in high precaution about this matter .

    Reference translation 4:US Guam International Airport and its office received an email from Mr. Bin Laden and other rich businessman from Saudi Arabia . They said there would be biochemistry air raid to Guam Airport and other public places . Guam needs to be in high precaution about this matter .


    Bleu in action

    BLEU in Action

    枪手被警方击毙。 (Foreign Original)

    the gunman was shot to death by the police . (Reference Translation)

    thegunmanwaspolicekill. #1woundedpolicejayaof #2thegunmanwasshotdeadbythepolice. #3thegunmanarrestedbypolicekill. #4thegunmenwerekilled. #5thegunmanwasshottodeathbythepolice. #6

    gunmenwerekilledbypolice?SUB>0?SUB>0 #7

    albythepolice. #8theringeriskilledbythepolice. #9policekilledthegunman. #10

    green= 4-gram match (good!)

    red= word not matched (bad!)


    Bleu comparison

    Bleu Comparison

    Chinese-English Translation Example:

    Candidate 1: It is a guide to action which ensures that the military always obeys the commands of the party.

    Candidate 2: It is to insure the troops forever hearing the activity guidebook that party direct.

    Reference 1: It is a guide to action that ensures that the military will forever heed Party commands.

    Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party.

    Reference 3: It is the practical guide for the army always to heed the directions of the party.


    How do we compute bleu scores

    How Do We Compute Bleu Scores?

    • Intuition: “What percentage of words in candidate occurred in some human translation?”

    • Proposal: count up # of candidate translation words (unigrams) # in any reference translation, divide by the total # of words in # candidate translation

    • But can’t just count total # of overlapping N-grams!

      • Candidate: the the the the the the

      • Reference 1: The cat is on the mat

    • Solution: A reference word should be considered exhausted after a matching candidate word is identified.


    Modified n gram precision

    “Modified n-gram precision”

    • For each word compute:

      (1) the maximum number of times it occurs in any single reference translation

      (2) number of times it occurs in the candidate translation

    • Instead of using count #2, use the minimum of #1 and #2, I.e. clip the counts at the max for the reference transcription

    • Now use that modified count.

    • And divide by number of candidate words.


    Modified unigram precision candidate 1

    Modified Unigram Precision: Candidate #1

    It(1) is(1) a(1) guide(1) to(1) action(1) which(1) ensures(1) that(2) the(4) military(1) always(1) obeys(0) the commands(1) of(1) the party(1)

    Reference 1: It is a guide to action that ensures that the military will forever heed Party commands.

    Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party.

    Reference 3: It is the practical guide for the army always to heed the directions of the party.

    What’s the answer???

    17/18


    Modified unigram precision candidate 2

    Modified Unigram Precision: Candidate #2

    It(1) is(1) to(1) insure(0) the(4) troops(0) forever(1) hearing(0) the activity(0) guidebook(0) that(2) party(1) direct(0)

    Reference 1: It is a guide to action that ensures that the military will forever heed Party commands.

    Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party.

    Reference 3: It is the practical guide for the army always to heed the directions of the party.

    What’s the answer????

    8/14


    Modified bigram precision candidate 1

    Modified Bigram Precision: Candidate #1

    It is(1) is a(1) a guide(1) guide to(1) to action(1) action which(0) which ensures(0) ensures that(1) that the(1) the military(1) military always(0) always obeys(0) obeys the(0) the commands(0) commands of(0) of the(1) the party(1)

    Reference 1: It is a guide to action that ensures that the military will forever heed Party commands.

    Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party.

    Reference 3: It is the practical guide for the army always to heed the directions of the party.

    10/17

    What’s the answer????


    Modified bigram precision candidate 2

    Modified Bigram Precision: Candidate #2

    It is(1) is to(0) to insure(0) insure the(0) the troops(0) troops forever(0) forever hearing(0) hearing the(0) the activity(0) activity guidebook(0) guidebook that(0) that party(0) party direct(0)

    Reference 1: It is a guide to action that ensures that themilitary will forever heed Party commands.

    Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party.

    Reference 3: It is the practical guide for the army always to heed the directions of the party.

    What’s the answer????

    1/13


    Catching cheaters

    Catching Cheaters

    the(2) the the the(0) the(0) the(0) the(0)

    Reference 1: The cat is on the mat

    Reference 2: There is a cat on the mat

    What’s the unigram answer?

    2/7

    What’s the bigram answer?

    0/7


    Bleu distinguishes human from machine translations

    Bleu distinguishes human from machine translations


    Bleu problems with sentence length

    Bleu problems with sentence length

    • Candidate: of the

    • Solution: brevity penalty; prefers candidates translations which are same length as one of the references

    Reference 1: It is a guide to action that ensures that themilitary will forever heed Party commands.

    Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party.

    Reference 3: It is the practical guide for the army always to heed the directions of the party.

    Problem: modified unigram precision is 2/2, bigram 1/1!


    Sentence length

    Sentence length

    • N-gram penalizes using extra words.

    • Fails to enforce the proper translation length.

      Sentence Brevity

    • Multiplicative brevity penalty

    • Done over the entire corpus, rather than sentence-by-sentence: r/c

      • r = sum of the “best match lengths”

      • c = total length of the candidate translation


    Ling 180 symbsys 138 intro to computer speech and language processing

    Brevity Penalty

    Ranking


    Bleu tends to predict human judgments

    BLEU Tends to Predict Human Judgments

    (variant of BLEU)

    slide from G. Doddington (NIST)


    Rule based vs statistical mt

    Rule-Based vs. Statistical MT

    • Rule-based MT:

      • Hand-written transfer rules

      • Rules can be based on lexical or structural transfer

      • Pro: firm grip on complex translation phenomena

      • Con: Often very labor-intensive -> lack of robustness

    • Statistical MT

      • Mainly word or phrase-based translations

      • Translation are learned from actual data

      • Pro: Translations are learned automatically

      • Con: Difficult to model complex translation phenomena


    Parallel corpus

    Parallel Corpus

    • Example from DE-News (8/1/1996)


    Word level alignments

    Word-Level Alignments

    • Given a parallel sentence pair we can link (align) words or phrases that are translations of each other:


    Parallel resources

    Parallel Resources

    • Newswire: DE-News (German-English), Hong-Kong News, Xinhua News (Chinese-English),

    • Government: Canadian-Hansards (French-English), Europarl (Danish, Dutch, English, Finnish, French, German, Greek, Italian, Portugese, Spanish, Swedish), UN Treaties (Russian, English, Arabic, . . . )

    • Manuals: PHP, KDE, OpenOffice (all from OPUS, many languages)

    • Web pages: STRAND project (Philip Resnik)


    Sentence alignment

    Sentence Alignment

    • If document De is translation of document Df how do we find the translation for each sentence?

    • The n-th sentence in De is not necessarily the translation of the n-th sentence in document Df

    • In addition to 1:1 alignments, there are also 1:0, 0:1, 1:n, and n:1 alignments

    • Approximately 90% of the sentence alignments are 1:1


    Sentence alignment c ntd

    Sentence Alignment (c’ntd)

    • There are several sentence alignment algorithms:

      • Align (Gale & Church): Aligns sentences based on their character length (shorter sentences tend to have shorter translations then longer sentences). Works astonishingly well

      • Char-align: (Church): Aligns based on shared character sequences. Works fine for similar languages or technical domains

      • K-Vec (Fung & Church): Induces a translation lexicon from the parallel texts based on the distribution of foreign-English word pairs.


    Computing translation probabilities

    Computing Translation Probabilities

    • Given a parallel corpus we can estimate P(e | f) The maximum likelihood estimation of P(e | f) is: freq(e,f)/freq(f)

    • Way too specific to get any reasonable frequencies! Vast majority of unseen data will have zero counts!

    • P(e | f ) could be re-defined as:

    • Problem: The English words maximizing

      P(e | f ) might not result in a readable sentence


    Computing translation probabilities c tnd

    Computing Translation Probabilities (c’tnd)

    • We can account for adequacy: each foreign word translates into its most likely English word

    • We cannot guarantee that this will result in a fluent English sentence

    • Solution: transform P(e | f) with Bayes’ rule: P(e | f) = P(e) P(f | e) / P(f)

    • P(f | e) accounts for adequacy

    • P(e) accounts for fluency


    Decoding

    Decoding

    • The decoder combines the evidence from P(e) and P(f | e) to find the sequence e that is the best translation:

    • The choice of word e’ as translation of f’ depends on the translation probability P(f’ | e’) and on the context, i.e. other English words preceding e’


    What makes a good translation

    What makes a good translation

    • Translators often talk about two factors we want to maximize:

    • Faithfulness or fidelity

      • How close is the meaning of the translation to the meaning of the original

      • (Even better: does the translation cause the reader to draw the same inferences as the original would have)

    • Fluency or naturalness

      • How natural the translation is, just considering its fluency in the target language


    Statistical mt faithfulness and fluency formalized

    Statistical MT: Faithfulness and Fluency formalized!

    • Best-translation of a source sentence S:

    • Developed by researchers who were originally in speech recognition at IBM

    • Called the IBM model


    The ibm model

    The IBM model

    • Hmm, those two factors might look familiar…

    • Yup, it’s Bayes rule:


    It s also the noisy channel model

    It’s also the Noisy Channel Model

    Slide from Christof Monz


    Fluency p t

    Fluency: P(T)

    • How to measure that this sentence

      • That car was almost crash onto me

    • is less fluent than this one:

      • That car almost hit me.

    • Answer: language models (N-grams!)

      • For example P(hit|almost) > P(was|almost)

    • But can use any other more sophisticated model of grammar

    • Advantage: this is monolingual knowledge!


    Faithfulness p s t

    Faithfulness: P(S|T)

    • French: ça me plait [that me pleases]

    • English:

      • that pleases me - most fluent

      • I like it

      • I’ll take that one

    • How to quantify this?

    • Intuition: degree to which words in one sentence are plausible translations of words in other sentence

      • Product of probabilities that each word in target sentence would generate each word in source sentence.


    Faithfulness p s t1

    Faithfulness P(S|T)

    • Need to know, for every target language word, probability of it mapping to every source language word.

    • How do we learn these probabilities?

    • Parallel texts!

      • Lots of times we have two texts that are translations of each other

      • If we knew which word in Source Text mapped to each word in Target Text, we could just count!


    Faithfulness p s t2

    Faithfulness P(S|T)

    • Sentence alignment:

      • Figuring out which source language sentence maps to which target language sentence

    • Word alignment

      • Figuring out which source language word maps to which target language word


    Big point about faithfulness and fluency

    Big Point about Faithfulness and Fluency

    • Job of the faithfulness model P(S|T) is just to model “bag of words”; which words come from say English to Spanish.

    • P(S|T) doesn’t have to worry about internal facts about Spanish word order: that’s the job of P(T)

    • P(T) can do Bag generation: put the following words in order

      • have programming a seen never I language better

    -actual the hashing is since not collision-free usually

    the is less perfectly the of somewhat capacity table


    P t and bag generation the answer

    P(T) and bag generation: the answer

    • “Usually the actual capacity of the table is somewhat less, since the hashing is not collision-free”

    • How about:

      • loves Mary John


    A motivating example

    A motivating example

    • Japanese phrase 2000nen taio

      • 2000nen

        • 2000 - highest

        • Y2K

        • 2000 years

        • 2000 year

      • Taio

        • Correspondence -highest

        • Corresponding

        • Equivalent

        • Tackle

        • Dealing with

        • Deal with

    P(S|T) alone prefers:

    2000 Correspondence

    Adding P(T) might produce correct

    Dealing with Y2K


    Recent progress in statistical mt

    insistent Wednesday may recurred her trips to Libya tomorrow for flying

    Cairo 6-4 ( AFP ) - an official announced today in the Egyptian lines company for flying Tuesday is a company " insistent for flying " may resumed a consideration of a day Wednesday tomorrow her trips to Libya of Security Council decision trace international the imposed ban comment .

    And said the official " the institution sent a speech to Ministry of Foreign Affairs of lifting on Libya air , a situation her receiving replying are so a trip will pull to Libya a morning Wednesday " .

    Egyptair Has Tomorrow to Resume Its Flights to Libya

    Cairo 4-6 (AFP) - said an official at the Egyptian Aviation Company today that the company egyptair may resume as of tomorrow, Wednesday its flights to Libya after the International Security Council resolution to the suspension of the embargo imposed on Libya.

    " The official said that the company had sent a letter to the Ministry of Foreign Affairs, information on the lifting of the air embargo on Libya, where it had received a response, the first take off a trip to Libya on Wednesday morning ".

    Recent Progress in Statistical MT

    slide from C. Wayne, DARPA

    2002

    2003


    Centauri arcturan knight 1997

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihok yorok clok kantok ok-yurp


    Centauri arcturan knight 19971

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihok yorok clok kantok ok-yurp


    Centauri arcturan knight 19972

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihok yorok clok kantok ok-yurp


    Centauri arcturan knight 19973

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihok yorok clok kantok ok-yurp


    Centauri arcturan knight 19974

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihok yorok clok kantok ok-yurp

    ???


    Centauri arcturan knight 19975

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihok yorok clok kantok ok-yurp


    Centauri arcturan knight 19976

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihokyorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihok yorok clok kantok ok-yurp


    Centauri arcturan knight 19977

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihokyorok clok kantok ok-yurp


    Centauri arcturan knight 19978

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihokyorok clok kantok ok-yurp

    ???


    Centauri arcturan knight 19979

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihok yorok clok kantok ok-yurp


    Centauri arcturan knight 199710

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihok yorokclok kantok ok-yurp

    process of

    elimination


    Centauri arcturan knight 199711

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, translate this to Arcturan: farok crrrok hihok yorokclok kantok ok-yurp

    cognate?


    Centauri arcturan knight 199712

    1a. ok-voon ororok sprok .

    1b. at-voon bichat dat .

    7a. lalok farok ororok lalok sprok izok enemok .

    7b. wat jjat bichat wat dat vat eneat .

    2a. ok-drubel ok-voon anok plok sprok .

    2b. at-drubel at-voon pippat rrat dat .

    8a. lalok brok anok plok nok .

    8b. iat lat pippat rrat nnat .

    3a. erok sprok izok hihok ghirok .

    3b. totat dat arrat vat hilat .

    9a. wiwok nok izok kantok ok-yurp .

    9b. totat nnat quat oloat at-yurp .

    4a. ok-voon anok drok brok jok .

    4b. at-voon krat pippat sat lat .

    10a. lalok mok nok yorok ghirok clok .

    10b. wat nnat gat mat bat hilat .

    5a. wiwok farok izok stok .

    5b. totat jjat quat cat .

    11a. lalok nok crrrok hihok yorok zanzanok .

    11b. wat nnat arrat mat zanzanat .

    6a. lalok sprok izok jok stok .

    6b. wat dat krat quat cat .

    12a. lalok rarok nok izok hihok mok .

    12b. wat nnat forat arrat vat gat .

    Centauri/Arcturan [Knight, 1997]

    Your assignment, put these words in order: { jjat, arrat, mat, bat, oloat, at-yurp}

    zero

    fertility


    It s really spanish english

    1a. Garcia and associates .

    1b. Garcia y asociados .

    7a. the clients and the associates are enemies .

    7b. los clients y los asociados son enemigos .

    2a. Carlos Garcia has three associates .

    2b. Carlos Garcia tiene tres asociados .

    8a. the company has three groups .

    8b. la empresa tiene tres grupos .

    3a. his associates are not strong .

    3b. sus asociados no son fuertes .

    9a. its groups are in Europe .

    9b. sus grupos estan en Europa .

    4a. Garcia has a company also .

    4b. Garcia tambien tiene una empresa .

    10a. the modern groups sell strong pharmaceuticals .

    10b. los grupos modernos venden medicinas fuertes .

    5a. its clients are angry .

    5b. sus clientes estan enfadados .

    11a. the groups do not sell zenzanine .

    11b. los grupos no venden zanzanina .

    6a. the associates are also angry .

    6b. los asociados tambien estan enfadados .

    12a. the small groups are not modern .

    12b. los grupos pequenos no son modernos .

    It’s Really Spanish/English

    Clients do not sell pharmaceuticals in Europe => Clientes no venden medicinas en Europa


    Statistical mt systems

    Statistical MT Systems

    Spanish/English

    Bilingual Text

    English

    Text

    Statistical Analysis

    Statistical Analysis

    Broken

    English

    Spanish

    English

    What hunger have I,

    Hungry I am so,

    I am so hungry,

    Have I that hunger …

    Que hambre tengo yo

    I am so hungry


    Statistical mt systems1

    Statistical MT Systems

    Spanish/English

    Bilingual Text

    English

    Text

    Statistical Analysis

    Statistical Analysis

    Broken

    English

    Spanish

    English

    Translation

    Model P(s|e)

    Language

    Model P(e)

    Que hambre tengo yo

    I am so hungry

    Decoding algorithm

    argmax P(e) * P(s|e)

    e


    Three problems for statistical mt

    Three Problems for Statistical MT

    • Language model

      • Given an English string e, assigns P(e) by formula

      • good English string -> high P(e)

      • random word sequence -> low P(e)

    • Translation model

      • Given a pair of strings <f,e>, assigns P(f | e) by formula

      • <f,e> look like translations -> high P(f | e)

      • <f,e> don’t look like translations -> low P(f | e)

    • Decoding algorithm

      • Given a language model, a translation model, and a new sentence f … find translation e maximizing P(e) * P(f | e)


    The classic language model word n grams

    The Classic Language ModelWord N-Grams

    Goal of the language model -- choose among:

    He is on the soccer field

    He is in the soccer field

    Is table the on cup the

    The cup is on the table

    Rice shrine

    American shrine

    Rice company

    American company


    The classic language model word n grams1

    The Classic Language ModelWord N-Grams

    Generative approach:

    w1 = START

    repeat until END is generated:

    produce word w2 according to a big table P(w2 | w1)

    w1 := w2

    P(I saw water on the table) =

    P(I | START) *

    P(saw | I) *

    P(water | saw) *

    P(on | water) *

    P(the | on) *

    P(table | the) *

    P(END | table)

    Probabilities can be learned

    from online English text.


    Translation model

    Translation Model?

    Generative approach:

    Mary did not slap the green witch

    Source-language morphological analysis

    Source parse tree

    Semantic representation

    Generate target structure

    Maria no dió una bofetada a la bruja verde


    Translation model1

    Translation Model?

    Generative story:

    Mary did not slap the green witch

    Source-language morphological analysis

    Source parse tree

    Semantic representation

    Generate target structure

    What are all

    the possible

    moves and

    their associated

    probability

    tables?

    Maria no dió una bofetada a la bruja verde


    The classic translation model word substitution permutation ibm model 3 brown et al 1993

    The Classic Translation ModelWord Substitution/Permutation [IBM Model 3, Brown et al., 1993]

    Generative approach:

    Mary did not slap the green witch

    n(3|slap)

    Mary not slap slap slap the green witch

    P-Null

    Mary not slap slap slap NULL the green witch

    t(la|the)

    Maria no dió una bofetada a la verde bruja

    d(j|i)

    Maria no dió una bofetada a la bruja verde

    Probabilities can be learned from raw bilingual text.


    Statistical machine translation

    Statistical Machine Translation

    … la maison … la maison bleue … la fleur …

    … the house … the blue house … the flower …

    All word alignments equally likely

    All P(french-word | english-word) equally likely


    Statistical machine translation1

    Statistical Machine Translation

    … la maison … la maison bleue … la fleur …

    … the house … the blue house … the flower …

    “la” and “the” observed to co-occur frequently,

    so P(la | the) is increased.


    Statistical machine translation2

    Statistical Machine Translation

    … la maison … la maison bleue … la fleur …

    … the house … the blue house … the flower …

    “house” co-occurs with both “la” and “maison”, but

    P(maison | house) can be raised without limit, to 1.0,

    while P(la | house) is limited because of “the”

    (pigeonhole principle)


    Statistical machine translation3

    Statistical Machine Translation

    … la maison … la maison bleue … la fleur …

    … the house … the blue house … the flower …

    settling down after another iteration


    Statistical machine translation4

    Statistical Machine Translation

    … la maison … la maison bleue … la fleur …

    … the house … the blue house … the flower …

    • Inherent hidden structure revealed by EM training!

    • For details, see:

    • “A Statistical MT Tutorial Workbook” (Knight, 1999).

    • “The Mathematics of Statistical Machine Translation” (Brown et al, 1993)

    • Software: GIZA++


    Statistical machine translation5

    Statistical Machine Translation

    … la maison … la maison bleue … la fleur …

    … the house … the blue house … the flower …

    P(juste | fair) = 0.411

    P(juste | correct) = 0.027

    P(juste | right) = 0.020

    Possible English translations,

    to be rescored by language model

    new French

    sentence


    More formally the ibm model

    More formally: The IBM Model

    • Let’s flesh out these intuitions about P(S|T) and P(T) a bit.

    • Many of the next slides are drawn from Kevin Knight’s fantastic “A Statistical MT Tutorial Workbook”!

    • http://www.isi.edu/natural-language/mt/wkbk.rtf


    Ibm model 3 as probabilistic version of direct mt

    IBM Model 3 as probabilistic version of Direct MT

    • We translate English into Spanish as follows:

      • Replace the words in the English sentence by Spanish words

      • Scramble around the words to look like Spanish order

    • But we can’t propose that English words are replaced by Spanish words one-for-one, because translations aren’t the same length.


    Ibm model 3 from knight 1999

    IBM Model 3 (from Knight 1999)

    • For each word ei in English sentence, choose a fertilityi. The choice of i depends only on ei, not other words or ’s.

    • For each word ei, generate i Spanish words. Choice of French word depends only on English word ei, not English context or any Spanish words.

    • Permute all the Spanish words. Each Spanish word gets assign absolute target position slot (1,2,3, etc). Choice of Spanish word position dependent only on absolute position of English word generating it.


    Translation as string rewriting from knight 1999

    Translation as String rewriting (from Knight 1999)

    • Mary did not slap the green witch

    • Assign fertilities: 1 = copy over word, 2= copy twice, etc. 0 = delete

    • Mary not slap slap slap the the green witch

    • Replace English words with Spanish one-for-one:

    • Mary no daba una bofetada a la verde bruja

    • Permute the words

    • Mary no daba una bofetada a la bruja verde


    Model 3 p s t training parameters

    Model 3: P(S|T) training parameters

    • What are the parameters for this model? Just look at dependencies:

    • Words: P(casa|house)

    • Fertilities: n(1|house): prob that “house” will produce 1 Spanish word whenever ‘house’ appears.

    • Distortions: d(5|2) prob that English word in position 2 of English sentence generates French word in position 5 of French translation

    • Actually, distortions are d(5,2,4,6) where 4 is length of English sentence, 6 is Spanish length

    • Remember, P(S|T) doesn’t have to model fluency


    Model 3 last twist

    Model 3: last twist

    • Imagine some Spanish words are “spurious”; they appear in Spanish even though they weren’t in English original

    • Like function words; we generated “a la” from “the” by giving “the” fertility 2

    • Instead, we could give “the” fertility 1, and generat “a” spuriously

    • Do this by pretending every English sentence contains invisible word NULL as word 0.

    • Then parameters like t(a|NULL) give probability of word “a” generating spuriously from NULL


    Spurious words

    Spurious words

    • We could imagine having n(3|NULL) (probability of being exactly 3 spurious words in a Spanish translation)

    • Instead, of n(0|NULL), n(1|NULL) … N(25|NULL), have a single parameter p1

    • After assign fertilities to non-NULL English words we want to generate (say) z Spanish words.

    • As we generate each of z words, we optionally toss in spurious Spanish word with probability p1

    • Probability of not tossing in spurious word p0=1-p1


    Distortion probabilities for spurious words

    Distortion probabilities for spurious words

    • Can’t just have d(5|0,4,6), I.e. chance that NULL word will end up in position 5.

    • Why? These are spurious words! Could occur anywhere!! To hard to predict

    • Instead,

      • Use normal-word distortion parameters to choose positions for normally-generated Spanish words

      • Put Null-generated words into empty slots left over

      • If three NULL-generated words, and three empty slots, then there are 3!, or six, ways for slotting them all in

      • We’ll assign a probability of 1/6 for each way


    Real model 3

    Real Model 3

    • For each word ei in English sentence, choose fertility i with prob n(i| ei)

    • Choose number 0 of spurious Spanish words to be generated from e0=NULL using p1 and sum of fertilities from step 1

    • Let m be sum of fertilities for all words including NULL

    • For each i=0,1,2,…L , k=1,2,… I :

      • choose Spanish word ikwith probability t(ik|ei)

    • For each i=1,2,…L , k=1,2,… I :

      • choose target Spanish position ikwith prob d(ik|I,L,m)

    • For each k=1,2,…, 0 choose position 0k from 0 -k+1 remaining vacant positions in 1,2,…m for total prob of 1/ 0!

    • Output Spanish sentence with words ik in positions ik (0<=I<=1,1<=k<= I)


    The classic translation model word substitution permutation ibm model 3 brown et al 19931

    The Classic Translation ModelWord Substitution/Permutation [IBM Model 3, Brown et al., 1993]

    Generative approach:

    Mary did not slap the green witch

    n(3|slap)

    Mary not slap slap slap the green witch

    P-Null

    Mary not slap slap slap NULL the green witch

    t(la|the)

    Maria no dió una bofetada a la verde bruja

    d(j|i)

    Maria no dió una bofetada a la bruja verde

    Probabilities can be learned from raw bilingual text.


    Model 3 parameters

    Model 3 parameters

    • N,t,p,d

    • If we had English strings and step-by-step rewritings into Spanish, we could:

      • Compute n(0|did) by locating every instance of “did”, see what happens to it during first rewriting step

      • If “did” appeared 15,000 times and was deleted during the first rewriting step 13,000 times, then n(0|did) = 13/15


    Alignments

    Alignments

    NULL And the program has been implemented

    | | | | | | /|\

    Le programme a ete mis en application

    • If we had lots of alignments like this,

      • n(0|d): how many times “did” connects to no French words

      • T(maison|house) how many of all French words generated by “house” were “maison”

      • D(5|2,4,6) out of all times some word2 moved somewhere, how many times to word5?


    Where to get alignments parallel corpora

    Where to get alignments: Parallel corpora

    • Example from DE-News (8/1/1996)

    Slide from Christof Monz


    Sentence alignment1

    Sentence Alignment

    El viejo está feliz porque ha pescado muchos veces. Su mujer habla con él. Los tiburones esperan.

    The old man is happy. He has fished many times. His wife talks to him. The fish are jumping. The sharks await.


    Sentence alignment2

    Sentence Alignment

    • El viejo está feliz porque ha pescado muchos veces.

    • Su mujer habla con él.

    • Los tiburones esperan.

    • The old man is happy.

    • He has fished many times.

    • His wife talks to him.

    • The fish are jumping.

    • The sharks await.


    Sentence alignment3

    Sentence Alignment

    • El viejo está feliz porque ha pescado muchos veces.

    • Su mujer habla con él.

    • Los tiburones esperan.

    • The old man is happy.

    • He has fished many times.

    • His wife talks to him.

    • The fish are jumping.

    • The sharks await.


    Sentence alignment4

    Sentence Alignment

    • El viejo está feliz porque ha pescado muchos veces.

    • Su mujer habla con él.

    • Los tiburones esperan.

    • The old man is happy. He has fished many times.

    • His wife talks to him.

    • The sharks await.

    Note that unaligned sentences are thrown out, and

    sentences are merged in n-to-m alignments (n, m > 0).


    Where to get alignments

    Where to get alignments

    • It turns out we can bootstrap alignments

    • If we just have a bilingual corpus

    • We can bootstrap alignments

      • Assume some startup values for n,d,, etc

      • Use values for n,d, , etc to use model 3 to do “forced alignment”; I.e. to pick the best word alignments between sentences

      • Use these alignments to retrain n,d, , etc

      • Go to 2

    • This is called the Expectation-Maximization or EM algorithm


    Summary

    Summary

    • Intro and a little history

    • Language Similarities and Divergences

    • Four main MT Approaches

      • Transfer

      • Interlingua

      • Direct

      • Statistical

    • Evaluation


    Classes

    Classes

    • LINGUIST 139M/239M. Human and Machine Translation. (Martin Kay)

    • CS 224N. Natural Language Processing (Chris Manning)


  • Login