1 / 11

“Poetic” Statistical Machine Translation: Rhyme and Meter

“Poetic” Statistical Machine Translation: Rhyme and Meter. Genzel , Uszkoreit , Och ; Google, 2010. Challenge . Automatic translation of poetry is possibly the most difficult problem in Computational Linguistics, MT, and AI. Few humans are capable of Poetry Translation.

upton
Download Presentation

“Poetic” Statistical Machine Translation: Rhyme and Meter

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Poetic” Statistical Machine Translation: Rhyme and Meter Genzel, Uszkoreit, Och; Google, 2010

  2. Challenge • Automatic translation of poetry is possibly the most difficult problem in Computational Linguistics, MT, and AI. • Few humans are capable of Poetry Translation. • No previous attempts to apply MT to poetry.

  3. Defining Poetry Translation • A poem’s form and meter (韻律) must be preserved in translation, if at all possible. • Poetic form as constraint of possible translation outputs. • Naïve approach: Perform MT and then use a poem detector among the results. • Better approach: Recast “poem-likeness” as a feature function that has 0 cost, and as a local feature to guide the decoder search.

  4. Reduce Hypothesis Space • H = {h | h is an acceptable translation} • Reduce the size of H • H of poems = {h | h is an acceptable translation, maintains syllable structure (音節), rhythm(韻律), and rhyme(押韻)}

  5. Types of Poetry • Line length: Haiku (三行俳句詩) (5-7-5) • Rhythmic Poetry: 0 for no stress, 1 for stress(blank verse) • Iambic foot (01)* (抑揚格) • Dactylic foot (100)* (揚抑抑格) • Rhythmic and Rhyming: Sonnet (abbaabbacdecde) • Lines have the same meter • {abab, a:010101, b:10101010}

  6. Stress Pattern Feature Function • Use text-to-speech to capture stress. • Phrase-based: current h-length mod foot length. 2-syllable: 0 or 1, 3-syllable: 0, 1, or 2. Cost is number of mismatches. • Hierarchical: States of how well a partial hypothesis is modeled, length and cost. Can combine states. • “Whatever fits”: modify translation trivially to fit the pattern. Take care to combine right pattern score.

  7. Framework for General Poetic Form Feature Function 1/3 • Track the target length: • dynamic programming over phrase lattice. Max source phrase size is k, length is n and max target length is l, then the sweep requires . Can reduce to with precomputationof size range.

  8. Framework for General Poetic Form Feature Function 2/3 • State Space for the feature function: • Current sentence length in syllables • Set of uncovered ranges • Letters from the rhyming scheme

  9. Framework for General Poetic Form Feature Function 3/3 • Algorithm: • Hypothesis state with phrase pair p • 1. Cost as 0, as • 2. Update : increment sentence length by target phrase length, update covered range • 3. Compute min and max achievable sentence length; if desired length not in range, cost++ • 4. For each word in the target phrase: • (a) If syllable pattern does not match, cost++. • (b) If at end of line: • i. if line ends mid word, cost++ • ii. Let x be the rhyme scheme letter • iii. If x is in the state , check if the word associated with x rhymes with the current word, if not cost++ • iv. Remove x with associated word from the state • v. If letter x occurs further in the rhyming scheme , add x with current word to

  10. Results • No objective evaluation of “poetic” quality • Test: percentage of sentences that can be translated while maintaining the stress pattern, and the impact of this constraint on the BLEU Score. • Baseline score is 35.33, and stress pattern constrained system is 18.93. • If we allow no stress errors than 85% of the sentences were matched. • If we allow one stress error than 92% were matched.

  11. Sample Translation

More Related