1 / 24

Pen Research

Pen Research. Jay Pittman Development Lead Tablet PC Handwriting Recognition Microsoft Corporation. TDNN. Time-Delayed Neural Network Ink is cut into segments via simple algorithm For Latin script, we just cut at bottoms For Arabic or Hindi, we may need a different algorithm

Samuel
Download Presentation

Pen Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pen Research Jay Pittman Development Lead Tablet PC Handwriting Recognition Microsoft Corporation

  2. TDNN • Time-Delayed Neural Network • Ink is cut into segments via simple algorithm • For Latin script, we just cut at bottoms • For Arabic or Hindi, we may need a different algorithm • Features are computed per segment • Chebychev coefficients • Variety of size metrics: width, height, offsets to neighbors, etc. • Each supported letter or character has an output node • Many have 2 outputs, for beginning of character and continuation of character • Very large training sets are collected from tens of thousands of native writers

  3. Beam Search • Very similar to speech recognition systems • System lexicon • Simple list of words (like a spellchecker) • Stored as a trie • Plus regular expressions for numbers, dates, times, currency, phone numbers, etc. • Low-scoring sequences are “culled” from the trie as we go • Converts character recognizer into word recognizer • We can recognize a word even if you mangle one of the letters • Supports sloppier (and therefore faster) handwriting

  4. TDNN + Beam Search Ink Segments Top 10 List TDNN dog 68 clog 57 dug 51 Space TDNN doom 42 Output Matrix divvy 37 a* 88 8 68 22 63 57 Lexicon ooze 35 a … 23 4 44 61 57 57 Beam Search … … cloy 34 a d* g 57 a 88 … o 92 31 51 9 47 20 g doxy 29 d o 65 b 13 81 8 2 14 3 l b 14 t 12 b t … client 22 l 76 c b 6 g* c 86 a 71 12 52 8 79 13 t dozy 13 a g a 73 d 17 17 5 7 43 90 t 5 o d 92 … g … e o 77 o* … 53 18 79 28 57 6 g 68 t o 7 16 57 91 44 15 t 8

  5. Personalization • New In Vista • Shape adaptation • Collect samples from you • Simple idea: continue same training we do at Microsoft, but only on your samples • Implicit • Explicit • Text adaptation • Collect your personal words from Word and outgoing emails Capital I? dev RTM KKOMO dogfooding Qi trie featurize Herry

  6. Open System • Recognizer API is published • Any recognizer may support this API • A non-Microsoft recognizer can be installed, and it will be invoked by the inking platform • Non-Microsoft recognizers available now in Japan, China, and Russia • Compete with my group • Or cover languages we don’t cover • Or cover non-text (music, math, chemistry) • Sorry: recognition result API is very text oriented • Strings in Unicode • Perhaps your text might be XML http://www.microsoft.com/downloads/details.aspx?FamilyID=69640b5c-0ee9-421e-8d5c-d40debee36c2&displaylang=en Start / All Programs / Microsoft Tablet PC Platform SDK / Microsoft Tablet PC Platform SDK Documentation Microsoft Tablet PC / Tablet PC Platform / Programming the Tablet PC / Creating a Recognizer

  7. Latin Orthography • XP has U.S. English, U.K. English, German, French, Spanish, and Italian • Vista adds Dutch and Brazilian Portuguese • Currently working on Swedish, Danish, Norwegian (Bokmal), Finnish, Polish, Czech, Portugal Portuguese, Catalan, Romanian, Croatian, and Serbian • No ship estimates are available • Serbian is written in both Latin and Cyrillic • Making plans for the next batch: Probably Bahasa Indonese (Indonesian), Hungarian, Turkish, Slovak, Slovene, Lithuanian, Estonian, Latvian, Vietnamese, Tagalog (Filipino), others TBD • World’s largest orthography • Largest count of languages • Largest combined count of literates

  8. East Asian OrthographiesIdeographic Orthographies • Completely different code base • Focus on large count of characters • XP has Japanese, Chinese (Simplified), Chinese (Traditional), and Korean • Vista adds personalization, and improves cursive recognition

  9. Cyrillic and Greek Orthographies • Same TDNN + beam search technology works equally well in Greek and Cyrillic • But we must collect new training sets • Cyrillic has more upper/lowercase confusion • Russian Collection completed • Currently working on recognizer • No estimate on shipping date • Some work also underway in Serbian • Cyrillic: Russian, Ukrainian, Bulgarian, Serbian, Byelorussian, Macedonian, Kazakh • Other former-Soviet Turkic republics are in a state of transition back to Latin scripts • Serbian is written in both Latin and Cyrillic

  10. Greek Τη γλώσσα μου έδωσαν Ελληνική I was given a language that is Hellenic Ink by Dr. John Drakopoulos

  11. Russian Kremlin, Kremlin. I’ve heard from everyone about [it] Ото Кремль. Кремль, всех я про слышал Ink by Vladimir Smirnov

  12. Bi-Directional OrthographiesArabic and Hebrew • Text written right-to-left, but numbers written left-to-right • Arabic has its own digits • Hebrew uses “western” digits • No uppercase / lowercase distinction • Arabic is cursive-only • Up to 4 forms per letter (initial, medial, final, isolated) • Hebrew is print-only • 5 letters have a separate final form • Abjabs (consonant alphabets) • We have collections underway • Some initial experimentation in both languages • No estimate on shipping date • Arabic script variations used for Farsi (Persian), Urdu, Kurdish, Azeri (in Iran), Pashto, Darwi, Baluchi, Sindhi (in Pakistan), Kashmiri (in Pakistan)

  13. Bi-Directional ExampleArabic عمره [his] age ١٢٣ 123 عاماً Year[s] Ink by Ahmed Kamal

  14. Bi-Directional ExampleHebrew מבבל. פלימפטון 322 Plimpton 322 [is] from Babylonia. Ink by Ethan Zoller

  15. Brahmic Orthographies • Left to right, no uppercase / lowercase • Abugidas (syllabic alphabets) • Default vowel (short “Ə” or “uh” sound) is not written • All other vowels require a vowel sign added to the consonant • This includes a “no-vowel sign” • Called halant (“choked”) in Hindi, virama in other Indic languages • Vowel sound follows consonant sound • There are also independent vowels • Hindi Devanagari collection in progress • Some initial experimentation • No estimate on shipping date • Brahmic scripts: Devanagari (Hindi, Marathi), Thai, Bengali, Gujarati, Gurmukhi (Punjabi), Tamil, Telugu, Kannada, Malayalam, Oriya, Sinhala, Khmer, Lao, Tibetan, Myanmar, Sindhi (in India), Kashmiri (in India)

  16. Vowel Signs “KAI” “KEE” “KE” “KA” “KAY” “K” “KAA” “KO” “KOW” “KU” “KUU” “KR”

  17. Consonant Clusters A transliteration of the English word “string” Unicode buffer: S halant T halant R I N G

  18. Brahmic Comparison • K Kh G Gh C Ch • Devanagari: क ख ग घ च छ • Bengali: ক খ গ ঘ চ ছ • Gurmukhi: ਕ ਖ ਗ ਘ ਚ ਛ • Gujarati: ક ખ ગ ઘ ચ છ • Tamil: க ச • Telugu: క ఖ గ ఘ చ ఛ • Kannada: ಕ ಖ ಗ ಘ ಚ ಛ

  19. Demo Title NameTitle Group

  20. Early Feedback As for Vista, I am in love with the B2 Tablet implementation! I can see where the effort is going! The TIP is awesome and the handwriting recognition is vastly improved. There is hardly a time where I have to correct my input - no matter how sloppy I am. Unsolicited feedback from a Vista Beta II user whose identity I do not know

  21. © 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

  22. Backup Slides

  23. Devanaguri Example Pr. Ltd. [Private Limited] headline headline headline period [abbreviation] प “PA” म “MA” ि [vowel sign] “I” ल “LA” ि [vowel sign] “I” ा [vowel sign] “AA” [R sign] period [abbreviation]

More Related