1 / 84

Online Character Recognition Research at KAIST/CAIR November 1-2, 1995 Jin Hyung Kim

Online Character Recognition Research at KAIST/CAIR November 1-2, 1995 Jin Hyung Kim Computer Science Department and Center for AI Research KAIST, Korea. Korea-Australia Joint Workshop on Information Technology. Pen Computer. Pen is the major input device optional keyboard

libra
Download Presentation

Online Character Recognition Research at KAIST/CAIR November 1-2, 1995 Jin Hyung Kim

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Online Character Recognition Research at KAIST/CAIR November 1-2, 1995 Jin Hyung Kim Computer Science Department and Center for AI Research KAIST, Korea Korea-Australia Joint Workshop on Information Technology

  2. Pen Computer • Pen is the major input device • optional keyboard • various size and shape • Most Natural Interface between human and computer • Some are already on the market

  3. Pen computer : Note Taker

  4. Pen computer : Draw and Spell

  5. Pen computer : Fax / Phone

  6. Pen Computer : Partable Map

  7. Pen Computer : Inventory Watch

  8. Pen Computer : Archtect뭩 SketchPad

  9. Pen Computer : Family Message Center

  10. Pen Computer : Electronic Classroom

  11. Pen Computer : Data Collector

  12. Dreams are No More Dream • Dreams • DynaBook - Alan Kay • Knowledge Navigator - Apple • Tablet - U. of Illinois Undergrads • Projects • Pattern Information Processing (Japan) • Electronic Paper (ESPRIT) • Many Products • Newton, GridPad, EO, SamSung PenMaster

  13. PenComputer Component Technology • Hardware • Flat Panel Display • Pen/Digitizer • MicroProcessor • Battery and Power management • Storage Devices • Packaging • Software • Wireless Comm. • Operating System • Handwriting recognition • Utilities • Applications

  14. Recognition for Pen Computing • Handwriting Recognition is a bottleneck • Menu Selection, Characters, Drawings, Gestures • Roman Character Recogntion Systems on the Market • Limited Capability : C+ • Printed style Only • Unconstrained Free Style • low recognition accuracy • Active research topic with Neural Network, Hidden Markov Model, Fuzzy logic, etc.

  15. Handwriting • A Sequence of some writing units • Temporaly ordered • (mostly) left-to-right

  16. Handwriting Recognition • Source of Difficulty • Static Variability - personal style • Dynamic Variability - shape deviation • Stroke connection - coarticulation effect • Problems to solve Free-Writing • Variability Modeling • simple model for high flexibility • Resolve coarticulation • segmentation problem

  17. Modeling of Handwritings • Unit of Model • Substroke or stroke • letter • Subword • word • Consideration • consistency • trainability

  18. Handwritten Roman Styles

  19. Non-Roman Character Recognition • Demand is more strong for Oriental Languages • Japanese PDA products • Boxed KANA • Carefully written Chinese characters • Boxed style Roman Characters • Chinese efforts • Korean effort( Hangul recognition) • 2-D Phonetic writing system • at the stage of practical use

  20. Hangul - Korean Script • Hangul is a phonetic writing system • Combine best features of Alphabet & Syllabary • Invented by King Sejong 500 years ago • 24 phonemes (alphabets) • 14 consonants and 10 vowels ㄱ ㄴ ㄷ ㄹ ㅁ ㅂ ㅅ ㅇ ㅈ ㅊ ㅋ ㅌ ㅍ ㅎ g n d l m b s j ch k p t h ㅏ ㅑ ㅓ ㅕ ㅗ ㅛ ㅜ ㅠ ㅡ ㅣ a ya er yer o yo u yu eu i

  21. Hangul - Korean Script • A 멊lock?represents a syllable • Consonant + Vowel • Consonant + Vowel + Consonant • Logical 멊lock?construction rule • Consonant at left of vertical vowel • Consonant at top of horizintal vowel • Third consonant at bottom if exist

  22. Hangul - Korean Script ㅎ ㄱ = H = K ㄱ 2 1 1 ㅏ ㅎ ㅜ ㅜ ㅏ = U = A 2 ㄴ ㄱ ㄱ 3 = K ㄴ 3 = N

  23. Printed All alphabet separated One Syllable in a box Cursive Ligature within Syllable One Syllable in a box Handwritten Hangul Styles Cursive Ligature within Syllable Syllables may Overlab Spatially Cursive Ligature over Syllables

  24. NotePad Consortium • Korean initiative for pen computing technology • mainly Handwriting Recognition • Pen Applications, Pen-based HCI research • August 1991 - January 1995 • Flagship Project of Center for AI Research, KAIST • ERC of excellency funded by KOSEF • More than 60 researchers/year from more than 10 universities • Funding supports from KOSEF, MCI, and Computer companies such as Daewoo, Hyundai, PosData, Samsung, Trigem, Korea Computer Inc.

  25. Ministry of Commerce and Industries Computer manufactures University User Group NotePad Consortium Fund Technology Man Power Korea Science and Engineering Foundation Center for AI Research NotePad Consortium EXPO

  26. NotePad Consortium Research Result • Several Hangul recognizers(2 Korean Patents) • Unconstrained, continuous writing • above 95% of recognition rate • Roman word recognizers(US Patent) • Boxed, Run-on, unconstrained cursive word • about 88% of word recognition rate • Chinese Character recognizer • Gesture Recognizer • Several Applications • Arithmatic Tutoring System, Fax Machine, etc

  27. Approaches for Recognizer Development • Knowledge-based Approach • Structural / Feature based • heuristics / Fuzzy • Encoding of expert knowledge • Data-driven Approach • Decision Theoritic • Artifical Neural Network • Hidden Markov Model • Training procedure

  28. Recognizer vs. Recognizer Generator Give a man a fish, and he뭠l eat for a day Teach him to fish, and he뭠l eat for a lifetime - Laotse

  29. KAIST HMM Approach for Handwriting Recognizer • Variability is modeled with HMM • Alphbets are modeled by character model • Stroke connection is modeled by ligature model • as a separate entity • Viewing handwritten word as an alternating sequence of character model and ligature model • Network of HMM • knowledge of language utilized • Hierarchy of Networks • component - character - word - sentence

  30. Hidden Markov Model • Stochastic model of process with uncertain and incomplete information • Doubly stochastic process • transition parameters model temporal variability • output distribution model spatial variability • Efficient and good modeling tool for • sequences with temporal constraints • spatial variability along the sequence • real world complex processes • Highly successful for speech recognition

  31. Why HMM for Handwritten Character and Ligature ? • HMM is doubly stochastic model • Spacial variation and temporal variation • Well developed Search Algorithm • Viterbi algorithm • Welll developed training algorithm • Baum-Welsh algorithm • A model is represented by either a path of the HMM or the set of all paths of the HMM

  32. HMMThree Problems • What is the probability of generating an observation sequence? • calculating the model-input matching score, i.e, likelihood. P [ X = x1, x2, ..., xT |  ] = ? • What is the most probable transition sequence? Q* = argmax P [ Q, X |  ] • How do we estimate or optimize the parameters in an HMM? • training problem

  33. Model of Handwritten Word • View Handwritten word as alternating sequence of characters and ligatures • Handwritten words are modeled as Network of character and ligature models • Ligature as separate entity

  34. HMM for Character and Ligature • Character and ligature are viewed as sequence of chain codes • Each character and ligature model as Hidden Markov Model • Produce Probability as its score ligature model character model

  35. Character Model • Character • atomic units of handwriting • consistency in shape • small number of models • Characrter Model • HMM-based • model variability in time(length) and shape • simple left-to-right model • small number of states ( < 10)

  36. Ligature Model • Ligature • Between-letter stroke pattern • pen-up or pen-down dragging • linear or slightly curved • Ligature Model • HMM-based • represent connecting pattern and variation • simple model structure (1 - 3 states)

  37. Input Data Encoding • 16 directional chain coding • 0 ~ 15 for pen-down movement • 16 ~ 31 for pen-up movement • 32 for small dot 3 2 18 1 17 0 16 31 15 14 30 pen down movement pen up movement ==> (11, 11, 13, 15, 2, 22, 22, 22, 1, 1, 1) down up down

  38. Hidden Markov Modelling • We are searching the max. probability model M* such that P(M*|X) = max P(X|M) P(M) M* = argmax P(X|M) P(M) • HMM produces P(X|M) which is interpreted as the degree of matching between model M and given data sequence X M M

  39. Network Approach for Word Recognition • One HMM for each word is unmanageable • HMMs are interconnected • represents word construction rules from characters and ligatures (English) • represents syllable construction rules from phonetic symbols and ligatures (Hangul) • Node represents start and termination of chain code sequence of the character • Arc represents character HMM

  40. English Word Network • Writing sequence assumed • left-to-right, as the sequence of appearance • Delayed strokes are allowed • Circular Network • Initial node for the start of word • Final node for the end of word • Circular path thru ligature arcs • Based on character HMM • Special treatment for delayed stroke • Ligature grouping • based on starting and terminating location

  41. Circular Network for English Words

  42. Hangul Syllable Network • Writing sequence assumed • first consonant, vowel, last consonant if exist • Layered network • Initial node for the start of syllable • Final node for the end of syllablle • Based on Symbol HMM • Ligature grouping • based on starting and terminating location • Null Transition if no last consonant exist

  43. Initial Node Final Node BongNet : Hangul Syllable Model Consonant Ligature Vowel Ligature Consonant

  44. Recognition Problem • A Path corresponds to a Hangul Syllable / English word • Complete sequence of states and arcs from initial node to final node • Recognition • Finding the maximal probability path for given input chain code sequence • Yields optimal segmentation and character label, simultaneously

  45. Max Prob. Path Finding (Hangul)

  46. Max Prob. Path Finding (English)

  47. Model Training Procedure 1) Collect Handwritings with correct label 2) Preprocessing and encoding 3) Manual segmentation 4) Collection of each character and ligature ligature grouping to reduce the number of models 5) Estimating Model parameters Baum-Welch algorithm

More Related