off line cursive word recognition n.
Skip this Video
Loading SlideShow in 5 Seconds..
(Off-Line) Cursive Word Recognition PowerPoint Presentation
Download Presentation
(Off-Line) Cursive Word Recognition

Loading in 2 Seconds...

play fullscreen
1 / 49

(Off-Line) Cursive Word Recognition - PowerPoint PPT Presentation

  • Uploaded on

(Off-Line) Cursive Word Recognition. Tal Steinherz Tel-Aviv University. Cursive Word Recognition. Preprocessing. Segmentation. Feature Extraction . Recognition. Post Processing. Preprocessing. Skew correction Slant correction Smoothing Reference line finding. Segmentation Motivation.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '(Off-Line) Cursive Word Recognition' - hien

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
off line cursive word recognition

(Off-Line) Cursive Word Recognition

Tal Steinherz

Tel-Aviv University

cursive word recognition
Cursive Word Recognition



Feature Extraction


Post Processing

  • Skew correction
  • Slant correction
  • Smoothing
  • Reference line finding
segmentation motivation
Segmentation Motivation
  • Given a 2-dimensional image and a model that expects a 1-dimensional input signal, one needs to derive an ordered list of features.
  • Fragmentation is another alternative where the resulting pieces have no literal meaning.
segmentation dilemma
Segmentation Dilemma
  • To segment or not to segment ? That’s the question!
  • Sayre’s paradox: “To recognize a letter, one must know where it starts and where it ends, to isolate a letter, one must recognize it first”.
recognition model
Recognition Model
  • What is the basic (atomic) model?
    • word (remains identical through training and recognition)
    • letter (concatenated on demand during recognition)
  • What are the training implications?
    • specific = total cover (several samples for each word)
    • dynamic = brick cover (samples of various words that include all possible characters=letters)
basic word model

. . .

. . .

i th letter sub-model

last letter sub-model

1st letter sub-model

Basic Word Model
segmentation free
  • In a segmentation-free approach recognition is based on measuring the distance between observation sequences
segmentation free continue
Segmentation-Free - continue
  • The most popular metric is Levenshtein’s Edit Distance, where a transformation between sequences is done by atomic operations: insertion, deletion and substitution - associated with different costs
  • Implementations: Dynamic programming, HMM
segmentation free demo
Segmentation-Free (demo)
  • Each column was translated into a feature vector.
  • Two types of features:
    • number of zero-crossing
    • gradient of the word’s curve
segmentation based
  • In a segmentation-based approach recognition is based on complete bipartite match-making between blocks of primitive segments and letters of a word
segmentation based continue
Segmentation-Based - continue
  • The best match is found by the dynamic programming Viterbi algorithm
  • An implementation by an HMM is very popular and enhances the model capabilities
segmentation based demo
Segmentation-Based (demo)
  • First the word is heuristically segmented.
  • It is preferable to over segment a character. Nevertheless a character must not span more than a predefined number of segments.
  • Each segment is translated into a feature vector.
features in segments demo
Features in Segments (demo)
  • Global features:
    • ascenders, descenders, loops, i dots, t strokes
  • Local features:
    • X crossings, T crossings, end points, sharp curvatures, parametric strokes
  • Non-symbolic features:
    • pixel moments, pixel distributions, contour condings
pattern recognition issues
Pattern Recognition Issues
  • Lexicon size:
    • small (up to 100 words)
    • limited (between 100 to 1000 words)
    • infinite (more than 1000 words)
word model extension

. . .

. . .

‘m’ sub-HMM

‘z’ sub-HMM

‘a’ sub-HMM

Word Model Extension
  • A new approach to practice recognition?
    • path discriminant (a single general word model, a path=hypothesis per word)
online vs off line
Online vs. Off-Line
  • Online – captured by pen-like devices.the input format is a two-dimensional signal of pixel locations as a function of time (x(t),y(t)).
  • Off-line – captured by scanning devices.the input format is a two-dimensional image of gray-scale colors as a function of location I(m*n).strokes have significant width.
online vs off line cont
Online vs. Off-Line (cont.)
  • In general online classifiers are superior to off-line classifiers because some valuable strokes are blurred in the static image.Sometimes temporal information (stroke order) is also a must in order to distinguish between similar objects.
online weaknesses
Online Weaknesses

Sensitivity to stroke order, stroke number and stroke characteristics variations:

  • Similar shapes that resemble in the image domain might be produced by different sets of strokes.
  • Many redundant strokes (consecutive superfluous pixels) that are byproducts of the continuous nature of cursive handwriting.
  • Incomplete (open) loops are more frequent.
off line can improve online
Off-Line can improve Online
  • Sometimes the off-line representation enables one to recognize words that are not recognized given the online signal.
  • An optimal system would combine online and off-line based classifiers.
the desired integration between online and off line classifiers
The desired integration between online and off-line classifiers
  • Having a single word recognition engine to practice both the online and off-line data.
  • It requires an off-line to online transformation to extract an alternative list of strokes that preserves off-line like features while being consistent in order.

Online signal

Projection to image Domain

Bitmap image

Stroke width=1

Online signal

“Painting” (thickening the strokes)

Real static image

Stroke width>1

The “pseudo-online” transformation

Pseudo-online representation

Online recognition engine

C l a s s i f i c a t i o n

Online classifiers

Pseudo-online classifiers

Online classification outputs

Pseudo-online classification outputs

Integration by some combination scheme

Recognition results

cursive handwriting terms
Cursive Handwriting Terms
  • Axis - The main subset of strokes that assemble the backbone, which is the shortest path from left to right including loops on several occasions.
  • Tarsi - The other subsets of connected strokes that produce branches, which are hang above (in case of ascenders) or below (in case of descenders) the axis .
the pseudo online transformation
The Pseudo-Online Transformation
  • Follow the skeleton of the axis from the left most pixel until reaching the first intersection with a tarsus.
  • Surround the tarsus by tracking its contour until returning back to the intersection point we started from.
  • Continue along the axis to the next intersection with a tarsus, and so on until the right most pixel is reached.
  • Loops that are encountered along the axis are also surrounded completely.
experimental setup
Experimental Setup
  • The online word recognition engine of Neskovic et al. – satisfies Trainability and Versatility.
  • A combination of 6/12 online and pseudo-online classifiers.
  • Several combination schemes – majority vote, max rule, sum rule.
  • An extension of the HP’s dataset that can be found in the UNIPEN collection.
experimental setup cont
Experimental Setup (cont.)
  • Different Training sets of 46 writers.
  • Disjoint validation sets of 9 writers.
  • Disjoint test set of 11 writers.
  • The lexicon contains 862 words.
result analysis
Result Analysis
  • Word level - in 110 word classes (12.8%) at least 7 word samples (10.6%) were correctly recognized only by the combination with the pseudo-online classifiers.
  • Writer level – for 12 writers (18.2%) at least 65 of the words they produced (7.5%) were correctly recognized only by the combination with the pseudo-online classifiers.
result analysis cont
Result Analysis (cont.)
  • 909 of the input words (5.9%) were correctly recognized by at least one pseudo-online classifier and neither one of the 12 online classifiers.
  • 357 of the input words (2.3%) were correctly recognized by at least 4 of the 12 pseudo-online classifiers and neither one of the 12 online classifiers.
  • For 828 of the input words (5.3%) the difference between the number of pseudo-online and online classifiers that correctly recognized them was 6 or more.
  • The pseudo-online representation does add information that cannot be obtained by optimizing \ extending a combination of online classifiers only.