Farsi handwritten word recognition using continuous hidden markov models and structural features
Download
1 / 72

Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features - PowerPoint PPT Presentation


  • 147 Views
  • Uploaded on

Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features. M. M. Haji CSE Department Shiraz University. January 2005. Outline. Introduction Preprocessing Text Segmentation Document Image Binarization Skew and Slant Correction Skeletonization

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features' - valin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Farsi handwritten word recognition using continuous hidden markov models and structural features

Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features

M. M. HajiCSE DepartmentShiraz University

January 2005


Outline
Outline Markov Models and Structural Features

  • Introduction

  • Preprocessing

    • Text Segmentation

    • Document Image Binarization

    • Skew and Slant Correction

    • Skeletonization

  • Structural Feature Extraction

  • Multi-CHMM Recognition

  • Conclusion and Discussion


Introduction
Introduction Markov Models and Structural Features

  • One of the most challenging problems in Artificial Intelligence.

  • Words are rather complex patterns, having much variability in handwriting style.

  • Performance of handwriting recognition systems is still far from human's both in terms of accuracy and speed.


Introduction1
Introduction Markov Models and Structural Features

  • Previous Research:

    • Dehghan et al. (2001). "Handwritten Farsi (Arabic) Word Recognition: A Holistic Approach Using Discrete HMM", Pattern Recognition, vol. 34, pp. 1057-1065.

    • Dehghan et al. (2001). "Unconstrained Farsi Handwritten Word Recognition Using Fuzzy Vector Quantization and Hidden Markov Models", Pattern Recognition Letters, vol. 22, pp. 209-214.

    • A maximum recognition rate of 65% for a 198-word lexicon!


Methodology
Methodology Markov Models and Structural Features

  • Holistic Strategies

  • AnalyticalStrategies

Implicit Segmentation

Explicit Segmentation


Holistic strategies
Holistic Strategies Markov Models and Structural Features

  • Recognition on the whole representation of a word.

  • No attempt to segment a word to its individual characters.

  • Necessary to segment the text lines into words.

    • Intra-word space is sometimes greater than inter-word space!


Holistic strategies1
Holistic Strategies Markov Models and Structural Features

  • Using a lexicon, a list of the allowed interpretations of the input word image.

  • The error rate increases with the lexicon size.

  • Successful for postal address recognition or bank check reading where lexicon is limited and small.


Analytical strategies
Analytical Strategies Markov Models and Structural Features

Explicit Segmentation:

  • Isolating single letters which are then separately recognized usually by neural networks .

  • Successful for English machine-printed text.

  • Arabic/Farsi texts whether machine-printed or handwritten are cursive.

  • Cursiveness and character overlapping are the main challenges.


Analytical strategies1
Analytical Strategies Markov Models and Structural Features

Implicit Segmentation:

  • Converting the text (line or word) image into a sequence of small size units.

  • Recognition at this intermediate level rather than the word or character level usually by Hidden Markov Model (HMM).

  • Each unit may be a part of a letter, so a number of successive units can belong to a single letter.


Text segmentation
Text Segmentation Markov Models and Structural Features


Text segmentation1
Text Segmentation Markov Models and Structural Features

  • Detecting text regions in an image (removing non-text components).

  • Applications in document image analysis and understanding, image compression and content-based image retrieval.

  • Document image binarization and skew correction algorithms usually require predominant text area to have an accurate estimate of text characteristics.

  • Numerous methods have been proposed (an extensive literature).

  • There is no general method to detect arbitrary text strings.

  • In the most general form, detection must be:

    • insensitive to noise, background model and lighting conditions and,

    • invariant to text language, color, size, font and orientation even in a same image!


Text segmentation2
Text Segmentation Markov Models and Structural Features

  • We believe that a text segmentation algorithm should have adaptation and learning capability.

  • A learner usually needs much time and training data to achieve satisfactory results, which restricts its practicality.

  • A simple procedure was developed for generating training data from manually segmented images.

  • A Naive Bayes Classifier (NBC) was utilized, which is fast both in training and application phase.

  • Surprisingly excellent results were obtained by this simple classifier!


Text segmentation3
Text Segmentation Markov Models and Structural Features

  • DCT-18 features

  • 10,000 training instance

  • Naive Bayes Classification:


Text segmentation4
Text Segmentation Markov Models and Structural Features

  • Naive Bayes Classification:

P(Text) = P(Non-text) = 0.5.


Binarization
Binarization Markov Models and Structural Features


Binarization1
Binarization Markov Models and Structural Features

  • Converting gray-scale images into two-level images.

  • Many vision algorithms and operators only handle two-level images.

  • Applied in primary steps of a vision algorithm.

  • Selecting a proper threshold surface.

  • Challenging for images with poor contrast, strong noise and variable modalities in histograms.

  • Global and local (adaptive) algorithms.

  • General and special-purpose algorithms.


Binarization2
Binarization Markov Models and Structural Features

Four different algorithms for document image binarization were compared and contrasted:

  • Otsu, N. (Jan. 1979). “A Threshold Selection Method from Gray Level Histograms”, IEEE Trans. on Systems, Man and Cybernetics, vol. 9, pp. 62-66.

  • Niblack, W. (1989).An Introduction to Digital Image Processing, Prentice Hall, Englewood Cliffs, pp. 115-116.

  • Wu, V. and Manmatha, R. (Jan. 1998). "Document Image Clean-Up and Binarization", Proceedings of SPIE conference on Document Recognition.

  • Liu, Y. and Srihari, S. N. (May 1997). “Document Image Binarization Based on Texture Features”, IEEE Trans. on PAMI, vol. 19(5), pp. 540-544.

global, general purpose

local, general-purpose

local, special-purpose

global, special-purpose


Binarization3
Binarization Markov Models and Structural Features

Input

Histogram

Niblack

Otsu

Wu and Manmatha

Liu and Srihari


Binarization4
Binarization Markov Models and Structural Features

  • Quality improvement by preprocessing and postprocessing.

  • Preprocessing:

    • Taylor, M. J. and Dance, C. R. (Sep. 1998). "Enhancement of Document Images from Cameras", Proceedings of SPIE conference on Document Recognition, pp. 230-241.

  • Postprocessing:

    • Trier, D. and Taxt, T. (March 1995). "Evaluation of Binarization Methods for Document Images", IEEE Trans. on PAMI, vol. 17(3), pp. 312-315.

super-resolution


Skew correction
Skew Correction Markov Models and Structural Features


Skew correction1
Skew Correction Markov Models and Structural Features

  • The angle that text lines deviate from the x-axis.

  • Page decomposition techniques require properly aligned images as input.

  • 3 types:

    • global skew

    • multiple skew

    • non-uniform skew

  • “Skew correction" is applied by a rotation after "skew detection“.


Skew correction2
Skew Correction Markov Models and Structural Features

  • Categories based on the underlying techniques:

    • Projection Profile

    • Correlation

    • Hough Transform

    • Mathematical Morphology

    • Fourier Transform

    • Artificial Neural Networks

    • Nearest-Neighbor Clustering


Skew correction3
Skew Correction Markov Models and Structural Features

  • The projection profile at the global skew angle of the document has narrow peaks and deep valleys.


Skew correction4
Skew Correction Markov Models and Structural Features

  • Projection profile technique:

goodness measure


Skew correction5
Skew Correction Markov Models and Structural Features

  • Limiting the range of skew angles.

  • Binary search for finding the maximizer of a function.

  • Computing the sum of pixels along parallel lines at an angle, instead of rotation at the angle.

  • Reducing the size of input image, as much as structure of text lines is preserved.

    • MIN, MAX downsampling

  • Local skew correction, after line segmentation, by robust line fitting.


Slant correction
Slant Correction Markov Models and Structural Features

uniform

non-uniform


Slant correction1
Slant Correction Markov Models and Structural Features

  • The deviation of average near-vertical strokes from the vertical direction.

  • Occurring in handwritten and machine-printed texts.

    اراک

  • Slant is non-informative.

  • The average slant angle is estimated first and then a shear transformation in horizontal direction is applied to the word (or line) image to correct its slant.


Slant correction2
Slant Correction Markov Models and Structural Features

  • The most effective methods are based on the analysis of vertical projection profiles (histograms) at various angles.

  • Identical to the projection profile based methods for skew correction, except that:

    • The histograms are computed in vertical rather than horizontal direction.

    • Shear transformation is used instead of rotation.

  • Accurate result for handwritten words with uniform slant.

  • Robust to noise.


Slant correction3
Slant Correction Markov Models and Structural Features


Slant correction4
Slant Correction Markov Models and Structural Features

  • Projection profile technique:

goodness measure


Slant correction5
Slant Correction Markov Models and Structural Features

  • Postprocessing:

    • Smoothing jagged edges.

and after smoothing

after slant correction

A part of a slanted word


Skeletonization
Skeletonization Markov Models and Structural Features


Skeletonization1
Skeletonization Markov Models and Structural Features

  • Skeletonization or medial axis transform (MAT) of a shape has been one the most surveyed problems in image processing and machine vision.

  • A skeletonization (thinning) algorithm transforms a shape into arcs and curves of thickness one which is called skeleton.

  • An ideal skeleton has the following properties:

    • retaining basic structural properties of the original shape

    • well-centered

    • well-connected

    • precisely reconstructable

    • robust


Skeletonization2
Skeletonization Markov Models and Structural Features

  • Simplifying classification:

    • Diminishing variability and distortion of instances of one class.

    • Reducing the amount of data to be handled.

  • Proved to be effective in pattern recognition problems:

    • Character recognition

    • Fingerprint recognition

    • Chromosome recognition

  • Providing compact representations and structural analysis of objects.


Skeletonization3
Skeletonization Markov Models and Structural Features

Five different skeletonization algorithms were compared and contrasted with the main focus on preserving text characteristics:

  • Naccache, N. J. and Shinghal, R. (1984). "SPTA: A Proposed Algorithm for Digital Pictures", IEEE Trans. on Systems, Man and Cybernetics, vol. SMC-14(3), pp. 409-418.

  • Zhang, T. Y. and Suen, C. Y. (1984). "A Fast Parallel Algorithm for Thinning Digital Patterns", Comm. ACM, vol. 27(3), pp. 236-239.

  • Ji, L. and Piper, J. (1992). "Fast Homotopy-Preserving Skeletons Using Mathematical Morphology", IEEE Trans. on PAMI, vol. 14(6), pp. 653 - 664.

  • Sajjadi, M. R. (Oct. 1996). "Skeletonization of Persian Characters", M. Sc. Thesis, Computer Science and Engineering Department, Shiraz University, Iran.

  • Huang, L., Wan, G. and Liu, C. (2003). "An Improved Parallel Thinning Algorithm", Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003), pp. 780-783.


Skeletonization4
Skeletonization Markov Models and Structural Features

Homotopy-Preserving

Input

Zhang-Suen

Huang et al.

SPTA

DTSA


Skeletonization5
Skeletonization Markov Models and Structural Features

Homotopy-Preserving

Zhang-Suen

Input

Huang et al.

SPTA

DTSA


Skeletonization6
Skeletonization Markov Models and Structural Features

SPTA

Input

robustness to border noise

DTSA

Huang et al.


Skeletonization7
Skeletonization Markov Models and Structural Features

  • Postprocessing:

    • Removing spurious branches


Skeletonization8
Skeletonization Markov Models and Structural Features

  • Modification:

    • Removing 4-connectivity, and preserving 8-connectivity of the pattern.


Structural feature extraction
Structural Feature Extraction Markov Models and Structural Features

  • The connectivity number Cn:

end-point

dot

Cn=0

Cn=1

Cn=2

branch-point

cross-point

Cn=2

Cn=3

Cn=4


Structural feature extraction1
Structural Feature Extraction Markov Models and Structural Features

  • Capable of tolerating much variation.

  • Not robust to noise.

  • Hard to extract.

  • 1D HMM needs 1D observation sequence.

  • Converting 2D word image into a 1D signal.

    • speech recognition, online handwritten recognition: 1D signal.

    • offline handwritten recognition: 2D signal.


Structural feature extraction2
Structural Feature Extraction Markov Models and Structural Features

  • Converting the word skeleton into a graph.

    • Tracing the edges in a canonical order:


Structural feature extraction3
Structural Feature Extraction Markov Models and Structural Features

  • Loop Extraction:

    • Important distinctive features.

    • Making the number of strokes smaller:

      • Easier Modeling

      • Lower Computational Cost

  • Different types of loops:

    • simple-loop

    • multi-link-loop

    • double-loop

  • A DFS algorithm was written to find complex loops in the word graph.


Structural feature extraction4
Structural Feature Extraction Markov Models and Structural Features

هـ

ـهـ

ـصـ

ـط

ـمـ

ـو

ـه

...

صـ

ص

ف

مـ

و

...


Structural feature extraction5
Structural Feature Extraction Markov Models and Structural Features

  • Each edge is transformed into a 10D feature vector:

    • Normalized length feature (f1)

    • Curvature feature (f2)

    • Slope feature (f3)

    • Connection type feature (f4)

    • Endpoint distance feature (f5 )

    • Number of segments feature (f6 )

    • Curved features (f7-f10)

  • Independent of the baseline location.

  • Invariance against scaling, translation and rotation.


Structural feature extraction6
Structural Feature Extraction Markov Models and Structural Features

1: [0.68, 1.00, 6, 0 , 0.05, 1, 0.0, 0.0, 0.7, 0.0]

2: [0.11, 1.01, 6, 1 , 0.23, 1, 0.0, 0.0, 0.0, 0.0]

3: [2.00, 3.00, 8, 10, 0.00, 0, 0.0, 0.0, 0.0, 0.0]

...


Hidden markov models
Hidden Markov Models Markov Models and Structural Features

  • Signal Modeling:

    • Deterministic

    • Stochastic:

      • Characterizing the signal by a parametric random process.

  • HMM is a widely used statistical (stochastic) model:

    • The most widely used technique in modern ASR systems.

  • Speech and handwritten text are similar:

    • Symbols with ambiguous boundaries.

    • Symbols with variations in appearance.

  • Not modeling the whole pattern as a single feature vector, exploring the relationship between consecutive segments.


Hidden markov models1
Hidden Markov Models Markov Models and Structural Features

  • Nondeterministic finite state machines:

    • Probabilistic state transition.

    • Each state is associated with a random function.

    • Unknown state sequence.

    • Some probabilistic function of the state sequence can be seen.


Hidden markov models2
Hidden Markov Models Markov Models and Structural Features

N:The Number of states of the model

S={s1, s2, ..., sN}: The set of states

∏ = {πi= P(si at t = 1)}:The initial state probabilities

A = {aij = P(sj at t+1 | si at t)}:The state transition probabilities

M:The Number of observation symbols

V = {v1, v2, ..., vM}:The set of possible observation symbols

B = {bi(vk) = P(vk at t | si at t}:The symbol emission probabilities

Ot:The observed symbol at time t

T:The length of observation sequence

λ = (A, B, ∏):The compact notation to denote the HMM.


Left to right hmms
Left-to-Right HMMs Markov Models and Structural Features

A 5-state Left-to-Right HMM

A 5-state Left-to-Right HMM with maximum relative forward jump of 2


Hidden markov models3
Hidden Markov Models Markov Models and Structural Features

The Three Fundamental Problems:

1. Given a model λ = (A, B, ∏), how do we compute P(O | λ), the probability of occurrence of the observation seq. O = O1, O2, ..., OT.

The Forward-Backward Algorithm

2. Given the observation sequence O and a model λ, how do we choose a state sequence S = s1, s2, ..., sT so that P(O, S | λ) is maximized, i.e. finding a state sequence that best explains the observation.

The Viterbi Algorithm

3. Given the observation sequence O, how do we adjust the model parameters λ = (A, B, ∏) so that P(O | λ) or P(O, S | λ) is maximized. i.e. finding a model that best explains the observed data.

The Baum-Welch Algorithm, The Segmental K-means Algorithm


Hidden markov models4
Hidden Markov Models Markov Models and Structural Features

  • Discrete HMM:

    • Discrete observation sequences: V = {v1, v2, ..., vM}.

    • A codebook obtained by Vector Quantization (VQ).

      • Codebook size?

    • Distortion: information loss due to the quantization error!

  • Continuous Hidden Markov Model (CHMM):

    • Overcoming the distortion problem.

    • Requiring more parameters → more memory

    • More deliberate initialization techniques:

      • Diverging with randomly selected initial parameters!


Hidden markov models5
Hidden Markov Models Markov Models and Structural Features

Multivariate Gaussian mixture:

cim: The mth mixture gain coefficient in state i

μim: The mean of the mth mixture in state i

∑im: The covariance of the mth mixture in state i

M: The number of mixtures used

K: The dimensionality of the observation space


The block diagram of the recognition system
The Block Diagram of the Recognition System Markov Models and Structural Features


The class diagram of the experimental recognition system
The Class Diagram of the Experimental Recognition System Markov Models and Structural Features


An overview of the complete system
An Overview of the Complete System Markov Models and Structural Features

  • Two-Stage Skew Correction

  • Postponed Binarization


Training data
Training Data Markov Models and Structural Features

  • The recognition system was trained and evaluated on a dataset of 100 city names of Iran.

  • A pattern recognition problem with 100 classes was considered.

  • Most samples in the dataset were automatically generated by a Java program drawing input string with different fonts, sizes and orientations on output image.

  • The dataset contains 150 samples for each word.


Training data1
Training Data Markov Models and Structural Features


Training data2
Training Data Markov Models and Structural Features


Experimental results
Experimental Results Markov Models and Structural Features

1-best recognized


Experimental results1
Experimental Results Markov Models and Structural Features

1-best recognized


Experimental results2
Experimental Results Markov Models and Structural Features

1-best recognized


Experimental results3
Experimental Results Markov Models and Structural Features

1-best recognized


Experimental results4
Experimental Results Markov Models and Structural Features

1-best recognized


Experimental results5
Experimental Results Markov Models and Structural Features

3-best recognized:

1. زنجان 2. اصفهان 3. دامغان


Experimental results6
Experimental Results Markov Models and Structural Features

4-best recognized:

1. قشم 2. قم 3. مرند 4. مشهد


Experimental results7
Experimental Results Markov Models and Structural Features

Not N-best recognized, for N ≤ 20


Experimental results8
Experimental Results Markov Models and Structural Features

Not N-best recognized, for N ≤ 20


Experimental results9
Experimental Results Markov Models and Structural Features

Not N-best recognized, for N ≤ 20


Conclusion
Conclusion Markov Models and Structural Features

  • The first work to use CHMMs with structural features to recognize Farsi handwritten words.

  • A complete offline recognition system for Farsi handwritten words.

  • A new machine learning approach based on the NBC for text segmentation.

  • Comparing and contrasting different algorithms for:

    • Binarization

    • Skew and Slant Correction

    • Skeletonization

  • Excellent generalization performance.

  • A maximum recognition rate of 82% on our dataset of size 100.


Thanks for your attention Markov Models and Structural Features

Please feel free to ask any question

[email protected]


ad