Create Presentation
Download Presentation

Download

Download Presentation

Combining Shape and Physical Models for Online Cursive Handwriting Synthesis

258 Views
Download Presentation

Download Presentation
## Combining Shape and Physical Models for Online Cursive Handwriting Synthesis

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Combining Shape and Physical Models for Online Cursive**Handwriting Synthesis Jue Wang (University of Washington)Chenyu Wu (Carnegie Mellon University)Ying-Qing Xu (Microsoft Research Asia)Heung-Yeung Shum (Microsoft Research Asia) International Journal on Document Analysis and Recognition (IJDAR) 2004**Introduction**• Handwriting computing techniques (pen-based devices) • Handwriting recognition • make it possible for computers to understand the information involved in handwriting • Handwriting modulation • handwriting editing, error correction, script searching**Introduction**• Handwriting Modeling & Synthesis • Movement-simulation techniques • base on motor models and try to model the process of handwriting production • focus on the representation and analysis of real handwriting signals rather than handwriting synthesis**Introduction**• Shape-simulation methods • consider the static shape of handwriting trajectory • more practical than movement-simulation tech when dynamic information is not available • straight forward approach : synthesize form collected handwritten glyphs • learning-based cursive handwriting synthesis approach**Introduction**• Successful handwriting synthesis algorithm • shapes of letters vs. training samples • connection between synthesized letters • A novel cursive handwriting synthesis tech • Combine the advantages of the shape-simulation and the movement-simulation methods**Outline**• Sample collection and segmentation • Learning strategies • Synthesis Strategies • Experimental results • Discussion and Conclusion**Sample Collection**• About 200 words • Each letter has appeared more than 5 times • These handwriting samples firstly pass through a low pass filter and then be re-sampled to produce equidistant points**Sample Segmentation**• Overview • Segmentation-based recognition method • Recognition-based segmentation (rely heavily on the performance of the recognition engine) • Level-building • simultaneously outputs the recognition and segmentation results • segmentation and recognition are merged to give an optimal result**A Two-level Framework**• Framework of traditional handwriting segmentation approaches • Temporal handwriting sequence • is a low level feature that denotes the coordinate and velocity of the sequence at time t**Segmentation**• The segmentation problem is to find the identity string {I1,…,In}, with the corresponding segments of the sequence {S1,…,Sn}, S1= {z1,…,zt1},…, Sn={ztn-1,…, zT},that best explain the sequence**Segmentation**• For the training of the writer-independent segmentation system • low-level feature-based segmentation algorithm works well for a small number of writers • A script code is calculated from handwriting data as the middle-level feature**Middle Level Feature**• Five kinds of key points are extracted • points of maximum/minimum x-coordinate (X+,X-) • points of maximum/minimum y-coordinate (Y+,Y-) • crossing points ( ) • Average direction of the interval sequence between two adjacent key points**Middle Level Feature**• Samples of each character are divided into several clusters • those in the same cluster have a similar structural topology • Since the length of script code might not be the same in all cases→ can’t directly compute the similarity • The script code is modeled as a homogeneous Markov chain**Middle Level Feature**• Given two script codes T1, T2 • We may compute the stationary distributions , and transition matrix A1, A2 • The similarity between two script codes is measured as**Middle Level Feature**• The position of , , A1, A2 are enforced symmetrically • balance the variance of the KL divergence and the difference in code length • If both the stationary distribution and the transition matrix of two script codes are matched well, and their code lengths are almost the same → d(T1, T2) is close to 1**Segmentation**• After introducing the script code as middle-level features, the optimization problem becomes • improve the accuracy of segmentation • dramatically reduce the computational complexity of level-building**Outline**• Sample collection and segmentation • Learning strategies • Synthesis Strategies • Experimental results • Discussion and Conclusion**Learning Strategies**• Data alignment • Trajectory matching • Training set alignment • Shape models**Trajectory Matching**• Segmentation and reconstruction of on-line handwritten scripts (1998, Pattern Recognition) Each piece is simple arc, points can be equidistantly sampled from it to represent the stroke**Trajectory Matching**• Landmark-point-extraction method • pen-down, pen-up points • local extrema of curvature • inflection points of curvature • A handwriting sample can be divided into as many as six pieces • The same character are mostly composed of the same number of pieces and they match each other naturally**Trajectory Matching**• A handwriting sample can be represented by a point vector • s: number of static pieces segmented from the sample • ni: number of points extracted from the i th piece**Trajectory Matching**• The following is to align different vector into a common coordinate frame • estimate an affine transform for each samplethat transforms the sample into the coordinate frame • Affine transformations: translation, rotation, scaling**Training Set Alignment**• Iterative algorithm(Learning from one example through shared densities on transforms (IEEE CVPR 2000) ) • Deformable energy based criterion is defined as**Training Set Alignment - Algorithm**• Maintain an affine transform matrix Ui for each sample, which is set to identity initially • Compute the deformable energy-based criterion E • Repeat until convergence: • For each one of the six unit affine matrixes[14], Aj, j = 1,…,6 • Let • Apply to the sample and recalculate the criterion E • If E has been reduced, accept , otherwise: • Let and apply again,If E has been reduce, accept , otherwise revert to Ui • End**Shape Models**• By modeling the distribution of aligned vectors, new examples can be generated that are similar to those in the training set • Like the Active Shape Model, principal component analysis is applied to the data (PCA)(Statistical models of appearance for computer vision, Draft report, 2000)**Shape Model**• Formally, the covariance of the data is calculated as • Then the eigenvectors and corresponding eigenvalues of S are computed and sorted so that • The training set is approximated by • represent the t eigenvectors corresponding to the largest eigenvalues • b is a vt-dimensional vector given by • By varying the elements in b, new handwriting trajectory can be generated from this model • apply limits of to the elements bi**Outline**• Sample collection and segmentation • Learning strategies • Synthesis Strategies • Experimental results • Discussion and Conclusion**Synthesis Strategies**• Generate each individual letter in the word • Then the baselines of these letters are aligned and juxtaposed in a sequence • Concatenate letters with their neighbors to form a cursive handwriting • →can’t be easily achieved • To solve this problem, a delta log-normal model based conditional sampling algorithm is proposed**t0: activation timeDi: amplitude of impulse commands :**mean time delay :response time of the agonist and antagonist system Delta Log-normal Model • A powerful tool in analyzing rapid human movements • With respect to handwriting generation, the movement of a simple stroke is controlled by velocity • The magnitude of the velocity is described as(Why handwriting segmentation can be misleading?, 13th international conference on PR, 1996) log-normal function (on a logarithmic scale axis)**Delta Log-normal Model**• The angular velocity can be expressed as • The angular velocity is calculated as the derivative of • Give , the curvature along a stroke piece is calculated as • The static shape of the piece is an arc, characterized by : initial directionc0: constant (arc length)**Delta Log-normal Model-Example**[Why Handwriting Segmentation Can Be Misleading, 1996 IEEE ICPR]**Conditional Sampling**• First, the trajectories of synthesized handwriting letters are decomposed into static pieces • The first piece of a trajectory is called head piece, and the last piece is called the tail piece • In the concatenation process, the trajectories of letters will be deformed to produce a natural cursive handwriting,by changing the parameters of the head and the tail pieces from**Conditional Sampling**• A deformation energy of a stroke is defined as • A concatenation energy between the i th letter and the (i+1) th letter is defined as • By minimizing the second and the third items, the two letters are forced to connect with each other smoothly and naturally**Conditional Sampling**• The concatenation energy of a whole word is calculated as • We must ensure that the deformed letters are consistent with models • The sampling energy is calculated as • The whole energy formulation is finally given as**Synthesis-Iterative Approach**• Randomly generate a vector b(i) for each letter initially • Generate trajectories Si of letters and calculate an affine transform Ti for each letter (transform it to its desired position) • For each pair of adjacent letters {Si, Si+1}, deform the pieces in these letters to minimize the concatenation energy Ec(i, i+1) • Project the deformed shape into the model coordinate frame • Update the model parameters • If not converged return to step 2**Discussion & Conclusion**• Performance is limited by samples used for training since the shape models can only generate novel shapes within the variation of training samples • Although some experimental results are shown, it is still not known how to make an objective evaluation on the synthesized scripts and compare different synthesis approaches**Markov chain on a space X with transitions T is a random**process (infinite sequence of random variables) (x(0), x(1),…x(t),…) that satisfy That is, the probability of being in a particular state at time t given the state history depends only on the state at time t-1 If the transition probabilities are fixed for all t, the chain is considered homogeneous 0.4 0.7 0.3 0 x2 0.3 T= 0.3 0.4 0.3 0.3 0 0.3 0.7 0.3 0.3 0.7 0.7 x1 x3 Markov chains**0.4**x2 0.3 0.3 0.3 0.3 0.7 0.7 x1 x3 0.7 0.3 0 x = 0.33 0.33 0.33 0.33 0.33 0.33 0.3 0.4 0.3 0.7 0.3 0 0 0.3 0.7 T= 0.3 0.4 0.3 0 0.3 0.7 Stationary distribution • Consider the Markov chain given above: • The stationary distribution is