1 / 43

Combining Shape and Physical Models for Online Cursive Handwriting Synthesis

Combining Shape and Physical Models for Online Cursive Handwriting Synthesis. Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University) Ying-Qing Xu (Microsoft Research Asia) Heung-Yeung Shum (Microsoft Research Asia).

Download Presentation

Combining Shape and Physical Models for Online Cursive Handwriting Synthesis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington)Chenyu Wu (Carnegie Mellon University)Ying-Qing Xu (Microsoft Research Asia)Heung-Yeung Shum (Microsoft Research Asia) International Journal on Document Analysis and Recognition (IJDAR) 2004

  2. Introduction • Handwriting computing techniques (pen-based devices) • Handwriting recognition • make it possible for computers to understand the information involved in handwriting • Handwriting modulation • handwriting editing, error correction, script searching

  3. Introduction • Handwriting Modeling & Synthesis • Movement-simulation techniques • base on motor models and try to model the process of handwriting production • focus on the representation and analysis of real handwriting signals rather than handwriting synthesis

  4. Introduction • Shape-simulation methods • consider the static shape of handwriting trajectory • more practical than movement-simulation tech when dynamic information is not available • straight forward approach : synthesize form collected handwritten glyphs • learning-based cursive handwriting synthesis approach

  5. Introduction • Successful handwriting synthesis algorithm • shapes of letters vs. training samples • connection between synthesized letters • A novel cursive handwriting synthesis tech • Combine the advantages of the shape-simulation and the movement-simulation methods

  6. Outline • Sample collection and segmentation • Learning strategies • Synthesis Strategies • Experimental results • Discussion and Conclusion

  7. Sample Collection • About 200 words • Each letter has appeared more than 5 times • These handwriting samples firstly pass through a low pass filter and then be re-sampled to produce equidistant points

  8. Sample Segmentation • Overview • Segmentation-based recognition method • Recognition-based segmentation (rely heavily on the performance of the recognition engine) • Level-building • simultaneously outputs the recognition and segmentation results • segmentation and recognition are merged to give an optimal result

  9. A Two-level Framework • Framework of traditional handwriting segmentation approaches • Temporal handwriting sequence • is a low level feature that denotes the coordinate and velocity of the sequence at time t

  10. Segmentation • The segmentation problem is to find the identity string {I1,…,In}, with the corresponding segments of the sequence {S1,…,Sn}, S1= {z1,…,zt1},…, Sn={ztn-1,…, zT},that best explain the sequence

  11. Segmentation • For the training of the writer-independent segmentation system • low-level feature-based segmentation algorithm works well for a small number of writers • A script code is calculated from handwriting data as the middle-level feature

  12. Middle Level Feature • Five kinds of key points are extracted • points of maximum/minimum x-coordinate (X+,X-) • points of maximum/minimum y-coordinate (Y+,Y-) • crossing points ( ) • Average direction of the interval sequence between two adjacent key points

  13. Script Codes Examples

  14. Middle Level Feature • Samples of each character are divided into several clusters • those in the same cluster have a similar structural topology • Since the length of script code might not be the same in all cases→ can’t directly compute the similarity • The script code is modeled as a homogeneous Markov chain

  15. Middle Level Feature • Given two script codes T1, T2 • We may compute the stationary distributions , and transition matrix A1, A2 • The similarity between two script codes is measured as

  16. Middle Level Feature • The position of , , A1, A2 are enforced symmetrically • balance the variance of the KL divergence and the difference in code length • If both the stationary distribution and the transition matrix of two script codes are matched well, and their code lengths are almost the same → d(T1, T2) is close to 1

  17. Segmentation • After introducing the script code as middle-level features, the optimization problem becomes • improve the accuracy of segmentation • dramatically reduce the computational complexity of level-building

  18. Graph Model

  19. Result

  20. Outline • Sample collection and segmentation • Learning strategies • Synthesis Strategies • Experimental results • Discussion and Conclusion

  21. Learning Strategies • Data alignment • Trajectory matching • Training set alignment • Shape models

  22. Trajectory Matching • Segmentation and reconstruction of on-line handwritten scripts (1998, Pattern Recognition) Each piece is simple arc, points can be equidistantly sampled from it to represent the stroke

  23. Trajectory Matching • Landmark-point-extraction method • pen-down, pen-up points • local extrema of curvature • inflection points of curvature • A handwriting sample can be divided into as many as six pieces • The same character are mostly composed of the same number of pieces and they match each other naturally

  24. Trajectory Matching • A handwriting sample can be represented by a point vector • s: number of static pieces segmented from the sample • ni: number of points extracted from the i th piece

  25. Trajectory Matching • The following is to align different vector into a common coordinate frame • estimate an affine transform for each samplethat transforms the sample into the coordinate frame • Affine transformations: translation, rotation, scaling

  26. Training Set Alignment • Iterative algorithm(Learning from one example through shared densities on transforms (IEEE CVPR 2000) ) • Deformable energy based criterion is defined as

  27. Training Set Alignment - Algorithm • Maintain an affine transform matrix Ui for each sample, which is set to identity initially • Compute the deformable energy-based criterion E • Repeat until convergence: • For each one of the six unit affine matrixes[14], Aj, j = 1,…,6 • Let • Apply to the sample and recalculate the criterion E • If E has been reduced, accept , otherwise: • Let and apply again,If E has been reduce, accept , otherwise revert to Ui • End

  28. Shape Models • By modeling the distribution of aligned vectors, new examples can be generated that are similar to those in the training set • Like the Active Shape Model, principal component analysis is applied to the data (PCA)(Statistical models of appearance for computer vision, Draft report, 2000)

  29. Shape Model • Formally, the covariance of the data is calculated as • Then the eigenvectors and corresponding eigenvalues of S are computed and sorted so that • The training set is approximated by • represent the t eigenvectors corresponding to the largest eigenvalues • b is a vt-dimensional vector given by • By varying the elements in b, new handwriting trajectory can be generated from this model • apply limits of to the elements bi

  30. Outline • Sample collection and segmentation • Learning strategies • Synthesis Strategies • Experimental results • Discussion and Conclusion

  31. Synthesis Strategies • Generate each individual letter in the word • Then the baselines of these letters are aligned and juxtaposed in a sequence • Concatenate letters with their neighbors to form a cursive handwriting • →can’t be easily achieved • To solve this problem, a delta log-normal model based conditional sampling algorithm is proposed

  32. Individual Letter Synthesis

  33. t0: activation timeDi: amplitude of impulse commands : mean time delay :response time of the agonist and antagonist system Delta Log-normal Model • A powerful tool in analyzing rapid human movements • With respect to handwriting generation, the movement of a simple stroke is controlled by velocity • The magnitude of the velocity is described as(Why handwriting segmentation can be misleading?, 13th international conference on PR, 1996) log-normal function (on a logarithmic scale axis)

  34. Delta Log-normal Model • The angular velocity can be expressed as • The angular velocity is calculated as the derivative of • Give , the curvature along a stroke piece is calculated as • The static shape of the piece is an arc, characterized by : initial directionc0: constant (arc length)

  35. Delta Log-normal Model-Example [Why Handwriting Segmentation Can Be Misleading, 1996 IEEE ICPR]

  36. Conditional Sampling • First, the trajectories of synthesized handwriting letters are decomposed into static pieces • The first piece of a trajectory is called head piece, and the last piece is called the tail piece • In the concatenation process, the trajectories of letters will be deformed to produce a natural cursive handwriting,by changing the parameters of the head and the tail pieces from

  37. Conditional Sampling • A deformation energy of a stroke is defined as • A concatenation energy between the i th letter and the (i+1) th letter is defined as • By minimizing the second and the third items, the two letters are forced to connect with each other smoothly and naturally

  38. Conditional Sampling • The concatenation energy of a whole word is calculated as • We must ensure that the deformed letters are consistent with models • The sampling energy is calculated as • The whole energy formulation is finally given as

  39. Synthesis-Iterative Approach • Randomly generate a vector b(i) for each letter initially • Generate trajectories Si of letters and calculate an affine transform Ti for each letter (transform it to its desired position) • For each pair of adjacent letters {Si, Si+1}, deform the pieces in these letters to minimize the concatenation energy Ec(i, i+1) • Project the deformed shape into the model coordinate frame • Update the model parameters • If not converged return to step 2

  40. Experimental Results

  41. Discussion & Conclusion • Performance is limited by samples used for training since the shape models can only generate novel shapes within the variation of training samples • Although some experimental results are shown, it is still not known how to make an objective evaluation on the synthesized scripts and compare different synthesis approaches

  42. Markov chain on a space X with transitions T is a random process (infinite sequence of random variables) (x(0), x(1),…x(t),…) that satisfy That is, the probability of being in a particular state at time t given the state history depends only on the state at time t-1 If the transition probabilities are fixed for all t, the chain is considered homogeneous 0.4 0.7 0.3 0 x2 0.3 T= 0.3 0.4 0.3 0.3 0 0.3 0.7 0.3 0.3 0.7 0.7 x1 x3 Markov chains

  43. 0.4 x2 0.3 0.3 0.3 0.3 0.7 0.7 x1 x3 0.7 0.3 0 x = 0.33 0.33 0.33 0.33 0.33 0.33 0.3 0.4 0.3 0.7 0.3 0 0 0.3 0.7 T= 0.3 0.4 0.3 0 0.3 0.7 Stationary distribution • Consider the Markov chain given above: • The stationary distribution is

More Related