100 likes | 219 Views
This presentation provides an in-depth overview of the preprocessing tasks essential for effective handwritten word recognition. Key techniques such as binarization, slant correction, skeletonization, and reference lines detection will be discussed. The challenges and methods involved in data collection, including writing with a special pen and scanning at a high resolution, are also highlighted. Additionally, the presentation explores various binarization approaches like Otsu's and adaptive thresholding, and outlines segmentation strategies for accurately identifying text characteristics. ###
E N D
Handwritten Word Recognition (preprocessing) CmpE 537 (Computer Vision) Aleksei Ustimov 2006800811
Preprocessing Tasks • Binarization • Slant Correction • Skeletonization • Reference Lines Detection • Segmentation
Data Collection • Writing any text using special pen, • Scanning written texts with 150 dpi resolution, • Separating isolated words with image processing tool.
Binarization • Otsu’s Thresholding method • Slow • Not sensible to details • Adaptive Thresholding method • Noisy • Requires tuning
Slant Correction • Detects pen stroke width • Removes all lines with slant >60o • Removes small pieces • Calculates slant angle • Correct slant by shifting image rows
Skeletonization • Performs Holtz thinning until all lines are 1px wide • Removes small triangles using LYT removal algorithm
Reference Lines Detection • Locates main body • Locates ascenders • Locates descenders • Calculates approximate position of reference lines
Segmentation • Locates discontinuities in lines • Locates ligatures (character connection arcs) • Prefers oversegmentation
References • Seiler, R., Schenkel, M., Eggimann, F., Off-Line Cursive Handwriting Recognition Compared with On-Line Recognition, In Proc. IEEE-ICPR, Vol. 4, 1996, p. 505-509, 1996 • Bunke, H., Roth, M., Schukat-Talamazzini, E.G., Off-line Cursive Handwriting Recognition Using Hidden Markov Models, Pattern Recognition, Vol. 28, No. 9, p. 1399-1413, 1995 • Andrew, W., Robinson, A.J., An Off-Line Cursive Handwriting Recognition System, IEEE Transactions On Pattern Analysis and Machine Intelligence, Vol. 20, No. 3, p. 309-321, 1998 • Wang, W., Brakensiek, A., Kosmala, A., Rigoll, G., HMM Based High Accuracy Off-Line Cursive Handwriting Recognition By A Baseline Detection Error Tolerant Feature Extraction Approach, In Proc. IWFHR-7, p. 209-218, 2000 • Park, J., Govindaraju, V., Using Lexical Similarity In Handwritten Word Recognition, In Proc. IEEE Conference on Computer Vision and Pattern Recognition, Vol. 12, p. 290-295, 2000 • Morita, M., Sabourin, R., Bortolozzi, F., Suen, C.Y., A Recognition and Verification Strategy For Handwritten Word Recognition, In Proc. ICDAR, Vol. 1, p. 482-486, 2003 • Favata, J.T., Offline General Handwritten Word Recognition Using an Approximate BEAM Matching Algorithm, IEEE Tansactions on Pattern Analysis and Machine Intelligence, Vol. 23, No. 9, p. 1009-1021, 2001 • Lavrenko, V., Rath, T., Manmatha, R., Holistic Word Recognition For Handwritten Historical Documents, In Proc. DIAL, p. 278-287, 2004 • Liu, X., Shi, Z., A Format-Driven Handwritten Word Recognition System, ICDAR, Vol. 2, p. 1118-1122, 2003