1 / 16

Multiple Approaches at Hand Written Digit Recognition

Multiple Approaches at Hand Written Digit Recognition. Luis Bathen Mike Munson Jeremy Smith. Problem and Motivation. Problem: Given picture data representing a handwritten digit, determine which digit (0-9) it is Error rate must be extremely low for practical use

sema
Download Presentation

Multiple Approaches at Hand Written Digit Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Approaches at Hand Written Digit Recognition Luis Bathen Mike Munson Jeremy Smith

  2. Problem and Motivation • Problem: • Given picture data representing a handwritten digit, determine which digit (0-9) it is • Error rate must be extremely low for practical use • If the digit is too poorly drawn or cannot be classified, it should be rejected rather than risk improper classification • Motivation • Primary application: Postal Mail (Automatic sorting of mail by destination ZIP code) • Other applications: Digitizing handwritten spreadsheets, tax forms, etc.

  3. Alvarez, Roderiguez, & Hermidia Kirsch Gradients Network Structure

  4. Scaling the Image • The algorithm adds the values of 4 pixels to get the new pixel's value. • Resolution is halved, depth is doubled. • This makes less nodes to train, and gives invariance to subtle differences.

  5. Edge Detection • Output only certain edges from an image (vertical, horizontal, left-diagonal, right-diagonal) to new images. • Using the Kirsch algorithm • Goal is to create separate networks, trained over specific features of the image (one network for each edge map)

  6. A0 A1 A2 A3 (i,j) A4 A7 A6 A5 The Image • To extract locality from the image we used the Kirsch detector algorithm. • G(i, j) = max{1, max[|5Sk- 3tk|]} • Where G(i, j) is the gradient for pixel (i, j) and K=0..7 • Sk=Ak + Ak+1+ Ak+2 • Tk = Ak + Ak+1 + Ak+2 + Ak+3 + Ak+4

  7. Approaches • Five subnet-multi-layered network (10 Class output) • Raw 32x32/16x16/8x8 Binary Images (10 class output) • Raw 3-bit value images 4x4 (10 Class output) • 4x4 Binary/3-bit (Binary output) • Simple image correlation (For Comparison)

  8. Five subnet-multi-layered network (10 Class output)

  9. 32x32/16x16/8x8 and10 class output Winning Number is: 8 0.45 0.32 0.02 0.12E-23 0.99 0.12E-125 10 outputs M hidden N inputs

  10. Activate Output Layer ai(t + 1) = f(neti) f(neti) = 1 / ( 1 + e -neti ) neti =S wij * ai(t) - bi Activate Hidden Layer Activate Inputs Net Activation

  11. 0 0 1 0 0 0 Set Expected Output Calculate Out Delta Calculate Hidden Delta (1 – aj)ajSdkwkj if j is a hidden unit k dj= Update the Weights and Biases dj= (1 – aj)aj(tj - aj) if j is an output unit Dwij = hdiaj Dbi = -hdi Back Propagation

  12. Rejection • It is better to reject a digit than to classify it incorrectly. • To avoid rejection, the results must meet criteria • Highest confidence value must be greater than 87.5% • This value must be at least 20% more than he second-highest value

  13. Trials & Results • Edge Detection: total failure • Edge maps looked nice, but weren't useful • Bad news: The networks trained over the edge maps were horribly innacurate • Good news: The network trained over the simple scaled image proved nearly as accurate

  14. More Trials & Results • Different scaled image sizes • 16x16 • 8x8 • 4x4 • Goal: find best performance, possibly by taking a vote from more than one network. • Results: the 16x16 and 8x8 networks are too large/slow to effectively train and use. The 4x4 is roughly >90% accurate with <15% rejection.

  15. Results (32x32 down-sampled to 4x4-3-bit)

  16. References • [1] T. Bruel. “Segmentation of Handprinted Letter Strings using a Dynamic Programming Algorithm.” Presented at the Sixth International Conference on Document Analysis and Recognition (ICDAR ’01), September 1991. • [2] L. Fontaine, L. Shastri. “Handprinted Digit Recognition Using Spatiotemporal Connectionist Models.” Technical Report MS-CIS-92-24, University of Pennsylvania, March 1992. • [3] D. C. Alvarez, F.M. Rodriguez, X.F. Hermida. “Printed and Handwritten Digits Recognition Using Neural Networks.” Original publication source unkown; paper available at http://wgpi.tsc.uvigo.es/pub/papers/icsp98_1.pdf • [4] Y. Le Cun. “A Theoretical Framework for Back-Propagation.” From proceedings of 1998 Connectionist Models Summer School, 21-28. • [5] Y. Le Cun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jackel. “Handwritten Digit Recognition with a Back-Propagation Neural Network.” Advances in Neural Information Processing Systems, Vol. 2, 598-605. Morgan Kaufmann, 1990. • [6] O. Matan, C. J. C. Burges, Y. Le Cun, J. S. Denker. “Multi-Digit Recognition Using A Space Displacement Neural Network.” Advances in Neural Information Processing Systems, Vol. 4, 488-495. Morgan Kaufmann, 1992. • [7] G. Velasquez, ``A Distributed Approach to a Neural Network Simulation Program'.' Master's thesis, The University of Texas at El Paso, El Paso, TX, 1998

More Related