slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Optical Data Capture: Optical Character Recognition (OCR) Intelligent Character Recognition (ICR) Intelligent Recogniti PowerPoint Presentation
Download Presentation
Optical Data Capture: Optical Character Recognition (OCR) Intelligent Character Recognition (ICR) Intelligent Recogniti

Loading in 2 Seconds...

play fullscreen
1 / 21

Optical Data Capture: Optical Character Recognition (OCR) Intelligent Character Recognition (ICR) Intelligent Recogniti - PowerPoint PPT Presentation


  • 146 Views
  • Uploaded on

Optical Data Capture: Optical Character Recognition (OCR) Intelligent Character Recognition (ICR) Intelligent Recognition. Summary. Concept/Definition Forms Design Scanners & Software Storage Accuracy OCR/ICR Advantages and Disadvantages Intelligent Recognition (IR)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Optical Data Capture: Optical Character Recognition (OCR) Intelligent Character Recognition (ICR) Intelligent Recogniti' - reece


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Optical Data Capture: Optical Character Recognition (OCR)

Intelligent Character Recognition (ICR)

Intelligent Recognition

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

summary
Summary

Concept/Definition

Forms Design

Scanners & Software

Storage

Accuracy

OCR/ICR Advantages and Disadvantages

Intelligent Recognition (IR)

Commercial Suppliers

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

definition concept of ocr
Definition/Concept of OCR
  • Gives scanning and imaging systems the ability to turn images of machine printed characters into machine readable characters.
    • Images of the machine printed characters are extracted from a bitmap of the scanned image

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

definition concept of icr
Definition/Concept of ICR

Gives scanning and imaging systems the ability to turn images of hand written characters into machine readable characters

Images of the hand written characters are extracted from a bitmap of the scanned image

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

ocr and icr differences
OCR and ICR Differences
  • OCR is less accurate than OMR but more accurate than ICR
  • ICR will require editing to achieve high data coverage

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

forms
Forms
  • OCR/ICR has less strict form design compared to OMR
    • No timing tracks
    • Has Registration Marks
  • ICR requires hand printed boxes filled one alphanumeric character per box

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

slide7
OCR
  • Forms
    • OCR/ ICR is more flexible since:
      • no timing tracks are required
      • The image can float on a page
    • The use of drop color reduces the size of the scanner’s output and enhances the accuracy
    • ICR/OCR technology often uses registration mark on the four-corners of a document, in the recognition of an image

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

slide8
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

ocr icr scanners and software
OCR/ICR Scanners and Software

Forms can be scanned through a scanner and then the recognition engine of the OCR/ICR system interpret the images and turn images of handwritten or printed characters into ASCII data (machine-readable characters).

Users can scan up without doing the OCR

Speeds Range from: 85-160 sheets/min (dependent on the recognition engine)

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

ocr icr storage characteristics
OCR/ICR Storage Characteristics

Storage/Retrieval

Images are scanned and stored and maintained electronically

There is no need to store the paper forms as long as you safeguard the electronic files

With OCR/ICR technologies, images can be scanned, indexed, and written to optical media

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

ideal ocr icr accuracy thresholds
Ideal OCR/ICR Accuracy Thresholds
  • Accuracy:
    • Accuracy achieved by data entry clerks (~99.5%) are approximately equal to OCR/ICR in in perfect tuning (~99.5%)
    • Up to 99.9% accuracy with editing (like OMR)
  • The recognition engine must be tuned, tested and validated very carefully

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

ocr icr advantages
OCR/ICR Advantages
  • Advantages
  • Recognition engines used with imaging can capture highly specialized data sets
  • OCR/ICR recognize machine-printed or hand-printed characters.
  • Scanning and recognition allowed efficient management and planning for the rest of the processing workload
  • Quick retrieval for editing and reprocessing

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

ocr icr disadvantages
OCR/ICR Disadvantages

Technology is costly

May require significant manual intervention

Additional workload to data collectors -ICR has severe limitations when it comes to human handwriting

Characters must be hand-printed/machine-printed with separate characters in boxes

ineffective when dealing with cursive characters

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

omr ocr icr compared
OMR-OCR/ICR Compared

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

ocr icr challenges issues
OCR/ICR Challenges/Issues
  • Has corresponding issues with OMR
  • Algorithm development (Preparation of memory dictionary)
  • Processing time considerations due to recognition engine
  • Development costs

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

definition concept of ir
Definition/Concept of IR

State of the art recognition technology

Gives scanning and imaging systems the ability to turn images of hand written and cursive characters into machine readable characters

Images of the hand writtenand cursive characters are extracted from a bitmap of the scanned image

The ability to capture cursive make this method unique

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

definition concept of ir17
Definition/Concept of IR

eight elements that make up the trajectories of all cursive letters (figure 1)

Photo: Parascript LLC

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

definition concept of ir18
Definition/Concept of IR

Intelligent Recognition dynamically uses context

context is used during the recognition process, improving the accuracy of results

Contexts helps to identify letters where the symbol segmentation of an image is ambiguous

Photo: Parascript LLC

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

slide19

Technology Evolution

FORM TYPES

TEXT STYLES

No special form design

No constraining boxes or combs

Condensed strings

Cursive

Dirty & Noisy forms

Bad quality paper

Legacy Forms

Bad quality

machine print

Unconstrained

Handprint

Specially designed for automatic

recognition

Constrained

Handprint

Constraining boxes or combs

Drop out ink for preprinted

text & boxes

Machine Print

Intelligent

Recognition

OCR

ICR

TECHNOLOGY EVOLUTION

Illustration: Conference on Technology Options for 2011 Census

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

major commercial suppliers
Major Commercial Suppliers
  • Top Image Systems (TIS)(http://www.topimagesystems.com)
  • ReadSoft (http://www.readsoft.com)
  • Teleform(http://www.intelliscan.com/TeleForm1.htm)
  • Scanner Suppliers
    • Fujitsu, Canon, Bell & Howell, Kodak

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008

thank you

THANK YOU!

UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and practice of data editing

Dar es Salaam, Tanzania, 9-13 June 2008