Ocrdroid a framework to digitize text using mobile phones
Download
1 / 28

OCRdroid : A Framework to Digitize Text Using Mobile Phones - PowerPoint PPT Presentation


  • 75 Views
  • Uploaded on

OCRdroid : A Framework to Digitize Text Using Mobile Phones. Authors Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme University of Southern California Presenter Mi Zhang. Outline. What is OCRdroid ? Related Work Design Considerations

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' OCRdroid : A Framework to Digitize Text Using Mobile Phones' - soleil


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Ocrdroid a framework to digitize text using mobile phones
OCRdroid: A Framework to Digitize Text Using Mobile Phones

  • Authors

    • Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme

    • University of Southern California

  • Presenter

    • Mi Zhang


Outline
Outline

  • What is OCRdroid ?

  • Related Work

  • Design Considerations

  • System Architecture

  • Experimental Results

  • Summary


What is ocrdroid
What is OCRdroid ?

Why?

Huge demand for recognizing text in camera-captured pictures

Mobile phones are Ubiquitous and Powerful

What?

OCRdroid = OCR + Mobile Phone

Two Applications

PocketPal: Personal Receipt Management Tool

PocketReader: Personal Mobile Screen Reader


Related work
Related Work

Design and implementation of a Card Reader based on build-in camera. X.P. Luo, J. Li, and L.X. Zhen

Automatic detection and recognition of signs from natural scenes. X. Chen, J. Yang, and A. Waibel

A morphological image preprocessing suite for OCR on natural scene images. M. Elmore, and M. Martonosi


Design considerations
Design Considerations

  • Real-Time Processing

  • Lighting Conditions

  • Text Skew

  • Perception Distortion (Tilt)

  • Text Misalignment

  • Blur (Out – Of - Focus)


Real time processing
Real-Time Processing

  • Issues :

    • Limited memory

    • Relative Low processing power

    • Require quick response

  • Our Solutions :

    • Multi-Thread System Architecture

    • Image Compression

    • Computationally Efficient Algorithms


Lighting conditions
Lighting Conditions

  • Issues :

    • Uneven Lighting (Shadows, Reflection, Flooding, etc.)


Lighting conditions1
Lighting Conditions

  • Our Solution :

    • Local Binarization : Fast Sauvola’s Algorithm


Text skew
Text Skew

  • Issues :

    • When perspective is not fixed, text lines may get skewed from their original orientation


Text skew1
Text Skew

  • Our Solution :

    • Branch-and-Bound text line finding algorithm + Auto-rotation


Perception distortion tilt
Perception Distortion (Tilt)

  • Issues :

    • When the text plane is not parallel to the imaging plane

    • Mobile phones are susceptible to tilts

    • Small Perception Distortion causes OCR to fail


Perception distortion tilt1
Perception Distortion (Tilt)

  • Our Solution :

    • Use Embedded Orientation Sensor (Pitch and Roll)

    • Calibration


Text misalignment
Text Misalignment

  • Issues :

    • Camera screen covers a partial text region

    • Irregular shapes of text characters


Text misalignment1

Top Border

Left Border

Right Border

Bottom Border

Text Misalignment

  • Our Solution :

    • Step#1 : Modified version of Sauvola’s algorithm


Text misalignment2
Text Misalignment

  • Our Solution :

    • Step#1(Cont) : Routes to perform Sauvola’s algorithm


Text misalignment3
Text Misalignment

  • Our Solution :

    • Step#2 : Noise Reduction

Top Border

W

.

.

.

.

.

.

.

.

W

Left Border

Right Border

Bottom Border


Blur out of focus
Blur (Out Of Focus)

  • Issues :

    • OCR needs sharp edge response


Blur out of focus1
Blur (Out Of Focus)

  • Our Solution :

    • Android autofocus mechanism


Web Server

System Architecture

4. Perform Backend Processing & OCR

3. Upload image

Internet

5. Return OCR

Results

OCR Engine – Tesseract

6. Results returned

Android Phone

1. Photo of a receipt

2. Front end processing

7. Information Extraction


Front-End Architecture

Orientation

Handler

Camera

Preview

Capture

Image

Upload

Alignment

Checker

Internet

Proper Alignment Detected

Improper Alignment Detected

OCR Data

Receiver

Information

Extraction

Mobile

Database

Internet


Back-End Architecture

Store Image

Skew Detection & Auto-rotation

Binarization

Internet

OCR Text

Output

Internet

Sends Results back to Mobile Device

Tesseract OCR Engine


Experimental results
Experimental Results

Test Corpus

Ten distinct black & white images

Three distinct lighting conditions

Normal: Adequate light

Poor: Dim

Flooding: Light source focus on a particular portion of image

Performance Metrics

Character Accuracy

Word Accuracy

Timing


Experimental results1
Experimental Results

  • Binarization: (Measured by Character Accuracy)

    • Normal: Around 97%

    • Poor: Around 60%

    • Flooding: Around 60%

  • Skew tolerance: Up to 30 degrees

  • Perception Distortion: Up to 10 degrees


Experimental results2
Experimental Results

Misalignment Detection:

Timing Performance:

Misalignment Detection: Less Than 6 seconds

Overall Process: Less Than 11 seconds


More information
More Information

  • Project Website @: http://www-scf.usc.edu/~ananddjo/ocrdroid/index.php

    • Test Cases & Results

    • Demo Video

    • Paper

    • Presentation Slide

    • Tools Information (Mobile Phone + Software)


Summary
Summary

  • OCRdroid – A Generic Framework for Developing OCR-based Applications on Mobile Phones

  • Six Design Considerations & Our Solutions

    • Especially, we advance a new real-time computationally efficient algorithm for text misalignment detection

  • Experimental Results




ad