Ocrdroid a framework to digitize text using mobile phones
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

OCRdroid : A Framework to Digitize Text Using Mobile Phones PowerPoint PPT Presentation


  • 57 Views
  • Uploaded on
  • Presentation posted in: General

OCRdroid : A Framework to Digitize Text Using Mobile Phones. Authors Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme University of Southern California Presenter Mi Zhang. Outline. What is OCRdroid ? Related Work Design Considerations

Download Presentation

OCRdroid : A Framework to Digitize Text Using Mobile Phones

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Ocrdroid a framework to digitize text using mobile phones

OCRdroid: A Framework to Digitize Text Using Mobile Phones

  • Authors

    • Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme

    • University of Southern California

  • Presenter

    • Mi Zhang


Outline

Outline

  • What is OCRdroid ?

  • Related Work

  • Design Considerations

  • System Architecture

  • Experimental Results

  • Summary


What is ocrdroid

What is OCRdroid ?

Why?

Huge demand for recognizing text in camera-captured pictures

Mobile phones are Ubiquitous and Powerful

What?

OCRdroid = OCR + Mobile Phone

Two Applications

PocketPal: Personal Receipt Management Tool

PocketReader: Personal Mobile Screen Reader


Related work

Related Work

Design and implementation of a Card Reader based on build-in camera. X.P. Luo, J. Li, and L.X. Zhen

Automatic detection and recognition of signs from natural scenes. X. Chen, J. Yang, and A. Waibel

A morphological image preprocessing suite for OCR on natural scene images. M. Elmore, and M. Martonosi


Design considerations

Design Considerations

  • Real-Time Processing

  • Lighting Conditions

  • Text Skew

  • Perception Distortion (Tilt)

  • Text Misalignment

  • Blur (Out – Of - Focus)


Real time processing

Real-Time Processing

  • Issues :

    • Limited memory

    • Relative Low processing power

    • Require quick response

  • Our Solutions :

    • Multi-Thread System Architecture

    • Image Compression

    • Computationally Efficient Algorithms


Lighting conditions

Lighting Conditions

  • Issues :

    • Uneven Lighting (Shadows, Reflection, Flooding, etc.)


Lighting conditions1

Lighting Conditions

  • Our Solution :

    • Local Binarization : Fast Sauvola’s Algorithm


Text skew

Text Skew

  • Issues :

    • When perspective is not fixed, text lines may get skewed from their original orientation


Text skew1

Text Skew

  • Our Solution :

    • Branch-and-Bound text line finding algorithm + Auto-rotation


Perception distortion tilt

Perception Distortion (Tilt)

  • Issues :

    • When the text plane is not parallel to the imaging plane

    • Mobile phones are susceptible to tilts

    • Small Perception Distortion causes OCR to fail


Perception distortion tilt1

Perception Distortion (Tilt)

  • Our Solution :

    • Use Embedded Orientation Sensor (Pitch and Roll)

    • Calibration


Text misalignment

Text Misalignment

  • Issues :

    • Camera screen covers a partial text region

    • Irregular shapes of text characters


Text misalignment1

Top Border

Left Border

Right Border

Bottom Border

Text Misalignment

  • Our Solution :

    • Step#1 : Modified version of Sauvola’s algorithm


Text misalignment2

Text Misalignment

  • Our Solution :

    • Step#1(Cont) : Routes to perform Sauvola’s algorithm


Text misalignment3

Text Misalignment

  • Our Solution :

    • Step#2 : Noise Reduction

Top Border

W

.

.

.

.

.

.

.

.

W

Left Border

Right Border

Bottom Border


Blur out of focus

Blur (Out Of Focus)

  • Issues :

    • OCR needs sharp edge response


Blur out of focus1

Blur (Out Of Focus)

  • Our Solution :

    • Android autofocus mechanism


Ocrdroid a framework to digitize text using mobile phones

Web Server

System Architecture

4. Perform Backend Processing & OCR

3. Upload image

Internet

5. Return OCR

Results

OCR Engine – Tesseract

6. Results returned

Android Phone

1. Photo of a receipt

2. Front end processing

7. Information Extraction


Ocrdroid a framework to digitize text using mobile phones

Front-End Architecture

Orientation

Handler

Camera

Preview

Capture

Image

Upload

Alignment

Checker

Internet

Proper Alignment Detected

Improper Alignment Detected

OCR Data

Receiver

Information

Extraction

Mobile

Database

Internet


Ocrdroid a framework to digitize text using mobile phones

Back-End Architecture

Store Image

Skew Detection & Auto-rotation

Binarization

Internet

OCR Text

Output

Internet

Sends Results back to Mobile Device

Tesseract OCR Engine


Experimental results

Experimental Results

Test Corpus

Ten distinct black & white images

Three distinct lighting conditions

Normal: Adequate light

Poor: Dim

Flooding: Light source focus on a particular portion of image

Performance Metrics

Character Accuracy

Word Accuracy

Timing


Experimental results1

Experimental Results

  • Binarization: (Measured by Character Accuracy)

    • Normal: Around 97%

    • Poor: Around 60%

    • Flooding: Around 60%

  • Skew tolerance: Up to 30 degrees

  • Perception Distortion: Up to 10 degrees


Experimental results2

Experimental Results

Misalignment Detection:

Timing Performance:

Misalignment Detection: Less Than 6 seconds

Overall Process: Less Than 11 seconds


More information

More Information

  • Project Website @: http://www-scf.usc.edu/~ananddjo/ocrdroid/index.php

    • Test Cases & Results

    • Demo Video

    • Paper

    • Presentation Slide

    • Tools Information (Mobile Phone + Software)


Summary

Summary

  • OCRdroid – A Generic Framework for Developing OCR-based Applications on Mobile Phones

  • Six Design Considerations & Our Solutions

    • Especially, we advance a new real-time computationally efficient algorithm for text misalignment detection

  • Experimental Results


Questions

Questions ?


Thank you

Thank You


  • Login