ocrdroid a framework to digitize text using mobile phones
Download
Skip this Video
Download Presentation
OCRdroid : A Framework to Digitize Text Using Mobile Phones

Loading in 2 Seconds...

play fullscreen
1 / 28

OCRdroid : A Framework to Digitize Text Using Mobile Phones - PowerPoint PPT Presentation


  • 75 Views
  • Uploaded on

OCRdroid : A Framework to Digitize Text Using Mobile Phones. Authors Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme University of Southern California Presenter Mi Zhang. Outline. What is OCRdroid ? Related Work Design Considerations

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' OCRdroid : A Framework to Digitize Text Using Mobile Phones' - soleil


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
ocrdroid a framework to digitize text using mobile phones
OCRdroid: A Framework to Digitize Text Using Mobile Phones
  • Authors
    • Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav Sukhatme
    • University of Southern California
  • Presenter
    • Mi Zhang
outline
Outline
  • What is OCRdroid ?
  • Related Work
  • Design Considerations
  • System Architecture
  • Experimental Results
  • Summary
what is ocrdroid
What is OCRdroid ?

Why?

Huge demand for recognizing text in camera-captured pictures

Mobile phones are Ubiquitous and Powerful

What?

OCRdroid = OCR + Mobile Phone

Two Applications

PocketPal: Personal Receipt Management Tool

PocketReader: Personal Mobile Screen Reader

related work
Related Work

Design and implementation of a Card Reader based on build-in camera. X.P. Luo, J. Li, and L.X. Zhen

Automatic detection and recognition of signs from natural scenes. X. Chen, J. Yang, and A. Waibel

A morphological image preprocessing suite for OCR on natural scene images. M. Elmore, and M. Martonosi

design considerations
Design Considerations
  • Real-Time Processing
  • Lighting Conditions
  • Text Skew
  • Perception Distortion (Tilt)
  • Text Misalignment
  • Blur (Out – Of - Focus)
real time processing
Real-Time Processing
  • Issues :
    • Limited memory
    • Relative Low processing power
    • Require quick response
  • Our Solutions :
    • Multi-Thread System Architecture
    • Image Compression
    • Computationally Efficient Algorithms
lighting conditions
Lighting Conditions
  • Issues :
    • Uneven Lighting (Shadows, Reflection, Flooding, etc.)
lighting conditions1
Lighting Conditions
  • Our Solution :
    • Local Binarization : Fast Sauvola’s Algorithm
text skew
Text Skew
  • Issues :
    • When perspective is not fixed, text lines may get skewed from their original orientation
text skew1
Text Skew
  • Our Solution :
    • Branch-and-Bound text line finding algorithm + Auto-rotation
perception distortion tilt
Perception Distortion (Tilt)
  • Issues :
    • When the text plane is not parallel to the imaging plane
    • Mobile phones are susceptible to tilts
    • Small Perception Distortion causes OCR to fail
perception distortion tilt1
Perception Distortion (Tilt)
  • Our Solution :
    • Use Embedded Orientation Sensor (Pitch and Roll)
    • Calibration
text misalignment
Text Misalignment
  • Issues :
    • Camera screen covers a partial text region
    • Irregular shapes of text characters
text misalignment1

Top Border

Left Border

Right Border

Bottom Border

Text Misalignment
  • Our Solution :
    • Step#1 : Modified version of Sauvola’s algorithm
text misalignment2
Text Misalignment
  • Our Solution :
    • Step#1(Cont) : Routes to perform Sauvola’s algorithm
text misalignment3
Text Misalignment
  • Our Solution :
    • Step#2 : Noise Reduction

Top Border

W

.

.

.

.

.

.

.

.

W

Left Border

Right Border

Bottom Border

blur out of focus
Blur (Out Of Focus)
  • Issues :
    • OCR needs sharp edge response
blur out of focus1
Blur (Out Of Focus)
  • Our Solution :
    • Android autofocus mechanism
slide19

Web Server

System Architecture

4. Perform Backend Processing & OCR

3. Upload image

Internet

5. Return OCR

Results

OCR Engine – Tesseract

6. Results returned

Android Phone

1. Photo of a receipt

2. Front end processing

7. Information Extraction

slide20

Front-End Architecture

Orientation

Handler

Camera

Preview

Capture

Image

Upload

Alignment

Checker

Internet

Proper Alignment Detected

Improper Alignment Detected

OCR Data

Receiver

Information

Extraction

Mobile

Database

Internet

slide21

Back-End Architecture

Store Image

Skew Detection & Auto-rotation

Binarization

Internet

OCR Text

Output

Internet

Sends Results back to Mobile Device

Tesseract OCR Engine

experimental results
Experimental Results

Test Corpus

Ten distinct black & white images

Three distinct lighting conditions

Normal: Adequate light

Poor: Dim

Flooding: Light source focus on a particular portion of image

Performance Metrics

Character Accuracy

Word Accuracy

Timing

experimental results1
Experimental Results
  • Binarization: (Measured by Character Accuracy)
    • Normal: Around 97%
    • Poor: Around 60%
    • Flooding: Around 60%
  • Skew tolerance: Up to 30 degrees
  • Perception Distortion: Up to 10 degrees
experimental results2
Experimental Results

Misalignment Detection:

Timing Performance:

Misalignment Detection: Less Than 6 seconds

Overall Process: Less Than 11 seconds

more information
More Information
  • Project Website @: http://www-scf.usc.edu/~ananddjo/ocrdroid/index.php
    • Test Cases & Results
    • Demo Video
    • Paper
    • Presentation Slide
    • Tools Information (Mobile Phone + Software)
summary
Summary
  • OCRdroid – A Generic Framework for Developing OCR-based Applications on Mobile Phones
  • Six Design Considerations & Our Solutions
    • Especially, we advance a new real-time computationally efficient algorithm for text misalignment detection
  • Experimental Results
ad