Recognition of Video Text Through Temporal Integration

Recognition of Video TextThrough Temporal Integration TrungQuyPhan, PalaiahnakoteShivakumaraTong Lu and Chew Lim Tan

Introduction • Text extraction from video frames video search and retrieval

Introduction • Low resolution • Complex background • Unconstrained appearance

Introduction • Low resolution • Complex background • Unconstrained appearance • Temporal information

Problem • Input • Word bounding box in a reference frame • Frame ID • Output • Binarized image • Scope • Static texts • Linearly moving texts

Approach • Tracking • Alignment • Integration • Refinement

1. Tracking • Find • [tstart, tend]  text framespan • Bounding box in each frame  text instance tstart … … tend tref

1. Tracking • Text descriptors

1. Tracking • Text descriptors • Stroke Width Transform-SIFT

1. Tracking • t = tref + 1, tref + 2, … • Initialize search area

1. Tracking • t = tref + 1, tref + 2, … • Initialize search area • If matchRatio ≥ 0.1  estimate new BB

1. Tracking • t = tref + 1, tref + 2, … • Initialize search area • If matchRatio ≥ 0.1  estimate new BB • Otherwise, found tend

2. Alignment • Align at pixel-level  better integration

2. Alignment • Align at pixel-level  better integration • Slide reference text mask over individual masks  optimal alignment

3. Integration • Text probability map

3. Integration • Initial binarization

4. Refinement • SWT: rounded strokes • Intensity values preserve sharp edges & holes suppress background pixels

Experiments • Moving text dataset: English + German • 250 words • 1,545 characters • Bottom to top, right to left and left to right • Static text dataset: English • 212 words • 1,389 characters

Experiments • Methods for comparison • Niblack (Single) • Min/max (Multiple) • Average-Min/max (Multiple) • Ours (Single) • Ours (Multiple)

Sample Results

Results on Moving Texts • Character recognition rate (CRR) • Word recognition rate (WRR)

Results on Static Texts • Multiple-frame: ~20% improvement over single-frame

Summary • A variation of SIFTfor robust tracking • Integration based onword masks • Future work: handle complex text movements

Recognition of Video Text Through Temporal Integration

Recognition of Video Text Through Temporal Integration

Presentation Transcript

Discovering Evolutionary Theme Patterns from Text -An exploration of Temporal Text Mining

The Recognition of Human Movement Using Temporal Templates

TEXT RECOGNITION

Scene text recognition in images and video

Text Recognition Techniques

Spatio Temporal Video Retrieval

Thinking through text

Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment

Temporal Video Boundaries

Temporal Video Boundaries -Part One-

Face Recognition in Video

Temporal Mediators: Integration of Temporal Reasoning and Temporal-Data Maintenance

Learning through Integration

Image Recognition using Hierarchical Temporal Memory

Integration Through Education

Local Descriptors for Spatio-Temporal Recognition

ATM TEXT INTEGRATION INTORDUCTION

Integration through sport.

Parallel Integration of Video Modules

Handwritten Text Recognition and Digital Text Conversion