1 / 20

Text Detection in Video

Text Detection in Video. Min Cai 2002.3.13. Background. Video OCR: Text detection, extraction and recognition Detection Target: Artificial text Text detection: Detect the region from Single frame Refine the region by combining consecutive frames. Existing Work.

roymichael
Download Presentation

Text Detection in Video

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Text Detection in Video Min Cai 2002.3.13

  2. Background • Video OCR: Text detection, extraction and recognition • Detection Target: Artificial text • Text detection: • Detect the region from Single frame • Refine the region by combining consecutive frames

  3. Existing Work

  4. Connected-component-based methods • Basic idea • Treat text as an uniform color (color level) and classify each pixel as text or non-text according to the color value. • Combine connected text-pixels into connected components. • Group collinear connected components into a text string. • Advantage • Can detect an arbitrary orientation text ---- with similar color and in a simple background. • Disadvantage • Sensitive to color variance • Lossy compression of video introduces color bleeding • Complex background

  5. Texture Segmentation method • Basic idea • Treat text as a type of texture • Use texture segmentation algorithms to detect text • Gabor Filter • Gaussian derivatives • Advantage • Can segment text areas & graphic areas in a simple background efficiently. It is usually used in document analysis. • Disadvantage • Time-consuming • Cannot handle well a text embedded in various background.

  6. Bottom-Up method • Basic idea • A seed region is defined as a small region with high edge density. • Grow a seed region into successively larger components until all seed regions are reached on the image. • Advantage • It is a generic method to detect a homogeneous object of various shape. That is, it can detect not only a rectangular object, but also other shapes. • Disadvantage • Sensitive to noise. • Can not handle the large range of font-size. • Sensitive to the stroke density (different language).

  7. Top-Down method • Basic idea • Based on run-length smoothing algorithm • Analyze horizontal and vertical projection profiles • Advantage • Can detect the boundary of horizontal alignment text string quickly and correctly • Noise insensitive • Disadvantage • Cannot handle diagonal alignment text. • One pass of horizontal & vertical projection cannot handle the complex layout.

  8. Analysis (1) • A certain contrast against background • Artificial text strings are designed to be read easily • A certain stroke density • Text strings always appear horizontally • Spatial cohesion • Characters of the same text string are of similar heights, orientation and spacing • Size constraint • Text strings have certain size restriction • A text string appears in multiple consecutive frames and the similar position.

  9. Analysis (2)

  10. Single Threshold

  11. Count Edge strength 0 MIN T-local MAX Low half High half Local threshold (1) • Use a small kernel (red) to scan the whole image. • In a bigger window (gray) surrounding the kernel, calculate the local threshold corresponding to its local histogram. b. Local threshold selection a. Window move

  12. Local threshold (2)

  13. Text-like area recovery (1) Before recovery After recovery

  14. Text-like area recovery (2) Before recovery After recovery

  15. High pass filter

  16. The first region from the array Initial: Add the whole Image to processing array Horizontal project Vertical project No Add to Processing array Yes Add to result array Can divide? Coarse-to-Fine detection • Using Top-down scheme to detect text-like areas

  17. Detect text-like areas 1) 2) 3) 4) b. Coarse vertical projection

  18. Refinement • Combine the neighboring text areas with similar height • Using size constraints to remove unsatisfied areas

  19. Multi-frame analysis • Text region matching • Find all the regions corresponding to the same text • Text region enhancement • Enhance the text image quality by multi-frame integration • Repetitive text elimination • Only record the text at its first emergence.

  20. Thank you! End

More Related