1 / 59

LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

Department of Computer Science & Engineering The Chinese University of Hong Kong. LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition. Supervised by Prof. LYU, Rung Tsong Michael. Prepared by: Wong Chi Hang Tsang Siu Fung. Outline. Introduction Overall Design

ethan
Download Presentation

LYU0203 Smart Traveller with Visual Translator for OCR and Face Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Department of Computer Science & Engineering The Chinese University of Hong Kong LYU0203Smart Traveller with Visual Translatorfor OCR and Face Recognition Supervised by Prof. LYU, Rung Tsong Michael Prepared by: Wong Chi Hang Tsang Siu Fung

  2. Outline • Introduction • Overall Design • Korean OCR • Face Detection • Future Work

  3. Introduction – What is VTT? • Smart Traveller with Visual Translator (VTT) • Mobile Device which is convenient for a traveller to carry • Mobile Phone, Pocket PC, Palm, etc. • Recognize and translate the foreign text into native language • Detect and recognize the face into name

  4. Introduction – Motivation • More and more people have mobile device which include Pocket PC, Palm, mobile phone. • Mobile Device becomes more powerful. • There are many people travelling aboard

  5. Introduction – Motivation (Cont.) • Types of programs for Mobile Device • Communication and Network • Multimedia • Games • Personal management • System tool • Utility

  6. Introduction – Motivation (Cont.) • Application for traveller? • Almost no!!! • Very often, travellers encounter many problems about unfamiliar foreign language • Therefore, the demand of an application for traveller is very large.

  7. Introduction – Objective • Help travellers to overcome language and memory power problems • Two main features: • Recognize and translate Korean to English (Korean is not understandable for us) • Detect and recognize the face (Sometimes we forget the name of a friend)

  8. Introduction – Objective (Cont.) • Target of Korean OCR • Signs and Guideposts • Printed Characters • Contrast Text Color and Background Color • Target of Face Recognizer • One face in photo • Frontal face • Limited set of faces

  9. Introduction – Objective (Cont.) • Real Life Examples • Sometimes we lose the way, we need to know where we are. • Sometimes we forget somebody we met before.

  10. User Request Response GUI Output Request Request Request Response Request Response Data Korean OCR Face Recognizer Camera API Result Query Result Query Update Data Request Stroke Database & Dictionary Face Database Camera Overall Design of VTT System

  11. KOCR – Design

  12. KOCR – Text Area Detection • Edge Detection using Sobel Filter

  13. Horizontal Projection Threshold Vertical Projection KOCR – Text Area Detection (Cont.) • Horizontal and Vertical Edge Projection

  14. KOCR – Binarization • Color Segmentation • Base on Color Histogram Threshold

  15. KOCR – Stroke Extraction • Labeling of Connected Component with 8-connectivity

  16. KOCR – Stroke Extraction (Cont.) • Why do we choose stroke but not whole character? • Korean Character is composed of Some Stroke types • Limited Set of Stroke Types in Korean

  17. KOCR – Stroke Feature • Our Proposed Feature • Five rays each side • Difference of adjacent rays (-1 or 0 or 1) • Has holes (0 or 1) • Dimension ratio of Stroke (width/height) (-1 or 0 or 1)

  18. KOCR – Stroke Feature (Cont.) • Problems Faced • Train the stroke database needs much time • Two or more strokes maybe stick together

  19. KOCR – Stroke Recognition • Exact Matching by Pre-learned Stroke Features • Trained Decision Tree

  20. KOCR – Pattern Identification • Six Pattern of Korean Character • Identify by simple if-then-else statement 0 1 2 3 4 5

  21. Face Detection • Outline • 1. Find Face Region • 2. Find the potential eye region • 3. Locate the iris • 4. Improvement

  22. 1. Find Face Region There are three methods available 1. Projection of the image 2. Base on gray-scale image 3. Color-based model

  23. 1. Find Face Region -Projection of the image • Consider only one single color: blue, green or red. • Usually blue pixel value is used because it can avoid the interference of the facial feature. • Project the blue pixel vertically to find the left and right edge of face.

  24. x Edge of face 1. Find Face Region (Cont.) -Projection of the image Sum of pixel value

  25. 1. Find Face Region (Cont.) -Projection of the image • The image should be filtered out the high frequency of this curve by FTT (Fast Fourier Transform) • Assume the face occupy large area of the image

  26. 1. Find Face Region -Base on gray-scale image • No color information • Pattern recognition

  27. 1. Find Face Region -Color-based model • We use this method because of its simplicity and robustness. • Color-based model is used to represent color. • Since human retina has three types of color photoreceptor cone cell, color model need three numerical components.

  28. Color-based model (Cont.) • There are many color model such as RGB, YUV (luminance-chrominance) and HSB (hue, saturation and brightness) • Usually RGB color model will be transformed to other color model such as YUV and HSB.

  29. Color-based model (Cont.) -YUV • We use YUV or YCbCr color model. • Y component is used to represent the intensity of the image • Cb and Cr are used to represent the blue and red component respectively.

  30. Cr Cb Y Color-based model (Cont.) -YCbCr Image Original Image -

  31. Representation of Face color • How can YUV color model represent face color? • What happens when we transform the pixel into Cr-Cb histogram?

  32. Representation of Face color • We just use a simple ellipse equation to model skin color. Cr Cb

  33. Representation of Face color The equation of the ellipse : where L is the length of the long axis and S is the length of the short axis. • We choose L = 35.42, S = 20.615, θ = -0.726 (radius)

  34. Representation of Face color -Color segmentation • The white regions represent the skin color pixels

  35. Representation of Face color -Color segmentation (modified version1) • We distribute some agents in the image uniformly. • Then each agent will check whether the pixel is a skin-like pixel and not visited by the other agent. • If yes, it will produce 4 more agents at its four neighboring points. • If no, it will moved to one of its four neighboring points randomly.

  36. This agent produce 4 more agents Representation of Face color (Cont.) -Color segmentation (modified version1) If the pixel is a skin-like pixel and not visited by the other agent, produce 4 more agents at its four neighboring points

  37. This agent move to one of four neighboring point Representation of Face color (Cont.) -Color segmentation (modified version1) Otherwise, it will moved to one of its four neighboring points randomly

  38. Representation of Face color (Cont.) -Color segmentation (modified version1) • Each agent will search their own region • Each region are shown in the next slide with different color.

  39. Representation of Face color (Cont.) -Color segmentation (modified version1) • The advantage of this algorithm is that we need not to search the whole image. • Therefore, it is fast.

  40. Representation of Face color (Cont.) -Color segmentation (modified version1) • 19270 of 102900 pixels is searched (about 18.7%) • There are 37 regions

  41. 2. Eye detection • After the segmentation of face region, we have some parts which are not regarded as skin color. • They are probably the region of eye and mouth • We only consider the red component of these regions because it usually includes the most information about faces.

  42. 2. Eye detection (Cont.) • We extraction such regions by pseudo-convex hull.

  43. 2. Eye detection (Cont.) We do the following on the regions of potential eye region • Histogram equalization • Threshold

  44. 2. Eye detection (Cont.) Histogram equalization Threshold with < 49 After the histogram equalization and threshold, the searching space of eyes is greatly reduced.

  45. 3. Locate the iris • After the operations above, we almost find the eye. • However, we should locate the iris. We use the following different methods • Template matching • Hough Transform

  46. 3. Locate the iris (Cont.) -Template matching • It bases on normalized cross-correlation. • It is used to measure the similarity between two images

  47. 3. Locate the iris (Cont.) -Template matching Let I1, I2 be images of the same size. I1(pi) = ai , I2(pi) = bi NCC(I1, I2) lies on the range [-1, 1]

  48. 3. Locate the iris (Cont.) -Template matching We use this template and calculate the NCC. This template can be obtained by averaging all the eye image.

  49. 3. Locate the iris (Cont.) -Template matching Red region show the result

  50. 3. Locate the iris (Cont.) -Hough transform • Hough Transform can find the complete shape of the edge according to small portion of edge information. • It works with a parametric representation of the object we are looking for. • We use Hough transform with 2D circle parametric representation to find the iris.

More Related