a non obtrusive head mounted face capture system l.
Skip this Video
Loading SlideShow in 5 Seconds..
A Non-obtrusive Head Mounted Face Capture System PowerPoint Presentation
Download Presentation
A Non-obtrusive Head Mounted Face Capture System

Loading in 2 Seconds...

play fullscreen
1 / 54

A Non-obtrusive Head Mounted Face Capture System - PowerPoint PPT Presentation

  • Uploaded on

A Non-obtrusive Head Mounted Face Capture System. Chandan K. Reddy Master’s Thesis Defense. Thesis Committee: . Dr. George C. Stockman (Main Advisor) Dr. Frank Biocca (Co-Advisor) Dr. Charles Owen Dr. Jannick Rolland (External Faculty). Modes of Communication.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'A Non-obtrusive Head Mounted Face Capture System' - albert

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a non obtrusive head mounted face capture system

A Non-obtrusive Head Mounted Face Capture System

Chandan K. ReddyMaster’s Thesis Defense

Thesis Committee:

Dr. George C. Stockman (Main Advisor)

Dr. Frank Biocca (Co-Advisor)

Dr. Charles Owen

Dr. Jannick Rolland (External Faculty)

modes of communication
Modes of Communication
  • Text only - e.g. Mail, Electronic Mail
  • Voice only – e.g. Telephone
  • PC camera based conferencing – e.g. Web cam
  • Multi-user Teleconferencing
  • Teleconferencing through Virtual Environments
  • Augmented Reality Based Teleconferencing
problem definition
Problem Definition
  • Face Capture System ( FCS )
  • Virtual View Synthesis
  • Depth Extraction and 3D Face Modeling
  • Head Mounted Projection Displays
  • 3D Tele-immersive Environments
  • High Bandwidth Network Connections
thesis contributions
Thesis Contributions
  • Complete hardware setup for the FCS.
  • Camera-mirror parameter estimation for the optimal configuration of the FCS.
  • Generation of quality frontal videos from two side videos
  • Reconstruction of texture mapped 3D face model from two side views
  • Evaluation mechanisms for the generated frontal views.
existing face capture systems
Existing Face Capture Systems

FaceCap3d - a product

from Standard Deviation

Optical Face Tracker – a product from Adaptive Optics

Courtesy :

Advantages : Freedom for Head Movements

Drawbacks : Obstruction of the user’s Field of view

Main Applications : Character Animation and Mobile environments

existing face capture systems6
Existing Face Capture Systems


Sea of Cameras

(UNC Chappel Hill)

National tele-immersion Initiative

Advantages : No burden for the user

Drawbacks : Highly equipped environments and restricted head motion

Main Applications : Teleconferencing and Collaborative work

proposed face capture system
Proposed Face Capture System

(F. Biocca and J. P. Rolland, “Teleportal face-to-face system”, Patent Filed, 2000.)

Novel Face Capture System that is being developed.

Two Cameras capture the corresponding side views through the mirrors

  • User’s field of view is unobstructed
  • Portable and easy to use
  • Gives very accurate and quality face images
  • Can process in real-time
  • Simple and user-friendly system
  • Static with respect to human head
  • Flipping the mirror – cameras view the user’s viewpoint
  • Mobile Environments
  • Collaborative Work
  • Multi-user Teleconferencing
  • Medical Areas
  • Distance Learning
  • Gaming and Entertainment industry
  • Others
optical layout
Optical Layout
  • Three Components to be considered
    • Camera
    • Mirror
    • Human Face
specification parameters
Specification Parameters
  • Camera
    • Sensing area: 3.2 mm X 2.4 mm (¼”).
    • Pixel Dimensions: Image sensed is of dimensions 768 X 494 pixels. Digitized image size is 320 X 240 due to restrictions of the RAM size.
    • Focal Length(Fc): 12 mm (VCL – 12UVM).
    • Field of View (FOV): 15.2 0 X 11.4 0.
    • Diameter (Dc): 12mm
    • Fnumber (Nc): 1 -achieve maximum lightness.
    • Minimum Working Distance (MWD)- 200 mm.
    • Depth of Field (DOF): to be estimated
specification parameters contd
Specification Parameters (Contd.)
  • Mirror
    •  Diameter (Dm) / Fnumber (Nm)
    • Focal Length (fm)
    • Magnification factor (Mm)
    • Radius of curvature (Rm)
  • Human Face
    • Height of the face to be captured (H~ 250mm)
    • Width of the face to be captured (W~ 175 mm)
  • Distances
    • Distance between the camera and the mirror. (Dcm~150mm)
    • Distance between the mirror and the face. (Dmf ~200mm)
customization of cameras and mirrors
Customization of Cameras and Mirrors
  • Off-the-shelf cameras
    • Customizing camera lens is a tedious task
    • Trade-off has to be made between the field of view and the depth of field
    • Sony DXC LS1 with 12mm lens is suitable for our application
  • Custom designed mirrors
    • A plano-convex lens with 40mm diameter is coated with black on the planar side.
    • The radius of curvature of the convex surface is 155.04 mm.
    • The thickness at the center of the lens is 5 mm.
    • The thickness at the edge is 3.7 mm.
problem statement
Problem Statement

Generating virtual frontal view from two side views

data processing
Data processing
  • Two synchronized videos are captured in real-time (30 frames/sec) simultaneously.
  • For effective capturing and processing, the data is stored in uncompressed format.
  • Machine Specifications (Lorelei @ metlab.cse.msu.edu):
    • Pentium III processor
    • Processor speed: 746 MHz
    • RAM Size: 384 MB
    • Hard Disk write Speed (practical): 9 MB/s
  • MIL-LITE is configured to use 150 MB of RAM
data processing contd
Data processing (Contd.)
  • Size of 1 second video = 30 * 320 * 240 *3

= 6.59 MB

  • Using 150 MB RAM, only 10 seconds video from two cameras can be captured
  • Why does the processing have to be offline?
    • Calibration procedure is not automatic
    • Disk writing speed must be at least 14 MB/S.
    • To capture 2 videos of 640 * 480 resolution, the Disk writing speed must be at least 54 MB/S ???
structured light technique
Structured Light technique

Projecting a grid on the frontal view of the face

A square grid in the frontal view appears as a quadrilateral (with curved edges) in the real side view

color balancing
Color Balancing
  • Hardware based approach
    • White balancing of the cameras
  • Why this is more robust ? – why not software based ?
    • There is no change in the input camera
    • Better handling of varying lighting conditions
    • No pre - knowledge of the skin color is required
    • No additional overhead
    • Its enough if both cameras are color balanced relatively
off line calibration stage
Off-line Calibration Stage

Left Calibration

Face Image

Right Calibration

Face Image


Transformation Tables

operational stage
Operational Stage


Face Image


Face Image



Left Warped

Face Image

Right Warped

Face Image

Mosaiced Face Image

comparison of the frontal views
Comparison of the Frontal Views

First row – Virtual frontal views

Second row – Original frontal views

video synchronization eye blinking
Video Synchronization (Eye blinking)

First row – Virtual frontal views

Second row – Original frontal views

coordinate systems
Coordinate Systems

There are five coordinate systems in our application

  • World Coordinate System (WCS)
  • Face Coordinate System (FCS)
  • Left Camera Coordinate system (LCCS)
  • Right Camera Coordinate system (RCCS)
  • Projector Coordinate System (PCS)
camera calibration

s L Pr




s W Px

s W Py

s L Pc


s W Pz



Camera Calibration
  • Conversion from 3D world coordinates to 2D camera coordinates - Perspective Transformation Model

Eliminating the scale factor

uj = (c11 – c31 uj) xj + (c12 – c32 uj) yj + (c13 – c33 uj) zj + c14

vj = (c21 – c31 vj) xj + (c22 – c32 vj) yj + (c23 – c33 vj) zj + c24

calibration sphere
Calibration sphere
  • A sphere can be used for Calibration
  • Calibration points on the sphere are chosen in such a way that the

Azimuthal angle is varied in steps of 45o

Polar angle is varied in steps of 30o

  • The location of these calibration points is known in the 3D coordinate System with respect to the origin of the sphere
  • The origin of the sphere defines the origin of the World Coordinate System
projector calibration
Projector Calibration
  • Similar to Camera Calibration
  • 2D image coordinates can not be obtained directly from a 2D image.
  • A “Blank Image” is projected onto the sphere
  • The 2D coordinates of the calibration points on the projected image are noted
  • More points can be seen from the projector’s point of view – some points are common to both camera views
  • Results appear to have slightly more errors when compared to the camera calibration
3d face model construction
3D Face Model Construction
  • Why?
    • To obtain different views of the face
    • To generate the stereo pair to view it in the HMPD
  • Steps required
    • Computation of 3D Locations
    • Customization of 3D Model
    • Texture Mapping
computation of 3d points
Computation of 3D points
  • 3d point estimation using stereo
  • Stereo between two cameras is not possible because of the occlusion by the facial features
  • Hence two stereo pair computations
    • Left camera and projector
    • Right camera and projector
  • Using stereo, compute 3D points of prominent facial feature points in FCS
3d generic face model
3D Generic Face Model

A generic face model with 395 vertices and 818 triangles

Left: front view and Right: side view

evaluation schemes
Evaluation Schemes
  • Evaluation of facial expressions and is not studied extensively in literature
  • Evaluation can be done for facial alignment, face recognition for static images
  • Lip and eye movements in a dynamic event
  • Perceptual quality – How are the moods conveyed?
  • Two types of evaluation
    • Objective evaluation
    • Subjective evaluation
objective evaluation
Objective Evaluation
  • Theoretical Evaluation
  • No human feedback required
  • This evaluation can give us a measure of
    • Face recognition
    • Face alignment
    • Facial movements
  • Methods applied
    • Normalized cross correlation
    • Euclidean distance measures
evaluation images
Evaluation Images

5 frames were considered for objective evaluation

First row – virtual frontal views

Second row – original frontal views

normalized cross correlation
Normalized Cross-Correlation
  • Regions considered for normalized cross-correlation

( Left: Real image Right: Virtual image)

normalized cross correlation46
Normalized Cross-Correlation
  • Let V be the virtual image and R be the real image
  • Let w be the width and h be the height of the images
  • The Normalized Cross-correlation between the two images V and R is given by


euclidean distance measures
Euclidean Distance measures

Euclidean distance between two points i and j is given by

Let Rij be the euclidean distance between two points i and j in the real image

Let Vij be the euclidean distance between two points i and j in the virtual image

Dij = | Rij - Vij |

subjective evaluation
Subjective Evaluation
  • Evaluates the human perception
  • Measurement of quality of a talking face
  • Factors that might affect
    • Quality of the video
    • Facial movements and expressions
    • Synchronization of the two halves of the face
    • Color and Texture of the face
    • Quality of audio
    • Synchronization of audio
  • A preliminary study has been made to assess the quality of the generated videos
conclusion and future work
Conclusion and Future Work

Future Work


Time Domain




Frontal Image


Frontal Video


Texture Mapped

3D Face Model

3D Facial



  • Design and implementation of a novel Face Capture System
  • Generation of virtual frontal view from two side views in a video sequence
  • Extraction of depth information using stereo method
  • Texture mapped 3D face model generation
  • Evaluation of virtual frontal videos
future work
Future Work
  • Online processing in real-time
  • Automatic calibration
  • 3D facial animation
  • Subjective Evaluation of the virtual frontal videos
  • Data compression while processing and transmission
  • Customization of camera lenses
  • Integration with a Head Mounted Projection Display
thank you
Thank You