Csc 2528 handshapes and movements multiple channel asl recognition
Download
1 / 21

CSC 2528: Handshapes and Movements: Multiple-channel ASL recognition - PowerPoint PPT Presentation


  • 80 Views
  • Uploaded on

CSC 2528: Handshapes and Movements: Multiple-channel ASL recognition. Christian Vogler and Dimitris Metaxas (presented by Christopher Collins). Overview: Part II. Introduction to ASL recognition Challenges of ASL recognition Related work Modelling Phoneme-based modelling

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' CSC 2528: Handshapes and Movements: Multiple-channel ASL recognition' - kalyca


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Csc 2528 handshapes and movements multiple channel asl recognition

CSC 2528: Handshapes and Movements: Multiple-channel ASL recognition

Christian Vogler and Dimitris Metaxas

(presented by Christopher Collins)

University of Toronto Computer Science


Overview part ii
Overview: Part II

  • Introduction to ASL recognition

  • Challenges of ASL recognition

  • Related work

  • Modelling

    • Phoneme-based modelling

    • Independent Channels

    • Handshape

  • Parallel Hidden Markov Models

  • Experiments

  • Conclusions and Future Work

University of Toronto Computer Science


Asl recognition introduction
ASL Recognition: Introduction

  • Computer interaction is still mainly keyboard/mouse

    • requires literacy in a written language or an agreed-upon standard written form of ASL (e.g. sign-writing)

    • difficult for many people who are deaf

University of Toronto Computer Science


Asl recognition challenges
ASL Recognition: Challenges

  • More difficult than speech recognition due to:

    • simultaneous events

University of Toronto Computer Science


Asl recognition challenges1
ASL Recognition: Challenges

  • More difficult than speech recognition due to:

    • simultaneous events

    • inflections

University of Toronto Computer Science


Asl recognition challenges2
ASL Recognition: Challenges

  • More difficult than speech recognition due to:

    • simultaneous events

    • inflections

    • phonology poorly understood, no agreed standard

University of Toronto Computer Science


Challenges of simultaneity
Challenges of Simultaneity

University of Toronto Computer Science


Related work
Related Work

  • C. Vogler and D. Metaxas. Parallel Hidden Markov Models for ASL Recognition (1999).

  • G. Fang et al. Signer-independent continuous sign language recognition based on SRN/HMM (2001).

  • R.-H. Liang and M. Ouhyoung. A real-time continuous gesture recognition system for sign language (1998).

University of Toronto Computer Science


Overview
Overview

  • HMM-based approach to ASL recognition

    • parallel HMMs for different channels

    • channels are left and right handshape and movement

    • uses the movement-hold phonology

University of Toronto Computer Science


Movement hold example
Movement-Hold Example

University of Toronto Computer Science


Handshape modelling
Handshape Modelling

  • Most previous work uses joint and abduction angles as features (low-level)

  • Also experiment with a measure of the openness of a finger (high level)

    • height and width of quadrilateral

    • MPJ angle

    • abduction angles

University of Toronto Computer Science


Extensions to hmm
Extensions to HMM

  • Regular HMM model one process evolving over time

  • To model parallel, possibly interacting processes with a regular HMM, events must evolve in lockstep

  • Earlier work by Vogler and Metaxas explains development of parallel HMM model

University of Toronto Computer Science


Factorial hmm
Factorial HMM

University of Toronto Computer Science


Coupled hmm
Coupled HMM

University of Toronto Computer Science


Parallel hmm
Parallel HMM

University of Toronto Computer Science


Combination of processes
Combination of Processes

  • Using independence assumption, combine path probabilities (from each channel, with states representing the same sign sequence) by multiplying them. Choose the most probable state sequence.

  • Time is polynomial in number of states, linear in number of parallel processes

More info: C. Vogler and D. Metaxas, Parallel Hidden Markov Models for ASL Recognition; Proc. Int. Conf. on Comp. Vis., Greece, 1999.

University of Toronto Computer Science


Experiments
Experiments

  • Compare handshape models (joint angles vs. quadrilateral) for handshape recognition task

  • Compare PaHMM model with various channel combinations against single hand movement channel (naïve baseline?)

  • Vocabulary of 22 signs, 400 training sentences of length 2-7 signs, and 99 test sentences

  • Omitted left-hand handshape?

University of Toronto Computer Science


Choice of handshape model
Choice of Handshape Model

  • Measure correctly recognized handshape (recognizing signs with handshape alone not possible)

  • Quadrilateral feature vector results in better (and more consistent) recognition accuracy

University of Toronto Computer Science


Experimental results
Experimental Results

H=correct, D = deletion, S = substitution, I = insertion, N = number

University of Toronto Computer Science


Conclusions
Conclusions

  • Handshape information is important in ASL recognition

  • Parallel HMM a promising model for multi-channel data

University of Toronto Computer Science


Future work
Future Work

  • Training/Test data from native signers

  • Include facial expressions

  • Use of relative spatial information (classifiers)

  • Larger vocabulary

  • Incorporation of language modelling to improve recognition, such as n-gram or parsing

University of Toronto Computer Science


ad