Spoken Dialog Systems and Voice XML : Intro to Pattern Recognition

Spoken Dialog Systems and Voice XML :Intro to Pattern Recognition Esther Levin Dept of Computer Science CCNY Some materials used in this course were taken from the textbook “Pattern Classification” by Duda et al., John Wiley & Sons, 2001 with the permission of the authors and the publisher

Credits and Acknowledgments • Materials used in this course were taken from the textbook “Pattern Classification” by Duda et al., John Wiley & Sons, 2001 with the permission of the authors and the publisher; and also from • Other material on the web: • Dr. A. Aydin Atalan, Middle East Technical University, Turkey • Dr. Djamel Bouchaffra, Oakland University • Dr. Adam Krzyzak, Concordia University • Dr. Joseph Picone, Mississippi State University • Dr. Robi Polikar, Rowan University • Dr. Stefan A. Robila, University of New Orleans • Dr. Sargur N. Srihari, State University of New York at Buffalo • David G. Stork, Stanford University • Dr. Godfried Toussaint, McGill University • Dr. Chris Wyatt, Virginia Tech • Dr. Alan L. Yuille, University of California, Los Angeles • Dr. Song-Chun Zhu, University of California, Los Angeles

Outline • Introduction • What is this pattern recogntiion • Background Material • Probability theory

PATTERN RECOGNITION AREAS • Optical Character Recognition ( OCR) • Sorting letters by postal code. • Reconstructing text from printed materials (such as reading machines for blind people). • Analysis and identification of human patterns • Speech and voice recognition. • Finger prints and DNA mapping. • Banking and insurance applications • Credit cards applicants classified by income, credit worthiness, mortgage amount, # of dependents, etc. • Car insurance (pattern including make of car, #of accidents, age, sex, driving habits, location, etc). • Diagnosis systems • Medical diagnosis (disease vs. symptoms classification, X-Ray, EKG and tests analysis, etc). • Diagnosis of automotive malfunctioning • Prediction systems • Weather forecasting (based on satellite data). • Analysis of seismic patterns • Dating services (where pattern includes age, sex, race, hobbies, income, etc).

SENSORY Vision Face/Handwriting/Hand Speech Speaker/Speech Olfaction Apple Ripe? DATA Text Categorization Information Retrieval Data Mining Genome Sequence Matching More Pattern Recognition Applications

What is a pattern? “A pattern is the opposite of a chaos; it is an entity vaguely defined, that could be given a name.”

PR Definitions • Theory, Algorithms, Systems to Put Patterns into Categories • Classification of Noisy or Complex Data • Relate Perceived Pattern to Previously Perceived Patterns

Characters A v t u I h D U w K Ç ş ğ İ ü Ü Ö Ğ ع٤٧چك КЦД ζωΨΩξθ נדתשםא

Handwriting

Terminology • Features, feature vector • Decision boundary • Error • Cost of error • Generalization

A Fishy Example I • “Sorting incoming Fish on a conveyor according to species using optical sensing” • Salmon or Sea Bass?

Problem Analysis • Set up a camera and take some sample images to extract features • Length • Lightness • Width • Number and shape of fins • Position of the mouth, etc… This is the set of all suggested features to explore for use in our classifier!

Solution by Stages • Preprocess raw data from camera • Segment isolated fish • Extract features from each fish (length,width, brightness, etc.) • Classify each fish

Preprocessing • Use a segmentation operation to isolate fishes from one another and from the background • Information from a single fish is sent to a feature extractor whose purpose is to reduce the data by measuring certain features • The features are passed to a classifier 2

Classification Select the length of the fish as a possible feature for discrimination 2

The length is a poor feature alone! Select the lightness as a possible feature. 2

“Customers do not want sea bass in their cans of salmon” • Threshold decision boundary and cost relationship • Move our decision boundary toward smaller values of lightness in order to minimize the cost (reduce the number of sea bass that are classified salmon!) Task of decision theory 2

Adopt the lightness and add the width of the fish Fish x = [x1, x2] Lightness Width 2

We might add other features that are not correlated with the ones we already have. A precaution should be taken not to reduce the performance by adding such “noisy features” • Ideally, the best decision boundary should be the one which provides an optimal performance such as in the following figure: 2

However, our satisfaction is premature because the central aim of designing a classifier is to correctly classify novel input Issue of generalization! 2

Decision Boundaries Observe: Can do much better with two features Caveat: overfitting!

Occam’s Razor Entities are not to be multiplied without necessity William of Occam (1284-1347)

A Complete PR System

Problem Formulation Input object Class Label Measurements & Preprocessing Features Classification • Basic ingredients: • Measurement space (e.g., image intensity, pressure) • Features (e.g., corners, spectral energy) • Classifier - soft and hard • Decision boundary • Training sample • Probability of error

Pattern Recognition Systems • Sensing • Use of a transducer (camera or microphone) • PR system depends of the bandwidth, the resolution, sensitivity, distortion of the transducer • Segmentation and grouping • Patterns should be well separated and should not overlap 3

Feature extraction • Discriminative features • Invariant features with respect to translation, rotation and scale. • Classification • Use a feature vector provided by a feature extractor to assign the object to a category • Post Processing • Exploit context dependent information other than from the target pattern itself to improve performance

The Design Cycle • Data collection • Feature Choice • Model Choice • Training • Evaluation • Computational Complexity 4

Data Collection How do we know when we have collected an adequately large and representative set of examples for training and testing the system? 4

Feature Choice Depends on the characteristics of the problem domain. Simple to extract, invariant to irrelevant transformation insensitive to noise. 4

Model Choice Unsatisfied with the performance of our linear fish classifier and want to jump to another class of model 4

Training Use data to determine the classifier. Many different procedures for training classifiers and choosing models 4

Evaluation Measure the error rate (or performance) and switch from one set of features & models to another one. 4

Computational Complexity What is the trade off between computational ease and performance? (How an algorithm scales as a function of the number of features, number or training examples, number patterns or categories?) 4

Learning and Adaptation • Learning: Any method that combines empirical information from the environment with prior knowledge into the design of a classifier, attempting to improve performance with time. • Empirical information: Usually in the form of training examples. • Prior knowledge: Invariances, correlations • Supervised learning • A teacher provides a category label or cost for each pattern in the training set • Unsupervised learning • The system forms clusters or “natural groupings” of the input patterns 5

Syntactic Versus Statistical PR • Basic assumption: There is an underlying regularity behind the observed phenomena. • Question: Based on noisy observations, what is the underlying regularity? • Syntactic: Structure through common generative mechanism. For example, all different manifestations of English, share a common underlying set of grammatical rules. • Statistical: Objects characterized through statistical similarity. For example, all possible digits `2' share some common underlying statistical relationship.

Difficulties • Segmentation • Context • Temporal structure • Missing features • Aberrant data • Noise Do all these images represent an `A'?

Design Cycle How do we know what features to select, and how do we select them…? What type of classifier shall we use. Is there best classifier…? How do we train…? How do we combine prior knowledge with empirical data? How do we evaluate our performance Validate the results. Confidence in decision?

Conclusion • I expect you are overwhelmed by the number, complexity and magnitude of the sub-problems of Pattern Recognition • Many of these sub-problems can indeed be solved • Many fascinating unsolved problems still remain 6

Toolkit for PR • Statistics • Decision Theory • Optimization • Signal Processing • Neural Networks • Fuzzy Logic • Decision Trees • Clustering • Genetic Algorithms • AI Search • Formal Grammars • ….

Linear algebra • Matrix A: • Matrix Transpose • Vector a

Spoken Dialog Systems and Voice XML : Intro to Pattern Recognition

Spoken Dialog Systems and Voice XML : Intro to Pattern Recognition

Presentation Transcript

CS 414 – Multimedia Systems Design Lecture 39 – Voice-over-IP/Skype

Iatric Systems M.U.S.E. 2006 International Tuesday Training Intro to Clinical Rule Writing

Pattern Recognition and Machine Learning

Elements of Voice

Conversation analysis and the structure of spoken English

Speech and Language Modeling

Pattern Recognition and Machine Learning

Basic Communications Systems Class 9

Christopher M. Bishop

Gesture Recognition

Fusion by Biometrics

Chapter 06 MFC Dialog 와 Control Class

6.870 Object Recognition and Scene Understanding

Automatic Spoken Document Processing for Retrieval and Browsing

Fluoroscopy Intro to EQUIPMENT

Efficient Part-Based Recognition of Multiple Object Classes

VoiceStack Review & GIANT bonus packs

VoiceStack review-$26,800 bonus & discount