Senior Project – Computer Science – 2013 Multimodal Emotion Recognition Colin Grubb Advisor – Prof. Nick Webb

Introduction Multimodal Fusion is a technique in which two or more inputs are combined together in order to improve classification accuracy on a particular problem. In this study, we aimed to improve the classification accuracy of existing systems via fusion. We took two existing pieces of software, one audio and one visual, and worked to combine them together using decision level fusion. We conducted experiments to see how we could make the two individual systems compliment each other in order to achieve the highest possible accuracy. • Manual Rules • Created rules to modify EmoVoice output based on • EmoVoice bias towards negative and active voice • PCA weaknesses • Rules classified by training instance class attribute • Happy: If the EMV confidence levels of content and happy voice outweighed all other confidence levels, change to Happy • Neutral: If all confidence levels were within 0.05 of each other, or if neutral confidence was tied for first, change instance to Neutral • Sad: If second to angry within 0.05, change instance to Sad • Emotion Software • Audio Software: EmoVoice (EMV) • Open source, real time • Naïve Bayes classifier • Accuracy: 38.43% • Visual Software: Partial Component Analysis (PCA) • Created by Professor Shane Cotter • Works on still images of faces • Accuracy: 77.4% Senior Project – Computer Science – 2013Multimodal Emotion RecognitionColin GrubbAdvisor – Prof. Nick Webb • Gathering Data • Four emotional states • Angry • Happy • Neutral • Sad • List of sentences read to EmoVoice • Normal visual data and long range visual data (6 ft.) • Datasets constructed using outputs from unimodal systems Experimentation • EmoVoice data modified to complement PCA weaknesses and combat negative/active voice bias • J48 decision tree (C.45) used as classifier Results System Layout • Four experiments run: • Regular Distance • Long Distance • Regular Distance – No Conf. • Long Distance – No Conf (Results were statistically significant with p = 0.05) Conclusion and Future Work We were able to achieve higher classification accuracy via combining audio and visual data and then applying manual bias in order to handle emotions where classification accuracy was weak for the individual systems. Future work will include the automation of individual system components, an online classifier where the output will be returned in real time, and refining the manual rules used to counteract bias. There is also potential for the system to be mounted on a robot currently residing in the department Acknowledgements Prof. Nick Webb Prof. Shane Cotter Prof. Aaron Cass Thomas Yanuklis

Senior Project – Computer Science – 2013 Multimodal Emotion Recognition Colin Grubb Advisor – Prof. Nick Webb

Senior Project – Computer Science – 2013 Multimodal Emotion Recognition Colin Grubb Advisor – Prof. Nick Webb

Presentation Transcript

The Computer Science Department

CSE325 Computer Science and Sculpture

MULTIMODAL EMOTION PERCEPTION: ANALOGOUS TO SPEECH PROCESSES

Computer Networks

CS10051 Section 600 Introduction to Computer Science

CMU Shpinx Speech Recognition Engine

Presentation Outline

Probability and Statistics with Reliability, Queuing and Computer Science Applications: Introduction

INTRODUCTION TO COMPUTER SCIENCE

Performance and availability of computers and networks

Machine Learning Methods for Human-Computer Interaction

EFFICIENT DYNAMIC VERIFICATION ALGORITHMS FOR MPI APPLICATIONS

CS558 Computer Vision

Emotion

CSE325 Computer Science and Sculpture

Cellular Automata A multi-purpose modeling method Lemont B. Kier, PhD Prof. of Medicinal Chemistry

Prof. Chris Carothers Computer Science Department Lally 306 [Office Hrs: Wed, 11a.m – 1p.m]

Doctoral Thesis Defense Kedar Kulkarni Date: 10/04/2007 Advisor: Prof. Andreas A. Linninger

Professional Learning Community- Topic: Retention Strategies: March 21, 2013

Unit 8B: Motivation and Emotion: Emotions, Stress and Health

Lecture 5 Why computer science needs philosophy

An Interactive and Visual Approach to Learning Computer Science