1 / 75

Speaker Discrimination: The Challenge of Conversational Data

Speaker Discrimination: The Challenge of Conversational Data. Dissertation Committee Advisor: Robert Yantorno, Ph.D Members: Dennis Silage, Ph.D. Brian Butz, Ph.D. Iyad Obeid, Ph.D. Eugene Kwatny, Ph.d. Uchechukwu O. Ofoegbu. Presentation Outline. Problem Statement and Research Goal

Download Presentation

Speaker Discrimination: The Challenge of Conversational Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speaker Discrimination:The Challenge of Conversational Data Dissertation Committee Advisor: Robert Yantorno, Ph.D Members: Dennis Silage, Ph.D. Brian Butz, Ph.D. Iyad Obeid, Ph.D. Eugene Kwatny, Ph.d Uchechukwu O. Ofoegbu

  2. Presentation Outline • Problem Statement and Research Goal • Scope of Research • Distance Analysis • Feature Analysis • Data Analysis • Application Systems • Fusion of Distances • Proposal Summary Dissertation Committee Advisor: Robert Yantorno, Ph.D Members: Dennis Silage, Ph.D. Brian Butz, Ph.D. Iyad Obeid, Ph.D. Eugene Kwatny, Ph.d

  3. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Problem Statement and Research Goal

  4. Reference Speech Problem Statement and Research Goal Scope of Research Feature Extraction Distance Analysis Feature Analysis Data Analysis Model Building Application Systems Fusion of Distances Test Speech Feature Extraction Recognition Decision Comparison Proposal Summary Conventional Speaker Recognition • Speaker Identification • Who is this speaker? • Speaker Verification • Is he who he claims to be? System Output

  5. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Conversation Segmentation • Broadcast News/Conference Data • Conversational Data

  6. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Problems with Conversational Data • No a priori information available from participating speakers. • Training is impossible • No a priori knowledge of change points • Speakers alternate very rapidly. • Limited amounts of data for single speaker representations • Distortion • Channel noise, co-channel data

  7. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Proposed Solutions • Selective creation of data models • Development of an “optimal” distance measure • Decision level fusion of distance measures • Development of application-specific system

  8. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Scope of Research

  9. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Criminal Activity Detection • Monitoring inmate conversations • Prevention of 3-way calls • Notification of suspicious contacts • Enhancement of keyword detection • Uncooperative data collection • Forensics • Voiceprints

  10. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Commercial Services • Automated Customer Services • Personalized contact with customers • Search/Retrieval of Audio Data

  11. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Homeland Security • Military Activities • Pilot-control tower communications • Detection of unidentified speakers on pilot radio channels • Terrorist Identification

  12. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Distance Analysis

  13. Problem Statement and Research Goal Difference between means Scope of Research Distance Analysis Standard Deviation Feature Analysis Standard Deviation Data Analysis Application Systems Fusion of Distances Proposal Summary Distance Measures • Univariate vs. Multivariate Analysis

  14. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Distance Measures • Notations • Random variables being compared: X = [X1, X2, …, Xp]: nx by p matrix Y = [Y1, Y2, …, Yp]: ny by p matrix • Properties • Q(X, Y) ≥ 0, • Q(X, Y) = 0 iff X = Y, • Q(X, Y) = Q(Y, X), • Q(X, Y) ≤ Q(X, Z) + Q(Z,Y)

  15. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Distance Measures • Mahalanobis Distance QMAHANALOBIS(X,Y) = (μx – μy)T Σ-1 (μx – μy) Σ = combined covariance matrix of X and Y • Hotelling’s T-Square Statistics Cik = ith row and kth column of the inverse of C

  16. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Distance Measures • Kullback-Leibler (KL) Distance • Bhattacharya Distance

  17. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Distance Measures • Levene’s Test • Derived from T-Square statistics as follows: • Each set of points is transformed along each vector into absolute divergence from the mean vector • The T-Square Statistic is then applied on the transformed features.

  18. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis 10 utterances from Speaker A Utterance 1 Window Data Compute 14th Order LPCC Data Analysis Utterance 2 Compute Distance Application Systems Window Data Compute 14th Order LPCC Fusion of Distances Proposal Summary Procedural Set-up • HTIMIT database used • Average Utterance Length = 5 seconds Intra-speaker distance computations Randomly Select 2 Utterances

  19. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Speaker A Speaker B Window Data Compute 14th Order LPCC Data Analysis Compute Distance Application Systems Window Data Compute 14th Order LPCC Fusion of Distances Proposal Summary Procedural Set-up Inter-speaker, different utterances distance computations Randomly Select Utterance Randomly Select Utterance

  20. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Analysis of Distance Measures • Mahalanobis Distance – Gaussian Estimate

  21. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Analysis of Distance Measures • Levene’s Test – Gamma Estimate

  22. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Feature Analysis

  23. Cepstral Analysis Frequency Analysis of Speech Excitation Component Vocal Tract Component STFT of Speech Slowly varying formants Fast varying harmonics = X Log of STFT Log of Excitation Log of Vocal Tract Component = + IDFT of Log of STFT Excitation Vocal tract + =

  24. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Cepstral Features • Linear Predictive Cepstral Coefficients • Obtained Recursively from LPC Coefficients • Mel-Scale Frequency Cepstral Coefficients • Nonlinear warping of frequency axis to model the human auditory system

  25. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Cepstral Features • Delta Cepstral Coefficients • First and Second derivatives of cepstral coefficients • Reflects dynamic information • Used as supplement to original cepstral features

  26. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Analysis of Cepstral Features • Mahalanobis Distance

  27. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Analysis of Cepstral Features • Levene’s Test

  28. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Feature Combination • Proposed Investigation • What’s the best feature combination? • Will the delta and delta-delta coefficients contribute to the speaker differentiating ability of the features.

  29. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Feature Combination Analysis • T-test Based Evaluation • Why? • Robust to the Gaussian distribution especially for amounts of data sizes and when the two samples to be compared have approximately equal values. • Unaffected by differences in the variances of the compared variables

  30. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Data Analysis

  31. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Traditional Speaker Modeling • Examples • Gaussian Mixture Models • Hidden Markov Models • Neural Networks • Prosody-Based Models • Disadvantages • Require large amounts • Sometimes require training procedure • Relatively complex

  32. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Conversational Data Modeling • Current Method • Equal Segmentation of Data • Indiscriminate use of data • Poor performance • Problems • Change points unknown • Not all speech is useful

  33. S V U V U V … U V U V S V Problem Statement and Research Goal . . . V V V Scope of Research V V V Distance Analysis Feature Analysis Data Analysis Application Systems MEAN AND COVARIANCE MATRIX COMPUTATION MEAN AND COVARIANCE MATRIX COMPUTATION Fusion of Distances Proposal Summary Proposed Speaker Modeling SEGMENT 1 SEGMENT M FEATURE COMPUTATION FEATURE COMPUTATION . . . MODEL 1 MODEL M

  34. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Proposed Speaker Modeling • Why voiced only • Same speech class compared • Contains the most information • What’s the appropriate number of phonemes • Large enough to sufficiently represent speakers • Small enough to avoid speaker overlap

  35. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Modeling Analysis N = 20 – 4 seconds of voiced speech

  36. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Modeling Analysis

  37. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Modeling Analysis N = 5 – 1 second of voiced speech

  38. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Applications Systems

  39. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Unsupervised Speaker Indexing • The Restrained-Relative Minimum Distance (RRMD) Approach REFERENCE MODELS 0 D1,2 D1,3 … D2,1 0 D2,3 … D3,1 D3,2 0 … … 0 D1,2 D1,3 … D2,1 0 D2,3 … D3,1D3,2 0 … …

  40. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Unsupervised Speaker Indexing • The Restrained-Relative Minimum Distance (RRMD) Approach Observe distance Reference 1 Reference 2 Unusable Data Failed Min. Distance Failed Relative Distance Condition Passed Restraining Condition Same Speaker? Same Speaker Passed

  41. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary RRMD Approach • Restraining Condition • Distance Likelihood Ratio DLR > 1  Same Speaker DLR < 1  Check Relative Distance Condition

  42. Problem Statement and Research Goal Scope of Research Distance Analysis Reference 1 Reference 2 Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary RRMD Approach • Relative Distance Condition • Relative Distance: Drel = dmax – dmin • Drel > threshold  Same Speaker dmin dmax

  43. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Preliminary Results • Experiments • 245 telephone conversations from the SWITCHBOARD database, with an average length of 400 seconds. • T-Square statistics used • Ground truth obtained from Mississippi State Transcriptions

  44. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Preliminary Results • Best N Estimation N = 5

  45. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Preliminary Results • RRMD Experiments • Drel Varied from 0-200 • Two Errors Defined • Indexing Error Ierr = 100 – Accuracy, • Undecided Error Nu = number of detected undecided/unusable samples, Nc = number labeled as co-channel data ‘undecided error’ :

  46. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Preliminary Results

  47. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Reference Model Selected Randomly Reference Model Selected Randomly Reference Model Selected Randomly Data Analysis Application Systems Fusion of Distances Proposal Summary Speaker Count System • The Residual Ratio Algorithm (RRA) • Process is repeated K-1 times for counting up to K speakers Too little data Removed, select Another model DLR-based Model Comparison DLR-based Model Comparison . . .

  48. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary RRA Examples – 2 Speakers

  49. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary RRA Examples – 3 Speakers

  50. Problem Statement and Research Goal Scope of Research Distance Analysis Feature Analysis Data Analysis Application Systems Fusion of Distances Proposal Summary Comparison TWO-SPEAKER RESIDUAL THREE-SPEAKER RESIDUAL Residual Ratio after 2nd round of RRA Residual Ratio after 2nd round of RRA Speaker 2

More Related