230 likes | 329 Views
USING CLASS WEIGHTING IN INTER-CLASS MLLR. Sam-Joo Doh and Richard M. Stern Department of Electrical and Computer Engineering and School of Computer Science Carnegie Mellon University October 20, 2000. Outline. Introduction Review Transformation-based adaptation Inter-class MLLR
E N D
USING CLASS WEIGHTING IN INTER-CLASS MLLR Sam-Joo Doh and Richard M. Stern Department of Electrical and Computer Engineering and School of Computer Science Carnegie Mellon University October 20, 2000
Outline • Introduction • Review • Transformation-based adaptation • Inter-class MLLR • Application of weights • For different neighboring classes • Summary Robust Speech Group
Introduction • We would like to achieve • Better adaptation using small amount of adaptation data Enhance recognition accuracy • Current method • Reduce the number of parameters • Assume transformation function Transformation-based adaptation • Example: Maximum likelihood linear regression Robust Speech Group
Introduction (cont’d) • Transformation-based adaptation • Transformation classes are assumed to be independent • It does not achieve reliable estimates for multiple classes using a small amount of adaptation data • Better idea ? • Utilize inter-class relationship to achieve more reliable estimates for multiple classes Robust Speech Group
Transformation-Based Adaptation • Estimate each target parameter(mean vector) Robust Speech Group
Transformation-Based Adaptation (cont’d) • Estimate each transformation function Robust Speech Group
Better estimation of transformation function More details of target parameters Quality Number of transformation classes Transformation-Based Adaptation (cont’d) • Trade-off Robust Speech Group
Previous Works • Consider Correlations among model parameters • Mostly in Bayesian framework • Considering a few neighboring models: • Not effective • Considering all neighboring models: • Too much computation • It is difficult to apply correlation on multi-Gaussian mixtures: No explicit correspondence Robust Speech Group
Previous Works (cont’d) • Using correlationsamong model parameters Robust Speech Group
Inter-Class Relationship • Inter-class relationshipamong transformation functions ? Robust Speech Group
Inter-Class Relationship (cont’d) • Two classes are independent Class 1Class 2 Robust Speech Group
Transform class 2 parameters f2(.) m2k(12) f1(.) g12(.) Inter-Class Relationship (cont’d) • If we know aninter-class transformation g12(.) Class 1Class 2 • Now class 2 data contribute to the estimation off1(.) More reliable estimationoff1(.) while it keeps the characteristics of Class 1 • f2(.) can be estimated by transforming class 1 parameters Robust Speech Group
Estimate (A1, b1) to minimize Q Where Inter-class MLLR • Use Linear Regression for inter-class transformation Robust Speech Group
Application of Weights • Neighboring classes have different contributions to the target class Robust Speech Group
Weighted least squares estimation: • Use the variance of the error for weight • Large error Small weight • Small error Large weight Application of Weights (cont’d) • Application of weights to the neighboring classes • We assume in neighboring class n • The error using (A1, b1) in neighboring class n Robust Speech Group
Number of Neighboring Classes • Limit the number of neighboring classes • Sort neighboring classes • Set threshold for the number of samples • Use “closer” neighboring class first • Count the number of samples used • Use next neighboring classes until the number of samples exceed the threshold Robust Speech Group
Experiments • Test data • 1994 DARPA, Wall Street Journal (WSJ) task • 10 Non-native speaker x 20 test sentences (Spoke 3: s3-94) • Baseline System: CMU SPHINX-3 • Continuous HMM, 6000 senones • 39 dimensional features • MFCC cepstra + delta + delta-delta + power • Supervised/Unsupervised adaptation • Focus on small amounts of adaptation data • 13 phonetic-based classes for inter-class MLLR Robust Speech Group
Experiments Experiments (cont’d) • Supervised adaptation • Word Error Rates Robust Speech Group
Experiments Experiments (cont’d) • Unsupervised adaptation • Word Error Rates Robust Speech Group
Experiments (cont’d) • Limit the number of neighboring classes • Supervised adaptation: 10 adaptation sentences Robust Speech Group
Summary • Application of weights • Use weighted least square estimation • Was helpful for supervised case • Was not helpful for unsupervised case (with small amount of adaptation data) • Number of neighboring classes • Use smaller number of neighboring classes as more adaptation data are available Robust Speech Group
Summary • Inter-class transformation • It can have speaker-dependent information • We may prepare several sets of inter-class transformations Select appropriate set for a new speaker • Combination with Principal Component MLLR • Did not provide additional improvement Robust Speech Group
Thank you ! Robust Speech Group