1 / 13

Status report 1) ANN training at ICSI 2) gmtkTie

Status report 1) ANN training at ICSI 2) gmtkTie. ANN training. Joe Frankel, with help from Matthew Magimai-Doss, is training up nets on Fisher (minus SWB1) We already have train and validation scores for some nets Training times appear to be of the order of 100 hours per net.

garvey
Download Presentation

Status report 1) ANN training at ICSI 2) gmtkTie

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Status report1) ANN training at ICSI2) gmtkTie

  2. ANN training • Joe Frankel, with help from Matthew Magimai-Doss, is training up nets on Fisher (minus SWB1) • We already have train and validation scores for some nets • Training times appear to be of the order of 100 hours per net

  3. ANN performance summary • Finished • glottal - 351x1400x4 - Train accuracy: 87.27%, CV accuracy: 87.1% • degree1 - 351x1600x6 - Train accuracy: 78.01%, CV accuracy: 77.79% • nasal - 351x1200x3 - Train accuracy: 90.74%, CV accuracy: 90.55% • In progress • place1 - 351x1900x10 - Train accuracy: 76.30%, CV accuracy: 76.06% • rounding - 351x1200x3 - Train accuracy: 87.69%, CV accuracy:87.52% • vowel - 351x2400x23 - Train accuracy: 72.95%, CV accuracy: 72.96% • front - 351x1700x7 - Train accuracy: 74.36%, CV accuracy: 75.14% • Not yet started • height - 351x1800x8

  4. Glottal – 87% overall Confusion matrix Entries are percentages (of frames) for the cross-validation set Columns are correct labels (columns add up to 100%) Rows are classified labels

  5. Degree1 – 78% overall

  6. Nasal – 91% overall

  7. gmtkTie • A general-purpose parameter tying tool for GMTK • Eventually will be able to: • Tie any type of parameters • Do bottom-up or top-down clustering of parameters of all types • Perform other manipulations, e.g. removal of unused parameters, maybe even model structure (e.g. changing variable cardinalities) • Do a full emulation of HTK’s HHEd tying commands • User provides a list of commands to execute, just as with HTK

  8. gmtkTie – current capabilities 1 • Bottom-up (purely data-driven) clustering • Uses a simple agglomerative clustering algorithm to find sets of similar parameters • Can be followed by tying all parameters within each cluster • Many different dissimilarity measures available, such as: Euclidean; cross-likelihood of means; variance-scaled Euclidean (Mahalanobis); etc • Many different measures for the size (“purity”) of clusters, such as: max pairwise dissimilarity; average dissimilarity to centroid • Many different criteria for finding the cluster centroid, such as: the item with least total dissimilarity to other cluster items; average value of all cluster items; arbitrary choice; etc

  9. gmtkTie – current capabilities 2 • Top-down, decision tree-based clustering • As in HTK, uses a decision tree to determine the clustering • Simple, greedy (non-backtracking) clustering scheme which repeatedly splits clusters • Uses questions about parameter features in order to split clusters; uses measure of cluster “purity” based on parameter values in order to select best question • Key property of this method: can “synthesise” parameters for which there is little or no training data • Currently only for mixtures of Gaussians • Stopping criterion is a threshold on the minimum log likelihood improvement

  10. Decision tree-based clustering 1 • Currently, user must provide the feature values for each parameter, e.g. • What state is this parameter used in? • What are the left/right phonetic contexts? • Or anything else you like…. • One problem: there is no sanity checking – the user must be sure that, e.g. the Gaussian mixture called “gmMx34” is used only when the left context is “ah”. • Would like to make construction of features more automated, but this is hard (i.e. I think need Jeff’s help because it involves running some of the inference routines in GMTK and I don’t know enough about that part of the code yet) • User also supplies candidate questions about these features, e.g. • Is left phonetic context in the set {ax, ah, axr}?

  11. Decision tree-based clustering 2 • Can save/load the decision trees • Can synthesise parameters using these trees (user simply provides the features and gmtkTie ties that parameter to an existing parameter) • Multiple feature sets (definitions + values for each parameter) and trees can be loaded in memory at once, then referred to by name

  12. gmtkTie – alpha testing • Currently building a tied-state TIMIT triphone system • Parameters are tied at the Gaussian component/mixing weights level • Would be better (smaller saved parameter files, for one thing) to tie at the mixture distribution level (easy to do: can have repeated mixture names in a collection) • Requires minor changes to gmtkTie (it currently only loads the parameter file, not the collection definition)

  13. gmtkTie – what will be available for the workshop? • Tested and working decision tree-based clustering/tying for Gaussian mixture distributions, using the same method as HTK • Bottom-up clustering/tying • Documentation (on the GMTK Wiki) • Including a working example (TIMIT triphone system probably)

More Related