Linguist Module in Sphinx-4 By Sonthi Dusitpirom - PowerPoint PPT Presentation

Linguist module in sphinx 4 by sonthi dusitpirom
1 / 18

  • Uploaded on
  • Presentation posted in: General

Linguist Module in Sphinx-4 By Sonthi Dusitpirom. Objective. How to change dictionary in Sphinx-4. Sphinx-4 .

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Linguist Module in Sphinx-4 By Sonthi Dusitpirom

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Linguist module in sphinx 4 by sonthi dusitpirom

Linguist Module in Sphinx-4By SonthiDusitpirom



  • How to change dictionary in Sphinx-4

Sphinx 4


  • Sphinx-4 is an open source framework for speech recognition, written in the Java programming to help in the research of speech recognition system. In Sphinx-4 it has 3 main components

    • The FrontEnd

    • The Decoder

    • The Linguist

Sphinx 41


Sphinx 42


  • In this project we focus on the Linguist componentthat has 3 subcomponents

    • The Acoustic Model

      • Acoustic model is pronounced of individual characters, known as phonemes.

    • The Dictionary

      • Dictionary is the pronunciation of all the words that the system can recognize.

    • The Language Model

      • Language model describes how the grammar looks like.

Acoustic model

Acoustic Model

  • The acoustic model in Sphinx-4 consists of a set of left-to-right Hidden Markov Models for basic sound units. The units represent phones in a triphone context.

  • The acoustic model in Sphinx-4 is packed in JAR file. The advantage of packing it in a JAR file is that the file can be included in the classpath and referenced in the configuration file for it to be used in a Sphinx-4 application.

Acoustic model1

Acoustic Model

  • In sphix-4 we have two important models that are for difference purpose

    • TIDIGITS_8gau_13dCep_16k_40mel_130Hz_6800.jar is designed and created for number. If you need to recognize number then you should use this model

    • WSJ_8gau_13dCep_16k_40mel_130Hz_6800.jar is designed and created for text. If you want to recognize text then you should use this model.



  • Dictionary provides pronunciations for words found in the language model. The pronunciations split words into sequences of phonemes that found in the acoustic model.

Language model

Language Model

  • There are two types of model that describe language

    • Grammars language model

      • Grammars describe very simple types of languages for command and control, and you are written by hand or generated automatically with plain code.

    • Statistical language model

      • Statistical language model estimate the probability of the distribution of natural language. The most widely used statistical language model is N-gram

Create a new dictionary

Create a new dictionary

  • In Sphinx-4 we already have a dictionary. This is the way to change dictionary

    • Extract WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar in lib directory.

    • Go to dict folder and open cmudict.0.6.d file in that folder.

    • Insert words and phonemes into cmudict.0.6d file and save.

    • Zip the folder that we extract in zip file.

    • Remove WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar from libraries in build path and add zip file into libraries in build path.

Xml configuration file

XML Configuration File

  • The configuration of a particular Sphin-4 system is determined by a configuration file. This configuration file defines the following

    • The names and types of all of the components of the system.

    • The connectivity of these components – that is, which components talk to each other.

    • The detailed configuration for each of these elements.

Xml configuration file1

XML configuration File

  • Determining which components are to be used in the system.

  • Determining the detailed configuration of each of these components.

Use model in sphinx 4

Use Model in Sphinx-4

  • There are three steps to use new model from Sphinx-4

    • Defining a language model.

    • Defining a dictionary.

    • Defining an acoustic model.

Define a language model

Define a Language Model

<component name="jsgfGrammar" type="edu.cmu.sphinx.jsapi.JSGFGrammar">

<property name="grammarLocation“

value=" the path to the grammar folder "/>

<property name="dictionary"


<property name="grammarName"

value=“the name of grammar"/>

<property name="logMath“



Define a language model1

Define a Language Model

<component name="trigramModel" type="edu.cmu.sphinx.linguist.language.ngram.large.LargeTrigramModel">

<property name="unigramWeight“


<property name="maxDepth"


<property name="logMath"


<property name="dictionary"


<property name="location"

value="the name of the language model file"


Define a dictionary

Define a Dictionary

<component name="dictionary" type="edu.cmu.sphinx.linguist.dictionary.FastDictionary"> <property name="dictionaryPath"

value="the name of the dictionary file"

<property name="fillerPath" />

value="the name of the filler file"/>

<property name="addSilEndingPronunciation" value="false"/>

<property name="allowMissingWords"


<property name="unitManager"



Define an acoustic model

Define an Acoustic Model

<component name="sphinx3Loader" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader">

<property name="logMath" value="logMath"/>

<property name="unitManager" value="unitManager"/> <property name="location"

value="the path to the model folder"/>

<property name="location"

value="the path to the model folder"/>


<component name="acousticModel" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.TiedStateAcousticModel">

<property name="loader" value="sphinx3Loader"/>

<property name="unitManager" value="unitManager"/>


Any question

Any Question ?

  • Login