Linguist module in sphinx 4 by sonthi dusitpirom
1 / 18

Linguist Module in Sphinx-4 By Sonthi Dusitpirom - PowerPoint PPT Presentation

  • Uploaded on

Linguist Module in Sphinx-4 By Sonthi Dusitpirom. Objective. How to change dictionary in Sphinx-4. Sphinx-4 .

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Linguist Module in Sphinx-4 By Sonthi Dusitpirom' - waseem

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Linguist module in sphinx 4 by sonthi dusitpirom
Linguist Module in Sphinx-4By SonthiDusitpirom


  • How to change dictionary in Sphinx-4

Sphinx 4

  • Sphinx-4 is an open source framework for speech recognition, written in the Java programming to help in the research of speech recognition system. In Sphinx-4 it has 3 main components

    • The FrontEnd

    • The Decoder

    • The Linguist

Sphinx 42

  • In this project we focus on the Linguist componentthat has 3 subcomponents

    • The Acoustic Model

      • Acoustic model is pronounced of individual characters, known as phonemes.

    • The Dictionary

      • Dictionary is the pronunciation of all the words that the system can recognize.

    • The Language Model

      • Language model describes how the grammar looks like.

Acoustic model
Acoustic Model

  • The acoustic model in Sphinx-4 consists of a set of left-to-right Hidden Markov Models for basic sound units. The units represent phones in a triphone context.

  • The acoustic model in Sphinx-4 is packed in JAR file. The advantage of packing it in a JAR file is that the file can be included in the classpath and referenced in the configuration file for it to be used in a Sphinx-4 application.

Acoustic model1
Acoustic Model

  • In sphix-4 we have two important models that are for difference purpose

    • TIDIGITS_8gau_13dCep_16k_40mel_130Hz_6800.jar is designed and created for number. If you need to recognize number then you should use this model

    • WSJ_8gau_13dCep_16k_40mel_130Hz_6800.jar is designed and created for text. If you want to recognize text then you should use this model.


  • Dictionary provides pronunciations for words found in the language model. The pronunciations split words into sequences of phonemes that found in the acoustic model.

Language model
Language Model

  • There are two types of model that describe language

    • Grammars language model

      • Grammars describe very simple types of languages for command and control, and you are written by hand or generated automatically with plain code.

    • Statistical language model

      • Statistical language model estimate the probability of the distribution of natural language. The most widely used statistical language model is N-gram

Create a new dictionary
Create a new dictionary

  • In Sphinx-4 we already have a dictionary. This is the way to change dictionary

    • Extract WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar in lib directory.

    • Go to dict folder and open cmudict.0.6.d file in that folder.

    • Insert words and phonemes into cmudict.0.6d file and save.

    • Zip the folder that we extract in zip file.

    • Remove WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar from libraries in build path and add zip file into libraries in build path.

Xml configuration file
XML Configuration File

  • The configuration of a particular Sphin-4 system is determined by a configuration file. This configuration file defines the following

    • The names and types of all of the components of the system.

    • The connectivity of these components – that is, which components talk to each other.

    • The detailed configuration for each of these elements.

Xml configuration file1
XML configuration File

  • Determining which components are to be used in the system.

  • Determining the detailed configuration of each of these components.

Use model in sphinx 4
Use Model in Sphinx-4

  • There are three steps to use new model from Sphinx-4

    • Defining a language model.

    • Defining a dictionary.

    • Defining an acoustic model.

Define a language model
Define a Language Model

<component name="jsgfGrammar" type="edu.cmu.sphinx.jsapi.JSGFGrammar">

<property name="grammarLocation“

value=" the path to the grammar folder "/>

<property name="dictionary"


<property name="grammarName"

value=“the name of grammar"/>

<property name="logMath“



Define a language model1
Define a Language Model

<component name="trigramModel" type="edu.cmu.sphinx.linguist.language.ngram.large.LargeTrigramModel">

<property name="unigramWeight“


<property name="maxDepth"


<property name="logMath"


<property name="dictionary"


<property name="location"

value="the name of the language model file"


Define a dictionary
Define a Dictionary

<component name="dictionary" type="edu.cmu.sphinx.linguist.dictionary.FastDictionary"> <property name="dictionaryPath"

value="the name of the dictionary file"

<property name="fillerPath" />

value="the name of the filler file"/>

<property name="addSilEndingPronunciation" value="false"/>

<property name="allowMissingWords"


<property name="unitManager"



Define an acoustic model
Define an Acoustic Model

<component name="sphinx3Loader" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader">

<property name="logMath" value="logMath"/>

<property name="unitManager" value="unitManager"/> <property name="location"

value="the path to the model folder"/>

<property name="location"

value="the path to the model folder"/>


<component name="acousticModel" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.TiedStateAcousticModel">

<property name="loader" value="sphinx3Loader"/>

<property name="unitManager" value="unitManager"/>