Speech recognition lexicon day 19 oct 9 2013
This presentation is the property of its rightful owner.
Sponsored Links
1 / 14

Speech Recognition lexicon DAY 19 – Oct 9, 2013 PowerPoint PPT Presentation


  • 36 Views
  • Uploaded on
  • Presentation posted in: General

Speech Recognition lexicon DAY 19 – Oct 9, 2013. Brain & Language LING 4110-4890-5110-7960 NSCI 4110-4891-6110 Harry Howard Tulane University. Course organization. The syllabus, these slides and my recordings are available at http://www.tulane.edu/~howard/LING4110/ .

Download Presentation

Speech Recognition lexicon DAY 19 – Oct 9, 2013

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Speech recognition lexicon day 19 oct 9 2013

Speech Recognition lexiconDAY 19 – Oct 9, 2013

Brain & Language

LING 4110-4890-5110-7960

NSCI 4110-4891-6110

Harry Howard

Tulane University


Course organization

Brain & Language, Harry Howard, Tulane University

Course organization

  • The syllabus, these slides and my recordings are available at http://www.tulane.edu/~howard/LING4110/.

  • If you want to learn more about EEG and neurolinguistics, you are welcome to participate in my lab. This is also a good way to get started on an honor's thesis.

  • The grades are posted to Blackboard.


Review

Brain & Language, Harry Howard, Tulane University

Review


Review1

Brain & Language, Harry Howard, Tulane University

Review


The speech recognition lexicon

Brain & Language, Harry Howard, Tulane University

The Speech Recognition lexicon

Ingram §7


Linguistic model fig 2 1 p 37

Brain & Language, Harry Howard, Tulane University

Linguistic model, Fig. 2.1 p. 37

Discourse model

Semantics

Sentence level

Syntax

Sentence prosody

Word level

Morphology

Word prosody

Segmental phonology

perception

Segmental phonology

production

Acoustic phonetics Feature extraction

Articulatory phonetics Speech motor control

INPUT


Storing on a hard disk

Brain & Language, Harry Howard, Tulane University

Storing on a hard disk

  • How does a computer store files on its hard drive?

    • By writing them in sequence or where ever there is space.


Retrieving from a hard disk

Brain & Language, Harry Howard, Tulane University

Retrieving from a hard disk

  • How does a computer find files on its hard drive (say, when you search for one by its name)?

    • It searches for it in sequence or randomly.

  • How long does it take?


How would this work for lexical retrieval

Brain & Language, Harry Howard, Tulane University

How would this work for lexical retrieval?

  • Ingram’s example

    • The phoneme detector department detects /k/.

    • A comparator starts looking for all the files that begin with /k/, perhaps ordered in terms of frequency.

    • The phoneme detector department detects /æ/.

    • The comparator rejects the files that don’t begin with /kæ/ and starts searching the remaining files, perhaps ordered in terms of frequency.

    • “It is an open bet whether the word cat would be retrieved before or after the detection of /t/.” (p. 143)

  • Problems

    • Other factors influence speed of retrieval, such as whether the target word has been seen recently.

    • Adding such factors to a serial search model tends to make it slow down!


An alternative the trace ii model

Brain & Language, Harry Howard, Tulane University

An alternative: the TRACE II model


Observations

Brain & Language, Harry Howard, Tulane University

Observations

  • TRACE implements parallel computation, rather than serial or sequential computation.

  • It is both bottom-up (driven by data) and top-down (driven by expectations).

    • Bottom up

      • The successive winnowing of a set of cohorts is modeled by decaying activation of competitors as more information is gathered.

    • Top down

      • Word frequency is modeled by lowering the threshold of activation of more frequent word units, so they need less activation.

      • The phoneme restoration effect is modeled by the word units supplying the missing activation of a phoneme unit.

        • [kæØ] can be heard as ‘cat’.

      • The Ganong effect is modeled in the same way.

        • [kæ<sʃ>] can be heard as ‘Cass’ or ‘cash’ in the proper context.


Simple recurrent networks

Brain & Language, Harry Howard, Tulane University

Simple recurrent networks

  • Read what Ingram says to get the general idea of what it is supposed to do.


Modeling variability

Brain & Language, Harry Howard, Tulane University

Modeling variability

  • We will go over it on Monday.


Next time

Brain & Language, Harry Howard, Tulane University

NEXT TIME

Q5. Finish Ingram §7 & start §8.

☞ Go over questions at end of chapter.


  • Login