Speech recognition and its clinical applications
Download
1 / 23

Speech Recognition and its clinical applications - PowerPoint PPT Presentation


  • 275 Views
  • Updated On :

Speech Recognition and its clinical applications. Thankam Thyvalikakath, MDS Center for Biomedical Informatics University of Pittsburgh. Outline. In-class assignment Background SpeechActs paper Clinical application of speech recognition Speech recognition in dentistry.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Speech Recognition and its clinical applications' - johana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Speech recognition and its clinical applications l.jpg

Speech Recognition and its clinical applications

Thankam Thyvalikakath, MDS

Center for Biomedical Informatics

University of Pittsburgh


Outline l.jpg
Outline

  • In-class assignment

  • Background

  • SpeechActs paper

  • Clinical application of speech recognition

  • Speech recognition in dentistry


Speech recognition l.jpg
Speech recognition ?

Speech Recognition are technologies of particular interest, for their support of direct communication between humans and computers, through a communications mode, humans commonly use among themselves and at which they are highly skilled.

Rudnicky, Hauptman, and Lee

http://starbase.cs.trincoll.edu/~ram/cpsc352/


Slide4 l.jpg

What was the first success story of speech recognition?

“Radio Rex” in the 1920’s, was the first success story in the field of speech recognition

www.stanford.edu/class/linguist236/lec1.pdf


Timeline of speech recognition l.jpg
Timeline of Speech recognition

  • 1936 - AT & T’s Bell labs started study of speech recognition (funded by DARPA)

  • 1974 - optical character recognition

  • 1975 – text to speech synthesis ( Kurzweil reading machine)

  • 1978 – speak and spell toy released by Texas Instruments

  • 1980 – Xerox started producing reading machine Text bridge

  • 1997 – Dragon Systems produces first continuous speech recognition product

http://starbase.cs.trincoll.edu


Slide6 l.jpg

How speech recognition evolved?

acoustic approach (pre - 1960’s)

pattern recognition approach (1960’s)

linguistic approach (1970’s)

pragmatic approach (1980's)


Types of speech recognition l.jpg
Types of speech recognition

  • Isolated words

  • Connected words

  • Continuous speech

  • Spontaneous speech (automatic speech recognition)

  • Voice verification and identification

Fundamentals of Speech Recognition". L. Rabiner & B. Juang. 1993


Speech recognition uses and applications l.jpg
Speech recognition – uses and applications

  • Dictation

  • Command and control

  • Telephony

  • Medical/disabilities

Fundamentals of Speech Recognition". L. Rabiner & B. Juang. 1993


Challenges of speech recognition l.jpg
Challenges of speech recognition

  • Ease of use

  • Robust performance

  • Automatic learning of new words and sounds

  • Grammar for spoken language

  • Control of synthesized voice quality

  • Integrated learning for speech recognition and synthesis

B.S Atal. Speech recognition in 2001: New research directions Proc.Natl.Acad.Sci USA Vol 92, pp 10046-100551Oct1995


Speechacts l.jpg
SpeechActs

SpeechActs is a prototype testbed for developing spoken natural language applications

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


Why develop speechacts l.jpg
Why develop SpeechActs?

  • Integrated conversational applications

  • No specialized language expertise

  • Technology independence

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


Information flow in speechacts l.jpg
Information flow in SpeechActs

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


Speechacts framework l.jpg
SpeechActs - Framework

  • Audio server presents raw digitized audio to speech recognizer

  • Swiftus parses the word list to produce a set of feature-value pairs

  • Discourse manager maintains a stack of information about the current conversation

  • Discourse manager and application respond to the user by sending a text string to ‘text to speech manager’

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


Speechacts a spoken language framework l.jpg
SpeechActs: A Spoken Language Framework

  • Continuous-speech recognizers require grammars that specify every possible utterance a user could say to the application

  • The recognizer grammar should closely synchronize with the Swiftus semantic grammar

  • Solved by inventing Unified Grammar

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


Unified grammar l.jpg
Unified grammar

  • Collection of rules

  • Made of a pattern such as Backus-Naur Form followed by augmentations which are statement written in the Pascal-like form

  • Compiler that produces a grammar specific to speech recognizer and corresponding Swiftus grammar

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


Swiftus the natural language processor l.jpg
Swiftus – the natural language processor

  • Semantic representation generated in real time to facilitate conversation

  • Accurate understanding

  • Tolerance of misrecognized words

  • Wide variation among applications

  • Ease of use

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


Swiftus performance solved l.jpg
Swiftus performance - Solved

Swiftus was designed by using coarse keyword matching and full, in-depth semantic analysis

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


Discourse management l.jpg
Discourse management

  • To support more natural speech , we need at least rudimentary discourse management

  • Should support discourse-segment pushing and popping

  • Prompt design

  • Error-correcting mechanism

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


Discourse manager l.jpg
Discourse manager

  • discourse represented as a data structure consisting of functions for handling user output

  • maintains a stack of these structures, and the top one handles the default discourse for the current application or current dialogue

  • current application or dialogue popped off the stack when the user cancels the activity or the problem is resolved

  • keeps a simple stack of referenced items to a avoid entering into a subdialogue

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


To simulate human conversation l.jpg
To simulate human conversation….

  • conversational pacing

  • explicit error corrections

  • define the functional boundaries of an application

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.


Clinical applications l.jpg
Clinical applications

  • Medical transcription mainly in radiology and pathology

  • First use of speech recognition in the field of radiology in 1981

  • Mean accuracy rate of reading pathology reports, using IBM Via Voice Pro software – 93.6% compared to human transcription at 99.6%

M. Al.Aynati, K.Chomeyko Comparison of Voice-automated Transcription and Human Transcription in General Pathology ReportsArch Pathol Lab Med. 2003;127:721–725)


Speech recognition in clinical dentistry l.jpg
Speech recognition in clinical dentistry?

  • 13% used voice recognition

  • 16% discontinued using voice recognition

  • 21% believed chairside computer use could be improved with better voice recognition

  • Using an automatic speech recognition will be the way to go!!

T. Schleyer et al (unpublished data) Chairside Computer Use in Clinical Dentistry


Thank you questions or comments l.jpg
Thank you Questions or comments?


ad