speech recognition and its clinical applications l.
Download
Skip this Video
Download Presentation
Speech Recognition and its clinical applications

Loading in 2 Seconds...

play fullscreen
1 / 23

Speech Recognition and its clinical applications - PowerPoint PPT Presentation


  • 276 Views
  • Uploaded on

Speech Recognition and its clinical applications. Thankam Thyvalikakath, MDS Center for Biomedical Informatics University of Pittsburgh. Outline. In-class assignment Background SpeechActs paper Clinical application of speech recognition Speech recognition in dentistry.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Speech Recognition and its clinical applications' - johana


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
speech recognition and its clinical applications

Speech Recognition and its clinical applications

Thankam Thyvalikakath, MDS

Center for Biomedical Informatics

University of Pittsburgh

outline
Outline
  • In-class assignment
  • Background
  • SpeechActs paper
  • Clinical application of speech recognition
  • Speech recognition in dentistry
speech recognition
Speech recognition ?

Speech Recognition are technologies of particular interest, for their support of direct communication between humans and computers, through a communications mode, humans commonly use among themselves and at which they are highly skilled.

Rudnicky, Hauptman, and Lee

http://starbase.cs.trincoll.edu/~ram/cpsc352/

slide4

What was the first success story of speech recognition?

“Radio Rex” in the 1920’s, was the first success story in the field of speech recognition

www.stanford.edu/class/linguist236/lec1.pdf

timeline of speech recognition
Timeline of Speech recognition
  • 1936 - AT & T’s Bell labs started study of speech recognition (funded by DARPA)
  • 1974 - optical character recognition
  • 1975 – text to speech synthesis ( Kurzweil reading machine)
  • 1978 – speak and spell toy released by Texas Instruments
  • 1980 – Xerox started producing reading machine Text bridge
  • 1997 – Dragon Systems produces first continuous speech recognition product

http://starbase.cs.trincoll.edu

slide6

How speech recognition evolved?

acoustic approach (pre - 1960’s)

pattern recognition approach (1960’s)

linguistic approach (1970’s)

pragmatic approach (1980's)

types of speech recognition
Types of speech recognition
  • Isolated words
  • Connected words
  • Continuous speech
  • Spontaneous speech (automatic speech recognition)
  • Voice verification and identification

Fundamentals of Speech Recognition". L. Rabiner & B. Juang. 1993

speech recognition uses and applications
Speech recognition – uses and applications
  • Dictation
  • Command and control
  • Telephony
  • Medical/disabilities

Fundamentals of Speech Recognition". L. Rabiner & B. Juang. 1993

challenges of speech recognition
Challenges of speech recognition
  • Ease of use
  • Robust performance
  • Automatic learning of new words and sounds
  • Grammar for spoken language
  • Control of synthesized voice quality
  • Integrated learning for speech recognition and synthesis

B.S Atal. Speech recognition in 2001: New research directions Proc.Natl.Acad.Sci USA Vol 92, pp 10046-100551Oct1995

speechacts
SpeechActs

SpeechActs is a prototype testbed for developing spoken natural language applications

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

why develop speechacts
Why develop SpeechActs?
  • Integrated conversational applications
  • No specialized language expertise
  • Technology independence

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

information flow in speechacts
Information flow in SpeechActs

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

speechacts framework
SpeechActs - Framework
  • Audio server presents raw digitized audio to speech recognizer
  • Swiftus parses the word list to produce a set of feature-value pairs
  • Discourse manager maintains a stack of information about the current conversation
  • Discourse manager and application respond to the user by sending a text string to ‘text to speech manager’

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

speechacts a spoken language framework
SpeechActs: A Spoken Language Framework
  • Continuous-speech recognizers require grammars that specify every possible utterance a user could say to the application
  • The recognizer grammar should closely synchronize with the Swiftus semantic grammar
  • Solved by inventing Unified Grammar

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

unified grammar
Unified grammar
  • Collection of rules
  • Made of a pattern such as Backus-Naur Form followed by augmentations which are statement written in the Pascal-like form
  • Compiler that produces a grammar specific to speech recognizer and corresponding Swiftus grammar

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

swiftus the natural language processor
Swiftus – the natural language processor
  • Semantic representation generated in real time to facilitate conversation
  • Accurate understanding
  • Tolerance of misrecognized words
  • Wide variation among applications
  • Ease of use

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

swiftus performance solved
Swiftus performance - Solved

Swiftus was designed by using coarse keyword matching and full, in-depth semantic analysis

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

discourse management
Discourse management
  • To support more natural speech , we need at least rudimentary discourse management
  • Should support discourse-segment pushing and popping
  • Prompt design
  • Error-correcting mechanism

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

discourse manager
Discourse manager
  • discourse represented as a data structure consisting of functions for handling user output
  • maintains a stack of these structures, and the top one handles the default discourse for the current application or current dialogue
  • current application or dialogue popped off the stack when the user cancels the activity or the problem is resolved
  • keeps a simple stack of referenced items to a avoid entering into a subdialogue

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

to simulate human conversation
To simulate human conversation….
  • conversational pacing
  • explicit error corrections
  • define the functional boundaries of an application

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

clinical applications
Clinical applications
  • Medical transcription mainly in radiology and pathology
  • First use of speech recognition in the field of radiology in 1981
  • Mean accuracy rate of reading pathology reports, using IBM Via Voice Pro software – 93.6% compared to human transcription at 99.6%

M. Al.Aynati, K.Chomeyko Comparison of Voice-automated Transcription and Human Transcription in General Pathology ReportsArch Pathol Lab Med. 2003;127:721–725)

speech recognition in clinical dentistry
Speech recognition in clinical dentistry?
  • 13% used voice recognition
  • 16% discontinued using voice recognition
  • 21% believed chairside computer use could be improved with better voice recognition
  • Using an automatic speech recognition will be the way to go!!

T. Schleyer et al (unpublished data) Chairside Computer Use in Clinical Dentistry