1 / 1

To structure baseball live games as well as to improve the speech recognition accuracy

Future Work. Research Purpose. Abstract. Proposed Method. Learning Stochastic Models. Prospect. Conclusion. Models. Experiments. Problems of Conventional Method. Situation Based Speech Recognition for Structuring Baseball Live Games. Atsushi SAKO, Tetsuya TAKIGUCHI and Yasuo ARIKI

miya
Download Presentation

To structure baseball live games as well as to improve the speech recognition accuracy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Future Work Research Purpose Abstract Proposed Method Learning Stochastic Models Prospect Conclusion Models Experiments Problems of Conventional Method Situation Based Speech Recognition for Structuring Baseball Live Games Atsushi SAKO, Tetsuya TAKIGUCHI and Yasuo ARIKI Department of Computer and Systems Engineering, Kobe University Experimental Conditions O It is a difficult problem to recognize baseball live speech because the speech is rather fast, noisy, emotional and disfluent due to rephrasing, repetition, mistake and grammatical deviation caused by spontaneous speaking style. To solve these problems, we have been studied the speech recognition method incorporating the baseball game task-dependent knowledge as well as an announcer’s emotion in commentary speech. In addition, in this paper, we propose the situation prediction model based on word co-occurrence. Owing to these proposed models, speech recognition errors are effectively prevented. This method is formalized in the framework of probability theory and implemented in the conventional speech decoding (Viterbi) algorithm. The experimental results showed that the proposed approach improved the structuring and segmentation accuracy as well as keywords accuracy. W … Foul ball … Pitch And Strikeout! … • Test set: A commentary speech on radio (7th Sep. 2003) • Learning corpus • HMM: 200 hours (baseline) + 3 hours (adaptation) • Language model: 570K morphemes 3B 1S 3B 2S … 3B 2S 3B 2S Next Batter 0B 0S S Estimate word and situation concurrently Formalization Experimental Results • O : Sequence of observed feature vectors • W : Sequence of words • S : Sequence of situations • To structure baseball live games as well as to improve the speech recognition accuracy • Using baseball dependent knowledge • Following simplification • A situation depends only on a previous situation and a word co-occurrence. • A word depends only on a present situation and a previous word. • An example of recognition result situation sentence phoneme signal Situation Dependent Acoustic Model Situation Prediction Model Situation Dependent Bi-gram Language Model (Bi-gram) Acoustic Model Log likelihood 0B 0S Strikeout! Pitch and Conventional Method Next Batter … Situation Prediction Model Four ball Strikeout! Pitch and Pitch Strike Situation Prediction Model Situation Dependent Language Model Situation Dependent Acoustic Model … 1B 2S 2B 2S 3B 2S Foul ball Pitch Strike 3B 2S … Strikeout! Proposed Method 1B 1S 2B 1S 3B 1S Time Straight Ball Formalization • Using word co-occurrence (not BOW) • O : Sequence of observed feature vectors • W : Sequence of words • We proposed Situation Based Speech Recognition. • Counts was used as a situation. • It worked well under obvious situations. • 2.3% improvement of keyword accuracy. • 6.1% improvement of structuring correct rate. • 75.0% correct rate of exciting scene detection. Acoustic Model • 2 Models such as normal emotion and excited emotion • Adaptation by MLLR+MAP Situation Dependent Language Model Acoustic Model Language Model (Bi-gram) • Learn from training data P=High P=Low Problems Strikeout! Strikeout! • An example of recognition error • Work it well under ambiguous situations. • More detail description of a situation including events 1B 2S 1B 1S

More Related