1 / 6

REPORT FROM MALAYSIA O’COCOSDA 2010

REPORT FROM MALAYSIA O’COCOSDA 2010. Zuraidah Mohd Don. Speech Corpus: the Universiti Sains Malaysia (USM) Research Groups. MASS: A Malay language LVCSR corpus resource (collaboration between USM, MMU and NTU from Singapore).

harlan-paul
Download Presentation

REPORT FROM MALAYSIA O’COCOSDA 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. REPORT FROM MALAYSIAO’COCOSDA 2010 Zuraidah Mohd Don

  2. Speech Corpus: the Universiti Sains Malaysia (USM) Research Groups • MASS: A Malay language LVCSR corpus resource (collaboration between USM, MMU and NTU from Singapore). • Developing speech, text and pronunciation dictionary resources for the purpose of building a large vocabulary speech recognizer for Malay • Speech corpus: 70 hours of read aloud speech (speaker independent/ dependent and accent independent/ dependent) from 90 L1 speakers and 10 hours broadcast news from local TV stations. • Written corpus: 700 Mbytes of data extracted from Malaysia's local news web pages from 1998-2008 • The aim: to develop a rule based G2P tool to generate a pronunciation dictionary.

  3. Speech synthesis and recognition: The University of Malaya Research Groups • Continuous speech recognition for Arabic based on HTK-toolkit • Continuous speech recognition for Malay based on HTK-toolkit • Arabic LVCSR using a large speech corpus for Automatic Speech Recognition • Malay neutral speech database for developing HMM-based Malay TTS • Malay Emotional Speech Corpus for Emotional Speech Synthesis • Grapheme to phoneme converter for Standard Malay • Standard Malay phonetic dictionary • Acoustic Model for Malay Automatic Speech Recognition (ASR) • Language Model for Malay ASR • HMM-based emotional speech synthesis for the Malay language

  4. Other applications: Speaker verification • Speaker Verification using Vector Quantization and Hidden Markov Model • The aim is to improve the performance of HMM in a speaker verification system. • It investigates text-dependent speaker verification using an approach combining VQ and HMM. • The proposed technique is evaluated using a Malay 100 speaker spoken digit database obtained in a noise-free environment. • The results are compared with stand alone HMM. • Universiti Kebangsaan Malaysia Research Group, the Department of Electrical, Electronic & System Engineering,

  5. Other applications: ESL context • Heuristics and Rule-Based Approach for Automated Marking Tool for ESL Writing • developing an automated marking tool for ESL and introducing heuristics and a rule-based approach to detect grammatical errors in tenses in ESL essays. • The results show that heuristics and a rule-based approach is useful and can improve the effectiveness of automated essay marking tool for writing in ESL. • UKM Research group: Nur Asma Mohd Razali, Nazlia Omar, Saadiyah Darus

  6. Other applications: database design • Automation of database design through semantic analysis • Using syntactic and semantic heuristics to create a database design in terms of the Entity-Relationship(ER) model through natural language processing. • Research shows that the use of the semantic heuristics may help further improve the results in the automatic detection of the ER elements.

More Related