1 / 15

Roadmap for Language Resources and Evaluation in a Multilingual Environment

Roadmap for Language Resources and Evaluation in a Multilingual Environment. Minority Languages in the African Context Justus Roux Centre for Language and Speech Technology (SU-CLaST) Stellenbosch University, South Africa jcr@sun.ac.za. Aim. Overview of

benjy
Download Presentation

Roadmap for Language Resources and Evaluation in a Multilingual Environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Roadmap for Language Resources and Evaluation in a Multilingual Environment Minority Languages in the African Context Justus Roux Centre for Language and Speech Technology (SU-CLaST) Stellenbosch University, South Africa jcr@sun.ac.za

  2. Aim • Overview of • proceedings of the LREC2006 workshop on Networking the development of African Languages • resolutions taken at the meeting • Remarks on future development and co-operation

  3. Background to the LREC workshop • African Language Association of Southern Africa Special Interest Group for Language and Speech Technology (ALASA-SIG) • Special Track on HLT at ALASA International Conference in Johannesburg in 2005 • National and international participants • Proceedings to appear in SA Journal of African Languages • Decision to interact with the international community via LREC2006

  4. Why? • UNESCO Year of African Languages (2006) • Challenges in bridging the digital divide concerning African languages (connecting Africa) • R&D activities in relative isolation • Perceived need to develop resources and capacity for HLT R&D in African languages • Similar activities in NEMLAR project – Language Technology for Arabic

  5. AIMS of Workshop • Develop an academic network for sharing ideas • Promote co-operation in the development of resources and tools (BLARKs for African languages) • Facilitate capacity building related to African languages in the context of HLT

  6. Programme • Area surveys • West Africa • East Africa • Central Africa • Southern Africa • Projects per area • Larger projects and infrastructures • Discussion on networking possibilities

  7. West Africa • Language Documentation paradigm: specific role of Uni Bielefeld • Doctoral students at various European universities • ALT-I: African Language Technology Institute in Ibadan • Local Language Speech Technology Initiative (Speech synthesis for Ibibio) • Initiatives in development of morphological parsers (Cologne) • West African Linguistics Society

  8. East Africa • Text corpora on Swahili across Europe • University of Helsinki • Tools: Open Swahili Localisation Project (OSLP) – spelling checker for Swahili • Tagging tools • Localisation Microsoft Windows XP: Swahili • Morphological analysers • SALAMA: Machine Translation • Centre for Science and New Technologies & CNRS (Avignon) • Speech mining in Somali • University of Nairobi & University of Antwerp • Annotated corpora in Gikuyu and applied machine learning

  9. Southern Africa • Extremely wide range of activities in South Africa primarily by locals (see proceedings) • University of South Africa • Morphological analysers for five African languages • Development of machine readable lexicons • University of Pretoria • Text corpora and spelling checkers • Machine-aided Translation / Localisation • Stellenbosch University Centre for Language and Speech Technology • ASR, TTS and Natural language Understanding in five languages

  10. Southern Africa (Continued) • University of North West - Centre for Text Technology • Localisation, spelling checkers • University of Limpopo & Cape Town • Speech Synthesis • Meraka Institute (Pretoria) • Open source software for language and speech technology applications • University of the Free State & Province of Flanders • Interpreting services, data warehousing

  11. Southern Africa (Continued) • Standardisation: • ISO/TC 37 mirror Committee (StanSA TC37) Terminology training workshops with Termnet Workshop on text annotation (Sept 2006) ISO-Meetings: Oslo (04), Warsaw (05), Beijing (06) • AFRILEX: • International conferences and workshops • National Language Service: • National Lexicography Units • National HLT Resource Centre

  12. Larger Projects • The African Anaphora Project (Rutgers, USA) • Building an Infrastructure for Collaborative Development (Taiwan)

  13. Decisions taken • To consolidate an inventory on tools, resources etc. available in Africa by using the on-line ELRA BLARK website • To set up a dedicated website (Wiki) to facilitate networking • The current Organising Committee will be responsible for the activities above as well as for fundraising for training workshops in Africa • To organise a similar workshop at LREC2008

  14. Concluding impressions • European countries are playing an active role in the field in West and East Africa – to be welcomed • International organisations are becoming increasingly involved in Africa: • ISCA International Affairs Committee for Africa • ISO • ELRA?? • International co-operation in EU projects (FP7)?

More Related