1 / 26

Multilingual Interfaces for Biodiversity Information

Multilingual Interfaces for Biodiversity Information. Guy Baillargeon Agriculture and Agri-Food Canada, Ottawa. Key Innovations in Biodiversity Informatics Indaiatuba, SP, Brazil, 21-22 Oct . 2002. Abstract.

aricin
Download Presentation

Multilingual Interfaces for Biodiversity Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multilingual Interfaces for Biodiversity Information Guy Baillargeon Agriculture and Agri-Food Canada, Ottawa Key Innovations in Biodiversity InformaticsIndaiatuba, SP, Brazil, 21-22 Oct. 2002

  2. Abstract The Global Biodiversity Information Facility (GBIF) intends to promote standards and software tools designed to facilitate their adaptation into multiple languages. Countries, economies and organizations participating in GBIF are invited to develop novel user interface designs that incorporate features to support their functionality in a multi‑lingual global context and to develop standards and protocols for indexing, validation, documentation and quality control in multiple human languages, character sets and computer encodings. Ensuring that GBIF applications perform well in any language and that biodiversity data can be put to good use independently of the language of the primary records is a formidable but achievable challenge. It is very much a requirement if we want GBIF to fulfill its potential. How some of this can be done will be demonstrated by outlining the steps required to add a new language to the Integrated Taxonomic Information System (ITIS) and to the Biological Observations, Specimens and Collections (BiOSC) Gateway. A new Portuguese version developed incooperation with Brazil will be shown for the first time.

  3. Think globally • 92% of the world population speaks little or no English • 20 main Asian languages • 15 main European languages Source: http://www.alis.com/pdf/GlobalisationEN.pdf

  4. An enormous diversity Source: http://www.ethnologue.com/ethno_docs/distribution.asp

  5. Human languages hit parade Source: http://www.krysstal.com/spoken.html combined with http://www.multilingualplanet.com/most_spoken_languages.htm

  6. World on line population As of Sept. 2002 Total: 619 millions Source: http://www.glreach.com/globstats/

  7. Non-English growing faster Source: http://global-reach.biz/globstats/evol.htm

  8. Multiple languages • English Internet users • 2000: 58% • 2005: < 35% • Non English online traffic • 2000: 40% • 2005: 70% Source: http://www.alis.com/pdf/GlobalisationEN.pdf

  9. GBIF MOU Goals • 2. “It is the intention of the Participants that GBIF: • […] • (d) promote standards and software tools designed to facilitate their adaptation into multiple languages, character sets and computer encodings”. • […] Source: http://www.gbif.net/moufinal.doc

  10. GBIF MOU Scope of activities • 4 (a) (iii) “Developing suitable tools and standards for accessing, linking and analysing new and existing databases, including standards and protocols for indexing, validation, documentation and quality control in multiple human languages, character sets and computer encodings;” Source: http://www.gbif.net/moufinal.doc

  11. ITIS North America • A joint project by • United States • Canada • Mexico • sis.agr.gc.ca/itis

  12. Canadian context Translation module Other multilingual applic. Canadian context • Requirement for a bilingual version of ITIS in Canada • Not changing the underlying data model • Wanted reusable components • Capability to handle other languages as well ITIS Client Browser

  13. Introducing ITIS*Brazil • A new version of ITIS in Portuguese • SIIT*Brasil - Versão em português • Developed in cooperation with CRIA • August - Sept. 2002 • www.itis.cria.org.br

  14. DEMIS map WMS layer WMS layer ITIS/BiOSC Data flow diagram DB Translation module Other multilingual applic. DB REMIB 4 DB BiOSC Gateway TSA DB DB ENHSIN DB AVH 3 DB ITIS DIGIR DB WMS map server 1 : Query ITIS 2 : Click Map it! button 3 : Get record index data from BiOSC 4 : Get full record from data owner 1 2 Client Browser

  15. How is it done • Selective translation • Semantic partitioning • Automated rendering • Localisation • Cultural conventions (date format, decimal separators, number format) • Alternate spellings

  16. Architecture • Single multilingual application server Each stored procedures • Handles all languages • Locale sensitive • Single character set on a single encoding • Linguistic sorting

  17. General issues • Treat look and feel independently from language issues • Determine user preference • Handle non-ASCII form input and query strings • Enable procedure for content translation • Tag HTML output with encoding information

  18. Look and feel • Stored as blocs of static HTML components • Header • Footer • Background • Images, buttons, logos

  19. User preference • Language and locale defined via passable parameter • User selectable

  20. Query string handling • URLs can only be encoded in 7-bit ASCII • 8-bit bytes are transformed into their hexadecimal representation prefixed by a percent sign • Requires decoding by the application • German word “Schloß” converted to • Schlo%c3%9f (Unicode) • Schlo%DF (Latin)

  21. Base letter conversion • Convert accented character to unaccented for easier query • éèêë to e • òóóõö to o • àáâãäå to a • ùúûü to u • ç to c • ñ to n • Output in correct (accented) form

  22. Alternate spellings • German, Danish and Swedish • ä to ae • ö to oe • ü to ue • å to aa • ø to oe • æ to ae • Output in standard format

  23. Translation table • All translation strings are externalized to a database table • String_id, Language_id, Translation • Primary key on String_id and Language_id • Translations are retrieved via SQL

  24. Code snippet htp.prn(ctislib.multitext(177,p_lang)||': '|| ctislib.multidata(p_lang,90, v_info_cursor.currency_rating)); en: Current Standing: accepted fr: Statut: accepté es: Estado actual: aceptado pt: Posição atual: aceito Multitext function accepts (string id number, language parameter) to translate application text Multidata function accepts (language parameter,table id number, text to be translated) to translate data

  25. Conclusion • Translation module works well for Western European languages • Could probably easily handle other languages using Latin script • Could probably expand to other alphabets such as Greek and Cyrillic • The big challenge: pictorial scripts • Japanese, Chinese, Korean …

  26. Credits • Canada • Guy Baillargeon • Derek Munro • Brazil • Vanderlei Perez Canhos • Dora Ann Lange Canhos • Sidnei de Souza

More Related