1 / 19

University of Southern Denmark, Kolding

The ThT System A Multilingual Nordic Search Interface by Ruth Feil and Lotte Weilgaard. University of Southern Denmark, Kolding. Aim. Developing and testing a prototype of a multilingual Nordic search interface. Monolingual access to information in several Nordic languages. Financial support.

dyanne
Download Presentation

University of Southern Denmark, Kolding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The ThT SystemA Multilingual Nordic Search InterfacebyRuth Feil and Lotte Weilgaard University of Southern Denmark, Kolding

  2. Aim Developing and testing a prototype of a multilingual Nordic search interface Monolingual access to information in several Nordic languages

  3. Financial support NorFA The Nordic Academy for Advanced Study of the Nordic Council of Ministers The Nordic Language Technology Research Programme

  4. Participants Norwegian School of Economics and Business Administration, Bergen University of Bergen University of Vaasa University of Southern Denmark TERMplus ApS, Copenhagen Observers: The Swedish Centre for Terminology, Stockholm Ventspils College, Latvia

  5. Approach • Multilingual approach • Onomasiological perspective • Limited size of documentationThesaurus (I & D) • ThTTerms (Terminology) • Existing software

  6. Languages Nordic languages Swedish Norwegian (Bokmål) Danish Finnish English

  7. Working procedure • Establishment of a parallel specialist corpus • Establishment of a classification • Establishment of an index with search concepts • Development of a search interface

  8. Establishing corpora • Test corpus: TV manual • Corpus 1: Document on the Arctic environment • Corpus 2: Nordea’s Year-end report

  9. Corpora Arctic • Danish 65,000 tokens • English 72,000 tokens • Swedish 60,000 tokens • Norwegian 63,000 tokens • Finnish 45,000 tokens Nordea approx. 9,000-10,000 tokens (each language)

  10. Classification Arctic environment in the Nordic countries Arctic regions in the Nordic countries Environment Nature Animals and plants Man Impacts Pollution Protection Hunting Agriculture Tourism Fisheries Forestry Industry Air Oxygen Sulphur Heavy metals Organisms Ozone

  11. Index with search concepts • Interactive extraction of search concepts (WORD index) • Search for equivalent terms (alignment) • Interactive systematisation of concepts (TREE VIEW)

  12. Alignment

  13. Tree View

  14. Horizontal search

  15. Vertical search

  16. Search

  17. Validation

  18. I & D and Terminology

  19. Thank you for your attention! www.norna.dk

More Related