1 / 20

Разширяване на кръгозора: Използване на лингвистични технологии в системи за публикации

Разширяване на кръгозора: Използване на лингвистични технологии в системи за публикации. ICT PSP call identifier: CIP-ICT-PSP-2009-3 Theme 5: Multilingual Web 5.3 Multilingual Web content management - methods, tools and processes. The information today.

rosagomez
Download Presentation

Разширяване на кръгозора: Използване на лингвистични технологии в системи за публикации

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Разширяване на кръгозора:Използване на лингвистични технологии в системи за публикации ICT PSP call identifier: CIP-ICT-PSP-2009-3 Theme 5: Multilingual Web 5.3 Multilingual Web content management - methods, tools and processes

  2. The information today • Flood of multilingual and heterogeneous information • The challenge: The information has to be processed and analyzed in order to be used more efficiently

  3. The information today • Increasing amount of multilingual and heterogeneous information

  4. The information today Widening the scope!

  5. The Language Technologies (LT) • The computers process the information; humans do understand it. • The computers has limited resources to understand the information; the humans has limited resources to process the information. • The NLP technologies optimizes the level of understanding of the computers and thus increase the productivity of the humans.

  6. Overview • The NLP technologies by examples • NLP in practice – the ATLAS project • Conclusions • Questions

  7. NLP by examples (1) • Divide and Conquer • Grouping the information: • By importance

  8. NLP by examples (1) • Divide and Conquer • Grouping the information • By importance • Automatic text categorization • Politics (24) • Sports (5) • Entertainment (5) • Technologies (12) • Science (20) • Rumors (6) • Other (10)

  9. NLP by examples (1) • Divide and Conquer • Grouping the information: • By importance • Automatic categorization • Text clustering • Politics (24)  International affairs (12), Conflicts (3), Terrorism (5), Nature and Environment (8), ... • Science (20)  Math (2), Physics (5), Nature and Environment(3), NLP technlologies(4), ... • Other (10)  Money and Banks (3), Richard Branson (4), Learning materials (3), ...

  10. NLP by examples (1) • Temporal dynamics • Before, Now, Tomorrow? • Politics (24 + 3)  International affairs (10 -2), Конфликити (3), Terrorism (6 +1), Nature and Environment (10 +2), ...

  11. NLP by examples(2) • We do value your opinion! • Positive, negative or objective?

  12. NLP by examples(3) • Salient excepts • Persons • politics, actors, scientists, fictions characters • Organizations and Institutions • NATO, EU, BAS, Bank of England, Google, Apple, … • Geographical locations • Bulgaria, Sofia, EU, Western Europe, Tibet • Dates • Steven Paul Jobs was born in San Francisco on February 24, 1955 person city date

  13. NLP by examples(3) • Salient excepts • Jobs was a demanding perfectionist who always aspired to position his businesses and their products at the forefront of the information technology industry by foreseeing and setting trends, at least in innovation and style ... • As of October 9, 2011, Jobs is listed as primary inventor related to a range of technologies from actual computer and portable devices to user interfaces ...

  14. NLP by examples(3) • Salient excepts • Jobs was a demanding perfectionist who always aspired to position his businesses and their products at the forefront of the information technology industry by foreseeing and setting trends, at least in innovation and style... • As of October 9, 2011, Jobs is listed as primary inventorrelated to a range of technologies from actual computer and portable devices to user interfaces ...

  15. NLP by examples(4) • You might be also interested in this and that … • Suggestions for similar content • According to the textual information • According to the persons, locations and dates • According to the key concepts and ideas • According to the genre and fictions characters • Cross-lingual Information Retrieval

  16. NLP by examples(5) • Machine translation • Text summarization • Of a single document • Of a collection of documents

  17. NLP in practice – ATLAS project • ATLAS – multilingual content management system which harnesses NLP technologies • Supported languages: Bulgarian, English, German, Polish, Romanian and Greek. www.atlasproject.eu • Using ATLAS • Software-as-a-service: http://i-publisher.atlasproject.eu • API for integration with 3rd party systems • ATLAS extracts and provides • Key phrases and names entities • A list of similar documents • The automatic categorization and text summary • Machine translation

  18. The ATLAS project • ICT PSP project • ATLAS consortium: • Coordinator: Tetracom Interactive Solutions–Bulgaria • DFKI -DeutschesForschungszentrumFuerKuenstlicheIntelligenz GmbH – Germany • Atlantis Consulting SA – Greece • Institute for Bulgarian Language “Professor LuybomirAndreychin” at the Bulgarian Academy of Sciences – Bulgaria • InstytutPodstawInformatykiPolskiejAkademiiNauk – Poland • Universität Hamburg – Germany • UniversitateaAlexandruIoanCuza – Romania • Sveucilišteu Zadru – Croatia • ITD - Institute of Technologies and Development – Bulgaria • Project duration • 3 years, counting from 1st March, 2010

  19. Conclusion? • What are the NLP technologies? • They provide a way to harness the computational resources of the computers for better information understanding • What can they be used for? • More effective way to handle the increasing amount of multilingual information • Who can use these technologies? • Libraries • Publishing houses • Medias • Online bookstores • Layers • Banks, companies and organization

  20. Questions... ?

More Related