Max Planck Institute for Psycholinguistics - PowerPoint PPT Presentation

slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Max Planck Institute for Psycholinguistics PowerPoint Presentation
Download Presentation
Max Planck Institute for Psycholinguistics

play fullscreen
1 / 15
Max Planck Institute for Psycholinguistics
113 Views
Download Presentation
melinda-rowe
Download Presentation

Max Planck Institute for Psycholinguistics

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Tool development report H. Brugman MPI Nijmegen Max Planck Institute for Psycholinguistics

  2. Overview • IMDI Tools – update and future • ELAN – update and future • Conversion tools • Custom solutions • Tool distribution over internet • (Archive software – already discussed) • Lexicon – activities and future • typology – activities and future • demo of selected features Max Planck Institute for Psycholinguistics

  3. IMDI Tools • IMDI Metadata set (latest version 3.02) • IMDI-BCEditor • New single window user interface • Substructures repository • Drag and drop interface • Support for Closed Vocabularies • Display of metadata descriptions in HTML • Printing • Different CV update policies • coming soon • Support for different fonts • Support for subdomain/project profiles Max Planck Institute for Psycholinguistics

  4. IMDI Tools - 2 • IMDI-BCBrowser • support for CVs • display metadata descriptions as HTML • printing • password-protected access to resources using http authorization protocol • different CV update policies Max Planck Institute for Psycholinguistics

  5. IMDI Tools - 3 • IMDI-BCSearchTool • XML-DB support (under construction) • for faster search • for ‘google’ type search • IMDI-CVEditor • prototype ready • IMDI-BCTreeBuilder (under construction) • IMDI-BCTreeModifier / spreadsheet view (planned) Max Planck Institute for Psycholinguistics

  6. ELAN • release 1.2 (oct 2002) • Automatic creation of a backup copy when first saving a document. • Exporting as a tab-delimited text. • Import/export between Elan and Shoebox, allowing time alignment in ELAN and interlinearization in Shoebox. • The Interlinear Viewer. • Color coding to visualize tier hierarchies. Max Planck Institute for Psycholinguistics

  7. ELAN - 2 • Slider for playing at rates between 5 and 200%. • More specific dependency relations between tiers using constraints. • Support for two modes of annotation, Overwrite and Bulldozer modes. • Entering annotations via the Inline Edit box • Subdividing annotations (entering annotations before/after existing annotations) • Modifying annotations in Grid Viewer. • Improved search options • regular expression search • combination of multiple subqueries. Max Planck Institute for Psycholinguistics

  8. ELAN - 3 • release 1.3 (april 2003) • workarounds for JMF (Java Media Framework) bugs • optimization and improved playback of viewers. ELAN is now substantially less demanding on CPU usage. • save shoebox imported file to eaf • import wac (word annotation converter) files • save wac imported files to eaf • XML improvements (other parser, validation, formatted XML) Max Planck Institute for Psycholinguistics

  9. ELAN - 4 • an autosave function is added • shoebox import: SIL IPA characters are now converted to Unicode during import • an 'active tier' item is added to the right mouse menu in the time line viewer • panels do not resize unexpectedly any more after edit operations • the JMF media controller is replaced by a single 'play' button • a number of small improvements and bug fixes Max Planck Institute for Psycholinguistics

  10. ELAN - 5 • upcoming releases • version 1.4 (april 2003) • “select while playing” • Quicktime as alternative for JMF • also for the Mac • version 2.0 (oct 2003) • complete revision, focusing on • user interface • modularity • time accuracy • speed and efficiency Max Planck Institute for Psycholinguistics

  11. Conversion tools Word Interlinear text WAC (xml) EAF (xml) ELAN WAC ELAN econv econv Transcriber .trs (xml) Shoebox (t separate) Shoebox (t in ref) Max Planck Institute for Psycholinguistics

  12. Custom solutions • custom conversion scripts (e.g. Word lexicon to XML, then to Shoebox) • custom XSLT transformations • scripts for merging annotation docs with time docs Max Planck Institute for Psycholinguistics

  13. Tool distribution over the internet • Network launching using Webstart • Advantages: • full control over all versions of all components needed to run some tool • components can be automatically updated • Disadvantages: • initial download takes long over slow connections • users want to decide when to update • What are your experiences? Max Planck Institute for Psycholinguistics

  14. Lexicon – activities and future • extensive analysis done • first design done. We propose to build: • a flexible lexicon tool were users can define their own lexicon structure and content • a matching generic XML lexicon format • “Shoebox+” lexicon tool functionality • can be used in interaction with ELAN • active participation in international discussions, e.g. ISO workshop on Lexicon structures. Max Planck Institute for Psycholinguistics

  15. Typology • project initiatives Max Planck Institute for Psycholinguistics