Discourse Level Software - PowerPoint PPT Presentation

discourse level software n.
Skip this Video
Loading SlideShow in 5 Seconds..
Discourse Level Software PowerPoint Presentation
Download Presentation
Discourse Level Software

play fullscreen
1 / 24
Discourse Level Software
Download Presentation
Download Presentation

Discourse Level Software

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Discourse Level Software Current Statusand Future Directions Nov. 16, 2004 Lars Huttar (lars_huttar@sil.org) Knowledge Management Services

  2. Abstract (I) • Discourse analysis (DA, a.k.a. textlinguistics) is a task frequently cited as needing computer-assisted tools. • Some tools are currently available for certain tasks, but as yet, no user-ready applications specifically for the discourse charting commonly used on the field.

  3. Abstract (II) • This presentation will review a few of the existing tools most pertinent to DA on the field, and software that is planned or under development. • I will also mention the conceptual model for constituent charting described in my thesis, which uses XML encoding of text and analysis, from which a chart is rendered via XSL.

  4. Overview • The need for discourse analysis software • What’s already out there? • What’s coming down the pike?

  5. Need for Discourse Software The task: • Help the user produce charts, diagrams, and summaries of texts in such a way as to facilitate discovery of discourse patterns and to expedite testing of hypotheses.

  6. Import (interlinear) text Segment and move pieces into chart columns Mark genre(s) Configurable auto-highlighting, e.g. color by POS. Toggle highlighting of certain features Manual annotation of features incl. coherence and prominence Search text, IT, and annotations Chart/summary of results, hyperlinked to data Accessible to MTTs/OTTs Geoffrey Hunt Kent Spielmann Major features desired

  7. Example constituent chart

  8. Current Practice • Pencil & paper • MS Word • MS Excel • A few bravesouls useother tools

  9. The Right Tools? Specialized tools could make it quicker and easier!

  10. How to Address the Need? • Use existing software • SIL FieldWorks DA tool(s) • Extend existing tools?

  11. What’s already here? • MDA • BART • RSTTool • MATE • CiCaDA

  12. Multilinear Discourse Analysis • Generate statistics and diagrams relating to span analysis, topic continuity statistics, and other issues • Input is an SFM marked up text (e.g. from Shoebox) • In Beta 2 • More info: phil.quick@sil.org

  13. Biblical Analysis Research Tool • BART – has features supporting discourse analysis of biblical texts • Comes with extensive built-in morphosyntax markup; supports customizable tagging and complex queries. • Only for biblical texts; can’t enter vernacular texts. • Part of TW, or available from WordSearch Corp. • www.sil.org/translation/bart.htm

  14. RSTTool • Lets user diagram relations between text “chunks.” • Free download from http://www.wagsoft.com/RSTTOOL • User can define own set of relations, schemas, etc. such as SSA or Longacre’s propositional relations. • Can generate statistics based on the tree structures built by the user. • File format is XML-based. • Text can be edited even after struc-turing has begun.

  15. MATE Workbench • Tool “to aid in the display, editing and querying of annotated speech corpora” • Encodes data in XML and displays via XSL-like stylesheets; could be programmed to produce various displays. • In “early demo” version (2001). Looks like it has potential, but I can’t get it to runon my machine. • http://mate.nis.sdu.dk/

  16. CiCaDA • Produce fairly feature-complete constituent charts from XML data using XSLT stylesheets. • Encode text, column assignments, and chart configuration in XML; chart is produced automatically. • Open standards promote modification/ reuse of data. • There is no “application;” no user-friendly way to enter the XML data.

  17. Helps available • LinguaLinks Library has several items, including: • Analyzing Discourse: a Manual of Basic Concepts – Dooley & Levinsohn (avail. on the web as well as in LLL). Very practical.

  18. Do you know of others? • Please let me know if you are aware of other useful discourse-level software tools!

  19. What’s coming? • TCC • AGTK • FieldWorks DA tools

  20. TCC • “A tool for drawing syntax trees” – could also be used for discourse “chunking” and highlighting • Looks very easy to use. Collapsible tree makes it easy to browse large text structures. • Supports Latin-1 charset. • Author taking feedback to make TCC more useful for SIL’s work. • Still in beta. No release sched. • Info: http://ulrikp.org/

  21. Annotation Graph ToolKit • AGTK is a toolkit for annotating texts • TreeTrans – edit syntactic trees; charting & chunking possible • InterTrans – interlinearize text (very beta) • Saves in an abstract XML format; potential good basis for “Lego” solution • Not ready for end users.

  22. SIL FieldWorks DA Tool(s) • FW DA software is still on the drawing board but is a high priority. • Would leverage the huge benefits of all the work that has gone into FieldWorks! • FW tools already support interlinear text, text annotations/tagging and highlighting. • Preliminary work has begun on design of constituent charting features. • Wish list for DA features exists but requirements not yet prioritized.Guidance team has not yet beenformed.

  23. Conclusion • There are some good tools already out there for certain tasks related to DA. Unfortunately they don’t interoperate much, and there are no domain-aware applications for constituent charting. • SIL FieldWorks tools, as they become available, should cover certain DA tasks well, such as constituent charting.

  24. Questions? Comments?