1 / 24

MAGIC Seen from the Perspective of RAGS

MAGIC Seen from the Perspective of RAGS. Kathleen R. McKeown Department of Computer Science Columbia University. MAGIC. Multimedia Abstract Generation of Intensive Care data Collaborators: Steven Feiner, Desmond Jordan Shimei Pan, James Shaw, Michelle Zhou

spike
Download Presentation

MAGIC Seen from the Perspective of RAGS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MAGIC Seen from the Perspective of RAGS Kathleen R. McKeown Department of Computer Science Columbia University

  2. MAGIC Multimedia Abstract Generation of Intensive Care data Collaborators: Steven Feiner, Desmond Jordan Shimei Pan, James Shaw, Michelle Zhou Kris Concepcion, Liz Chen, Jeanne Fromer

  3. Scenario Goal: provide post-operative information on bypass patients (CABG) • Prior to completion of surgery and before transport to Cardiac Intensive Care Unit (ICU) • Status needed for ICU nurse, cardiologist • Time critical

  4. Issues for Language Generation • Conciseness: Coordinated speech and text that is brief but unambiguous • Coordination with other media: Modify wording and speech to coordinate references with graphical highlighting • Media specific tailoring: • Produce wording appropriate for spoken language • Use information from language generation to improve quality of synthesized speech

  5. Status • Implemented prototype showing coordination between media for limited input • Text output for large numbers of input cases • Undergoing evaluation *now* in ICU • Runs on live data on a daily basis • 5-10% error rate • Continuing research on effects of LG information on prosody, partial results

  6. Principles • Early processes produce media independent representations • Representations use partial orderings in order to make early commitments where possible and retain flexibility • Both the speech and graphics content planner may add content and ordering constraints • Constraints on later decisions may be added early on (e.g., lexical choice)

  7. Data Server and Filter (conceptual) • Input • 18:25 <drug> Drips Norepinephrine • 18:27 <drug> Drips Norepinephrine • 18:29 <drug> Misc. Magnesium Sulfate • 18:29 <surgery> Cardiac Defibrillated by surgeon • 18:33:11 100 (BP) 51 (HR) • 18:34:01 96 52 • Output • C-inanimate entity -> C-drug -> C-operating-room-medication ->C-Drip -> C-Norepinephrine • Top-level categories • C-state, C-event, C-entity (abstract, physical, organization, math) • Inferences • Hypotension: time, duration, drugs given

  8. General Content Planner - SOAP(Rhetorical, semantic, conceptual) • Overview • Demographics • Name, Age, MRN, Gender, Doctor, Operation • Medical history • Lines • Therapy • Devices • Detail View • Drips (on leaving) • Induction info • Devices • Lab report • Timeline • Inferences • End values • Conclusions

  9. Speech Content Planner - Satisfying Conciseness • Speech content planner groups information into sentences • Ms. Jones is an 80 year old, hypertensive diabetic female patient of Dr. Smith undergoing CABG. • Ms. Jones is an 80 year old, female patient of Dr. Smith undergoing CABG. She has a history of diabetes and hypertension. • To satisfy communicative goal to be concise, selects adjectives, prepositional phrases when possible.

  10. Input to speech content planner -semantic propositions • X is-a patient • X has-property last name = Jones • X has-property age = 80 years • X has-property history = hypertension • X has-property history = diabetes • X has-property gender = female • X has-property surgery = CABG • X has-property doctor = Y • Y has-property last name = Smith

  11. Forming Sentence Structure(Rhetorical, semantic, lexical, syntactic) • ((relation is-a) (arg1 ((item ((class name) (last-name “Jones”))))) (arg2 ((item ((class patient)))))) • ((relation is-a) (arg1 ((item ((class name) (last-name “Jones”))))) (arg2 ((item ((class patient)) (premod ((history hypertension))))))

  12. 3 Types of Aggregation • Hypotactic aggregation: Given a set of propositions, can one be realized as a modifier? • Semantic aggregation: if a patient is on multiple drips and all devices, a patient has received massive cardiotonic therapy • Paratactic aggregation: Combine related propositions using conjunction and apposition

  13. Coordination across media • Temporal media • Coordinate spoken references with highlighting of graphical references • Requires negotiation of ordering and duration of media actions

  14. Negotiating Ordering • Spoken language generator has grammatical constraints on linear ordering • Graphics generator has spatial constraints on layout • Individual accounts of these constraints may result in an incoherent presentation

  15. Ms. Jones is an 80 year old, diabetic, hypertensive female patientof Dr. Smith undergoing CABG.

  16. Problems for Language Generation: Ordering • When to provide an ordering over references? • produce a partial ordering after word choice • How to select an ordering compatible with graphics? • produce several possibilities ordered by preference • How to communicate orderings with graphics? • maintain a mapping between strings and semantic objects

  17. Media Negotiation(Conceptual, Semantic, Document) • Speech components produce candidate partial orders 1.(< name age (* diabetes hypertension) gender surgeon operation) 10 2. (< name age gender surgeon operation (* diabetes hypertension) 5 3. (< name age gender (* diabetes hypertension) surgeon operation) 4

  18. Media Negotiation • Graphics components produce candidate partial orders 1. (di (highlight demographics) ((<m) (subhighlight (mrn age gender))(subhighlight (medhistory))(subhighlight (surgeon operation))) 10 2. (di (highlight demographics)(* (subhighlight (mrn age gender))(subhighlight (medhistory))(subhighlight (surgeon operation))) 7

  19. CTS Architecture Machine Learning Prosody model Speech Corpus Other Source Prosodic Rules NLG System Prosody Realizer T T S Text + Input Sound Annotated Structure Text

  20. Focus of Research(Rhetorical, Semantic, Syntactic, Prosodic) • Build a prosody model for CTS using prosodic features (based on ToBI): • pitch accent, phrase accent, boundary tone, break index. • Features produced by LG • Syntactic structure, POS tags, Semantic boundaries, Concept • Informativeness, predictability (statistical models) • Abnormality, unexpectedness, sequential rhetorical relation

  21. Mapping to RAGS • Data filter - conceptual • General Content Planner -rhetorical, semantic, conceptual • Speech Content Planner - rhetorical, semantic plus constraints on lexicalization, syntax • Lexical Chooser - semantic, lexical, syntactic • Media Coordination - semantic, conceptual, document • Syntactic Realization - semantic, syntactic • Prosody Realization -rhetorical, semantic, syntactic, prosodic

  22. Acknowledgments This work was funded in part by • DARPA • NSF • ONR • New York State Center for Advanced Technology • NLM

More Related