1 / 43

The Problem with Probabilistic Parsing

The Problem with Probabilistic Parsing. Kari Baker Arizona State University. What Will We Be Learning Today?. Creating a Model Text Normalization POS Constraints Phrase Constraints Bake-Off Results SNoW Reranker Other. The Task i2b2 Bake-Off Concepts Parsing

Download Presentation

The Problem with Probabilistic Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Problem with Probabilistic Parsing Kari Baker Arizona State University

  2. What Will We Be Learning Today? • Creating a Model • Text Normalization • POS Constraints • Phrase Constraints • Bake-Off Results • SNoW • Reranker • Other • The Task • i2b2 Bake-Off • Concepts • Parsing • Motivation for New Models Parser Models --- Baker

  3. i2b2/VA Challenges in Natural Language Processing for Clinical Data • Three-Part Shared Task • Concepts • Assertion • Relation • Concept Extraction • Problem • Test • Treatment Parser Models --- Baker

  4. Concept Examples Concept Not a Concept Problem: The man was obese. The obese man was admitted. Test: Blood Pressure 130/80 The patient has high blood pressure. Treatment: The patient underwent surgery. The patient arrived in the surgery suite. Parser Models --- Baker

  5. What does a parse look like? S1 S NP VP . DET NN VBD ADJP JJ The man was obese . Parser Models --- Baker

  6. What does a parse look like? (S1 (S (NP (DET The) (NN man)) (VP (VBD was) (ADJP (JJ obese))) (. .))) Parser Models --- Baker

  7. Concept Examples S1 S NP VP . DET NN VBD ADJP JJ The man was obese . Parser Models --- Baker

  8. Concept Examples S1 S NP VP . DET NN VBDADJP JJ The man was obese . Parser Models --- Baker

  9. Concept Examples S1 S NP VP . DET JJ NN AUX VBD The obese man was admitted . Parser Models --- Baker

  10. Concept Examples Problem: (S1 (S (NP (DET The) (NN man)) (VP (VBD was) (ADJP (JJ obese))) (. .))) (S1 (S (NP (DET The) (JJ obese) (NN man)) (VP (AUX was) (VBD admitted))(. .))) Parser Models --- Baker

  11. Concept Examples Test: (S1 (FRAG (NP (NN Blood) (NN Pressure)) (QP (CD 130/80)))) (S1 (S (NP (DET The) (NN patient)) (VP (VB has) (NP (JJ high) (NN blood) (NN pressure))) (. .))) Treatment: (S1 (S (NP (DET The) (NN patient)) (VP (VBD underwent) (NP (NN surgery))) (. .))) (S1 (S (NP (DET The) (NN patient)) (VP (VBD arrived) (PRP (IN in) (NP (DET the) (NN surgery) (NN suite)))) (. .))) Parser Models --- Baker

  12. Sodium 139 , potassium 3.8 , chloride 101 , bicarb 26 , BUN 9 , creatinine 0.7 , glucose 141 , albumin 4.1 , calcium 8.9 , LDH 665 , AST 44 , ALT of 57 , amylase 41 , CK 32 . • 1. Post endoscopic retrograde cholangiopancreatography pancreatitis . • FLANK PAIN URI ? • A/P : 48yo man with h/o HCV , bipolar DO , h/o suicide attempts , a/w overdose of Inderal , Klonopin , Geodon , s/T Jackson stay with intubation for airway protection , with question of L retrocardiac infiltrate , now doing well . • Please note the patient is only Caucasian speaking and information is second hand . • 16) Robituss in AC five to ten milliliters p.o. q.h.s. p.r.n. cough . • Pt has h/o colon can to liver , s/p resxn with serosal implants in 9/03 . • She received ASA , nitro SL then gtt , morphine , metoprolol , and heparin gtt . • 5. Dulcolax 10 to 20 mg PR b.i.d. p.r.n. constipation . • The pt is a 55yo F s / p Roux en Y GBP in 12/20 presenting to the ED this AM c / o mod severe midepigastric pain . • Her electrocardiogram revealed normal sinus rhythm , left atrial enlargement, left axis deviation , poor R-wave progression in V1 through V4 , consistent with marked clockwise rotation , cannot rule out an old anteroseptal wall myocardial infarction . Parser Models --- Baker

  13. The Problem CT scan normal (S1 (S (NP (NNP CT)) (VP (VB scan) (S (ADJP (JJ normal)))))) Parser Models --- Baker Parser Models --- Baker 13

  14. By-Hand Parses 57 Sentences Parsed by hand Necessary to understand structure of sentences Parser Models --- Baker

  15. The Problem • No VP • CT scan normal • Lists • 1. Bactrim double strength • Fragment construction • (S1 (FRAG (NP (NN Blood) (NN Pressure)) (QP (CD 130/80)))) …among others Parser Models --- Baker

  16. How does the Charniak Parser work? • Uses a trained model • Models can be trained on different corpra • WSJ PennTreebank corpus • Defines probabilistic productions • Example:S 99%, fragment 1% Parser Models --- Baker

  17. The Problem *WSJ corpus has 39,832 by-hand Parses Parser Models --- Baker

  18. The Problem CT scan normal Desired Parse: (S1 (FRAG (NP (NN CT) (NN scan)) (ADJP (JJ normal)))) Parser Output: (S1 (S (NP (NNP CT)) (VP(VB scan)(S (ADJP (JJ normal)))))) Parser Models --- Baker Parser Models --- Baker 18

  19. The Problem CT scan normal Desired Parse: (S1 (FRAG (NP (NN CT) (NN scan)) (ADJP (JJ normal)))) Parser Output: (S1 (S (NP (NNP CT)) (VP (VB scan) (S (ADJP (JJ normal)))))) Parser Models --- Baker

  20. The Problem S1 S1 FRAG S NP ADJP NP VP NNP VB S NN NN JJ ADJP JJ CT scannormal CT scan normal Parser Models --- Baker

  21. How are Desirable Parses Obtained? Text Normalization Part of Speech Constraints Phrase Constraints Parser Models --- Baker

  22. Text Normalization Pt 's labs were checked Only minimal exertion such as " walking across the room " The patient is a **AGE[in 50s]- year - old female well until **DATE[Jan 2007] The MRI was performed here at **INSTITUTION she does have a Foley catheter in for I&amp ; O measurement Parser Models --- Baker

  23. Text Normalization > = > If you experience fever > 100.4 , return to the hospital . If you experience fever > 100.4 , return to the hospital . Parser Models --- Baker

  24. Text Normalization Note: F-Score is taken from the parser output compared against the by-hand parses of the i2b2 data Parser Models --- Baker

  25. Medical Acronyms/Abbreviations Parser Models --- Baker

  26. Constraining with Parts of Speech qn  nightly = adverb (S1 (XX He) (XX was) (XX placed) (XX on)(XX Unasyn) (XX 3) (XX grams) (RB qn) (XX .)) He was placed on Unasyn 3 grams qn. Parser Models --- Baker

  27. Constraining with Parts of Speech *Note: There were 5 failed parses for the POS Constraints whereas the Normalized Text had zero. Parser Models --- Baker

  28. Constraining with Phrases Patient has swollen painful L side face . Concept = swollen painful L side face (S1 (XX Patient) (XX has) (NP-problem (XX swollen) (XX painful) (XX L) (XX side) (XX face))(XX .)) Parser Models --- Baker

  29. Constraining with Phrases Parser Models --- Baker

  30. What Next? Train Model! • No True Concepts on Test Day • Treat phrase-constrained parser as truth • Train model on that data Parser Models --- Baker

  31. Phrase-Constrained Model Parser Models --- Baker

  32. Phrase-Constrained Model Parser Models --- Baker

  33. Concept Extraction: SNoW (S1 (S (NP (DET The) (NN patient)) (VP (VBD underwent) (NP (NN surgery))) (. .))) SNoW The patient .99 None .01 Problem .00 Test .00 Treatment surgery .01 None .09 Problem .51 Test .49 Treatment surgery = Test Parser Models --- Baker

  34. Concept Extraction: SNoW Note: These F-Scores are from our predicted concepts compared to the “gold” concepts. Parser Models --- Baker

  35. Concept Extraction: Reranker The patient .99 None .01 Problem .00 Test .00 Treatment surgery .01 None .09 Problem .51 Test .49 Treatment (S1 (S (NP (DET The) (NN patient)) (VP (VBD underwent) (NP (NN surgery))) (. .))) Reranker surgery = Treatment The patient 1. None • surgery • Treatment • Test • Problem Parser Models --- Baker

  36. Concept Extraction: Reranker Parser Models --- Baker

  37. Other Results from i2b2 • Concept • Dependency Parse + External Medical Dictionary • F-Score = 53.8 • Relation • Used Dependency Parses Parser Models --- Baker

  38. Recap • Domain mismatch is bad • Constraining parser decreases domain mismatch • Training new models decreases domain mismatch Parser Models --- Baker

  39. Acknowledgments • Kristy Hollingshead • Brian Roark • Richard Sproat • Margit Bowler • Aaron Cohen • Jianji Yang • Kyle Ambert Parser Models --- Baker

  40. Thank You… • Kristy Hollingshead • Christian Monson • Kevin Burger • Isaac Wallis • The Interns • All OGI Faculty, Staff, and Students Parser Models --- Baker

  41. Questions? Parser Models --- Baker

  42. Hierarchical Phrases There is akinesis / dyskinesis and thinning of the mid to distal inferior septum and the apex. (S (NP (EX There)) (VP (VB is) (NP (NP-problem (NN akinesis)) (CC /) (NP-problem (NN dyskinesis))) (CC and) (NP-problem (NN thinning) (PP (IN of) (NP (DT the) (ADJP (JJ mid) (IN to) (JJ distal)) (JJ inferior) (NN septum))) (CC and) (NP (DT the) (NN apex))))) Parser Models --- Baker

  43. Statistical Evaluations Recall (# correct) / (total) Precision (# correct) / (# predicted) F-Score (2*Recall*Precision) / (Precision + Recall) Parser Models --- Baker

More Related