semantic interpretation of medical text
Download
Skip this Video
Download Presentation
Semantic Interpretation of Medical Text

Loading in 2 Seconds...

play fullscreen
1 / 29

Semantic Interpretation of Medical Text - PowerPoint PPT Presentation


  • 289 Views
  • Uploaded on

Semantic Interpretation of Medical Text. Barbara Rosario, SIMS Steve Tu, UC Berkeley Advisor: Marti Hearst, SIMS. Semantic Interpretation of Medical Text . More accurate representation of the content of the input text

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Semantic Interpretation of Medical Text' - arleen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
semantic interpretation of medical text
Semantic Interpretation of Medical Text

Barbara Rosario, SIMS

Steve Tu, UC Berkeley

Advisor: Marti Hearst, SIMS

semantic interpretation of medical text2
Semantic Interpretation of Medical Text
  • More accurate representation of the content of the input text
  • Enhance text with information (concept, relationships) drawn from a medical knowledge source
  • Determine semantic meaning of the words (and bigger constructs) and the relationships between them.
combine statistical and symbolic methods
Combine Statistical and Symbolic Methods
  • Use of knowledge bases, semantic hierarchies, medical knowledge, rules
  • Use of statistic methods and machine learning techniques
statistical methods
Statistical methods
  • Disambiguation
  • Detection of semantic patterns
  • Classification of semantically related constructs
  • Degrees (weights, probabilities)
first experiment noun compounds and mesh
First Experiment: Noun Compounds and MeSH
  • Interpretation of noun compounds is crucially semantic
  • Noun compounds extracted from a collection of titles and abstracts of medical journals found in Medline
  • MeSH (Medical Subject Headings) concepts for the labels
slide6
Input:

Medline Text File

Preprocessing

Tagger

Noun Compound Extraction

MeSH

Semantic Labeling

Output:

Semantic Labelled Noun Compounds

mesh tree structures main
MeSH Tree Structures (main)

1. Anatomy [A]

2. Organisms [B]

3. Diseases [C]

4. Chemicals and Drugs [D]

5. Analytical, Diagnostic and Therapeutic Techniques and Equipment [E]

6. Psychiatry and Psychology [F]

7. Biological Sciences [G]

8. Physical Sciences [H]

9. Anthropology, Education, Sociology and Social Phenomena [I]

10. Technology and Food and Beverages [J]

11. Humanities [K]

12. Information Science [L]

13. Persons [M]

14. Health Care [N]

15. Geographic Locations [Z]

mesh tree structures node a expanded
1. Anatomy [A]

Body Regions [A01] +

Musculoskeletal System [A02] +

Digestive System [A03] +

Respiratory System [A04] +

Urogenital System [A05] +

Endocrine System [A06] +

Cardiovascular System [A07] +

Nervous System [A08] +

Sense Organs [A09] +

Tissues [A10] +

Cells [A11] +

Fluids and Secretions [A12] +

Animal Structures [A13] +

Stomatognathic System [A14] +

Hemic and Immune Systems [A15] +

Embryonic Structures [A16] +

Body Regions [A01]

Abdomen [A01.047]

Groin [A01.047.365]

Inguinal Canal [A01.047.412]

Peritoneum [A01.047.596] +

Retroperitoneal Space[A01.047.681]

Umbilicus [A01.047.849]

Axilla [A01.133]

Back [A01.176] +

Breast [A01.236] +

Buttocks [A01.258]

Extremities [A01.378] +

Head [A01.456] +

Neck [A01.598]

Pelvis [A01.673] +

Perineum [A01.719]

Skin [A01.835] +

Thorax [A01.911] +

Viscera [A01.960]

MeSH Tree Structures (node A expanded)
mapping nouns to mesh concepts
Mapping Nouns to MeSH Concepts
  • Ex: migraine headache recurrence
more nouns compounds
migraine headache recurrence

C10.228.140.546.800.525 C23.888.592.612.441 C23.550.291.937

blood plasma perfusion

A12.207.152 A15.145.693 E05.680

migraine headache pain

C10.228.140.546.800.525 C23.888.592.612.441 G11.561.796.444

brain stem neurons

A08.186.211 E05.595.402.541.250 A08.663

rat liver mitochondria

B02.649.865.635.560 A03.620 A11.368.702.564

plasma arginine vasopressin

A15.145.693 D12.125.095.104 D06.472.734.692.781

rat thyroid cells

B02.649.865.635.560 A06.407.900 A11

growth hormone secretion

G07.553.481 D27.505.440.472 A12.200

blood urea nitrogen

A12.207.152 D02.948 D01.362.625

breast cancer cells

A01.236 C04 A11

cancer cell lines

C04 A11 G05.331.599.110.708.330.800.400

More Nouns Compounds
attachment and semantic interpretation
Attachment and Semantic Interpretation
  • Attachment classification
    • “acute migraine treatment” [[N N] N] (LA)
    • “intra-nasal migraine treatment” [N [N N]] (RA)
  • To bootstrap semantic interpretation
  • Decision tree (Quinlan)
levels of descriptions
Levels of Descriptions
  • migraine headache recurrence (LA)
    • C10.228.140.546.800.525 C23.888.592.612.441 C23.550.291.937
expressiveness of decision trees
Expressiveness of Decision Trees
  • first noun tree = B: ra (33.0/3.7)
  • first noun tree = E: ra (2.0/1.6)
  • first noun tree = F: la (0.0)
  • first noun tree = G: la (4.0/0.3)
  • first noun tree = A:
  • | second noun tree = B: la (0.0)
  • | second noun tree = D: la (4.0/0.3)
  • | second noun tree = E: la (10.0/0.4)
  • | second noun tree = F: la (0.0)
  • | second noun tree = G: la (6.0/1.6)
  • | second noun tree = A:
  • | | first tree position <= 4 : ra (7.0/1.6)
  • | | first tree position > 4 : la (36.0/5.8)
  • | second noun tree = C:
  • | | third noun tree = A: ra (9.0/0.3)
  • | | third noun tree = B: la (0.0)
  • | | third noun tree = D: la (1.0/0.3)
  • | | third noun tree = E: la (5.0/0.3)
  • | | third noun tree = F: la (0.0)
  • | | third noun tree = G: ra (2.0/1.6)
  • | | third noun tree = C:
  • | | | third tree position <= 21 : ra (5.0/2.6)
  • | | | third tree position > 21 : la (5.0/0.3)
  • first noun tree = C:
  • …..
semantic interpretation
Semantic Interpretation
  • Use decision tree paths for the detection of clusters of noun compounds with the same semantic interpretation
from mesh to umls
From MeSH to UMLS
  • Unified Medical Language System, project at U.S National Library of Medicine
  • 3 UMLS Knowledge Sources
    • Metathesaurus
    • Semantic Network
    • SPECIALIST lexicon and programs
metathesaurus
Metathesaurus
  • Most extensive of UMLS sources
  • 730,000 concepts representing more then 1,500,000 strings in over 60 vocabularies and classifications
  • Organized by concept or meaning.
    • In essence, its purpose is to link alternative names and views of the same concept together and to identify useful relationships between different concepts.
  • Relationships in the Metathesaurus come from the sources themselves or are created by the Metathesaurus editors.
semantic network
Semantic Network
  • Consistent categorization of all concepts represented in the UMLS Metathesaurus and the important relationships between them.
  • Every concept has been assigned a semantic type.
  • The semantic types (134) are the nodes in the Network, and the relationships between them are the links (54)
  • High level semantic structure
noun compounds again
Noun Compounds, again
  • Very preliminary studies…
  • Can we use the information of the Semantic Net for the semantic interpretation on the noun compounds?
  • Are semantic types and relationships good descriptors? Are they useful for disambiguation and classification?
mapping words semantic types semantic relationships
Mapping Words - Semantic Types, Semantic Relationships
  • Semantic types correctly assigned (on 246 nc, 738 nouns): 59%
  • Semantic types disambiguated by the relationships
    • Doesn’t disambiguate: 42.7%
    • Disambiguates wrong: 17.3%
    • Disambiguates correctly: 40%
some of future work
(Some of) Future Work
  • Explore in more depth UMLS sources
  • What form the best basis for automatic semantic interpretation of noun phrases?
    • Semantic types?
    • Metathesaurus concepts?(and what parts of them)
    • Just MeSH concepts?
      • Machine Learning algorithms to help choose a good representation of medical terms
future work
Future Work
  • Machine learning algorithms for classification
  • Can we (and how) generalize patterns found for noun compounds to other syntactic structures?
  • How can we best formally represent semantics?
  • How can we combine symbolic rules with statistical methods?
  • How can we deal with non medical words?
    • Can the system help us disambiguate them?
    • Should we use other ontologies (ex WordNet)?
ad