1 / 122

Cornerstone I: Representing Knowledge

Cornerstone I: Representing Knowledge. From Data to Knowledge Through Concept-Oriented Terminologies James J. Cimino. The first step on the path to knowledge is getting things by their right names. -Chinese saying. Overview. What is “data to knowledge”? Knowledge representation choices

Download Presentation

Cornerstone I: Representing Knowledge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cornerstone I: Representing Knowledge From Data to Knowledge Through Concept-Oriented Terminologies James J. Cimino

  2. The first step on the path to knowledge is getting things by their right names. -Chinese saying

  3. Overview • What is “data to knowledge”? • Knowledge representation choices • Knowledge-based terminology efforts • Medical Entities Dictionary • Proof of concepts

  4. What is “data to knowledge”? • Start with patient data in the medical record • Enhance knowledge by: • gaining a better understanding of the patient • learning relevant knowledge • bringing smart systems to bear to apply knowledge • discovering new knowledge from health data

  5. Knowledge Representation • Terminology for representing symbols • Format for arranging the symbols

  6. Knowledge Representation Choices • Guideline implementation

  7. Guideline Implementation • Starren and Xie, SCAMC, 1994 • National Cholesterol Education Panel Guideline

  8. Cholesterol 200 to 239 Cholesterol <200 Cholesterol >239 Cholesterol 200 to 239 HDL <35 or 2 Risks HDL >35, <2 Risks HDL >35, <2 Risks Provide dietary information Reevaluate in 2 years National Cholesterol Education Panel Guideline Measure Cholesterol & Assess Risk Factors

  9. Guideline Implementation • Starren and Xie, SCAMC, 1994 • National Cholesterol Education Panel Guideline • Three representations: • PROLOG (first-order logic)

  10. NCEP Guideline in PROLOG rule_j(PID):- check_lab(PID,hdl,HDL,_),!, HDL >= 35, total_risk(PID,Risk),!, Risk < 2, check_lab(PID,cholesterol), C,_), C >= 200, C =< 239, print_rule_j.

  11. Guideline Implementation • Starren and Xie, SCAMC, 1994 • National Cholesterol Education Panel Guideline • Three representations: • PROLOG (first-order logic) • CLASSIC (frames)

  12. NCEP Guideline in CLASSIC (CL-DEFINE-CONCEPT ‘C-PATIENT ‘(AND (ALL CHOL (AND INTEGER (MIN 200) (MAX 239))))) (CL-DEFINE-CONCEPT ‘G-PATIENT ‘(AND C-PATIENT LOW-RISK-PATIENT (ALL HDL (AND INTEGER (MIN 35)))))

  13. Guideline Implementation • Starren and Xie, SCAMC, 1994 • National Cholesterol Education Panel Guideline • Three representations: • PROLOG (first-order logic) • CLASSIC (frames) • CLIPS (production rules)

  14. NCEP Guideline in CLIPS (defrule C2G2J “Rules to reach box J” ?f1 <- (calculated-patient (state c) (done no) (hdl ?hdl) (name ?name) (test (>= ?hdl 35)) => (printout “Patient “ ?name “needs treatment”)

  15. Guideline Implementation • Starren and Xie, SCAMC, 1994 • National Cholesterol Education Panel Guideline • Three representations: • PROLOG (first-order logic) • CLASSIC (frames) • CLIPS (production rules) • “All three representations proved adequate for encoding the guideline”

  16. Knowledge Representation Choices • Guideline implementation • Terminologic knowledge

  17. Terminology Representation Choices • Frame-based

  18. Frame-Based Representation Serum Glucose Test is-a: Lab Test Measures: Glucose Specimen: Serum Units: “mg/dl”

  19. Terminology Representation Choices Terminology Representation Choices • Frame-based • Semantic network

  20. Chemical Lab Test Body Substance is-a is-a is-a Glucose Serum specimen measures Semantic Network Representation Serum Glucose Test

  21. Terminology Representation Choices Terminology Representation Choices • Frame-based • Semantic network • Conceptual graphs

  22. Conceptual Graph Representation [Serum Glucose Test] - (is-a) -> [Lab Test] (measures) -> [Glucose] (specimen) -> [Serum]

  23. Terminology Representation Choices Terminology Representation Choices • Frame-based • Semantic network • Conceptual graphs

  24. Knowledge Representation Choices • Guideline implementation • Terminologic knowledge

  25. Knowledge Representation • Terminology for representing symbols • Format for arranging the symbols • Terminology and format for representing terminologic knowledge

  26. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991

  27. femur increased_uptake right site site_attr during bone_phase Jochen Bernauer, SCAMC, 1991 • Conceptual graphs to model findings

  28. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993

  29. Rector, Nolan and Glowinski, SCAMC, 1993 • GALEN project conditions grammatically haveLocation bodyparts fractures sensibly haveLocation bones femurs sensiblyAndNecessarily haveDivision neck

  30. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993

  31. Campbell and Musen, SCAMC, 1993 • Conceptual graphs and SNOMED • Pain + Chest + Radiation to + Left + Arm [Pain] - (located in) -> [Chest] (radiating to) -> [Arm] -> (with laterality) -> [Left]

  32. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993

  33. Lexical group String String Lindberg, Humphreys, McCray, Methods 1993 • Unified Medical Language System Concept Lexical group String String

  34. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994

  35. Rocha, Huff, et al., CBM, 1994 • VOSER • A server architecture for managing terminologic knowledege

  36. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996

  37. Campbell, Cohn, Chute, et al., SCAMC 1996 • Convergent Medical Terminology • SNOMED/Kaiser/Mayo • Galapagos

  38. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996 • Brown, O’Neil and Price, Methods, 1997

  39. Brown, O’Neil and Price, Methods, 1997 • Read Codes • Representation with GALEN model

  40. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996 • Brown, O’Neil and Price, Methods, 1997 • Spackman, Campbell, and Côte, SCAMC 1997

  41. Spackman, Campbell, and Côte, SCAMC 1997 • SNOMED RT (Reference Terminology) • Convergent Medical Terminology • Description Logic Format

  42. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996 • Brown, O’Neil and Price, Methods, 1997 • Spackman, Campbell, and Côte, SCAMC 1997 • Huff, Rocha, McDonald, et al., JAMIA 1998

  43. Huff, Rocha, McDonald, et al., JAMIA 1998 • Logical Observations, Identfiers, Names and Codes (LOINC) 4764-5 | GLUCOSE^3H POST 100 G GLUCOSE PO | SCNC | PT | SER/PLAS | QN|

  44. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996 • Brown, O’Neil and Price, Methods, 1997 • Spackman, Campbell, and Côte, SCAMC 1997 • Huff, Rocha, McDonald, et al., JAMIA 1998 • Pharmacy system knowledge base vendors

  45. Drug Class International Package Identifiers is-a Not-Fully-Specified Drug is-a Ingredient Class is-a Clinical Drug is-a is-a is-a Composite Clinical Drug Trademark Drug is-a is-a Pharmacy System Knowledge Base Vendors Country-Specific Packaged Product Ingredient Manufactured Components Composite Trademark Drug

  46. Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996 • Brown, O’Neil and Price, Methods, 1997 • Spackman, Campbell, and Côte, SCAMC 1997 • Huff, Rocha, McDonald, et al., JAMIA 1998 • Pharmacy system knowledge base vendors

  47. Medical Entities Dictionary (MED) • New York Presbyterian Hospital • 60,000 concepts (procs, results, drugs, probs) • 208,242 synonyms • 84,677 hierarchical links • 113,906 semantic links • 238,040 other attributes • 66,404 translations (ICD9-CM, LOINC, MeSH, UMLS)

  48. Central Controlled Terminology

  49. MED Data Structures • Semantic network

  50. Substance Laboratory Specimen Event Chemical Anatomic Substance Plasma Specimen Diagnostic Procedure Substance Sampled Plasma Laboratory Test Laboratory Procedure Has Specimen Carbo- hydrate Bioactive Substance CHEM-7 Part of Glucose Substance Measured MED Semantic Network Medical Entity Plasma Glucose

More Related