1 / 40

Scale and Context: Issues in Ontologies to link Health- and Bio-Informatics

Scale and Context: Issues in Ontologies to link Health- and Bio-Informatics. Alan Rector, Jeremy Rogers, Angus Roberts, Chris Wroe Bio and Health Informatics Forum/ Medical Informatics Group Department of Computer Science, University of Manchester

ailani
Download Presentation

Scale and Context: Issues in Ontologies to link Health- and Bio-Informatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scale and Context: Issues in Ontologies to link Health- and Bio-Informatics Alan Rector, Jeremy Rogers, Angus Roberts, Chris WroeBio and Health Informatics Forum/Medical Informatics GroupDepartment of Computer Science, University of Manchester rector@cs.man.ac.ukwww.cs.man.ac.uk/mig img.man.ac.ukwww.clinical-escience.orgwww.opengalen.org

  2. Organisation of Talk • Informal presentation, motivation & examples • Intro to logic based ontologies • How to use logic based ontologies to represent scales and context • Making context modular – normalisation • Recurrent distinctions • and tests for those distinctions • Making logic based ontologies usable • Views and Intermediate Representations • Summary

  3. Example Problems of Context • Classification by multiple axes • e.g. Molecular action, physiologic, and pathological effects • Chloride transport & Cystic fibrosis • Biological Scope • eg. Normal/Abnormal, Human/Mouse • Conceptual view • e.g. the Digital Anatomist Foundational Model of organs vs Clinical convention –Is the pericardium a part of the heart?

  4. Basic Approach • Separate information into independent modules • Normalise the ontology • “The truth, the whole truth, and nothing but the truth” • Add explicit contextual information • Don’t distort the structure • Add context to it explicitly

  5. Why use Logic-based Ontologies? becauseKnowledge is Fractal! &Requirements are Diverse Coherence without Uniformity!

  6. hand extremity body Lung inflammation infection abnormal normal Logic-based Ontologies: Conceptual Lego gene protein cell expression chronic acute bacterial deletion polymorphism ischaemic

  7. Logic-based Ontologies: Conceptual Lego “SNPolymorphism of CFTRGene causing Defect in MembraneTransport of ChlorideIon causing Increase in Viscosity of Mucus in CysticFibrosis…” “Hand which isanatomicallynormal”

  8. Logic based ontologies • A formalisation of semantic nets, frame systems, and object hierarchies via KL-ONE and KRL • “is-kind-of” = “implies” (“logical subsumption”) • “Dog is a kind of wolf” means“All dogs are wolves” • Modern examples: DAML+OIL /“OWL”?) • Older variants LOOM, CLASSIC, BACK, GRAIL, K-REP, …

  9. Feature Structure Thing + feature: pathological red pathological Heart MitralValve MitralValve * ALWAYS partOf: Heart Encrustation * ALWAYS feature: pathological Encrustation Structure + feature: pathological + involves: Heart Encrustation + involves: MitralValve Logic Based Ontologies: The basics Validating (constraining cross products) Primitives Descriptions Definitions Reasoning Thing red + partOf: Heart red + partOf: Heart + (feature: pathological)

  10. Bridging Bio and Health Informatics • Define concepts with ‘pieces’ from different scales and disciplines and then combine them • “Polymorphism which causes defect which causes disease” • Use concepts which make context explicit • “ ‘Hand which is anatomically normal’  has five fingers”“ ‘Normal human prostate’  has three lobes” • Use different subproperties for different contexts • “Abnormalities of clinical parts of the heart”

  11. Protein CFTRGene in humans Membrane transport mediated by (Protein coded by (CFTRgene in humans)) Protein coded by(CFTRgene & in humans) Disease caused by (abnormality in (Membrane transport mediated by (Protein coded by (CTFR gene & in humans)))) Bridging Scales with Ontologies Species Genes Function Disease

  12. Use composition to express context • Normal and abnormal Hand  isSubdivisionOf some UpperExtremity Hand & AnatomicallyNormal  hasSubdivision exactly-5 fingers • Homologies and Orthologies Thumb of Hand of Human  hasFeature Opposable Thumb of Hand of NonHumanPrimate ¬hasFeature Opposable

  13. mammal Body mammal some Prostate Body male =3 =1 human Body Prostate Lobe male L1 L2 L3 =5 mouse Body Prostate male P1 P2 P3 P4 P5 More detailed example Body

  14. Disease of part_of Heart is_part_of OrganPart Organ Heart CardiacValve Pericardium Disease of Pericardium is_clinically_part_of Represent context and views by variant properties is_structurally_part_of

  15. What we want to avoid: combinatorial explosions • The “Exploding Bicycle”From “phrase book” to “dictionary + grammar” • 1980 - ICD-9 (E826) 8 • 1990 - READ-2 (T30..) 81 • 1995 - READ-3 87 • 1996 - ICD-10 (V10-19 Australian) 587 • V31.22 Occupant of three-wheeled motor vehicle injured in collision with pedal cycle, person on outside of vehicle, nontraffic accident, while working for income • and meanwhile elsewhere in ICD-10 • W65.40 Drowning and submersion while in bath-tub, street and highway, while engaged in sports activity • X35.44 Victim of volcanic eruption, street and highway, while resting, sleeping, eating or engaging in other vital activities

  16. Structure Function Structure Part-whole Part-whole Function The Cost 1: Normalising (untangling) Ontologies

  17. … ActionRole PhysiologicRole HormoneRole CatalystRole … … Substance BodySubstance Protein Insulin Steroid … The Cost 1: Normalising (untangling) OntologiesMaking each meaning explicit and separate PhysSubstance Protein ProteinHormone Insulin Enzyme Steroid SteroidHormone Hormone ProteinHormone^ Insulin^ SteroidHormone^ Catalyst Enzyme^ PhysSubstance Protein‘ ProteinHormone’ Insulin‘Enzyme’ Steroid‘SteroidHormone’ ‘Hormone’ ‘ProteinHormone’ Insulin^‘SteroidHormone’ ‘Catalyst’‘Enzyme’ …build it all by combining simple trees Hormone = Substance & playsRole-HormoneRole ProteinHormone = Protein & playsRole-HormoneRoleSteroidHormone = Steroid & playsRole-HormoneRole Catalyst = Substance & playsRole CatalystRole Insulin playsRole HormoneRole Enzyme ?=? Protein & playsRole-CatalystRole

  18. NormalisationBuilding ontologies from orthogonal trees • Each tree is homogeneous and based on subsumption • One prinicple – one of function, structure, cause,… • Every primitive has exactly 1 primitive parent • All multiple classification done by the logic • All self-standing primitives disjoint

  19. The Cost: 2 – Clean Distinctions & Tests • Repeating patterns within levels • Structures vs Substances • Flavours of part-whole • Part-whole vs containment, connection, branching • Process/Event vs Thing (“Endurant” vs “Perdurant”) • … • Repeating patterns across levels • Multiples at one level act as substances at the next • Substances span levels; structures are specific to a level

  20. Repeating Patterns within each level • Structures vs Substances (Discrete vs Mass) • Structures are made of substances • Organs are made of tissue • Parts & portions • Structures have parts & subdivisions,… • Substances have portions • Portions can have proportions & concentrations

  21. Tests • Structures (Discrete) • Can you count it? Is one part different from another? Is it made of something(s)? • Books, organs, ideas, individual cells, organisations, … • Substance (Mass) • Are all bits the same? Can something be made of it? Can you talk about “A piece of it”? “A lump of it”? “A stream of it”? … • Water, sodium, tissue, blood, …

  22. Repeating Patterns within each level • Part-whole vs containment • Parthood is organisational • The wall is part of the cell; • The cornea is part of the eye • Containment is physical • The inclusion is contained in the cell • The marrow is contained in the bone • Often occur together • Nucleus is a part of and contained in the cell • The retina is part of and contained in the eye

  23. Tests • Parts • If I take the part away, is the whole incomplete? • If the part is damaged is the whole damaged? • If I do something to the part do I do something to the whole? • Containment • Is the contained thing inside the container? • Is the relationship spatial/physical? (or temporal?)

  24. Repeating Patterns bridging levels • Multiples of structures at one level behave as substances at the next • “Blood is made of in part a multiple of red cells”“Tissue is made of in part a multipleof cells”“A rash is a multiple of spots”“Polyposis is a multiple of polyps”“A flock is a multiple of birds” • Multiples are not Sets • Not defined by members • Membership can change (intensional rather than extensional) • Action on the singleton is not action on the multiple;Action on the whole is (usually) action on the singletons • If I treat a spot, I do not treat the rash • If I treat the rash, I treat the spots

  25. Tests • Multiples • Name for the singleton – “grain”, “cell”, “bird”? • Singletons are countable? • Multiple is measurable rather than countable? • Odd to say part-of “This cell is part of the Arm”?

  26. But make it simple • Intermediate representations and views • OWL + Detailed Schema is the Assembler Language • FaCT/SHIQ/… is the machine code • Almost no one writes in assembler • let alone machine code • Separate “terms” and “concepts” • Language/labels from concepts

  27. Tools Versioning Language Metadata Intermed Rep Linksto Resources Indexed KB (Frame Like) Provenance Layered Architecture Protégé +“OilEd-II”+ …? DL

  28. Example:An Intermediate Representation for Surgery "Open fixation of a fracture of the neck of the left femur" MAIN fixing ACTS_ON fracture HAS_LOCATION neck of long bone IS_PART_OF femur HAS_LATERALITY left HAS_APPROACH open

  29. The formal “assembler” version (‘SurgicalProcess’ which isMainlyCharacterisedBy (performance which isEnactmentOf (‘SurgicalFixing’ which hasSpecificSubprocess (‘SurgicalAccessing’ hasSurgicalOpenClosedness (SurgicalOpenClosedness which hasAbsoluteState surgicallyOpen)) actsSpecificallyOn (PathologicalBodyStructure which < involves Bone hasUniqueAssociatedProcess FracturingProcess hasSpecificLocation (Collumwhich isSpecificSolidDivisionOf (Femurwhich hasLeftRightSelectorleftSelection))>))))

  30. Result • Training time: 3 mo  3 days + 3 days • Productivity: 25/day  100/day • Central reconciliation: 50%+  10% • Local cycle time: 3 months  <1 week • “Dependencies” High  Low • Author satisfaction: Low  High • Disputes: Frequent  Rare • Repeatability: Low  High Even Pre Web!

  31. Navigation vs Retrieval/Reference“Access terminology” & “Reference terminology” • Access follows model of use • e.g. MeSH, MEDCin • Hierarchy is what is needed next “to hand” • People find easy; Software hard • Retrieval follows model of meaning • Logic based ontologies • Hierarchy means “is-kind-of” / subsumption • People may find odd; Software is easy • Need Both - & visualisations of both • The logic based structure isn’t enough • Views and intermediate representations

  32. What’s in a View/ Intermediate Representation? Language linguisticgeneration &search User Oriented Structures semantictransformations & Filters Explicit Context in Ontology “Assembler”

  33. SummaryLet the logic engine do the work • Logic based ontologies can bridge granularities & represent context explicitly • And manage the potential combinatorial explosions • To do so • Views and Interface – usable, flexible & easy to learn • Entry, Navigation, & Use are different • Structure – explicit & modular – “Normalised” • Conception – clean testable distinctions • Tools & Architecture - layered & comprehensive • The logic is the assembly language

  34. Some Healthcare Terminologies

  35. Some Healthcare Terminologies • ICD 9/10 • Traditional paper thesauri • -CM versions essential for billing (and –AM) • CPT – Clinical Procedure Terminology • “Simple” list • Clinical Terms (Read Codes) V2 • Simple hierarchy • Still dominant in UK general practice • SNOMED-CT • At least “logic assisted” • Political questions… • NCI Cancer Ontology • “Logic based in parts” – work in progress

  36. Others • Standards Related • Loinc – laboratory data • Increasingly structured – “logic assisted” aspirations • HL7 Vocabulary TC • Specialised vocabularies – Inspiration for OHT • Links to RxNorm • Snomed Dicom Microglossary (SDM) • Image related information – not related tNOMED • Open Source • OpenGALEN Common Reference Model • Logic based – multilingual – a resource rather than a terminology • Basis of UK Drug Ontology • Open Health Terminology • Watch this space • Focusing on UMLS • Likely to be at least “logic assisted”

  37. Special Purpose • Anatomy • Digital Anatomist Foundational Model of AnatomyFMA • Principled frame based representation • Superb reference point for structural anatomy • Needs functional and clinical supplements • http://sig.biostr.washington.edu/projects/da/ • Drugs • RxNorm and VA projects • See Steve Brown & Stuart Nelson • UK Primary Care Drug DictionaryUKCPRS (Secondary Care)Drug Ontology (OpenGALEN based) • MEDDRA, FDA, Proprietary, …, …, …

  38. Unified Medical Language System (UMLS) • Common reference point and link to MeSH Terms and literature • De facto standard for universal identifiers • Concept Unique Identifiers (CUIs) • Lexical Unique Identifiers (LUIs) • String Unique Identifiers (SUIs) • Valuable in itself:Huge resource for mining and restructuring • Udo Hahn and Stefan Schulz“CoMMeT – Conceptual Model of Medical Terminology • http://www.coling.uni-freiburg.de/pub/schulz/commet/ • Alexa McCray is speaking next

More Related