1 / 22

Topic: Identifying the Data Schema behind SNOMED CT

Topic: Identifying the Data Schema behind SNOMED CT. Jon Patrick , Centre for Health Informatics Research & Development, University of Sydney Ming Zhang, Donna Truran National Centre for Classification in Health. Outline. Project description Research methodology Experiments and Results

cana
Download Presentation

Topic: Identifying the Data Schema behind SNOMED CT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topic:Identifying theData Schema behind SNOMED CT Jon Patrick, Centre for Health Informatics Research & Development, University of Sydney Ming Zhang, Donna Truran National Centre for Classification in Health

  2. Outline • Project description • Research methodology • Experiments and Results • Conclusion • Limitation • Recommendation for future work

  3. Project Description • Project background SNOMED CT – The core content is stored in simple tables • Project Objective To discover the conceptual model of SNOMED CT by reverse engineering

  4. Research methodology • Data preparation Transfer the SNOMED CT core content table into RDBMS , that is the Text file into MySQL • Ontology Structure Investigation Database querying -- Explicit characteristics Programming – Implicit characteristics • Data modelling Analysis of the different characteristics and features so as to generate the conceptual data model

  5. Experiment and Result • Explicit Characteristics of the Ontology Original data over view Fully defined and primitive Relationship types Hierarchy structure Multiple inheritance Full structure • implicit Characteristics of the Ontology Classification principles Relationship patterns

  6. Original Data model • 3 data tables: Concepts: one clinical idea is recorded as an concept: Descriptions: one clinical idea could have more than one description in this table • Relationships: each row represents a relationship between two concepts

  7. Fully defined and primitive concepts • Primitive: A concept is primitive if its defining characteristics are insufficient to define it – that is it has more content than indicated by its attributes and relationship, e.g. clinical finding • Fully defined concepts A concept is fully defined if its defining characteristics are sufficient • “sufficient” and “insufficient” are determined by SNOMED experts. • Currently 41244 (11%) concepts are fully defined

  8. Relationship types • Relationships between two concepts • “laterality” is a “relationship type” According to the statistics there are 1.4 million records of relationships, There are 62relationship types used currently to represent the relationships between two concepts.

  9. Relationship types

  10. Hierarchystructure • In the collection of relationship types, “IS_A” represents the hierarchal relationship. • 485,335 records in relationships tables are stored in the hierarchal information of SNOMED CT • The main hierarchal features root level(no parents): one root “SNOMED CT CONCEPT” middle node level (have parents and children): 80895 (22%) concepts 25687 nodes have only 1 child leaf node level (no children) 285283 (78%) concepts

  11. Multiple inheritance • one concept in SNOMED CT may have many children and many parents

  12. Multiple inheritance

  13. Hierarchystructure - example Root Middle Nodes leaf Multiple parents

  14. Full structure

  15. Experiment and Result • Explicit Characteristics of the Ontology Original data over view Fully defined and primitive Relationship types Hierarchy structure Multiple inheritance Fully structure • Implicit Characteristics of the Ontology Classification principle Relationship patterns

  16. Classification principle Top level categories: 18 direct children of root Each concept belongs to only one top level category So all concepts in SNOMED CT can be divided into 18 groups

  17. Implicit

  18. Relationship patterns The specific relationship type between any two Top categories

  19. Relationship patterns • Pattern: {C1,type,C2} C1 is the one of 18 top categories type is the one of 62 relationship types C2 is the one of 18 top categories There are 18x62x18 = 20088 possible patterns • Each record in 1.4 million relationships records match one pattern. • To avoid ambiguity, the scope of this study covers only is “active” concepts • The results show only 78 patterns have instance in relationship table.

  20. Data modelling based on patterns For example: to find the relationship between “clinical finding” and other top categories.

  21. Conceptual Data Model

  22. Future Work • Design a methods of defining real-world constraints over the relationships • E.g. suicide can have slow onset • Develop storage and maintenance procedures for managing the data, e.g. there is no constraint over the data model as it exists at the moment. • Design a terminology server to deliver SCT to vendors. • Work with vendors to define a transport mechanism for vendors to be able to install SCT. • Create Internet access to SCT content for ad hoc users. • Start working on systems that demonstrate the value of SCT for clinical and administrative work.

More Related