Information extraction from medical records
Download
1 / 19

Information Extraction From Medical Records - PowerPoint PPT Presentation


  • 91 Views
  • Uploaded on

Information Extraction From Medical Records. by Alexander Barsky. Current Methodology:. Broad assessment of patient contained in beginning of chart with references to more specific areas. Specific divisions follow broad assessment. Records are listed in chronological order of activity.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Information Extraction From Medical Records' - fallon-pittman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Current methodology
Current Methodology:

Broad assessment of patient contained in beginning of chart with references to more specific areas. Specific divisions follow broad assessment. Records are listed in chronological order of activity.



Problem
Problem:

A patient's medical chart is very detailed and very complex in nature. Any attempt to quickly locate specific information will be met with frustration.



Solution
Solution:

Create a system that properly extracts wanted information based on a predefined set of parameters.

Example: "Hormonal imbalance during puberty". Retrieve all references to hormonal imbalances but only between two specific time periods in medical chart.


Tool at our disposal
Tool At our disposal:

JAPE  : Java Annotation Patterns Engine.

    Use : pattern matching and semantic  extraction

GATE : General Architecture for Text Engineering.

    Use: Information Extraction, document annotation, and 

            XML output.

C#     : Visual C# Winforms.

    Use: Medium for conversion between XML and .csv file                    formats.


Solution methodology
Solution Methodology:

1. Create corpus of documents in GATE.

2. Introduce rules for information extraction.

3. Annotate documents in corpus.

4. Output annotated documents in XML.

5. Strip file of unnecessary elements and convert to .csv.


Annie
                        ANNIE

        A-Nearly-New-Information-Extraction-System

-Tokeniser - splits sentence into simple tokens

-Gazetter - identify entity names contained in lists

-Sentence Splitter - splits text into sentences based on lists.

-Parts of Speech Tagger - identifies text as different  POS.

-Coreference Matcher- identifies relationships between previously defined entities.     



Jape key to extraction
        JAPE : Key to Extraction most if not all ANNIE components

-


Jape example
                  JAPE Example most if not all ANNIE components

-


Xml output
XML Output: most if not all ANNIE components

-


Problem too much unorganized information solution xlst to the rescue
Problem: Too much unorganized information. most if not all ANNIE componentsSolution :XLST to the rescue!!!

XLST - Extensible Stylesheet Language Transformations

- Add specific rules to seperate needed from unnecessary information.


Xlst example
XLST Example most if not all ANNIE components

-Find all the nodes within the <Lookup>. Add string between the tags.


CSV File Type most if not all ANNIE componentsComma  Seperated Value - Used to present information in a tabular system. Useful for analyzing large amount of data in an easy to understand format. Most common program to use it is Excel.

.


Potential problem
Potential Problem: most if not all ANNIE components

Regardless of how well all the ANNIE tools are utilized and how well the JAPE rules are defined, proper recall precentage won't ever be exact.


Solution machine learning
Solution: Machine Learning most if not all ANNIE components

Machine learning is our best chance to increase precision  of output results. Training a computer to recognize commonally used reporting phraseology will organize extraction better with more precise, concise outputs. Lucky for us, GATE include plugins to program machine learning.


ad