datamining medline for topics and trends in dental and craniofacial research l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Datamining MEDLINE for Topics and Trends in Dental and Craniofacial Research PowerPoint Presentation
Download Presentation
Datamining MEDLINE for Topics and Trends in Dental and Craniofacial Research

Loading in 2 Seconds...

play fullscreen
1 / 19

Datamining MEDLINE for Topics and Trends in Dental and Craniofacial Research - PowerPoint PPT Presentation


  • 347 Views
  • Uploaded on

Datamining MEDLINE for Topics and Trends in Dental and Craniofacial Research William C. Bartling, D.D.S. NIDCR/NLM Fellow in Dental Informatics Center for Biomedical Informatics University of Pittsburgh Titus K. L. Schleyer, D.M.D., Ph.D. Director, Center for Dental Informatics

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Datamining MEDLINE for Topics and Trends in Dental and Craniofacial Research' - Leo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
datamining medline for topics and trends in dental and craniofacial research

Datamining MEDLINE for Topics and Trends in Dental and Craniofacial Research

William C. Bartling, D.D.S.

NIDCR/NLM Fellow in Dental Informatics

Center for Biomedical Informatics

University of Pittsburgh

Titus K. L. Schleyer, D.M.D., Ph.D.

Director, Center for Dental Informatics

University of Pittsburgh School of Dental Medicine

overview
Overview
  • Goals of project
  • Retrieving the entire corpus of dental and craniofacial research literature from MEDLINE
  • Determining the characteristics of a dental research article
  • Machine learning to extract articles from any body of literature
  • Methods to categorize dental research literature to study temporal trends
  • Summary
goals of project
Goals of project
  • To use computerized methods to determine topics and trends in dental and craniofacial research since 1966.
  • Determining the structure of such research can help to identify those research areas emerging and those waning.
  • Identify research funding opportunities?
retrieving the dental literature
Retrieving the dental literature
  • MEDLINE chosen as the database
  • MeSH tree searched manually for dental and craniofacial terms
  • Many MeSH terms were found in unusual locations in the hierarchy.
  • Decision to keep or discard term
  • Search limited to :
    • English language
    • Journal article
    • Abstract present
results of search
Results of search
  • ~450,000 English language articles in:
    • DENTISTRY
    • STOMATOGNATHIC SYSTEM (not PHARYNX)
    • STOMATOGNATHIC DISEASES (not PHARYNGEAL DISEASES)
  • ~61,000 articles indexed with dental MeSH terms not in above set
  • ~134,000 articles remaining after limiting to journal articles containing abstracts
what is a dental research article
What is a dental research article?
  • Currently at this phase of project
  • 1000 abstracts randomly chosen, 5 groups of 200 each
  • 15 expert judges
  • 3 judges assigned to each group
  • Judges categorize each article as:
    • Dental or craniofacial research
    • Dental or craniofacial, non-research
    • Non-dental
    • Not sure
  • Web interface for judging- PHP with mySQL
differentiation of article categories
Differentiation of article categories
  • Acceptable reliability in each group ( > 0.70)
  • Use results of each category to develop training set
  • Identify Patient Sets (IPS) software
    • Developed by Dr. Greg Cooper at University of Pittsburgh CBMI
    • Natural language processing used to find patient records of a certain type from free text documents, i.e. hospital admission records
ips creates a document vector for each document or set of documents
IPS creates a document vector for each document or set of documents

Document i

Word 1

p1

Word 2

p2

Word 3

p3

Word n

pn

identify patient sets ips
IDENTIFY PATIENT SETS (IPS)
  • Uses machine learning technique of “text classification”
  • All articles fed into the program
    • Select fields (title, abstract, MeSH terms)
  • Training set:
    • 2/3 of validated “dental research” articles
  • Add remaining 1/3 to original set, less the training set
  • Calculate success of retrieval using model created from training set
  • Adjust IPS and iterate, or train set with more or less documents until successful
determining trends and topics in dental and craniofacial research
Determining trends and topics in dental and craniofacial research
  • Entire set of dental research articles used
  • Knowledge visualization and bibliometric methods
  • Based on the assumption that articles in a given field are similar to one other (Hearst & Pedersen, 1996)
  • Similar articles and topics tend to cluster together
bibliometric examples from other fields
Bibliometric examples from other fields
  • Co-word analysis
    • Software engineering (Coulter, Monarch, and Konda, 1998)
  • Co-descriptor analysis
    • Information science (McCain, 1995)
  • Co-author analysis
    • Information retrieval literature (Ding et. al., 1999)
  • Co-citation analysis
    • Medical informatics literature (Morris & McCain, 1998)
visual methods to categorize literature
Visual methods to categorize literature
  • Co-occurrence vectors or weights
    • Weights based on co-occurrence of terms
  • Multidimensional scaling
    • Display of points in two or three dimensions
    • Points closer together on matrix when articles are more similar
  • Clustering
    • Groups of points in close proximity to each other are bounded to provide an intellectual grouping
how do we cluster dental research
How do we cluster dental research?
  • Entire text of abstracts
  • MeSH terms only
    • Major headings
    • Subheadings
    • All MeSH headings
  • Journal titles
  • Combinations of the above
once clustering is done
Once clustering is done:
  • Cluster dental research within certain time periods (5 years)
  • Determine quantities of articles published for each cluster within each time period
  • Cluster including only journals with a given impact factor threshhold
  • Study changes over time of different categories of research
summary
Summary
  • A comprehensive content analysis of the dental and craniofacial research literature has not been done.
  • Computerized methods can help to retrieve and categorize this literature.
  • Study of trends in dental research can help researchers to identify relevance of current studies and possibly reveal future research opportunities.
many thanks to the following
Many thanks to the following:
  • Amy Gregg, MLIS-Dental Reference Librarian
          • Falk Library for the Health Sciences
          • University of Pittsburgh
  • Shyam Visweswaran, MD- NLM Fellow in Intelligent Systems
          • Center for Biomedical Informatics
          • University of Pittsburgh
  • All of my expert raters!
  • This research is supported with a training grant from the National Institute of Dental and Craniofacial Research and the National Library of Medicine