Please have a seat our program will commence shortly
This presentation is the property of its rightful owner.
Sponsored Links
1 / 18

Please have a seat. Our program will commence shortly. PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on
  • Presentation posted in: General

Please have a seat. Our program will commence shortly. B iomarker A utomated R etrieval T ool. K N. R C. Ronny Chan, Kim Ngo Earth Science Data Systems Dept. Bioinformatics Relationship. Science produces massive amounts of data Data needs to be analyzed, stored, & retrieved

Download Presentation

Please have a seat. Our program will commence shortly.

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Please have a seat. Our program will commence shortly.


Biomarker Automated Retrieval Tool

K N

R C

Ronny Chan, Kim Ngo

Earth Science Data Systems Dept.


Bioinformatics Relationship

  • Science produces massive amounts of data

  • Data needs to be analyzed, stored, & retrieved

     This is data-mining

  • We want to apply computer science to improve this process


Motivation

  • Problems with conventional data mining

    • Time consuming

    • Accuracy not defined (subjective)

  • No objective scientific info retrieval tool

Where are the Biomarkers?


Cancer Biomarkers

An indicator of cancerous growth.

BIO +


Proposed Solution

Create a program that allows people to quickly scan literature for the most relevant keywords/biomarkers

BAG-1

ERBB2

B.A.R.T.

HER-2

EP-CAM

HPEBP4


Significance

  • What is the need of the project?

    • More efficient research

    • Save time

B.A.R.T.

conventional

enhanced


Goals

  • Make biomarker/keyword searches more efficient

  • Learn Java

  • Learn SQL


Approach

  • Write a program

    • Read in articles

    • Use part of Vector Space Model algorithm to rank terms

    • Output relevant terms in statistical rankings

BRCA1

they

VS.


Information Retrieval System

Introduced by Gerald Salton in the 60’s.

Used widely in different search engines

Vector Space Model


Algorithm for B.A.R.T.

Keywords Input

PubMed Query Agent

Keyword Parser

Content Analyzer

Content Ranker

Data Store

Data Retrieval and Output


Results

  • DCIS

  • CU-TP3982

  • ERBB2

  • HER-2

  • HPEBP4

  • BAG-1

  • EP-CAM

  • 99M


Lessons & Difficulties

  • Deciding on algorithm choice

    • Ease of implementation and effectiveness

  • Limited knowledge & experience

    • Java, SQL

    • Initial implementation is slow

5 ARTICLES=160 sec

20 ARTICLES=1904 sec

100 ARTICLES=8^38 years

UPDATE: AUGUST 18, 2004

 100 ARTICLES=8^19 years


Future work

  • Apply different term weight functions to make results more robust

  • Optimize the program for speed


Citations

  • http://ir.iit.edu/~dagr/cs529/files/handouts/03VectorSpaceImplementation-6per.PDF

  • http://classes.engr.oregonstate.edu/eecs/spring2004/cs419/10

  • http://www.cs.ust.hk/~dlee/Papers/ir/ieee-sw-rank.pdf

  • http://hartford.lti.cs.cmu.edu/classes/95-778/Lectures/04-BooleanVectorSpaceB.pdf

  • Biomarkers Definitions Working Group.

    Biomarkers and surrogate endoints: preferred definitions and conceptual framework. Clin. Pharmacol. Ther. 69(3), 89-95 (2001).


Acknowledgements

National Science Foundation (NSF)

National Institute of Health (NIH)

Earth Science Data System, JPL

Tina Xiao

Paul Ramirez

Chris Mattmann

Roshanak Roshandel

Sean Hardman

Southern California Bioinformatics Summer Institute (So Cal BSI)

SoCalBSI Professors

Jacqueline Heras

ALL SoCalBSI Colleagues


VSM Example

Q :malignant breast cancer

D 1:detection of malignant level in the cell

D 2:sighting of breast stage in the breast cancer

D 3:detection of malignant stage in the cancer


Example Continued…

Keyword tf * idf


  • Login