1 / 17

Limsoon Wong KRDL

Datamining: Turning Biological Data into Gold. Limsoon Wong KRDL. Jonathan’s blocks. Jessica’s blocks. Whose block is this?. What is Datamining?. Jonathan’s rules : Blue or Circle Jessica’s rules : All the rest. What is Datamining?. Question: Can you explain how?.

evan
Download Presentation

Limsoon Wong KRDL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Datamining: Turning Biological Data into Gold Limsoon Wong KRDL

  2. Jonathan’s blocks Jessica’s blocks Whose block is this? What is Datamining? Jonathan’s rules : Blue or Circle Jessica’s rules : All the rest

  3. What is Datamining? Question: Can you explain how?

  4. What are the Benefits? • To the patient: • Better drug, better treatment • To the pharma: • Save time, save cost, make more $ • To the scientist: • Better science

  5. The Datamining Process

  6. Epitope Prediction TRAP-559AA MNHLGNVKYLVIVFLIFFDLFLVNGRDVQNNIVDEIKYSE EVCNDQVDLYLLMDCSGSIRRHNWVNHAVPLAMKLIQQLN LNDNAIHLYVNVFSNNAKEIIRLHSDASKNKEKALIIIRS LLSTNLPYGRTNLTDALLQVRKHLNDRINRENANQLVVIL TDGIPDSIQDSLKESRKLSDRGVKIAVFGIGQGINVAFNR FLVGCHPSDGKCNLYADSAWENVKNVIGPFMKAVCVEVEK TASCGVWDEWSPCSVTCGKGTRSRKREILHEGCTSEIQEQ CEEERCPPKWEPLDVPDEPEDDQPRPRGDNSSVQKPEENI IDNNPQEPSPNPEEGKDENPNGFDLDENPENPPNPDIPEQ KPNIPEDSEKEVPSDVPKNPEDDREENFDIPKKPENKHDN QNNLPNDKSDRNIPYSPLPPKVLDNERKQSDPQSQDNNGN RHVPNSEDRETRPHGRNNENRSYNRKYNDTPKHPEREEHE KPDNNKKKGESDNKYKIAGGIAGGLALLACAGLAYKFVVP GAATPYAGEPAPFDETLGEEDKDLDEPEQFRLPEENEWN

  7. 1 66 100 Epitope Prediction Results • Prediction by our ANN model for HLA-A11 • 29 predictions • 22 epitopes • 76% specificity • Prediction by BIMAS matrix for HLA-A*1101 Number of experimental binders 19 (52.8%) 5 (13.9%) 12 (33.3%) Rank by BIMAS

  8. Gene Expression Analysis • Clustering gene expression profiles • Classifying gene expression profiles • find stable differentially expressed genes

  9. Gene Expression Analysis Results • The Discovery System • Correlation test • Voter selection • Class prediction

  10. WEB Protein Interaction Extraction “What are the protein-protein interaction pathways from the latest reported discoveries?”

  11. Protein Interaction Extraction Results • Rule-based system for processing free texts in scientific abstracts • Specialized in • extracting protein names • extracting protein-protein interactions Jak1

  12. Transcription Start Prediction

  13. Transcription Start Prediction Results

  14. Medical Record Analysis • Looking for patterns that are • valid • novel • useful • understandable

  15. Medical Record Analysis Results • DeEPs, a novel “emerging pattern’’ method • Beats C4.5, CBA, LB, NB, TAN in 21 out of 32 UCI benchmarks • Works for gene expressions

  16. Under the Hood • Artificial neural network • Neighbourhood analysis • Non-linear analysis • Template matching • Emerging pattern • Hidden markov models • Bayesian inference • Decision tree induction • ...

  17. Epitope Prediction Vladimir Brusic Judice Koh Seah Seng Hong Zhang Guanglan Yu Kun Transcription Start Prediction Vladimir Bajic Seah Seng Hong Gene Expression Analysis Zhang Louxin Zhang Zhuo Zhu Song Medical Records Li Jinyan Protein Interaction Extraction Ng See Kiong Zhang Zhuo Behind the Scene

More Related