1 / 32

Bio- and Medical-Informatics

Presenter: Russell Greiner. Bio- and Medical-Informatics. Vision Statement. *. data. Helping the world understand … and make informed decisions. bio- and medical- informatics. * Potential beneficiaries: biological and medical researchers, practicing clinicians, and

ori-meadows
Download Presentation

Bio- and Medical-Informatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Presenter: Russell Greiner Bio- and Medical-Informatics

  2. Vision Statement * data Helping the world understand … and make informed decisions. bio- and medical- informatics • * Potential beneficiaries: • biological and medical researchers, • practicing clinicians, and • the people they serve. 2

  3. Motivation • High impact on bio-science and society • Local bioinformatics expertise • ML has a key role: • actual patterns (predictors, …) not known • lots of data • Challenging ML problems • data is high dimensional, noisy, … • often structured data • need to obtain training data, labels, … • …

  4. Personnel • PI synergy: • R. Greiner, R. Goebel, C. Szepesvari • 18 Software developers • 4 Postdocs (3 AICML) • 14 UGrad / IIP students • 17 Grad students (11 MSc, 6 PhD)

  5. Partners/Collaborators • 6 UofA CS profs • 5 UofA Bioscientists • Non-UofA collaborators: • Cross Cancer Institute (Alberta Cancer Board) • University of Alberta Hospital • Boston University, Maimi University, Dept of Homeland Security

  6. Additional Resources • Grants • $440K PENCE (Proteome Analyst) • $600K ACB (Brain Tumour) • Part of • $3.6M GenomeCanada (Human Metabolome Project) • $5.5M GenomeCanada (Alberta Transplant Institute) • $1.7M ACB (misc PolyomX grants) • In Kind: Data from CCI, ATI • 1970+ MRI scans (260 patients); 270 labeled • 300 (30K – 50K) Microarray chips • 80 (250K) SNP Chips

  7. Highlights • The Human Metabolome is ~completed and annotated • described in Science, Nature, … • Human Metabolome DataBase used by 78,673 Visitors (438,481 pageviews) • Proteome Analyst is world’s best predictor of subcell location • analyzed >1,000,000 proteins, for >1,000 users • Patent filed for Brain Tumor Software • Effective new approach for learning to classify Microarrays • Virus classifier obtained 98.5% accuracy!

  8. SNP Analysis Microarray Metabolomics Proteomics 30,000 8

  9. Projects and Status Subcellular Locations • Brain Tumour Analysis (ongoing) (poster # 5) • Human Metabolome(new) • PolyomX(ongoing) (poster #8) • Proteome Analysis(ongoing) (posters # 6,7) • Whole Genome Analysis(ongoing) Metabolomics Proteomics Genomics 1500 Chemicals 3000 Enzymes 30,000 Genes 9

  10. Technical Details Brain Tumour Project

  11. Standard Practice! How to Treat Brain Tumours? • Irradiate ONLY visible tumor • No! Must also kill “(radiographically) occult”cancer cells surrounding tumour ! • Irradiate everything within 2 cm margin around tumor But that … • also includes normal cells • still misses other occult cells

  12. How to Treat Brain Tumours? BETTER: • Predict (from earlier data) location of occult cells • Just irradiate that region! • Minimize number of normal cells zappedto minimize loss of brain function • Meaningful, as conformal radiotherapy can zap arbitrary shapes!

  13. How to Predict? • Occult cells region  where tumour cell will grow next(Assumption)  use prior data (260 patients) • Observe each patient over time– how tumours have grown • Predict patterns, based on properties of tumour, patient, region, …

  14. Technology… • Using Discriminative Random Field • Segmentation • Growth Prediction • Extensions: • Increase Accuracy: Support Vector Random Field • Increase Computational Efficiency: Decoupled SVRF • Exploit Unlabeled Region: Semi-Supervised (D)SVRF

  15. Brain Tumour: Future Work • Incorporate other modalities • Diffusion Tensor Imaging • PET • … • Compute other features: • Textures (BGLAM) • Using alignment • Improve learning algorithms • Use Active Learning techniques to determine • which regions/slices/studies/patients to label • using which human labeler

  16. Projects and Status Subcellular Locations • Brain Tumour Analysis (ongoing) (poster # 5) • Human Metabolome(new) • PolyomX(ongoing) (poster #8) • Proteome Analysis(ongoing) (poster # 6,7) • Whole Genome Analysis(ongoing) Metabolomics Proteomics Genomics 1500 Chemicals 3000 Enzymes 30,000 Genes 16

  17. Technical Details Human Metabolome Project

  18. Metabolomics Proteomics Genomics 2300 Chemicals 3200 Enzymes 30,000 Genes HMP Overview • Goal:identity & quantify the entire human “metabolome” • all small endogamous and exogenous chemicals that appear in a non-trivial quantity in people… ``HMDB: The Human Metabolome Database'‘, Nucleic Acids Research, January 2007.

  19. HMP #1: Fast Profiling • Given an NMR spectrum (blood, urine, CSF), • autonomously find & quantify >100 compounds, • in < 2 minutes • If know “NMR signature” of each metabolite…then linear least squares • Except …“signature” not stable – shifts with unobservable ions • Think EM… • ML challenge • Acquire “conditional NMR signature” • Active Learning

  20. HMP #2: Classify Patients Cachexia? Collect patient urine Compute Metabolic Profile Obtain NMR spectrum • Given: • Metabolic profile of patient • NMR/Mass spec of patient’s urine, blood, CSF • Predict: • Patient’s disease state • Reaction to Rx; Cachexia; Cancer • The role of ML … • Learn Profile  Dx classifier Classify Profile Classifier Cachexia = Yes!

  21. HMP #3: Chemical Property • Given: • Specific metabolite (chemical) • Predict: • Chemical properties of metabolite • Solubility, Melting point, … • Biological properties of metabolite • which reactions consume it, … • The role of ML … • Learn Metabolite  Property classifier

  22. Technical Details PolyomX Project

  23. PolyomX • Given: • Description of a patient • (SNP, Microarray, Metabolomic Profile, …) • Predict: • Dx: Breast Cancer, Ovarian Cancer, … • Rx: Prostate Cancer Toxicity, Cachexia, … • The role of ML … • Learn Patient  Dx classifier, … ``Predictive Models for Breast Cancer Susceptibility from Multiple, Single Nucleotide Polymorphisms'', Clinical Cancer Research, April 2004. ``Association of DNA Repair and Steroid Metabolism Gene Polymorphisms with Clinical Late Toxicity in Patients Treated with Conformal Radiotherapy for Prostate Cancer'', Clinical Cancer Research, April 2006.

  24. PolyomX: Future Work • Better tools for analyzing microarrays • Rank-One Bicluster Classifier (RoBiC) • Scaling up to 250K SNP chips • Incorporating >1 modality • Many other tasks: • Ovarian Cancer (microarray) • Use pathways to understand microarray • Microtubules docking • …

  25. Technical Details Proteome Analyst

  26. Proteome Analysis • Given: • Protein (FASTA format) • Predict:Properties of Protein • General function • Subcellular localization • The role of ML … • Learn Protein  Location classifier

  27. Results so far • Proteome Analyst classifiers • General Function: 80 – 90% • SubCellular Location: ~90% • Best known, by any system! (BioInformatics, 2004) • “Explain” facility has already helped users to identify problems in dataset… ``Proteome Analyst: Custom Predictions with Explanations in a Web-based Tool for High-Throughput Proteome Annotations'', Nucleic Acids Research, July 2004 ``Proteome Analyst: Custom Predictions with Explanations in a Web-based Tool for High-Throughput Proteome Annotations'', Nucleic Acids Research, July 2004 ``Visual Explanation and Auditing of Evidence with Additive Classifiers'‘, IAAI06, July 2006 ``PA-GOSUB: A Searchable Database of Model Organism Protein Sequences With Their Predicted GO Molecular Function and Subcellular Localization'', Nucleic Acids Research, Dec 2005. ``The Path-A metabolic pathway prediction web server'', Nucleic Acids Research, July 2006.

  28. Current Proteome Analyst Tasks • Analyze metabolic pathways • Incorporate hierarchy (GO) • Use other information • Motifs in protein, … • Other applications • Relate to Microarray data • Use GLOBAL properties of complete-proteome … phylogenetic hierarchy • …

  29. Technical Details Whole Genome Analysis

  30. Whole Genome Analysis • heuristic selection of whole genome substrings, to increase efficiency and accuracy of subtype identification in HIV genome • construct Complete Composition Vector (CCV) nucelotide presentation, as approximate signature of viral genome • 100% recognition of subtypes in 867 whole genome examples

  31. Other Bioinformatics Tasks • Predict Bull’s Expected Breeding Value • from SNPs • Bovine Haplotype • Predict Tumour Rejection • from Microarray • Other challengesfrom colleagues atUniv Hospital,Cross Cancer Inst. • …

More Related