1 / 22

Scientific Applications of Machine Learning

Scientific Applications of Machine Learning. Eric Mjolsness Scientific Inference Systems Laboratory Donald Bren School of Information and Computer Sciences, and Institute for Genomics and Bioinformatics University of California, Irvine. Scientific Imagery Applications.

truda
Download Presentation

Scientific Applications of Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scientific Applications of Machine Learning Eric Mjolsness Scientific Inference Systems Laboratory Donald Bren School of Information and Computer Sciences, and Institute for Genomics and Bioinformatics University of California, Irvine

  2. Scientific Imagery Applications NGC 7331 - http://photojournal.jpl.nasa.gov/catalog/PIA06322 Arabidopsis SAM - Meyerowitz Lab

  3. Some Basic Machine Learning Distinctions • Supervised vs. unsupervised learning • Supervised e.g. classification and regression • Feature selection • regression for phenomenological model fitting e.g. GRN’s • Unsupervised e.g. clustering; may be preprocessor • Generative vs. Kernal methods • Generative (statistical inference) models • Kernal methods e.g Support Vector Machines • Vector vs. Relationship data • Vector data: preprocessed image features Dlog I, Dx, … • Images, time series, shifted spectra - semigroup actions • Sparse graph/relationship data - permutation actions

  4. Correspondence Problems • Extended sources - map morphologies • Similar to biological imaging problems • Fewer sources but many pixels • Moving or changing point sources • E.g. Ida and Dactyl / JPL MLS • Dense point sources with instrument noise e.g. globular clusters (radial density function) • Techniques: • soft permutations, geometric transformations via optimization & continuation • Embedding inside a graph clustering (optimization) algorithm • Multiscale acceleration of optimization

  5. Mixture Models • Mixture of Gaussians, t-distributions, … • Can do outlier detection • Mixture of factor analyzers • Mixture of time series models • * Problem-specific generative models • Can formulate with a Stochastic Parameterized Grammar • Clustering graphs Utsugi and Kumagai 2000 Frey et al. 1998

  6. Stochastic Grammars for Data Modeling

  7. Text & Biology Models

  8. More Detailed Clustering Grammars • Clusters generate data • Priors on cluster centers & variances • Iterative through levels in a hierarchy • Recursive through hierarchy

  9. Rock Field Grammar grammar rockfield() {

  10. Transcriptional Gene Regulation Networks • Gene Regulation Network (GRN) model v T E xt r ac el lu la r co mm un i ca ti on Drosophila eve stripe expression in model (right) and data (left). Green: eve expression, red: kni expression. From [Reinitz and Sharp, Mech. of Devel., 49:133-158, 1995 ]. [Mjolsness et al. J. Theor. Biol. 152: 429-453, 1991]

  11. L cell ligands nucleus receptors transcriptional regulation targets Gene Regulation + Signal Transduction Network T

  12. Software architectures for systems biology: Sigmoid & Cellerator

  13. 3-tier architecture Sigmoid Pathway Representation/Storage Database P R O P E R T Y SOAP: Web Service OJB API ME NU Interactive Graphic Model (SVG/Applet) Database Access XML(Object), Image, via HTTP Model Translation JLink API Graphic Output Cellerator Simulation/Inference Engine

  14. Possible software support • Machine learning (open source/academic) • CompClust (CIT/JPL): • Scripting/GUI dichotomy data point; • dataset views • WEKA data mining • Intel: PNL Probabilistic Networks Library • Future: stochastic grammar modeler • + autogeneration (as in Cellerator) • Image processing, data environments • Matlab, IDL, Mathematica, Khoros/VisiQuest, … • NIHImage/ImageJ, …

  15. Metadata in Systems Biology • SBML • Sigmoid UML

  16. WUS Fletcher et al., Science v. 283, 1999 Brand et. al., Science 289, 617-619, (2000)

  17. SAM gene network: Results protein concentrations wus(init) and L1 X Y

  18. clv1 wus X diffusive clv3 Z L1 Y diffusive SAM: Gene Network Model

  19. SAM growth imageryPIN1 cell walls

  20. Venu Gonehal

  21. Basic Machine Learning Distinctions • Supervised vs. unsupervised learning • Supervised e.g. classification and regression • Feature selection • regression for phenomenological model fitting e.g. GRN’s • Unsupervised e.g. clustering; may be preprocessor • Generative vs. Kernal methods • Generative (statistical inference) models • Kernal methods e.g Support Vector Machines • Vector vs. Relationship data • Vector data: preprocessed image features Dlog I, Dx, … • Images, time series, shifted spectra - semigroup actions • Sparse graph/relationship data - permutation actions

  22. Contacts • Wayne Hayes, UCI ICS faculty • scientific computing • UCI ICS Maching Learning • Padhraic Smyth • Pierre Baldi • Chris Hart, Caltech Biology grad student

More Related