Download
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Dr Paul Lewis PowerPoint Presentation
Download Presentation
Dr Paul Lewis

Dr Paul Lewis

105 Views Download Presentation
Download Presentation

Dr Paul Lewis

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Dr Paul Lewis • Lecturer in Bioinformatics • Cardiff University • Biostatistics & Bioinformatics Unit

  2. Biostatistics & Bioinformatics Unit (BBU) • Bioinformatics resource for Institutions across Wales • Backing of the Higher Education Funding Council for Wales • - £1.5 million grant through the Research Capacity Development Fund • 13 new posts in statistics & bioinformatics • UWCM, Cardiff University, Aberystwyth • MSc/Postgraduate Diploma/Postgraduate Certificate: • Bioinformatics • Genetic Epidemiology and Bioinformatics

  3. Brief Overview of Microarray Bioinformatics • Introduce My Microarray Research Interests • My Microarray Analysis Software

  4. Bioinformatics in Microarray Experiment Differential Gene Expression Experimental Design Pattern Discovery Annotation Hybridisation Class Prediction Data Normalisation

  5. Normalization Remove non-biological influences on data (systematic variation) • 3 categories of Normalisation • Normalisation – transform data to make more like a normal distribution • log, lowess, linlog • Standardisation – expand or contract distribution so data from • different experiments can be compared • calculate Z-scores • Centralisation – move distribution so its centered around expected mean • mean / median / mean trimmed centering

  6. Bioinformatics in Microarray Experiment Differential Gene Expression Experimental Design Pattern Discovery Annotation Hybridisation Class Prediction Data Normalisation

  7. Find Differentially Expressed Genes Is fold change significant? With Replicates • Parametric tests • t-test (ANOVA) J. Comput. Biol. 2000 7: 817-838 • Bayesian t-test Bioinformatics 2001 17: 509-519. • Mixture modelling & bootstrapping (SAM) P.N.A.S. 2001 98: 5116-5121 • Regression modelling Genome Res. 2001 11: 1227-1236. • All give similar results but SAM reduces false positives • Non Parametric Tests • Wilcoxon rank sum test Bioinformatics 2002 18: 1454-1461 • Non-parametric t-test Bioinformatics 2002 18: 1454-1461 • Ideal discriminator method Bioinformatics 2002 18: 1454-1461 • low false positive rate but less power

  8. Bioinformatics in Microarray Experiment Differential Gene Expression Experimental Design Pattern Discovery Annotation Hybridisation Class Prediction Data Normalisation

  9. Pattern Discovery & Class Prediction Explore how genes or samples group: Clustering Hierarchical Cluster Analysis HIERARCHY K-Means Self Organising Maps (SOM) PARTITION Fuzzy ART Principal Components Analysis (PCA) Multidimensional Scaling (MDS) REDUCTION Correspondence Analysis (CoA) Assign genes to known groupings: Classification logistic regression neural networks linear discriminant analysis

  10. Hierarchical Cluster Analysis

  11. Partitioning Clustering Methods K-Means & SOM • Need To Tell Methods Number of Clusters • Genes Partitioned into Clusters • What are Relationships Between Clusters?

  12. 2D & 3D Mapping Methods Data Projected onto 2 or 3 Dimensions CoA MDS But….What are Cluster Boundaries? PCA

  13. Bioinformatics in Microarray Experiment Differential Gene Expression Experimental Design Pattern Discovery Annotation Hybridisation Class Prediction Data Normalisation

  14. Annotation Online Tools: ARROGANT http://lethargy.swmed.edu/ DAVID http://apps1.niaid.nih.gov/david/ DRAGON http://207.123.190.10/dragon.htm EASE http://apps1.niaid.nih.gov/david/ FANTOM http://www.gsc.riken.go.jp/e/FANTOM/ GoMiner http://discover.nci.nih.gov/gominer/ MatchMiner http://discover.nci.nih.gov/matchminer/ Onto-Express http://vortex.cs.wayne.edu/Projects.html RESOURCERER http://pga.tigr.org/tigr-scripts/magic/r1.pl Affymetrix GO http://www.affymetrix.com Databases: Gene Ontology http://www.geneontology.org/ OMIM http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM LocusLink http://www.ncbi.nlm.nih.gov/LocusLink/ UniGene http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM LocusLink http://www.ncbi.nlm.nih.gov/LocusLink/

  15. My Research Interests Pattern Discovery Take - 2D & 3D Mapping Methods Methods - Define Cluster Boundaries Make FUZZY Algorithm Development 2D & 3D Visualisation Tools EAS-I Biologist-Friendly Software Tools

  16. Cluster Boundaries MDS CoA PCA

  17. Fuzzy Clustering • Differs to standard clust by assigning membership of a gene to all clusters • Allows you to see the association of each gene within a cluster • Can calculate the number of clusters in Partitioning methods (Fuzzy ART) • Helps Combine Clusters • Helps to clear Ambiguity

  18. Fuzzy Mapping Add Membership values of each gene to clusters

  19. Fuzzy Partitioning K-Means & SOM

  20. EASI DATA REDUCTION VISUALISATION

  21. EASI BBUnit Microarray Pattern Discovery • Need for Comprehensive Pattern Discovery Software Suite • Fuzzy Data Analysis Suite • Visualisation Tools to explore data • Easy to use • Free • Web based version • Service by BBU • Increase traffic to BBU web site • Establish BBU for microarray • Cross platform

  22. EASI INTERFACE Differential Gene Expression Pattern Discovery Utilities Normalisation • Hierarchical Cluster Analysis • SOM • K-Means • Fuzzy Art • PCA • MDS • CoA • Fuzzy C-Means • Log • Normalise • Mean Centre • Median centre • T test • ANOVA • Regression

  23. INTERFACE EASI

  24. Hierarchical Cluster Analysis

  25. Multi Dimensional Scaling (MDS)

  26. Self Organising Map (SOM)

  27. Correspondence Analysis (CoA)

  28. Contact lewispd@cf.ac.uk http://bbu.uwcm.ac.uk

  29. Acknowledgements • Pete Kille • Alan Clarke • Gareth Hughes (EASI team) • Karen Reed (Data) • Lesley Jones (Data, & EASI Collaborator) • BBU