1 / 13

Yeast Dataset Analysis

Yeast Dataset Analysis. Hongli Li 91.580 Final Project Computer Science Department UMASS Lowell. Outline . Gene Ontology Annotation Data Preprocessing Cluster Results Conclusion. GO Annotations. Total Number of Gene: 799 327 Gene has GO at level 3 of Biological Process

caspar
Download Presentation

Yeast Dataset Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Yeast Dataset Analysis Hongli Li 91.580 Final Project Computer Science Department UMASS Lowell

  2. Outline • Gene Ontology Annotation • Data Preprocessing • Cluster • Results • Conclusion

  3. GO Annotations • Total Number of Gene: 799 • 327 Gene has GO at level 3 of Biological Process • Genes with GO but not at level 3: 272 • Genes without GO: 200

  4. GO Annotation

  5. GO Anotation • Of 327 genes with GO at level 3 • 170 Genes belong to GO:0008152, the metabolism • 90 Genes belong to the GO:0007049 the Cell Cycle • 81 Genes belong to GO:0016043, the cell organization and biogenesis • 51 Genes belong to GO:0006810, the transport

  6. Data Preprocessing • Dataset: 799 Cell Cycle Regulated Genes • Filter: Minimum Exiting value over 85% • Impute Missing Values Using KNN • Standardize Patterns (mean = 0 and standard deviation =1)

  7. Cluster • SOTA – Self-Organizing Tree Algorithm • Euclidean Distance • Variability Threshold: 80%

  8. Cluster61 Result

  9. Cluster 61 • 67 Genes from 799 fall in Cluster 61 • 24 out of 67 genes has GO • 10 out of 24 genes belongs to metabolism • 14 belongs to Cell Cycle • 8 belongs to S phase of mitotic cell cycle • 8 belongs to DNA replication • 4 belongs to G1/S transition of mitotic cell cycle • Only one genes that belongs to metabolism not in cell cycles

  10. Cluster 60 • 33 Genes in this Cluster • 11 of 33 has GO • 4 of 11 genes are in M-phase specific microtubule process which belongs to Cell Cycle • 7 in organelle organization and biogenesis which belongs to cell growth and/or maintenance • totally 8 in cell cycle

  11. Cluster 59 • 38 genes in this cluster • 15 genes has anotation • 7 in metabolism   • 5 in cell cycle • M phase of mitotic cell cycle has 3 • Nuclear division has 3 • No gene in these two classes are same

  12. Conclusion & Future Work • Cluster #61 has strong relations with cell cycle, next is cluster #60 and #59 • Sub-Cluster the cluster #59, #60, #61 • Analyze the gene expression data of those genes that are known belongs to GO cell cycle annotations • Analyze other clusters • Do the same analyze to 6000 gene dataset

  13. Reference • http://gepas.bioinfo.cnio.es/index.html • P. T. Spellman et al., Comprehensive identification of cell cycleregulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization Mol. Biol. Cell., vol. 9, pp. 3273--3297, 1998. • Raymond J Cho. A Genome-Wide Transcriptional Analysis of the Mitotic Cell Cycle. Mol. Biol. Cell., vol. 2, pp. 65--73, 1998. • Herrero, J., Valencia et al. A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics, 17(2), 126-136. 2001 • Orly Alter. Singular value decomposition for genome-wide expression data processing and modeling. PNS, vol. 97, pp 10101-10106. 2000 • http://www.cellsalive.com/cell_cycle.htm • http://www.geneontology.org/ • http://fatigo.bioinfo.cnio.es/htdocs/helpFatiGO.html

More Related