1 / 19

Melampsora Genome Annotation and Genome Structure Analysis

Melampsora Genome Annotation and Genome Structure Analysis First Annotation Workshop of the Melampsora Genome Consortium. Yao-Cheng Lin Bioinformatics & Evolutionary Genomics VIB Department of Plant Systems Biology, UGent. Overview. Gene prediction (structure annotation)

kiril
Download Presentation

Melampsora Genome Annotation and Genome Structure Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MelampsoraGenome Annotation and Genome Structure Analysis First Annotation Workshop of the MelampsoraGenome Consortium Yao-Cheng Lin Bioinformatics & Evolutionary Genomics VIB Department of Plant Systems Biology, UGent

  2. Overview • Gene prediction (structure annotation) • Gene family analysis • Phylogeney position of Melampsora

  3. EuGène: gene prediction platform Intrinsic information Other prediction programs Coding IMM Intronic IMM Translation start FunSiP Content potential for coding, intronic and intergenic Alternative models GT/AG Splice site start site Predicted genes Genomic sequence EuGène BlastN GenomeThreader TblastX RepeatMasker BlastX Pucciniagenomic sequence TE & Repeat database Protein databases ESTs databases Extrinsic information

  4. Resources for Melampsoragene prediction • Gene models for training • Previously identified core genes in basidiomycetes • Genes with manual curation from INRA-Nancy • Splice site training/prediction • FunSiP: Michiel Van Bel developed it & helped for training • BlastX database • 8 basidiomycete proteomes, Fungi RefSeq, SwissProt • TBLASTX database • Pucciniagraminisgenomic sequence • EST libraries • JGI Sanger sequencing • 454 Pyrosequencing (the 1stmira assembly) • Repeat libraries • Hadi/Marie-Pierre. • In-house script, collected from first run of gene prediction. • Masked area from JGI. • EuGene 3.4

  5. Gene prediction – comparison of two prediction results

  6. Gene prediction – protein length distribution

  7. Example: metallothionein-like protein • Metallothionein-like protein in Magnaporthe • Protein length: 22-amino acid (MMT1) • Six Cystein residues. • Mmt1 mutants loose the ability to cause plant disease. • Difficulties inin siliconidentification • Sequence divergence. • Short sequence, easily been rejected by E-value cut-off.

  8. Overview • Gene prediction and annotation platform • Gene family analysis • Phylogeny position of Melampsora

  9. Gene family expansion and contraction • Gene family clustering • Similarity search with 12 fungi genomes (10 basidiomycetes, 2 ascomycetes), (All-against-all BLASTP, E-value cutoff 1e-5). • Gene families constructed by TribeMCL with inflation factor 4.0. • Species/Lineage specific gene family expansions • The mean gene family size and standard deviations were calculate for all gene families (exclude SSFs and orphans). • To center and normalize the data, the matrix of previous profile was transformed into a matrix of z-score. • Functional assignment • Domain based: RPS-BLAST • HMM profile for each family -> Search the SwissProt and NR database. • GO terms.

  10. Protein phylogeny profile / z-score Protein phylogeny profile Z-score profile Genome Family Core-gene family Species specific gene family Gene number – mean gene number Z = Standard deviation

  11. Fungi genomes characteristics 1 3 2

  12. Orphans / Species specific gene families 1 2 3

  13. Difference in average gene family size *Total 8035 families, exclude the species specific families

  14. Hierarchical clustering of gene family N. crassa M. grisea S. roseus P. graminis M. larici-populin U. maydis M. globosa P. placenta P. chrysosporium C. cinereus L. bicolor C. neoformans • Top100 most variable profiles, based on the standard deviations were calculated. • Red: Protein kinase, esteraselipase, crerecombinase, DNA/RNA helicase, Leucine-richrepeat • Blue: major facilitatorsuperfamily

  15. Overview • Gene prediction and annotation platform • Gene family analysis • Phylogeny position of Melampsora

  16. Phylogenies of Melampsora • Construct the Melampsoraphylogenic tree based on FUNYBASE with selected fungi genomes. • FUNYBASE: single-copy gene family (246 genes) within 21 fungi species (mostly ascomycetes). • 22 selected species: • Ascomycete: Aspergillusnidulans, Coccidioidesimmitis, Fusariumgraminearum, Mycosphaerellagraminicola, Magnaporthegrisea, Neurosporacrassa, Nectriahaematococca, Pyrenophoratritici-repentis, Stagonosporanodorum, Schizosaccharomycespombe, Sclerotiniasclerotiorum. • Basidiomycete: Coprinuscinereus, Cryptococcus neoformans, Laccaria bicolor, Malasseziaglobosa, Melampsoralarici-populina, Phanerochaetechrysosporium, Pucciniagraminis,Postia placenta, Sporobolomycesroseus, Ustilagomaydis • Zygomycete: Rhizopusoryzae *new genome; reject in FUNYBASE

  17. Phylogenies of Melampsora- Method • 246 HMM models for the conserved protein sequence blocks in FUNYBASE . • For each genome, HMMER search against whole proteome and retain the protein sequence of the best hit in each model. • 148 models have single-copy gene in our 22 selected species. • Concatenate the 148 single-copy orthologs for tree building.

  18. Melampsorain the phylogenetic tree of fungi using phylo_win, Neighbor joining method with Poisson correction, 500 bootstrap.

  19. Acknowledgements Gent StephaneRombauts Michiel Van Bel KlaasVandepoele Kenny Billiau Thomas Abeel Pierre Rouzé LievenSterck Yves Van de Peer • Nancy • StéphaneHacquard • Emilie Tisserant • Marie-Pierre Oudot-Le Secq • SébastienDuplessis • Francis Martin

More Related