260 likes | 410 Views
Genomics of Microbial Eukaryotes. Igor Grigoriev Fungal Genomics Program Head US DOE Joint Genome Institute, Walnut Creek, CA <ivgrigoriev@lbl.gov>. Outline. Eukaryotic Genome Annotation Fungal Genomics Program MycoCosm. Are you in the right room?. IMG. MycoCosm.
E N D
Genomics of Microbial Eukaryotes Igor Grigoriev Fungal Genomics Program Head US DOE Joint Genome Institute, Walnut Creek, CA <ivgrigoriev@lbl.gov>
Outline • Eukaryotic Genome Annotation • Fungal Genomics Program • MycoCosm
Are you in the right room? IMG MycoCosm 100+ annotated eukaryotic genomes genome.jgi.doe.gov
Eukaryotic Gene Prediction Promoter TGA ATG PolyA GT AG Gene model 3’UTR 5’UTR exons introns Train on known genes Ab initio methods use knowledge of known genes’ structures to predict start, stop, and splice sites in CDS only. (Fgenesh+, GeneMark) Transcript-based methods map or assemble transcripts on the genome, including UTRs (EST_map, Combest) EST contig Predict model Protein-based methods build CDS exons around known protein alignments. (Fgenesh, GeneWise) GenBank protein Predict model
Protein Annotation Signal peptide (signalP) Domain (InterPro, tmhmm) Possible orthologs (in nr, SwissProt, KEGG, KOG) Possible paralog (Blastp+MCL) Higher order assignments: Gene Ontology terms EC numbers --> KEGG pathways Gene families, with and without other species Predicted protein
EST Support is Critical for Eukaryotes Sanger 454 Illumina CombEST gene models 5531 34 EST profile
Best Models Representative set FGENESH GENEWISE EXTERNAL MODELS • Multiple gene predictors offer several different gene models at each gene locus; • A single best model from each locusis automatically selected based on homology and EST support; • These compose a non-redundant (or Filtered) gene set for further analysis • This set is further improved during community-driven manual curation
Bring it all together Gene families Gene expression Phylogenomics Proteomics Protein targeting etc Analysis Genomic assembly and EST contigs Repeat mask Transcript + protein maps Annotation Pipeline Gene predictions Protein annotations Manual curation Annotation
Many Genes of Eco-responsive Daphnia pulex First crustacean, aquatic animal sequenced, new model organism30,940 predicted D.pulex genes in ~200Mb genome85% supported by1+ lines of evidence Colbourne et al, Science, 2011
Half of Daphnia Genes have no Homologs * Of 716 highly conserved single copy orthologs, Daphnia is missing only two With Evgeny Zdobnov’s group (Univ. Genève)
Outline • Eukaryotic Genome Annotation • Fungal Genomics Program • MycoCosm
Fungal Genomics for Energy and Environment Bio-refinery Plant symbiontsand pathogens Lignocellulose degradation SugarFermentation Degrade Ferment Grow GOAL: Scale up sequencing and analysis of fungal diversity for DOE science and applications
Genomic Encyclopedia of Fungi Launched www.jgi.doe.gov/fungi 100+ fungal genomes 600+ registered users 5000+ visitors/month • Plant feedstock health • Symbiosis • Plant Pathogenicity • Biocontrol • Biorefinery fungi • Lignocellulose degradation • Sugar fermentation • Industrial organisms • Fungal diversity • Phylogentic • Ecologic
Distinct Mechanisms of Cellulose Degradation No cellulose binding domain CBM1 in brown rot! Cellobiohydrolase IIGH6(CBH50) Cellobiohydrolase I GH7 (CBH58,62) EndoglucanasesGH5-CBM1,GH12 White rot P.chrysosporium Cellulose Brown rot Postia placenta GH3 b-glucosidase Glucose Glucose oxidases Copper radical oxidases Fe2+ + H2O2 Fe3+ + HO- + HO. Martinez et al, PNAS 2009 Iron reductase Fe3+
Diverse Basidiomycota • FGP09 pilots • Basidio jam (Mar 2010) • 3 CSP11 proposals • Basidio jam (Mar 2011)
Future Grand Challenges • 1000 fungal genomessampling fungal diversity • Model fungisampling 100s of conditions • Fungal ecosystems: • Bioenergy crops symbionts & pathogens • Biorefinery • Fungal metagenomes MODELING FUNCTION SEQUENCE Fungal isolates & groups Systems of interacting organisms Systems in wild 18
Annotation and Analysis Tools • Automated Annotation • Pipeline • Genomics Analysis Platform • Genome Centric • Comparative Genomics • Community Resource • Integrated data • User tools • Training
Comparative View Genome-Centric View www.jgi.doe.gov/fungi
Genome-Centric View Focus:functional genomics, user data deposition and curation 22
Community Building Tools • Jamborees: • Genome analysis for publications • MycoCosm Tutorials: • On-line video, MGM, workshops w/ large meetings (Asilomar, JGI Users, MSA) • Preparation for CSP: • Large meetings and focused groups 24
Summary Eukaryotic Annotation Recipe: • Combined gene predictors, experimental data, and community annotation Fungal Genomics Program: • Scaled-up sequencing & comparative analysis of fungi relevant for energy & environment (jgi.doe.gov/fungi)
Outline • Eukaryotic Genome Annotation • Fungal Genomics Program • MycoCosm