1 / 21

We have alredy discussed

Lec 09. We have alredy discussed. What is MSA (Multiple Sequence Alignment) ? What is it good for? How do I use it? Software and algorithms The programs How they work? Which to use? In practice Get the sequences Reformat them Evaluate the alignment Realign or modify the alignment

Download Presentation

We have alredy discussed

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lec 09 We have alredy discussed • What is MSA (Multiple Sequence Alignment)? • What is it good for? • How do I use it? • Software and algorithms • The programs • How they work? • Which to use? • In practice • Get the sequences • Reformat them • Evaluate the alignment • Realign or modify the alignment • Add or subtract sequence

  2. Lec 09 Central server Web-based Local computer Platform, software and algorithm selection • Platforms • Software's • The best • What’s available • The easiest to use • The best output • Algorithm • The most accurate • The best for your problem • What’s available • What you are familiar with

  3. Lec 09 Main applications of MSA

  4. Lec 09 Get the sequences:databases • GenBank: An annotated collection of all publicly available nucleotide and protein sequences. • RefSeq: NCBI non-redundant set of reference sequences, including genomic DNA, transcript (RNA), and protein products. • UniProt Consortium Database: Universal protein knowledgebase, a central resource of protein sequence and function from Swiss-Prot, TrEMBL and PIR. • Entrez Gene: Gene-centered information at NCBI. • UniGene: Unified clusters of ESTs and full-length mRNA sequences. • OMIM: Online Mendelian inheritance in man: a catalog of human genetic and genomic disorders. • Model Organism Genome Databases: MGD, RGD, SGD, Flybase… • GeneCards: Integrated database of human genes, maps, proteins and diseases. • SNP Consortium Database.

  5. Lec 09 Get the sequences: Entrez Text Searches http://www.ncbi.nlm.nih.gov/sites/gquery

  6. Lec 09 Entrez Gene http://www.ncbi.nlm.nih.gov/gene

  7. Lec 09 UniProt Consortium Databases (http://www.uniprot.org) • Number of explicitly cross-referenced databases: 126

  8. Lec 09 UniProt Text Search http://www.uniprot.org/

  9. Lec 09 UniProt Sequence Report

  10. Lec 09 PIR Text Search http://pir.georgetown.edu/pirwww/search/textsearch.shtml

  11. Lec 09 OMIM: Online Mendelian Inheritance in Man http://www.ncbi.nlm.nih.gov/sites/entrez?db=omim&TabCmd=Limits

  12. Lec 09 Protein Family Databases • Whole Proteins • PIRSF: A Network Classification System of Protein Families • COG (Clusters of Orthologous Groups) of Complete Genomes • ProtoNet: Automated Hierarchical Classification of Proteins • Protein Domains • Pfam: Alignments and HMM Models of Protein Domains • SMART: Protein Domain Families • CDD: Conserved Domain Database • Protein Motifs • PROSITE: Protein Patterns and Profiles • BLOCKS: Protein Sequence Motifs and Alignments • PRINTS: Protein Sequence Motifs and Signatures • Integrated Family Databases • iProClass: Superfamilies/Families, Domains, Motifs, Rich Links • InterPro: Integrate Pfam, PRINTS, PROSITES, ProDom, SMART, PIRSF, SuperFamily

  13. Lec 09 • PIRSF: A Network Classification System of Protein Families http://pir.georgetown.edu/pirwww/dbinfo/pirsf.shtml

  14. Lec 09 COG: Clusters of Orthologous Groups of proteins http://www.ncbi.nlm.nih.gov/COG/

  15. Lec 09 Domain Classification http://pir.georgetown.edu/pirwww/dbinfo/iproclass.shtml

  16. Lec 09 Domain Classification InterPro Gene3D

  17. Lec 09 Protein Motifs

  18. Lec 09 Databases of Protein Functions • Metabolic Pathways, Enzymes, and Compounds • Enzyme Classification: Classification and Nomenclature of Enzyme-Catalysed Reactions (EC-IUBMB) • KEGG (Kyoto Encyclopedia of Genes and Genomes): Metabolic Pathways • LIGAND (at KEGG): Chemical Compounds, Reactions and Enzymes • EcoCyc: Encyclopedia of E. coli Genes and Metabolism • MetaCyc: Metabolic Encyclopedia (Metabolic Pathways) • BRENDA: Enzyme Database • UM-BBD: Microbial Biocatalytic Reactions and Biodegradation Pathways • Cellular Regulation and Gene Networks • EpoDB: Genes Expressed during Human Erythropoiesis • BIND: Descriptions of interactions, molecular complexes and pathways • DIP: Catalogs experimentally determined interactions between proteins • BioCarta: Biological pathways of human and mouse • GO: Gene Ontology Consortium Database

  19. Lec 09 KEGG Metabolic & Regulatory Pathways • KEGG is a suite of databases and associated software, integrating our current knowledge • on molecular interaction networks, the information of genes and proteins, and of chemical • compounds and reactions. http://www.genome.jp/kegg/pathway.html http://www.genome.jp/kegg/pathway.html#metabolism

  20. Lec 09 Multiple Genome Alignment MGA Michael Höhl, Stefan Kurtz ,Enno Ohlebusch Efficient Multiple Genome Alignment Bioinformatics , Vol. 18 (S1): S312-S320, 2002 http://bibiserv.techfak.uni-bielefeld.de/mga/ref.html PipMaker and MultiPipMakerSchwartz S, Elnitski L, Li M, et al. MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences NUCLEIC ACIDS RES 31 (13): 3518-3524 JUL 1 2003 http://bio.cse.psu.edu/pipmaker/ MAVIDBray N and Pachter L ,MAVID multiple alignment server , Nucleic Acids Research 2003 31: 3525-3526 http://baboon.math.berkeley.edu/mavid/http://www-gsd.lbl.gov/vista/ MultiPipMaker - output

  21. Lec 09 Multiple Genome Alignment Genomic Targets for Comparative Sequencing http://genome.ucsc.edu/

More Related