bioinformatics and sequence analysis l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Bioinformatics and sequence analysis PowerPoint Presentation
Download Presentation
Bioinformatics and sequence analysis

Loading in 2 Seconds...

play fullscreen
1 / 114

Bioinformatics and sequence analysis - PowerPoint PPT Presentation


  • 233 Views
  • Uploaded on

Bioinformatics and sequence analysis. Michael Nilges Unité de Bio-Informatique Structurale Institut Pasteur, Paris Mars 2002. Overview. Bioinformatics: a brief overview Organising knowledge: databanks and databases Protein sequence analysis Sequence alignment

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Bioinformatics and sequence analysis


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Bioinformaticsand sequence analysis Michael Nilges Unité de Bio-Informatique Structurale Institut Pasteur, Paris Mars 2002

    2. Overview • Bioinformatics: a brief overview • Organising knowledge: databanks and databases • Protein sequence analysis • Sequence alignment • Multiple alignment and sequence pofiles • Phylogenetic trees Bioinformatics lecture March 5, 2002

    3. I. Bioinformatics - a brief overview

    4. What is it? Bioinformatics: Deduction of knowledge by computer analysis of biological data. or: see 20000 pages on this issue on the WWW Bioinformatics lecture March 5, 2002

    5. The data • information stored in the genetic code (DNA) • protein sequences • 3D structures • experimental results from various sources • patient statistics • scientific literature Bioinformatics lecture March 5, 2002

    6. Algorithmic developments • Important part of research in bioinformatics: methods for data storage data retrieval data analysis Bioinformatics lecture March 5, 2002

    7. Interdisciplinary research • rapidly developing branch of biology • highly interdisciplinary: using techniques and concepts from informatics, statistics, mathematics, chemistry, biochemistry, physics, and linguistics. • many practical applications in biology and medicine. Bioinformatics lecture March 5, 2002

    8. Computation in biology... • similar to other sciences: computational physics, computational chemistry derivation of physics laws from astronomical data • already in the '20s biologists wanted to derive knowledge by induction • reasons for recent development: development of computers and networks availability of data (sequences, 3D structures) amount of data Bioinformatics lecture March 5, 2002

    9. Why? • An avalanche of data: • Sequences • Function related • Structures • requires computational approaches Bioinformatics lecture March 5, 2002

    10. Genomics • New way to perform experiments: • accumulation of data: • sequences • structures, • function-related • not hypothesis-driven • Hypothesis formed later and tested in silico Bioinformatics lecture March 5, 2002

    11. Bioinformatics key areas • e.g. homology searches organisation of knowledge (sequences, structures, functional data) Bioinformatics lecture March 5, 2002

    12. Structural Bioinformatics • Prediction of structure from sequence • secondary structure • homology modelling, threading • ab initio 3D prediction • Analysis of 3D structure • structure comparison/ alignment • prediction of function from structure • molecular mechanics/ molecular dynamics • prediction of molecular interactions, docking • Structure databases (RCSB) Bioinformatics lecture March 5, 2002

    13. Structural Bioinformatics Bioinformatics lecture March 5, 2002

    14. II. Databases

    15. Organizing knowledgein databanks and databases • Introduction • Sequence databanks and databases • EMBL, SwissProt, TREMBL • SRS: Sequence Retrieval system • 3D structure database: the RCSB - PDB • Domain databases Bioinformatics lecture March 5, 2002

    16. Biological databanks and databases • Very fast growth of biological data • Diversity of biological data: • primary sequences • 3D structures • functional data • Database entry usually required for publication • Sequences • Structures • Database entry may replace primary publication • genomic approaches Bioinformatics lecture March 5, 2002

    17. DNA sequence data bases • Three databanks exchange data on a daily basis • Data can be submitted and accessed at either location • Genebank • www.ncbi.nlm.nih.gov/Genbank/GenbankOverview.html • EMBL • www.ebi.ac.uk/embl/index.html • DNA DataBank of Japan (DDBJ) • www.nig.ac.jp/home.html Bioinformatics lecture March 5, 2002

    18. EMBL database growth Bioinformatics lecture March 5, 2002

    19. Distribution of entries Bioinformatics lecture March 5, 2002

    20. EMBL database documentation • Information on • user manual • release notes • feature table definition... see • http://www.ebi.ac.uk/embl/Documentation Bioinformatics lecture March 5, 2002

    21. EMBL entry for insulin receptor Bioinformatics lecture March 5, 2002

    22. EMBL entry 2: features Bioinformatics lecture March 5, 2002

    23. EMBL entry 3: sequence Bioinformatics lecture March 5, 2002

    24. SwissProt protein sequence data baseTREMBL translated EMBL • hosted jointly by EBI (European Bioinformatics Institute, an EMBL outstation in Hinxton, UK) and SIB (Swiss Institute for Bioinformatics in Lausanne and Geneva) • SwissProt is curated (Amos Bairoch) • quality checks • annotations • links to other databases • TREMBL: automatic translation of EMBL • automatic annotations Bioinformatics lecture March 5, 2002

    25. ExPASy - www.expasy.orgExpert Protein Analysis System Bioinformatics lecture March 5, 2002

    26. FASTA format one line header, starting with > some programs require several characters without space after > sequence, in free format (no numbers) Bioinformatics lecture March 5, 2002

    27. SWISS-PROT entry for insulin receptor(NiceProt view)

    28. Features of insulin receptor Bioinformatics lecture March 5, 2002

    29. Niceprot Feature Aligner Bioinformatics lecture March 5, 2002

    30. Clustalw-alignment of two domains Bioinformatics lecture March 5, 2002

    31. Links to other sites (Blast, ...) Bioinformatics lecture March 5, 2002

    32. The RCSB-PDBwww.rcsb.org/pdb • Data bases for 3D structures of biological macromolecules (proteins, nucleic acides) • RCSB (Research Collaboratory for Structural Bioinformatics) maintains and develops the PDB (Protein Data Bank) • others: • MMDB (EBI): msd.ebi.ac.uk • NCBI: www.ncbi.nlm.nih.gov/Structure/ Bioinformatics lecture March 5, 2002

    33. www.rcsb.org/pdb

    34. Results of a simple query Bioinformatics lecture March 5, 2002

    35. View structures Bioinformatics lecture March 5, 2002

    36. Domain databases • Pfam (A/B) www.sanger.ac.uk/Pfam • Smart smart.embl-heidelberg.de • Prodom prodes.toulouse.inra.fr/prodom/doc/prodom.html • Dart www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi?cmd=rps • Interpro www.ebi.ac.uk/interpro/ Bioinformatics lecture March 5, 2002

    37. InterPro • InterPro release 4.0 (Nov 2001) was built from • Pfam 6.6, PRINTS 31.0, PROSITE 16.37, ProDom 2001.2, SMART 3.1, TIGRFAMs 1.2, SWISS-PROT + TrEMBL data. • 4691 entries: 1068 domains, 3532 families, 74 repeats and 15 post-translational modification sites. Bioinformatics lecture March 5, 2002

    38. Results of InterPro search for spectrin Bioinformatics lecture March 5, 2002

    39. Spectrin repeat Bioinformatics lecture March 5, 2002

    40. SMART database Bioinformatics lecture March 5, 2002

    41. Domain architecture of spectrin beta chain Bioinformatics lecture March 5, 2002

    42. Pfam home page Bioinformatics lecture March 5, 2002

    43. Compilations of links to databases at Institut Pasteur www.pasteur.fr/recherche/banques at Infobiogen (Evry) www.infobiogen.fr/services/deambulum/fr European bioinformatics institute (ebi) www.ebi.ac.uk/Databases/index.html at the swiss institute for bioinformatics (SIB) www.expasy.org www.expasy.org/alinks.html#Proteins Bioinformatics lecture March 5, 2002

    44. SRS sequence retrieval system • unified way to access and link information in different databases • powerful queries • launch applications (e.g. blast, clustalw...) • temporary and permanent projects • can be reached from the pasteur databank page • srs.pasteur.fr/cgi-bin/srs6/wgetz Bioinformatics lecture March 5, 2002

    45. SRS 6 start page Bioinformatics lecture March 5, 2002