1 / 24

Bioinformatics

Bioinformatics. Richard Tseng and Ishawar Hosamani. Outline. Homology modeling (Ishwar) Structural analysis Structure prediction Structure comparisons Cluster analysis Partitioning method Density-based method Phylogenetic analaysis. Structural Analysis. Overview Structure prediction

Download Presentation

Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics Richard Tseng and Ishawar Hosamani

  2. Outline • Homology modeling (Ishwar) • Structural analysis • Structure prediction • Structure comparisons • Cluster analysis • Partitioning method • Density-based method • Phylogenetic analaysis

  3. Structural Analysis • Overview • Structure prediction • Structural alignment • Similarity

  4. Tools for protein structure prediction • Protein • Secondary structure prediction: SSEA http://protein.cribi.unipd.it/ssea/ • Tertiary structure prediction: • Wurst: http://www.zbh.uni-hamburg.de/wurst/ • LOOPP: http://cbsuapps.tc.cornell.edu/loopp.aspx

  5. WURST( Torda et al. (2004) Wurst: A protein threading server with a structural scoring function, sequence profiles and optimized substitution matrices Nucleic Acids Res., 32, W532-W535) • Rationale • Alignment: Sequence to structure alignments are done with a Smith-Waterman style alignment and the Gotoh algorithm • Score function: fragment-based sequence to structure compatibility score and a pure sequence-sequence component substitution score • Library: Dali PDB90 (24599 srtuctures)

  6. Tools for structure comparison • Pair structures comparison: • TopMatch • Matras: (http://biunit.naist.jp/matras/) • Multiple structures comparison: • 3D-surfer • Matras: (http://biunit.naist.jp/matras/)

  7. TopMatch (Sippl & Wiederstein (2008) A note on difficult structure alignment problems. Bioinformatics 24, 426-427) • Rationale: • Structure alignment: http://www.cgl.ucsf.edu/home/meng/grpmt/structalign.html • Similarity measurement • Input format • PDB, SCOP and CATH code • PDB structure directly • Exercise: http://topmatch.services.came.sbg.ac.at/

  8. 3D-surfer (David La et al.  3D-SURFER: software for high throughput protein surface comparison and analysis. Bioinformatics , in press. (2009)) • Rationale • Define a surface function • Transform the surface function into a 3D Zernike description function • Input format • PDB and CATH code • PDB structure directly • Exercise: http://dragon.bio.purdue.edu/3d-surfer/

  9. Cluster analysis • Goal: • Grouping the data into classes or clusters, so that objects within a cluster have high similarity in comparison to one another but are very dissimilar to objects in other clusters. • Methods • Partitioning method: k-means • Density-based method: Ordering Points to Identify the Clustering Structure (OPTICS)

  10. k-means • Rationale: Partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean • Exercise http://cgm.cs.ntust.edu.tw/etrex/kMeansClustering/kMeansClustering2.html

  11. OPTICS • Rationle: Partition observations based on the density of similar objects • Exercise http://www.dbs.informatik.uni-muenchen.de/Forschung/KDD/Clustering/OPTICS/Demo/

  12. Example: Folding of Trp-cage peptide

  13. Phylogenetic analysis • Overviews • Comparisons of more than two sequences • Analysis of gene families, including functional predictions • Estimation of evolutionary relationships among organisms

  14. Theoretical tree • Parsimony method • Distance matrix method • Maximum likelihood and Bayesian method • Invariants method

  15. Software • Collections of tools http://evolution.genetics.washington.edu/phylip/software.html • A web server version for tree construction and display • PHYLIP, http://bioweb2.pasteur.fr/phylogeny/intro-en.html • Interactive tree of life, http://itol.embl.de/ • Mostly common used stand alone software • PHYLIP, tool for evaluating similarity of nucleotide and amino acid sequences. http://evolution.gs.washington.edu/phylip.html • TreeView, tool for visualization and manipulation of family tree. http://taxonomy.zoology.gla.ac.uk/rod/treeview.html • Matlab - bioinformatics tool box

  16. Example: Alignment phylogenetic tree of Tubulin family • Searching homologous sequences of Tubulin (PDB code: 1JFF) from RCSB protein databank • Blast for pair sequence alignment • Clustalw for comparative sequence alignment • Evaluating protein distance matrix • using “Protdist” of PHYILIP (Particularly, Point Accepted Mutation (PAM) matrix is used) • Clustering proteins using “Neighbor” of PHYILIP (Neightboring-Joint method is considered)

  17. Example: n-distance phylogenetic tree • Evaluating n-distance matrix • n-distance method • Clustering proteins using “Neighbor” of PHYILIP (Neightboring-Joint method is considered) • 16S and 18S Ribosomal RNA sequenecs of 35 organisms

  18. Summary • Homology modeling • Tools for structure prediction and comparisons • Tools for phylogenetic tree construction Thanks for your attention!!

  19. Protein distance matrix

  20. Tubulin family tree

  21. n-distance method • Frequency count of “n-letter words” • n-dsiatnce matrix • Advantage: • Identify fully conservative words located at nearly the same sites • Effecient MREIVHIQAGQCGNQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERINVYYNE

More Related