1 / 22

Clustal W and Clustal X version 2.0

Clustal W and Clustal X version 2.0. 김영호 , 박준호 , 최현희 The 9 th Protein Folding Winter School. The Paper. Abstract. The Clustal W and Clustal X multiple sequence alignment programs have been completely rewritten in C++

atara
Download Presentation

Clustal W and Clustal X version 2.0

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clustal W and Clustal X version 2.0 김영호, 박준호, 최현희 The 9th Protein Folding Winter School

  2. The Paper

  3. Abstract • The Clustal W and Clustal X multiple sequence alignment programs have been completely rewritten in C++ • This will facilitate the further development of the alignment algorithms in the future • This has allowed proper porting of the programs to the latest versions of Linux, Macintosh and Windows operating systems

  4. Contents 1 Introduction 2 Clustal W 2.0 and Clustal X 2.0 3 New Features 4 Related Sources

  5. Introduction • One of the oldest and most widely used • First distributed by post on floppy disks (late 1980s, witten in Microsoft Fortran for MS-DOS) • Clustal 1 ~ Clustal 4 (1988, 1989, IBM compatible PCs) • Clustal V (1992, VAX/VMS, Unix, Apple Macintosh, IBM compatible PCs)

  6. Introduction • Clustal W and Clustal X(late 1990s) • Other powerful tools • BAliBASE • T-Coffee • MAFFT • MUSCLE • Yet, Clustal W and Clustal X continue to be very widely used. (EBI Clustal site gets millions of multiple alignment jobs per yr)

  7. Introduction • Clustal W and Clustal X • W : Command terminal • X : Graphic • Procedure • Sequence input (choose a chain or domain from each FASTA sequence) • Concatenate all the query sequences in one file • Run • Output (score, alignment)

  8. Clustal W 2.0 and Clustal X 2.0 • What’s new? • Rewritten in C++ • Easier to maintain the code • Easier to modify, replace some of the alignment algorithms. • UPGMA guide trees • Alternative to the NJ guide trees • Speeds up the alignment of large data sets • Iterative alignment facility • Increase alignment accuracy

  9. Clustal W 2.0 and Clustal X 2.0 • Clustal X • Developed using NCBI’s vibrant toolbox • The vibrant toolbox is no longer supported • Clustal X 2.0 • Rewritten using the Qt GUI toolbox • Qt GUI toolbox provides a native look and feel on Windows, Linux and Mac platforms`

  10. New Features • UPGMA • Faster than NJ (takes less than a minute to cluster 10,000 sequences while NJ takes over an hour) • Slightly less accurate than BAliBASE benchmark, but on large alignments this is offset by the savings in processing time (2h vs. 12h)

  11. New Features • Iteration • A quick and effective method of refining alignments. • ‘Remove first’ iteration scheme • WSP (Weighted Sum of Pairs) • During each iteration step, each sequence is removed form the alignment in turn and realigned. If the WSP score is reduced then the resulting alignment is retained.

  12. New Features • Command line option • ‘-clustering=UPGMA’ • Calls algorithm for UPGMA • ‘-iteration=alignment’ • Refines the final alignment • Less accurate but faster • ‘-iteration=tree’ • Refines at each step in the progressive alignment • More accurate but slower • ‘-numiters’ • Sets iteration cycles (default: 3)

  13. Related Sources • EBI Website • European Bioinformatics Institute website • Supports several alignment programs • We can try various programs (Eg. ClustalW, MAFFT, T-coffee, MUSCLE etc.)

  14. Related Sources • Clustal (web)

  15. Related Sources • Clustal (dos)

  16. Related Sources • Clustal (dos)

  17. Related Sources • MUSCLE

  18. Related Sources • T-Coffee

  19. Related Sources • MAFFT

  20. Related Sources • Kalign

  21. Thank You !

More Related