1 / 12

Demo: Phylip

Demo: Phylip. http://evolution.genetics.washington.edu/phylip.html Ziheng Yang Department of Biology, UCL. Phylip: strengths. C program Freely available and runs on all major platforms Lots of people around who know how to use it Runs can be automated by using redirection and command lines

reilly
Download Presentation

Demo: Phylip

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Demo: Phylip http://evolution.genetics.washington.edu/phylip.html Ziheng Yang Department of Biology, UCL

  2. Phylip: strengths • C program • Freely available and runs on all major platforms • Lots of people around who know how to use it • Runs can be automated by using redirection and command lines • Support for phylip format files by other programs such as clustal, treeview etc. • Easy and transparent interface: each program does one simple job • Popular everywhere including China & Russia where cash is in short supply.

  3. Phylip: “weaknesses” • Easy and simple interface (no mice and menus); renaming files can be tedious. • Parsimony not so good as PAUP* • Do not automatically estimate substitution parameters (universal ts/tv rate ratio) • Some models or options are not available. • Don’t read NEXUS standard files. • 10 characters in sequence name

  4. Common features infile intree weights categories fontfile Phylip programs outfile outtree plotfile These are default file names. If the input files do not exist, you will be asked for the file name. If the output files exist, you will be asked to confirm overwriting them.

  5. Major programs • dnadist: DNA alignment  distance matrix • protdist: protein alignment  distance matrix • neighbor: distance matrix  NJ tree • dnaml: DNA alignment  ML tree • dnamlk: DNA alignment  ML tree under clock • proml: protein alignment  ML tree • dnapars: DNA alignment  parsimony tree • protpars: protein alignment  parsimony tree • seqboot: DNA alignment  bootstrap datasets • consense: summarizes bootstrap results

  6. Sequence file format (Interleaved) 9 1141 chimpanzee ATGACCCCGA CACGCAAAAT TAACCCACTA ATAAAATTAA TTAATCACTC bonobo ATGACCCCAA CACGCAAAAT CAACCCACTA ATAAAATTAA TTAATCACTC human ATGACCCCAA TACGCAAAAT TAACCCCCTA ATAAAATTAA TTAACCGCTC gorilla ATGACCCCTA TACGCAAAAC TAACCCACTA GCAAAACTAA TTAACCACTC bornean ATGACCCCAA TACGCAAAAC CAACCCACTA ATAAAATTAA TTAACCACTC sumatran ATGACCTCAA CACGTAAAAC CAACCCACTA ATAAAATTAA TCAACCACTC gibbon ATGACCCCCC TGCGCAAAAC TAACCCACTA ATAAAACTAA TCAACCACTC horse ATGACAAACA TCCGGAAATC TCACCCACTA ATTAAAATCA TCAATCACTC donkey ATGACAAACA TCCGAAAATC CCACCCGCTA ATTAAAATCA TCAATCACTC ATTTATCGAC CTCCCCACCC CATCCAACAT TTCCGCATGA TGGAACTTCG ATTTATCGAC CTCCCCACCC CATCCAATAT TTCCACATGA TGAAACTTCG ATTCATCGAC CTCCCCACCC CATCCAACAT CTCCGCATGA TGAAACTTCG ATTCATTGAC CTCCCTACCC CGTCCAACAT CTCCACATGA TGAAACTTCG ACTCATCGAC CTCCCCACCC CATCAAACAT CTCTGCATGA TGGAACTTCG ACTTATCGAC CTCCCCACCC CATCAAACAT CTCCGCATGA TGGAACTTCG ACTTATCGAC CTTCCAGCCC CATCCAACAT TTCTATATGA TGAAACTTTG TTTTATTGAC CTACCAGCCC CCTCAAACAT TTCATCATGA TGAAACTTCG TTTTATCGAC CTGCCAACCC CCTCAAACAT TTCATCATGA TGAAACTTTG

  7. Sequence file format (sequential) 5 285 human VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYRLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH goat_cow VLSAADKSNVKAAWGKVGGNAGAYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGEKVAAALTKAVGHLDDLPGTLSDLSDLHAHKLRVDPVNFKLLSHSLLVTLACHLPNDFTPAVHASLDKFLANVSTVLTSKYRLTAEEKAAVTAFWGKVKVDEVGGEALGRLLVVYPWTQRFFESFGDLSTADAVMNNPKVKAHGKKVLDSFSNGMKHLDDLKGTFAALSELHCDKLHVDPENFKLLGNVLVVVLARNFGKEFTPVLQADFQKVVAGVANALAHRYH rabbit VLSPADKTNIKTAWEKIGSHGGEYGAEAVERMFLGFPTTKTYFPHFDFTHGSEQIKAHGKKVSEALTKAVGHLDDLPGALSTLSDLHAHKLRVDPVNFKLLSHCLLVTLANHHPSEFTPAVHASLDKFLANVSTVLTSKYRLSSEEKSAVTALWGKVNVEEVGGEALGRLLVVYPWTQRFFESFGDLSSANAVMNNPKVKAHGKKVLAAFSEGLSHLDNLKGTFAKLSELHCDKLHVDPENFRLLGNVLVIVLSHHFGKEFTPQVQAAYQKVVAGVANALAHKYH ...

  8. Common data-file problems • Input data files are plain text files. Use type (cat) or more (more) to confirm them. • Sequence name must be 10 characters. Add spaces to separate name from sequence. Note that a Tab is different from either one or many spaces. Note the difference between “invisible” spaces and nothing and beware of your editor. If you have the name human on one line, make sure it has at least 5 trailing spaces. • Line feed is known to cause problems, especially when files are transferred among platforms or over the network. Try re-saving the file from a program. Sequence data files are by default corrupted if sent by email. Send zip or gz files.

  9. Windows annoyances • Turn on file extension. In Windows Explorer: “Tools - Folder options – View”: untick "Hide extensions for known file types“. • Try to run jobs from the command line rather than double-clicking from Windows Explorer. • Use Task Manager to run your large jobs at lower priority (nice and renice on unix). If you set the process cmd to low, all jobs started from that window will run at low priority. Resist the temptation of running a big job on your friend’s machine as otherwise you will lose her.

  10. Clustal

  11. A parsimony analysis (dnapars) del rm copy cp move mv set p set path=d:\soft\phylip\;%PATH% set p copy cytb.phy infile dnapars move outfile cytb.mp.o del infile out* dnapars move outfile cytb.mp.o

  12. Example files http://abacus.gene.ucl.ac.uk/ziheng/teach/cytb.txt http://abacus.gene.ucl.ac.uk/ziheng/teach/abglobin.aa http://abacus.gene.ucl.ac.uk/ziheng/teach/testMB.nex http://abacus.gene.ucl.ac.uk/ziheng/teach/adh.nex

More Related