1 / 26

Orthology & Paralogy Alignment & Assembly

Orthology & Paralogy Alignment & Assembly. Alastair Kerr Ph.D. [many slides borrowed from various sources]. Overview. Orthology & Paralogy Definitions and examples Ways to determine an ortholog Pre-calculations: resources Alignment & Assembly Differences Key programs for each

netis
Download Presentation

Orthology & Paralogy Alignment & Assembly

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Orthology & ParalogyAlignment & Assembly Alastair Kerr Ph.D. [many slides borrowed from various sources]

  2. Overview • Orthology & Paralogy • Definitions and examples • Ways to determine an ortholog • Pre-calculations: resources • Alignment & Assembly • Differences • Key programs for each • Jalview example

  3. Homologs Have common origins but may or may not have common activity. Homologous or not?: Often determined by arbitrary threshold level of similarity determined by alignment

  4. Homologs …have common ancestry, but the way they are related can vary (i.e. the reasons they have diverged into different sequences can vary) • orthologs - Homologs produced by speciation. They tend to have similar function. • paralogs - Homologs produced by gene duplication. They tend to have differing functions.

  5. Early globin gene Orthologous or paralogous homologs Gene Duplication -chain gene ß-chain gene mouse  human  cattle  cattle ß human ß mouse ß Orthologs () Orthologs (ß) Paralogs (cattle) Homologs Orthologs – diverged after speciation – tend to have similar function Paralogs – diverged after gene duplication – some functional divergence occurs Therefore, for linking similar genes between species, or performing “annotation transfer”, identify orthologs

  6. True or False? A1x is the ortholog in species x of A1y? A1x is a paralog of A2x? A1x is a paralog of A2y?

  7. Identifying Gene/Protein Relationships from Phylogenetic trees • orthologs - Homologs produced by speciation. Gene phylogeny matches organismal phylogeny. • paralogs - Homologs produced by gene duplication. Multiple copies of homologs in a given species or evidence that gene duplication involved through phylogenetic analysis and lack of match to organismal phylogeny

  8. Gene Orthology: How to detect? • Most : Identify reciprocal best BLAST hits (EGO, COGs,…) Example Problem: • If making comparisons between human and bovine, for example, the bovine gene dataset is still quite incomplete • Therefore, current best hit may be a paralog now and the true ortholog not yet sequenced mouse human cattle cattle

  9. 2 Forms in 1 Species + + ++ + Slides from Jonathan Eisen

  10. 2 Forms in 1 Species - Gene Loss + + ++ + Loss Loss Gene duplicated in common ancestor ++

  11. Unusual Distribution Pattern + +

  12. Unusual Distribution - Gene Loss + + Gene lost here Gene present in ancestor

  13. Unusual Distribution -Evolutionary Rate Variation -? Gene too diverged to be found + +

  14. Ortholog guess via synteny A B C A ? C

  15. Syntenic blocks

  16. ensEMBL calculationshttp://www.ensembl.org demo

  17. OMA Browserhttp://omabrowser.org demo

  18. Alignments and Assemblies • Alignment • ALL sequences from SAME region • Therefore can be useless for a • non-overlapping contigs • PCR probes/oligos • Good for • paralog/orthologs • Basis for phylogeny • Assembly: • Good for near identical sequences • Types: • De-novo • Guided [reference sequence]

  19. Alignment • Implicit statement • Each residue in an aligned sequence derived from the last common ancestor [LCA] • Therefore ok to only look at conserved regions or mask non-conserved regions • Especially for phylogeny

  20. Alignment Tools • Faster but less accurate (some better with gaps) • Muscle • ClustalW/X • MAFFT • Slow but more accurate • *-Coffee • T: original • 3D: uses pdb as guide (structural) • M: uses multiple methods • Probcons

  21. Alignment Edit Tools • NEVER use a word processor or excel to edit alignments…… • JalView (Java Alignment Viewer) • Good for editing • DAS capable

  22. Multiple Sequence Alignment Consensus Conservation & Clustering PDB Secondary Structure Prediction ‘Standard’ Formats FASTA MSF CLUSTAL PILEUP BLC PFAM Distributed Annotation System GFF Clickable HTML Jalview Features Images Jalview Annotation Line Art Newick Analysis Structures Sequences Visualization Alignments Features Annotation Figure Generation Trees

  23. Select specific sources • Filtered list • Add user defined sources • Group features by source • Type==colour • Highlight start-end Jalview DAS Client Functionality DAS ANNOTATION SERVERS • Query matches ID to Authority • Map to local reference frame • Mouse over for feature name, links and scores

  24. Assemblers • Many free options • STADEN - staden.sf.net • Original assembler, all platforms • No longer in development • Useless for next gen sequencing • MAQ and MAQView • Installed in computers in COIL

More Related