1 / 33

Manually Adjusting Multiple Alignments

Manually Adjusting Multiple Alignments. Chris Wilton. Multiple Alignments. Reviewing multiple alignments what is a multiple alignment? Analyzing a multiple alignment what makes a ‘good’ multiple alignment? what can it tell us, why is it useful? Adjusting a multiple alignment

Download Presentation

Manually Adjusting Multiple Alignments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Manually Adjusting Multiple Alignments Chris Wilton

  2. Multiple Alignments • Reviewing multiple alignments • what is a multiple alignment? • Analyzing a multiple alignment • what makes a ‘good’ multiple alignment? • what can it tell us, why is it useful? • Adjusting a multiple alignment • Alignment editors and HowTo • Demonstration and practice

  3. What is a Multiple Alignment? • A comparison of sequences • “multiple sequence alignment” • A comparison of equivalents: • Structurally equivalent positions • Functionally equivalent residues • Secondary structure elements • Hydrophobic regions, polar residues

  4. A Good Multiple Alignment? • Difficult to define… • Good ones look pretty! • Aligned secondary structures • Strongly conserved residues / regions • Comparison with known structure helps • Bad ones look chaotic and random.

  5. conservation quality consensus ☻ ? A Good Multiple Alignment?

  6. Multiple Alignment Features • Barton (1993) • “The position of insertions and deletions suggests regions where surface loops exist…

  7. Multiple Alignment Features

  8. Multiple Alignment Features • Barton (1993) • “The position of insertions and deletions suggests regions where surface loops exist… • Conserved glycine or proline suggests aβ-turn...

  9. Multiple Alignment Features

  10. Multiple Alignment Features • Barton (1993) • “The position of insertions and deletions suggests regions where surface loops exist… • Conserved glycine or proline suggests aβ-turn… • Residues with hydrophobic properties conserved at i, i+2, i+4 (etc) separated by unconserved or hydrophilic residues suggests a surface β-strand…

  11. Multiple Alignment Features

  12. Multiple Alignment Features • Barton (1993) • “The position of insertions and deletions suggests regions where surface loops exist… • Conserved glycine or proline suggests aβ-turn… • Residues with hydrophobic properties conserved at i, i+2, i+4 (etc) separated by unconserved or hydrophilic residues suggests a surfaceβ-strand… • A short run of hydrophobic amino acids (4 or 5 residues) suggests a buriedβ-strand…

  13. Multiple Alignment Features

  14. Multiple Alignment Features • Barton (1993) • Pairs of conserved hydrophobic amino acids separated by pairs of unconserved or hydrophilic residues suggests anα-helix with one face packed in the protein core. Similarly, an i, i+3, i+4, i+7 pattern of conserved residues.”

  15. Multiple Alignment Features

  16. Multiple Alignment Features • Barton (1993) • Pairs of conserved hydrophobic amino acids separated by pairs of unconserved or hydrophilic residues suggests anα-helix with one face packed in the protein core. Similarly, an i, i+3, i+4, i+7 pattern of conserved residues.” • Cysteine is a rare amino acid, and is often used in disulphide bonds ( pairs of conserved cysteines ) • Charged residues ( histidine, aspartate, glutamate, lysine, arginine ) and other polar residues embedded in a conserved region indicate functional importance

  17. Multiple Alignment Features

  18. Quality Assessment • Bad residues • Large distance from column consensus • Bad columns • Average distance from consensus is high – “entropy” • Bad regions • Profile scores • Bad quality doesn’t always mean badly aligned! L I M I I L V E I V L A M P E R M K I D Q G Q N M W D L V T W D Y A A S L D F D N P G G A C R T T L I D R I N A I E V M A K L I Q

  19. Quality Assessment • Profiles • A profile holds scores for each residue type (plus gaps) over every column of a multiple alignment • Concepts: • Consensus sequence • Amino acid similarity • Some multiple alignment programs use profiles to build or add to an alignment • Any alignment, or even one sequence, can be a profile (one sequence isn’t a very good one…)

  20. What can we do with a MA? • Identify subgroups (phylogeny) • Intra-group sequence conservation • Evolutionary relatedness (view tree) • Identify motifs (functionality) • Evolutionary signals • Highly conserved residues indicate functional or structural significance! • Widen search for related proteins • MA better than single sequence • Consensus sequence / profile useful RPDDWHLHLR GGIDTHVHFI GFTLTHEHIC PFVEPHIHLD PKVELHVHLD

  21. What do we want to do? • Build a homology model? • Accuracy • Perform phylogenetic analysis? • Completeness • Functional analysis of a protein family? • Diversity

  22. Building the initial alignment • Fetch related sequences and run alignment • Clustal, Dialign, TCoffee, Muscle … • Fetch a multiple alignment from a database and add sequences of interest • Pfam, ProDom, ADDA … • Start from a motif-finding procedure • MEME, Pratt, Gibbs Sampler …

  23. Adjusting the alignment • Filter alignment: • Remove any redundancy • Remove unrelated sequences • Remove unwanted domains • Recalculate alignment if necessary • Look for conserved motifs, adjust any misalignments. Try different colour schemes and thresholds. • One step at a time…

  24. Jalview Alignment Editor Clamp, M., Cuff, J., Searle, S. M. and Barton, G. J. (2004), "The Jalview Java Alignment Editor", Bioinformatics, 20, 426-7.

  25. HYDROPHOBIC / POLAR hydrophobic polar BURIED INDEX buried surface β-STRAND LIKELIHOOD probable unlikely HELIX LIKELIHOOD probable unlikely Colouring your alignment

  26. Colouring your alignment • By conservation thresholds:

  27. Colouring your alignment • Conservation index Amino Acid Property Classification Schema, eg: Livingstone & Barton 1993

  28. Sequence Features

  29. Check PDB Structures • Load MA with sequence(s) for known PDB structure • View >> Feature Settings >> Fetch DAS Features (wait...) OR • Right-click >> Associate Structure with Sequence >> Discover PDB ids (quicker) • Right-click sequence name >> View PDB Entry • Structure opens in new window – residues acquire MA colours • Highlight residues by hovering mouse over alignment or structure • Label residues by clicking on structure

  30. Compare Alignment to Structure

  31. Compare Alignment to Structure • Crucial way of checking alignment! • Where are gaps / insertions /deletions ? • In secondary structures: bad • In surface loops: okay • Where are our key / functional residues? • Are they in probable active site? • Check they are clustered • Check they are accessible, not buried

  32. Demonstration and Practice • Start Jalview (click here) • Tools >> Preferences >> Visual select Maximise Window, unselect Quality, set Font Size to 8 or 9, Colour >> Clustal, uncheck Open File Editing check Pad Gaps When Editing • File >> Input Alignment >> from URL (use this one) • Get used to the controls – selecting and deselecting sequences/groups (drag mouse), dragging sequences/groups (use shift/ctrl), selecting sequence regions, hiding sequences/groups, removing columns and regions… Then explore menus and tools. • Now load this alignment – I’ve messed up a good alignment, and now I’d like you to correct it! There are two groups of sequences and one single sequence to adjust.

  33. Demonstration and Practice • View >> Feature Settings >> DAS Settings • select Uniprot, dssp, cath, Pfam, PDBsum_ligands, PDBsum_DNAbinding, then click ‘Save as default’ • click Fetch DAS Features (then click yes at prompt) ... • Move mouse over alignment and read information about features • Move mouse over sequence names to check for PDB ids • Open a PDB structure (choose any) • View >> uncheck Show All Chains, then use up-arrow key to increase structure size. • Hover mouse over structure (see how residues are highlighted in the sequence), then do same for sequence. Select residues in the structure by clicking them – a label will appear. Click again to remove label. • Check position of insertions & deletions using this method.

More Related