540 likes | 668 Views
This lesson provides a comprehensive analysis of various prediction methods used to determine functions from protein sequences. It emphasizes evaluating the accuracy of predictions by comparing results to experimentally verified sites. Key topics include method calibration (like E-value cutoff adjustments), trade-offs between sensitivity and specificity, and the role of evolutionary conservation in function prediction. The lesson also highlights how adaptive evolution and purifying selection contribute to the retention of functional traits in proteins, leveraging tools like ConSurf and ConSeq for conservation analysis.
E N D
Evaluation of prediction methods • Comparing our results to experimentally verified sites Our prediction gives: Is the prediction correct?
Method evaluation • A good method will be one with a high level of true-positives and true-negatives, and a low level of false-positives and false-negatives Our prediction gives: Is the prediction correct?
Calibrating the method • All methods have a parameter (cutoff) that can be calibrated to improve the accuracy of the method. • For example: the E-value cutoff in BLAST
Calibrating E-value cutoff Our prediction gives: Is the prediction correct? Is this a homolog?
Calibrating E-value cutoff • Reminder: the lower the E-value, the more ‘significant’ the alignment between the query and the hit.
Calibrating the E-value • What will happen if we raise the E-value cutoff (for instance – work with all hits with an E-value which is < 10) ? Our prediction gives: Is the prediction correct?
Calibrating the E-value • On the other hand – if we lower the E-value (look only at hits with E-value < 10-8) Our prediction gives: Is the prediction correct?
Improving prediction • Trade-off between specificity and sensitivity
True positive True positive + False negative Sensitivity vs. specificity • Sensitivity = • Specificity = How good we hit real homologs Represent all the proteins which are really homologous True negative True negative + False positive How good we avoid real non-homologs Represent all the proteins which are really NOT homologous
Raising the E-value to 10:sensitivityspecificity • Lowering the E-value to 10-8sensitivity specificity
Functional prediction in proteins (purifying and positive selection)
Darwin – the theory of natural selection • Adaptive evolution: Favorable traits will become more frequent in the population
Adaptive evolution • When natural selection favors a single allele and therefore the allele frequency continuously shifts in one direction
Kimura – the theory of neutral evolution • Neutral evolution: Most molecular changes do not change the phenotype Selection operates to preserve a trait (no change)
Purifying Selection • Stabilizes a trait in a population:Small babies more illnessLarge babies more difficult birth… • Baby weight is stabilized round 3-4 Kg
Purifying selection(conservation) -the molecular level • Histone 3
Synonymous vs. non-synonymous substitutions Purifying selection: excess of synonymous substitutions
Synonymous vs. non-synonymous substitutions Non-synonymous substitution: GUUGCU Synonymous substitution: GUUGUC Purifying selection: excess of synonymous substitutions
Conservation as a means of predicting function Infer the rate of evolution at each site Low rate of evolution constraints on the site to prevent disruption of function: active sites, protein-protein interactions, etc.
Prediction of conserved residues by estimating evolutionary rates at each site ConSurf/ConSeq web servers:
Find homologous protein sequences (psi-blast) Perform multiple sequence alignment (removing doubles) Construct an evolutionary tree Project the results on the 3D structure Calculate the conservation score for each site Working process Input a protein with a known 3D structure (PDB id or file provided by the user)
The Kcsa potassium channel • An outstanding mystery: how does the Kcsa Potassium channel conduct only K+ ions and not Na+?
The Kcsa potassium channel structure • The structure of the Kcsa channel was resolved in 1998 • Kcsa is a homotetramer with a four-fold symmetry axis about its pore.
The Kcsa potassium selectivity filter • The selectivity filter identifies water molecules bound to K+ • When water is bound to Na+: no passage
Conservation analysis of Kcsa • Use Consurf to study Kcsa conservation
Conseq • ConSeq performs the same analysis as ConSurf but exhibits the results on the sequence. • Predict buried/exposed relation • exposed & conserved functionally important site • buried & conserved structurally important site
Conseq analysis • Exposed & conserved functionally important site • Buried & conserved structurally important site
Darwin – the theory of natural selection • Adaptive evolution: Favorable traits will become more frequent in the population
Adaptive evolution on the molecular level Look for changes which confer an advantage
Naïve detection • Observe multiple sequence alignment:variable regions = adaptive evolution??
Naïve detection • The problem – how do we know which sites are simply sites with no selection pressure (“non-important” sites) and which are under adaptive evolution? X
Solution – look at the DNA synonymous non-synonymous
Solution – look at the DNA Adaptive evolution = Positive selectionNon-syn > Syn Purifying selectionSyn > Non-syn NeutralselectionSyn = Non-syn
Also known as… Ka/Ks (or dn/ds, or ω) • Purifying selection: Ka < Ks (Ka/Ks <1) • Neutral selection: Ka=Ks (Ka/Ks = 1) • Positive selection: Ka > Ks (Ka/Ks >1) Ka Ks Non-synonymous mutation rate Synonymous mutation rate
Examples for positive selection • Proteins involved in immune system • Proteins involved in host-pathogen interaction‘arms-race’ • Proteins following gene duplication • Proteins involved in reproduction systems
Selecton – a server for the detection of purifying and positive selection http://selecton.bioinfo.tau.ac.il
HIV: molecular evolution paradigm • Rapidly evolving virus: • High mutation rate (low fidelity of reverse transcriptase) • High replication rate
HIV Protease Protease is an essential enzyme for viral replication Drugs against Protease are always part of the “cocktail”
Ritonavir Inhibitor • Ritonavir (RTV) is a specific protease inhibitor (drug) C37H48N6O5S2
Drug resistance No drug Drug Adaptive evolution (positive selection)
Used Selecton to analyse HIV-1 protease gene sequences from patients that were treated with RTV only