DSC: Data Analysis

DSC: Data Analysis Michael Blaber Professor of Biomedical Sciences College of Medicine Florida State University

2 Outline • Thermodynamic parameters • Model properties and assumptions • Error • Some examples • Consider protein as the biomolecule of interest

Thermodynamic parameters 3 A simple DSC “endotherm”: • Buffer/buffer run subtracted • Normalized for concentration

Native state is a “solid-like” phase under “low” temperature conditions 4 • CpN(T) is the native state heat capacity function CpN(T)

Denatured state is a “liquid-like” phase under “high” temperature conditions 5 • CpD(T) is the denatured state heat capacity function CpD(T)

The system transitions from the CpN(T) to the CpD(T) heat capacity function as the protein undergoes a temperature-dependent “phase” transition (from “solid-like” to “liquid-like”) 6 CpN(T) to CpD(T) Function Transition “DSC Baseline”

The “excess enthalpy” represents the heat energy associated with the “phase” transition (heat of “fusion”) 7 • DHcal is also known as the “calorimetric enthalpy” of unfolding The integrated area represents DHcal

The midpoint of the transition from the N to D state is the “melting temperature” or Tm 8 (the Tm is not necessarily the apparent maximum of the excess enthalpy) Tm

CpD(T)-CpN(T) = DCp(T) (“delta” values always state2-state1) • DCp(T) is a characteristically positive value for protein denaturation (i.e. CpD(T) > CpN(T) @ Tm) 9 DCp(T) @ Tm

10 DCp(Tm) is a characteristically positive value for protein denaturation • Protein denaturation exposes formerly buried hydrophobic groups to solvent • Solvent forms an organized clathrate structure around these hydrophobic groups • This organized solvent is a low entropy situation, with the ability to substantially increase in disorder, resulting in a high heat capacity Some proteins can unfold to an intermediate that lacks substantial secondary structure (i.e. secondary structure has been “melted”) but retains a conformation that shields many hydrophobic groups from solvent. • The apparent DCp(Tm) in this case will have a very low value

Summary of basic thermodynamic parameters 11 • CpN(T) Native state heat capacity function • CpD(T) Denatured state heat capacity function • DCp(T) = CpD(T)-CpN(T) • DCp(@Tm) “Delta Cp” • Tm “Melting temperature” • The temperature at which protein is 50% folded • Keq = 1.0, and thus DG(Tm)=0 • DHcal “Calorimetric enthalpy”, “enthalpy of unfolding”, DH(@Tm) • DH(T) can be derived • DS(@Tm) = DHcal/Tm “entropy of unfolding” • DS(T) can be derived • DG(T) can be derived

12 Model properties and assumptions Typically, the common models used to fit DSC data have the following three important assumptions: • The system is reversible • The system is in equilibrium • The system is two-state (N and D states) If the assumptions of the model are not met (i.e. verified) then the derived parameters are potentially in error

13 Assumption 1: The system is reversible • Aggregation is an irreversible pathway from the denatured (or partially denatured) state N  D  Aggregation • Lack of visible aggregation does not mean system is reversible • Reversibility is confirmed by recovery of the enthalpy of unfolding upon cooling and subsequent reheating

14 Pronounced (asymmetric) exotherm following initial endotherm Initial Scan Subsequent scan CpD < CpN (i.e. negative DCp) “Dead” second scan Post-transition noise

15 Initial Scan Subsequent scan No sharp exotherm Recovery of substanital enthalpy on second scan CpD > CpN (i.e. positive DCp) No post-transition noise

16 Assumption 2: The system is in equilibrium • The scan rate must not be faster than the kinetics of the phase transition (folding/unfolding rates) • Most DSC instruments have a default scanrate of 60°/hr, this may be too fast for some proteins (b-sheet) and scan rates of <15°/hr may be necessary in such cases. • Equilibrium is confirmed by an absence of hysteresis when comparing up-scans and down-scans

Hysteresis with up/downscans at 60°/hr 17 (traces inverted for ease of identification) No hysteresis with up/ downscans at 15°/hr

18 Assumption 3: The system is two-state (N and D states) • Common models to analyze DSC data assume monomeric two-state denaturation. • The following situations will violate the assumptions of the model and result in erroneous analyses: N  I  D N2 2D • Two-state unfolding is confirmed by • The absence of systematic residual error to a two-state fit • A characteristic positive value for DCp • Agreement between DHcal with DH determined from a two-state model (i.e. DHvH – “the van’t Hoff enthalpy”)

N  I  D 19 • System denaturation is too broad for a two state model Fit overshoots at maximum Fit undershoots at shoulders DCp(Tm) abnormally low Residual endothermic signal

N  I  D 20 Plot of residual error to a 2-state fit (i.e. fit – raw data): Tm • Obvious systematic error centrosymmetric at the Tm • with NEG peak @ Tm

N  I  D 21 van’t Hoff enthalpy (DHvH): • Calculation of DH (and DS) based upon Keq assuming two-state system: DH – TDS = -RTlnKeq • DHvH < DHcal for non-2-state systems with a folding intermediate • DHvH/DHcal < 1.0 for non-2-state systems with a folding intermediate

Oligomers: 22 N2 2D • Similar situation (i.e. systematic error in the residual, centrosymmetric at the Tm), but the model fit would undershoot at the Tm and overshoot on the shoulders (i.e. characteristic POS peak for the residual at the Tm) • DHvH > DHcal • DHvH/DHcal > 1.0 The basic take-home message: If the systematic error (i.e. the residual plot) is significantly greater than the expected instrument error (i.e. point to point error) then the 2-state assumption is not supported

23 Error Baselines… (CpN(T) and CpD(T) functions)

Low temperature regime is experimentally accessible 24 Extrapolated Error in CpN(T) CpN(T)

25 High temperature regime is experimentally accessible Extrapolated Error in CpD(T) CpD(T)

26 Additional concerns regarding CpN(T) and CpD(T) baselines: DG = DH – TDS = -RTlnKeq • The equilibrium describes a continuum of N and D partitioning with Temp • There may be a temperature regime where the N state is substantially populated, but there is no temperature where the N state is 100% populated (similarly, with D state). • You can assign a temperature of maximum DG but you cannot define the temperature where the protein “begins to unfold” • How to assign CpN(T) and CpD(T) with confidence? Some fitting routines require the operator to assign the CpN(T), CpD(T) functions • Operator-dependent bias • CpN(T) and CpD(T) functions typically are not refined

27 • Are the CpN(T) and CpD(T) functions accurately modeled by linear equations, constants or polynomials? • CpN(T) appears to be reasonably well-modeled by a linear function for many (but not necessarily all) proteins • CpD(T) appears to have negative curvature for many proteins • It may be possible to obtain CpD(T) over a broad temperature regime by DSC in the presence of denaturant • What are the characteristics of the instrumentation (vis a vis initial and final datapoints)? • We observe the greatest run-to-run variation for data within the first few degrees (CpN(T)) and last few degrees (CpD(T)) • How much baseline data is necessary for accurate analysis? • As much as possible (30° from Tm) • Thermostable and unstable proteins are a problem

28 Error of DCp(T) (CpD(T)-CpN(T)): • DCp(T) determines: • DH(T) • DS(T) • DG(T) The greatest accuracy in the determination of all thermodynamic parameters is at the Tm Confidence wanes the further away we move from the Tm Tm

29 Error and sample concentration… • Data is normalized to a molar heat capacity, therefore, concentration must be known • What are the consequences of error in the sample concentration?

30 Fit and residual looks like non-2-state condition (i.e. the presence of a folding intermediate)

31 Fit and residual looks 2-state

32 Fit and residual looks like non-2-state condition (i.e. the presence of native state oligomer)

Summary of concentration errors: 33 • If the concentration is greater than you think: • The fit will exhibit apparent non-2-state behavior N  I  D • DHvH/DHcal < 1.0 • If the concentration is less than you think: • The fit will exhibit apparent non-2-state behavior N2 2D • DHvH/DHcal > 1.0 The basic take-home message: Concentrations must be accurately known for DSC analysis!

34 Cross-validation of DSC data • DSC data collected in the presence of varying concentrations of added denaturant can provide a 2-dimensional DGu profile (DGu versus T and [D]) • In principle, if we evaluate such a set of DSC data at a fixed temperature (isotherm), we can predict DGu for isothermal equilibrium denaturation • DGu as a function of denaturant can be derived from isothermal protein folding studies, and compared to the predicted results from DSC • These values should agree if the assumptions of the model (and concentrations) are correct

DGu Landscape as a function of temperature and denaturant from DSC data 35 DGu vs [D] isotherm

DGu = -RTln(Keq) = -RTln(ku/kf) 36 Human Fibroblast Growth Factor-1

Mutational effects upon stability DDG @ Tm of mutant DDG @ Tm of wild type Tm Tm DDG = (DDG@Tm of wild type + DDG@Tm of mutant)/2

DSC: Data Analysis