The lac repressor-operator system: Swimming in Data Collaborators: Mitch Lewis , Bob Daber, Leslie Milk, Matt Sochor, Chuck Bell, Steve Stayrook Thermodynamics of Allostery Kinetics of Allostery: Induced Fit or Landscape Shift? Large Scale Analysis of base sequence specificity/affinity
R: Active form, binds DNA tightly, Inducer weakly
R*: Induced form, binds DNA weakly, Inducer tightly
Repressor binds O1 operator site >1000 more tightly than non-specific DNA
Residue RQY YQR
/ | \
aa number 22 18 17
Mechanism of allostery
The origin of base sequence specific recognition of DNA by proteins
Prototype for gene therapy
Design of Tools for DNA manipulation
Cronin, et al lac operator-repressor system is functional in the mouse Genes & Dev. 2001. 15: 1506-1517
in-vivo system for evolution and functional characterization of lac repressor
Two plasmid system: one contains a Lac repressor gene other contains the GFPmut3.1 gene controlled by the Lac promoter and a given operator.
FACS used to screen and separate phenotypes by GFP fluorescence.
Randomize plasmid sequence corresponding to given aa positions in repressor
Screen for given phenotype
Permits assymmetric DNA recognition domains
to target non-symmetric Operator Sequences
Knockout one inducer site: Probe allosteric mechanism
Fluorescence quantified by plate reader
Fractional GFP expression relative to that with no repressor plasmid
Induced by IPTG
KRR*: Repressor conformational equilibrium (Induced/active)
KIR*, KIR: Inducer binding affinities for induced, active repressor
KR*O, KRO: Operator DNA binding affinities for induced, active repressor
O/(O+RO)-> Transcription (mRNA) -> Translation (GFP level)
Fractional GFP expression with no inducer
Fractional GFP expression at saturation (n=1 inducer site)
(n=2 inducer sites)
Inducer Binding Affinity Ratio KIR*/KIR = 15
In Vivo Repressor Concentration [R]KRO = 150
Inducer-Repressor Binding Affinity KD,IR* = 4uM
All constants are obtained in vivo, without doing a single binding measurement!
This explains why Xtal structures of lac with and without IPTG bound are so similar
But why is Repressor conformational equilibrium so weak?
DG to drive conformational change available from inducer binding is about 1.6 kcal/mole, or about 3.2kcal/mole total, a fairly modest amount
Cell achieves effective repression in spite of weak equilibrium by setting [R] at 150-fold excess
Lac Switch has evolved to combine effective switchability given modest driving force from inducer binding, balancing the conflicting requirements of repression and induction
# of ligands 2 4
Binding Ratio 15-20 30
Conf. Equilibrium 2 1/1000
Hill # 1.2 >3
Comparison of equilibrium constants with previous in vitro studies
Ligand L binds, induces conformational change A->B (induced fit)
B is of higher free energy than A
L binds to B tighter than to A, so now LB has lower free energy than A or LA
Protein exists in an ensemble of conformations A, B, C….. Higher energy forms less populated. L binds to and ‘selects’ one of the higher energy conformers, lowering its free energy so it becomes the dominant form
This is the population selection model, aka the protein landscape model, the protein ensemble model
Low inducer, R binds O tightly
High inducer, R dissociates from O
Induced fit route?
…applied to the Lac-Operon system
This can only be determined by kinetics, not equilibria.
Lac is one of the few systems where there is enough kinetic data to definitively discriminate
Association rates depend on concentration
In cell, [R] = 1nM
Time constants for various steps at I = 1uM
Time constants for various steps at I = 10uM
Time constants for various steps at I = 100uM
Flux at 1uM IPTG
(below induction midpoint)
Flux at 10uM IPTG
Repressor is leaky-This is functionally important, since in vivo inducer is metabolic product of enzymes repressed by lac
Changes in leakiness, as measured by GFP levels, due to mutation/base changes → R-O affinity changes
Functional Rules for Lac Repressor-Operator Associations and Implications for Protein-DNA Interactions
Milk, Daber and Lewis,Protein Science (2010) Vol 19.
A library of Lac mutants, fully randomized at positions 17, 18 and 22 screened against 64 symmetric Lac operator variants.
Functional repressors sequenced, purified and assayed with the corresponding operators.
Lower GFP expression = Tighter binding. Increase in GFP by IPTG = Inducibility.
GFP levels in absence of inducer (leakness) used to calculate change in Repressor-Operator affinity relative to wild type (YQR-GTG).
Changes in affinity occur due to localized sequence changes in 3 aa’s or 3 bp’s within the framework of the rest of the lac-operator
AA Bases AA Bases
AAN TGA TTA HNR GTG
AAR GAG GGA GTA HQN TTT
ACR GAA GCA HQR GTG
AGN TGA TTA HSN TGG TTT
AGR GAA GGA GGG GTA GTG HSR GAG GAT GGG GTG
AIR GGT HTA CTT
AKN TAC HTK CTT
AKR GAC HTN TTG TTT
AMR GAT GGT GTG HTR GTA GTG
ANR GTG HVR GTA
APR GAA HYR GTG
AQR GAT GGG GTG IAA CTA
ASA CGA CGT IAF CTA
ASL TAG IAG CTA
ASN TGA TGG TGT IAN TGA TTA
ASR GCA GGG GGT IAR GAA GTA
ASS CGA IAY CTA TTA
CAN TTA IGR GAA GGA GTG TAA
CMR GGT GTG IKR GAC
CQR GTG IMR GAG
CSR GGG GGT INR GTG
CTR GAA GGA GGT IQR GTG
DAR GTA ISL CGA
EAR GTA ISR GAA GCA
EMR GTG ITR GAA GCA GTG
ESR GGG IWK CTA
FAR GAA KAN TGG
FKR GAC KAR GAG GGG
FMR GTG KGR GTG
GAN TTA KMR GGG GTG
GAR GAA GCA GGA GTA KNR GGG
GCR GAA ... ...
GGR GTG YQR GTG (Wild Type)
GKR GAC YTR GTG
196 Different AA sequences
26 Different Base sequence
AGG CGA CGG CGT CTA CTT GAA GAC GAG GAT GCA GGA GGC GGG GGT GTA GTG TAA TAC TAG TGA TGG TGT TTA TTG TTT
KSL ASA KSA ASA IAA HTA ACR AKR AAR AMR ACR AAR GSR AGR AIR AAR AGR IGR AKN ASL AAN ASN ASN AAN HTN HMN
ASS KSC KSA IAF HTK AGR FKR HAR AQR ASR AGR AQR AMR AGR AMR PAN PKN HGN AGN HGN KSN AGN HQN
ATA KSL PSA IAG APR GKR HCR HSR GAR ATR ASR ASR ATR ANR PSN KSL ASN HSN PSN CAN HSN
ISL KSM TSA IAY AVR IKR HGR PAR GSR AVR CSR ATR DAR AQR PAN ATN KAN TSN GAN HTN
PSA KSY ICK CTR MKR HSR PMR GTR CTR ESR AVR EAR CMR PSN IAN KSA HAN
PTA KTA ICN FAR NKR IMR SMR ISR GAR GSR CMR GAR CQR RSL LGN KSC IAN
SSA KTD ICY GAR PKR KAR TMR ITR GSR HGR CSR GTR EMR PAN KSF IAY
STA KTM IWK GCR SKR PAR PCR GTR HSR CTR HTR FMR PSN KSG IGN
KTN TAA GTR TKR PQR PGR IGR KAR GSR HVR GGR PTN KSH SAH
TAY HAR RAR PSR NTR KMR GTR IAR GMR TGH KSL SAN
IAR RSR SAR PVR KNR KQR LAR GNR TGN KSM SAY
IGR SSR SCR SGR KSR KSR MAR GQR KSS SGN
ISR SSR STR KTR KTR PAR GTR KSY STN
ITR STR TTR NSR PIR PTR HAR KTN TAH
LAR TAR NTR PVR QAR HCR PSN TAN
MAR TTR PSR SMR SAR HGR RSL TAY
MTR PTR SSR SCR HNR RSN TGN
PCR QSR STR SGR HQR SSN VAN
PVR RAR TMR STR HSR YAN
QAR RGR TSR TAR HTR
SAR RQR TTR TCR HYR
SCR RSR VMR TGR IGR
SGR RTR VTR TTR INR
TAR SGR VAR IQR
TSR SSR VYR ITR
TTR STR KGR
VAR TGR KMR
>300 aa-base pair combinations now screened. Now we have a Thermodynamic
Model for Induction, all 300+ affinities can be extracted from the leakiness…
AGG CGA CGG CGT CTA CTT GAA GAC GAG GAT GCA GGA GGC GGG GGT GTA GTG TAA TAC TAG TGA TGG TGT TTA TTG TTTAGG CGA CGG CGT CTA CTT GAA GAC GAG GAT GCA GGA GGC GGG GGT GTA GTG TAA TAC TAG TGA TGG TGT TTA TTG TTT
196 variants of Lac differing in aa sequence in the recognition helix, each of which bind specifically to different subsets of 26 DNA base pair sequences, for a total of 331 aa-bp complexes with known affinity.
Extract as much sequence level information about specificity as possible to infer sequence recognition ‘rules’.
Can take a ‘bioinformatics’ approach
Bipartite Graph partitioning
Given: 331 (and counting) amino-acid, base sequence variants and their relative affinities
Identify the structural basis for sequence specific protein-DNA recognition using a conformational analysis approach, i.e. by searching through protein and base sequence/conformation space to generate Lac-DNA structural models that explain, and ultimately predict, which amino-acid sequences recognize which base sequences.
What structural features determine high affinity, and/or sequence specificity? Can we predict, and so design, repressor sequences that will bind given lac-operator sequences, and more generally, bind any base sequence of the same length?
EVOLVE: Searches in both protein and DNA sequence space, with full amino-acid, base rotamer exploration, torsional minimization.
Simultaneously generations conformers for bound, unbound states, evaluates energy difference.
EVOLVE energy difference vs. Measured Affinity difference.
Correlation coefficient = 0.66
Not bad: without full rotamer exploration (depth first), no solvent, and no binding entropy yet