Molecular Biology. Part I: Chemistry and Genetics Part II: Maintenance of the Genome Part III: Expression of the Genome Part IV: Regulation Part V: Methods. Part V: METHODS. Ch 20: Techniques of Molecular Biology Ch 21: Model Organisms. Molecular Biology Course.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Part I: Chemistry and Genetics
Part II: Maintenance of the Genome
Part III: Expression of the Genome
Part IV: Regulation
Part V: Methods
Ch 20: Techniques of
Ch 21: Model Organisms
Preparation, analysis and manipulation of nucleic acids and proteins
The methods depend upon, and were developed from, an understanding of the properties of biological macromolecules themselves.
Topic 1: Nucleic acids
1.DNA and RNA molecules are negatively charged, thus move in the gel matrix (胶支持物) toward the positive pole (正电极).
2.Linear DNA molecules are separated according to sizes. The large DNA molecules move slower than the small molecules.
3.The mobility of circular DNA molecules is affected by their topological structures. The mobility of the same molecular weight DNA molecule with different shapes is:
supercoiled (超螺旋)> linear (线性) > nicked or relaxed (缺刻或松散)
DNA can be visualized by staining the gel with fluorescent dyes, such as ethidium bromide (EB 溴化乙锭)Fig 20-1: DNA is separatedby gel electrophoresis
Gel matrix (胶支持物) is an inserted, jello-like porous material that supports and allows macromolecules to move through.
has high resolving capability, and can resolve DNA/RNA that differ from each other as little as a single base pair/nucleotide.
but can separate DNA over a narrow size range (up to a few hundred bp/nt).
a much less resolving power than polyacrylamide,
but can separate DNA molecules of up to tens of kb
The electric field is applied in pulses that are oriented orthogonally (直角地) to each other.
Separate DNA molecules according to their molecule weight, as well as to their shape and topological properties.
Can effectively separate DNA molecules over 30-50 kb and up to several Mb in length.
Switching between two orientations: the larger the DNA is, the longer it takes to reorient
RNA have a uniform negative charge as DNA does.
RNA is single-stranded and have extensive secondary and tertiary structure, which significantly influences their electrophoretic mobility.
RNA can be treated with reagent such as glyoxal (乙二醛) to prevent RNA base pairing, so that its mobility correlates with the molecular weight
the 1st such
The random occurrence of the hexameric (六核苷酸的)sequence:
What are the frequencies if the recognition sequences are four (tetrameric) and eight (octameric) nucleotides? [homework]
(The smallest fragment)
it will be cut into 7 fragments which could be separated by the gel electrophoresis.
Fig 20-3 digestionof a DNA fragment with endonuclease EcoRI
Use of multiple REs allows different regions of a DNA molecule to be isolated
(1) Restriction enzymes differ in the recognition specificity: target sites are different.(2) Restriction enzymes differ in the length they recognized, and thus the frequencies differ.(3) Restriction enzymes differ in the nature of the DNA ends they generate: blunt/flush ends (平末端), sticky/staggered ends (粘性末端).(4) Restriction enzymes differ in the cleavage activity.
Fig 20-4 Recognition sequences and cut sites of various endonucleases
Fig 20-5 Cleavage of an EcoRI site. The 5’ protruding ends are said to be “sticky” because they readily anneal through base-pairing to DNA molecules cut with the same enzyme
Hybridization: the process of base-pairing between complementary ssDNA or RNA from two different sources.
A labeled, defined sequence used to search mixtures of nucleic acids for molecules containing a complementary sequence.
Radioactive labeling: display and/or magnify the signals by radioactivity.
Non-radioactive labeling: display and/or magnify the signals by antigen labeling – antibody binding – enzyme binding - substrate application (signal release)
End labeling: put the labels at the ends
Uniform labeling: put the labels internally
5’-end labeling usingpolynucleotide kinase (PNK)
3’-end labeling using terminal transferase
How to label one end of a DNA: Labeling at both ends by kinase,then remove one end by restriction digestion
Nick translation labeling of DNA:
DNase I to introduce random nicks DNA polI to remove dNMPs from 3’ to 5’ and add new dNMP including labeled nucleotide at the 3’ ends.
Hexanucleotide primered labeling of DNA:Denature DNA add random hexanucleotide primers and DNA pol synthesis of new strand incorporating labeled nucleotide.
labeled by in vitro transcription of the desired RNA sequence.
DNA on blot
RNA on blot
Northern analysis COB RNAs in S. cerevisiae
Comparison of Southern, Northern and Western bolt hybridization
Three different steps proceed in each PCR cycle.
Steps of PCR
Many cycles (25-35 in common) are performed to complete one PCR reaction, which resulted in an exponential amplification of the target DNA if both forward and reverse primers pair.
Any source of DNA that provides one or more target molecules can in principle be used as a template for PCR.
Whatever the source of template DNA, PCR can only be applied if some sequence information is known so that primers can be designed. .
5’-ATTCCGATCGCTAATCGATGGC------- TCCTGTGCA TTTCGCCACTAGAG-3’
DNA sequence is written from 5’ to the 3’ end if not stated. And only the sense strand is usually given instead of both strands.
an oligo pool derived from a protein sequence.
E.g. His-Phe-Pro-Phe-Met-Lys can generate a primer
5’-CAY TTY CCN TTY ATG AAR
N= any base
Gene of interest
To introduce deletion or point mutations
Reverse mutagenic primer
Denature and anneal
Extend to full length by DNA polymerase
Two ways for sequencing:
Maxam and Gilbert
The absence of 3’-hydroxyl lead to the inefficiency of the nucleophilic attack on the next incoming substrate molecule.
If one ddGTP is added to 100 dGTP, DNA synthesis aborts at a frequency of 1/100 every time the polymerase meets a ddGTP
Four separate reactions
Each ddNTP carries a fluorescence group, allowing us to “Read” the sequence directly from the gel.
If the H. influenzae genome is 1.8 kb, each read produces 600 bp of sequence, and 600 bp x 33,000 different colonies= 20 Mb.
That is to say 33,000 colonies are picked to prepare plasmid for sequencing.
2. Shotgun sequencing
3. Sequence Assembly
(A single contig is about 50,000 to 200,000 bp. )
Sophisticated computer programs have been developed that assemble the short sequences from random shotgun DNAs into larger contiguous sequences called contigs.
The purpose of this analysis is to predict the protein coding genes (蛋白质编码基因) and other functional sequences (其他功能序列) in the genome.
Finding protein coding genes = Identification of ORF (open-reading frames).
but not all ORF=real protein coding genes;
key change is in identifying the functions of these genes
For animal genomes with complex exon-intron structures, the challenge is far greater：
A variety of bioinformatics tools are required to identify genes and genetic composition of complex genomes.
The computer programs identifying potential protein coding genes are based on many sequence criteria including the occurrence of extended ORFs that are flanked by appropriate 5’ and 3’ splice sites.
~ one-fourths of genes cannot be identified by this way.
The failure to identify promoters because the core promoter elements are highly degenerate (退变的). Although the transcription complex is smart enough to identify these elements in cell, we are not yet smart enough to write programs to identify them in silico (硅片，人工).
The most important method for validating predicted protein coding genes and identifying those missed by current gene finder program is the use of cDNA sequence data.
Fig 20-18 Gene finder method: analysis of protein-coding regions in Ciona intestinalis (海鞘 )
A 20-kb genome sequence (scaffold)
Predicted by a gene finder program
The comparison of different animal genomes：
One of the striking findings of comparative genome analysis is the high degree of synteny (conservation in genetic linkage,遗传连锁的保守性) between distantly related animals.
The logic is that “functional sequence cannot be changed randomly”.
Regulatory sequences-transcription factor binding sites and larger elements of gene regulation, such as enhancers.
The computer program VISTA aligns the sequence contained in different genomes over short windows (10-20 bp), and can be used to predict the conserved regulatory sequence.
It is predicted that human and mice contain more like 50,000-100,000 enhancers.
An example of using different programs to predict the conserved regulatory sequences
Finding regions of similarity between different protein coding genes.
Input a query sequence (询问序列): a stretch of amino acids or the DNA sequence encoding your interested protein function.
Ask the computer to search the homologous sequences in the database, and you will get all the available genes that may have the similar protein function.
Host organisms/cells:where the plasmids get multiplied and propagated faithfully, which is crucial for DNA cloning.
Cloning vectors (克隆载体):allowing the exogenous DNA to be inserted, stored, and manipulated at DNA level.
E. coli cloning vector (circular):
bacteriophages (l and M13) (噬菌体)
plasmid-bacteriophage l hybrids (cosmids) (考斯质粒,质粒和噬菌体杂和体).
Yeast cloning vector: yeast artificial chromosomes (YACs，酵母人工染色体) (Linear)
Plasmids: small, extrachromosomal circular molecules, from 2 to ~200 kb in size, which exist in multiple copies within the host cells.
(Genomic library and cDNA library)
A DNA library (DNA文库) is a population of identical vectors that each contains a different DNA insert. (Fig. 20-8)
Genomic Library (基因组文库) : the DNA inserts in a DNA library is derived from restriction digestion or physical shearing of the genomic DNA.
cDNA library (cDNA文库) : the DNA inserts in a DNA library is converted from the mRNAs of a tissue, a cell type or an organism. cDNA stands for the DNA copied from mRNA. (Fig. 20-19)
Antibiotic screening (抗生素选择)： only the recombinant plasmids grow on the antibiotic-containing plate.
if the vector is phosphorylated
Dephosphorylate the vector using alkaline phosphate to prevent religation of vector molecules
MCS (Multiple cloning sites,
Insertion of a DNA fragment interrupts the ORF of lacZ’ gene, resulting in non-functional gene product that can not digest its substrate x-gal.
(substrate of the enzyme)
The expression of active b-galactosidase has to be vector dependent for the selection purpose
lacZ’: a shortened derivative of lacZ,
encoding N-terminal a-peptide of b-galactosidase.
Host strain for vectors containing lacZ’:
contains a mutant gene encoding only the C-terminal portion of b-galactosidase which can then complement the a-peptide to produce the active enzyme
Recombinant plasmid: containing inserted DNA: white transformants
Recreated vector (no insert)
Recombinant plasmid (contain insert)
Transfer to nitrocellulose
or nylon membrane
from master plate
Bake onto membrane
Probe with 32p-labled DNA
gene of interest
Expose to film
Screening by plaque hybridization
Analysis of a clone
1 Kb+ ladder
Expression of a gene from a transformed/transfected plasmid
allowing the exogenous DNA to be stored and expressed in an organism.
--E. coli expression vector
--Yeast expression vector
--Mammalian expression vector
In addition to the origin of replication, selective marker, multiple cloning site, expression vector has to contain a promoter and terminator for transcription. The inserted gene has to have a start codon and a stop codon for translation
T7 expression vector
A YEp vector
Insert Figure 1
Lac fusions: fuse your target gene with the LacZ coding sequence
His-tag fusions: A sequence encodes His-tag was inserted at the N- or C- termini of the target ORF, which allows purification of the fusion protein to be purified by binding to Ni2+ column.
GFP fusions: insert your targeted gene at the N- or C- termini of GFP, and your fusion protein will give you green fluorescence signal.
Topic 2: Proteins
In this approach, protein fractions are passed though glass columns filled with appropriated modified small acrylamide or agarose beads.
If the target protein is known to establish a specific and high-affinity interaction with a specific protein/nucleic acids/small molecule, we can couple this specific partner of the target protein to the column and thus the target protein will be selectively bound to the column.
This method is called affinity chromatography.
Western analysis using two specific antibodies
--- P+ P++ P+++ P++++
Tandem mass spectrometry (MS/MS) (串连质谱).
A chemical reaction in which the amino acid’s residues are sequentially release for the N-terminus of a polypeptide chain.
The whole process can be carried out in an automatic protein sequencer. 特贵
1. 2-D gel electrophoresis for protein separation (蛋白质分离).
2. MS spectrometry for the precise determination of the molecular weight and identify of a protein (蛋白质鉴定).
3. Bioinformatics for assigning proteins and peptides to the predicted products of protein coding sequence in the genome (蛋白质确定).
Topic 3: Study the interaction between protein and nucleic acid
DNA bound to
Identify the actual region of sequence with which the protein interacts.
Sequence ladder is required to determine the precise position
DNase footprinting（1）The protein protects DNA from attack by DNase. （2）Treat the DNA-protein complex with DNase I under mild conditions, so that an average of only one cut occur per DNA molecule.
protein and denature DNA
The three lanes represent DNA that was bound to 0, 1, and 5 units of protein. The lane with no protein shows a regular ladder of fragments. The lane with one unit shows some protection, and the lane with 5 units shows complete protection in the middle.By including sequencing ladders, we can tell exactly where the protein bound.
Topic 4: Determining the Structure of
Protein and nucleic acids
Determining the tertiary structure
Nucleic acids techniques:
Electrophoresis; Restriction digestion; Hybridization (southern & northern); PCR amplification; sequencing and genome sequencing; DNA cloning and gene expression.
Protein purification; affinity chromatography; Protein separation and identification by western blot; Protein sequencing; Proteomics.
Study the interaction between protein and nucleic acid
Gel retardation & Nuclease protection assays
Determining the Structure of protein and nucleic acids: X-ray crystallography, NMR