1 / 29

CDNA Normalisation and Affymetrix GeneChips

laken
Download Presentation

CDNA Normalisation and Affymetrix GeneChips

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. cDNA Normalisation and Affymetrix GeneChips Neil Lawrence

    2. Schedule

    3. Review DNA Microarrays New technology for measuring gene expression. Two main types. cDNA microarrays (Synteni/Stanford). GeneChip® (Affymetrix)

    4. Affymetrix vs cDNA Technology differs in: How DNA sequences are laid down spotting vs photolithography Length of DNA sequences that are laid down Full sequences vs partial sequences. How does the sequence of a strand of DNA correspond to the amino acid sequence of a protein ? This concept is explained by the central dogma of molecular biology, which states that: Why would the cell want to have an intermediate between DNA and the proteins it encodes? The DNA can then stay pristine and protected, away from the caustic chemistry of the cytoplasm. Gene information can be amplified by having many copies of an RNA made from one copy of DNA. Regulation of gene expression can be effected by having specific controls at each element of the pathway between DNA and proteins. The more elements there are in the pathway, the more opportunities there are to control it in different circumstances. What is RNA? RNA has the same primary structure as DNA . It consists of a sugar-phosphate backbone, with nucleotides attaches to the 1' carbon of the sugar. The differences between DNA and RNA are that: RNA has a hydroxyl group on the 2' carbon of the sugar (thus, the difference between deoxyribonucleic acid and ribonucleic acid. Instead of using the nucleotide thymine, RNA uses another nucleotide called uracil: Because of the extra hydroxyl group on the sugar, RNA is too bulky to form a a stable double helix. RNA exists as a single-stranded molecule. However, regions of double helix can form where there is some base pair complementation (U and A , G and C), resulting in hairpin loops. The RNA molecule with its hairpin loops is said to have a secondary structure. In addition, because the RNA molecule is not restricted to a rigid double helix, it can form many different tertiary structures. Each RNA molecule, depending on the sequence of its bases, can fold into a stable three-dimensional structure. From http://motif.stanford.edu/thesis/tRNA.html. There are several different kinds of RNA made by the cell. mRNA - messenger RNA is a copy of a gene. It acts as a photocpoy of a gene by having a sequence complementary to one strand of the DNA and identical to the other strand. The mRNA acts as a busboy to carry the information stored in the DNA in the nucleus to the cytoplasm where the ribosomes can make it into protein. From http://esg-www.mit.edu:8001/esgbio/dogma/dogma.html How does the sequence of a strand of DNA correspond to the amino acid sequence of a protein ? This concept is explained by the central dogma of molecular biology, which states that: Why would the cell want to have an intermediate between DNA and the proteins it encodes? The DNA can then stay pristine and protected, away from the caustic chemistry of the cytoplasm. Gene information can be amplified by having many copies of an RNA made from one copy of DNA. Regulation of gene expression can be effected by having specific controls at each element of the pathway between DNA and proteins. The more elements there are in the pathway, the more opportunities there are to control it in different circumstances. What is RNA? RNA has the same primary structure as DNA . It consists of a sugar-phosphate backbone, with nucleotides attaches to the 1' carbon of the sugar. The differences between DNA and RNA are that: RNA has a hydroxyl group on the 2' carbon of the sugar (thus, the difference between deoxyribonucleic acid and ribonucleic acid. Instead of using the nucleotide thymine, RNA uses another nucleotide called uracil: Because of the extra hydroxyl group on the sugar, RNA is too bulky to form a a stable double helix. RNA exists as a single-stranded molecule. However, regions of double helix can form where there is some base pair complementation (U and A , G and C), resulting in hairpin loops. The RNA molecule with its hairpin loops is said to have a secondary structure. In addition, because the RNA molecule is not restricted to a rigid double helix, it can form many different tertiary structures. Each RNA molecule, depending on the sequence of its bases, can fold into a stable three-dimensional structure. From http://motif.stanford.edu/thesis/tRNA.html. There are several different kinds of RNA made by the cell. mRNA - messenger RNA is a copy of a gene. It acts as a photocpoy of a gene by having a sequence complementary to one strand of the DNA and identical to the other strand. The mRNA acts as a busboy to carry the information stored in the DNA in the nucleus to the cytoplasm where the ribosomes can make it into protein. From http://esg-www.mit.edu:8001/esgbio/dogma/dogma.html

    5. Review cDNA Chips Advantages: Don’t need sequence of gene. Provide DNA library. Can make your own chips. Disadvantages Larger arrays. Requires image processing. Gene expression analysis: In each and every organism, different genes are expressed in different cell and tissue types (spatial differences) and at different developmental stages (temporal differences). Analysis of these variations in gene expression will lead to a better understanding of disease states, targeting of drugs to specific cells, tissues or individuals, development of agricultural products, etc. Some computational tools available to perform such analyses will be discussed. Gene expression analysis: In each and every organism, different genes are expressed in different cell and tissue types (spatial differences) and at different developmental stages (temporal differences). Analysis of these variations in gene expression will lead to a better understanding of disease states, targeting of drugs to specific cells, tissues or individuals, development of agricultural products, etc. Some computational tools available to perform such analyses will be discussed.

    6. Array Spotting

    7. How is the DNA obtained? http://www.dcs.shef.ac.uk/~neil/libraryFaq.htm

    8. Biological Sample cDNA extracted from two Biological samples and labelled with different colours. These samples are hybridised to a slide.

    9. Biological Noise 1 Biological noise – Expression levels depend on Intrinsic Intracellualar factors (The Stage of the Cell Cycle). Extrinsic factors (Signals from other cells). Expression level depends on complex interaction of activators and inhibitors.

    10. Biological Noise 2 Even in tissue culture dishes (with uniform cells) variation of RNA levels within each cell can occur. The Overall expression level is taken from RNA collected from a pool of cells. It will be a combination of all the transcripts of each cell, our understanding of these processes is incomplete, thus the fluctuations often appear to be random and are thus called biological noise.

    11. Experimental Noise Problems with initial spotting on the arrays e.g. not all spots present or the amount spotted is not the same across all spots. Efficiency of dye incorporation into cDNA. Efficiency of hybridization Differences in the dye properties (how the dye effects hybridisation with the cDNA strand). How effectively the cDNA samples bind to the elements (spots). How effectively the background level is reduced by washing Whether there are any tidemarks left after washing and drying.

    12. Other Processing Dirt/dust on the slide Differences arising from the scanning process. Image Processing Noise If the image is incorrectly processed, i.e. spot locations are not properly specified, the intensities that are extracted will not truly reflect the expression levels. This variation can be random or systematic.

    13. Normalisation During the array preparations technical variations can occur. Dye properties. Differences in dye incorporation. Differences in scanning. Remove these variations. Balance the fluorescent intensities of the dyes. Allows comparison of expression levels across experiments (arrays).

    14. Global Normalisation Global Normalisation methods assume the two dyes are related by a constant factor Taking logs

    15. Local Normalisations The dye factor is dependent on: Spot intensity (defined as A=RG). Location on the array. Local normalisation methods: Intensity dependent. Print-tip group.

    16. Intensity dependent Visualise the effect: M-A plot Correction of the intensity dependant variations:

    17. Print-tip Group Different experiments may use different printing set-up: Layout of the tips in the print-head of the arrayer. Differences on the length or opening of the tips. Deformation. Print-tip normalisation is simply: (print-tip + A) – dependent Normalisation

    18. Non-Spotting Technologies We review two approaches piezoelectric printing photolithography (Affymetrix) Piezoelectric Printing Analagous to ink-jet printers. Oligonucleotides of up to 50 bases are possible.

    19. Photolithography Photolithography (Affymetrix) Based on the same technique used to make the microprocessors. Oligonucleotides are generated in situ on a silicon surface. Oligonucleotides up to 30bp in length. Array density of 106 probes per cm-2.

    20. Affymetrix Stock Price

    21. Affymetrix Only one biological sample per chip. Oligonucleotides represent a portion of a gene’s sequence. Twenty sub-sequences present for each gene.

    22. Perfect vs Mismatch For each oligonucleotide there is A perfect match A mismatch The perfect match is a sub-sequence of the true sequence. The mismatch is a sub-sequence with a ‘central’ base-pair replaced.

    23. Affymetrix Analysis Mismatch is designed to measure ‘background’. Signal from each sub-sequence is IPerfect match – IMismatch Twenty of these sub-sequences are present. Average of all these signals is taken.

    24. Problems Sometimes Imismatch > Iperfect match Solution: set it to 20??!!! Other issues Present/Absent call Based on the number of Signals > 0. Proprietary Technology You don’t know what the subsequences are. Apparently this is changing!

    25. Current Position I missed their recent talk! Apparently the analysis is much improved. They are now giving more information away.

    26. Scaling Factors – Maximum likelihood estimation The data produced is still affected by undesirable variations that we need to remove. We can assume that the variations are primarily multiplicative: (No intensity dependent or print-tip effect) Obs.-exp.Level = true-exp.Level * error *random-noise (chip variations) (biological noise)

    27. Model Assumption Organise the twelve values from three exogenous control species in a matrix: X=[NControls * NChips] Error model: Here mi is associated with each control and rj is associated with each chip or experiment. Taking logs we have:

    28. Scaling Factors Calculating scaling factors using maximum likelihood estimation of the model parameters Likelihood: Estimates are calculated solving Scaling factors are thus :

    29. You Should Know The Central Dogma (Gene Expression). cDNA chip overview. Noise in cDNA chips. Affymetrix GeneChip overview.

    30. Conclusions Next week Guest Lecture – Dr Pen Rashbass. Week after Analysis of the microarray data.

More Related