Gene Expression BMI 731 Winter 2005 - PowerPoint PPT Presentation

Gene expression bmi 731 winter 2005 l.jpg
1 / 48

  • Uploaded on
  • Presentation posted in: General

Gene Expression BMI 731 Winter 2005. Catalin Barbacioru Department of Biomedical Informatics Ohio State University. Thesis: the analysis of gene expression data is going to be big in 21st century statistics. Many different technologies, including Spotted DNA arrays (Brown/Botstein)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Gene Expression BMI 731 Winter 2005

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Gene expression bmi 731 winter 2005 l.jpg

Gene ExpressionBMI 731 Winter 2005

Catalin Barbacioru

Department of Biomedical Informatics

Ohio State University

Thesis the analysis of gene expression data is going to be big in 21st century statistics l.jpg

Thesis:the analysis of gene expression data is going to be big in 21st century statistics

Many different technologies, including

Spotted DNA arrays (Brown/Botstein)

Short oligonucleotide arrays (Affymetrix)

Serial analysis of gene expression (SAGE)

Long oligo arrays (Agilent)

Fibre optic arrays (Illumina)

Slide3 l.jpg





Number of papers




1995 1996 1997 1998 1999 2000 2001



Total microarray articles

indexed in Medline

Common themes l.jpg

Common themes

  • Parallel approach to collection of very large amounts of data (by biological standards)

  • Sophisticated instrumentation, requires some understanding

  • Systematic features of the data are at least as important as the random ones

  • Often more like industrial process than single investigator lab research

  • Integration of many data types: clinical, genetic, molecular…..databases

Central dogma l.jpg

Central dogma

  • The expression of the genetic information stored in the DNA molecule occurs in two stages:

  • (i) transcription, during which DNA is transcribed into mRNA;

  • (ii) translation, during which mRNA is translated to produce a protein.

  • DNA → mRNA → protein

  • Other important aspects of gene regulation: methylation, alternative splicing.

Slide6 l.jpg

Idea: measure the amount of mRNA to see which genes are being expressedin (used by) the cell.

Measuring protein might be better, but is currently harder.

Slide17 l.jpg

  • DNA microarrays represent an important new method for determining the complete expression profile of a cell.

  • Monitoring gene expression lies at the heart of a wide variety of medical and biological research projects, including classifying diseases, understanding basic biological processes, and identifying new drug targets.

Affymetrix instrument system l.jpg

Affymetrix® Instrument System

Platform for GeneChip® Probe Arrays

  • Integrated

  • Exportable

  • Easy to use

  • Versatile

Photolithography l.jpg


Synthesis of ordered oligonucleotide arrays l.jpg







T –










C –



Synthesis of Ordered Oligonucleotide Arrays

Affymetrix genechip arrays l.jpg

Affymetrix GeneChip arrays

Genechip probe arrays l.jpg






GeneChip® Probe Arrays

Hybridized Probe Cell

GeneChipProbe Array

Single stranded,

labeled RNA target

Oligonucleotide probe


Millions of copies of a specific

oligonucleotide probe


>200,000 different

complementary probes

Image of Hybridized Probe Array

Analysis of expression level from probe sets l.jpg

Analysis of expression level from probe sets

Each pixel is quantitated and integrated for each oligo feature (range 0-25,000)

Perfect Match (PM)

Mis Match (MM) Control

log(PM / MM) = difference score

All significant difference scores are averaged to create “average difference” = expression level of the gene.

Slide26 l.jpg

Analysis of expression level from probe sets

• each oligo sequence (20-25 mer) is synthesized

as a 20 µ square (feature)

• each feature contains > 1 million copies of the oligo

• scanner resolution is about 2 µ (pixel)

• each gene is quantitated by 16-20 oligos and

compared to equal # of mismatched controls

• 22,000 genes are evaluated with 20 matching oligos

and 10 mismatched oligos = 480,000 features/chip

• 480,000 features are photolithographically synthesized onto a 2 x 2 cm glass substrate

Affymetrix arrays l.jpg

Affymetrix arrays

  • Global views of gene expression are often essential for obtaining comprehensive pictures of cell function.

  • For example, it is estimated that between 0.2 to 10% of the 10,000 to 20,000 mRNA species in a typical mammalian cell are differentially expressed between cancer and normal tissues.

  • Whole-genome analyses also benefit studies where the end goal is to focus on small numbers of genes, by providing an efficient tool to sort through the activities of thousands of genes, and to recognize the key players.

  • In addition, monitoring multiple genes in parallel allows the identification of robust classifiers, called "signatures", of disease.

  • Global analyses frequently provide insights into multiple facets of a project. A study designed to identify new disease classes, for example, may also reveal clues about the basic biology of disorders, and may suggest novel drug targets.

Spotted dna microarrays l.jpg

Spotted DNA microarrays

  • In ‘‘spotted’’ microarrays, slides carrying spots of target DNA are hybridized to fluorescently labeled cDNA from experimental and control cells and the arrays are imaged at two or more wavelengths

  • Expression profiling involves the hybridization of fluorescently labeled cDNA, prepared from cellular mRNA, to microarrays carrying thousands of unique sequences.

  • Typically, a set of target DNA samples representing different genes is prepared by PCR and transferred to a coated slide to form a 2-D array of spots with a center-to-center distance (pitch) of about 200 μm, providing a pan-genomic profile in an area of 3 cm2 or less.

  • cDNA samples from experimental and control cells are labeled with different color fluors (cytochrome Cy5 and Cy3) and hybridized simultaneously to microarrays, and the relative levels of mRNA for each gene are then determined by comparing red and green signal intensities

Spotted dna microarrays31 l.jpg

Spotted DNA microarrays

Scanning Technology

  • Microarray slides are imaged with a modified fluorescence microscope designed for scanning large areas at high resolution (arrayWoRx, Applied Precision, Issaquah, WA, Affymetrix).

  • Fluorescence illumination are obtained from a metal halide arc lamp focused onto a fiber optic bundle, the output of which is directed at the microarray slide and emission recorded through a microscope objective (Nikon) onto a cooled CCD (charge-coupled device) camera.

  • Interference filters are used to select the excitation and emission wavelengths corresponding to the Cy3 and Cy5 fluorescent probes (Amersham Pharmacia).

  • Each image covered a 2.4 x 2.4 mm area of the slide at 5-μm resolution. To scan the entire microarray, a series of images (‘‘panels’’) were acquired by moving the slide under the microscope objective in 2.4-mm increments.

Http www bio davidson edu courses genomics chip chip swf l.jpg

The red green ratios can be spatially biased l.jpg

The red/green ratios can be spatially biased

  • .

Top 2.5%of ratios red, bottom 2.5% of ratios green

Spotted vs affymetrix arrays l.jpg

Spotted vs. Affymetrix Arrays

Affymetrix strengths:

- highly reliable: synthesized in situ

- highly reproducible from run to run

- no clone maintenance or ‘drift’

- sealed fluidics and controlled temperature

- standardized chips increase database power

- excellent scanner

- complex, but very reliable labelling

- excellent cost/benefit ratio

- amenable to mutation and SNP detection

Affymetrix weaknesses limitations l.jpg

Affymetrix weaknesses/limitations

  • not easily customized: $300K/chip

  • high labeling cost $170/chip

  • high per chip cost $350 to $1850

  • limited choice of species

  • requires knowledge of sequence

  • not designed for competitive protocols

Limitations to all microarrays l.jpg

Limitations to all microarrays

  • dynamic range of gene expression:

  • very difficult to simultaneously detect low and high

  • abundance genes accurately

  • - each gene has multiple splice variants

  • 2 splice variants may have opposite effects (i.e. trk)

  • arrays can be designed for splicing, but complexity ^ 5X

  • - translational efficiency is a regulated process:

  • mRNA level does not correlate with protein level

  • - proteins are modified post-translationally

  • glycosylation, phosphorylation, etc.

  • - pathogens might have little ‘genomic’ effect

Slide45 l.jpg

Biological question

Differentially expressed genes

Sample class prediction etc.

Experimental design

Microarray experiment

16-bit TIFF files

Image analysis

(Rfg, Rbg), (Gfg, Gbg)


R, G





Biological verification

and interpretation

  • Login