slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings “Expressionist” Semi PowerPoint Presentation
Download Presentation
A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings “Expressionist” Semi

Loading in 2 Seconds...

play fullscreen
1 / 53

A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings “Expressionist” Semi - PowerPoint PPT Presentation

  • Uploaded on

A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings “Expressionist” Seminar Group Johns Hopkins School of Public Health Ernest S. Kawasaki NCI Advanced Technology Center Microarray Facility August 9, 2006. ERCC Summary/Update

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings “Expressionist” Semi' - benjamin

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings

“Expressionist” Seminar Group

Johns Hopkins School of Public Health

Ernest S. Kawasaki

NCI Advanced Technology Center

Microarray Facility

August 9, 2006


ERCC Summary/Update

External RNA Controls Consortium

MAQC Summary/Update

MicroArray Quality Control Consortium

Possible Use of ERCC/MAQC Standards & Large Data Set


Organizations/Consortia Developing Standards & Controls for Gene Expression Profiling Technologies

  • MGED -- Microarray Gene Expression Database
      • Standard for data reporting (MIAME)
  • MAQC -- Microarray Quality Control Group
      • FDA sponsored RNA standards, ref
      • datasets, etc. (Leming Shi et al)
  • ERCC -- External RNA Controls Consortium
    • (M. Salit, J. Warrington et al)
  • NIST -- Metrology for Gene Expression Program provides a better understanding of the
          • fundamentals of microarray technologies
          • (M. Salit, M. Satterfield, et al)

Rapid Increase in Microarray Publications

2005 -- The 10 Year Anniversary of the First Expression Microarray


No common standards are used across platforms so data are difficult or impossible to compare.

5000 -


4000 -

Number of publications


3000 -


2000 -














Yearly Summary From PubMed


Proliferation of Whole Genome Arrays

ABI 60mer 31,000 Probe Sets

Affymetrix 25mer 54,000 “ “

Agilent 60mer 44,000 “ “

GE Amersham 30mer 55,000 “ “

Illumina 50mer 46,000 “ “

Microarrays Inc. 70mer 49,000 “ “

NimbleGen 60mer 38,000 “ “

Phalanx Biotech 60mer ~30,000 “ “

Home Brew 70mer ~40,000 “ “ cDNA

Etc, Many other companies (Combimatrix) making smaller custom arrays. DNA-DNA hybrid occupies ~4nm2 on slide surface

ercc e xternal r na c ontrol c onsortium conception in march 2003 stanford university

ERCCExternalRNAControlConsortiumConception in March 2003Stanford University

The Private, Public, and Academic sectors working together to produce control materials for gene expression analysis.

Mark Salit NIST/ Janet Warrington Affymetrix

mission of the ercc
Mission of the ERCC

The ERCC is developing external RNA controls

useful for gene expression assays in Microarrays & QRT-PCR on a wide variety of platforms.

J. Warrington -- Affymetrix


Members of the ERCCMore than 70 and counting….A good mix of academic, government and commercial organizations with ~115 scientists, 10 countries






GE Healthcare

Genetics Society of Vietnam

Harvard University


Informax, Inc.

International Federation of Clinical Chemistry & Laboratory Medicine


Johns Hopkins University

Lawrence Livermore Lab


Marine Molecular Quality Controls

Mayo Clinic

National Institute of Standards

& Technology

NIH, National Cancer Institute





Applied Biosystems




Cambridge University

Capital Bio

Celera Diagnostics


Centers for Disease Control

Centers for Medicare

& Medicaid Services

Clinical & Laboratory Standards


Clinical Hospital Center Zagreb


Eli Lilly

Eppendorf Microarray Division

Expression Analysis



Queens University Hospital

Roche Molecular Systems

Stanford University


Tokyo University


University Health Network

US Department of Agriculture

Veridex, Johnson & Johnson



Etc, etc, etc

J. Warrington -- Affymetrix

the ercc is producing standardized expression controls analysis tools and protocols
The ERCC is producing standardized expression controls, analysis tools and protocols
  • Well-characterized, widely accepted RNA standard controls for multiple platforms
    • Certified Reference Material (CRM)
  • Protocols for multiple applications, research and the clinical laboratory (CLSI – Clinical & Laboratory Standards Inst) Approved July 2006!
  • Software tools to support development work
  • Software tools to support multiple applications

J. Warrington -- Affymetrix

control sequences june 2006
Control Sequences June 2006

L. Reid -- Expression Analysis, J. Warrington -- Affymetrix


Testing Strategy for RNA Controls

  • Design and development -- generate reagents -- ~100 in place w/70 sequenced
  • Prototype testing -- validate reagents
  • Proof of concept -- validate the assays
  • Functional testing -- validate the product
  • Performance review -- analyze all data
  • Testing begins in the 4th quarter. L. Reid et al

Uses of RNA Controls/Standards

  • Negative Controls
  • -- Determine “true” background
  • -- QC for slide quality, hybridization, etc.
  • Positive Controls
  • -- QC as above
  • -- Labeling efficiency
  • -- Dilution series, determine sensitivity of assay,
  • determine lowest conc. with reliable signal
  • -- Ratiometric series, normalization tool
  • Will allow better comparison of intra or inter lab data and with the same or different array platforms.

Tests for Validation of ERCC Controls

  • Negative control test – background studies
  • Cross-hybridization – determine if any of the
  • controls hybridize to each other or to mRNAs
  • Labeling test – determine efficiency in the
  • presence of complex RNA sample
  • Latin square – test controls over a range of
  • concentrations (1:5,000,000 to 1:1000)
  • Linear range test and ratiometric studies
  • Above studies will require ~102 arrays per site!

Latin Squares Design for Testing Controls

A1 – A4 = the 4 arrays used

G1 – G4 = the 4 transcripts being studied

L1 – L4 = the 4 concentrations of each transcript

L. Reid, BMC Genomics 6:150


ERCC Test Sites

  • >100 Arrays/Site for Validating Controls
  • Affymetrix
  • GE Healthcare
  • Illumina
  • Novartis
  • Qiagen
  • Agilent, ABI, Roche maybe

The MAQC Project

  • MicroArray Quality Control
  • An FDA sponsored consortium (Leming Shi)
  • Founded to address concerns of microarray
  • community concerning reproducibility of
  • expression profiling experiments.
  • Group consists of over 140 members from
  • academia, government, pharma & biotech.
  • A large study was designed to compare ex-
  • pression data from 10 different platforms and
  • 40 different test sites with >650 arrays.
  • Study has been completed and results will be
  • published in Nature Biotechnology. Data will
  • released next month.

MAQC Study Goals/Exptl. Design

  • Establish a set of reference standards for use in the
  • MAQC, but more importantly for the array community
  • Generate large collection of reference data sets using
  • multiple microarray platforms and many diff. labs….
  • .
  • .
  • Promote the use of reference RNA samples…..
  • Make recommendations on the appropriate uses of
  • microarray technology.
  • The MAQC group first tested multiple RNAs with 160 arrays and then chose two for titration studies with 200 arrays. Two RNAs at two concentrations were chosen for repeated (5 arrays per sample) assays for four pools. The samples were UHRR from Stratagene and Human Brain Ref from Ambion. The four pools were: A. 100% UHRR
  • B. 100% HBRR C. 75% UHRR: 25% HBRR D. 25% UHRR:75% HBRR.
  • At the completion of this study there is data from over 1026 arrays!

Platforms Used In MAQC Study

ABI(Applied Biosystems)One-Color Array 32,878 Probes

AFX (Affymetrix) One-Color Array 54,675 Probes

AGL (Agilent) Two-Color Array 43,931 Probes

AGI (Agilent) One-Color Array 43,931 Probes

CBC (CapitalBioCorp) One & Two Color 23,231 Probes

EPP (Eppendorf) One-Color Array 294 Probes

GEH (GE Healthcare) One-Color Array 54,359 Probes

ILM (Illumina) One-Color Array 47,293 Probes

NCI (NCI-Operon) Two-Color Array 37,632 Probes

TCI (TeleChem Int) One & Two Color 27,648 Probes

TAQ (Applied Biosystems) TaqMan® Assays 1,004 PCRs

QGN (Panomics) QuantiGene Assay 245 Probes

GEX (GeneExpress) StaRT-PCR™ Assay 205 Probes



12,091 Genes

Used for Com-

parison Across

All Platforms.

(Damir Herman, Jean Thierry-Mieg)


Take Home Messages/General Findings From MAQC Study

  • Large data sets are available for objectively
  • assessing platform performance and various
  • data analysis algorithms.
  • Microarray technology is reproducible and
  • reliable when one has an understanding of
  • its limitations.
  • Cross platform analyses requires a very
  • careful annotation & mapping of probe
  • sequences.
  • All the platforms had good intra-lab repeat-
  • ability, and inter-lab reproducibility after
  • removal of outliers.
  • Methods of microarray analysis are an impor-
  • tant variable, and this large data set will help
  • resolve issues in this area (statisticians and
  • bioinformaticists take delight……..)

Manuscripts in MAQC Study -- Entire Issue of

  • Nature Biotechnology Sept. 2006
  • Editorial
  • FDA Forward
  • Stanford - Data quality in genomics and microarrays
  • Impact of microarray data quality in genomic data
  • submissions to the FDA
  • US EPA efforts to develop a framework for using
  • genomics data in risk assessment and regulatory
  • decision making.
  • MAQC main manuscript – overall description
  • The reproducibility of differentially expressed gene
  • lists in microarray studies*
  • An analysis and comparison of alternative platforms
  • Use of RNA titrations to assess platform performanc
  • Performance of one-color vs two-color arrays

MAQC Manuscripts (cont.)

  • External RNA controls for assessment of microarray
  • analytical performance
  • Normalization and technical variation in gene
  • expression measurements*
  • Toxigenomics and microarrays: biological response
  • measurements are preserved across platforms
  • Reproducibility probability score: A metric incorp-
  • orating measurement variability across labs for
  • gene comparison*
  • Late news: 9 manuscripts submitted and 6 were accepted. With 3 commentaries there are 9 articles in the Sept. Nature Biotechnology Suppl. from the MAQC.

With proper use of negative and positive controls, microarrays may be used to identify, quantitate expression and count the absolute number of genes being expressed in any given cell or tissue sample.


aka ESK

Nature May 25, 2006


Present (P)& Absent (A) Calls in

Spotted Long Oligo Arrays

  • “Average” cell expresses <10,000 genes.
  • “Whole” genome array contains >25,000 genes.
  • Therefore, Present calls should be 40% or less or 60%
  • Absent.
  • However, P calls are usually 90% or more using usual
  • image analysis systems like GenePix.
  • Why is this? Why do we care?
  • Good negative controls may resolve this issue.

What is Background?

Articles are still being written about how to determine “true” background. Controls can be used to settle this issue.

Internal Background

External Background

Li et al (2005) Bioinformatics 21:2875


Common Methods for Background Subtraction

W Yin et al (2005) Bioinformatics 21:2410


Use of Negative Controls for Background Subtraction

Internal Background ~ 500-1000 units

External Background ~ 100-200 “

%Present using external = 96%

%Present using internal = 77%

= 21,565/22,464 vs 17,010/22,464

Bckgrd subt eliminated 4,555 genes from further analysis. Good or bad??

Use of negative controls can dramatically change values for % genes expressed and gene expression ratios!




Negative Control



External Background

Negative Controls & Background


Signal distribution of noise background (B), negative control background (median)(neg) and mean intensities of all probes (F) on the slide separated by Cy5 and Cy3 channels


Influence of Type of Background Subtraction on Expression Ratios

  • Assume control sample gene has signal of 600 units.
  • The experimental has a signal of 5600 in same gene.
  • The external background is 100 units.
  • Therefore, the calculated ratio value would be 11.
  • 5500/500 = 11
  • But if the negative control background is 500, the
  • ratio is now 51.
  • 5100/100 = 51
  • Use of negative controls as background may relieve
  • some of the “compression” in ratios for these types of
  • arrays and give a more accurate expression value.

Box plots of CV (data are loess normalized, one set with negative bg sub, another set without) – this figure shows background subtraction could improve the data quality

1 2 3 4 5 6 7 8 9 10 11 12

1. jurkat; 2. jurkat_neg; 3. L428l; 4. L428_neg; 5. lncap; 6. lncap_neg;

7. mcf; 8. mcf_neg; 9. oci; 10. oci_neg; 11. sud; 12sud_neg


Perfect Match (PM) and Mismatch (MM): The Affy Image Quantitation Methods

GCOS (Gene Chip Operating System): default Affy analysis software.

RMA (Robust Multiarray Average): Irizarry method using only PM signals.

GCRMA: Similar to RNA but takes into account GC content

dChip: Similar to GCOS but has with or without MM options.

YW Chip: The Yonghong Wang method. PM only with only sequence validated oligos used in analysis.

correlation between 2 technical replicates affy chips

Influence of Different Methods of Background Subtraction

Correlation Between 2 Technical Replicates – Affy Chips


No background subtraction

MAS background subtraction

RMA background subtraction


PM Only vs PM-MM Analyis of Technical Replicates

PM Only


Log2 Intensities

Mean Values Intensities

S.D. Dist. Probesets 4 Reps








Correlation Study of Gene With Absent Calls

Genes here were called absent by GCOS in 8 hybs from 2 technical

replicates. Data indicates that absent calls may not be truly absent in many cases.


The MM Probes: C or T at 13th Position May Result in Artefactual High Signal: 92% of All MM with Higher Signal Than PM have C or T


Analysis of Probe Sequences Within Probe Sets in Affy Gene Chip

# of “Correct” or Mapped# Probe Sets in Each

Oligos/Probe SetCategory

1 692

2 514

3 433

4 450

5 425

6 499

7 626

8 862

9 1608

10 3771

11 36562


How The ERCC & MAQC Can Increase The

  • Reliability/Acceptance of Microarray Data
  • A set of controls used by all expression platforms will
  • go a long way to end confusion about comparability
  • of data from related experiments.
  • Probe mapping and sequences from all platforms will
  • be extremely useful for cross platform comparisons.
  • Very large data set from all major platforms will point
  • out problem areas in present protocols/technologies,
  • which, hopefully, will result in their improvement.
  • Large data sets from ERCC and MAQC combined will
  • provide a great resource for critically evaluating algo-
  • rithms used in analyzing arrays. Which analysis
  • method provides “true” answers?
  • Hopefully, a (workable) consensus about utilization of
  • microarray technologies will arise from these two large
  • exercises in (sometimes a bit contentious) human
  • scientific cooperation.

In Closing…….

My attempt at being funny…..


Is your back to the wall? Are you under a lot of pressure?


Do you feel you’re on the Treadmill of Life?

Moebius Strip II by M.C. Escher

Nature vol. 246, p776, 2003


Keep on smilin’, ‘caus when you’re smilin’, the whole world smiles with you……

100 nm

Nano Smiley DNAs --- Many Happy Genomes

Courtesy P Rothemund Nature v440p297y06


Thank you all ------

ERCC & MAQC Consortia

ATC Microarray Lab Crew

YW for Analysis & AP for

Chip Data