slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings “Expressionist” Seminar PowerPoint Presentation
Download Presentation
A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings “Expressionist” Seminar

Loading in 2 Seconds...

play fullscreen
1 / 53

A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings “Expressionist” Seminar - PowerPoint PPT Presentation


  • 257 Views
  • Uploaded on

A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings “Expressionist” Seminar Group Johns Hopkins School of Public Health Ernest S. Kawasaki NCI Advanced Technology Center Microarray Facility August 9, 2006. ERCC Summary/Update

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings “Expressionist” Seminar ' - benjamin


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

A Review/Update of the ERCC & MAQC Microarray Consortia and Some Applications of Their Findings

“Expressionist” Seminar Group

Johns Hopkins School of Public Health

Ernest S. Kawasaki

NCI Advanced Technology Center

Microarray Facility

August 9, 2006

slide2

ERCC Summary/Update

External RNA Controls Consortium

MAQC Summary/Update

MicroArray Quality Control Consortium

Possible Use of ERCC/MAQC Standards & Large Data Set

slide3

Organizations/Consortia Developing Standards & Controls for Gene Expression Profiling Technologies

  • MGED -- Microarray Gene Expression Database
      • Standard for data reporting (MIAME)
  • MAQC -- Microarray Quality Control Group
      • FDA sponsored RNA standards, ref
      • datasets, etc. (Leming Shi et al)
  • ERCC -- External RNA Controls Consortium
    • (M. Salit, J. Warrington et al)
  • NIST -- Metrology for Gene Expression Program provides a better understanding of the
          • fundamentals of microarray technologies
          • (M. Salit, M. Satterfield, et al)
slide4

Rapid Increase in Microarray Publications

2005 -- The 10 Year Anniversary of the First Expression Microarray

5400!

No common standards are used across platforms so data are difficult or impossible to compare.

5000 -

4350

4000 -

Number of publications

3110

3000 -

2000

2000 -

1125

55

140

425

0

2005

1995-8

1999

2000

2001

2002

2003

2004

Yearly Summary From PubMed

slide5

Proliferation of Whole Genome Arrays

ABI 60mer 31,000 Probe Sets

Affymetrix 25mer 54,000 “ “

Agilent 60mer 44,000 “ “

GE Amersham 30mer 55,000 “ “

Illumina 50mer 46,000 “ “

Microarrays Inc. 70mer 49,000 “ “

NimbleGen 60mer 38,000 “ “

Phalanx Biotech 60mer ~30,000 “ “

Home Brew 70mer ~40,000 “ “ cDNA

Etc, Many other companies (Combimatrix) making smaller custom arrays. DNA-DNA hybrid occupies ~4nm2 on slide surface

ercc e xternal r na c ontrol c onsortium conception in march 2003 stanford university

ERCCExternalRNAControlConsortiumConception in March 2003Stanford University

The Private, Public, and Academic sectors working together to produce control materials for gene expression analysis.

Mark Salit NIST/ Janet Warrington Affymetrix

mission of the ercc
Mission of the ERCC

The ERCC is developing external RNA controls

useful for gene expression assays in Microarrays & QRT-PCR on a wide variety of platforms.

J. Warrington -- Affymetrix

slide9

Members of the ERCCMore than 70 and counting….A good mix of academic, government and commercial organizations with ~115 scientists, 10 countries

FDA, CBER

FDA, CDER

FDA, CDRH

FDA, NCTR

FDA, OIVD

GE Healthcare

Genetics Society of Vietnam

Harvard University

Illumina

Informax, Inc.

International Federation of Clinical Chemistry & Laboratory Medicine

Invitrogen

Johns Hopkins University

Lawrence Livermore Lab

LGC

Marine Molecular Quality Controls

Mayo Clinic

National Institute of Standards

& Technology

NIH, National Cancer Institute

Northwestern

Affymetrix

Agilent

Ambion

Applied Biosystems

ATCC

Biomerieux

BMS

Cambridge University

Capital Bio

Celera Diagnostics

Cenetron

Centers for Disease Control

Centers for Medicare

& Medicaid Services

Clinical & Laboratory Standards

Institute

Clinical Hospital Center Zagreb

Combimatrix

Eli Lilly

Eppendorf Microarray Division

Expression Analysis

Nugenic

Qiagen

Queens University Hospital

Roche Molecular Systems

Stanford University

Stratagene

Tokyo University

UCLA

University Health Network

US Department of Agriculture

Veridex, Johnson & Johnson

Vialogy

Vigentech

Etc, etc, etc

J. Warrington -- Affymetrix

the ercc is producing standardized expression controls analysis tools and protocols
The ERCC is producing standardized expression controls, analysis tools and protocols
  • Well-characterized, widely accepted RNA standard controls for multiple platforms
    • Certified Reference Material (CRM)
  • Protocols for multiple applications, research and the clinical laboratory (CLSI – Clinical & Laboratory Standards Inst) Approved July 2006!
  • Software tools to support development work
  • Software tools to support multiple applications

J. Warrington -- Affymetrix

control sequences june 2006
Control Sequences June 2006

L. Reid -- Expression Analysis, J. Warrington -- Affymetrix

slide14

Testing Strategy for RNA Controls

  • Design and development -- generate reagents -- ~100 in place w/70 sequenced
  • Prototype testing -- validate reagents
  • Proof of concept -- validate the assays
  • Functional testing -- validate the product
  • Performance review -- analyze all data
  • Testing begins in the 4th quarter. L. Reid et al
slide15

Uses of RNA Controls/Standards

  • Negative Controls
  • -- Determine “true” background
  • -- QC for slide quality, hybridization, etc.
  • Positive Controls
  • -- QC as above
  • -- Labeling efficiency
  • -- Dilution series, determine sensitivity of assay,
  • determine lowest conc. with reliable signal
  • -- Ratiometric series, normalization tool
  • Will allow better comparison of intra or inter lab data and with the same or different array platforms.
slide16

Tests for Validation of ERCC Controls

  • Negative control test – background studies
  • Cross-hybridization – determine if any of the
  • controls hybridize to each other or to mRNAs
  • Labeling test – determine efficiency in the
  • presence of complex RNA sample
  • Latin square – test controls over a range of
  • concentrations (1:5,000,000 to 1:1000)
  • Linear range test and ratiometric studies
  • Above studies will require ~102 arrays per site!
slide17

Latin Squares Design for Testing Controls

A1 – A4 = the 4 arrays used

G1 – G4 = the 4 transcripts being studied

L1 – L4 = the 4 concentrations of each transcript

L. Reid, BMC Genomics 6:150

slide18

ERCC Test Sites

  • >100 Arrays/Site for Validating Controls
  • Affymetrix
  • GE Healthcare
  • Illumina
  • NIAID
  • Novartis
  • Qiagen
  • Agilent, ABI, Roche maybe
slide19

The MAQC Project

  • MicroArray Quality Control
  • An FDA sponsored consortium (Leming Shi)
  • Founded to address concerns of microarray
  • community concerning reproducibility of
  • expression profiling experiments.
  • Group consists of over 140 members from
  • academia, government, pharma & biotech.
  • A large study was designed to compare ex-
  • pression data from 10 different platforms and
  • 40 different test sites with >650 arrays.
  • Study has been completed and results will be
  • published in Nature Biotechnology. Data will
  • released next month.
slide20

MAQC Study Goals/Exptl. Design

  • Establish a set of reference standards for use in the
  • MAQC, but more importantly for the array community
  • Generate large collection of reference data sets using
  • multiple microarray platforms and many diff. labs….
  • .
  • .
  • Promote the use of reference RNA samples…..
  • Make recommendations on the appropriate uses of
  • microarray technology.
  • The MAQC group first tested multiple RNAs with 160 arrays and then chose two for titration studies with 200 arrays. Two RNAs at two concentrations were chosen for repeated (5 arrays per sample) assays for four pools. The samples were UHRR from Stratagene and Human Brain Ref from Ambion. The four pools were: A. 100% UHRR
  • B. 100% HBRR C. 75% UHRR: 25% HBRR D. 25% UHRR:75% HBRR.
  • At the completion of this study there is data from over 1026 arrays!
slide21

Platforms Used In MAQC Study

ABI(Applied Biosystems)One-Color Array 32,878 Probes

AFX (Affymetrix) One-Color Array 54,675 Probes

AGL (Agilent) Two-Color Array 43,931 Probes

AGI (Agilent) One-Color Array 43,931 Probes

CBC (CapitalBioCorp) One & Two Color 23,231 Probes

EPP (Eppendorf) One-Color Array 294 Probes

GEH (GE Healthcare) One-Color Array 54,359 Probes

ILM (Illumina) One-Color Array 47,293 Probes

NCI (NCI-Operon) Two-Color Array 37,632 Probes

TCI (TeleChem Int) One & Two Color 27,648 Probes

TAQ (Applied Biosystems) TaqMan® Assays 1,004 PCRs

QGN (Panomics) QuantiGene Assay 245 Probes

GEX (GeneExpress) StaRT-PCR™ Assay 205 Probes

slide22

MAQC STUDY DESIGN

12,091 Genes

Used for Com-

parison Across

All Platforms.

(Damir Herman, Jean Thierry-Mieg)

slide23

Take Home Messages/General Findings From MAQC Study

  • Large data sets are available for objectively
  • assessing platform performance and various
  • data analysis algorithms.
  • Microarray technology is reproducible and
  • reliable when one has an understanding of
  • its limitations.
  • Cross platform analyses requires a very
  • careful annotation & mapping of probe
  • sequences.
  • All the platforms had good intra-lab repeat-
  • ability, and inter-lab reproducibility after
  • removal of outliers.
  • Methods of microarray analysis are an impor-
  • tant variable, and this large data set will help
  • resolve issues in this area (statisticians and
  • bioinformaticists take delight……..)
slide24

Manuscripts in MAQC Study -- Entire Issue of

  • Nature Biotechnology Sept. 2006
  • Editorial
  • FDA Forward
  • Stanford - Data quality in genomics and microarrays
  • Impact of microarray data quality in genomic data
  • submissions to the FDA
  • US EPA efforts to develop a framework for using
  • genomics data in risk assessment and regulatory
  • decision making.
  • MAQC main manuscript – overall description
  • The reproducibility of differentially expressed gene
  • lists in microarray studies*
  • An analysis and comparison of alternative platforms
  • Use of RNA titrations to assess platform performanc
  • Performance of one-color vs two-color arrays
slide25

MAQC Manuscripts (cont.)

  • External RNA controls for assessment of microarray
  • analytical performance
  • Normalization and technical variation in gene
  • expression measurements*
  • Toxigenomics and microarrays: biological response
  • measurements are preserved across platforms
  • Reproducibility probability score: A metric incorp-
  • orating measurement variability across labs for
  • gene comparison*
  • Late news: 9 manuscripts submitted and 6 were accepted. With 3 commentaries there are 9 articles in the Sept. Nature Biotechnology Suppl. from the MAQC.
slide26

With proper use of negative and positive controls, microarrays may be used to identify, quantitate expression and count the absolute number of genes being expressed in any given cell or tissue sample.

………Anonymous………….

aka ESK

Nature May 25, 2006

slide27

Present (P)& Absent (A) Calls in

Spotted Long Oligo Arrays

  • “Average” cell expresses <10,000 genes.
  • “Whole” genome array contains >25,000 genes.
  • Therefore, Present calls should be 40% or less or 60%
  • Absent.
  • However, P calls are usually 90% or more using usual
  • image analysis systems like GenePix.
  • Why is this? Why do we care?
  • Good negative controls may resolve this issue.
slide28

What is Background?

Articles are still being written about how to determine “true” background. Controls can be used to settle this issue.

Internal Background

External Background

Li et al (2005) Bioinformatics 21:2875

slide29

Common Methods for Background Subtraction

W Yin et al (2005) Bioinformatics 21:2410

slide30

Use of Negative Controls for Background Subtraction

Internal Background ~ 500-1000 units

External Background ~ 100-200 “

%Present using external = 96%

%Present using internal = 77%

= 21,565/22,464 vs 17,010/22,464

Bckgrd subt eliminated 4,555 genes from further analysis. Good or bad??

Use of negative controls can dramatically change values for % genes expressed and gene expression ratios!

slide31

Low

Signal

Negative Control

Background

N

External Background

Negative Controls & Background

slide32

Signal distribution of noise background (B), negative control background (median)(neg) and mean intensities of all probes (F) on the slide separated by Cy5 and Cy3 channels

slide34

Influence of Type of Background Subtraction on Expression Ratios

  • Assume control sample gene has signal of 600 units.
  • The experimental has a signal of 5600 in same gene.
  • The external background is 100 units.
  • Therefore, the calculated ratio value would be 11.
  • 5500/500 = 11
  • But if the negative control background is 500, the
  • ratio is now 51.
  • 5100/100 = 51
  • Use of negative controls as background may relieve
  • some of the “compression” in ratios for these types of
  • arrays and give a more accurate expression value.
slide35

Box plots of CV (data are loess normalized, one set with negative bg sub, another set without) – this figure shows background subtraction could improve the data quality

1 2 3 4 5 6 7 8 9 10 11 12

1. jurkat; 2. jurkat_neg; 3. L428l; 4. L428_neg; 5. lncap; 6. lncap_neg;

7. mcf; 8. mcf_neg; 9. oci; 10. oci_neg; 11. sud; 12sud_neg

slide39

Perfect Match (PM) and Mismatch (MM): The Affy Image Quantitation Methods

GCOS (Gene Chip Operating System): default Affy analysis software.

RMA (Robust Multiarray Average): Irizarry method using only PM signals.

GCRMA: Similar to RNA but takes into account GC content

dChip: Similar to GCOS but has with or without MM options.

YW Chip: The Yonghong Wang method. PM only with only sequence validated oligos used in analysis.

correlation between 2 technical replicates affy chips

Influence of Different Methods of Background Subtraction

Correlation Between 2 Technical Replicates – Affy Chips

GCOS

No background subtraction

MAS background subtraction

RMA background subtraction

slide41

PM Only vs PM-MM Analyis of Technical Replicates

PM Only

PM-MM

Log2 Intensities

Mean Values Intensities

S.D. Dist. Probesets 4 Reps

PM

MM

PM

MM

PM

PM

slide42

Correlation Study of Gene With Absent Calls

Genes here were called absent by GCOS in 8 hybs from 2 technical

replicates. Data indicates that absent calls may not be truly absent in many cases.

slide43

The MM Probes: C or T at 13th Position May Result in Artefactual High Signal: 92% of All MM with Higher Signal Than PM have C or T

slide45

Analysis of Probe Sequences Within Probe Sets in Affy Gene Chip

# of “Correct” or Mapped# Probe Sets in Each

Oligos/Probe SetCategory

1 692

2 514

3 433

4 450

5 425

6 499

7 626

8 862

9 1608

10 3771

11 36562

slide46

How The ERCC & MAQC Can Increase The

  • Reliability/Acceptance of Microarray Data
  • A set of controls used by all expression platforms will
  • go a long way to end confusion about comparability
  • of data from related experiments.
  • Probe mapping and sequences from all platforms will
  • be extremely useful for cross platform comparisons.
  • Very large data set from all major platforms will point
  • out problem areas in present protocols/technologies,
  • which, hopefully, will result in their improvement.
  • Large data sets from ERCC and MAQC combined will
  • provide a great resource for critically evaluating algo-
  • rithms used in analyzing arrays. Which analysis
  • method provides “true” answers?
  • Hopefully, a (workable) consensus about utilization of
  • microarray technologies will arise from these two large
  • exercises in (sometimes a bit contentious) human
  • scientific cooperation.
slide47

In Closing…….

My attempt at being funny…..

USF

Is your back to the wall? Are you under a lot of pressure?

slide48

Do you feel you’re on the Treadmill of Life?

Moebius Strip II by M.C. Escher

Nature vol. 246, p776, 2003

slide50

Keep on smilin’, ‘caus when you’re smilin’, the whole world smiles with you……

100 nm

Nano Smiley DNAs --- Many Happy Genomes

Courtesy P Rothemund Nature v440p297y06

slide51

Thank you all ------

ERCC & MAQC Consortia

ATC Microarray Lab Crew

YW for Analysis & AP for

Chip Data