Array-based Comparative Genomic Hybridization. Bastien JOB 2010-10-19. Structural Genomics Sequence variations (CGHa, SNPa, DNAseq, mutations…). Fonctional Genomics Gene expression / splicing… (GEa, Q-PCR, RNAseq… ). Proteomics (Antibody arrays, 2D EP +MS/MS, HPLC+MS / MS, … ). Genome.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
(CGHa, SNPa, DNAseq, mutations…)
Gene expression / splicing…
(GEa, Q-PCR, RNAseq…)
(Antibody arrays, 2D EP +MS/MS, HPLC+MS / MS, …)
Promotor, regulating seq
CGH arrayis a methodaimingat the identification of the variation in number of the genomic content of a test sample, by comparison to a referencesample, using an array of (at least) thousands of measure points on the genome.
A bit of history of cytogenomics
In cancer :
Other uses :
It’s an establishedmethod in the cancer researchfield, in establishment for the diagnostic field.
1993 : SKY
199x : CGH on chr
200x : cDNA/BAC-based CGH array
2005 : Oligo-based CGH array
MYC – IgH translocation in Burkitt lymphoma
IMAGE CREDIT: Gregory Schuler, NCBI, NIH, Bethesda, MD, USA
Also a common fusion in prostate cancer (Tomlins et al., Science 2005)
EGFR amplification in lung cancer as HSR (homogeneously stained region)
EGFR amplification in lung cancer as several double minutes
Varella-Garcia et al, J Clin Pathol 2009
Test sample DNA (tumor) Cy5 -vs- Reference DNA (normal) Cy3
CGH array simplified process on the platform :
From sample to analysis
Scan, signals acquisition & normalization
G2 : 244 K Agilent oligoarray
Spots : 60µm (@ 5µm/px)
Spots : 30µm @ 2µm/px
G3 : 4 x 180 K Agilent oligoarray
945,826 CN probes *
* ~200,000 CNVs
Genomic profile Segmentation
Feature Extraction v10.x
Description of the population
Identification of genomic regions of interest
Describing genomic contents
+ Clinical Annotations
Credits : Pierre NEUVIAL (ENSAE)
Currentoligogeneration : perfect disc-shaped spots.
Credits : Pierre NEUVIAL
General information and some parameters
Grid positioning check
Control of channels (signal, background, …)
Control of outliers (number and position)
Control of intensity distributions
Control of the randomness of signals
Spatial representation of signals, background, log2(ratio), p-value, errors (…)
Distribution of signals and log2(ratio)
Some biasescanberemovedby specific algorithms
Most of thesebiases are linked to spottedarrays
Credits : Pierre NEUVIAL
Data generated by thismethodare relativevalues (ratio of a test versus a reference) : we are lacking information about « real » normalitylevel.
Identifying the most probable normal genomic level is easy here, as we have a main central peak.
It’s much more difficult here, to the higher complexity of the distribution / profile…
Why segmenting ?
Data reduction : The data obtained are a list of hundreds of thousands of values. However, a genomic profile can be simplified to a limited list of segments considered as abnormal.
Example taken from a breast cancer profile
=> Interesting idea but too stringent !
Rueda et al., BMC Bioinformatics 2009
Example on breast cancer data for K=2 and K=3
Example for a population of 103 breast cancers optimaly clustered into 3 groups
In the multiclonal model of tumoral evolution, genes of interest (oncogenes, tumor suppressors, …) have a higher probability to be found more frequently than others in the overlap of aberrations defined by a sufficient number of genomic profiles.
Regions statistically found as potential MCR
The tool used for this purpose is STAC v1.2 (Diskin et al, 2006)
Common problem : CNVs as a contamination…
=> MYCN found in a 663 Kb window .
=> HOX genes cluster found in a 143 Kb window .
=> HRAS found in a 351 Kb window .
=> Loss of CDKN2A and CDKN2B in a 1.2 Mb window.
Partial example of a neuroblastoma cell-line
Trivial example of the difference found on the ERBB2 locus when comparing ERBB2- amplified and non-amplified breast cancer populations.
Another example showing a characteristic gain of the BRAF locus in a BRAF-mutated population of melanoma.
Detecting genes undergoing simultaneously genomic copy number variations and RNA expression variation can be useful to get stronger candidates in the characterization of a pathology.
Due to molecular cascades in human pathways, gene expression analysis may preferentially show lower genes involved in a pathway. Correlating CGH and gene expression results from a same population, it may be easier to focus on “upper” genes.
"Cheese plots" for the probe-specific simultaneous visualization of cross-technology correlation and differential expression.