glue grant h1 analysis tutorial n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Glue Grant H1 Analysis Tutorial PowerPoint Presentation
Download Presentation
Glue Grant H1 Analysis Tutorial

Loading in 2 Seconds...

play fullscreen
1 / 42

Glue Grant H1 Analysis Tutorial - PowerPoint PPT Presentation


  • 449 Views
  • Uploaded on

Glue Grant H1 Analysis Tutorial. Weihong Xu 11/12/2008 Boston, MA. Outline. Introduction to array design and library files Image quantification (DAT->CEL) CEL reduction (CEL->exprCEL, remove SNP) Low level analysis (CEL->Expression Index) Practice session #1 Expression Console

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Glue Grant H1 Analysis Tutorial' - vinson


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
glue grant h1 analysis tutorial

Glue Grant H1 Analysis Tutorial

Weihong Xu

11/12/2008

Boston, MA

outline
Outline
  • Introduction to array design and library files
  • Image quantification (DAT->CEL)
  • CEL reduction (CEL->exprCEL, remove SNP)
  • Low level analysis (CEL->Expression Index)
  • Practice session #1
  • Expression Console
  • High level analysis (Expression Index -> Gene List)
  • Practice session #2

If time permits,

  • Visualization
  • Glue Grant Exon Array Tools (beta-testing)
  • Practice session #3

Glue Grant H1 Analysis Tutorial

introduction to array design
Introduction to array design
  • Significant change over Affymetrix exon array ST1.0
    • More focused on known transcripts
    • Higher coverage
    • More comprehensive probe selection method
    • More contents:
      • exon probes 3.2M (0.32M targets)
      • junction probes 1M (0.25M targets)
      • coding SNP 1M (85K targets)
      • Untranslated Regions (UTR) 0.5M (50K targets)
      • tiling un-annotated units 0.5M (50K targets)
  • http://gluegrant1.stanford.edu/wiki/

Glue Grant H1 Analysis Tutorial

some definitions tc ec psr juc
Some definitions (TC, EC, PSR, Juc, …)

Glue Grant H1 Analysis Tutorial

potential analysis questions
Potential Analysis Questions
  • Gene expression
  • Alternative splicing
  • Transcript isoform deconvolution
  • Allele-specific expression
  • Antisense expression

Glue Grant H1 Analysis Tutorial

introduction to library files
Introduction to Library files
  • Support multiple tools:
    • Quality control
    • low level analysis and expression analysis using APT and Expression Console
    • High level analysis using dChip
    • Glue Grant Exon Analysis Tools;
    • visualization using cisGenomeBrowser or UCSC Genome Browser.
    • Library and annotation database
      • http://gluegrant1.stanford.edu/phpMyAdmin/
        • username: ??? password: ???
        • hglue – all tables are read-only
        • GlueArraySandBox – for users to generate personalized library files and annotations

Glue Grant H1 Analysis Tutorial

major types of library files
Major types of library files
  • CLF - mapping of probe IDs to x/y in the CEL file
  • PGF - groups probes (by probe ID) into probe sets.
  • PS – a list of probe IDs
  • MPS – a list of meta probe set IDs with a corresponding list of probe set IDs
  • BGP – a list of Probe IDs to be used in background correction
  • QCC – a table of probe IDs for quality control and their corresponding type
  • KIL – a list of probe IDs to be ignored in DABG (probe with GC < 3)
  • http://www.affymetrix.com/support/developer/powertools/changelog/FILE-FORMATS.html

Glue Grant H1 Analysis Tutorial

image quantification dat cel
Image quantification (DAT->CEL)
  • Function: convert pixel image to probe intensity file
    • Gridding
    • Quantification
  • Software:
    • GeneChip Operating Software (GCOS)
    • Affymetrix GeneChip Command Console (AGCC)
  • http://www.affymetrix.com/products_services/software/specific/command_console_software.affx

Glue Grant H1 Analysis Tutorial

cel file reduction cel exprcel
CEL file reduction (CEL->exprCEL)
  • Function: remove SNPs to meet the IRB concern
  • Script:
        • Mac/Unix: modCEL.unix.pl --xymap=mapping_file \

--CEL=path/*.CEL --OUTDIR=path --Prefix=expr

        • PC: modCEL.pc.pl –xymap=mapping_file \

--CEL=filename.CEL --OUTDIR=path --Prefix=expr

  • Parameters:
    • xymap - mapping, hGlue1_0.r3.CEL2exprCEL.xymay
    • Prefix – a string that will be added to the CEL file name

Glue Grant H1 Analysis Tutorial

low level analysis cel expression index
Low Level Analysis (CEL->Expression Index)
  • APT/Expression Console and QC
    • Quality control
    • Extracting specific features
    • Background correction/Normalization/Summarization
  • Practice session (~30minutes to 1hr)

Glue Grant H1 Analysis Tutorial

apt expression console
APT/Expression console
  • APT-Affymetrix Power Tool
    • Support both 3’ expression array and exon array
    • Support both expression and genotype analysis
      • Apt-probeset-summarize -- S(N(B))
      • Apt-cel-extract -- extract features
      • Apt-dump-pgf -- extract probe/probeset information
      • Apt-summary-vis -- generating visualization track files
      • Apt-midas –alternative splicing
    • Memory management
  • http://www.affymetrix.com/partners_programs/programs/developer/tools/powertools.affx#1_1

Glue Grant H1 Analysis Tutorial

overview of quality control
Overview of Quality Control
  • Function: ensure the quality and reproducibility of array result
  • What to assess?
    • Probe level
      • Per array: signal distribution of different probe types
      • Across array: overall signal distribution, PM-mean, BG-mean
    • Probe Set level (PSR, TC)
      • Per array: Pos_vs_Neg_AUC, Presence call
      • Across array: correlation plot (median correlation to other arrays in the same batch)

Glue Grant H1 Analysis Tutorial

quality control tool glueqc r
Quality Control Tool – GlueQC.R
  • requires R and APT
  • Syntax: Rscript GlueQC.R celpath outpath libpath
  • Libraries:
    • hGlue1_0.r3.clf
    • hGlue1_0.r3.pgf
    • hGlue1_0.r3.PSR.ps
    • hGlue1_0.r3.TC.mps
    • hGlue1_0.r3.KIL
    • hGlue1_0.r3.qc.clfpgf

Glue Grant H1 Analysis Tutorial

density distribution plot
Density distribution plot
  • Overall intensity range
  • separation between different probe types

Glue Grant H1 Analysis Tutorial

all array density plot
All array density plot
  • Check the similarity of intensity distribution across arrays

Glue Grant H1 Analysis Tutorial

qc summary plot
QC summary plot
  • Check outliers in each plot
  • Flags can only be consider as caution sign, especially when the sample size is small

Glue Grant H1 Analysis Tutorial

qc summary table
QC summary table

Glue Grant H1 Analysis Tutorial

extract features
Extract features
  • Function: extract a subset of probe signals from CEL files
  • Tool: apt-cel-extract
    • Syntax: apt-cel-extract -o out.txt [-c chip.clf -p chip.pgf] [-d chip.cdf] [--probeset-ids=norm-exon.txt] *.cel
    • Parameters:
      • If using probeset-ids, CLF and PGF have to been supplied

Glue Grant H1 Analysis Tutorial

examples lowlevelanalysis extractfeatures bat
Exampleslowlevelanalysis/extractfeatures.bat
  • extract all raw probe signal

>apt-cel-extract -o raw_probe_signal.txt --cel-files CELlist.txt

  • extract quantile normalized and GC-background corrected probe signal

>apt-cel-extract -c hGlue1_0.r3.clf -p hGlue1_0.r3.pgf --b hGlue1_0.r3.antigenomic.bgp -a quant-norm,pm-gcbg -o bgc_probe_signal.txt --cel-files CELlist.txt

  • extract probe signal of a specific content: “main->junction”

>apt-dump-pgf -c hGlue1_0.r3.clf -p hGlue1_0.r3.pgf --probeset-type main --probeset-type junction -o juc.pgf

>apt-cel-extract -c hGlue1_0.r3.clf -p hGlue1_0.r3.pgf --probe-ids juc.pgf -o juc_raw_probe_signal.txt --cel-files CELlist.txt

Glue Grant H1 Analysis Tutorial

background correction normalization and summarization
Background correction, normalization and summarization
  • Goal: transform probe signal into biological meaningful expression measure
    • Background correction -- remove non-target signal
    • Normalization --remove non-biological variance
    • Summarization -- summarize probe signal into probe set signal

Glue Grant H1 Analysis Tutorial

apt probeset summarize
apt-probeset-summarize
  • Syntax
    • apt-probeset-summarize –a rma-sketch [–a dabg] –c chip.clf –p chip.pgf –b chip.bgp –o outpath –m chip.mps [–kill-list chip.kil] *.CEL
  • Parameters
      • -a, analysis method
        • Chipstream format: a comma separated list of transformations with specific parameters passed as key value pairs, e.g.
          • rma-bg,quant-norm.sketch=-1.usepm=true.bioc=true,pm-only,med-polish
        • Predefined method: rma-sketch, dabg, rma, plier etc
      • --kill-list: needed when the analysis involves gc-bg
      • Windows: using ‘—cel-files filename’ instead of *.CEL

Glue Grant H1 Analysis Tutorial

apt probeset summarize 2
apt-probeset-summarize (2)
  • Background correction
    • gc-bg
    • rma-bg
    • Mas5-bg
    • Pm-gcbg
    • Pm-mm
  • Normalization
    • Quant-norm
    • Med-norm
  • Summarization
    • Plier/iter-plier
    • Median polish (RMA)
    • DABG
    • Median
    • No Li-Wong yet

Glue Grant H1 Analysis Tutorial

examples lowlevelanalysis bns bat
ExamplesLowLevelAnalysis/bns.bat
  • PSR rma-sketch and dabg analysis

apt-probeset-summarize -a rma-sketch -a dabg -c hGlue1_0.r3.clf -p hGlue1_0.r3.pgf

-b hGlue1_0.r3.antigenomic.bgp --qc-probesets hGlue1_0.r3.qcc -s hGlue1_0.r3.PSR.ps

--qc-probesets hGlue1_0.r3.qcc -o BNS/PSR --cel-files CELlist.txt --kill-list hGlue1_0.r3.kil

  • TC (transcription cluster) Meta Probe Set rma-sketch or chipstream

apt-probeset-summarize -a rma-sketch -a quant-norm.sketch=50000,pm-gcbg,iter-plier -c hGlue1_0.r3.clf --p hGlue1_0.r3.pgf -b hGlue1_0.r3.antigenomic.bgp --qc-probesets hGlue1_0.r3.qcc -m hGlue1_0.r3.TC.mps -o BNS/TC --cel-files CELlist.txt --kill-list hGlue1_0.r3.kil

  • Compute U133Plus2 probe Set

apt-probeset-summarize -a rma-sketch -c hGlue1_0.r3.clf --p hGlue1_0.r3.pgf -b hGlue1_0.r3.antigenomic.bgp --qc-probesets hGlue1_0.r3.qcc -m hGlue1_0.r3.U133plus2.mps -o BNS/u133plus2 --cel-files CELlist.txt

  • Compute Human Exon ST1.0 Transcript Cluster

apt-probeset-summarize -a rma-sketch -c hGlue1_0.r3.clf --p hGlue1_0.r3.pgf -b hGlue1_0.r3.antigenomic.bgp --qc-probesets hGlue1_0.r3.qcc -m hGlue1_0.r3.HuEX_TC.mps -o BNS/huex --cel-files CELlist.txt

Glue Grant H1 Analysis Tutorial

apt probeset summarize output
Apt-probeset-summarize output
  • [method].summary.txt – expression index matrix
  • [method].report.txt – quality control measures

Glue Grant H1 Analysis Tutorial

expression console
Expression Console
  • Improvement over last tutorial
    • More summary options: EC, TC, JUC, EX, TX
    • Define probes into core, extended (multi probes)
    • Convert to U133plus2, HuEx format
  • Walk through an example
    • Summary
    • QC metrix
    • Link with annotation
  • Refer to doc/EC_Tutorial.doc (recycled from last tutorial)

Glue Grant H1 Analysis Tutorial

practice session 1
Practice session #1
  • CEL reduction (SNPremover)
  • GlueQC
    • GlueQC on data/07-20-08/CELlist_test.txt (15 arrays)
  • Low level Analysis
    • Feature extraction
      • Extract raw probe intensity of 15 arrays
      • Extract quantile normalized and GC-background corrected probe intensity of “main->junction” from 15 arrays
    • B.N.S
      • rma-sketch summary of PSR for 15 arrays
      • rma-sketch summary of TC for 15 arrays (use mps file from lib/GenBase)

Glue Grant H1 Analysis Tutorial

high level analysis expression index gene list
High level analysis (Expression Index -> Gene List)
  • Array annotation and annotation files
  • Import APT results to dChip for high level analysis
  • A practice session

Glue Grant H1 Analysis Tutorial

array annotation r3
Array annotation (r3)
  • Update over r2 version
    • Corrected a bug caused by MySQL end-of-line problem
    • Added annotation for Transcript, Junction and other contents
    • Added annotation files for dChip and GenBase
    • Added BED files and REFFLAT files for Genome Browser
  • Refer to lib/readme.doc for details
      • Customerization: http://gluegrant1.stanford.edu/phpMyAdmin/

Glue Grant H1 Analysis Tutorial

hglue1 0 r3 tc annot csv
hGlue1_0.r3.TC_annot.csv

Glue Grant H1 Analysis Tutorial

hglue1 0 r3 psr annot csv
hGlue1_0.r3.PSR_annot.csv

Glue Grant H1 Analysis Tutorial

hglue1 0 r3 junction annot csv
hGlue1_0.r3.Junction_annot.csv

Glue Grant H1 Analysis Tutorial

dchip
dChip
  • Improve over last tutorial
    • Added Gene Ontology, KEGG pathway and chromosome band analysis
  • Walk through an example
    • Remove extra header and extra tail
    • Import external data into dChip
      • Differential Expression Analysis
      • Clustering/Enrichment
      • Chromosome/Genome enrichment

Glue Grant H1 Analysis Tutorial

practice session 2
Practice session #2
  • dChip

Glue Grant H1 Analysis Tutorial

visualization cisgenomebrowser
Visualization - cisGenomeBrowser
  • Light version of UCSC Genome Browser (Hui Jiang)
    • CEL image
    • Genome Region
  • http://biogibbs.stanford.edu/~jiangh/browser/index.html

Glue Grant H1 Analysis Tutorial

cisgenomebrowser cel image
cisGenomeBrowser-CEL Image

Glue Grant H1 Analysis Tutorial

cisgenomebrowser genomic region
cisGenomeBrowser-Genomic Region

Glue Grant H1 Analysis Tutorial

cisgenomebrowser
cisGenomeBrowser
  • Annotation track
    • hGlue1_0.r3.TC.refflat
    • hGlue1_0.r3.TX.refflat
    • Hg18.genefile (refseq track only)
  • Signal track (visualization/genCisGenomeBrowserTrack.bat)
    • probe raw signal barfile

>genbar.pl –coord = hGlue1_0.r3.Probe.BED --signal = raw_probe_signal.txt –outdir = Probe_barfile

    • PSR barfile

>genbar.pl --coord=hGlue1_0.r3.PSR.BED --signal=PSR/rma-sketch.summary.txt --outdir=PSR_barfile

    • Gene barfile

>genbar.pl --coord=hGlue1_0.r3.TC.BED --signal=TC/rma-sketch.summary.txt --outdir=TC_barfile

  • Demo

Glue Grant H1 Analysis Tutorial

other browsers
Other Browsers
  • UCSC Genome Browser (visualization/genUCSCBrowsreTrack.bat)
    • apt-summary-vis -g hGlue1_0.r3.PSR.BED PSR/rma-sketch.summary.txt --wiggle-col-index 1 –o CEL1.PSR.wig
    • Need to tweak BED file to make PSR non-overlap in order to work on UCSC browser
  • Affymetrix Genome Browser
    • apt-summary-vis -g hGlue1_0.r3.PSR.BED PSR/rma-sketch.summary.txt –o PSR.egr

Glue Grant H1 Analysis Tutorial

glue grant exon array tool
Glue Grant Exon Array tool
  • Highlights
    • Specially tailored for exon arrays
    • Command line with R interface
    • Probe sequence specific background model-MAT
    • Summarization: probe-selection (GenBase), Li-Wong model (dChip) and median-polish (RMA)
    • Integrated alternative splicing analysis (MADS)
  • Run analysis (GlueGrantExonArrayTool/runEAT.bat)
    • ../../bin/GlueGrantExonArrayTool/eat.win32.exe EXPR_param.conf -l ../../data/07-20-08/CELlist.txt
    • ../../bin/GlueGrantExonArrayTool/eat.win32.exe MADS_param.conf -l ../../data/07-20-08/CELlist.txt

Glue Grant H1 Analysis Tutorial

param conf
Param.conf
  • Specify analysis parameters
    • Analysis type
    • Librarie files
    • Background correction method
    • Normalization method
    • Summarizaiton method
    • MADS parameters
  • Example: /GlueGrantExonArrayTool/Expr_param.conf

Glue Grant H1 Analysis Tutorial

practice session 3
Practice session#3
  • cisGenomeBrowser
    • Generate bar files for PSR and TC of 15 arrays in CELlist_test.txt from practice session#1
    • Search for genes of your interests
  • Glue Grant Analsysis Tool
    • Repeat steps in runEAT.bat

Glue Grant H1 Analysis Tutorial

thank you
Thank you

Glue Grant H1 Analysis Tutorial