slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Presented by: Deepti Malhotra Biological Sequence Analysis PowerPoint Presentation
Download Presentation
Presented by: Deepti Malhotra Biological Sequence Analysis

Loading in 2 Seconds...

play fullscreen
1 / 19

Presented by: Deepti Malhotra Biological Sequence Analysis - PowerPoint PPT Presentation

  • Uploaded on

Selection of optimal oligonucleotide probes for microarrays using multiple criteria, global alignment and parameter estimation Xingyuan Li, Zhili He1 and Jizhong Zhou1. 6114–6123 Nucleic Acids Research, 2005, Vol. 33, No. 19. Presented by: Deepti Malhotra Biological Sequence Analysis.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Presented by: Deepti Malhotra Biological Sequence Analysis

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Selection of optimal oligonucleotide probes formicroarrays using multiple criteria, globalalignment and parameter estimationXingyuan Li, Zhili He1 and Jizhong Zhou1.6114–6123 Nucleic Acids Research, 2005, Vol. 33, No. 19.

Presented by:

Deepti Malhotra

Biological Sequence Analysis


MICROARRAY - What is it?

Analysis of the relative expression level of hundreds or thousands of genes simultaneously by determining the amount of messenger RNA (mRNA) that is present in a single experiment.

Labeled Target


(gene of interest)


cdna microarray niehs tox chip
cDNA Microarray: NIEHS Tox Chip

Nuwaysir E, et al., Molecular Carcinogenesis 24:153-159 (1999)

genechip probe arrays






GeneChip® Probe Arrays

Hybridized Probe Cell

GeneChipProbe Array

Single stranded, fluorescently

labeled DNA target

Oligonucleotide probe


Each probe cell or feature contains

millions of copies of a specific

oligonucleotide probe


Over 200,000 different probes

complementary to genetic

information of interest

Courtesy: Affymetrix

Image of Hybridized Probe Array

genechip probe arrays1






GeneChipProbe Arrays

GeneChipProbe Array

Probe Pair

Probe Set



Hybridized Probe Cell

Probe Cell (feature)

Image of Hybridized Probe Array

multiple specific probe pairs per gene
Multiple Specific Probe Pairs per Gene



nature genetics supplement • volume 21 • january 1999

what s the complexity
What’s the complexity?
  • More genes
  • More information per experiment

Feature Size



100 µm

50 µm

20 µm

10 µm









* Using 20 probe pairs per gene

why so many probe pairs
Why So Many Probe Pairs?

Probe Pairs

  • Point Mutations, Deletions, or Insertions will not effect the detection of the gene of interest.
  • Bioinformatics algorithm will account for expression across 11 different probe pairs to calculate expression of gene.

Gene of Interest

redundancy of probe synthesis
Redundancy of probe synthesis
  • Multiple Indicators for the Same Gene Ensures:
    • Quantitative accuracy
    • High sensitivity
    • Indicators of oligonucleotide Specificity:
      • Sequence identity to non-targets
      • Continuous stretch to non-targets
      • Free energy of Binding to the non-targets

All these 3 criteria important for the selection of optimal probes

problems with probe synthesis addressed by commoligo
Problems with probe synthesis – addressed by CommOligo
  • Representation of each sequence in a genome wide search
  • Liberal cut-offs and fewer non specifics
  • Generally use BLAST for local alignment or Suffix arrays for exact string search
  • Homologous sequence studies versus whole genome arrays  Applicability to experiments
  • Experimental threshold determination
  • Inherent variability

Series of filters checking Oligos

Cut offs based on CommOligo_PE

Parameters and thresholds are user adjustable

Iterative probe optimization

All 3 criteria’s included


Sequence alignment strategy

Dynamic Programming Matrix

  • Uses bit scores from Myers algorithm during identity calculation
  • An alignment corresponds to the path from bottom row with high identity/ score to the top row.
  • Traverse path/ last path
final optimization and scoring
Final optimization and scoring
  • Quality score is calculated as:
  • CommOligo_PE used to determine the thresholds and the probes are optimized for maximum coverage and correctness by calculating:
  • The goal is to maximize NPV and C
  • Cross validation by dividing into subsets of 10 randomly and using one as a test calibration is run 10 times.

Training sets:


Genome wide analysis

Homologous sequence searches

take home message
Take home message
  • CommOligo works well with Homologous sequences  3 stringent criteria's  cDNA
  • Still works well at the same thresholds for genome wide searches  Oligochip
  • Actual hybridization data is used
  • Better identity and minimum energy filters
  • Optimal Tm for the hybridization reaction is based on the oligos selected after having passed all the filters and not all the possible oligos
  • Iterative threshold optimization