slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Proce PowerPoint Presentation
Download Presentation
CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Proce

Loading in 2 Seconds...

play fullscreen
1 / 30

CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Proce - PowerPoint PPT Presentation


  • 196 Views
  • Uploaded on

CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Procedure Characteristics of Data Data integration and Data mining. Nylon Membrane. Glass Slides. GeneChip. Substrates for High Throughput Arrays. Single label P 33.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Proce' - swaantje


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

CS491JH: Data Mining in Bioinformatics

  • Introduction to Microarray Technology
  • Technology Background
  • Data Processing Procedure
  • Characteristics of Data
  • Data integration and Data mining
slide2

Nylon Membrane

Glass Slides

GeneChip

Substrates for High Throughput Arrays

Single label P33

Single label biotin

streptavidin

Dual label

Cy3, Cy5

genechip probe arrays

*

*

*

*

*

GeneChip® Probe Arrays

Hybridized Probe Cell

GeneChipProbe Array

Single stranded,

labeled RNA target

Oligonucleotide probe

24µm

Millions of copies of a specific

oligonucleotide probe

1.28cm

>200,000 different

complementary probes

Image of Hybridized Probe Array

genechip expression array design

Multiple oligo probes

GeneChip® Expression Array Design

Gene

Sequence

Probes designed to be Perfect Match

Probes designed to be Mismatch

procedures for target preparation
Procedures for Target Preparation

Cells

Labeled transcript

AAAA

IVT

(Biotin-UTP

Biotin-CTP)

L

L

L

L

Poly (A)+/

Total

RNA

cDNA

Fragment

(heat, Mg2+)

L

L

Wash & Stain

Hybridize

(16 hours)

L

L

Scan

Labeled fragments

slide7

NSF Soybean Functional Genomics

Steve Clough / Vodkin Lab

Printing Arrays on 50 slides

slide8

Cells from condition A

Cells from condition B

mRNA

Label Dye 1

Label Dye 2

cDNA

Mix

NSF / U of Illinois

Microarray Workshop

-Steve Clough / Vodkin Lab

equal

over

under

Ratio of expression of genes from two sources

Total

or

slide9

NSF Soybean Functional Genomics

Steve Clough / Vodkin Lab

GSI Lumonics

slide10

Cattle and Soy Controls

Beta Actin

PKG

HPRT

Beta 2 microglobulin

Rubisco

AB binding protein

Major latex protein

homologue (MSG)

Array of cattle and soy spiking controls. 50 ug of cattle brain total RNA was labeled with Cy3 (green).

1 ul each of in vitro transcribed soy Rubisco (5 ng), AB binding protein (0.5 ng) and MSG (0.05 ng)

were labeled with Cy5. The two labeled samples were cohybridized on superamine slides (Telechem,

Inc.). To the right of each set of spots are five negative controls (water).

slide11

Fetal Spleen-Cy3

Adult Spleen-Cy5

IgM

IgM

MYLK

MYLK

IgM heavy chain

IgM heavy chain

COL1A2

COL1A2

slide12

GenePix Image Analysis Software

Placenta vs. Brain – 3800 Cattle Placenta Array

cy3cy5

slide14

Microarray Data Process

  • Experimental Design
  • Image Analysis – raw data
  • Normalization – “clean” data
  • Data Filtering – informative data
  • Model building
  • Data Mining (clustering, pattern recognition, et al)
  • Validation
slide16

<-0.3

>0.3

slide17

Characteristics of Data

Data can be viewed as a NxM matrix (N >> M):

N is the number of genes

M is the number of data points for each gene

Or Nx(M+K)

K is the number of Features describing each gene(genome location, functional description, metabolic pathway et al)

slide18

Model for Data Analysis

  • Gene Expression is a Dynamic Process
  • Each Microarray Experiment is a snap shot of the process
  • Need basic biological knowledge to build model
  • For Example:
  • Assumption – In most of experiments, only a small set of genes (100s/1000s) have been affected significantly.
need for data mining

Data Mining

Need for Data Mining
  • Data volumes are too large for traditional analysis methods
  • Large number of records and high dimensional data
  • Only small portion of data is analyzed
  • Decision support process becomes more complex

Functions of Data Mining

Use the data to build predictors – prediction, classification, deviation detection, segmentation

Generates more sophisticated summaries and reports to aid understanding of the data – find clusters, partitions in data

data mining methods
Data Mining Methods

Classification, Regression (Predictive Modeling)

Clustering (Segmentation)

Association Discovery (Summarization)

Change and deviation detection

Dependency Modeling

Information Visualization

slide21

Clustered display of data from time course of serum stimulation of primary human fibroblasts.

Cholesterol Biosynthesis

Cell Cycle

Immediate Early Response

Signaling and Angiogenesis

Wound Healing and Tissue Remodeling

Eisen et al.

Proc. Natl. Acad. Sci. USA

95 (1998) pg 14865

slide27

Gene Expression Profile of Aging and Its Retardation by Caloric Restriction

Cheol-Koo Lee, Roger G. Klopp, Richard Weindruch, Tomas A. Prolla