1 / 68

Analysis of time-course gene expression data

Analysis of time-course gene expression data. Shyamal D. Peddada Biostatistics Branch National Inst. Environmental Health Sciences (NIH) Research Triangle Park, NC. Outline of the talk. Some objectives for performing “long series” time-course experiments Single cell-cycle experiment

teo
Download Presentation

Analysis of time-course gene expression data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of time-course gene expression data Shyamal D. PeddadaBiostatistics Branch National Inst. Environmental Health Sciences (NIH)Research Triangle Park, NC

  2. Outline of the talk • Some objectives for performing “long series” time-course experiments • Single cell-cycle experiment • A nonlinear regression model • Phase angle of a cell cycle gene • Inference • Open research problems • Multiple cell-cycle experiments • “Coherence” between multiple cell-cycle experiments • Illustration • Open research problems

  3. Objectives Some genes play an important role during the cell division cycle process. They are known as “cell-cycle genes”. Objectives: Investigate various characteristics of cell-cycle and/or circadian genes such as: • Amplitude of initial expression • Period • Phase angle of expression (angle of maximum expression for a cell cycle gene)

  4. Phases in cell division cycle

  5. A brief description • G1 phase: "GAP 1". For many cells, this phase is the major period of cell growth during its lifespan. • S ("Synthesis”) phase: DNA replication occurs.

  6. A brief description • G2 phase: "GAP 2“: Cells prepare for M phase. The G2 checkpoint prevents cells from entering mitosis when DNA was damaged since the last division, providing an opportunity for DNA repair and stopping the proliferation of damaged cells. • M (“Mitosis”) phase: Nuclear (chromosomes separate) and cytoplasmic (cytokinesis) division occur. Mitosis is further divided into 4 phases.

  7. Single, long series experiment …

  8. Whitfield et al.(Molecular Biology of the Cell, 2002) Basic design is as follows: • Experimental units: Human cancer cells (HeLa) • Microarray platform: cDNA chips used with approx 43000 probes (i.e. roughly 29000 genes) • 3 different patterns of time points (i.e. 3 different experiments) One of the goals of these experiments was to identify periodically expressed genes.

  9. Whitfield et al.(Molecular Biology of the Cell, 2002) Experiment 1: (26 time points) Hela cancer cells arrested in the S-phase using double thymidine block. • Sampling times after arrest (hrs): • 0 1 2 3 4 5 6 7 8 9 10 11 12 14 15 16 18 20 22 24 26 28 32 36 40 44.

  10. Whitfield et al. (2002) Experiment 2: (47 time points) Hela cancer cells arrested in the S-phase using double thymidine block. • Sampling times after arrest (hrs): • every hour between 0 and 46.

  11. Whitfield et al. (2002) Experiment 3: (19 time points) Hela cancer cells arrested arrested in the M-phase using thymidine and then by nocodazole. • Sampling times after arrest (hrs): • 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36.

  12. Whitfield et al. (2002)Phase marker genes: Cell Cycle Phase Genes ------------------ ------- G1/S CCNE1, CDC6, PCNA,E2F1 S RFC4, RRM2 G2 CDC2, TOP2A, CCNA2, CCNF G2/M STK15, CCNB1, PLK, BUB1 M/G1 VEGFC, PTTG1, CDKN3, RAD21

  13. Questions • Can we describe the gene expression of a cell-cycle gene as a function of time? • Can we determine the phase angle for a given cell-cycle gene? i.e. can we quantify the previous table in terms of angles on a circle? • What is the period of expression for a given gene? • Can we test the hypothesis that all cell-cycle genes share the same time period? • Etc.

  14. Profile of PCNA based on experiment 2 data

  15. Some important observations • Gene expression has a sinusoidal shape • Gene expression for a given gene is an average value of mRNA levels across a large number of cells • Duration of cell cycle varies stochastically across cells • Initially cells are synchronized but over time they fall out of synchrony • Gene expression of a cell-cycle gene is expected to “decrease/decay” over time. This is because of items 2 and 4 listed above!

  16. Random Periods Model (PNAS, 2004) • a and b: background drift parameters • K: the initial amplitude • T: the average period • the attenuation parameter • the phase angle

  17. Fitted curves for some phase marker genes

  18. Whitfield et al. (2002)Phase marker genes: Phase Genes Phase angles (radians) -------- ------- ------------------------ G1/S CCNE1, CDC6, PCNA,E2F1 0.56, 5.96, 5.87, 5.83 S RFC4, RRM2 5.47, 5.36 G2 CDC2, TOP2A, CCNA2, CCNF 4.24, 3.74, 3.55, 3.25 G2/M STK15, CCNB1, PLK, BUB1 3.06, 2.67, 2.61, 2.51 M/G1 VEGFC, PTTG1, CDKN3, RAD21 2.66, 2.40, 2.25, 1.81

  19. A hypothesis of biological interest Do all cell cycle genes have same T and same but the other 4 parameters are gene specific? i.e.

  20. An Important Feature • Correlated data • Temporal correlation within gene • Gene-to-gene correlations

  21. Test Statistic • Wald statistic for heteroscedastic linear and non-linear models • Zhang, Peddada and Rogol (2000) • Shao (1992) • Wu (1986)

  22. The Null Distribution • Due to the underlying correlation structure • Asymptotic approximation is not appropriate. • Use moving-blocks bootstrap technique on the residuals of the nonlinear model. • Kunsch (1989)

  23. Moving-blocks Bootstrap • Step 1: Fit the null model to the data and compute the residuals. • Step 2: Draw a simple random sample (with replacement) from all possible blocks , of a specific size, of consecutive residuals.

  24. Moving-blocks Bootstrap • Step 3: Add these residuals to the fitted curve under the null hypothesis to obtain the bootstrap data set • Step 4: Using the bootstrap data fit the model under the alternate hypothesis and compute the Wald statistic.

  25. Moving-blocks Bootstrap • Step 5: Repeat the above steps a large number of times. • Step 6: The bootstrap p-value is the proportion of the above Wald statistics that exceed the Wald statistic determined from the actual data.

  26. Analysis of experiment 2 • The bootstrap p-value for testing using Experiment 2 data of Whitfield et al. (2002) is 0.12. Thus our model is biologically plausible.

  27. Statistical inferences on the phase angle Multiple experiments

  28. Some questions of interest • How to evaluate or combine results from multiple cell division cycle experiments? • Are the results “consistent” across experiments? • How to evaluate this? • What could be a possible criterion?

  29. Data : RPM estimate of phase angle of a cell-cycle gene ‘g’ from the experiment.

  30. Representation using a circle Consider 4 cell cycle genes A, B, C, D. The vertical line in the circle denotes the reference line. The angles are measured in a counter-clockwise. Thus the sequential order of expression in this example is A, B, D, C. A B C D

  31. “Coherence” in multiple cell-cycle experiments • A group of cell cycle genes are said to be coherent across experiments if their sequential order of the phase angles is preserved across experiments. B A D B Exp 2 D A C D C C Exp 3 B A Exp 1

  32. Geometric Representation • We shall represent phase angles from multiple cell cycle experiments using concentric circles. • Each circle represents an experiment. • Same gene from a pair of experiments is connected by a line segment. • A figure with non-intersecting lines indicates perfect coherence. • If there is no coherence at all then there will be many intersecting lines.

  33. Example: Perfectly Coherent

  34. Example: Perfectly Coherent

  35. Example: No coherence

  36. Estimated Phase Angles • Due to statistical errors in estimation, the estimated phase angles from multiple cell cycle experiments need not preserve the sequential order even though the true phase angles are in a sequential order.

  37. How to evaluate coherence?

  38. Some background on regression for circular data

  39. Experiment B Experiment A Question: Can we determine a rotation matrix A such that we can rotate the circle representing Experiment A to obtain the circle representing Experiment B?

  40. Angle of rotation for a rigid body • Yes! By solve the following minimization problem:

  41. Determination of Coherence Across “k” Experiments

  42. The Basic Idea • Consider a rigid body rotating in a plane. Suppose the body is perfectly rigid with no deformations. • Let denote the 2x2 rotation matrices from experiment i to i+1 (k+1 = 1). Then Alternatively

  43. The Basic Idea • Equivalently, if Then under perfect rigid body motion we should have

  44. Problem! • In the present context we do NOT necessarily have a rigid body! • Not all experiments are performed with same precision. • The time axis may not be constant across experiments. • Number of time points may not be same across experiments. • Etc.

  45. Example: Not a rigid motion but perfectly coherent

  46. Consequence • Rotation matrix A alone may not be enough to bring two circles to congruence! • An additional “association/scaling” parameter may be needed as see in the previous figure!

  47. Circular-Circular regression model for a pair of experiments (Downs and Mardia, 2002) • For , let denote a pair of angular variables. • Suppose is von-Mises distributed with mean direction and concentration parameter

More Related