1 / 30

RNA interference, and DNA methylation RCOOH RCOOCH3

Why CpG islands? CSE, Marmara University mimoza.marmara.edu.tr/~m.sakalli/cse546 Including some slights of Papoulis. These notes will be further modified. Dec/15/09 Notes on probability are from A. Papoulis and S. U. Pillai,. RNA interference, and DNA methylation RCOOH RCOOCH3.

kaiya
Download Presentation

RNA interference, and DNA methylation RCOOH RCOOCH3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Why CpG islands?CSE, Marmara University mimoza.marmara.edu.tr/~m.sakalli/cse546Including some slights of Papoulis. These notes will be further modified. Dec/15/09Notes on probability are from A. Papoulis and S. U. Pillai,

  2. RNA interference, and DNA methylation RCOOHRCOOCH3 • Methylation involves on the regulation of the gene expression, protein functioning, and RNA metabolisms. • A cell is combination of numerous proteins, each determining how a cell functions. Disproportionately expressed proteins will have devastating effects. • Two possible vulnerabilities: • one is at the transcriptional level, while dna is converted to mRNA, a fraction of antisense oligonucleotide binding to unprocessed gene in the DNA, and creating a 3-strand complex, as a result blocking transcription process, • and the second vulnerability is that at the level of translation. Translation is a ribosome-guided process for manufacturing a protein from mRNA. There, once antisenseRNA hybridizes mRNA, then protein generation is inhibited since the editing enzymes splicing introns from RNAs are blocked. RNaseH recognizes the double helix complex of antisense on bound mRNA, and somehow frees antisense on, and cleaves mRNA. • Antisense therapy: HIV, influenza and for cancer treatment where replication and transcription is targeted.

  3. RNA interference, and DNA methylation RCOOHRCOOCH3 • RNA interference (RNAi) is a system controlling (either increasing or decreasing) the activity of RNAs. MicroRNA (miRNA) and small interfering RNA (siRNA) which are the direct products of genes, and can bind to other specific RNAs. They play roles in defending cells against parasitic genes – viruses and transposons – but also gene expression in general. It is universal. • The methylation process differs in prokaryotic and eukaryotic cells, in the former one it occurs at the 5’ of cytosine pyrimidine and at the 6’ of nitrogen of the adenine purine ring, while in the later one, it occurs at the # 5 carbon of the cytosin pyrimidine sites. • In mammalian, metyhlation occurs at the 5C of CpG dinucleotide. CpG is 1% of human genome. Most are methylated. Unmethylated CpG islands present in the regulatory genes, including promoter regions, therefore impeding transcription and protein modeling, (cromotin and histone). • One abnormality for example caused due to the incomplete methylation is Rett syndrome. Epigenetic abnormalities. Methylated histones holding dna tightly and blocking transcriptions.

  4. The occurrence of CpG sequences is the least frequent in many genomes.. rarer than would be expected by the independent probabilities of C and G. This is said (!!because) C in CpG has a tendency to methylate and to become methyle-C, and methylation process is suppressed in areas around genes, hence these areas have a relatively higher concentration of CpG in islands. • Epigenetic Importance: Methyle-C has a high change in mutating to T, therefore important in epigenetic inheritance, as well its importance in in controlling gene expression and regulation. • Questions: How close a short sequence is to be a CpG island, and the likelihood of a long sequence containing one or more CpG islands, and more importantly the relation it bears, coincidental or for some functional reasoning. • Therefore Markov chains.

  5. A Markov chain is a stochastic random process, a discrete process {Xn} where n {0, 1, 2, . . . }, with the Markov property, for which, the conditional probability distribution of the future states depends only upon the current state and a fixed number of past states (with m memories). Continuous Time MC has continuous time index. Pr{Xm+1 = j|X0 = k0, . . . , Xm-1 = km-1, Xm = i} = Pr{ Xm+1 = j |Xn = i}transition probabilities. Finite state machine, iid sequence. for every i, j, k0, . . . , km-1 and for every m. Stationary: For all n, the transition matrix does not change over time and the future state depends only on the current state i and not on the previous states. Pr{Xn+1 = j |Xn = i } = Pr{X1 = j |X0 = i }.

  6. The one-step transition matrix for a Markov chain with states S = { 0, 1, 2 } is [………, …, …] where Pr{X1 = j| X0 = i } = pij(n)>0. Accessibility: A Markov Process is ergodic if if possible to communicate between any two i to j states. Then this is irreducible, if all states communicate.. Periodic if returns to the same state at every k (periodicity) steps. Aperiodic if there is no a repetitive k steps. A system lucking the system is absorbing state. If there is no absorbing state then the Markov Chain irreducible.

  7. Learn these.. • Conditional probability, joint probability. • Independence of occurrences of events. • Bayesian process. • Expressing sequences statistically with their distribution. Discriminating states. • MLE, EM. • MCMC, for producing a desired posteriori distribution, 1- Metropolis-Hastings, RWMC, 2-Gibbs sampling. • Markov chains, properties maintained. Stationary, ergodic, irreducibility, aperiodic, • Hidden Markov Models (the goal is to detect the sequence of underlying states that is likely to give rise to an observed sequence). • This is Viterbi Algorithm.

  8. Independence: A and B are said to be independent events, if Notice that the above definition is a probabilistic statement, not a set theoretic notion such as mutually exclusiveness. Suppose A and B are independent, then Thus if A and B are independent, the event that B has occurred does not give any clue on the occurrence of the event A. It makes no difference to A whether B has occurred or not. (1-45) (1-46) PILLAI

  9. Example 1.2: A box contains 6 white and 4 black balls. Remove two balls at random without replacement. What is the probability that the first one is white and the second one is black? Let W1 = “first ball removed is white” B2 = “second ball removed is black” PILLAI

  10. Ex 1.2: A box contains 6 w and 4 b balls. Remove two at random without replacement. What is the probability that the 1st one is white and the 2nd one is black? Let W1 = “first ball removed is white” and B2 = “second ball removed is black” We need We have Using the conditional probability rule, But and and hence (1-47) PILLAI

  11. Are the events W1 and B2 independent? Our common sense says No. To verify this we need to compute P(B2). Of course the fate of the second ball very much depends on that of the first ball. The first ball has two options: W1 = “first ball is white” or B1= “first ball is black”. Note that and Hence W1 together with B1 form a partition. Thus (see (1-42)-(1-44)) and As expected, the events W1 and B2 are dependent. PILLAI

  12. From (1-35), Similarly, from (1-35) or or Bayes’ theorem (1-48) (1-49) (1-50) PILLAI

  13. Although simple enough, Bayes’ theorem has an interesting interpretation: P(A|B): a-posteriori probability of A given B. P(B): (New Infor.) Evidence of “B has occurred”. P(B|A): Likelihood of B given A P(A): the a-priori probability of the event A. We can also view the event B as new knowledge obtained from a fresh experiment. We know something about A as P(A). The new information is available in terms of B. Thenew information should be used to improve our knowledge/understanding of A. Bayes’ theorem gives the exact mechanism for incorporating such new information. PILLAI

  14. A more general version of Bayes’ theorem involves partition of . From (1-50) where we have made use of (1-44). In (1-51), represent a set of mutually exclusive events with associated a-priori probabilities With the new information “B has occurred”, the information about Ai can be updated by the n conditional probabilities (1-51) PILLAI

  15. Example 1.3: Two boxes B1 andB2 contain 100 and 200 light bulbs respectively. The first box (B1) has 15 defective bulbs and the second 5. Suppose a box is selected at random and one bulb is picked out. (a) What is the probability that it is defective? Solution: Note that box B1 has 85 good and 15 defective bulbs. Similarly box B2 has 195 good and 5 defective bulbs. Let D = “Defective bulb is picked out”. Then PILLAI

  16. Since a box is selected at random, they are equally likely. Thus B1 and B2 form a partition as in (1-43), and using (1-44) we obtain Thus, there is about 9% probability that a bulb picked at random is defective. PILLAI

  17. (b) Suppose we test the bulb and it is found to be defective. What is the probability that it came from box 1? Notice that initially then we picked out a box at random and tested a bulb that turned out to be defective. Can this information shed some light about the fact that we might have picked up box 1? From (1-52), and indeed it is more likely at this point that we must have chosen box 1 in favor of box 2. (Recall box1 has six times more defective bulbs compared to box2). (1-52) PILLAI

  18. Fig. 14.1 14. Stochastic Processes Introduction Let denote the random outcome of an experiment. To every such outcome suppose a waveform is assigned. The collection of such waveforms form a stochastic process. The set of and the time index t can be continuous or discrete (countably infinite or finite) as well. For fixed (the set of all experimental outcomes), is a specific time function. For fixed t, is a random variable. The ensemble of all such realizations over time represents the stochastic PILLAI/Cha

  19. process X(t). (see Fig 14.1). For example where is a uniformly distributed random variable in represents a stochastic process. Stochastic processes are everywhere: Brownian motion, stock market fluctuations, various queuing systems all represent stochastic phenomena. If X(t) is a stochastic process, then for fixed t, X(t) represents a random variable. Its distribution function is given by Notice that depends on t, since for a different t, we obtain a different random variable. Further represents the first-order probability density function of the process X(t). (14-1) (14-2) PILLAI/Cha

  20. For t = t1 and t = t2, X(t) represents two different random variables X1 = X(t1) and X2 = X(t2) respectively. Their joint distribution is given by and represents the second-order density function of the process X(t). Similarly represents the nth order density function of the process X(t). Complete specification of the stochastic process X(t) requires the knowledge of for all and for all n. (an almost impossible task in reality). (14-3) (14-4) PILLAI/Cha

  21. Mean of a Stochastic Process: • represents the mean value of a process X(t). In general, the mean of • a process can depend on the time index t. • Autocorrelation function of a process X(t) is defined as • and it represents the interrelationship between the random variables • X1 = X(t1) and X2 = X(t2) generated from the process X(t). • Properties: • 2. (14-5) (14-6) (14-7) (Average instantaneous power) PILLAI/Cha

  22. 3. represents a nonnegative definite function, i.e., for any set of constants Eq. (14-8) follows by noticing that The function represents the autocovariance function of the process X(t). Example 14.1 Let Then (14-8) (14-9) (14-10) PILLAI/Cha

  23. Example 14.2 (14-11) This gives (14-12) Similarly (14-13) PILLAI/Cha

  24. Stationary Stochastic Processes Stationary processes exhibit statistical properties that are invariant to shift in the time index. Thus, for example, second-order stationarity implies that the statistical properties of the pairs {X(t1) , X(t2) } and {X(t1+c) , X(t2+c)} are the same for anyc. Similarly first-order stationarity implies that the statistical properties of X(ti) and X(ti+c) are the same for any c. In strict terms, the statistical properties are governed by the joint probability density function. Hence a process is nth-order Strict-Sense Stationary (S.S.S) if for anyc, where the left side represents the joint density function of the random variables and the right side corresponds to the joint density function of the random variables A process X(t) is said to be strict-sense stationary if (14-14) is true for all (14-14) PILLAI/Cha

  25. For a first-order strict sense stationary process, from (14-14) we have for any c. In particular c = – t gives i.e., the first-order density of X(t) is independent of t. In that case Similarly, for a second-order strict-sense stationary process we have from (14-14) for any c. For c = – t2 we get (14-15) (14-16) (14-17) (14-18) PILLAI/Cha

  26. i.e., the second order density function of a strict sense stationary process depends only on the difference of the time indices In that case the autocorrelation function is given by i.e., the autocorrelation function of a second order strict-sense stationary process depends only on the difference of the time indices Notice that (14-17) and (14-19) are consequences of the stochastic process being first and second-order strict sense stationary. On the other hand, the basic conditions for the first and second order stationarity – Eqs. (14-16) and (14-18) – are usually difficult to verify. In that case, we often resort to a looser definition of stationarity, known as Wide-Sense Stationarity (W.S.S), by making use of (14-19) PILLAI/Cha

  27. (14-17) and (14-19) as the necessary conditions. Thus, a process X(t) • is said to be Wide-Sense Stationary if • and • (ii) • i.e., for wide-sense stationary processes, the mean is a constant and • the autocorrelation function depends only on the difference between • the time indices. Notice that (14-20)-(14-21) does not say anything • about the nature of the probability density functions, and instead deal • with the average behavior of the process. Since (14-20)-(14-21) • follow from (14-16) and (14-18), strict-sense stationarity always • implies wide-sense stationarity. However, the converse is not true in • general, the only exception being the Gaussian process. • This follows, since if X(t) is a Gaussian process, then by definition • are jointly Gaussian random • variables for any whose joint characteristic function • is given by (14-20) (14-21) PILLAI/Cha

  28. where is as defined on (14-9). If X(t) is wide-sense stationary, then using (14-20)-(14-21) in (14-22) we get and hence if the set of time indices are shifted by a constant c to generate a new set of jointly Gaussian random variables then their joint characteristic function is identical to (14-23). Thus the set of random variables and have the same joint probability distribution for all n and all c, establishing the strict sense stationarity of Gaussian processes from its wide-sense stationarity. To summarize if X(t) is a Gaussian process, then wide-sense stationarity (w.s.s) strict-sense stationarity (s.s.s). Notice that since the joint p.d.f of Gaussian random variables depends only on their second order statistics, which is also the basis (14-22) (14-23) PILLAI/Cha

  29. The ergodic hypothesis:an isolated system in thermal equilibrium, evolving in time, will pass through all the accessible microstates states at the same recurrence rate, i.e. all accessible microstates are equally probable. The average over long times will equal the average over the ensemble of all equi-energetic microstates: if we take a snapshot of a system with N microstates, we will find the system in any of these microstates with the same probability.

More Related