400 likes | 633 Views
V10: Bayesian Parameter Estimation. Although the MLE approach seems plausible, it can be overly simplistic in many cases . Assume again that we perform the thumbtack experiment and get 3 heads out of 10 → assuming = 0.3 is then quite reasonable .
E N D
V10: Bayesian Parameter Estimation Althoughthe MLE approachseems plausible, itcanbeoverlysimplistic in manycases. Assumeagainthatweperformthethumbtackexperiment andget 3 heads out of 10 → assuming = 0.3 isthenquitereasonable. But whatifwe do the same experimentwith a standardcoin, and also get 3 heads? Intuitively, wewouldprobably not concludethattheparameterofthecoinis 0.3. Why not? Becausewehave a lotmoreexperiencewithtossingcoins, wehave a lotmorepriorknowledgeabouttheirbehavior. Mathematics of Biological Networks
Joint probabilisticmodel In theBayesianapproach, weencodeourpriorknowledge about with a probabilitydistribution. This distributionrepresentshowlikelywearea priori tobelievethe different choicesofparameters Thenwecancreatea jointdistributionovertheparameter and thedatacasesX[1], …, X[M] thatweareabouttoobserve. This jointdistributioncapturesourassumptionsabouttheexperiment. As longaswedon‘tknow , thetossesare not marginallyindependent becauseeachtosstellsussomethingabout. One isknown, weassumethatthetossesareconditionallyindependentgiven . Mathematics of Biological Networks
Joint probabilisticmodel Wecandescribetheseassumptionsusingtheprobabilisticmodelbelow. Mathematics of Biological Networks
Joint probabilisticmodel Having determinedthemodelstructure, itremainstospecifythelocalprobabilitymodels in thisnetwork. Webeginbyconsideringtheprobability P(X[m] | ) : We also needtodescribethepriordistributionover , P(). This is a continuousdensityovertheinterval [0,1]. Thereareseveralpossiblechoicesforthis. Letusfirstconsiderhowtouse it. Mathematics of Biological Networks
Joint probabilisticmodel The networkstructureimpliesthatthejointdistribution of a particulardatasetandfactorizesas where M[1] isthenumberofheads in thedata, M[0] isthenumberoftails, and P( x[1], …, x[M] |) issimplythelikelihoodfunctionL( : D). This networkspecifies a jointprobabilitymodeloverparametersanddata. Mathematics of Biological Networks
Posteriordistribution There areseveralways in whichwecanusethisnetwork. Forexample, wecantake an observeddataset D of M outcomes, anduseittoinstantiatethevaluesof x[1], …, x[M]. Wecanthencomputetheposteriordistributionover: The firstterm in thenumeratoristhelikelihood, thesecondtermisthepriorovertheparameters. The denominatoris a normalizingfactor so thattheproductis a proper densityfunction[0,1]. Mathematics of Biological Networks
Prediction Let usconsiderthevalueofthenextcointoss x[M+1] giventheobservationsofthefirst M tosses. Since isunknown, we will consider all itspossiblevaluesandintegrateoverthem Whengoingfromthesecondtothethirdline, weusedtheconditionalindepenciesimpliedbythe meta-network. → weareintegratingtheposteriorover topredicttheprobabilityofheadsforthenexttoss. Mathematics of Biological Networks
Prediction: revisitthumbtackexample Assume thatourprioris uniform (constant) over in theinterval [0,1]. Thenis proportional tothelikelihood . Pluggingthisintothe integral, weneedtocompute This so-calledBayesianestimatorisquitesimilartothe MLE prediction exceptthatitaddsone „imaginary“ sample toeachcount. Mathematics of Biological Networks
Priors: Beta distribution When usingnonuniformpriors, thechallengeisto pick a continuousdistributionthatcanbewritten in a compact form (e.g. using an analyticalformula), andthatcanbeupdatedefficientlyaswegetnewdata. An appropriateprioristheBeta distribution. Definition: a Beta distributionisparametrizedbytwo real and positive hyperparameters1, 0anddefinedas: The normalizationconstantisdefinedas: whereistheGamma function. Mathematics of Biological Networks
Beta distribution The parameters 1and 0correspondintuitivelytothenumberofimaginary headsandtailsthatwehave „seen“ beforestartingtheexperiment. These areexamplesofbetafunctions Mathematics of Biological Networks
Gamma function The Gamma functionissimply a continuousgeneralizationoffactorials. Itsatisfies (1) = 1 and (x + 1) = x (x). Hence(n + 1) = n! Beta distributionshaveproperties thatmakethemparticularlyusefulforparameterestimation. Assumeourdistribution P() isBeta(1,0) andconsider a singlecointoss X. Letuscomputethe marginal probabilityover X, based on P(). Weneedtointegrateout . Mathematics of Biological Networks
Properties of Beta functions This findingmatchesourinituitionthatthe Beta priorindicates thatwehaveseen1 (imaginary) headsand0(imaginary) tails. Mathematics of Biological Networks
Properties of Beta distributions As wegetmoreobservations, i.e. M[1] headsand M[0] tailsitfollowsthat whichispreciselyBeta(1+ M[1], 0+ M[0]). This resultillustrates a keypropertyofthe Beta distribution: Iftheprioris a Beta distribution, thentheposteriordistribution, thatis, thepriorconditioned on theevidence, is also a Beta distribution. Mathematics of Biological Networks
Priors An immediate consequenceisthatwecancompute theprobabilitiesoverthenexttoss: where = 1 + 0 andM = M1+ M0 In thiscase, ourposterior Beta distributiontellsus thatwehaveseen1 + M[1] (imaginary) headsand 0+ M[0] tails. Mathematics of Biological Networks
Effectof Priors Let uscomparetheeffectofBeta(2,2) vs. Beta(10,10) on theprobabilityoverthenextcointoss. Bothpriorspredictthattheprobabilityofheads in thefirsttossis. How do different priors (Beta(10,10) ismorenarrow) affectfurtherconvergence? Supposeweobserve 3 heads in 10 tosses. Usingthefirstprior, ourestimateis Usingthesecondpriorgives But whenweobtainmuchmoredata, theeffectoftheprioralmostdisappears. Ifweobtain 1000 tossesofwhich 300 areheads, both and givevaluescloseto 0.3 Mathematics of Biological Networks
Priors andPosteriors Letusassume a generallearningproblem whereweobserve a trainingset D thatcontains M IID samples of a random variable X from an unknowndistribution P*(X). We also assumethatwehave a parametricmodel P( | ) wherewecanchooseparametersfrom a parameterspace . The MLE approachattemptedto find theparameters in thatare „best“ giventhedata. The Bayesianapproach, on theotherhand, does not attempt to find a singlebestestimate. Instead, onequantifiesthesubjectiveprobabilityfor different valuesof after seeingtheevidence. Mathematics of Biological Networks
Priors andPosteriors We needtodescribe a jointdistribution P(D, ) overthedataandtheparameters. Wecaneasilywrite The firstterm on therightisthelikelihoodfunction (see V8 – example on predicting PP complexes). The secondtermisthepriordistributionoverthepossiblevalues in . Itcapturesour initial uncertaintyabouttheparameters. Itcan also captureourpreviousexperiencebeforewestarttheexperiment. Mathematics of Biological Networks
Priors andPosteriors Oncewehavespecifiedthelikelihoodfunctionandtheprior, wecanusethedatatoderivetheposteriordistribution overtheparametersusingBayesrule: The term P(D) isthemarginal likelihoodofthedata whatistheintegrationofthelikelihood over all possibleparameterassignments. Mathematics of Biological Networks
Priors andPosteriors Let usreconsidertheexampleof a multinomialdistribution(MD). Weneedtodescribeouruncertaintyabouttheparametersof MD. The parameterspacecontains all nonnegativevectors such that. As wesawpreviously, thelikelihoodfunctionis Sincetheposterioris a productofthepriorandthelikelihood, itisnaturaltorequirethattheprior also have a form similar tothelikelihood. One such prioristheDirichletdistributionwhichgeneralizes the Beta distribution. Mathematics of Biological Networks
Dirichletdistribution A Dirichletdistributionisspecifiedby a setofhyperparameters1, … K so that Weuse todenote. If weuse a Dirichletprior, thentheposterioris also Dirichlet: Proposition: If P() isthen P( | D) is , where M[K] isthenumberofoccurrencesofxk. Priors such astheDirichletareusefulsincetheyensurethattheposteriorhas a nicecompactdescriptionandusesthe same representationastheprior. We will see on 2 examplestheeffectsofpriors on posteriorestimates. Mathematics of Biological Networks
Effectof Beta prior on convergenceofposteriorestimates For a givendatasetsize M, weassumethat D contains 0.2 M headsand 0.8 M tails. As theamountof real datagrows, ourestimateconvergestothetrueunderlyingdistribution, regardlessofthestartingpoint. (Left): effectofvaryingpriormeans1´, 0´ for a fixedpriorstrength . (Right): effectofvaryingpriorstrengthfor a fixedpriormean 1´ = 0´= 0.5 Mathematics of Biological Networks
Convergenceofparameterestimate Dottedline: Beta(10,10) Small-dash line: Beta(5,5) Large-dash line: Beta (1,1) → Beta(10,10) haslonger „memory“ about initial conditions Effectof different priors on smoothingtheparameterestimates. Below thegraphisshowntheparticularsequenceoftosses. Solid line: MLE estimate Dashedlines: Bayesianestimateswith different strengthsand uniform priormeans. Mathematics of Biological Networks
Imprinting effects during hematopoietic differentiation? • One of the most well studied developmental systems • Mature cell line models Rathinam and Flavell 2008 Mohamed Hamed (unpublished) Mathematics of Biological Networks
Blood lineages Mohamed Hamed (unpublished) Mathematics of Biological Networks
Motivation I • Identify cellular events that drive cell differentiation and reprogramming • Construct gene-regulatory network (GRN) that governs - transitions between the different states along the developmental cell lines and - pausing at specific states. • Do imprinted genes play a role in regulating differentiation?. Mohamed Hamed (unpublished) Mathematics of Biological Networks
Motivation II Berg, Lin et al. (2011) Real-time PCR analysis of imprinted gene expression in hematopoietic cells Imprinted genes drastically down-regulated in differentiated cells. during the earliest phases of hematopoietic development, imprinted genes may have distinct roles Mohamed Hamed (unpublished) Mathematics of Biological Networks
Imprinted genes • violate the usual rule of inheritance • bi-allelic genes : gene copy (allele) encoding hemoglobin from dad gene copy (allele) encoding hemoglobin from mom Child: expresses equal amounts of the 2 types of hemoglobin • mono-allelic (imprinted) genes : one allele silenced by DNA methylation Mathematics of Biological Networks
Imprinted genes cluster in the genome Mathematics of Biological Networks
Parental conflict hypothesis = “battle of the sexes” Paternally expressed genes Maternally expressed genes embryonicggrowth in placenta embryonic growth in placenta Mathematics of Biological Networks
Mouse Pluripotency network (Plurinet) Pluripotency network in mouse G. Fuellen et al. (2010) based on 177 publications 274 genes 574 stimulations / inhibitions/ and interactions Mathematics of Biological Networks
Gene regulatory network around Oc4 controls pluripotency Tightly interwoven network of 9 transcription factors keeps ES cells in pluripotent state. 6632 human genes have binding site in their promoter region for at least one of these 9 TFs. Many genes have multiple motifs. 800 genes bind ≥ 4 TFs. Mathematics of Biological Networks
Gene expression profiles imprinted pluri hematopoiesis c a b • long and short-term hematopoietic stem cells • Intermediate progenitor populations such as Lymphoid primed multipotent progenitor (LMPP), common lymphoid progenitor (CLP), and granulocyte–monocyte progenitor (GMP), and • Terminally differentiated blood progeny such as NK cells and granulocyte- monocyte (GM). • All 3 gene sets contain genes that are • upregulated either in (1), (2) or (3) stages Mohamed Hamed (unpublished) Mathematics of Biological Networks
Lineage-specific marker genes from all 3 gene sets cluster together red : maternally expressed imprinted genes blue : paternally expressed imprinted genes cyan : pluripotency genes orange: hematopoietic genes Mohamed Hamed (unpublished) Mathematics of Biological Networks
Imprinted gene network (IGN) Aim: explain surprisingly similar expression profiles of 3 gene sets • only 5imprinted genes (Gab1, Ins1, Phf17, Tsix, and Xist) are present in the pluripotency list and • only 3 imprinted genes (Axl, Calcr, and Gnas) belong to the hematopoietic list. Who regulates the imprinted genes? • Identify regulators (TFs) of imprinted genes and target genes regulated by imprinted genes Mohamed Hamed (unpublished) Mathematics of Biological Networks
Mohamed Hamed (unpublished) Johannes Trumm, MSc thesis ,CBI, 2011. Mathematics of Biological Networks
Mebitoo GRN Plugin Johannes Trumm, MSc thesis ,CBI, 2011. Mathematics of Biological Networks
gene sets are (largely) co-expressed andenriched with developmental GO terms Mohamed Hamed (unpublished) Mathematics of Biological Networks
Summary Parameter learning from data is an important research field. We entered into some basics about MLE and Bayesian parameter estimation. Powerful and efficient priors need to be estimated, see Beta function. V11: enter into structure learning. Application example: construct GRN to derive genes that drive hematopoiesis. Intersection with pluripotency and imprinted genes reveals interesting module of co-expressed genes with homogenous involvement in development. Mathematics of Biological Networks