1 / 29

Parametric Distributions

Parametric Distributions. Definitions. A random variable , X is a map from the result of an experiment or observation to the real numbers. The cumulative distribution function of a random variable is defined through the probability measure as F X (z)=P(X ≤ z). This is often written F(z).

lena
Download Presentation

Parametric Distributions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parametric Distributions

  2. Definitions • A random variable,X is a map from the result of an experiment or observation to the real numbers. • The cumulativedistribution function of a random variableis defined through the probability measure as FX(z)=P(X≤z). • This is often written F(z).

  3. Properties of F • F() is non-decreasing. • F() vanishes to 0 on lhs and increases to 1 on rhs. • Note that F() is right continuous. • For any such F(), a random variable can be created. (Skorokhod Representation)

  4. pdf • Where F(x) can be written as the integral from minus infinity to x of some function, f(z), • Then f(z) is termed a density (or pdf). • Where this is expressible as a discrete sum, the discrete function f(j) is also termed a pdf. • A pdf will tell us which values of the R.V. are most likely.

  5. Important Note • Thus, this idea is very general. • Lots of F()s are possible. • A closed functional form for F() and f() is not required. • Exercise: • Draw some ‘possible’ cdfs. • Check that they fulfil the conditions. • A note about empirical cdfs.

  6. Example • A lecturer is thinking of doing building work in his house, but is waiting to hear about the profits from a venture he was involved in before deciding whether to proceed. • He knows he will get at least €10,000 net from the project. • Things are going well, and it is likely that the actual returns will be around €20,000. • There is an outside chance that €40,000 could be returned, but this is unlikely.

  7. Example II • A lecturer takes about 22 minutes to cycle to work. • On a good day, and pedaling hard, he can make it in 15 minutes. The fastest he has done it is 12 minutes. • It would take 90 minutes to walk, so this is a realistic upper bound for cycling.

  8. Example III • A (male) Senior Sophister management science student is interested in ‘meeting’ with incoming (female) JF students in ESS throughout the year. • Previous experience tells him his ‘success’ rate is about 1 in 10. • Around 50 opportunities present themselves a year. • Summarise the annual promiscuity.

  9. Parametric Forms • Over the years, mathematicians have examined functions that have the properties described. • Many of these have arisen through considering combinations of other simple functions. • These functions have parameters, which can be modified to change the shape of the curve. • However, the overall functional form stays the same.

  10. Advantages Parametric Dists • Properties and behaviours well understood. • Moments can readily be calculated. • Black box software available. • Can readily communicate models to colleagues. • Sufficiently flexible for most purposes. • As realistic as empirical functions and may be more physically justifiable.

  11. Disadvantages • May not exactly match application (ease of use vs tool availability compromise.) • Results may be sensitive to distributional assumptions. • Sometimes easy to program without a full understanding of what is going on – downside of black box.

  12. Some Models • Bernoulli - Br(x|q) - dbinom(,,1) • Binomial - Bi(x|q,n) - dbinom() • Poisson - Pn(x|l) - dpois() • Beta - Be(x|a,b) - dbeta() • Uniform - Un(x|a,b) - dunif() • Gamma - Ga(x|a,b) - dgamma() • Exponential Ex(x|q) - dexp() • Normal - N(x|m,s) - dnorm()

  13. Binomial • Bi(x|q,n) • Pdf f(x) = • nCxqx(1-q)(n-x) • E(x) = nq • Var(x) =nq(1-q) • Graph for n=9 and q=0.5.

  14. Binomial • Cdf • This is a step function, since can only have integer values.

  15. Normal (Gaussian) • N(x|m,s) • Pdf f(x) = • cexp{-0.5 s-2(x-m)2} • E(x) = m • Var(x) =s2 • Graph for m=0 and s=1.0.

  16. Normal • Cdf • This is smooth since the underlying rv is continuous. • Note that neither 0 nor 1 is reached in the plotted region.

  17. Choosing Models • Thus, for example, if one is interested in a smoothly varying quantity, such as response rate, then one might consider ‘modeling’ it using a Normal distribution. • If an ‘expert’ tells you that response rate is likely to be around 7%, but could go from 5% to 9%, neither of which is very likely, what values of parameters for a Normal model might represent this ‘belief’?

  18. Aside – Using ‘R’ • A handy piece of statistical software that is known to be well programmed, and is good for plots etc is ‘R’. • This is an open source implementation of S-Plus. • Some short input on the use of ‘R’ is worthwhile.

  19. R • Access via web page - also on lab machines. • Command line interface. x<-(1:1000)/200 y<-dgamma(x,2,2) plot(x,y,type="l",col=1) • Sets up vector, x, taking sequential values between 0 and 5. • Sets up y to be the pdf of x. • Plots y as a function of x, as a line plot, in black.

  20. Norm (7,0.5) vs (7,0.8)

  21. Issues • What if the ‘belief’ says that high response rates are more likely than low ones (skew)? • Can you draw a density that might match? • What if there is likely to be a response rate of around 6%, but if by chance a marketing stunt that is being run next week gets air time on radio, then the rate will be around 10%?

  22. Exercise • Write down a pdf for • Skewed distribution • Truncated distribution • Mixture of distributions • Show (in outline) that there exists a random variable, which has as its pdf the quantity that you have written down.

  23. Gamma Distribution • Ga(x|a,b) – shape, a and rate, b • Pdf f(x) = • c x(a –1)exp(-bx) • E(x) = a/ b • Var(x) = a/(b2) • Graph a=2, b=2

  24. Use in modeling • Thus, instead of fixing deterministic aspects of the model, we can allow inputs to be defined by parametric distributions. • We still need to fix the parameters of the distributions, but this may be much more realistic than fixing values. • Elicitation is the term given to the assignment of parameters based on ‘expert opinion.’

  25. Method • Thus, we have the following method at the modeling step; • Determine a ‘realistic’ model for the situation (conditional on particular values of inputs.) • Examine which inputs have the biggest impact on the output variable of main interest. • Model the uncertainty of the inputs through a probability distribution. • Examine the impact on outputs.

  26. Practicalities • This can be done by; • Examining the moments of the combinations of random variables. • Analytically (gives exact answer, but messy.) • Simulation from the distributions of interest.

  27. Simulation • In order to ‘simulate’ values from the distribution of interest we need a system of generating random numbers. • It suffices to be able to generate numbers from a uniform[0,1). • Prove that if this can be done, then any random variable can be simulated. • Example: Normsinv(Rand())

  28. Exercise • Examine each of the distributions listed earlier in lectures. • For each one, you should produce a pdf and cdf for various parameters of interest. • These graphs can readily be constructed in R.

  29. Exercise II • For the Norseman problem, examine the impact of a response rate which is unknown, but apriori believed to be Normal, with mean 6% and standard deviation 0.6% . • Additionally, you might consider the impact of Gamma distributed orders, with shape 10 and rate 12.

More Related