Computing for Research I Spring 2011

1 / 19

# Computing for Research I Spring 2011 - PowerPoint PPT Presentation

Computing for Research I Spring 2011. R: Random number generation &amp; Simulations. March 21. Presented by: Yanqiu Weng. Outline. How to sample from common distribution:. Uniform distribution. Binomial distribution. Normal distribution . Pre-specified vector. Examples:.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Computing for Research I Spring 2011' - tien

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Computing for Research ISpring 2011

R: Random number generation & Simulations

March 21

Presented by:

Yanqiu Weng

Outline

How to sample from common distribution:

Uniform distribution

Binomial distribution

Normal distribution

Pre-specified vector

Examples:

• Randomization code generation
• Simulation 1 (explore the relationship between power and effect size)
• Simulation 2 (explore the relationship between power and sample size)
• Simulation 3 (generate binary outcome using logistic regression model)

Uniform distribution

PDF

Mean:

Variance:

[R] Uniform distribution

runif(n, min=0, max=1)

See R code …

[R] Uniform distribution

Use UNIFORM distribution to generate BERNOULLI distribution

Basic idea:

0

Bernoulli

distribution

Uniform

distribution

1

See R code …

[R] Binomial distribution

rbinom(n, size, prob)

e.g. generate 10 Binomial random number with Binom(100, 0.6)

n = 10

size = 100

prob = 0.6

rbinom(10, 100, 0.6)

e.g. generate 100 Bernoulli random number with p=0.6

n = 100

size = 1

prob = 0.6

rbinom(100, 1, 0.6)

See R code …

[R] Normal distribution

dnorm(x, mean, sd) #density

pnorm(q, mean, sd) #probability P(X<=q)

qnorm(p, mean, sd) #quantile

rnorm(n, mean, sd) #random number

See R code …

[R] Normal distribution

dnorm(x, mean, sd) #density

e.g. plot a standard normal curve

pnorm(q, mean, sd) #probability P(X<=x)

e.g. calculate the p-value for a one sides test with standardized test statistic

H0: X<=0

H1: X>0

Reject H0 if “Z” is very large

If from the test, we got the Z value = 3.0, what’s the p-value?

P-value = P(Z>=z) = 1 - P(Z<=z)

1 - pnorm(3, 0, 1)

[R] Normal distribution

qnorm(p, mean, sd) #quantile

See R code …

rnorm(n, mean, sd) #random number

See R code …

e.g. randomly choose two number from {2,4,6,8,10} with/without replacement

sample(x, size, replace = FALSE, prob = NULL)

sample(c(2,4,6,8,10), 2, replace = F)

4

8

2

6

10

e.g. A question from our THEORY I CLASS:

“Draw a histogram of all possible average of 6 numbers selected from {1,2,7,8,14,20} with replacement”

A quick way to solve this question is to do a simulation:

That is: we assume we repeat selection of 6 balls with replacement from left urn for many many times, and plot their averages The R code is looked like:

14

20

8

a <- NULL

for (i in 1:10000){

a[i] <- mean(sample(c(1,2,7,8,14,20),6, replace = T))

}

hist(a)

1

2

7

e.g. Generate 1000 Bernoulli random number with P = 0.6

sample(x, size, replace = T, prob =)

Let x = (0, 1),

Let size = 1,

Let replace = T/F,

Let prob = (0.4, 0.6).

Repeat 1000 times

0

1

Example 1

Generate randomization sequence

Goal: randomize 100 patient to 2 treatment using following method:

1. Simple randomization (like flipping a coin) – Bernoulli distribution

0 0 1 0 0 1 0 1 0 0 …. 1 0 1 0

runif(), rbinom(), sample().

2. Permuted block randomization

AABB BABA BBAA BABA BAAB … BBAA

Block size = 4

sample()

Example 2

Investigate the relationship between effect size and power – drug lowing BP

Linear model: Y = b0 + b1X + e

Y: Blood Pressure (response)

X: treatment (1 = drug vs. 0 = placebo)

e: random error = var(Y)

b1 represent the effect size

For instance, if previous study shows us that the BP in placebo population are distributed as N(100, 49), what is the power when real effect of new drug on BP reduction is 0, 1, 2, 3, 4 and 5 for study with sample size = 100 (50 in drug, 50 in placebo)

b0 = 100

e ~ N(0, 49)

Important information: Y (placebo) ~ N(100, 49)

Example 2

Investigate the relationship between effect size and power - drug lowing BP

Y: Blood Pressure (response)

X: treatment (1 = drug vs. 0 = placebo)

e: random error = var(Y)

Linear model: Y = b0 + b1X + e

b0 = 100

e ~ N(0, 49)

Important information: Y (placebo) ~ N(100, 49)

What’s the power given b1 (the real effect size of the treatment) is 0, 1, 2, 3, 4 or 5

Note: if we set alpha = 0.05, power means the probability of b1 (treatment effect) showing significant (P<0.05) in the linear model among the repeated simulations

Example 2

Investigate the relationship between effect size and power - drug lowing BP

Y: Blood Pressure (response)

X: treatment (1 = drug vs. 0 = placebo)

e: random error = var(Y)

Linear model: Y = b0 + b1X + e

• Simulation steps (E.g. sample size = 50/ per group, 1000 simulations):
• Generate X according to study design (50 “1”s and 50 “0”s);
• Generate 100 “e” from N(0, 49);
• Given b0 and b1, generate Y using Y = b0 + b1X + e;
• Use 100 pairs of (Y, X) to refit a new linear model, and get the b0 and b1 and their p-value;
• Repeat these steps for 1000 times.
• If alpha = 0.05, for a two-sided test

Example 3

Investigate the relationship between sample size and power

Linear model: Y = b0 + b1X + e

What’s the power give b1 = 2 and sample size = 25, 50, 75, 100, 125 per group

Example 4

Generate binary response in logistic regression

Simple Logistic model: logit(p) = b0 + b1X

where logit(p) = log(p/1-p)

For a study, we know b0, b1, and x, we want to predict the Y, Y = 0 or 1.

To do that, we can consider each Yi as a Bernoulli trial with

e.g. if p = 0.7, that mean Y have 70% percent chance to be 1 and 30 percent chance to be 0

Example 4

Generate binary response in logistic regression

For example:

For simple Logistic model: logit(p) = b0 + b1X, we know

b0 = 1

b1 = 0.5

X ~ N(0, 1)

In simulation, we randomly choose x from N(0, 1),

Say if x = 2, how to predict the binary response Y?

First we need find its “P”, the probability of “1” in a Bernoulli distribution

Using previous knowledge, we can generate a Y from a Bernoulli distribution with p = 0.88