Classical hypothesis testing theory
Download
1 / 102

Classical Hypothesis Testing Theory - PowerPoint PPT Presentation


  • 200 Views
  • Updated On :

Classical Hypothesis Testing Theory. Alexander Senf. Review. 5 steps of classical hypothesis testing (Ch. 3) Declare null hypothesis H 0 and alternate hypothesis H 1 Fix a threshold α for Type I error (1% or 5%) Type I error ( α ): reject H 0 when it is true

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Classical Hypothesis Testing Theory' - hope


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Review l.jpg
Review

  • 5 steps of classical hypothesis testing (Ch. 3)

    • Declare null hypothesis H0 and alternate hypothesis H1

    • Fix a threshold α for Type I error (1% or 5%)

      • Type I error (α): reject H0 when it is true

      • Type II error (β): accept H0 when it is false

    • Determine a test statistic

      • a quantity calculated from the data


Review3 l.jpg
Review

  • Determine what observed values of the test statistic should lead to rejection of H0

    • Significance point K (determined by α)

  • Test to see if observed data is more extreme than significance point K

    • If it is, reject H0

    • Otherwise, accept H0


Overview of ch 9 l.jpg
Overview of Ch. 9

  • Simple Fixed-Sample-Size Tests

  • Composite Fixed-Sample-Size Tests

  • The -2 log λ Approximation

  • The Analysis of Variance (ANOVA)

  • Multivariate Methods

  • ANOVA: the Repeated Measures Case

  • Bootstrap Methods: the Two-sample t-test

  • Sequential Analysis



The issue l.jpg
The Issue

  • In the simplest case, everything is specified

    • Probability distribution of H0 and H1

      • Including all parameters

    • α (and K)

    • But: β is left unspecified

  • It is desirable to have a procedure that minimizes β given a fixed α

    • This would maximize the power of the test

      • 1-β, the probability of rejecting H0 when H1 is true


Most powerful procedure l.jpg
Most Powerful Procedure

  • Neyman-Pearson Lemma

    • States that the likelihood-ratio (LR) test is the most powerful test for a given α

    • The LR is defined as:

    • where

      • f0, f1 are completely specified density functions for H0,H1

      • X1, X2, … Xn are iid random variables


Neyman pearson lemma l.jpg
Neyman-Pearson Lemma

  • H0 is rejected when LR ≥ K

  • With a constant K chosen such that:

    P(LR ≥ K when H0 is true) = α

  • Let’s look at an example using the Neyman-Pearson Lemma!

  • Then we will prove it.


Example l.jpg
Example

  • Basketball players seem to be taller than average

    • Use this observation to formulate our hypothesis H1:

      • “Tallness is a factor in the recruitment of KU basketball players”

    • The null hypothesis, H0, could be:

      • “No, the players on KU’s team are a just average height compared to the population in the U.S.”

      • “Average height of the team and the population in general is the same”


Example10 l.jpg
Example

  • Setup:

    • Average height of males in the US: 5’9 ½“

    • Average height of KU players in 2008: 6’04 ½”

      • Assumption: both populations are normal-distributed centered on their respective averages (μ0 = 69.5 in, μ1 = 76.5 in) and σ = 2

      • Sample size: 3

    • Choose α: 5%


Example11 l.jpg
Example

  • The two populations:

f0

f1

p

height (inches)


Example12 l.jpg
Example

  • Our test statistic is the Likelihood Ratio, LR

  • Now we need to determine a significance point K at which we can reject H0, given α = 5%

    • P(Λ(x) ≥ K | H0 is true) = 0.05, determine K


Example13 l.jpg
Example

  • So we just need to solve for K’ and calculate K:

    • How to solve this? Well, we only need one set of values to calculate K, so let’s pick two and solve for the third:

    • We get one result: K3’=71.0803


Example14 l.jpg
Example

  • Then we can just plug it in to Λ and calculate K:


Example15 l.jpg
Example

  • With the significance point K = 1.663*10-7 we can now test our hypothesis based on observations:

    • E.g.: Sasha = 83 in, Darrell = 81 in, Sherron = 71 in

    • 1.446*1012 > 1.663*10-7

    • Therefore, our hypothesis that tallness is a factor in the recruitment of KU basketball players is true.


Neyman pearson proof l.jpg
Neyman-Pearson Proof

  • Let A define region in the joint range of X1, X2, … Xn such that LR ≥ K. A is the critical region.

    • If A is the only critical region of size α we are done

    • Let’s assume another critical region of size α, defined by B


Proof l.jpg
Proof

  • H0 is rejected if the observed vector (x1, x2, …, xn) is in A or in B.

  • Let A and B overlap in region C

  • Power of the test: rejecting H0 when H1 is true

    • The Power of this test using A is:


Proof18 l.jpg
Proof

  • Define: Δ = ∫AL(H1) - ∫BL(H1)

    • The power of the test using A minus using B

    • Where A\C is the set of points in A but not in C

    • And B\C contains points in B but not in C


Proof19 l.jpg
Proof

  • So, in A\C we have:

  • While in B\C we have:

Why?


Proof20 l.jpg
Proof

  • Thus

  • Which implies that the power of the test using A is greater than or equal to the power using B.



Not identically distributed l.jpg
Not Identically Distributed

  • In most cases, random variables are not identically distributed, at least not in H1

    • This affects the likelihood function, L

    • For example, H1 in the two-sample t-test is:

    • Where μ1 and μ2 are different


Composite l.jpg
Composite

  • Further, the hypotheses being tested do not specify all parameters

  • They are composite

  • This chapter only outlines aspects of composite test theory relevant to the material in this book.


Parameter spaces l.jpg
Parameter Spaces

  • The set of values the parameters of interest can take

  • Null hypothesis: parameters in some region ω

  • Alternate hypothesis: parameters in Ω

  • ω is usually a subspace of Ω

    • Nested hypothesis case

      • Null hypothesis nested within alternate hypothesis

      • This book focuses on this case

    • “if the alternate hypothesis can explain the data significantly better we can reject the null hypothesis”


Ratio l.jpg
λ Ratio

  • Optimality theory for composite tests suggests this as desirable test statistic:

    • Lmax(ω): maximum likelihood when parameters are confined to the region ω

    • Lmax(Ω): maximum likelihood when parameters are confined to the region Ω, defined by H1

    • H0 is rejected when λ is sufficiently small (→ Type I error)


Example t tests l.jpg
Example: t-tests

  • The next slides calculate the λ-ratio for the two sample t-test (with the likelihood)

    • t-tests later generalize to ANOVA and T2 tests


Equal variance two sided t test l.jpg
Equal Variance Two-Sided t-test

  • Setup

    • Random variables X11,…,X1m in group 1 are Normally and Independently Distributed (μ1,σ2)

    • Random variables X21,…,X2n in group 2 are NID (μ2,σ2)

    • X1i and X2j are independent for all i and j

    • Null hypothesis H0: μ1= μ2 (= μ, unspecified)

    • Alternate hypothesis H1: both unspecified


Equal variance two sided t test28 l.jpg
Equal Variance Two-Sided t-test

  • Setup (continued)

    • σ2 is unknown and unspecified in H0 and H1

      • Is assumed to be the same in both distributions

    • Region ω is:

    • Region Ω is:


Equal variance two sided t test29 l.jpg
Equal Variance Two-Sided t-test

  • Derivation

    • H0: writing μ for the mean, when μ1= μ2, the maximum over likelihood ω is at

    • And the (common) variance σ2 is


Equal variance two sided t test30 l.jpg
Equal Variance Two-Sided t-test

  • Inserting both into the likelihood function, L


Equal variance two sided t test31 l.jpg
Equal Variance Two-Sided t-test

  • Do the same thing for region Ω

  • Which produces this likelihood Function, L


Equal variance two sided t test32 l.jpg
Equal Variance Two-Sided t-test

  • The test statistic λ is then

It’s the same function, just

With different variances


Equal variance two sided t test33 l.jpg
Equal Variance Two-Sided t-test

  • We can then use the algebraic identity

  • To show that

  • Where t is (from Ch. 3)


Equal variance two sided t test34 l.jpg
Equal Variance Two-Sided t-test

  • t is the observed value of T

  • S is defined in Ch. 3 as

λ

We can plot λ as a

function of t:

(e.g. m+n=10)

t


Equal variance two sided t test35 l.jpg
Equal Variance Two-Sided t-test

  • So, by the monotonicity argument, we can use t2 or |t| instead of λ as test statistic

  • Small values of λ correspond to large values of |t|

  • Sufficiently large |t| lead to rejection of H0

  • The H0 distribution of t is known

    • t-distribution with m+n-2 degrees of freedom

  • Significance points are widely available

    • Once α has been chosen, values of |t| sufficiently large to reject H0 can be determined


Equal variance two sided t test36 l.jpg
Equal Variance Two-Sided t-test

http://www.socr.ucla.edu/Applets.dir/T-table.html


Equal variance one sided t test l.jpg
Equal Variance One-Sided t-test

  • Similar to Two-Sided t-test case

    • Different region Ω for H1:

      • Means μ1 and μ2 are not simply different, but one is larger than the other μ1 ≥ μ2

      • If then maximum likelihood estimates are the same as for the two-sided case


Equal variance one sided t test38 l.jpg
Equal Variance One-Sided t-test

  • If then the unconstrained maximum of the likelihood is outside of ω

  • The unique maximum is at , implying that the maximum in ω occurs at a boundary point in Ω

  • At this point estimates of μ1 and μ2 are equal

  • At this point the likelihood ratio is 1 and H0 is not rejected

  • Result: H0 is rejected in favor of H1 (μ1 ≥ μ2) only for sufficiently large positive values of t


Example revised l.jpg
Example - Revised

  • This scenario fits with our original example:

    • H1 is that the average height of KU basketball players is bigger than for the general population

    • One-sided test

    • We could assume that we don’t know the averages for H0 and H1

    • We actually don’t know σ (I just guessed 2 in the original example)


Example revised40 l.jpg
Example - Revised

  • Updated example:

    • Observation in group 1 (KU): X1 = {83, 81, 71}

    • Observation in group 2: X2 = {65, 72, 70}

    • Pick significance point for t from a table: tα = 2.132

      • t-distribution, m+n-2 = 4 degrees of freedom, α = 0.05

    • Calculate t with our observations

    • t > tα, so we can reject H0!


Comments l.jpg
Comments

  • Problems that might arise in other cases

    • The λ-ratio might not reduce to a function of a well-known test statistic, such as t

    • There might not be a unique H0 distribution of λ

    • Fortunately, the t statistic is a pivotal quantity

      • Independent of the parameters not prescribed by H0

        • e.g. μ, σ

    • For many testing procedures this property does not hold


Unequal variance two sided t test l.jpg
Unequal Variance Two-Sided t-test

  • Identical to Equal Variance Two-Sided t-test

    • Except: variances in group 1 and group 2 are no longer assumed to be identical

      • Group 1: NID(μ1, σ12)

      • Group 2: NID(μ2, σ22)

      • With σ12 and σ22 unknown and not assumed identical

      • Region ω = {μ1 = μ2, 0 < σ12, σ22 < +∞}

      • Ω makes no constraints on values μ1, μ2, σ12, and σ22


Unequal variance two sided t test43 l.jpg
Unequal Variance Two-Sided t-test

  • The likelihood function of (X11, X12, …, X1m, X21, X22, …, X2n) then becomes

  • Under H0 (μ1 = μ2 = μ), this becomes:


Unequal variance two sided t test44 l.jpg
Unequal Variance Two-Sided t-test

  • Maximum likelihood estimates , and satisfy the simultaneous equations:


Unequal variance two sided t test45 l.jpg
Unequal Variance Two-Sided t-test

  •  cubic equation in

  • Neither the λ ratio, nor any monotonic function has a known probability distribution when H0 is true!

  • This does not lead to any useful testing statistic

    • The t-statistic may be used as reasonably close

    • However H0 distribution is still unknown, as it depends on the unknown ratio σ12/σ22

    • In practice, a heuristic is often used (see Ch. 3.5)


The 2 log approximation l.jpg
The -2 log λ Approximation


The 2 log approximation47 l.jpg
The -2 log λ Approximation

  • Used when the λ-ratio procedure does not lead to a test statistic whose H0 distribution is known

    • Example: Unequal Variance Two-Sided t-test

  • Various approximations can be used

    • But only if certain regularity assumptions and restrictions hold true


The 2 log approximation48 l.jpg
The -2 log λ Approximation

  • Best known approximation:

    • If H0 is true, -2 log λ has an asymptotic chi-square distribution,

      • with degrees of freedom equal to the difference in parameters unspecified by H0 and H1, respectively.

      • λ is the likelihood ratio

      • “asymptotic” = “as the sample size → ∞”

    • Provides an asymptotically valid testing procedure


The 2 log approximation49 l.jpg
The -2 log λ Approximation

  • Restrictions:

    • Parameters must be real numbers that can take on values in some interval

    • The maximum likelihood estimator is found at a turning point of the function

      • i.e. a “real” maximum, not at a boundary point

    • H0 is nested in H1 (as in all previous slides)

  • These restrictions are important in the proof

    • I skip the proof…


The 2 log approximation50 l.jpg
The -2 log λ Approximation

  • Instead:

    • Our original basketball example, revised again:

      • Let’s drop our last assumption, that the variance in the population at large is the same as in the group of KU basketball players.

      • All we have left now are our observations and the hypothesis that μ1 > μ2

        • Where μ1 is the average height of Basketball players

      • Observation in group 1 (KU): X1 = {83, 81, 71}

      • Observation in group 2: X2 = {65, 72, 70}


Example revised again l.jpg
Example – Revised Again

  • Using the Unequal Variance One-Sided t-Test

  • We get:



The analysis of variance anova53 l.jpg
The Analysis of Variance (ANOVA)

  • Probably the most frequently used hypothesis testing procedure in statistics

  • This section

    • Derives of the Sum of Squares

    • Gives an outline of the ANOVA procedure

    • Introduces one-way ANOVA as a generalization of the two-sample t-test

    • Two-way and multi-way ANOVA

    • Further generalizations of ANOVA


Sum of squares l.jpg
Sum of Squares

  • New variables (from Ch. 3)

    • The two-sample t-test tests for equality of the means of two groups.

    • We could express the observations as:

    • Where the Eij are assumed to be NID(0,σ2)

    • H0 is μ1 = μ2


Sum of squares55 l.jpg
Sum of Squares

  • This can also be written as:

    • μ could be seen as overall mean

    • αj as deviation from μ in group j

  • This model is overparameterized

    • Uses more parameters than necessary

    • Necessitates the requirement

    • (always assumed imposed)


Sum of squares56 l.jpg
Sum of Squares

  • We are deriving a test procedure similar to the two-sample two-sided t-test

  • Using |t| as test statistic

    • Absolute value of the T statistic

  • This is equivalent to using t2

    • Because it’s a monotonic function of |t|

  • The square of the t statistic (from Ch. 3)


Sum of squares57 l.jpg
Sum of Squares

  • …can, after algebraic manipulations, be written as F

  • where


Sum of squares58 l.jpg
Sum of Squares

  • B: between (among) group sum of squares

  • W: within group sum of squares

  • B + W: total sum of squares

    • Can be shown to be:

  • Total number of degrees of freedom: m + n – 1

    • Between groups: 1

    • Within groups: m + n - 2


Sum of squares59 l.jpg
Sum of Squares

  • This gives us the F statistic

  • Our goal is to test the significance of the difference between the means of two groups

    • B measures the difference

  • The difference must be measured relative to the variance within the groups

    • W measures that

  • The larger F is, the more significant the difference


The anova procedure l.jpg
The ANOVA Procedure

  • Subdivide observed total sum of squares into several components

    • In our case, B and W

  • Pick appropriate significance point for a chosen Type I error α from an F table

  • Compare the observed components to test our hypothesis


F statistic l.jpg
F-Statistic

  • Significance points depend on degrees of freedom in B and W

    • In our case, 1 and (m + n – 2)

http://www.ento.vt.edu/~sharov/PopEcol/tables/f005.html


Comments62 l.jpg
Comments

  • The two-group case readily generalizes to any number of groups.

  • ANOVAs can be classified in various ways, e.g.

    • fixed effects models

    • mixed effects models

    • random effects model

    • Difference is discussed later

    • For now we consider fixed effect models

      • Parameter αi is fixed, but unknown, in group i


Comments63 l.jpg
Comments

  • Terminology

    • Although ANOVA contains the word ‘variance’

    • What we actually test for is a equality in means between the groups

      • The different mean assumptions affect the variance, though

  • ANOVAs are special cases of regression models from Ch. 8


One way anova l.jpg
One-Way ANOVA

  • One-Way fixed-effect ANOVA

  • Setup and derivation

    • Like two-sample t-test for g number of groups

    • Observations (ni observations, i=1,2,…,g)

    • Using overparameterized model for X

    • Eij assumed NID(0,σ2), Σniαi = 0, αi fixed in group i


One way anova65 l.jpg
One-Way ANOVA

  • Null Hypothesis H0 is: α1 = α2 = … = αg = 0

  • Total sum of squares is

  • This is subdivided into B and W

  • with


One way anova66 l.jpg
One-Way ANOVA

  • Total degrees of freedom: N – 1

    • Subdivided into dfB = g – 1 and dfW = N - g

  • This gives us our test statistic F

  • We can now look in the F-table for these degrees of freedom to pick significance points for B and W

  • And calculate B and W from the observed data

  • And accept or reject H0


Example67 l.jpg
Example

  • Revisiting the Basketball example

    • Looking at it as a One-Way ANOVA analysis

      • Observation in group 1 (KU): X1 = {83, 81, 71}

      • Observation in group 2: X2 = {65, 72, 70}

    • Total Sum of Squares:

    • B (between groups sum of squares)


Example68 l.jpg
Example

  • W (within groups sum of squares)

  • Degrees of freedom

    • Total: N-1 = 5

    • dfB = g – 1 = 2 - 1 = 1

    • dfW = N – g = 6 – 2 = 4


Example69 l.jpg
Example

  • Table lookup for df 1 and 4 and α= 0.05:

  • Critical value: F = 7.71

  • Calculate F from our data:

  • So… 4.806 < 7.71

  • With ANOVA we actually accept H0!

    • Seems to be the large variance in group 1



Excel l.jpg
Excel

  • Offers most of these tests, built-in


Two way anova l.jpg
Two-Way ANOVA

  • Two-Way Fixed Effects ANOVA

  • Overview only (in the scope of this book)

  • More complicated setup; example:

    • Expression levels of one gene in lung cancer patients

    • a different risk classes

      • E.g.: ultrahigh, very high, intermediate, low

    • b different age groups

    • n individuals for each risk/age combination


Two way anova73 l.jpg
Two-Way ANOVA

  • Expression levels (our observations): Xijk

    • i is the risk class (i = 1, 2, …, a)

    • j indicates the age group

    • k corresponds to the individual in each group (k = 1, …, n)

      • Each group is a possible risk/age combination

    • The number of individuals in each group is the same, n

    • This is a “balanced” design

    • Theory for unbalanced designs is more complicated and not covered in this book


Two way anova74 l.jpg
Two-Way ANOVA

  • The Xijk can be arranged in a table:

Risk category

j

i

Age group

Number of individuals in this

risk/age group (aka “cell”)

This is a two-way table


Two way anova75 l.jpg
Two-Way ANOVA

  • The model adopted for each Xijk is

    • Where Eijk are NID(μ, α2)

    • The mean of Xijk is μ + αi + βi + δij

    • αi is a fixed parameter, additive for risk class i

    • βi is a fixed parameter, additive for age group i

    • δij is a fixed risk/age interaction parameter

      • Should be added is a possible group/group interaction exists


Two way anova76 l.jpg
Two-Way ANOVA

  • These constraints are imposed

    • Σiαi = Σiβi = 0

    • Σiδij = 0 for all j

    • Σjδij = 0 for all i

  • The total sum of squares is then subdivided into four groups:

    • Risk class sum of squares

    • Age group sum of squares

    • Interaction sum of squares

    • Within cells (“residual” or “error”) sum of squares


Two way anova77 l.jpg
Two-Way ANOVA

  • Associated with each sum of squares

    • Corresponding degrees of freedom

    • Hence also a corresponding mean square

      • Sum of squares divided by degrees of freedom

  • The mean squares are then compared using F ratios to test for significance of various effects

    • First – test for a significant risk/age interaction

    • F-ratio used is ratio of interaction mean square and within-cells mean square


Two way anova78 l.jpg
Two-Way ANOVA

  • If such an interaction is used, it may not be reasonable to test for significant risk or age differences

  • Example, μ in two risk classes, two age groups:

    • No evidence of interaction

    • Example of interaction

Risk

Age

Age


Multi way anova l.jpg
Multi-Way ANOVA

  • One-way and two-way fixed effects ANOVAs can be extended to multi-way ANOVAs

  • Gets complicated

  • Example: three-way ANOVA model:


Further generalizations of anova l.jpg
Further generalizations of ANOVA

  • The 2m factorial design

    • A particular form of the one-way ANOVA

      • Interactions between main effects

    • m “factors” taken at two “levels”

      • E.g. (1) Gender, (2) Tissue (lung, kidney), and (3) status (affected, not affected)

    • 2m possible combinations of levels/groups

    • Can test for main effects and interactions

    • Need replicated experiments

      • n replications for each of the 2m experiments


Further generalizations of anova81 l.jpg
Further generalizations of ANOVA

  • Example, m = 3, denoted by A, B, C

    • 8 groups, {abc, ab, ac, bc, a, b, c, 1}

    • Write totals of n observations Tabc, Tab, …, T1

    • The total between sum of squares can be subdivided into seven individual sums of squares

      • Three main effects (A, B, C)

      • Three pair wise interactions (AB, AC, BC)

      • One triple-wise interaction (ABC)

      • Example: Sum of squares for A, and for BC, respectively


Further generalizations of anova82 l.jpg
Further generalizations of ANOVA

  • If m ≥ 5 the number of groups becomes large

  • Then the total number of observations, n2m is large

  • It is possible to reduce the number of observations by a process …

  • Confounding

    • Interaction ABC probably very small and not interesting

    • So, prefer a model without ABC, reduce data

    • There are ANOVA designs for that


  • Further generalizations of anova83 l.jpg
    Further generalizations of ANOVA

    • Fractional Replication

      • Related to confounding

      • Sometimes two groups cannot be distinguished from each other, then they are aliases

        • E.g. A and BC

      • This reduces the need to experiments and data

      • Ch. 13 talks more about this in the context of microarrays


    Random mixed effect models l.jpg
    Random/Mixed Effect Models

    • So far: fixed effect models

      • E.g. Risk class, age group fixed in previous example

        • Multiple experiments would use same categories

        • But: what if we took experimental data on several random days?

        • The days in itself have no meaning, but a “between days” sum of squares must be extracted

          • What if the days turn out to be important?

          • If we fail to test for it, the significance of our procedure is diminished.

          • Days are a random category, unlike risk and age!


    Random mixed effect models85 l.jpg
    Random/Mixed Effect Models

    • Mixed Effect Models

      • If some categories are fixed and some are random

      • Symbols used:

        • Greek letters for fixed effects

        • Uppercase Roman letters for random effects

        • Example: two-way mixed effect model with

          • Risk class a and days d and n values collected each day, the appropriate model is written:


    Random mixed effect models86 l.jpg
    Random/Mixed Effect Models

    • Random effect model have no fixed categories

    • The details on the ANOVA analysis depend on which effects are random and which are fixed

    • In a microarray context (more in Ch. 13)

      • There tend to be several fixed and several random effects, which complicates the analysis

      • Many interactions simply assumed zero


    Multivariate methods l.jpg
    Multivariate Methods

    ANOVA: the Repeated Measures Case

    Bootstrap Methods: the Two-sample t-test

    All skipped …



    Sequential analysis89 l.jpg
    Sequential Analysis

    • Sequential Probability Ratio

      • Sample size not known in advance

      • Depends on outcomes of successive observations

      • Some of this theory is in BLAST

        • Basic Local Alignment Search Tool

      • The book focuses on discreet random variables


    Sequential analysis90 l.jpg
    Sequential Analysis

    • Consider:

      • Random variable Y with distribution P(y;ξ)

      • Tests usually relate to the value of parameter ξ

      • H0: ξ is ξ0

      • H1: ξ is ξ1

      • We can choose a value for the Type I error α

      • And a value for the Type II error β

      • Sampling then continues while


    Sequential analysis91 l.jpg
    Sequential Analysis

    • A and B are chosen to correspond to an α and β

    • Sampling continues until the ratio is less than A (accept H0) or greater than B (reject H0)

    • Because these are discreet variables, boundary overshoot usually occurs

      • We don’t expect to exactly get values α and β

    • Desired values for α and β approximately achieved by using


    Sequential analysis92 l.jpg
    Sequential Analysis

    • It is also convenient to take logarithms, which gives us:

    • Using

    • We can write


    Sequential analysis93 l.jpg
    Sequential Analysis

    • Example: sequence matching

      • H0: p0 = 0.25 (probability of a match is 0.25)

      • H1: p1 = 0.35 (probability of a match is 0.35)

      • Type I error α and Type II error β chosen 0.01

      • Yi: 1 if there is a match at position i, otherwise 0

      • Sampling continues while

      • with


    Sequential analysis94 l.jpg
    Sequential Analysis

    • S can be seen as the support offered by Yi for H1

    • The inequality can be re-written as

    • This is actually a random walk with step sizes 0.7016 for a match and -0.2984 for a mismatch


    Sequential analysis95 l.jpg
    Sequential Analysis

    • Power Function for a Sequential Test

      • Suppose the true value of the parameter of interest is ξ

      • We wish to know the probability that H1 is accepted, given ξ

      • This probability is the power Ρ(ξ) of the test


    Sequential analysis96 l.jpg
    Sequential Analysis

    • Where θ* is the unique non-zero solution to θ in

    • R is the range of values of Y

    • Equivalently, θ* is the unique non-zero solution to θ in

    • Where S is defined as before


    Sequential analysis97 l.jpg
    Sequential Analysis

    • This is very similar to Ch. 7 – Random Walks

    • The parameter θ* is the same as in Ch. 7

    • And it will be the same in Ch 10 – BLAST

    • < skipping the random walk part >


    Sequential analysis98 l.jpg
    Sequential Analysis

    • Mean Sample Size

      • The (random) number of observations until one or the other hypothesis is accepted

      • Find approximation by ignoring boundary overshoot

      • Essentially identical method used to find the mean number of steps until the random walk stops


    Sequential analysis99 l.jpg
    Sequential Analysis

    • Two expressions are calculated for ΣiS1,0(Yi)

      • One involves the mean sample size

      • By equating both expressions, solve for mean sample size


    Sequential analysis100 l.jpg
    Sequential Analysis

    • So, the mean sample size is:

    • Both numerator and denominator depend on Ρ(ξ), and so also on θ*

    • A generalization applies if Q(y) of Y has different distribution than H0 and H1 – relevant to BLAST


    Sequential analysis101 l.jpg
    Sequential Analysis

    • Example

      • Same sequence matching example as before

        • H0: p0 = 0.25 (probability of a match is 0.25)

        • H1: p1 = 0.35 (probability of a match is 0.35)

        • Type I error α and Type II error β chosen 0.01

      • Mean sample size equation is:

      • Mean sample size is when H0 is true: 194

      • Mean sample size is when H1 is true: 182


    Sequential analysis102 l.jpg
    Sequential Analysis

    • Boundary Overshoot

      • So far we assumed no boundary overshoot

      • In practice, there will almost always be, though

        • Exact Type I and Type II errors different from α and β

      • Random walk theory can be used to assess how significant the effects of boundary overshoot are

      • It can be shown that the sum of Type I and Type II errors is always less than α + β (also individually)

      • BLAST deals with this in a novel way -> see Ch. 10