Stat131 171 w4l2 modelling variation introduction to modelling and gof
This presentation is the property of its rightful owner.
Sponsored Links
1 / 38

STAT131/171 W4L2 Modelling Variation: Introduction to modelling and GOF PowerPoint PPT Presentation


  • 72 Views
  • Uploaded on
  • Presentation posted in: General

STAT131/171 W4L2 Modelling Variation: Introduction to modelling and GOF. by Anne Porter [email protected] Activity: Let’s play beat the butcher. Morning radio 6am -7am, weekdays Contestant telephones in to play Contestant has to say stop before the gong rings to win the meat

Download Presentation

STAT131/171 W4L2 Modelling Variation: Introduction to modelling and GOF

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Stat131 171 w4l2 modelling variation introduction to modelling and gof

STAT131/171W4L2 Modelling Variation: Introduction to modelling and GOF

by

Anne Porter

[email protected]


Activity let s play beat the butcher

Activity: Let’s play beat the butcher

  • Morning radio 6am -7am, weekdays

  • Contestant telephones in to play

  • Contestant has to say stop before the gong rings to win the meat

  • Radio personality reads the meat items: 2 slices of scotch fillet,…,3kg mince, until the gong is reached


The list

The list

Let’s play, all stand, I’ll read, you sit when you

have enough meat. Last ones standing before the gong win.

1) Three kilos scotch fillet

4) 12 chicken kebabs

5) 12 lamb kebabs

2) 1 chicken

3) 3 kilos of sausages

6) 3 livers

9) 2kg salmon rissoles

7) 1 kg bacon

8) lamb chops


Stat131 171 w4l2 modelling variation introduction to modelling and gof

How might you increase your chances of winning?What information would be useful before you play again?

  • What is the maximum and minimum number of items ever read out?

  • What is the voice pattern over the gonged items?

  • What is the average number of items read out before the Gong?

  • What is the frequency of gongs over time for each item?


Frequency distribution of the number of items before the gong

Frequency distribution of the number of items before the gong

What is a more

informative way of

presenting the data

so we optimise our

chance of where to

stop?


Relative frequency table

Relative frequency table

What is a better

way of presenting

this information

so it is easier to

use?


Cumulative frequency

Cumulative frequency

What will be

the median

number of items

before the gong?

Is the (n+1)/2th

value =the

50.5th value =8


Frequency distribution of the number of items before the gong1

Frequency distribution of the number of items before the gong

What is the average

number of items read

before the gong?


What do we do to calculate the mean number of items before the gong

What do we do to calculate the mean number of items before the gong?


What do we do to calculate the mean number of items before the gong1

What do we do to calculate the mean number of items before the gong?

Multiply the number

of items by the

Frequency AND add

to get the total number

of items before the gong AND divide by

the number of games

played


Calculate the mean

Calculate the mean


Calculate the mean1

Calculate the mean

=784/100 =7.84

Items before the gong


Will your stopping strategy be the same for this set of data

Will your stopping strategy be the same for this set of data?

Why not?


Will your stopping strategy be the same for this set of data1

Will your stopping strategy be the same for this set of data?

Why not?

For these values

of x we have

a much smaller

spread


In the long run what should be the probability of stopping at each number if stopping at random

In the long run what should be the probability of stopping at each number if stopping at random?


P x x and number expected for each item for the random stopping model

P(X=x) and number expected for each item for the random stopping model

Does it appear

that the data fit

the random

stopping model?

Why so?


P x x and number expected for each item for the random stopping model1

P(X=x) and number expected for each item for the random stopping model

Does it appear

that the data fit

the random

stopping model?

Why so?

Number expected

differs from number

observed.


Bar chart compare observed expected frequencies

Bar Chart: Compare observed & expected frequencies


Measuring the difference between o and e

Measuring the difference between O and E

How do we

Measure (compare,

calculate) the difference

between observed

and expected


P x x and number expected for each item for the random stopping model2

P(X=x) and number expected for each item for the random stopping model

How might we calculate the difference between observed and expected

If the data fits will

this be big or small?

small


Calculating

Calculating


Calculating1

Calculating


Calculating2

Calculating


Model fit using

Model Fit Using

  • Calculate

  • And see if it is too large for the data to be considered to fit the model


Model fit informal is too big

Model Fit Informal : Is too big?

  • If

    • Where d=g-p-1

    • g is the number of cells

    • p is the number of parameters estimated from the data

      Then there is evidence the data does not fit the model

For our example g=

10 cells therefore d=10-0-1=9

= 17.49

Decision: As =65.6 >17.49 there is evidence that the data do not fit the random stopping model


Model fit formal

Percentage Points of the distribution

df a

0.995 0.99 0.05 0.025 0.01 0.005

1 3.841 5.024 6.635 7.879

9 1.735 2.088 16.919 19.023 21.666 23.589

Model Fit Formal

  • Decision: If calculated > critical value of (tables) then there is evidence of lack of fit

    a=0.05 (typical and we will use)

    df=Number of cells –number of estimated parameters-1

    df =10-0-1=9


Model fit formal1

Percentage Points of the distribution

df a

0.995 0.99 0.05 0.025 0.01 0.005

1 3.841 5.024 6.635 7.879

9 1.735 2.088 16.919 19.023 21.666 23.589

Model Fit Formal

  • Decision: As calculated =65.6 > critical value of 16.919 found in the tables there is evidence of lack of fit between the data and the random stopping model.


Lack of fit

Lack of fit

Looking at the table we can see most lack of fit occurs for items 2, 3, 8 and 9 lots of meat before the gong


Sampling distributions

Sampling Distributions

  • We will explore how these types of sampling distributions, are generated in our lecture on sampling distributions.

  • We will also explore how we chose a value of a

  • We will look at using the data to estimate parameters later


Model fit approaches

Model fit approaches

  • Use a Bar chart to compare observed and expected frequencies

  • Compare observed and expected frequencies

  • Calculate and use

    • Informally

    • Formally

      assumes that the expected counts in each cell is 5

      If not combine cells. Other literature uses other rules, there is a debate over this.

      (Check the Utts& Heckard (2004) definition)


Mean expected value e x for the random stopping model

Mean (expected value, E(X)) for the random stopping model


Expected value for the random stopping model is

Expected value for the random stopping model is?

E(X)=6.5


Spread of the population model

Spread of the Population Model

We will leave calculation of these till a little later on a simpler example


What have we been doing

What have we been doing?

  • We have been looking at the centre, spread, outliers and shape of samples of data?

    • With a view to improving decision making.

  • Why are we concerned with looking at models?


Describing characteristics of data

Describing characteristics of Data

We collect data on samples

  • Time in seconds until two species of flies released together mate

  • The number of lost articles found in a large municipal office

  • The average carbohydrate content per 100 gm serve in a sample of different species

  • The number of items of meat read before the gong


Improving our decisions

Improving our decisions

Looking at

  • The shape of the distribution

  • Centre

  • Spread

  • Whether or not the data fit some model

    • May even look at outliers, points not fitting the model


Describing batches of data

Describing Batches of Data

  • Comparing midterm marks from the different versions of the test.

  • Are the papers completed in a similar manner?


What we are really looking at is not

What we are really looking at is NOT

  • The mating behaviour of these particular flies

  • Past lost articles

  • Or last years exam papers

  • Or the last 100 games of beat the butcher

We are interested in them because they may

suggest a model for the characteristics of the

data in general. This involves Probability Models.

We shall continue to explore probability models in future lectures.


  • Login