1 / 38

STAT131/171 W4L2 Modelling Variation: Introduction to modelling and GOF - PowerPoint PPT Presentation

STAT131/171 W4L2 Modelling Variation: Introduction to modelling and GOF. by Anne Porter [email protected] Activity: Let’s play beat the butcher. Morning radio 6am -7am, weekdays Contestant telephones in to play Contestant has to say stop before the gong rings to win the meat

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about ' STAT131/171 W4L2 Modelling Variation: Introduction to modelling and GOF' - cheung

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

STAT131/171W4L2 Modelling Variation: Introduction to modelling and GOF

by

Anne Porter

• Morning radio 6am -7am, weekdays

• Contestant telephones in to play

• Contestant has to say stop before the gong rings to win the meat

• Radio personality reads the meat items: 2 slices of scotch fillet,…,3kg mince, until the gong is reached

Let’s play, all stand, I’ll read, you sit when you

have enough meat. Last ones standing before the gong win.

1) Three kilos scotch fillet

4) 12 chicken kebabs

5) 12 lamb kebabs

2) 1 chicken

3) 3 kilos of sausages

6) 3 livers

9) 2kg salmon rissoles

7) 1 kg bacon

8) lamb chops

How might you increase your chances of winning?What information would be useful before you play again?

• What is the maximum and minimum number of items ever read out?

• What is the voice pattern over the gonged items?

• What is the average number of items read out before the Gong?

• What is the frequency of gongs over time for each item?

What is a more

informative way of

presenting the data

so we optimise our

chance of where to

stop?

What is a better

way of presenting

this information

so it is easier to

use?

What will be

the median

number of items

before the gong?

Is the (n+1)/2th

value =the

50.5th value =8

What is the average

before the gong?

Multiply the number

of items by the

to get the total number

of items before the gong AND divide by

the number of games

played

Calculate the mean the gong?

=784/100 =7.84

Items before the gong

Why not?

Why not?

For these values

of x we have

a much smaller

In the long run what should be the probability of stopping at each number if stopping at random?

P(X=x) and number expected for each item for the random stopping model

Does it appear

that the data fit

the random

stopping model?

Why so?

P(X=x) and number expected for each item for the random stopping model

Does it appear

that the data fit

the random

stopping model?

Why so?

Number expected

differs from number

observed.

Bar Chart: stopping modelCompare observed & expected frequencies

Measuring the difference between O and E stopping model

How do we

Measure (compare,

calculate) the difference

between observed

and expected

P(X=x) and number expected for each item for the random stopping model

How might we calculate the difference between observed and expected

If the data fits will

this be big or small?

small

Calculating stopping model

Calculating stopping model

Calculating stopping model

Model Fit Using stopping model

• Calculate

• And see if it is too large for the data to be considered to fit the model

Model Fit Informal : Is too big? stopping model

• If

• Where d=g-p-1

• g is the number of cells

• p is the number of parameters estimated from the data

Then there is evidence the data does not fit the model

For our example g=

10 cells therefore d=10-0-1=9

= 17.49

Decision: As =65.6 >17.49 there is evidence that the data do not fit the random stopping model

Percentage Points of the distribution stopping model

df a

0.995 0.99 0.05 0.025 0.01 0.005

1 3.841 5.024 6.635 7.879

9 1.735 2.088 16.919 19.023 21.666 23.589

Model Fit Formal

• Decision: If calculated > critical value of (tables) then there is evidence of lack of fit

a=0.05 (typical and we will use)

df=Number of cells –number of estimated parameters-1

df =10-0-1=9

Percentage Points of the distribution stopping model

df a

0.995 0.99 0.05 0.025 0.01 0.005

1 3.841 5.024 6.635 7.879

9 1.735 2.088 16.919 19.023 21.666 23.589

Model Fit Formal

• Decision: As calculated =65.6 > critical value of 16.919 found in the tables there is evidence of lack of fit between the data and the random stopping model.

Lack of fit stopping model

Looking at the table we can see most lack of fit occurs for items 2, 3, 8 and 9 lots of meat before the gong

Sampling Distributions stopping model

• We will explore how these types of sampling distributions, are generated in our lecture on sampling distributions.

• We will also explore how we chose a value of a

• We will look at using the data to estimate parameters later

Model fit approaches stopping model

• Use a Bar chart to compare observed and expected frequencies

• Compare observed and expected frequencies

• Calculate and use

• Informally

• Formally

assumes that the expected counts in each cell is 5

If not combine cells. Other literature uses other rules, there is a debate over this.

(Check the Utts& Heckard (2004) definition)

Spread of the Population Model stopping model

We will leave calculation of these till a little later on a simpler example

What have we been doing? stopping model

• We have been looking at the centre, spread, outliers and shape of samples of data?

• With a view to improving decision making.

• Why are we concerned with looking at models?

Describing characteristics of Data stopping model

We collect data on samples

• Time in seconds until two species of flies released together mate

• The number of lost articles found in a large municipal office

• The average carbohydrate content per 100 gm serve in a sample of different species

• The number of items of meat read before the gong

Improving our decisions stopping model

Looking at

• The shape of the distribution

• Centre

• Whether or not the data fit some model

• May even look at outliers, points not fitting the model

Describing Batches of Data stopping model

• Comparing midterm marks from the different versions of the test.

• Are the papers completed in a similar manner?

What we are really looking at is NOT stopping model

• The mating behaviour of these particular flies

• Past lost articles

• Or last years exam papers

• Or the last 100 games of beat the butcher

We are interested in them because they may

suggest a model for the characteristics of the

data in general. This involves Probability Models.

We shall continue to explore probability models in future lectures.