- 72 Views
- Uploaded on
- Presentation posted in: General

STAT131/171 W4L2 Modelling Variation: Introduction to modelling and GOF

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

STAT131/171W4L2 Modelling Variation: Introduction to modelling and GOF

by

Anne Porter

- Morning radio 6am -7am, weekdays
- Contestant telephones in to play
- Contestant has to say stop before the gong rings to win the meat
- Radio personality reads the meat items: 2 slices of scotch fillet,â€¦,3kg mince, until the gong is reached

Letâ€™s play, all stand, Iâ€™ll read, you sit when you

have enough meat. Last ones standing before the gong win.

1) Three kilos scotch fillet

4) 12 chicken kebabs

5) 12 lamb kebabs

2) 1 chicken

3) 3 kilos of sausages

6) 3 livers

9) 2kg salmon rissoles

7) 1 kg bacon

8) lamb chops

- What is the maximum and minimum number of items ever read out?
- What is the voice pattern over the gonged items?
- What is the average number of items read out before the Gong?
- What is the frequency of gongs over time for each item?

What is a more

informative way of

presenting the data

so we optimise our

chance of where to

stop?

What is a better

way of presenting

this information

so it is easier to

use?

What will be

the median

number of items

before the gong?

Is the (n+1)/2th

value =the

50.5th value =8

What is the average

number of items read

before the gong?

Multiply the number

of items by the

Frequency AND add

to get the total number

of items before the gong AND divide by

the number of games

played

=784/100 =7.84

Items before the gong

Why not?

Why not?

For these values

of x we have

a much smaller

spread

Does it appear

that the data fit

the random

stopping model?

Why so?

Does it appear

that the data fit

the random

stopping model?

Why so?

Number expected

differs from number

observed.

How do we

Measure (compare,

calculate) the difference

between observed

and expected

How might we calculate the difference between observed and expected

If the data fits will

this be big or small?

small

- Calculate
- And see if it is too large for the data to be considered to fit the model

- If
- Where d=g-p-1
- g is the number of cells
- p is the number of parameters estimated from the data
Then there is evidence the data does not fit the model

For our example g=

10 cells therefore d=10-0-1=9

= 17.49

Decision: As =65.6 >17.49 there is evidence that the data do not fit the random stopping model

Percentage Points of the distribution

df a

0.995 0.99 0.05 0.025 0.01 0.005

1 3.841 5.024 6.635 7.879

9 1.735 2.088 16.919 19.023 21.666 23.589

- Decision: If calculated > critical value of (tables) then there is evidence of lack of fit
a=0.05 (typical and we will use)

df=Number of cells â€“number of estimated parameters-1

df =10-0-1=9

Percentage Points of the distribution

df a

0.995 0.99 0.05 0.025 0.01 0.005

1 3.841 5.024 6.635 7.879

9 1.735 2.088 16.919 19.023 21.666 23.589

- Decision: As calculated =65.6 > critical value of 16.919 found in the tables there is evidence of lack of fit between the data and the random stopping model.

Looking at the table we can see most lack of fit occurs for items 2, 3, 8 and 9 lots of meat before the gong

- We will explore how these types of sampling distributions, are generated in our lecture on sampling distributions.
- We will also explore how we chose a value of a
- We will look at using the data to estimate parameters later

- Use a Bar chart to compare observed and expected frequencies
- Compare observed and expected frequencies
- Calculate and use
- Informally
- Formally
assumes that the expected counts in each cell is 5

If not combine cells. Other literature uses other rules, there is a debate over this.

(Check the Utts& Heckard (2004) definition)

E(X)=6.5

We will leave calculation of these till a little later on a simpler example

- We have been looking at the centre, spread, outliers and shape of samples of data?
- With a view to improving decision making.

- Why are we concerned with looking at models?

We collect data on samples

- Time in seconds until two species of flies released together mate
- The number of lost articles found in a large municipal office
- The average carbohydrate content per 100 gm serve in a sample of different species
- The number of items of meat read before the gong

Looking at

- The shape of the distribution
- Centre
- Spread
- Whether or not the data fit some model
- May even look at outliers, points not fitting the model

- Comparing midterm marks from the different versions of the test.
- Are the papers completed in a similar manner?

- The mating behaviour of these particular flies
- Past lost articles
- Or last years exam papers
- Or the last 100 games of beat the butcher

We are interested in them because they may

suggest a model for the characteristics of the

data in general. This involves Probability Models.

We shall continue to explore probability models in future lectures.