Organizing and describing data
This presentation is the property of its rightful owner.
Sponsored Links
1 / 67

Organizing and describing Data PowerPoint PPT Presentation


  • 76 Views
  • Uploaded on
  • Presentation posted in: General

Organizing and describing Data. Instructor:. W.H.Laverty. Office:. 235 McLean Hall. Phone:. 966-6096. Lectures:. M W F 11:30am - 12:20pm Arts 143 Lab: M 3:30 - 4:20 Thorv105. Evaluation:. Assignments, Labs, Term tests - 40% Every 2nd Week (approx) – Term Test Final Examination - 60%.

Download Presentation

Organizing and describing Data

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Organizing and describing data

Organizing and describing Data


Organizing and describing data

Instructor:

W.H.Laverty

Office:

235 McLean Hall

Phone:

966-6096

Lectures:

M W F

11:30am - 12:20pm Arts 143

Lab: M 3:30 - 4:20 Thorv105

Evaluation:

Assignments, Labs, Term tests - 40%

Every 2nd Week (approx) – Term TestFinal Examination - 60%


Techniques for continuous variables

Techniques for continuous variables

Continuous variables are measurements that vary over a continuum (Weight, Blood Pressure, etc.) (as opposed to categorical variables Gender, religion, Marital Status etc.)


The grouped frequency table the histogram

The Grouped frequency table:The Histogram


To construct

To Construct

  • A Grouped frequency table

  • A Histogram


Organizing and describing data

  • Find the maximum and minimum of the observations.

  • Choose non-overlapping intervals of equal width (The Class Intervals) that cover the range between the maximum and the minimum.

  • The endpoints of the intervals are called the class boundaries.

  • Count the number of observations in each interval (The cell frequency - f).

  • Calculate relative frequency

    relative frequency = f/N


Organizing and describing data

Data Set #3

The following table gives data on Verbal IQ, Math IQ,

Initial Reading Acheivement Score, and Final Reading Acheivement Score

for 23 students who have recently completed a reading improvement program

InitialFinal

VerbalMathReadingReading

StudentIQIQAcheivementAcheivement

186941.11.7

21041031.51.7

386921.51.9

41051002.02.0

51181151.93.5

6961021.42.4

790871.51.8

8951001.42.0

9105961.71.7

1084801.61.7

1194871.61.7

121191161.73.1

1382911.21.8

1480931.01.7

151091241.82.5

161111191.43.0

1789941.61.8

18991171.62.6

1994931.41.4

20991101.42.0

2195971.51.3

221021041.73.1

23102931.61.9


Organizing and describing data

In this example the upper endpoint is included in the interval. The lower endpoint is not.


Histogram verbal iq

Histogram – Verbal IQ


Histogram math iq

Histogram – Math IQ


Example

Example

  • In this example we are comparing (for two drugs A and B) the time to metabolize the drug.

  • 120 cases were given drug A.

  • 120 cases were given drug B.

  • Data on time to metabolize each drug is given on the next two slides


Drug a

Drug A


Drug b

Drug B


Grouped frequency tables

Grouped frequency tables


Histogram drug a time to metabolize

Histogram – drug A(time to metabolize)


Histogram drug b time to metabolize

Histogram – drug B(time to metabolize)


The grouped frequency table the histogram1

The Grouped frequency table:The Histogram


To construct1

To Construct

  • A Grouped frequency table

  • A Histogram


To construct a grouped frequency table

To Construct - A Grouped frequency table

  • Find the maximum and minimum of the observations.

  • Choose non-overlapping intervals of equal width (The Class Intervals) that cover the range between the maximum and the minimum.

  • The endpoints of the intervals are called the class boundaries.

  • Count the number of observations in each interval (The cell frequency - f).

  • Calculate relative frequency

    relative frequency = f/N


To draw a histogram

To draw - A Histogram

Draw above each class interval:

  • A vertical bar above each Class Interval whose height is either proportional to The cell frequency (f) or the relative frequency (f/N)

frequency (f) or relative frequency (f/N)

Class Interval


Some comments about histograms

Some comments about histograms

  • The width of the class intervals should be chosen so that the number of intervals with a frequency less than 5 is small.

  • This means that the width of the class intervals can decrease as the sample size increases


Organizing and describing data

  • If the width of the class intervals is too small. The frequency in each interval will be either 0 or 1

  • The histogram will look like this


Organizing and describing data

  • If the width of the class intervals is too large. One class interval will contain all of the observations.

  • The histogram will look like this


Organizing and describing data

  • Ideally one wants the histogram to appear as seen below.

  • This will be achieved by making the width of the class intervals as small as possible and only allowing a few intervals to have a frequency less than 5.


Organizing and describing data

  • As the sample size increases the histogram will approach a smooth curve.

  • This is the histogram of the population


Organizing and describing data

N = 25


N 100

N = 100


N 500

N = 500


N 2000

N = 2000


Organizing and describing data

N = ∞


Organizing and describing data

Comment: the proportion of area under a histogram between two points estimates the proportion of cases in the sample (and the population) between those two values.


Example the following histogram displays the birth weight in kg s of n 100 births

Example: The following histogram displays the birth weight (in Kg’s) of n = 100 births


Find the proportion of births that have a birthweight less than 0 34 kg

Find the proportion of births that have a birthweight less than 0.34 kg.


Proportion 1 1 3 10 11 19 17 100 0 62

Proportion = (1+1+3+10+11+19+17)/100 = 0.62


The characteristics of a histogram

The Characteristics of a Histogram

  • Central Location (average)

  • Spread (Variability, Dispersion)

  • Shape


Central location

Central Location


Spread dispersion variability

Spread, Dispersion, Variability


Shape bell shaped normal

Shape – Bell Shaped (Normal)


Shape positively skewed

Shape – Positively skewed


Shape negatively skewed

Shape – Negatively skewed


Shape platykurtic

Shape – Platykurtic


Shape leptokurtic

Shape – Leptokurtic


Shape bimodal

Shape – Bimodal


The stem leaf plot

The Stem-Leaf Plot

An alternative to the histogram


Organizing and describing data

Each number in a data set can be broken into two parts

  • A stem

  • A Leaf


Organizing and describing data

Example

Verbal IQ = 84

84

  • Stem = 10 digit = 8

  • Leaf = Unit digit = 4

Leaf

Stem


Organizing and describing data

Example

Verbal IQ = 104

104

  • Stem = 10 digit = 10

  • Leaf = Unit digit = 4

Leaf

Stem


To construct a stem leaf diagram

To Construct a Stem- Leaf diagram

  • Make a vertical list of “all” stems

  • Then behind each stem make a horizontal list of each leaf


Example1

Example

The data on N = 23 students

Variables

  • Verbal IQ

  • Math IQ

  • Initial Reading Achievement Score

  • Final Reading Achievement Score


Organizing and describing data

Data Set #3

The following table gives data on Verbal IQ, Math IQ,

Initial Reading Acheivement Score, and Final Reading Acheivement Score

for 23 students who have recently completed a reading improvement program

InitialFinal

VerbalMathReadingReading

StudentIQIQAcheivementAcheivement

186941.11.7

21041031.51.7

386921.51.9

41051002.02.0

51181151.93.5

6961021.42.4

790871.51.8

8951001.42.0

9105961.71.7

1084801.61.7

1194871.61.7

121191161.73.1

1382911.21.8

1480931.01.7

151091241.82.5

161111191.43.0

1789941.61.8

18991171.62.6

1994931.41.4

20991101.42.0

2195971.51.3

221021041.73.1

23102931.61.9


Organizing and describing data

We now construct:

a stem-Leaf diagram

of Verbal IQ


Organizing and describing data

A vertical list of the stems

8

9

10

11

12

We now list the leafs behind stem


Organizing and describing data

8

6

10

4

8

6

10

5

11

8

9

6

9

0

9

5

10

5

8

4

9

4

11

9

8

2

8

0

10

9

11

1

8

9

9

9

9

4

9

9

8

9

10

11

12

9

5

10

2

10

2


Organizing and describing data

8

6

10

4

8

6

10

5

11

8

9

6

9

0

9

5

10

5

8

4

9

4

11

9

8

2

8

0

10

9

11

1

8

9

9

9

9

4

9

9

8

9

10

11

12

9

5

10

2

10

2


Organizing and describing data

86 6 4 2 0 9

96 0 5 4 9 4 9 5

104 5 5 9 2 2

118 9 1

12


The leafs may be arranged in order

The leafs may be arranged in order

8 0 2 4 6 6 9

9 0 4 4 5 5 6 9 9

10 2 2 4 5 5 9

11 1 8 9

12


The stem leaf diagram is equivalent to a histogram

The stem-leaf diagram is equivalent to a histogram

8 0 2 4 6 6 9

9 0 4 4 5 5 6 9 9

10 2 2 4 5 5 9

11 1 8 9

12


The stem leaf diagram is equivalent to a histogram1

The stem-leaf diagram is equivalent to a histogram

8 0 2 4 6 6 9

9 0 4 4 5 5 6 9 9

10 2 2 4 5 5 9

11 1 8 9

12


Rotating the stem leaf diagram we have

Rotating the stem-leaf diagram we have

80

90

100

110

120


The two part stem leaf diagram

The two part stem leaf diagram

Sometimes you want to break the stems into two parts

for leafs 0,1,2,3,4

* for leafs 5,6,7,8,9


Stem leaf diagram for initial reading acheivement

Stem-leaf diagram for Initial Reading Acheivement

01234444455556666677789

0

This diagram as it stands does not

give an accurate picture of the

distribution


We try breaking the stems into two parts 1 012344444 1 55556666677789 2 0 2

We try breaking the stems into

two parts

1.*012344444

1. 55556666677789

2.* 0

2.


The five part stem leaf diagram

The five-part stem-leaf diagram

If the two part stem-leaf diagram is not adequate you can break the stems into five parts

for leafs 0,1

tfor leafs 2,3

ffor leafs 4, 5

s for leafs 6,7

*for leafs 8,9


We try breaking the stems into five parts 1 01 1 t 23 1 f 444445555 1 s 66666777 1 89 2 0

We try breaking the stems into

five parts

1.*01

1.t23

1.f444445555

1.s66666777

1. 89

2.* 0


Stem leaf diagrams verbal iq math iq initial ra final ra

Stem leaf Diagrams

Verbal IQ, Math IQ, Initial RA, Final RA


Some conclusions

Some Conclusions

  • Math IQ, Verbal IQ seem to have approximately the same distribution

  • “bell shaped” centered about 100

  • Final RA seems to be larger than initial RA and more spread out

  • Improvement in RA

  • Amount of improvement quite variable


Next topic

Next Topic

  • Numerical Measures - Location


  • Login