Organizing and describing data
Download
1 / 67

Organizing and describing Data - PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on

Organizing and describing Data. Instructor:. W.H.Laverty. Office:. 235 McLean Hall. Phone:. 966-6096. Lectures:. M W F 11:30am - 12:20pm Arts 143 Lab: M 3:30 - 4:20 Thorv105. Evaluation:. Assignments, Labs, Term tests - 40% Every 2nd Week (approx) – Term Test Final Examination - 60%.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Organizing and describing Data' - flynn


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Instructor:

W.H.Laverty

Office:

235 McLean Hall

Phone:

966-6096

Lectures:

M W F

11:30am - 12:20pm Arts 143

Lab: M 3:30 - 4:20 Thorv105

Evaluation:

Assignments, Labs, Term tests - 40%

Every 2nd Week (approx) – Term TestFinal Examination - 60%


Techniques for continuous variables

Techniques for continuous variables

Continuous variables are measurements that vary over a continuum (Weight, Blood Pressure, etc.) (as opposed to categorical variables Gender, religion, Marital Status etc.)



To construct
To Construct

  • A Grouped frequency table

  • A Histogram


  • Find the maximum and minimum of the observations.

  • Choose non-overlapping intervals of equal width (The Class Intervals) that cover the range between the maximum and the minimum.

  • The endpoints of the intervals are called the class boundaries.

  • Count the number of observations in each interval (The cell frequency - f).

  • Calculate relative frequency

    relative frequency = f/N


Data Set #3

The following table gives data on Verbal IQ, Math IQ,

Initial Reading Acheivement Score, and Final Reading Acheivement Score

for 23 students who have recently completed a reading improvement program

Initial Final

Verbal Math Reading Reading

Student IQ IQ Acheivement Acheivement

1 86 94 1.1 1.7

2 104 103 1.5 1.7

3 86 92 1.5 1.9

4 105 100 2.0 2.0

5 118 115 1.9 3.5

6 96 102 1.4 2.4

7 90 87 1.5 1.8

8 95 100 1.4 2.0

9 105 96 1.7 1.7

10 84 80 1.6 1.7

11 94 87 1.6 1.7

12 119 116 1.7 3.1

13 82 91 1.2 1.8

14 80 93 1.0 1.7

15 109 124 1.8 2.5

16 111 119 1.4 3.0

17 89 94 1.6 1.8

18 99 117 1.6 2.6

19 94 93 1.4 1.4

20 99 110 1.4 2.0

21 95 97 1.5 1.3

22 102 104 1.7 3.1

23 102 93 1.6 1.9


In this example the upper endpoint is included in the interval. The lower endpoint is not.


Histogram verbal iq
Histogram – Verbal IQ interval. The lower endpoint is not.


Histogram math iq
Histogram – Math IQ interval. The lower endpoint is not.


Example
Example interval. The lower endpoint is not.

  • In this example we are comparing (for two drugs A and B) the time to metabolize the drug.

  • 120 cases were given drug A.

  • 120 cases were given drug B.

  • Data on time to metabolize each drug is given on the next two slides


Drug a
Drug A interval. The lower endpoint is not.


Drug b
Drug B interval. The lower endpoint is not.


Grouped frequency tables
Grouped frequency tables interval. The lower endpoint is not.


Histogram drug a time to metabolize
Histogram – drug A interval. The lower endpoint is not.(time to metabolize)


Histogram drug b time to metabolize
Histogram – drug B interval. The lower endpoint is not.(time to metabolize)


The grouped frequency table the histogram1

The Grouped frequency table: interval. The lower endpoint is not.The Histogram


To construct1
To Construct interval. The lower endpoint is not.

  • A Grouped frequency table

  • A Histogram


To construct a grouped frequency table
To Construct - A Grouped frequency table interval. The lower endpoint is not.

  • Find the maximum and minimum of the observations.

  • Choose non-overlapping intervals of equal width (The Class Intervals) that cover the range between the maximum and the minimum.

  • The endpoints of the intervals are called the class boundaries.

  • Count the number of observations in each interval (The cell frequency - f).

  • Calculate relative frequency

    relative frequency = f/N


To draw a histogram
To draw - A Histogram interval. The lower endpoint is not.

Draw above each class interval:

  • A vertical bar above each Class Interval whose height is either proportional to The cell frequency (f) or the relative frequency (f/N)

frequency (f) or relative frequency (f/N)

Class Interval


Some comments about histograms
Some comments about histograms interval. The lower endpoint is not.

  • The width of the class intervals should be chosen so that the number of intervals with a frequency less than 5 is small.

  • This means that the width of the class intervals can decrease as the sample size increases






N = 25 smooth curve.


N 100
N = 100 smooth curve.


N 500
N = 500 smooth curve.


N 2000
N = 2000 smooth curve.


N = smooth curve.∞


Comment: smooth curve. the proportion of area under a histogram between two points estimates the proportion of cases in the sample (and the population) between those two values.


Example the following histogram displays the birth weight in kg s of n 100 births
Example: smooth curve. The following histogram displays the birth weight (in Kg’s) of n = 100 births




The characteristics of a histogram
The Characteristics of a Histogram than 0.34

  • Central Location (average)

  • Spread (Variability, Dispersion)

  • Shape


Central location
Central Location than 0.34









The stem leaf plot

The Stem-Leaf Plot than 0.34

An alternative to the histogram



Example than 0.34

Verbal IQ = 84

84

  • Stem = 10 digit = 8

  • Leaf = Unit digit = 4

Leaf

Stem


Example than 0.34

Verbal IQ = 104

104

  • Stem = 10 digit = 10

  • Leaf = Unit digit = 4

Leaf

Stem


To construct a stem leaf diagram
To Construct a Stem- Leaf diagram than 0.34

  • Make a vertical list of “all” stems

  • Then behind each stem make a horizontal list of each leaf


Example1
Example than 0.34

The data on N = 23 students

Variables

  • Verbal IQ

  • Math IQ

  • Initial Reading Achievement Score

  • Final Reading Achievement Score


Data Set #3 than 0.34

The following table gives data on Verbal IQ, Math IQ,

Initial Reading Acheivement Score, and Final Reading Acheivement Score

for 23 students who have recently completed a reading improvement program

Initial Final

Verbal Math Reading Reading

Student IQ IQ Acheivement Acheivement

1 86 94 1.1 1.7

2 104 103 1.5 1.7

3 86 92 1.5 1.9

4 105 100 2.0 2.0

5 118 115 1.9 3.5

6 96 102 1.4 2.4

7 90 87 1.5 1.8

8 95 100 1.4 2.0

9 105 96 1.7 1.7

10 84 80 1.6 1.7

11 94 87 1.6 1.7

12 119 116 1.7 3.1

13 82 91 1.2 1.8

14 80 93 1.0 1.7

15 109 124 1.8 2.5

16 111 119 1.4 3.0

17 89 94 1.6 1.8

18 99 117 1.6 2.6

19 94 93 1.4 1.4

20 99 110 1.4 2.0

21 95 97 1.5 1.3

22 102 104 1.7 3.1

23 102 93 1.6 1.9


We now construct: than 0.34

a stem-Leaf diagram

of Verbal IQ


A vertical list of the stems than 0.34

8

9

10

11

12

We now list the leafs behind stem


8 than 0.34

6

10

4

8

6

10

5

11

8

9

6

9

0

9

5

10

5

8

4

9

4

11

9

8

2

8

0

10

9

11

1

8

9

9

9

9

4

9

9

8

9

10

11

12

9

5

10

2

10

2


8 than 0.34

6

10

4

8

6

10

5

11

8

9

6

9

0

9

5

10

5

8

4

9

4

11

9

8

2

8

0

10

9

11

1

8

9

9

9

9

4

9

9

8

9

10

11

12

9

5

10

2

10

2


8 6 6 4 2 0 9 than 0.34

9 6 0 5 4 9 4 9 5

10 4 5 5 9 2 2

11 8 9 1

12


The leafs may be arranged in order
The leafs may be arranged in order than 0.34

8 0 2 4 6 6 9

9 0 4 4 5 5 6 9 9

10 2 2 4 5 5 9

11 1 8 9

12


The stem leaf diagram is equivalent to a histogram
The stem-leaf diagram is equivalent to a histogram than 0.34

8 0 2 4 6 6 9

9 0 4 4 5 5 6 9 9

10 2 2 4 5 5 9

11 1 8 9

12


The stem leaf diagram is equivalent to a histogram1
The stem-leaf diagram is equivalent to a histogram than 0.34

8 0 2 4 6 6 9

9 0 4 4 5 5 6 9 9

10 2 2 4 5 5 9

11 1 8 9

12


Rotating the stem leaf diagram we have
Rotating the stem-leaf diagram we have than 0.34

80

90

100

110

120


The two part stem leaf diagram

The two part stem leaf diagram than 0.34

Sometimes you want to break the stems into two parts

for leafs 0,1,2,3,4

* for leafs 5,6,7,8,9


Stem leaf diagram for initial reading acheivement

Stem-leaf diagram for Initial Reading Acheivement than 0.34

01234444455556666677789

0

This diagram as it stands does not

give an accurate picture of the

distribution


We try breaking the stems into two parts 1 012344444 1 55556666677789 2 0 2

We try breaking the stems into than 0.34

two parts

1.* 012344444

1. 55556666677789

2.* 0

2.


The five part stem leaf diagram

The five-part stem-leaf diagram than 0.34

If the two part stem-leaf diagram is not adequate you can break the stems into five parts

for leafs 0,1

t for leafs 2,3

f for leafs 4, 5

s for leafs 6,7

* for leafs 8,9


We try breaking the stems into five parts 1 01 1 t 23 1 f 444445555 1 s 66666777 1 89 2 0

We try breaking the stems into than 0.34

five parts

1.* 01

1.t 23

1.f 444445555

1.s 66666777

1. 89

2.* 0


Stem leaf diagrams verbal iq math iq initial ra final ra

Stem leaf Diagrams than 0.34

Verbal IQ, Math IQ, Initial RA, Final RA


Some conclusions
Some Conclusions than 0.34

  • Math IQ, Verbal IQ seem to have approximately the same distribution

  • “bell shaped” centered about 100

  • Final RA seems to be larger than initial RA and more spread out

  • Improvement in RA

  • Amount of improvement quite variable


Next topic
Next Topic than 0.34

  • Numerical Measures - Location


ad