1 / 70

# Descriptive Statistics - PowerPoint PPT Presentation

Descriptive Statistics. Prepared By Masood Amjad Khan GCU, Lahore. Slide No. Subject. Slide No. Subject. 1. Index 2 2. Index 3 3. Statistics (Definitions) 4

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Descriptive Statistics' - hakan

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Prepared By

GCU, Lahore

Subject

Slide No.

Subject

• 1. Index 2

• 2. Index 3

• 3. Statistics (Definitions) 4

• 4. Descriptive Statistics 5

• Inferential Statistics 11

• Examples of 4 and 5 14

• 7. Data, Level of measurements 15

• 8. Variable 8

• 9. Discrete variable 10

• 10. Continues variable 9

• 11. FrequencyDistribution 6

• 12. Constructing Freq. Distn. 22, 23

• 13. Exampleof 12 24, 25

• 14. Displayingthe Data 7

• 15.Bar Chart, Pie Chart 16

• 16. Stem Leaf Plot 32-34

• 17. Graph 17

• 18. Histogram 26, 27

• 19. FrequencyPolygon 28, 29

• 20. Cumulative Freq. Polygon 30, 31

21. Summary Measures 18

22. Goals 19

23.Arithmetic Mean 37, 40

24. Characteristic of Mean 20

25. Examplesof 23 38-39

26. Weighted Mean 41

27. Example weighted Mean 42

28. Geometric Mean 43

29. Example: Geometric Mean 44

30. Median 45

31. Example of Median 46

32. Properties of Median 47

33. Mode 48

34.Examples of Mode 49-50

35. Positions of mean, median

and mode. 51

36. Dispersion 52

37. Range and Mean Deviation 53

39. Example of Mean Deviation 54-55

40. Variance 56

Index

Subject

61.

62.

63.

64.

65.

66.

67.

68.

69.

70.

71.

72.

73.

74.

75.

76.

77.

75.

79.

80.

Index

Slide No.

Subject

41. Examples of variance 57-59

42. Moments 60

43. Examples of Moments 61-62

44. Skewness 63

45. Types of Skewness 64

46. Coefficient of Skewness 65

47. Example of skewness 66-67

48. Empirical Rule 68-69

49. Exercise 70

50.

51.

52.

53.

54.

55.

56.

57.

58.

59.

60.

(Common Usage)

Field or Discipline of Study

Definition

1. No. of children born in a hospital

in some specified time.

2. No. of students enrolled in GCU

in 2007.

3. No of road accidents on motor

way.

4. Amount spent on Research

Development in GCU

during 2006-2007.

5. No. of shut down of Computer

Network on a particular day.

The Science of Collection, Presentation,

Analyzing and Interpretation of Data to make

Decisions and Forecasts.

Probability provides the transition

between Descriptive and

Inferential Statistics

Descriptive

Statistics

STATISTICS

Examples of Descriptive

And Inferential

Statistics

Inferential

Statistics

1

Consists of methods for Organizing, Displaying,

and DescribingData by using Tables, Graphs,

and Summary Measures.

Data

Types of Data

A data set is a collection of observations on one

or morevariables.

Descriptive Statistics

1

Frequency Distribution

Frequency Table

Organizing the Data

1

Construction of

Frequency Distribution

A grouping of quantitative data into

mutually exclusive classes showing

the number of observations in each

class.

A grouping of qualitative data into

mutually exclusive classes showing

the number of observations in each

class.

Selling price of 80 vehicles

Vehicle Selling Number of

Price Vehicles

15000 to 24000 48

24000 to 33000 30

33000 to 42000 2

Preference of four type of beverage

by 100 customers.

Beverage Number

Cola-Plus 40

Coca-Cola 25

Pepsi 20

7-UP 15

Graph

Diagrams/Charts

• Bar Chart

• Pie Chart

• Histogram

• Frequency Polygon

Displaying the Data

1

Categorical variable

Quantitative

Variable

Variable

A characteristic under study that assumes different

values for different elements. (e.g Height of persons,

no. of students in GCU )

A variable that can be measured

numerically is called quantitative

variable.

A variable that can not assume

a numerical value but can be

classified into two or more

non numeric categories is

called qualitative or categorical

variable.

Continuous

variable

Discrete

variable

• Educational achievements

• Marital status

• Brand of PC

1

Go to Descriptive Statistics

A variable whose observations can assume any

value within a specific range.

• Amount of income tax paid.

• Weight of a student.

• Yearly rainfall in Murree.

• Time elapsed in successive network breakdown.

1

Back

Variable that can assume only certain values, and there

are gaps between the values.

• Children in a family

• Strokes on a golf hole

• TV set owned

• Cars arriving at GCU in an hour

• Students in each section of statistics course

1

Back

InferentialStatistics

Consists of methods, that use sample results to help

make decisions or predictions about population.

1

Hypothesis

Estimation

Interval

Estimation

Point

Estimation

Sample

• A portion of population selected for study.

• 2. A sub set of Data selected from a population.

Selecting a Sample

Go to Inferential Statistics

1

Finite Population

Infinite Population

Population

1. Consists of all-individual items or objects-whose

characteristics are being studied.

2. Collection of Data that describe some phenomenon

of interest.

• Length of fish in particular lake.

• No. of students of Statistics

course in BCS.

• No. of traffic violations on some

specific holiday.

• Depth of a lake from any conceived

position.

• Length of life of certain brand of

light bulb.

• Stars on sky.

1

Go to Inferential Statistics

Descriptive

Inferential

Descriptive and Inferential Statistics

• At least 5% of all fires reported

• last year in Lahore were

• deliberately set.

• Next to colonial homes, more

• residents in specified locality

• prefer a contemporary design.

• As a result of recent poll, most

• Pakistanis are in favor of

• independent and powerful parliament.

• As a result of recent cutbacks by the

• oil-producing nations, we can expect

• the price of gasoline to double in the

• next year.

1

Nominal

Ordinal

Interval

Ratio

1

Types of Data

• Data can be classified according to level of measurement.

• The level of measurement dictates the calculations that can

be done to summarize and present the data.

• It also determines the statistical tests that should be performed.

Data are ranked no

meaningful difference

between values

Data may

only be

classified

Meaningful

difference

between values.

Meaningful 0 point

and ratio between

values.

• Jersey numbers

• of football

• player.

• Make of car.

class.

• Team standings.

• Temperature

• Dress size

• No. of patients seen

• No of sales call made

• Distance students

travel to class

Pie Chart

A graph in which the classes

are reported on the horizontal

axis and the class frequencies on

vertical axis. The class frequencies

are proportional to the heights of

the bars.

A chart that shows the proportion or

percent that each class represents

of the total number of frequencies.

f

Angle

White

130

36

29

Black

104

Lime

325

90

Orange

455

126

Red

286

79

1300

360

n =

Diagrams/Charts

Angle = (f/n)360

1

Back

Frequency

Polygon

Cumulative Frequency

Polygon

Graphs

Go to Descriptive Statistics

1

Location

Measures of

Dispersion

Describing the Data

Summary Measures

Goals

Moments

• Arithmetic Mean

• WeightedArithmetic

Mean

• Geometric Mean

• Median

• Mode

• Range, Mean Deviation

• Variance, Standard

Deviation

Skewness

1

Goals

• Calculate the arithmetic mean,

weighted mean, median, mode,

and geometric mean.

• Explain the characteristics, uses,

of each measure of location.

• Identify the position of the mean,

median, and mode for both

symmetric and skewed distributions.

• Compute and interpret the range,

• mean deviation, variance, and

• standard deviation.

• Understand the characteristics, uses,

• of each measure of dispersion.

• Understand Chebyshev’s theorem and

• the Empirical Rule as they relate to a set

• of observations.

1

The arithmetic meanis the most widely used measure of location. It requires the interval scale.

Its major characteristics are:

All values are used.

It is unique.

The sum of the deviations from the mean is 0.

It is calculated by summing the values and dividing by the number of values.

Characteristics of the Mean

• Every set of interval-level and ratio-level data has a mean.

• All the values are included in computing the mean.

• A set of data has a unique mean.

• The mean is affected by unusually large or small data values.

• The arithmetic mean is the only measure of central tendency where the sum of the deviations of each value from the mean is zero.

1

1

Use of Tables of Random Numbers

• Random numbers are the randomly produced digits from 0 to 9.

• Table of random numbers contain rows and columns of these randomly

produced digits.

• In using Table, choose:

• the starting point at random

• read off the digits in groups containing either one, two, three, or more

of the digits in any predetermined direction (rows or columns).

Example

• Choose a sample of size 7 from a group of 80 objects.

• Label the objects 01, 02, 03, …, 80 in any order.

• Arbitrarily enter the Table on any line and read out the pair of digits in any two

consecutive columns.

• Ignore numbers which recur and those greater than 80.

Go to Sample

Step 2

Construction of Frequency Distribution

• How many no. of groups (classes)?

• Just enough classes to reveal the

shape of the distribution.

• Let k be the desired no. of classes.

• k should be such that 2k > n.

• If n = 80 and we choose k = 6,

then 26 = 64 which is < 80, so k = 6

is not desirable. If we take k = 7,

then 27 = 128, which is > 80, so no.

of classes should be 7.

• Determine the class interval (width).

• the class interval should be the same

for all classes.

• The formula to determine class width:

where i is the class width, H is the

highest observed value, L is the

lowest observed value, and k is the

number of classes.

Next

1

Step 4

Construction of Frequency Distribution(continued)

• Set the individual class limits.

• Class limits should be very clear.

• Class limits should not be

overlapping.

• Some time class width is rounded

which may increase the range H-L.

• Make the lower limit of the first

class a multiple of class width.

• Make tally of observations falling

in each class.

Step 5

• Count the number of items in each

class (class frequency)

Back

Example

1

23372

20454

23591

24220

30655

22442

17891

18021

28683

30872

19587

21639

24296

15935

21558

20047

24285

24324

24609

26651

29076

20642

19889

19873

25251

28034

23169

28337

17399

20895

25277

20004

17357

20155

19688

28670

20818

19766

21981

20203

23765

25783

26661

24533

27453

32492

17968

25799

18263

23657

35851

20633

24052

15794

20642

20356

21442

21722

19331

32277

15546

29237

18890

20962

22845

26285

27896

35925

27443

17266

23613

21740

22374

24571

25449

22817

26613

19251

20445

Construction of Frequency Distribution( Example )

Raw Data

( Ungrouped Data )

Continued

Back

1

Frequency

15000 up to 18000

8

18000 up to 21000

23

21000 up to 24000

17

24000 up to 27000

18

27000 up to 30000

8

30000 up to 33000

4

33000 up to 36000

2

Total = 80

Construction of Frequency Distribution( Example Continued )

• Following Step 1, with n = 80 k should be 7.

• Following Step 2 the class width should be 2911.

• The width size is usually rounded up to a number multiple of 10 or 100.

• The width size is taken as i = 3000.

• Following Step 3, with i = 3000 and k = 7, the range is 7×3000=21000.

• Where as the actual range is H – L = 35925 - 15546 = 20379.

• The lower limit of the first class should be a multiple of class width.

• Thus the lower limit of starting class is taken as 15000.

• Following Step 4

and Step 5

Back

1

Group

H

cf

f

Histogram (Example 1)

1.6 - 2.2

2.1

2

2

35

2.2 – 2.8

2.7

6

4

30

25

2.8 - 3.4

3.3

19

13

20

15

10

3.4 – 4.0

3.9

32

13

5

0

4.0 - 4.6

4.5

38

6

1.60

2.20

2.80

3.40

4.00

4.60

5.20

Groups

4.6 - 5.2

5.1

40

2

Histogram

A graph in which the classes are marked on the horizontal axis and the

class frequencies on the vertical axis. The class frequencies are represented

by the heights of the bars and the bars are drawn adjacent to each other.

Next

1

Group

H

cf

f

1.5 - 2.0

2

2

2

Histogram (Example 1)

2.0 - 2.5

2.5

4

2

40

2.5- 3.0

3

9

5

30

3.0 - 3.5

3.5

24

15

20

Percent

3.5- 4.0

4

32

8

10

0

4.0 - 4.5

4.5

38

6

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

4.5 - 5.0

5

40

2

Groups

Histogram

1

Back

Group

Mid pt

cf

f

Frequency Polygon (Example 1)

35.0

3.10

3.70

1.6 - 2.2

1.9

2

2

30.0

25.0

2.2 - 2.8

2.5

6

4

20.0

Percent

2.8 - 3.4

3.1

19

13

15.0

4.30

3.4 – 4.0

3.7

32

13

10.0

2.50

5.0

1.90

4.90

4.0 - 4.6

4.3

38

6

0.0

1

2

2

3

3

4

5

4.6 - 5.2

4.9

40

2

Raw Data

FrequencyPolygon

A graph in which the points formed by the intersections of the class

midpoints and the class frequencies are connected by line segments.

Mid point = ( Li +Hi )/2

1

Back

Example 1 k = 7

Mid pt

cf

f

Frequency Polygon (Example 1)

1.5 – 2.0

1.75

2

2

40.0

3.25

35.0

2.0 - 2.5

2.25

3

1

30.0

25.0

3.75

2.5 – 3.0

2.75

7

4

Percent

20.0

3.0 - 3.5

3.25

22

15

15.0

4.25

10.0

2.75

4.75

3.5 – 4.0

3.75

32

10

5.0

1.75

2.25

0.0

4.0 - 4.5

4.25

37

5

1

2

3

4

Data Example1

4.5 – 5.0

4.75

40

3

Frequency PolygonContinued

Back

1

Group

Mid pt

cf

f

1.6 - 2.2

1.9

2

2

2.2 - 2.8

2.5

6

4

2.8 - 3.4

3.1

19

13

3.4 – 4.0

3.7

32

13

4.0 - 4.6

4.3

38

6

4.6 - 5.2

4.9

40

2

Cumulative FrequencyPolygon

A graph in which the points formed by the intersections of the class

midpoints and the class cumulative frequencies are connected by line

segments.

A cumulative frequency polygon portrays the number or percent of

observations below given value.

1

Next

Cumulative Frequency PolygonContinued

Group

Mid pt

cf

f

1.5 – 2.0

1.75

2

2

2.0 - 2.5

2.25

3

1

2.5 – 3.0

2.75

7

4

3.0 - 3.5

3.25

22

15

3.5 – 4.0

3.75

32

10

4.0 - 4.5

4.25

37

5

4.5 – 5.0

4.75

40

3

Back

1

A Stem and Leaf Plot is a type of graph that is similar to a histogram but shows more information.

Summarizes the shape of a set of data.

provides extra detail regarding individual values.

The data is arranged by placed value.

Stem and Leaf Plots are great organizers for large amounts of information.

The digits in the largest place are referred to as the stem.

The digits in the smallest place are referred to as the leaf

The leaves are always displayed to the left of the stem.

Series of scores on sports teams, series of temperatures or rainfall over a period of time, series of classroom test scores are examples of when Stem and Leaf Plots could be used.

Stem and Leaf Plot

What is A Stem and Leaf Plot Diagram?

What Are They Used For?

Constructing

Stem and Leaf Plot

1

Begin with the lowest temperature.

The lowest temperature of the month was 50.

Enter the 5 in the tens column and a 0 in the ones.

The next lowest is 57.

Enter a 7 in the ones

Next is 59, enter a 9 in the ones.

find all of the temperatures that were in the 60's, 70's and 80's.

Enter the rest of the temperatures sequentially until your Stem and Leaf Plot contains all of the data.

Temperature

Stem (Tens)

Leaf (Ones)

5

0 7 9

6

1 1 2 2 4 5 5 5 7 8 9

7

0 0 1 3 6 7 7 9 9

8

0 0 0 2 2 3 7

ConstructingStem and Leaf Plot

Make Stem and Leaf Plot with

the following temperatures for June.

77 80 82 68 65 59 61 57 50 62 61 70 69 64 67 70 62 65 65 73 76 87 80 82 83 79 79 71 80 77

1

Next

Make a Stem and Leaf Plot for the histogram but shows more information.

following data.

2.4

0.7

3.9

2.8

1.3

1.6

2.9

2.6

3.7

2.1

3.2

3.5

1.8

3.1

0.3

4.6

0.9

3.4

2.3

2.5

0.4

2.1

2.3

1.5

4.3

1.8

2.4

1.3

2.6

1.8

2.7

0.4

2.8

3.5

1.4

1.7

3.9

1.1

5.9

2.0

5.3

6.3

0.2

2.0

1.9

1.2

2.5

2.1

1.2

1.7

Stem and LeafExample

1

Next

Back

Data.

Make a Stem and Leaf Plot.

Stem and Leaf PlotExample

1

Next

Back

Go to Stem and Leaf Plot

1

Back

Ungrouped Data

Grouped Data

Population

Sample

Population

Sample

Measures of Location

1

Point of

Equilibrium

N observations

X1, X2,…, XN in

the population.

n observations

X1, X2 ,…, Xn in

the sample

Let Xi and fi be the mid

point and frequency

respectively of the ith

group in the population

The mean is defined as

Let Xi and fi be the mid

point and frequency

respectively of the ith

group in the sample

The mean is defined as

Next

Following is a random sample of

12 Clients showing the number of

minutes used by clients in a

particular cell phone last month.

What is the mean number of

Minutes Used?

Example of Population Mean

Thereare automobile manufacturing

Companies in the U.S.A. Listed below

is the no. of patents granted by the US

Government to each company.

Is this information a sample or population?

Numerical Examples Of Arithmetic MeanUngrouped Data

1

Next

Back

Frequency

Midpoint

(\$ thousands)

f

X

fX

15 - 18

8

16.5

132.0

18 - 21

23

19.5

448.5

21 - 24

17

22.5

382.5

24 - 27

18

25.5

459.0

27 - 30

28.5

228.0

8

30 - 33

4

31.5

126.0

Total

80

1845.0

33 - 36

2

34.5

69.0

Numerical Examples Of ArithmeticMeanGrouped Data

Following is the frequency distribution of Selling Prices of Vehicles at

Whitner Autoplex Last month.

Find arithmetic mean.

So the mean vehicle selling price is \$23100.

Go to

Summary measures

Back

1

Back

Point ofEquilibrium

1

Case when values of variable are

associated with certain quality, e.g price of medium, large, and big

The weight meanof a set of numbers

X1, X2, ..., Xn, with corresponding

weights w1, w2, ...,wn, is computed

from the following formula:

Soft Drink

Price

Weights

Medium

\$0.90

3

Large

\$1.25

4

Big

\$1.50

3

Summary Measures

Weighted Mean

EXAMPLE

Weighted Mean

1

The Carter Construction Company pays its hourly employees

\$16.50, \$19.00, or \$25.00 per hour. There are 26 hourly employees,

14 of which are paid at the \$16.50 rate, 10 at the \$19.00 rate, and 2 at the

\$25.00 rate. What is the mean hourly rate paid the 26 employees?

Go to

Summary measures

Back

1

The histogram but shows more information.geometric mean of a set of n

positive numbers is defined as the

nth root of the product of n values.

The formula for the geometric mean

is written:

The geometric mean used as the

average percent increase over time

n is calculated as:

Useful in finding the average change of percentages, ratios, indexes, or growth rates over time.

It has a wide application in business and economics because we are often interested in finding the percentage changes in sales, salaries, or economic figures, such as the GDP, which compound or build on each other.

The geometric mean will always be less than or equal to the arithmetic mean.

Summary Measures

Geometric Mean

Example

1

Company for four successive years

was 30%, 20%, -40%, and 200%.

Find the geometric mean rate of

return on investment.

Solution:

The 1.3 represents the 30 percent

return on investment, i.e original

Investment of 1.0 plus the return of

0.3. So

Which shows that the average return is

29.4 percent.

If you earned \$30000 in 1997 and

\$50000 in 2007, what is your annual rate of

increase over the period?

The annual rate of increase is 5.24 percent.

Example of Geometric Mean

Back

Summary Measures

1

If number of observations n is odd, histogram but shows more information.

the median is( n+1)/2th observation.

If n is even the median is the

average of n/2th and (n/2+1)th

observations

Example:

Determine the median for each set of

data.

Arrange the set of data

n=7 median is 4th observation

that is 33.

2) n=6, median is average of 3rd and

4th observation, that is (27+28)/2

= 27.5.

Median for Grouped Data

The median is obtained by using the

formula:

Where m is the group of n/2th obs.

Lm, Im, fm, and cfm-1 are the lowest

value, class width, frequency, and

cumulative frequency respectively of

the mth group.

Median

Median is the midpoint of the values

after they have been ordered from

the smallest to the largest, or the

largest to the smallest

• 41 15 39 54 31 15 33

• 15 16 27 28 41 42

• 15 15 31 33 39 41 54

• 15 16 27 28 41 42

Example

1

data.

n/2 = 20, so median group is 3.40-4.00

Lm = 3.40, Im = 0.6, fm = 13, cfm-1 = 19

Example (Median)

Back

Go to Summary Measures

1

• There is a unique median for each data set.

• It is not affected by extremely large or small values and is therefore a valuable measure of central tendency when such values occur.

• It can be computed for ratio-level, interval-level, and ordinal-level data.

• It can be computed for an open-ended frequency distribution if the median does not lie in an open-ended class.

Go to Summary Measures

1

The modeis the value of the

observation that appears most

frequently.

Mode

1

Next

Back

Next

1

Calculate the mode of the following

Distribution.

Solution:

Modal Group is 2.8 - 3.4

fm = 14, fm-1 = 4, fm+1 = 12 and Im= 0.6

ModeGrouped Data

Back

Go to Summary Measures

1

The Relative Positions of histogram but shows more information. the Mean, Median and the Mode

Go to Summary Measures

1

A measure of location, such as the mean or the median, only describes the center of the data. It is valuable from that standpoint, but it does not tell us anything about the spread of the data.

For example, if your nature guide told you that the river ahead averaged 3 feet in depth, would you want to wade across on foot without additional information? Probably not. You would want to know something about the variation in the depth.

A second reason for studying the dispersion in a set of data is to compare the spread in two or more distributions.

Studying dispersion through display.

Dispersion

Next

1

Mean Deviation

Example

The number of cappuccinos sold at

the Starbucks location in the Orange

Country Airport between 4 and 7p.m.

for a sample of 5 days last year were

20, 40, 50, 60, and 80. Determine the

mean deviation for the number of

cappuccinos sold.

Range and Mean Deviation

Range = Largest value – Smallest value

Range = Largest – Smallest value

= 80 – 20 = 60

Next

Back

1

The number of cappuccinos sold

at he Starbucks location in the

Orange Country Airport between

4 and 7 p.m. for a sample of 5

days last year were 20, 40, 50,60,

and 80.

Determine the mean deviation for

the number of cappuccinos sold.

Solution

Mean DeviationExample

Next

Back

1

f

X

8

16.5

-6.6

52.8

23

19.5

-3.6

82.8

17

22.5

-0.6

10.2

18

25.5

2.4

43.2

8

28.5

5.4

43.2

Selling Price

Frequency

4

31.5

8.4

33.6

2

34.5

11.4

22.8

(\$ thousands)

f

X

15 - 18

8

16.5

Total

288.6

23

19.5

18 - 21

21 - 24

17

22.5

24 - 27

18

25.5

27 - 30

28.5

8

30 - 33

4

31.5

33 - 36

2

34.5

Total

80

Mean Deviation(Grouped Data)

Back

1

Go to Summary Measures

deviation.

Let X1, X2,…, XN be N observations

in the population.

The variance is defined as:

The standard deviation is defined as:

The sample variance and

Standard deviation.

Let X1, X2,…, Xn be n observations

in the sample.

The variance is defined as:

The standard deviation is defined as:

Variance andStandard Deviation

Next

1

during the last five months in

Beaufort County, South Carolina, is

38, 26, 13, 41, and 22. What is the

population variance?

The hourly wages for a sample of

part-time employees at Home Depot

are: \$12, \$20, \$16, \$18, and \$19.

What is the sample variance?

ExampleVariance and standard deviation

Next

2

Back

The sample standard deviation is defined as:

Example:

For the following frequency distribution of prices of vehicle, compute the

standard deviation of the prices.

Next

Back

2

Alternate method of computing variance is:

Example

Back

Go to Measures of Dispersion

2

The rth moment about origin ‘a’ is

defined as:

The rth moment about mean is

defined as:

First moment about mean is Zero.

Moments of Grouped Data

The rth moment about origin ‘a’ is

defined as:

The rth moment about mean is

defined as:

First moment about mean is Zero.

Moments

2

Next

Next

Back

2

Example

2

Back

Go to Dispersion

Mean, median and mode are measures of central location for a set of observations and measures of data dispersion are range and the standard deviation.

Another characteristic of a set of data is the shape.

There are four shapes commonly observed:

symmetric,

positively skewed,

negatively skewed,

Bimodal

The coefficient of skewness can range from -3 up to 3.

A value near -3, such as -2.57, indicates considerable negative skewness.

A value such as 1.63 indicates moderate positive skewness.

A value of 0, which will occur when the mean and median are equal, indicates the distribution is symmetrical and that there is no skewness present.

Skewness

Next

2

Types of set of observations and measures of data dispersion are range and the standard deviation.Skewness

Next

Back

2

The Pearson coefficient of skewness is set of observations and measures of data dispersion are range and the standard deviation.

defined as:

Example

Following are the earnings per share for a sample of 15 software companies for the year 2005. The earnings per share are arranged from smallest to largest.

Compute the mean, median, and standard deviation. Find the coefficient of skewness using Pearson’s estimate. What is your conclusion regarding the shape of the distribution?

Solution

The shape is moderately positively skewed.

Coefficient ofSkewness

Next

Back

2

Example of Skewness set of observations and measures of data dispersion are range and the standard deviation.(Continued)

Example

The skewness can also be measured

with moments as:

m2= 1.75, m3 = 62

b = 0.492

• The shape is slightly positively skewed

2

Go to Skewness

Back

Next

Example set of observations and measures of data dispersion are range and the standard deviation.Skewness

Mode

Median

Mean

Go to Skewness

Back

2

Next

Empirical Rule set of observations and measures of data dispersion are range and the standard deviation.

Empirical Rule

For a symmetrical, bell-shaped frequency distribution:

• Approximately 68% of the observations will lie within plus and minus one standard deviations of the mean. ( mean ±s.d )

• About 95% of the observations will lie within plus and minus two standard deviations of the mean. ( mean ± 2s.d )

• Practically all (99.7%) wiill lie within plus and minus three standard deviations of the mean. ( mean ± 3s.d )

• Let the mean of a symmetric distribution be 100 and standard deviation be 10, then the empirical rule is as follows:

70 80 90 100110120130

68%

95%

99.7%

Next

Go to Skewness

Back

2

Consider the following distribution: set of observations and measures of data dispersion are range and the standard deviation.

Check the empirical rule.

Mean = 3.2 s.d = 0.75

Mean ± sd = ( 2.45 – 3.95 ) ( 67.5%)

Mean ± 2sd = ( 1.7 – 4.7 ) ( 97.5%)

Mean ± 3sd = ( 0.89 – 5.45 ) (100%)

Mean = 3.25 sd = 0.77

Mean ± sd = ( 2.48 – 4.05) ( 67.5%)

Mean ± 2sd = ( 1.71 – 4.79 ) ( 97.5%)

Mean ± 3sd = ( 0.94 – 5.56 ) ( 100%)

ExampleEmpirical Rule

Next

Back

2

For the following data of examination set of observations and measures of data dispersion are range and the standard deviation.

marks find the Mean, Median, Mode,

Mean Deviation and variance. Also

find the Skewness.

The following is the distribution of

Wages per thousand employees in a

Certain factory.

Marks

No. of students

30 – 39

40 – 49

50 – 59

60 – 69

70 – 79

80 – 89

90 - 99

8

87

190

304

211

85

20

Exercise

No. of Employees

Daily Wages

22

24

26

28

30

32

34

36

38

40

42

44

3

13

43

102

175

220

204

139

69

25

6

1

Calculate the

Modal

and Median

wages. Why is

difference b/w

the two.

Back

3