- 64 Views
- Uploaded on
- Presentation posted in: General

Chapter 1 Basic Statistics

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Chapter 1Basic Statistics

FARAH ADIBAH ADNAN

ENGINEERING MATHEMATICS INSTITUTE (IMK)

- Basic Statistics
- Statistics in Engineering
- Collecting Engineering Data
- Data Summary and Presentation
- Probability Distributions
- Discrete Probability Distribution

- Continuous Probability Distribution

- Sampling Distributions of the Mean and Proportion

- Statistics - area of science that deals with collection, organization, analysis, and interpretation of data.
- Statistics - deals with methods and techniquesthat can be used to draw conclusions about the characteristics of a large number of data points, commonly called a populationby using a smaller subset of the entire data called sample.
- Because many aspects of engineering practice involve working with data, obviously some knowledge of statistics is important to an engineer.

- Specifically, statistical techniques can be a powerful aid in designing new products and systems, improving existing designs, and improving production process.
- The methods of statistics allow scientists and engineers to design valid experiments and to draw reliable conclusions from the data they produce

- Direct observation
The simplest method of obtaining data.

Advantage: relatively inexpensive.

Disadvantage: difficult to produce useful information since it does not consider all aspects regarding the issues.

- Experiments
More expensive methods but better way to produce data.

Data produced are called experimental.

- Surveys
Most familiar methods of data collection.

Depends on the response rate.

- Personal Interview
Has the advantage of having higher expected response rate.

Fewer incorrect respondents.

- Data can be categorized into two :-
- Qualitative data - qualitative attributes

- Quantitative data - quantitative attributes

- Two sources of data :-
- Primary ( eg. Questionnaire, Telephone Interview)

- Secondary (eg. Internet, Annual Report)

Data should be summarized in more informative way such as graphical, tables or charts.

Data can be summarized or presented in two ways:

1) Tabular

2) Charts/graphs.

Data Presentation of Qualitative Data

1) Frequency Distribution Table - represents the number of times the observation occurs in the data.

Example: Ethnic Group

Observation Frequency

Malay33

Chinese9

Indian6

Others2

2) Charts for qualitative data are:

Pie Chart : Gender

Bar Chart : Ethnic Group

Line Chart : Number of Sandpipers from Jan 1989 – Dec 1989

Data Presentation of Quantitative Data

1) Frequency Distribution Table – list all classes and the number of values that belong to each class.

- This formula will be used to form frequency distribution table, from raw data.
Class - an interval that includes all the values that fall within two numbers; the lower and upper class (class limit).

Class Boundary - the midpoint of the upper limit of one class and the lower limit of the next class.

Class Width/Size/Interval ,c -difference between the two boundaries of a class . Formula :

C = Upper boundary – Lower Boundary

Class Midpoint/Mark, x – formula:

(Lower Limit + Upper Limit)/2

How to Form Frequency Distribution Table

- Decide the number of classes to be used.
- Determine class width:
- When the number of classes are given,
Class width =

- When the number of classes are not given,
Class width =

where the number of class =

- Don’t forget to always round up to the nearest whole number when
dealing with class width/interval.

- Any convenient number that is equal to or less than the smallest values in the data set can be used as the lower limit of the first class.

- 25 11 15 29 22 10 5 1721
- 13 26 16 18 12 9 26 2016
- 23 14 19 23 20 16 27 9 2114

Example:

The following data give the total number of iPods sold by a mail order company on each of 30 days. Construct a frequency distribution table. (Hint: 5 number classes).

Solution:

Number of classes = 5

Class width =

Frequency Distribution Table

Histogram: Student’s CGPA

Polygon : Student’s CGPA

2) Graph for quantitative data are:

Ogive: Student’s CGPA

Summary statisticsare used to summarize a set of observations.

Two basic summary statistics are

1) Measures of central tendency

- Mean

- Median

- Mode

2) Measures of dispersion

- Range

- Variance

- Standard deviation

1) Mean ,( )

- Mean of a sample ( ) or population ( ) is the sum of the sample data divided by the total number sample.
- Mean for ungroup data is given by:
Sample:

Population:

- Mean for group datais given by:
Sample: Population:

where f = class frequency; x = class mark (mid point)

Example:

1) Find the mean for the set of data 4, 6, 3, 1, 2, 5, 7.

Solution:

2) Find the mean of the frequency distribution table below.

(f)

(x)

Solution:

Therefore, the mean of frequency distribution above is:

2) Median, ( )

- Median is the middle value of a set of observations arranged in ascending order and normally is denoted by ( ).
- Median for ungrouped data:
- The median depends on the number of observations in the data, .

- If is odd, then the median is the th observation of the ordered observations / middle value.

- If is even, then the median is the average of the 2 middle values ( th observation and the th observation).

- Median for grouped data / frequency of distribution.
The median of frequency distribution is defined by:

where,

= the lower class boundary of the median class;

= the size of the median class interval;

= the sum of frequencies of all classes lower than the median class;

= the frequency of the median class.

Example:

1) Find the median for the set of data 4, 6, 3, 1, 2, 5, 7, 3.

Solution:

Arrange in order of magnitude : 1,2,3,3,4,5,6,7.

As n = 8 (even), the median is the mean of the 4th and 5th value.

Therefore, the median is 3.5

2) Find the median of the frequency distribution table below.

Cumulative

Frequency

Solution:

To determine median class:

So, the median class falls in class 3.00 – 3.25.

3) Mode, ( )

- The mode of a set of observations is the observation with the highest frequencyand is usually denoted by ( ). Sometimes mode can also be used to describe the qualitative data.
*Note:

- If data set with only 1 value that occur with the highest frequency, therefore it has 1 mode and it is called unimodal data.
- If data set has 2 measurements with highest frequency, therefore it has 2 modes and known as bimodal data.
- If data set has more than 2 measurements with highest frequency, so the data set contains more than 2 modes and said to be multimodal data.

- For ungrouped data:
- Defined as the value which occurs most frequent.

Example:

The mode for data 4,6,3,1,2,5,7,3 is 3.

- For grouped data
When data has been grouped into classes and a frequency curve is drawn to fit the data, the mode is the value of corresponding to the maximum point on the curve.

- Determining the mode using formula.

where

= the lower class boundary of the modal class;

= the size of the modal class interval;

= the difference between the modal class frequency and the class before it;and

= the difference between the modal class frequency and the class after it.

Note: The class which has the highest frequency is called the modal class.

Example:

Find mode of the frequency distribution table below.

Solution:

- The measure of dispersion/spread is the degree to which a set of data tends to spread around the average value.
- It shows whether data will set is focused around the mean or scattered.
- The common measures of dispersion are:
1) Range

2) Variance

3) Standard deviation

- The standard deviation actually is the square root of the variance.
- The sample variance is denoted by s2 and the sample standard deviation is denoted by s.

1) Range

- Simplest measure of dispersion.
- Apply for both group & ungroup data.
Ungroup data:

Formula: Range = Largest value – Smallest value

Group data:

Formula: Range = Largest value (class limit) – Smallest value

(class limit)

Example:

Solution:

Range = Largest Value – Smallest Value

= 267, 277 – 49, 651 = 217, 626 square miles.

2) Variance, ( )

- Measures the variability in a set of data.
- The variance for the ungrouped data:
Sample: Population:

- The variance for the grouped data:
Sample:

Population:

Example:

The variance for grouped data :

Solution:

2) Standard Deviation, ( )

- The positive square root of the variance is the standard deviation.
- A larger value of the standard deviation – the values of the data set are spread relatively large from the mean.
- A lower value of the standard deviation – the values of the data set are spread relatively small from the mean.
- The standard deviation for the ungrouped data:
Sample: Population:

- The standard deviation for grouped data:
Sample:

Population:

Example:

From previous example.

The final results in business statistics of 40 students are recorded as below

- Present the data in frequency distribution table.
- Construct a histogram
- Calculate mean, median, mode, variance and std deviation.