Chapter 1 Basic Statistics. FARAH ADIBAH ADNAN ENGINEERING MATHEMATICS INSTITUTE (IMK). CHAPTER 1. Basic Statistics Statistics in Engineering Collecting Engineering Data Data Summary and Presentation Probability Distributions - Discrete Probability Distribution
FARAH ADIBAH ADNAN
ENGINEERING MATHEMATICS INSTITUTE (IMK)
- Discrete Probability Distribution
- Continuous Probability Distribution
The simplest method of obtaining data.
Advantage: relatively inexpensive.
Disadvantage: difficult to produce useful information since it does not consider all aspects regarding the issues.
More expensive methods but better way to produce data.
Data produced are called experimental.
Most familiar methods of data collection.
Depends on the response rate.
Has the advantage of having higher expected response rate.
Fewer incorrect respondents.
- Qualitative data - qualitative attributes
- Quantitative data - quantitative attributes
- Primary ( eg. Questionnaire, Telephone Interview)
- Secondary (eg. Internet, Annual Report)
Data should be summarized in more informative way such as graphical, tables or charts.
Data can be summarized or presented in two ways:
Data Presentation of Qualitative Data
1) Frequency Distribution Table - represents the number of times the observation occurs in the data.
Example: Ethnic Group
Pie Chart : Gender
Bar Chart : Ethnic Group
Line Chart : Number of Sandpipers from Jan 1989 – Dec 1989
1) Frequency Distribution Table – list all classes and the number of values that belong to each class.
Class - an interval that includes all the values that fall within two numbers; the lower and upper class (class limit).
Class Boundary - the midpoint of the upper limit of one class and the lower limit of the next class.
Class Width/Size/Interval ,c -difference between the two boundaries of a class . Formula :
C = Upper boundary – Lower Boundary
Class Midpoint/Mark, x – formula:
(Lower Limit + Upper Limit)/2
Class width =
Class width =
where the number of class =
dealing with class width/interval.
The following data give the total number of iPods sold by a mail order company on each of 30 days. Construct a frequency distribution table. (Hint: 5 number classes).
Number of classes = 5
Class width =
Polygon : Student’s CGPA
2) Graph for quantitative data are:
Ogive: Student’s CGPA
Summary statisticsare used to summarize a set of observations.
Two basic summary statistics are
1) Measures of central tendency
2) Measures of dispersion
- Standard deviation
1) Mean ,( )
where f = class frequency; x = class mark (mid point)
1) Find the mean for the set of data 4, 6, 3, 1, 2, 5, 7.
2) Find the mean of the frequency distribution table below.
Therefore, the mean of frequency distribution above is:
- The median depends on the number of observations in the data, .
- If is odd, then the median is the th observation of the ordered observations / middle value.
- If is even, then the median is the average of the 2 middle values ( th observation and the th observation).
The median of frequency distribution is defined by:
= the lower class boundary of the median class;
= the size of the median class interval;
= the sum of frequencies of all classes lower than the median class;
= the frequency of the median class.
1) Find the median for the set of data 4, 6, 3, 1, 2, 5, 7, 3.
Arrange in order of magnitude : 1,2,3,3,4,5,6,7.
As n = 8 (even), the median is the mean of the 4th and 5th value.
Therefore, the median is 3.5
2) Find the median of the frequency distribution table below.
To determine median class:
So, the median class falls in class 3.00 – 3.25.
- Defined as the value which occurs most frequent.
The mode for data 4,6,3,1,2,5,7,3 is 3.
When data has been grouped into classes and a frequency curve is drawn to fit the data, the mode is the value of corresponding to the maximum point on the curve.
= the lower class boundary of the modal class;
= the size of the modal class interval;
= the difference between the modal class frequency and the class before it;and
= the difference between the modal class frequency and the class after it.
Note: The class which has the highest frequency is called the modal class.
Find mode of the frequency distribution table below.
3) Standard deviation
Formula: Range = Largest value – Smallest value
Formula: Range = Largest value (class limit) – Smallest value
Range = Largest Value – Smallest Value
= 267, 277 – 49, 651 = 217, 626 square miles.
The variance for grouped data :
From previous example.
The final results in business statistics of 40 students are recorded as below