1 / 60

QBM117 Business Statistics

QBM117 Business Statistics. Descriptive Statistics. Objectives. To distinguish between a variable and data To distinguish between quantitative and qualitative data To discuss the different levels of measurement To summarise quantitative data using frequency distributions and histograms

ogden
Download Presentation

QBM117 Business Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. QBM117Business Statistics Descriptive Statistics

  2. Objectives • To distinguish between a variable and data • To distinguish between quantitative and qualitative data • To discuss the different levels of measurement • To summarise quantitative data using frequency distributions and histograms • To learn how to produce a histogram in Excel

  3. Introduction • Managers, economists and business analysts frequently have access to large masses of potentially useful data. • Before the data can be used to support a decision (inferential statistics), they must be organised and summarised (descriptive statistics).

  4. Descriptive Statistics • Descriptive Statistics involves collecting, organising, summarising and presenting numerical data. • Once the data is collected and organised, it needs to be summarised and presented in such a way that the important features of the data are highlighted. • Descriptive statistics methods can be applied to data from an entire population and data from a sample.

  5. Variables and Data • A variable is any characteristic of a population or sample that is of interest to us. • The term data refers to the actual values of variables.

  6. Example 1 Information concerning a magazine’s readership is of interest to both the publisher and to the magazine’s advertisers. A survey of 100 subscribers included the following questions: What is your age? What is your sex? What is your marital status? What is your annual income?

  7. What are the variables? The variables are age, sex, marital status and annual income. What are the data? The data are the actual values of the variables; for the age variable, the data are the actual ages of the 100 subscribers sampled, e.g. 34 years. for the sex variable, the data are the sexes of the 100 subscribers sampled, e.g. Male or Female.

  8. Types of Data • Data may be either quantitative (numerical) or qualitative (categorical). • Quantitative data are numerical observations. • Qualitative data are categorical observations.

  9. Example 1 revisited Information concerning a magazine’s readership is of interest to both the publisher and to the magazine’s advertisers. A survey of 100 subscribers included the following questions: What is your age? What is your sex? What is your marital status? What is your annual income?

  10. For each of the questions determine the data type of the possible responses. What is your age? quantitative What is your sex? qualitative What is your marital status? qualitative What is your annual income? quantitative

  11. Levels of Measurement • Data can also be described in terms of the level of measurement attained. • All data are generated by one of four scales of measurement: - nominal - ordinal - interval - ratio

  12. Levels of Measurement of Qualitative Data • Qualitative data are considered to be measured on a nominal scale or an ordinal scale. • A nominal scale classifies data into distinct categories in which no ordering is implied. • An ordinal scale classifies data into distinct categories in which ordering is implied.

  13. Example 2 For each of the following examples of qualitative data, determine the level of measurement. 1. Type of stocks owned (Growth, Income, Technology, Other, None) Nominal 2. Product satisfaction (Very unsatisfied, Unsatisfied, Neutral, Satisfied, Very satisfied) Ordinal

  14. 3. Student Grades (HD, DI, CR, PS, FL) Ordinal 4. Personal Notebook (Compaq, Toshiba, IBM, Apple, ACER, Other) Nominal 5. Commodities (Gold, Oil, Aluminium, Cooper, Zinc, Wheat, Wool, Cotton, Sugar) Nominal 6. Faculty rank (Professor, Associate Professor, Senior Lecturer, Lecturer, Associate Lecturer) Ordinal

  15. Levels of Measurement ofQuantitative Data • Quantitative data are considered to be measured on an interval scale or a ratio scale. • An interval scale is an ordered scale in which the difference between measurements is a meaningful quantity that does not involve a true zero point. • A ratio scale is an ordered scale in which the difference between points involves a true zero point.

  16. Example 3 For each of the following examples of quantitative data, determine the level of measurement. 1. Temperature (degrees Celsius or Fahrenheit) Interval 2. Height (centimeters or inches) Ratio 3. Calendar Years Interval 4. Annual income Ratio

  17. Example 4 For each of the following examples of data, determine the data type and the level of measurement. 1. Name of Internet provider qualitative, nominal 2. Monthly charge for Internet service quantitative, ratio

  18. 3. Amount of time spent on the Internet per week quantitative, ratio 4. Primary purpose for using the Internet qualitative, nominal 5. Number of emails received per week quantitative, ratio 6. Number of on-line purchases made in a month quantitative, ratio

  19. 7. Total amount spent on on-line purchases in a month quantitative, ratio 8. Whether the personal computer as a rewritable CD drive qualitative, nominal

  20. Graphical and Tabular Methods for Quantitative Data • The best way to examine large amounts of data is to present it in summary form by constructing appropriate tables and graphs. • We can then extract the important features from the data from these tables and graphs. • Often, the first step taken towards summarising a mass of numbers is to form what is known as a frequency distribution.

  21. Frequency Distribution • A frequency distribution is a tabular summary of a set of data showing the number (frequency) of observations in each of several non-overlapping classes. • When constructing a frequency distribution you need to - select an appropriate number of classes - select an appropriate width for each class - make sure that classes are non-overlapping and contain all observation

  22. The following table is a guide to the appropriate number of classes for different numbers of observations.

  23. An alternative rough guide to selecting the appropriate number of classes K required to accommodate n observations is given by Sturge’s formula: K=1+3.3log10n • Once the number of classes to be used has been chosen, the approximate class width is calculated using the following formula: Class width = largest value – smallest value number of classes

  24. The class width chosen should allow for convenient and easy reading. • You need to ensure that the classes do not overlap and that each observation is contained in a class. • The classes should then be listed in a column. • You then need to count the number of observations that fall into each class interval. • The counts (frequencies) are then listed next to their respective classes.

  25. Example 5Exercise 2.41 page 50 of text The number of items returned to a leading Brisbane retailer by its customers recorded for the last 25 days are as follows:

  26. Construct a frequency distribution for these data. There are n=25 observations. The table suggests that 5-7 classes would be appropriate. A rough guide to an appropriate number of classes is K=1+3.3log1025 =5.61 (2 d.p.)

  27. Approximate class width = 29-6 = 3.83 6 Round this up to 5 as a class width of 5 is easy and convenient. Now we need to choose non-overlapping intervals of width 5 so that each observation falls into one interval.

  28. Histograms • The information in a frequency distribution is often grasped more easily if the distribution is graphed. • The most common graphical technique used for representing a frequency distribution for quantitative data is the frequency histogram.

  29. Frequency Histograms • A frequency histogram is constructed by placing the variable of interest on the horizontal axis, and the frequency on the vertical axis. • The frequency of each class is shown by drawing a rectangle whose base is the class interval on the horizontal axis and whose height is the corresponding frequency.

  30. Example 5 revisited The number of items returned to a leading Brisbane retailer by its customers recorded for the last 25 days are as follows:

  31. Construct a frequency histogram for these data.

  32. Relative Frequency Histograms • Instead of showing the absolute frequency of observations in each class, it is often preferable to show the proportion of observations falling into each class. • To do this we replace the class frequency by the relative class frequency, which is calculated as follows: class relative frequency = class frequency______ Total number of observations

  33. We start be forming a relative frequency distribution. • The frequencies in the frequency distribution are replaced by the relative frequencies. • We then construct a relative frequency histogram. • The relative frequency histogram is constructed by placing the relative frequency on the vertical axis (in place of the frequency).

  34. Example 5 revisited The number of items returned to a leading Brisbane retailer by its customers recorded for the last 25 days are as follows:

  35. Construct a relative frequency distribution for these data.

  36. Construct a relative frequency histogram for these data.

  37. Shapes of Histograms • The purpose of drawing histograms is to acquire information. • We describe the shape of a histogram on the basis of the following four characteristics. - symmetry - skewness - number of modes - bell-shaped

  38. Symmetry • A histogram is said to be symmetric if, when we draw a vertical line down the centre of the histogram, the two sides are identical in shape and size.

  39. Skewness • A histogram with a long tail extending to the right is positively skewed. • A histogram with a long tail extending to the left is negatively skewed.

  40. Number of Modes • A unimodal histogram is one with a single peak. • A bimodal histogram is one with two peaks • A multimodal histogram is one with several peaks.

  41. Bell-shaped • A special type of symmetric unimodal histogram is one that is bell-shaped. • You will discover the importance of this in the next topic.

  42. Cumulative Frequency Distribution • A variation of the frequency distribution that provides another tabular summary of quantitative data is the cumulative frequency distribution. • The cumulative frequency distribution contains the same number of classes as the frequency distribution. • However, the cumulative frequency distributions shows the number of observations less than or equal to the upper class limit of each class.

  43. Cumulative Relative Frequency Distribution • The cumulative relative frequency distribution shows the proportion of observations with values less than or equal to the upper limit of each class. • The cumulative relative frequency distribution can be computed either by summing the relative frequencies in the relative frequency distribution, or by dividing the cumulative frequencies by the total number of observations.

  44. Ogives • A graph of the cumulative relative frequency is called an ogive. • The cumulative relative frequency of each class is plotted above the upper limit of the corresponding class, and the points representing the cumulative relative frequencies are the joined by straight lines. • The ogive is closed at the lower end by extending a straight line to the lower limit of the first class.

  45. Example 5 revisited The number of items returned to a leading Brisbane retailer by its customers recorded for the last 25 days are as follows:

  46. Construct a cumulative relative frequency distribution for these data.

  47. Construct an ogive for these data.

  48. Histograms for Large Data Sets • We have constructed a frequency distribution and histogram for a small data set by hand. • We are now going to construct a frequency distribution and histogram for a large data set. • To do this by hand would be very time consuming.

  49. Excel • There are many computer software packages available which make dealing with large data sets quite manageable. • We will use Excel rather than a statistical package as most students are familiar with Excel. • However, some of the things Excel does are not “statistically” correct.

More Related