- By
**lotus** - Follow User

- 263 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'BCOR 1020 Business Statistics' - lotus

Download Now**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Download Now

Presentation Transcript

### BCOR 1020Business Statistics

Lecture 3 – January 24, 2008

Overview

- Chapter 3 – Describing Data Visually…
- Visual Description
- Dot Plots
- Frequency Distribution and Histograms
- Simple Line Charts & Bar Charts
- Scatter Plots
- Tables
- Pie Charts
- Maps and Pictograms
- Deceptive Graphs

Chapter 3 – Visual Description

Methods of organizing, exploring and

summarizing data include:

- Visual (charts and graphs) – provides insight into characteristics of a data set without using mathematics.
- Numerical (statistics or tables) – provides insight into characteristics of a data set using mathematics.

Chapter 3 – Visual Description

Beginning with univariate data (a set of n observations

on one variable), consider the following:

- Measurement – What are the units of measurement? Are the data integer or continuous? Any missing observations? Any concerns with accuracy or sampling methods?
- Central Tendency – Where are the data values concentrated? What seem to be typical or middle data values?
- Dispersion – How much variation is there in the data? How spread out are the data values? Are there unusual values?
- Shape – Are the data values distributed symmetrically? Skewed? Sharply peaked? Flat? Bimodal?

Chapter 3 – Visual Description

- Example: Price/Earnings Ratios:

- P/E ratios are current stock price divided by earnings per share in the last 12 months. For example:

Chapter 3 – Visual Description

Measurement – Look at the data and visualize

how it was collected and measured.

Sorting – Sort the data and then summarize in a

graphical display.

- Here are the sorted P/E ratios:
- Sorting allows you to observe central tendency, dispersion and shape as well as minimum, maximum and range.

Chapter 3 – Dot Plots

A dot plot is the simplest graphical display of n

individual values of numerical data.

- Easy to understand.
- Not good for large samples (e.g., > 5,000).

Steps in Making a Dot Plot:

- Make a scale that covers the data range
- Mark the axes and label them
- Plot each data value as a dot above the scale at its approximate location. If more than one data value lies at about the same axis location, the dots are piled up vertically.

* Figure 3.4 in your text details the MegaStat menus for creating a dotplot.

Chapter 3 – Dot Plots

- Range of data shows dispersion.

- Clustering shows central tendency.

- Dot plots do not tell much of shape of distribution.

- Can add annotations (text boxes) to call attention to specific features.

Chapter 3 – Frequency Distributions and Histograms

Bins and Bin Limits:

- A frequency distribution is a table formed by classifying n data values into k classes (bins).
- Bin limits define the values to be included in each bin. Widths must all be the same.
- Frequencies are the number of observations within each bin.
- Expressas relative frequencies (frequency divided by the total) or percentages (relative frequency times 100).

Chapter 3 – Frequency Distributions and Histograms

Constructing a Frequency Distribution:

- Sort data in ascending order (e.g., P/E ratios)
- Choose the number of bins (k).
- k should be much smaller than n.
- Too many bins results in sparsely populated bins, too few and dissimilar data values are lumped together.
- Herbert Sturges proposes the following rule: k = 1 + log2(n)

Bin width

Chapter 3 – Frequency Distributions and HistogramsConstructing a Frequency Distribution:

- Set the bin limits:

In our example, we will use k = 7 bins to get convenient bin limits. The approximate bin width is:

To obtain “nice” limits, we round the width to 10 and start the first bin at 0 to get bin limits:

0, 10, 20, 30, 40, 50, 60, 70

Chapter 3 – Frequency Distributions and Histograms

Constructing a Frequency Distribution:

- Put the data values in the appropriate bin.
- In general, the lower limit is included in the bin while the upper limit is excluded.
- Create the table, you can include:
- Frequencies – counts for each bin
- Relative frequencies – absolute frequency divided by total number of data values.
- Cumulative frequencies – accumulated relative frequency values as bin limits increase.

Example: Back to the P/E ratio data…

Chapter 3 – Frequency Distributions and Histograms

What are the bin limits for the P/E ratio data?

Chapter 3 – Frequency Distributions and Histograms

Histograms:

- A histogram is a graphical representation of a frequency distribution.
- A histogram is a bar chart.
- X-axis ticks shows end points of each bin.
- Y-axis shows frequency (or relative/cumulative frequency) within each bin.

Consider 3 histograms for the P/E ratio data with different bin widths.

Do they give you different impressions of the data?

k = 4

k = 7

k = 13

* Figures 3.8 & 3.9 in your text details the MegaStat menus for creating a histogram.

Chapter 3 – Frequency Distributions and Histograms

Modal Class – a histogram bar that is

higher than those on either side:

- Monomodal– a single modal class.
- Bimodal – two modal classes.
- Multimodal – more than two modal classes.

Caution: Modal classes may be artifacts of the

way bin limits are chosen.

Chapter 3 – Frequency Distributions and Histograms

Shape:

- A histogram suggests the shape of the population.
- Skewness – indicated by the direction of the longer tail of the histogram.
- Left-skewed – (negatively skewed) a longer left tail.
- Right-skewed – (positively skewed) a longer right tail.
- Symmetric – both tail areas approximately the same.

Some examples…

Clickers

Consider the histogram of the P/E ratio data

that was displayed earlier in this lecture.

How would you describe the skewness of

this histogram?

A = symmetric

B = left-skewed

C = right-skewed

Chapter 3 – Simple Line Charts

Simple Line Charts – Used to display a time

series or spot trends, or to compare time periods.

- Can display several variables at once.

Chapter 3 – Simple Line Charts

Two-scale line chart–used to

compare variables that differ in

magnitude or are measured in

different units.

Grid Lines – A line graph usually has no vertical grid lines. Horizontal lines can be added to make it easier to establish the y value.

Which is easier to read?

Chapter 3 – Simple Line Charts

Log Scales:

- Arithmetic scale – distances on the Y-axis are proportional to the magnitude of the variable being displayed.
- Logarithmic scale – (ratio scale) equal distances represent equal ratios.
- Use a log scale for the vertical axis when data vary over a wide range, say, by more than an order of magnitude. This will reveal more detail for small data values.
- Log scale is only suited for positive data values.
- Reveals whether the quantity is growing at an increasing percent (concave upward), constant percent(straight line), or declining percent (concave downward)

Example…

Consider the following graphs illustrating U.S.

Trade from 1959 to 2002. What does the log scale

graph tell you about growth rate for both series?

Log scale

Arithmetic scale

Chapter 3 – Simple Line Charts

When to Use Log Scales:

- Useful for…
- time series data that might be expected to grow at a compound annual percentage rate (e.g., GDP, national debt, future income)
- financial charts that cover long periods of time-data that grow rapidly (e.g., revenues)

Chapter 3 – Simple Line Charts

Tips for Effective Line Charts:

- Line charts are used for time series data (never for cross-sectional data).
- Y-axis shows numerical variable while X-axis shows time units with time increasing left to right.
- Use a zero origin on the Y-axis unless more detail is needed.
- Omit numerical labels on a line chart to avoid clutter. Use gridlines if needed.
- Use data markers (squares, triangles, circles) if they don’t clutter the graph.
- Don’t make lines too thick.

Chapter 3 – Bar Charts

Plain Bar Charts –Most common way to

display attribute data.

- Bars represent categories or attributes.
- Lengths of bars represent frequencies.

Chapter 3 – Bar Charts

Pareto Charts – Special type of bar chart used in

quality management to display the frequency of

defects or errors of different types.

- Categories are displayed in descending order of frequency.
- Focus on significant few (i.e., few categories that account for most defects or errors).

Chapter 3 – Bar Charts

Stacked Bar Chart – Bar height is the sum of

several subtotals. Areas may be compared by

color to show patterns in the subgroups and total.

Chapter 3 – Bar Charts

Bar Charts for Time Series Data – Bar charts

can be (and often are) used for time series data

although it may be harder to compare trends.

Chapter 3 – Bar Charts

Tips for Effective Bar Charts:

- Show the numerical variable of interest with vertical bars on the Y-axis, category labels on the X-axis.
- For time series quantities, display the category labels on the horizontal X-axis with time increasing from left to right.
- The height or length of each bar should be proportional to the quantity displayed.
- Put numerical values at the top of each bar, except if too cluttered.

Chapter 3 – Scatter Plots

Example: Aircraft Fuel Consumption:

- Consider five observations on flight time and fuel consumption for a twin-engine Piper Cheyenne aircraft.

- A causal relationship is assumed since a longer flight would consume more fuel.

Chapter 3 – Scatter Plots

- Example: Aircraft Fuel Consumption:
- Here is the scatter plot with flight time on the X-axis and fuel use on the Y-axis.

- Is there an association between variables?

* Figure 3.31 in your text details the Excel menus for creating a scatter plot.

Strong association

Moderate association

Little or no association

Chapter 3 – Scatter PlotsDegree of Association/Correlation:

Clickers

Consider the scatter plot (below) comparing

birthrates and life expectancies in several countries.

True or False: This graph shows a strong

association between these two variables.

A = True

B = False

Chapter 3 – Tables

Tables are the simplest form of data display. A compound

table is a table that contains time series data down the

columns and variables across the rows.

Example: School Expenditures

- Arrangement of data is in rows and columns to enhance meaning.
- The data can be viewed by focusing on the time pattern (down the columns) or by comparing the variables (across the rows).
- Units of measure are stated in the footnote.
- Note merged headings to group columns.

Chapter 3 – Tables

Tips for Effective Tables:

- Keep the table simple, consistent with its purpose.
- Summary tables go in the main body.
- Detailed tables go in an appendix.
- In a slide show, main point of table should be clear within 10 seconds, otherwise, break up table.
- Display the data to be compared in columns.
- Round off data to 3 or 4 significant figures.
- Table layout should guide the eye towards the desired comparison.
- Use spaces or shading to separate rows or columns.
- Use lines sparingly.
- Keep row and column headings simple yet descriptive.
- Use a consistent number of decimal digits within a column.
- Right-justify or decimal align the data.

Chapter 3 – Pie Charts

An Oft-Abused Chart:

- A pie chart can only convey a general idea of the data.
- Pie charts should be used to portray data which sum to a total (e.g., percent market shares).
- If frequency counts are important, use a bar chart or histogram.
- A pie chart should only have a few (i.e., 2 or 3) slices.
- Each slice should be labeled with data values or percents.

Chapter 3 – Maps and Pictograms

Spatial Variation and GIS:

- Maps can be used for displaying many kinds of data.
- Appropriate when patterns of variation across space are of interest.
- Self-explanatory and revealing.
- Assess patterns based on geography.
- GIS (geographic information systems) combines statistics, geography and graphics.

Chapter 3 – Maps and Pictograms

Example:U.S. population change by county, 1990/2000

Chapter 3 – Maps and Pictograms

Example:U.S. presidential election results, 2004

- On election night 2004 and in the months and years since then, we have seen many maps that look like this.
- The amount of red on the map is skewed because there are a lot of large states (geographically) in which a majority voted Republican.

One possible way to allow for this, suggested by Robert Vanderbei at Princeton University, is to use not just two colors on the map, red and blue, but instead to use red, blue, and shades of purple to indicate percentages of voters. Here is what the normal map looks like if you do this.

Source: http://www-personal.umich.edu/~mejn/election/

Chapter 3 – Maps and Pictograms

Example:U.S. presidential election results, 2004

We can also correct for this by making use of a cartogram, a map in which the sizes of states have been rescaled according to their population. That is, states are drawn with a size proportional not to their sheer topographic acreage -- which has little to do with politics -- but to the number of their inhabitants, states with more people appearing larger than states with fewer, regardless of their actual area on the ground.

Source: http://www-personal.umich.edu/~mejn/election/

Chapter 3 – Maps and Pictograms

Pictograms –

A visual display in

which data values are

replaced by pictures.

- Although entertaining, they can create visual distortion. What do you think?

Chapter 3 – Deceptive Graphs

Error 2: Elastic Graph Proportions

- Keep the aspect ratio (width/height) below 2.00 so as not to exaggerate the graph. By default, Excel uses an aspect ratio of 1.8.

Chapter 3 – Deceptive Graphs

Error 3: Dramatic Title

- Keep short and grab readers attention.

Error 4: Distracting Pictures

- Avoid so as not to distract readers or impart an emotional slant.

Error 5: Authority Figures

- Can use pictures of authority figures to impart credibility to self-serving commercial claims.

Chapter 3 – Deceptive Graphs

Error 6: 3-D and Rotated Graphs

- Can make trends appear to dwindle into the distance or loom towards you.

Correct

Deceptive

Chapter 3 – Deceptive Graphs

Error 7: Missing Axis Demarcations

- If tick marks are missing, you cannot identify individual data values.

Error 8: Missing Measurement Units or Definitions

- Missing or unclear units of measurement can render a chart useless.

Error 9: Vague Source

- May indicate lost citation, unknown source, or mixed data sources. Use complete source citations.

Chapter 3 – Deceptive Graphs

Error 10: Complex Graphs

- Avoid if possible. Keep your main objective in mind. If necessary, break graph into smaller parts.

Chapter 3 – Deceptive Graphs

Error 11: Gratuitous Effects

- Avoid too many annoying special effects when using slide shows.

Error 12: Estimated Data

- Estimated points should be noted when used or avoided if possible.

Chapter 3 – Deceptive Graphs

Error 13: Area Trick

- As figure height increases, so does width, distorting the area.

Clickers

Consider the graph given below. What error is

present that makes this a deceptive graph?

A = Non-Zero Origin

B = Dramatic Title

C = 3-D or Rotated

D = Complex Graph

- Chart Wizard

- Click on the Chart Wizard icon on the toolbar to open a sequence of pop-up menus to guide you through the steps of creating a chart.

- Step 1: Select the Chart type and then click Next.

- Chart Wizard

- Step 2: Add labels for years on the X-axis by selecting a data range (B4:B13). Click Next.

- Chart Wizard

- Step 3: Embellish the chart by adding a title, axis labels, adjusting the gridlines or appending a data table to the graph by clicking on the appropriate tab.

- Embellished Charts

- Charts created in Excel can be edited to:

- Improve the titles (main, X-axis, Y-axis).

- Change the axis scales (minimum, maximum, demarcations).

- Display the data values (on the top of each bar).

- Embellished Charts

- Charts created in Excel can be edited to:

- Add a data table underneath the graph.

- Embellished Charts

- Charts created in Excel can be edited to:

- Change color or patterns in the plot or chart areas.

- Embellished Charts

- Charts created in Excel can be edited to:

- Format the decimals (on the axes or data labels).

- Edit the gridlines (color, dotted or solid, patterns).

- Embellished Charts

- Charts created in Excel can be edited to:

- Alter the appearance of the bars (color, pattern, gap width).

- Embellished Charts

- To alter a chart’s appearance, click on any chart object and then right-click to see a menu of properties that you can change.

- For example, right-click on the Y-axis scale and choose Format Axis.

Embellished bar chart

Effective Excel Charts

- Embellished Charts

- Be careful about over-embellishing your charts.

Multiple bar chart

Effective Excel Charts

- Embellished Charts

- Excel offers many other types of specialized charts.

- Embellished Charts

- Other specialized Excel charts:

- Bubble chart displays three variables on a 2-dimensional scatter plot.

- Note: bubble size is proportional to third variable.

Data from http://peltiertech.com/Excel/ChartsHowTo/HowToBubble.html

- Embellished Charts

- Other specialized Excel charts:

- Stock chart for high/low/close stock prices.

Data from

http://finance.yahoo.com

- Embellished Charts

- Other specialized Excel charts:

- Radar (or Spider) chart compares individual performance against abenchmark.

- Caution, data may be distorted by emphasized areas.

- Embellished Charts

- Other specialized Excel charts:

- Use floating bar charts to show a range of data.

Download Presentation

Connecting to Server..