Envisioning Information Lecture 2 Simple Graphs and Charts Ken Brodlie School of Computing University of Leeds Lecture Outline Preliminaries Definitions Datatypes Simple Data Presentation Graphs and charts Basic Datatypes correspond to different levels of measurement Data can be:

### Envisioning Information

Lecture 2

Simple Graphs and Charts

Ken Brodlie

School of Computing

University of Leeds

ENV 2006

Lecture Outline
• Preliminaries
• Definitions
• Datatypes
• Simple Data Presentation
• Graphs and charts

Data can be:

Categorical - labels

Numerical – numbers

Categorical

Nominal

No sense of order

Apples, oranges,…

Ordinal

Ordered in sequence

January, February, ..

Numerical

Continuous

Real numbers

Height of students in class

Discrete

Typically whole numbers

Marks in an exam

Fundamentals

Categorical - nominal

Categorical - ordinal

Numerical – continuous

Numerical - discrete

Question

Pioneering figure is John Tukey

New approach to data analysis, heavily based on visualization, as an alternative to classical data analysis

See wikipedia

Two stage process:

Exploratory: Search for evidence using all tools available

Confirmatory: evaluate strength of evidence using classical data analysis

Exploratory Data Analysis

### Simple Data Presentation

Simple data tables are often presented as line graphs, bar graphs, pie charts, dot graphs, histograms…

Which should we use and when?

Simple Data Presentation

Fundamental technique of data presentation

Used to compare two variables

X-axis is often the control variable

Y-axis is the response variable

Good at:

Showing specific values

Trends

Trends in groups (using multiple line graphs)

Mobile

Phone use

Line Graph

Students participating in sporting activities

Any critical

comments here?

Note: graph labelling is fundamental

Bar graph

Presents categorical variables

Height of bar indicates value

Double bar graph allows comparison

Note spacing between bars

Can be horizontal (when would you use this?)

Simple Representations – Bar Graph

Number of police officers

Internet use at a school

Note more space for labels

Very simple but effective…

Horizontal to give more space for labelling

Dot Graph

Pie chart summarises a set of categorical/nominal data

But use with care…

… too many segments are harder to compare than in a bar chart

Pie Chart

Should we have a long lecture?

Favourite movie genres

Histograms summarise discrete or continuous data that are measured on an interval scale

No gaps if variable is continuous

Histograms

Distribution of salaries

in a company

Used to present measurements of two variables

Effective if a relationship exists between the two variables

Example taken from

NIST Handbook –

Evidence of strong

positive correlation

Scatter Plot

Car ownership by household income

The scatter plot is a fundamental tool in Excel

Chart type XY (Scatter) and subtype Unconnected Points

Scatter Plots in Excel

http://www2.ncsu.edu:8010/ncsu/chemistry/resource/excel/excel.html

Regression Line

Remember: correlation does not imply causality… ie a relationship

exists but one is not necessarily causing the other – there may be a

third factor?

Tukey Sum-Difference Plot

Better understanding of residuals …

In some situations we have, not a single data value at a point, but a number of data values, or even a probability distribution

When might this occur?

Tukey proposed the idea of a boxplot to visualize the distribution of values

For explanation and some history, see:

http://mathworld.wolfram.com/Box-and-WhiskerPlot.html

http://en.wikipedia.org/wiki/Box_plot

Darwin’s plant study

http://www.upscale.utoronto.ca/GeneralInterest/Harrison/Visualisation/Visualisation.html

Box Plots

M – median

Q1, Q3 – quarrtiles

Whiskers –

1.5 * interquartile range

Dots - outliers

Acknowledgement
• Thanks to Statistics Canada – an excellent web site for simple data presentation
• http://www.statcan.ca/english/edu/power/toc/contents.htm

Exercise for next week
• Understand a bit more about the merits of pie charts and bar graphs
• Create a dataset with roughly equal numbers in each class
• Which is best if the task is to discriminate?

Exercise for next week
• Over the next week look for examples of basic graphs
• In newspapers, magazines or other print media
• On news web sites or other electronic media
• Analyse two examples
• One should be a example where you think the use of graphics is good
• One should be bad
• Be ready next week to present these results to the class…

