Download Presentation
Envisioning Information

Loading in 2 Seconds...

1 / 21

# Envisioning Information - PowerPoint PPT Presentation

Envisioning Information Lecture 2 Simple Graphs and Charts Ken Brodlie School of Computing University of Leeds Lecture Outline Preliminaries Definitions Datatypes Simple Data Presentation Graphs and charts Basic Datatypes correspond to different levels of measurement Data can be:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

## PowerPoint Slideshow about 'Envisioning Information' - benjamin

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Envisioning Information

Lecture 2

Simple Graphs and Charts

Ken Brodlie

School of Computing

University of Leeds

ENV 2006

Lecture Outline
• Preliminaries
• Definitions
• Datatypes
• Simple Data Presentation
• Graphs and charts

ENV 2006

Data can be:

Categorical - labels

Numerical – numbers

Categorical

Nominal

No sense of order

Apples, oranges,…

Ordinal

Ordered in sequence

January, February, ..

Numerical

Continuous

Real numbers

Height of students in class

Discrete

Typically whole numbers

Marks in an exam

Fundamentals

ENV 2006

Categorical - nominal

Categorical - ordinal

Numerical – continuous

Numerical - discrete

Question

ENV 2006

Pioneering figure is John Tukey

New approach to data analysis, heavily based on visualization, as an alternative to classical data analysis

See wikipedia

Two stage process:

Exploratory: Search for evidence using all tools available

Confirmatory: evaluate strength of evidence using classical data analysis

Exploratory Data Analysis

ENV 2006

### Simple Data Presentation

ENV 2006

Simple data tables are often presented as line graphs, bar graphs, pie charts, dot graphs, histograms…

Which should we use and when?

Simple Data Presentation

ENV 2006

Fundamental technique of data presentation

Used to compare two variables

X-axis is often the control variable

Y-axis is the response variable

Good at:

Showing specific values

Trends

Trends in groups (using multiple line graphs)

Mobile

Phone use

Line Graph

Students participating in sporting activities

Any critical

comments here?

Note: graph labelling is fundamental

ENV 2006

Bar graph

Presents categorical variables

Height of bar indicates value

Double bar graph allows comparison

Note spacing between bars

Can be horizontal (when would you use this?)

Simple Representations – Bar Graph

Number of police officers

Internet use at a school

Note more space for labels

ENV 2006

Very simple but effective…

Horizontal to give more space for labelling

Dot Graph

ENV 2006

Pie chart summarises a set of categorical/nominal data

But use with care…

… too many segments are harder to compare than in a bar chart

Pie Chart

Should we have a long lecture?

Favourite movie genres

ENV 2006

Histograms summarise discrete or continuous data that are measured on an interval scale

No gaps if variable is continuous

Histograms

Distribution of salaries

in a company

ENV 2006

Used to present measurements of two variables

Effective if a relationship exists between the two variables

Example taken from

NIST Handbook –

Evidence of strong

positive correlation

Scatter Plot

Car ownership by household income

ENV 2006

The scatter plot is a fundamental tool in Excel

Chart type XY (Scatter) and subtype Unconnected Points

Scatter Plots in Excel

http://www2.ncsu.edu:8010/ncsu/chemistry/resource/excel/excel.html

ENV 2006

Regression Line

Remember: correlation does not imply causality… ie a relationship

exists but one is not necessarily causing the other – there may be a

third factor?

ENV 2006

Tukey Sum-Difference Plot

Better understanding of residuals …

ENV 2006

In some situations we have, not a single data value at a point, but a number of data values, or even a probability distribution

When might this occur?

Tukey proposed the idea of a boxplot to visualize the distribution of values

For explanation and some history, see:

http://mathworld.wolfram.com/Box-and-WhiskerPlot.html

http://en.wikipedia.org/wiki/Box_plot

Darwin’s plant study

http://www.upscale.utoronto.ca/GeneralInterest/Harrison/Visualisation/Visualisation.html

Box Plots

M – median

Q1, Q3 – quarrtiles

Whiskers –

1.5 * interquartile range

Dots - outliers

ENV 2006

Acknowledgement
• Thanks to Statistics Canada – an excellent web site for simple data presentation
• http://www.statcan.ca/english/edu/power/toc/contents.htm

ENV 2006

Exercise for next week
• Understand a bit more about the merits of pie charts and bar graphs
• Create a dataset with roughly equal numbers in each class
• Which is best if the task is to discriminate?

ENV 2006

Exercise for next week
• Over the next week look for examples of basic graphs
• In newspapers, magazines or other print media
• On news web sites or other electronic media
• Analyse two examples
• One should be a example where you think the use of graphics is good
• One should be bad
• Be ready next week to present these results to the class…

ENV 2006

Gnuplot

R

Excel

ENV 2006