visualization and data mining l.
Download
Skip this Video
Download Presentation
Visualization and Data Mining

Loading in 2 Seconds...

play fullscreen
1 / 33

Visualization and Data Mining - PowerPoint PPT Presentation


  • 101 Views
  • Uploaded on

Visualization and Data Mining. Napoleon Invasion of Russia, 1812. Napoleon. Marley, 1885. Snow’s Cholera Map, 1855. Asia at night. South and North Korea at night. North Korea Notice how dark it is. Seoul, South Korea. Visualization Role. Support interactive exploration

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Visualization and Data Mining' - lei


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
south and north korea at night
South and North Korea at night

North Korea

Notice how dark

it is

Seoul,

South Korea

visualization role
Visualization Role
  • Support interactive exploration
  • Help in result presentation
  • Disadvantage: requires human eyes
  • Can be misleading
bad visualization spreadsheet with misleading y axis
Bad Visualization: Spreadsheet with misleading Y -axis

Y-Axis scale gives WRONG

impression of big change

better visualization
Better Visualization

Axis from 0 to 2000 scale gives

correct impression of small change

slide11

Lie Factor=14.8

(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)

lie factor
Lie Factor

Tufte requirement: 0.95<Lie Factor<1.05

(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)

tufte s principles of graphical excellence
Tufte’s Principles of Graphical Excellence
  • Give the viewer
    • the greatest number of ideas
    • in the shortest time
    • with the least ink in the smallest space.
  • Tell the truth about the data!

(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)

visualization methods
Visualization Methods
  • Visualizing in 1-D, 2-D and 3-D
    • well-known visualization methods
  • Visualizing more dimensions
    • Parallel Coordinates
    • Other ideas
1 d univariate data
1-D (Univariate) Data
  • Representations

7

5

3

1

Tukey box plot

Middle 50%

low

high

Mean

0

20

Histogram

2 d bivariate data
2-D (Bivariate) Data
  • Scatter plot, …

price

mileage

3 d image requires 3 d blue and red glasses
3-D image (requires 3-D blue and red glasses)

Taken by Mars Rover Spirit, Jan 2004

visualizing in 4 dimensions
Visualizing in 4+ Dimensions
  • Scatterplots
  • Parallel Coordinates
  • Chernoff faces
multiple views
Multiple Views

Give each variable its own display

1

A B C D E

1 4 1 8 3 5

2 6 3 4 2 1

3 5 7 2 4 3

4 2 6 3 1 5

2

3

4

A B C D E

Problem: does not show correlations

scatterplot matrix
Scatterplot Matrix

Represent each possible

pair of variables in their

own 2-D scatterplot

(car data)

Q: Useful for what?

A: linear correlations

(e.g. horsepower & weight)

Q: Misses what?

A: multivariate effects

parallel coordinates
Parallel Coordinates
  • Encode variables along a horizontal row
  • Vertical line specifies values

Same dataset in parallel coordinates

Dataset in a Cartesian coordinates

Invented by

Alfred Inselberg

while at IBM, 1985

example visualizing iris data
Example: Visualizing Iris Data

Iris versicolor

Iris setosa

Iris virginica

flower parts
Flower Parts

Petal, a non-reproductive part of the flower

Sepal, a non-reproductive part of the flower

parallel coordinates25
Parallel Coordinates

Sepal

Length

5.1

parallel coordinates 2 d
Parallel Coordinates: 2 D

Sepal

Length

Sepal

Width

3.5

5.1

parallel coordinates 4 d
Parallel Coordinates: 4 D

Sepal

Length

Petal

length

Petal

Width

Sepal

Width

3.5

5.1

0.2

1.4

parallel visualization summary
Parallel Visualization Summary
  • Each data point is a line
  • Similar points correspond to similar lines
  • Lines crossing over correspond to negatively correlated attributes
  • Interactive exploration and clustering
  • Problems: order of axes, limit to ~20 dimensions
chernoff faces
Chernoff Faces

Encode different variables’ values in characteristics

of human face

Cute applets:

http://www.cs.uchicago.edu/~wiseman/chernoff/

http://hesketh.com/schampeo/projects/Faces/chernoff.html

visualization summary
Visualization Summary
  • Many methods
  • Visualization is possible in more than 3-D
  • Aim for graphical excellence
ad