worcester polytechnic institute
Download
Skip this Video
Download Presentation
Worcester Polytechnic Institute

Loading in 2 Seconds...

play fullscreen
1 / 20

Worcester Polytechnic Institute - PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on

Worcester Polytechnic Institute. XmdvTool Interactive Visual Data Exploration System for High-dimensional Data Sets. http://davis.wpi.edu/~xmdv. Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine Rosario, Allen R. Martin, Ying-Huey Fua, Daniel Stroe .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Worcester Polytechnic Institute' - virote


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
worcester polytechnic institute
Worcester Polytechnic Institute

XmdvTool

Interactive Visual Data Exploration System for High-dimensional Data Sets

http://davis.wpi.edu/~xmdv

Matthew O. Ward, Elke A. Rundensteiner,

Jing Yang, Punit Doshi, Geraldine Rosario,

Allen R. Martin, Ying-Huey Fua, Daniel Stroe

This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276

xmdvtool features
XmdvTool Features
  • Hierarchical visualization and interaction tools for exploring very large high-dimensional data sets to discover patterns, trends and outliers
  • Applications:
    • Bioterrorism Detection
    • Bioinformatics and Drug Discovery
    • Space Science
    • Geology and Geochemistry
    • Systems Monitoring and Performance Evaluation
    • Economics and Business
    • Simulation Design and Analysis
  • Multi-platform support (Unix, Linux, Windows)
  • Public domain software:http://davis.wpi.edu/~xmdv
slide3

Xmdv: Main Features

  • Scale-up to High Dimensions: Visual Hierarchical Dimension Reduction
  • Scale-up to Large Data Sets: Interactive Hierarchical Displays, Database Backend with Minmax Encoding, Semantic Caching and Adaptive Prefetching
  • Interlinked Multi-Displays: Parallel Coordinates, Glyphs, Scatterplot Matrices, Dimensional Stacking
  • Visual Interaction Tools:N-Dimensional Brushes, Structure-Based Brushing, InterRing
scale up for large number of dimensions
Scale-Up for Large Number of Dimensions

Solution to High Dimensional Datasets:

  • Group Similar Dimensions into Dimension Hierarchy
  • Navigate Dimension Hierarchy by InterRing
  • Form Lower Dimensional Spaces by Dimension Clusters
  • Convey Dimension Cluster Information by Dissimilarity Display
slide6

Visual Hierarchical Dimension Reduction Process

A 42-dimensional Data Set

A 4-Dimensional Subspace

Dimension Hierarchy Interaction Tool: InterRing

slide7

InterRing - Dimension Hierarchy Navigation and Manipulation

Roll-up/Drill-down Rotate Zoom in/out

Modify

Distort

dissimilarity display
Dissimilarity Display

Three Axes Method

Diagonal Plot Method

Axis Width Method

Mean-Band Method

scale up for large number of records
Scale-up for Large Number of Records

Solution to Large Scale Datasets:

  • Group Similar Records into Data Hierarchy
  • Navigate Data Hierarchy by Structure-Based Brushing
  • Represent Data Clusters by Mean-Band Method
  • Provide Database Backend Support using MinMax Tree, Caching, Prefetching
slide10

Interactive Hierarchical Display

2D example

Hierarchical Clustering

Structure-Based Brushing

slide11

Interactive Hierarchical Display

Flat Display

Hierarchical Display

Mean-Band Method in Parallel Coordinates

slide12

Interactive Hierarchical Display

Flat Display

Hierarchical Display

Mean-Band Method in Parallel Coordinates

scalability of data access
Scalability of Data Access
  • Approach
      • Attach database system to visualization front-end
  • MinMax hierarchy encoding
      • Key idea: avoid recursive processing
      • Pre-computed
  • Caching
      • Key idea: reduce response time and network traffic
  • Prefetching
      • Key idea: use application hints and predict user patterns
      • Performed during idle time
scalability of data access minmax hierarchy encoding
Pre-compute object positions

level-of-detail (L)

extent values (x,y)

preserve tree structure

New query semantics

objects are now rectangles

select objects that touch L

select objects that touch (x, y)

structure-based brush = intersection of two selections

level of detail

L

x

y

extent values

L

query = (x, y, L)

x

y

Scalability of Data Access:MinMax Hierarchy Encoding
slide15

Scalability of Data Access: Caching

  • Purpose
      • reduce response time and network traffic
  • Issues
      • visual query cannot directly translate into object IDs
      • high-level cache specification to avoid complete scans
  • Semantic caching
      • queries are cached rather than objects
      • minimize cost of cache lookup
      • dynamically adapt cached queries to patterns of queries
slide16

Scalability of Data Access: Prefetching

  • Strategy
    • Speculative (no specific hints)
        • navigation remains local
        • both user and data set influence exploration
    • Adaptive (strategy changes over time)
        • Evolves as more knowledge becomes available
    • Non-pure (interruptible prefetching)
        • leave buffer in consistent state
  • Requirements
    • non-pure prefetching + large transactions & small object size + semantic caching  small granularity (object level)
    • speculative, non-pure prefetcher  cache replacement policy + guessing method
slide17

Scalability of Data Access: Experimental Evaluation

  • Conclusions:
  • Caching reduces response time by 80%
  • Prefetching further reduces response time by 30%
  • Designing better prefetching strategies might help further reduce response time
slide18

m(n)

(m-1)

m

(m+1)

m(n+1)

m(n-1)

m(n-2)

m(n)

Hot Regions

Current Navigation Window

m(n-1)

m(n+1)

m(n-2)

Scalability of Data Access: Prefetching

Mean Strategy

Random Strategy

Direction Strategy

Localized Speculative Strategies

Exponential Weight Average Strategy

Focus Strategy

Data Set Driven Strategy

Vector Strategies

slide19

OFF-LINE PROCESS

MinMax

Labeling

Hierarchical

Data

DB

DB

DB

Flat

Data

Loader

Schema

Info

User

Translator

GUI

Rewriter

MEMORY

Exploration

Variables

Buffer

Queries

Prefetcher

Library:

Buffer

Estimator

ON-LINE PROCESS

Random

Direction

Focus

Mean

EWA

Xmdv System Implementation

  • Tools
    • C/C++
    • TCL/TK
    • OpenGL
    • Oracle 8i
    • Pro*C
publications available at http davis wpi edu xmdv
Publications (available at http://davis.wpi.edu/~xmdv)
  • Jing Yang, Matthew O. Ward and Elke A. Rundensteiner, "InterRing: An Interactive Tool for Visually Navigating and Manipulating Hierarchical Structures",InfoVis 2002, to appear
  • Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward and Daniel Stroe, “Prefetching For Visual Data Exploration.”

Technical Report #: WPI-CS-TR-02-07, 2002

  • Jing Yang, Matthew O. Ward and Elke A. Rundensteiner, “Interactive Hierarchical Displays: A General Framework for Visualization and Exploration of Large Multivariate Data Sets”, Computers and Graphics Journal, 2002, to appear
  • Daniel Stroe, Elke A. Rundensteiner and Matthew O. Ward, “Scalable Visual Hierarchy Exploration”, Database and Expert Systems Applications, pages 784-793, Sept. 2000
  • Ying-Huey Fua, Matthew O. Ward and Elke A. Rundensteiner, “Hierarchical Parallel Coordinates for Exploration of LargeDatasets”, IEEE Proc. of Visualization, pages 43-50, Oct. 1999
  • Ying-Huey Fua, Matthew O. Ward and Elke A. Rundensteiner, “Navigating Hierarchies with Structure-Based Brushes”, IEEE Proceedings of Visualization, pages 43-50, Oct. 1999
ad