I tree exploring time varying data using indexable tree
This presentation is the property of its rightful owner.
Sponsored Links
1 / 21

i Tree: Exploring Time-Varying Data using Indexable Tree PowerPoint PPT Presentation


  • 120 Views
  • Uploaded on
  • Presentation posted in: General

i Tree: Exploring Time-Varying Data using Indexable Tree. Yi Gu and Chaoli Wang Michigan Technological University Presented at IEEE Pacific Visualization Symposium 28 February 2013 Sydney, Australia. Time-activity curve (TAC) Time-varying medical imaging data [Fang et al. 2007]

Download Presentation

i Tree: Exploring Time-Varying Data using Indexable Tree

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


I tree exploring time varying data using indexable tree

iTree: Exploring Time-Varying Data using Indexable Tree

Yi Gu and Chaoli Wang

Michigan Technological University

Presented at IEEE Pacific Visualization Symposium

28 February 2013

Sydney, Australia


I tree exploring time varying data using indexable tree

Time-activity curve (TAC)

Time-varying medical imaging data [Fang et al. 2007]

Importance analysis

Multiscale data clustering

Temporal sequencing

Trend identification

What iTree can do for us?

Handle ever-growing size and complexity (efficient data compacting)

Index and query TACs adaptively (effective data indexing)

Interact with space-time data (intuitive visual exploration)

TAC-based time-varying data visualization


S ymbolic a ggregate appro x imation sax

Keogh’s SIGKDD 2007 tutorial slide

c

c

c

b

b

b

a

a

-

-

0

0

40

60

80

100

120

20

Symbolic Aggregate ApproXimation (SAX)

C

C

0

20

40

60

80

100

120

First convert the time series to piecewise aggregate approximation(PAA) representation, then convert the PAA to symbols

It takes linear time [Lin et al. 2003]

breakpoints

SAX word can be represented by symbols (e.g., a, b, c) or bits (e.g., 00, 01, 10 or 02, 12, 22)

baabccbc

word length: 8; bit cardinality: 2


I tree exploring time varying data using indexable tree

Handle time-varying data

Use group of voxels over time intervals by going through voxel by voxel for the 1st time step, then the 2nd etc.

Modify the original SAX/iSAX algorithms to

Better differentiate SAX words (effectiveness)

Improve computational performance (efficiency)

Make iSAX amenable for visual mapping (visualization)

PAA conversion

Convert a TAC T of length n to a PAA C of length w

SAX for time-varying volume data (1)


I tree exploring time varying data using indexable tree

Transfer function based breakpoint identification

H’: histogram after logarithm and normalization of the original histogram

H: new histogram by multiplying H’ by the opacity value

SAX for time-varying volume data (2)

After

Before


I tree exploring time varying data using indexable tree

SAX word generation

Construct an alphabet Φ and transform C into an array of symbol Ĉ to form a SAX word

Distance between two symbols

Distance between two SAX words

Distance between two SAX words is the lower bound of the Euclidean distance defined based on the PAA representation

SAX for time-varying volume data (3)


I tree exploring time varying data using indexable tree

DLB(Q’,S’)

D(Q,S)

SAX lower bounding

Exact (Euclidean) distance D(Q,S)

Lower bounding distance DLB(Q,S)

Raw data

Approx. resp.

Q’

Q

S’

S

DLB(Q’,S’)

D(Q,S)

Lower bounding means that for all Q and S, we have…

DLB(Q’,S’) D(Q,S)

Keogh’s SIGKDD 2007 tutorial slide


I tree exploring time varying data using indexable tree

SAX construction (in sec)

Choose 8 to 12 word length and 16 to 32 quantization level are appropriate for quality and speed tradeoff

Less than 10 minutes to construct SAX excluding I/O time


I tree exploring time varying data using indexable tree

iSAX organizes SAX words hierarchically

A node represents a set of TACs with the same or similar SAX words

Split a node when the number of SAX words exceeds a certain threshold

How to split?

The original iSAX chooses the symbol with the left-most smallest bit cardinality to split

We choose a symbol covering the largest value rangeto split

iSAX for time-varying volume data (1)

  • 22011132

  • 22121132 22221132

  • 22014332

  • 22018432 22019432


I tree exploring time varying data using indexable tree

Comparison

Original breakpoint identification and symbol splitting

Our new breakpoint identification and symbol splitting


I tree exploring time varying data using indexable tree

iSAX construction

Voxel IDs for each terminal node are saved into a file

Use the SAX word itself as the file name to facilitate search

Out-of-core acceleration strategy

Partition all voxels or groups into at most 2w buckets and save each non-empty bucket into a file

Choose the file with the largest voxel/group count to split if larger than a threshold δn

Continue this until no file is larger than δn

iSAX for time-varying volume data (2)


I tree exploring time varying data using indexable tree

Approximate and exact search

Both take the PAA representation and a threshold δ as input

Approximate search only compares each of the file names with the PAA converted SAX word if the distance is less than δ

Exact search needs an additional step: compute PAA-based distance to the input PAA and return those voxels that have a distance less than δ

iSAX for time-varying volume data (3)


I tree exploring time varying data using indexable tree

From iSAX (internal) hierarchy to iTree (external)

Number of non-empty children of the root is fairly large

Solution: level promoting

iSAX has a larger number of hierarchy with small fanout (2)

Solution: sibling grouping

Sibling nodes are not arranged according to their similarity

Solution: sibling reordering

Resulting properties

The height of the iTree is determined by the maximal bit cardinality for representing any symbol in the SAX words

The iTree is balanced: no node has an excessively large fanout

Neighboring sibling nodes have a higher degree of similarity in terms of spatial closeness and temporal trend

iTree (1)


I tree exploring time varying data using indexable tree

iTree drawing and focus+context visualization

Hyperbolic layout [Laming and Rao 1996]

Accommodate a large number of nodes

Allow focus+context interaction

Add the time ring to indicate the time dimension

Query in multiple coordinated views (volume view, iTree view and SAX view)

iTree (2)


I tree exploring time varying data using indexable tree

iSAX/iTree construction (in sec)

Reduce the number of nodes an order of magnitude smaller from iSAX to iTree


I tree exploring time varying data using indexable tree

Brute-force/approx./exact search (in sec)

Brute-force search does not use any indexing scheme but simply goes over the PAA representation of data for identifying similar voxels

The time cost for approx. search does not increase much from current interval to all time steps (only involving using the names of index files for distance computation)


I tree exploring time varying data using indexable tree

iTree

Data organization, visual representation and user interaction framework for time-varying data analysis and visualization

Applicable for tackling big time-varying data sets

Limitations

Breakpoint identification depends on input transfer function

Blockwise TACs lead to block discontinuity in data classification

Future work

Motif finding (locate previously unknown, frequently occurring patterns)

Time-varying multivariate data

Acknowledgements

U.S. National Science Foundation

Summary


  • Login