A tree a structure for storage and modeling of uncertain multidimensional arrays
Download
1 / 33

A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays . Presented by: ZHANG Xiaofei March 2, 2011. Outline. Motivation Modeling correlated uncertainty Construction of A*-tree Analysis of A*-tree Query processing Experiments. Outline. Motivation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays' - faolan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
A tree a structure for storage and modeling of uncertain multidimensional arrays

A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays

Presented by: ZHANG Xiaofei

March 2, 2011


Outline
Outline Multidimensional Arrays

  • Motivation

  • Modeling correlated uncertainty

  • Construction of A*-tree

  • Analysis of A*-tree

  • Query processing

  • Experiments


Outline1
Outline Multidimensional Arrays

  • Motivation

  • Modeling correlated uncertainty

  • Construction of A*-tree

  • Analysis of A*-tree

  • Query processing

  • Experiments


Motivation
Motivation Multidimensional Arrays

  • Multidimensional arrays

    • Suit for scientific and engineering applications

    • Logically equivalent to relational tables

<A1,A2,…,An>

D2

D1

A cell of the multidimensional arrays: (A1,A2,…,Ak, D1,D2,…Dd)


Motivation cont d
Motivation (Cont’d) Multidimensional Arrays

  • Uncertain data

    • Inevitable

    • Two categories


Motivation cont d1
Motivation (Cont’d) Multidimensional Arrays

  • Correlated uncertain data

    • Examples: Geographically distributed sensors

More applications examples can be found in router’s network traffic analysis, quantization of image or sound, etc.


Outline2
Outline Multidimensional Arrays

  • Motivation

  • Modeling correlated uncertainty

  • Construction of A*-tree

  • Analysis of A*-tree

  • Query processing

  • Experiments


Modeling correlated uncertainty
Modeling Correlated Uncertainty Multidimensional Arrays

  • PGM: Probabilistic Graphical Model

    • Bayesian network

Limitations:

Prior knowledge and initial probabilities

Significant computational cost(NP hard)


Modeling correlated uncertainty cont d
Modeling Correlated Uncertainty (Cont’d) Multidimensional Arrays

  • PGM: Probabilistic Graphical Model

    • Markov Random Fields

A graphical model in which a set of random variables have a Markov property described by an undirected graph

Pros: cyclic dependencies

Cons: no induced dependencies

NP hard to compute


Modeling correlated uncertainty cont d1
Modeling Correlated Uncertainty (Cont’d) Multidimensional Arrays

  • Considering the locality of correlation

    • E.g. a 2-dimensional arrays


Outline3
Outline Multidimensional Arrays

  • Motivation

  • Modeling correlated uncertainty

  • Construction of A*-tree

  • Analysis of A*-tree

  • Query processing

  • Experiments


Construction of a tree
Construction of A*-tree Multidimensional Arrays

  • Basic A*-structure

k-ary tree: k=2^d, where d is the number of correlated dimensions

Each leaf contains the joint distribution of four neighboring cells it maps to

The joint distribution at each internal node is recursively defined


Construction of a tree cont d
Construction of A*-tree (Cont’d) Multidimensional Arrays

  • Joint distribution at a node

X1

X2

Y=(X1+X2+X3+X4)/4

Xi=Y(1+Fi)

X3

X4

Fi range k, r entries in distribution table, l bits to present probability


Construction of a tree cont d1
Construction of A*-tree (Cont’d) Multidimensional Arrays

  • Extension of A*-tree

    • Uneven dimensional size

      • 2k+1 partitioned as k and k+1

      • Shorter dimension stops partition first, with partition of longer dimension goes on


Construction of a tree cont d2
Construction of A*-tree (Cont’d) Multidimensional Arrays

  • Extension of A*-tree

    • Basic uncertainty blocks of arbitrary shapes

      • Each cell is intuitively the basic uncertain block, however, maybe this granularity is too fine

      • Initial identification of uncertainty blocks is user and application specified


Outline4
Outline Multidimensional Arrays

  • Motivation

  • Modeling correlated uncertainty

  • Construction of A*-tree

  • Analysis of A*-tree

  • Query processing

  • Experiments


Analysis of a tree
Analysis of A*-tree Multidimensional Arrays

  • Natural mapping from A*-tree to Bayesian Network


Analysis of a tree cont d
Analysis of A*-tree (Cont’d) Multidimensional Arrays

  • How A*-tree model express the neighboring correlation

    • From the perspective of any random query, the average level where cell correlation is encoded is low. (efficient inference & accurate modeling)


Analysis of a tree cont d1
Analysis of A*-tree (Cont’d) Multidimensional Arrays

  • Neighboring cells and clustering distance

    • Definition


Analysis of a tree cont d2
Analysis of A*-tree (Cont’d) Multidimensional Arrays

  • Neighboring cells and clustering distance


Analysis of a tree cont d3
Analysis of A*-tree (Cont’d) Multidimensional Arrays

  • CD (Clustering Distance)

    • For any query that may return q pairs of neighboring cells

Expected average CD

e.g. for 1024*1024 array, h=10, then

E(argCD )~ 1.01


Analysis of a tree cont d4
Analysis of A*-tree (Cont’d) Multidimensional Arrays

  • Accuracy vs. Efficiency

    • Double “flip”

    • Polynomial time scan O(d*n)

    • Consider basic uncertainty block


Outline5
Outline Multidimensional Arrays

  • Motivation

  • Modeling correlated uncertainty

  • Construction of A*-tree

  • Analysis of A*-tree

  • Query processing

  • Experiments


Query processing
Query Processing Multidimensional Arrays

  • Monte Carlo based query processing

    • Sampling

      Q: select avg(brightness)

      From space_image

      Where

      Dis(x,y,z,322,108,251)<50


Query processing cont d
Query Multidimensional Arrays Processing (Cont’d)

  • Compared with MRF

    • MRF require sequenced round sampling

    • Each sample node is computed from all the nodes


Query processing cont d1
Query Processing (Cont’d) Multidimensional Arrays

  • Other queries

    • COUNT, AVG and SUM

  • Minimum Set Cover

  • Build-in cell-count function

  • Effectively query answering


Outline6
Outline Multidimensional Arrays

  • Motivation

  • Modeling correlated uncertainty

  • Construction of A*-tree

  • Analysis of A*-tree

  • Query processing

  • Experiments


Experiments
Experiments Multidimensional Arrays

  • Data set description

  • Evaluations

    • Accuracy of modeling the underlying joint distribution

    • Execution time

    • Aggregate query

    • Space cost


Experiments cont d
Experiments (Cont’d) Multidimensional Arrays

  • Accuracy


Experiments cont d1
Experiments (Cont’d) Multidimensional Arrays

  • Accuracy


Experiments cont d2
Experiments (Cont’d) Multidimensional Arrays

  • Execution time


Experiments cont d3
Experiments (Cont’d) Multidimensional Arrays

  • Aggregate query and space cost


A tree a structure for storage and modeling of uncertain multidimensional arrays

Thank you! Multidimensional Arrays

Q&A