a tree a structure for storage and modeling of uncertain multidimensional arrays
Download
Skip this Video
Download Presentation
A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays

Loading in 2 Seconds...

play fullscreen
1 / 33

A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays - PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on

A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays . Presented by: ZHANG Xiaofei March 2, 2011. Outline. Motivation Modeling correlated uncertainty Construction of A*-tree Analysis of A*-tree Query processing Experiments. Outline. Motivation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays ' - faolan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a tree a structure for storage and modeling of uncertain multidimensional arrays

A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays

Presented by: ZHANG Xiaofei

March 2, 2011

outline
Outline
  • Motivation
  • Modeling correlated uncertainty
  • Construction of A*-tree
  • Analysis of A*-tree
  • Query processing
  • Experiments
outline1
Outline
  • Motivation
  • Modeling correlated uncertainty
  • Construction of A*-tree
  • Analysis of A*-tree
  • Query processing
  • Experiments
motivation
Motivation
  • Multidimensional arrays
    • Suit for scientific and engineering applications
    • Logically equivalent to relational tables

<A1,A2,…,An>

D2

D1

A cell of the multidimensional arrays: (A1,A2,…,Ak, D1,D2,…Dd)

motivation cont d
Motivation (Cont’d)
  • Uncertain data
    • Inevitable
    • Two categories
motivation cont d1
Motivation (Cont’d)
  • Correlated uncertain data
    • Examples: Geographically distributed sensors

More applications examples can be found in router’s network traffic analysis, quantization of image or sound, etc.

outline2
Outline
  • Motivation
  • Modeling correlated uncertainty
  • Construction of A*-tree
  • Analysis of A*-tree
  • Query processing
  • Experiments
modeling correlated uncertainty
Modeling Correlated Uncertainty
  • PGM: Probabilistic Graphical Model
    • Bayesian network

Limitations:

Prior knowledge and initial probabilities

Significant computational cost(NP hard)

modeling correlated uncertainty cont d
Modeling Correlated Uncertainty (Cont’d)
  • PGM: Probabilistic Graphical Model
    • Markov Random Fields

A graphical model in which a set of random variables have a Markov property described by an undirected graph

Pros: cyclic dependencies

Cons: no induced dependencies

NP hard to compute

modeling correlated uncertainty cont d1
Modeling Correlated Uncertainty (Cont’d)
  • Considering the locality of correlation
    • E.g. a 2-dimensional arrays
outline3
Outline
  • Motivation
  • Modeling correlated uncertainty
  • Construction of A*-tree
  • Analysis of A*-tree
  • Query processing
  • Experiments
construction of a tree
Construction of A*-tree
  • Basic A*-structure

k-ary tree: k=2^d, where d is the number of correlated dimensions

Each leaf contains the joint distribution of four neighboring cells it maps to

The joint distribution at each internal node is recursively defined

construction of a tree cont d
Construction of A*-tree (Cont’d)
  • Joint distribution at a node

X1

X2

Y=(X1+X2+X3+X4)/4

Xi=Y(1+Fi)

X3

X4

Fi range k, r entries in distribution table, l bits to present probability

construction of a tree cont d1
Construction of A*-tree (Cont’d)
  • Extension of A*-tree
    • Uneven dimensional size
      • 2k+1 partitioned as k and k+1
      • Shorter dimension stops partition first, with partition of longer dimension goes on
construction of a tree cont d2
Construction of A*-tree (Cont’d)
  • Extension of A*-tree
    • Basic uncertainty blocks of arbitrary shapes
      • Each cell is intuitively the basic uncertain block, however, maybe this granularity is too fine
      • Initial identification of uncertainty blocks is user and application specified
outline4
Outline
  • Motivation
  • Modeling correlated uncertainty
  • Construction of A*-tree
  • Analysis of A*-tree
  • Query processing
  • Experiments
analysis of a tree
Analysis of A*-tree
  • Natural mapping from A*-tree to Bayesian Network
analysis of a tree cont d
Analysis of A*-tree (Cont’d)
  • How A*-tree model express the neighboring correlation
    • From the perspective of any random query, the average level where cell correlation is encoded is low. (efficient inference & accurate modeling)
analysis of a tree cont d1
Analysis of A*-tree (Cont’d)
  • Neighboring cells and clustering distance
    • Definition
analysis of a tree cont d2
Analysis of A*-tree (Cont’d)
  • Neighboring cells and clustering distance
analysis of a tree cont d3
Analysis of A*-tree (Cont’d)
  • CD (Clustering Distance)
    • For any query that may return q pairs of neighboring cells

Expected average CD

e.g. for 1024*1024 array, h=10, then

E(argCD )~ 1.01

analysis of a tree cont d4
Analysis of A*-tree (Cont’d)
  • Accuracy vs. Efficiency
    • Double “flip”
    • Polynomial time scan O(d*n)
    • Consider basic uncertainty block
outline5
Outline
  • Motivation
  • Modeling correlated uncertainty
  • Construction of A*-tree
  • Analysis of A*-tree
  • Query processing
  • Experiments
query processing
Query Processing
  • Monte Carlo based query processing
    • Sampling

Q: select avg(brightness)

From space_image

Where

Dis(x,y,z,322,108,251)<50

query processing cont d
Query Processing (Cont’d)
  • Compared with MRF
    • MRF require sequenced round sampling
    • Each sample node is computed from all the nodes
query processing cont d1
Query Processing (Cont’d)
  • Other queries
    • COUNT, AVG and SUM
  • Minimum Set Cover
  • Build-in cell-count function
  • Effectively query answering
outline6
Outline
  • Motivation
  • Modeling correlated uncertainty
  • Construction of A*-tree
  • Analysis of A*-tree
  • Query processing
  • Experiments
experiments
Experiments
  • Data set description
  • Evaluations
    • Accuracy of modeling the underlying joint distribution
    • Execution time
    • Aggregate query
    • Space cost
experiments cont d3
Experiments (Cont’d)
  • Aggregate query and space cost
ad