1 / 29

An Interactive Framework for Raster Data Spatial Joins

An Interactive Framework for Raster Data Spatial Joins. Wan Bae (Computer Science, University of Denver) Petr Vojtěchovský (Mathematics, University of Denver) Shayma Alkobaisi (Computer Science, University of Denver) Scott T. Leutenegger (Computer Science, University of Denver)

gracie
Download Presentation

An Interactive Framework for Raster Data Spatial Joins

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Interactive Framework for Raster Data Spatial Joins Wan Bae (Computer Science, University of Denver) Petr Vojtěchovský (Mathematics, University of Denver) Shayma Alkobaisi (Computer Science, University of Denver) Scott T. Leutenegger (Computer Science, University of Denver) Seon Ho Kim (Computer Science, University of Denver)

  2. Outline • Introduction • Issues and Problems • Probabilistic Joins • Sampling Joins • Interactive Framework • Experiments • Conclusion

  3. Geographic Information Systems • Integration of georeferenced data • Spatial queries • Complex spatial data analysis & • modeling for decision support data data Web application GIS data data Web application • Collect • Store • Retrieve Users data data

  4. Raster Data Model • A great portion of georeferenced data • Simple data structure but greater storage space • Continuously changing data (a) Satellite Image (b) Raster Model

  5. Continuously Changing Data

  6. Raster Data Spatial Joins (a) (b) • “Find the regions where rainfall rate is greater than 1.0 • and wind speed is greater than 50”

  7. Issues for User-driven Data Exploration • Fast Query response time • Time consuming for exact answers due to large size of data sets • Time intensive GIS decision support queries • Lack of optimization and approximation techniques for raster data joins • Interactive query processing • Lack of interactivities in traditional GIS • No user control over query processing • Visualization increases the utility of the GIS

  8. Our Approach For faster and more effective decision support queries: • Fast approximation of query results 1. probabilistic join 2. sampling join • Visualize intermediate results 1. “big picture” of query result 2. partial result: non-blocking joins • Allow users to control query processing

  9. Our Approximations • What is the probability that R joins S? R (8/16) S (9/16) = they must join! 2. Can use the result of a subset of data cell joins for the final answer? 1 joins / 2 cells ? / 16 cells

  10. Augmented Quad-trees NW NE NW NE SE SE SW SW Both data sets are indexed using Quad-trees

  11. Join Probability • Let X = [0, 1], m and n be randomly chosen intervals in X of length a, b. The probability p that m∩n≠ 0 Join Probability of p (m ∩ n ≠ 0) = ?

  12. 0 1 1-a a p a a1 a2 m overlapped b b1 b2 n x x+b b 1-b q 1-d Join Probability

  13. n m b b2 a a2 b1 a1 2-d Join Probability 1 1 0

  14. Look-up table for 2-d Join Probability

  15. p( , ) p( , ) Probabilistic Join (PJ)

  16. Probabilistic Join Result (a) data set Q (65536 x 65536) (b) data set S (65536 x 65536) (e) 4th level joins (d) 3th level joins (c) 2th level joins

  17. Incremental Stratified Sampling Join (ISSJ) • Utilize stratified random sampling technique from quad- trees of two data sets R and S • Data randomization: Acceptance/Rejection method 1.Sampling step: sample data from outer data set R • Spatial joining step: joins with the corresponding data cell on inner data set S • Refining step: running estimates and confidence intervals 4. Visualization: display partial results (actual join results)

  18. Stratified Random Sampling ST1 ST2 ST3 ST4 1 2 0 2 ST4 ST1 ST3 ST2

  19. Population Proportion: fraction indicating the part of the sample having a particular interest Estimated Value: the statistic computed from sample information using population proportion Confidence interval: an interval that estimates a population parameter within a range of possible values at specified probability Confidence level: the specified probability Estimates and Confidence Interval

  20. state airports confidence interval IA 95 22 0.05 95 0.05 NE 19 95 0.05 WI 15 95 0.05 13 CO 0.05 KS 11 95 MI 8 95 0.05 10% done Incremental Sampling Join Result (a) Estimated result (b) Partial result

  21. Interactive Join Framework

  22. Experiments • PJ and ISSJ compared to full Quad-tree join. • Confidence level set to 95% in ISSJ • Varied buffer size and data sets size. • Data sets: • Synthetic: UE, EU, UU (65536 65536 and 262144 262144) • Real: 6 data sets mineral resources for each state of AZ, CO, OR and WY from U.S. Geological Survey (65536 65536)

  23. Actual joins vs. 2-d PJ

  24. Accuracy of Estimates of ISSJ number of processed cells Estimates vs. exact value for real data sets

  25. Time for Confidence Interval of ISSJ sampling join full quad-tree join Confidence Interval and I/Os for real data sets

  26. ISSJ vs. PJ vs. Actual joins (a) ISSJ w/10% CI (b) ISSJ w/5% CI (d) PJ (a) Actual join

  27. Time for Confidence Intervals I/Os of PJ, ISSJ and the full quad-tree join for Colorado

  28. Conclusion • A novel spatial join, Probabilistic Join, for raster data joins for obtaining a “big picture” visualization of query answer • An interactive raster spatial join algorithm, Incremental Refining Spatial Join, for confidence interval bounded estimated query answer of raster data joins

  29. Thank you!

More Related