1 / 29

Introduction to Spatial Database Research

Introduction to Spatial Database Research. Donghui Zhang CCIS Northeastern University. What is spatial database?. A database system that is optimized to store and query spatial objects: Point: a hotel, a car Line: a road segment Polygon: landmarks, layout of VLSI. Road Network.

donald
Download Presentation

Introduction to Spatial Database Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Spatial Database Research Donghui Zhang CCIS Northeastern University

  2. What is spatial database? • A database system that is optimized to store and query spatial objects: • Point: a hotel, a car • Line: a road segment • Polygon: landmarks, layout of VLSI Road Network Satellite Image VLSI Layout

  3. Are spatial databases useful? • Geographical Information Systems • e.g. data: road network and places of interest. • e.g. usage: driving directions, emergency calls, standalone applications. • Environmental Systems • e.g. data: land cover, climate, rainfall, and forest fire. • e.g. usage: find total rainfall precipitation. • Corporate Decision-Support Systems • e.g. data: store locations and customer locations. • e.g. usage: determine the optimal location for a new store. • Battlefield Soldier Monitoring Systems • e.g. data: locations of soldiers (w/wo medical equipments). • e.g. usage: monitor soldiers that may need help from each one with medical equipment.

  4. Shortest-Path Query Fastest-Path Query MapQuest.com

  5. Driving directions as you go. • Find nearest Wal-Mart or hospital. NN Query

  6. Range query ArcGIS 9.2, ESRI

  7. Are spatial databases useful? • Geographical Information Systems • e.g. data: road network and places of interest. • e.g. usage: driving directions, emergency calls, standalone applications. • Environmental Systems • e.g. data: land cover, climate, rainfall, and forest fire. • e.g. usage: find total rainfall precipitation. • Corporate Decision-Support Systems • e.g. data: store locations and customer locations. • e.g. usage: determine the optimal location for a new store. • Battlefield Soldier Monitoring Systems • e.g. data: locations of soldiers (w/wo medical equipments). • e.g. usage: monitor soldiers that may need help from each one with medical equipment.

  8. Aggregation query

  9. Are spatial databases useful? • Geographical Information Systems • e.g. data: road network and places of interest. • e.g. usage: driving directions, emergency calls, standalone applications. • Environmental Systems • e.g. data: land cover, climate, rainfall, and forest fire. • e.g. usage: find total rainfall precipitation. • Corporate Decision-Support Systems • e.g. data: store locations and customer locations. • e.g. usage: determine the optimal location for a new store. • Battlefield Soldier Monitoring Systems • e.g. data: locations of soldiers (w/wo medical equipments). • e.g. usage: monitor soldiers that may need help from each one with medical equipment.

  10. Optimal Location query

  11. Are spatial databases useful? • Geographical Information Systems • e.g. data: road network and places of interest. • e.g. usage: driving directions, emergency calls, standalone applications. • Environmental Systems • e.g. data: land cover, climate, rainfall, and forest fire. • e.g. usage: find total rainfall precipitation. • Corporate Decision-Support Systems • e.g. data: store locations and customer locations. • e.g. usage: determine the optimal location for a new store. • Battlefield Soldier Monitoring Systems • e.g. data: locations of soldiers (w/wo medical equipments). • e.g. usage: monitor soldiers that may need help from each one with medical equipment.

  12. NN(Bob) = George George John Bob Bill Mike

  13. Who will seek help from me? RNN(Bob) = {John, Mike} George John Bob Bill Mike RNN query

  14. And beyond the “space” … • 2004 NBA dataset*: each player has 17 attributes • “Spatial Data”: an object is a point in a 17-dimensional space • Who are the best players? • i.e. not “dominated” by any other player. Skyline query * www.databaseBasketball.com

  15. And beyond the “space” … • 2004 NBA dataset*: each player has 17 attributes • “Spatial Data”: an object is a point in a 17-dimensional space • Who are the best players? • i.e. not “dominated” by any other player. Skyline query * www.databaseBasketball.com

  16. And beyond the “space” … • 2004 NBA dataset*: each player has 17 attributes • “Spatial Data”: an object is a point in a 17-dimensional space • Who are the best players? • i.e. not “dominated” by any other player. Skyline query * www.databaseBasketball.com

  17. And beyond the “space” … • 2004 NBA dataset*: each player has 17 attributes • “Spatial Data”: an object is a point in a 17-dimensional space • Who are the best players? • i.e. not “dominated” by any other player. Skyline query * www.databaseBasketball.com

  18. And beyond the “space” … • 2004 NBA dataset*: each player has 17 attributes • “Spatial Data”: an object is a point in a 17-dimensional space • Who are the best players? • i.e. not “dominated” by any other player. Skyline query * www.databaseBasketball.com

  19. And beyond the “space” … • 2004 NBA dataset*: each player has 17 attributes • “Spatial Data”: an object is a point in a 17-dimensional space • Who are the best players? • i.e. not “dominated” by any other player. Skyline query * www.databaseBasketball.com

  20. Subspace Skyline Queries u3 u3 t2 t2 t4 t3 t4 t7 1 2 3 4 5 6 7 8 t3 t7 1 2 3 4 5 6 7 8 t5 t5 t5 t6 t1 t1 t6 u1 1 2 3 4 5 6 7 8 9 u2 Skyline in u1, u3 1 2 3 4 5 6 7 8 9 Skyline in u2, u3 • In an online skyline processing system, the users may ask skyline queries on any subspace, i.e. a subset of attributes. • Different subspace skylines can be very different! u1 u2 u3 u4 t1 3 4 2 5 t2 4 6 7 2 t3 9 7 5 6 t4 4 3 6 1 t5 2 2 3 1 t6 6 1 1 3 t7 1 3 4 1 Objects of 4-dimensions

  21. Straightforward Solutions • On-the-fly computation • Slow query processing • Pre-compute and store all subspace skylines: high update costs • No update support • Waste of storage

  22. The Compressed Skycube [XZ06] • Compact storage • Represent all skylines in a very concise way, by preserving only essential information of subspace skylines. • Efficient query support • Efficiently answer arbitrary subspace skyline queries without accessing the original data. • Efficient update scheme • Avoid unnecessary data access and subspace skyline computation upon updates.

  23. The complete pre-computation Subspace Skyline u1 t7 u2 t6 u3 t6 u4 t4 , t5 , t7 u1 , u2 t5 , t6, t7 , t9 u1 , u3 t1 , t5 , t6, t7 , t9 u1 , u4 t7 u2 , u3 t6 u2 , u4 t5 , t6 u3 , u4 t5 , t6 u1 , u2 , u3 t1 , t5 , t6, t7 , t9 u1 , u2 , u4 t5 , t6, t7 u1 , u3 , u4 t1 , t5 , t6, t7 u2 , u3 , u4 t5 , t6 u1 , u2 , u3 , u4 t1 , t5 , t6, t7 u1 u2 u3 u4 t1 3 4 2 5 t2 4 6 7 2 Skycube t3 9 7 5 6 t4 4 3 6 1 t5 2 2 3 1 t6 6 1 1 3 Contains many duplicates, e.g. t6 appears 12 times t7 1 3 4 1 t8 6 5 3 8 t9 2 2 3 7

  24. Minimum Subspace (mss) Minimum Subspaces t1u1, u3 t4u4 t5 u4, u1, u2, u1, u3 t6u2, u3 t7u1, u4 t9u1, u2, u1, u3 Subspace Skyline • Object t6 appears in the skylines of 12subspaces. • The number of minimum subspaces of t6 is only 2. u1 t7 u2 t6 u3 t6 u4 t4 , t5 , t7 u1 , u2 t5 , t6, t7 , t9 u1 , u3 t1 , t5 , t6, t7 , t9 u1 , u4 t7 u2 , u3 t6 u2 , u4 t5 , t6 u3 , u4 t5 , t6 u1 , u2 , u3 t1 , t5 , t6, t7 , t9 u1 , u2 , u4 t5 , t6, t7 u1 , u3 , u4 t1 , t5 , t6, t7 u2 , u3 , u4 t5 , t6 u1 , u2 , u3 , u4 t1 , t5 , t6, t7

  25. The Compressed Skycube (CSC) CSC Subspace Skyline Minimum Subspaces u1 t7 t1u1, u3 u2 t6 t4u4 u3 t6 t5 u4, u1, u2, u1, u3 u4 t4 , t5 , t7 t6u2, u3 u1 , u2 t5 , t9 t7u1, u4 u1 , u3 t1 , t5 , t9 t9u1, u2, u1, u3 • Definition: The Compressed Skycube (CSC) consists of non-empty subspace U, such that an object t is stored in a subspace U if and only if U is a minimum subspace of t, i.e. U mss(t).

  26. Querying CSC t6 Find the skyline in subspace u2, u3, u4. t5 Only visit CSC, not whole dataset • Theorem 1: Given a query space Uq and an object t, if for any subspace Ui in mss(t), UiUq, then t is not in the skyline of Uq. • Search the subspaces which are subsets of the query space. • Theorem 2 (Local Comparison): To check a candidate t in a subspace V Uq, we only need to compare t with the objects within the same subspace. • Compare candidates within their own subspaces. Output is non-blocking! CSC Subspace Skyline u1 u2 u3 u4 u1 t7 t1 3 4 2 5 u2 t6 t4 4 3 6 1 u3 t6 t5 2 2 3 1 u4 t4 , t5 , t7 t6 6 1 1 3 u1 , u2 t5 , t9 t7 1 3 4 1 u1 , u3 t1 , t5 , t9 t9 2 2 3 7

  27. Updating CSC • sky(full): the skyline regarding to all dimensions. • t: the object to be updated. • Theorem: upon update, no need to access the original data if tsky(full). • Efficient algorithms in both cases.

  28. Performance • (Full-space) Dimensionality: 6 • Object cardinality: [100K, 500K]. • Distribution: Uniform Update efficiency Storage efficiency Query efficiency

  29. Summary • Spatial database has many practical applications. • Spatial database research aims to design efficient algorithms for various queries. • The talk mentioned a few (range query, aggregation query, NN query, RNN query, optimal-location query, fastest-path query, and skyline query). • There are much more -- an on-going research field.

More Related