1 / 12

Indexing in Spatial Databases and Query Processing

Indexing in Spatial Databases and Query Processing. Query Processing. Efficient algorithms to answer spatial queries Common Strategy - filter and refine Filter Step: Query Region overlaps with MBRs (minimum bounding rectangles) of B,C and D

Download Presentation

Indexing in Spatial Databases and Query Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Indexing in Spatial Databases and Query Processing

  2. Query Processing • Efficient algorithms to answer spatial queries • Common Strategy - filter and refine • Filter Step: Query Region overlaps with MBRs (minimum • bounding rectangles) of B,C and D • Refine Step: Query Region overlaps with B and C • - For reducing computation time: • - It is easier (computationally cheaper) to compute the intersection between • a query region and a rectangle rather than between the query region • and an arbitrary, irregular shaped, spatial object. Fig 1.8

  3. File Organization and Indices • DBMS have been designed to handle very large amounts of data • The fundamental difference in how algorithms are designed in a • GIS data analysis vs a database environment • A difference between GIS and SDBMS assumptions • GIS algorithms: • Main focus is minimizing the computation time • Assuming that entire dataset is residing in main memory • SDBMS: dataset is on secondary storage e.g disk • Main focus is on I/O time • I/O time is the time required to transfer data from a disk to the main • memory • finite main memory infinite disk • SDBMS uses spatial indices to efficiently search large spatial datasets in DB

  4. File Organization and Indices • programmer’s view-point: computation time • DBMS designer’s view-point: I/O time • CPU-bound • I/O bound • Computation time • I/O time

  5. Indexing • Consider secondary storage as a book • The smallest unit of transfer between the disk and main memory is a page. And records of a tble are like structured lines of text on the page • At anytime some pages reside in main meory, some at the disk • To accelerate the search DB uses index • DBMS can fetch all the pages spanned by a table and scan them line by line until the record is found • Or search in the index and for a desired key word and go directly to the page specified in the index. • Index entries in a book are sorted in alphabetical order • Similarly if the index is built on numbers, like the social security number, then they can be numerically numbered.

  6. Spatial Indexing: Search Data-Structures • Choice for spatial indexing: for query optimization • B-tree is a hierarchical collection of ranges of linear keys, e.g. numbers • B-tree index is used for efficient search of traditional data • Crucially depends on the existence of an order in the indexing field. • See the difference between binary tree and B-tree • Each node represents page B-tree R- tree

  7. Spatial Indexing: Search Data-Structures • Which nodes are index pages and which nodes are data pages • If a page holds m keys, then the height ? • O(logm n) • What is n? • WHY B-TREE IS NOT APPROPRIATE FOR INDEXING SPARIAL DATA

  8. Minimum Bounding Rectangle Study Area Minimum Bounding Rectangles

  9. R-tree for Spatial Data Indexing • Because natural order does not exist in multidimensional space, the B-tree can not be used directly to create an index of spatial objects. • R-tree data structure was one of the first index structure specifically designed to handle multidimensional extended objects • R-tree provides better search performance yet! • R-tree is a hierarchical collection of rectangles

  10. R-tree: Example Examples of R – Tree Index of polygons

  11. Query Optimization • Query Optimization • A spatial operation can be processed using different strategies • Computation cost of each strategy depends on many parameters • Example Query: • Find the names of all female senators who own a business • SELECT S.name FROM Senator S, Business B • WHERE S.soc-sec = B.soc-sec AND S.gender = ‘Female’ • Here, there is a selection query and a join query • Optimization • Process (S.gender = ‘Female’) before (S.soc-sec = B.soc-sec ) • Do not use index for processing (S.gender = ‘Female’)

  12. Multi-scan Query Example • Find all senators who serve a district of area greater than 300 • square miles and who own business within the district. • Spatial join example • SELECT S.name FROM Senator S, Business B • WHERE S.district.Area() > 300 AND Within(B.location, S.district) • Here, composition of two sub-queries, a range query and a spatial join • Which one is first?

More Related