# Temple University – CIS Dept. CIS616– Principles of Data Management - PowerPoint PPT Presentation Download Presentation Temple University – CIS Dept. CIS616– Principles of Data Management

Temple University – CIS Dept. CIS616– Principles of Data Management Download Presentation ## Temple University – CIS Dept. CIS616– Principles of Data Management

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Temple University – CIS Dept.CIS616– Principles of Data Management V. Megalooikonomou Spatial Access Methods (SAMs) (based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos at CMU)

2. General Overview • Multimedia Indexing • Spatial Access Methods (SAMs) • k-d trees • Point Quadtrees • MX-Quadtree • z-ordering • R-trees

3. SAMs - Detailed outline • spatial access methods • problem dfn • k-d trees • point quadtrees • MX-quadtrees • z-ordering • R-trees

4. Spatial Access Methods - problem • Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer spatial queries (like??)

5. Spatial Access Methods - problem • Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer • point queries • range queries • k-nn queries • spatial joins (‘all pairs’ queries)

6. Spatial Access Methods - problem • Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer • point queries • range queries • k-nn queries • spatial joins (‘all pairs’ queries)

7. Spatial Access Methods - problem • Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer • point queries • range queries • k-nn queries • spatial joins (‘all pairs’ queries)

8. Spatial Access Methods - problem • Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer • point queries • range queries • k-nn queries • spatial joins (‘all pairs’ queries)

9. Spatial Access Methods - problem • Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer • point queries • range queries • k-nn queries • spatial joins (‘all pairs’ within ε)

10. SAMs - motivation • Q: applications?

11. SAMs - motivation traditional DB GIS age salary

12. SAMs - motivation traditional DB GIS age salary

13. SAMs - motivation CAD/CAM find elements too close to each other

15. SAMs - motivation eg,. std S1 F(S1) 1 365 day F(Sn) Sn eg, avg 1 365 day

16. SAMs: solutions • K-d trees • point quadtrees • MX-quadtrees • z-ordering • R-trees • (grid files) Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page)

17. SAMs - Detailed outline • spatial access methods • problem dfn • k-d trees • point quadtrees • MX-quadtrees • z-ordering • R-trees

18. k-d trees • Used to store k dimensional point data • It is not used to store region data • A 2-d tree (i.e., for k=2) stores 2-dimensional point data while a 3-d tree stores 3-dimensional point data, etc.

19. 2-d trees – node structure • Binary trees • Info: information field • Xval,Yval: coordinates of a point associated with the node • Llink, Rlink: pointers to children • Properties (N: node): • If level N even -> • for all nodes M in the subtree rooted at N.Llink: M.Xval < N.Xval • for all nodes P in the subtree rooted at N.Rlink: P.Xval >= N.Xval • If level N odd -> • Similarly use Yvals

20. 2-d trees – Example

21. 2-d trees: Insertion/Search • To insert a node N into the tree pointed by T • If N and T agree on Xval, Yval then overwrite T • Else, branch left if N.Xval < T.xval, right otherwise (even levels) • Similarly for odd levels (branching on Yvals)

22. 2-d trees – Example of Insertion Splitting of region by Banja Luka Splitting of region by Derventa Splitting of region by Toslic Splitting of region by Sinj

23. 2-d trees: Deletion • Deletion of point (x,y) from T • If N is a leaf node easy • Otherwise either Tl (left subtree) or Tr (right subtree) is non-empty • Find a “candidate replacement” node R in Tl or Tr • Replace all of N’s non-link fields by those of R • Recursively delete R from Ti • Recursion guaranteed to terminate - Why?

24. 2-d trees: Deletion • Finding candidate replacement nodes for deletion • Replacement node R must bear same spatial relation to all nodes in Tl and Tr as node N

25. 2-d trees: Range Queries • Q: Given a point (xc, yc) and a distance r find all points in the 2-d tree that lie within the circle • A: Each node N in a 2-d tree implicitly represents a region RN – If the circle (specified by the query) has no intersection with RN then there is no point in searching the subtree rooted at node N

26. SAMs - Detailed outline • spatial access methods • problem dfn • k-d trees • point quadtrees • z-ordering • R-trees

27. Point Quadtrees • Represent point data • Always split regions into 4 parts • 2-d tree: a node N splits a region into two by drawing one line through the point (N.xval, N.yval) • Point quadtree: a node N splits a region by drawing a horizontal and a vertical line through the point (N.xval, N.yval) • Four parts: NW, SW, NE, and SE quadrants • Q: Quadtree nodes have 4 children?

29. Point quadtrees - Insertion Splitting of region by Banja Luka Splitting of region by Derventa Splitting of region by Toslic Splitting of region by Tuzla Splitting of region by Sinj

31. Point quadtrees: Deletion • Deletion of point (x,y) from T • If N is a leaf node easy • Otherwise a subtree (N.NW, N.SW, N.NE. N.SE) is non-empty • Find a “candidate replacement” node R in one of the subtrees such that: • Every other node R1 in N.NW is to the NW of R • Every other node R2 in N.SW is to the SW of R • etc… • Replace all of N’s non-link fields by those of R • Recursively delete R from Ti • In general, it may not always be possible to find such as replacement node • Q: What happens in the worst case?

32. Point quadtrees: Deletion • Deletion of point (x,y) from T • If N is a leaf node easy • Otherwise a subtree (N.NW, N.SW, N.NE. N.SE) is non-empty • Find a “candidate replacement” node R in one of the subtrees such that: • Every other node R1 in N.NW is to the NW of R • Every other node R2 in N.SW is to the SW of R • etc… • Replace all of N’s non-link fields by those of R • Recursively delete R from Ti • In general, it may not always be possible to find such as replacement node • Q: What happens in the worst case? May require all nodes to be reinserted

33. Point quadtrees: Range Searches • Each node in a point quadtree represents a region • Do not search regions that do not intersect the circle defined by the query

34. SAMs - Detailed outline • spatial access methods • problem dfn • k-d trees • point quadtrees • MX-quadtrees • z-ordering • R-trees

35. MX-Quadtrees • Drawbacks of 2-d trees, point quadtrees: • shape of tree depends upon the order in which objects are inserted into the tree • splits may be uneven depending upon where the point (N.xval, N.yval) is located inside the region (represented by N) • MX-quadtrees: shape (and height) of tree independent of number of nodes and order of insertion

36. MX-Quadtrees • Assumption: the map is represented as a grid of size (2k x 2k) for some k • When a region gets “split” it splits down the middle

37. MX-Quadtrees - Insertion After insertion of A, B, C, and D respectively

38. MX-Quadtrees - Insertion After insertion of A, B, C, and D respectively

39. MX-Quadtrees - Deletion • Fairly easy – why? • All point are represented at the leaf level • Total time for deletion: O(k)

40. MX-Quadtrees –Range Queries • Same as in point quadtrees • One difference: • Checking to see if a point is in the circle defined by the range query needs to be performed at the leaf level (points are stored at the leaf level)

41. SAMs - Detailed outline • spatial access methods • problem dfn • k-d trees • point quadtrees • MX-quadtrees • z-ordering • R-trees

42. z-ordering Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page) Hint: reduce the problem to 1-d points(!!) Q1: why? A: Q2: how?

43. z-ordering Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page) Hint: reduce the problem to 1-d points (!!) Q1: why? A: B-trees! Q2: how?

44. z-ordering Q2: how? A: assume finite granularity; z-ordering = bit-shuffling = N-trees = Morton keys = geo-coding = ...

45. z-ordering Q2: how? A: assume finite granularity (e.g., 232x232 ; 4x4 here) Q2.1: how to map n-d cells to 1-d cells?

46. z-ordering Q2.1: how to map n-d cells to 1-d cells?

47. z-ordering Q2.1: how to map n-d cells to 1-d cells? A: row-wise Q: is it good?

48. z-ordering Q: is it good? A: great for ‘x’ axis; bad for ‘y’ axis

49. z-ordering Q: How about the ‘snake’ curve?

50. z-ordering Q: How about the ‘snake’ curve? A: still problems: 2^32 2^32