Spatial (or N-Dimensional) Search in a Relational World

1 / 10

# Spatial (or N-Dimensional) Search in a Relational World - PowerPoint PPT Presentation

Spatial (or N-Dimensional) Search in a Relational World. Jim Gray. y. x=c/a. x. ax + by = c. y=c/b. y. x. Equations Define Subspaces. For (x,y) above the line ax+by &gt; c Reverse the space by -ax + -by &gt; -c

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Spatial (or N-Dimensional) Search in a Relational World' - tavi

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Spatial (or N-Dimensional) Searchin a Relational World

Jim Gray

y

x=c/a

x

ax + by = c

y=c/b

y

x

Equations Define Subspaces
• For (x,y) above the lineax+by > c
• Reverse the space by-ax + -by > -c
• Intersect a 3 volumes: a1x + b1y > c1a2x + b2y > c2a3x + b3y > c3
Domain is Union of Convex Hulls

Not a convex hull

• Simple volumes are unions of convex hulls.
• Higher order curves also work
• Complex volumes have holes and their holes have holes. (that is harder).

+

Now in Relational Terms

create table HalfSpace (

domainID int not null -- domain name

foreign key references Domain(domainID),

convexID int not null, -- grouping a set of ½ spaces

halfSpaceID int identity(), -- a particular ½ space

x float not null, -- the (a,b,..) parameters

y float not null, -- defining the ½ space

z float not null,

c float not null, -- the constant (“c” above)

primary key (domainID, convexID, halfSpaceID)

(x,y,z) inside a convex if it is inside all lines of the convex

(x,y,z) inside a convex if it is NOT OUTSIDE ANY line of the convex

select convexID -- return the convex hulls

from HalfSpace -- from the constraints

where @x * x + @y * y + @x * z < l -- point outside the line?

group by all convexID -- consider all the lines of a convexID

having count(*) = 0 -- count outside == 0

The Algebra is Simple (Boolean)

@domainID = spDomainNew (@type varchar(16), @comment varchar(8000))

@convexID = spDomainNewConvex (@domainID int)

@halfSpaceID = spDomainNewConvexConstraint (@domainID int, @convexID int,

@x float, @y float, @z float, @l float)

@returnCode = spDomainDrop(@domainID)

select * from fDomainsContainPoint(@x float, @y float, @z float)

Once constructed they can be manipulated with the Boolean operations.

@domainID = spDomainOr (@domainID1 int, @domainID2 int,

@type varchar(16), @comment varchar(8000))

@domainID = spDomainAnd (@domainID1 int, @domainID2 int,

@type varchar(16), @comment varchar(8000))

@domainID = spDomainNot (@domainID1 int,

@type varchar(16), @comment varchar(8000))

What! No Bounding Box?
• Bounding box limits search.A subset of the convex hulls.
• If query runs at 3M halfspace/sec then no need for bounding box, unless you have more than 10,000 lines.
• But, if you have a lot of half-spaces then bounding box is good.
A Different Problem
• Table-valued function find points near a point
• Select * from fGetNearbyEq(ra,dec,r)
• Use Hierarchical Triangular Mesh www.sdss.jhu.edu/htm/
• Space filling curve, bounding triangles…
• Standard approach
• 13 ms/call… So 70 objects/second.
• Too slow, so precompute neighbors: Materialized view.
• At 70 objects/sec it takes 6 months to compute a billion objects.
Zone Based Spatial Join
• Divide space into zones
• Key points by Zone, offset(on the sphere this need wrap-around margin.)
• Point search look in a few zones at a limited offset: ra ± r a bounding box that has 1-π/4 false positives
• All inside the relational engine
• Avoids “impedance mismatch”
• Can “batch” all-all comparisons
• 33x faster and parallel6 days, not 6 months!

r

ra-zoneMax

x

√(r2+(ra-zoneMax)2)

zoneMax

Ra ± x

In SQL

select o1.objID -- find objects

from zone o1 -- in the zoned table

where o1.zoneID between -- where zone # floor((@dec-@r)/@zoneHeight) and -- overlaps the circle

floor((@dec+@r)/@zoneHeight)

and o1.ra between @ra - @r and @ra + @r -- quick filter on ra

and o1.dec between @dec-@r and @dec+@r -- quick filter on dec

and ( (sqrt( power(o1.cx-@cx,2)+power(o1.cy-@cy,2)+power(o1.cz-@cz,2))))

< @r -- careful filter on distance

Bounding box

Eliminates the ~ 21% = 1-π/4

False positives

Summary
• SQL is a set oriented language
• You can express constraints as rows
• Then You
• Can evaluate LOTS of predicates per second
• Can do set algebra on the predicates.
• Benefits from SQL parallelism
• SQL == Prolog? 