910 likes | 1.13k Views
SASH. Spatial Approximation Sample Hierarchy Authors: Michael E. Houle, Jun Sakuma. SASH features. Index data in high-dimensional space Fast construction of the index N log N Fast lookups of k approximate nearest neighbors k log N. Drawbacks of other methods. Slow construction
E N D
SASH Spatial Approximation Sample Hierarchy Authors: Michael E. Houle, Jun Sakuma
SASH features • Index data in high-dimensional space • Fast construction of the index • N log N • Fast lookups of k approximate nearest neighbors • k log N
Drawbacks of other methods • Slow construction • Require a k-NN index to construct a k-NN index • Slow lookups • Reduce to grid searches or sequential search • But they may allow for true nearest neighbor queries
SASH construction • Two-phase process • Phase 1: divide the set into a hierarchy of subsets • Phase 2: link elements of the hierarchy together
SASH construction: phase 1 • Start with a set of points in a metric space • Divide the set in half randomly • Repeatedly divide the “second half” of the set until there is one element remaining • This hierarchy of sets reminds me of a skip list
SASH subsets • Partitioning process roughly yields log N sets of size 2k, 0 ≤ k ≤ log N • Label the sets S0 (for the set containing one element, namely the root) through Sh (for the largest set containing approximately N/2 elements)
SASH appearance • A SASH is hierarchy of sets of size 2k, 0 ≤ k ≤ h, with directed edges from the set of size 2k-1 to the set of size 2k • A SASH is generally not a tree, but it has some of the flavor of a binary tree with edges from sets of a certain size to sets that are double that size. • A SASH usually has many more edges.
SASH construction: phase 2 • The SASH is constructed inductively by first setting SASH0 = S0. • For 1 ≤ i-1 ≤ h, SASHi-1 is a partial SASH on the set S0U S1U … U Si-1 • SASHi is constructed by starting with SASHi-1 and producing new directed edges from elements in Si-1 to elements in Si.
SASH construction: phase 2 • Let SASH0 be the root, S0 • For 1 ≤ i ≤ h, assume SASHi-1 exists, then • For each c in Si, use SASHi-1 to find P possible parents of c in Si-1 • Once all c in Si link to possible parents, each p in Si-1 links to the C closest children that chose it as a possible parent • If some orphan objects in Si have no parents linking to them, repeat the above, allowing them to try link to more parents.
SASH parameters: P and C • In practice, the P is a small, and the C is at least twice P (Their experiments use C=4P) • It is likely that objects will have at least one parent that links to them, and if C > 2P, all orphans can eventually find parents • Children link to “nearby” parents, and parents then link to “nearby” children • The symmetric use of “nearby” gives good results, even though the relation isn’t really symmetric.
SASH Construction Example • Red nodes are in a completed SASH. Light blue nodes are in the process of being added to a SASH. Black nodes have not been processed. • Links from children to parents are green, and links from parents to children are red.