1 / 10

Forms of Retrieval

Forms of Retrieval. Sequential Retrieval Two-Step Retrieval Retrieval with Indexed Cases. Retrieval with Indexed Cases. Sources: Textbook, Chapter 7 Davenport & Prusack’s book on Advanced Data Structures Samet’s book on Data Structures. Red light on? Yes Beeping? Yes …

saburo
Download Presentation

Forms of Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Forms of Retrieval • Sequential Retrieval • Two-Step Retrieval • Retrieval with Indexed Cases

  2. Retrieval with Indexed Cases Sources: Textbook, Chapter 7 Davenport & Prusack’s book on Advanced Data Structures Samet’s book on Data Structures

  3. Red light on? Yes Beeping? Yes … Transistor burned! Range Search Space of known problems

  4. k-d Trees • Idea: Partition of the case base in smaller fragments • Representation of a k-dimensional space in a binary tree • Similar to a decision tree: comparison with nodes • During retrieval: • Search for a leaf, but • Unlike decision trees backtracking may occur

  5. Definition: k-d Trees • Given: • K types: T1, …, Tk for the attributes A1, …, Ak • A case base CB containing cases in T1 … Tk • A parameter b (size of bucket) • A K-D tree T(CB) for a case base CB is a binary tree defined as follows: • If |CB| < b then T(CB) is a leaf node (a bucket) • Else T(CB) defines a tree such that: • The root is marked with an attribute Ai and a value v in Ai and • The 2 k-d trees T({c  CB: c.i-attribute < v}) and T({c  CB: c.i-attribute  v}) are the left and right subtrees of the root

  6. BWB-Check • Ball-With in-Bounds check: • Suppose that algorithm reaches a leave node M (with at most b cases) while searching for the most similar case to P • Let c be a case in B such that dist(c,P) is the smallest • Then c is a candidate NN for P • For each boundary B of M, dist(P,B) > dist(c,P) then c is the NN • But if for any boundary B of M, if dist(P,B) < dist(c,P) then the algorithm needs to backtrack and check if in the regions of B, there is a better candidate • For computing distance, simply use: f-1 be the inverse of the distance-similarity compatible function: • distance(P,C) = f-1(sim(P,C))

  7. BOB-Check • Ball-Out of-Bounds check: • Used during backtracking • Checks if for the boundary B defined in the node: dist(P,B) < dist(c,P) • Where c is our current candidate for best case (e.g., the closest case to P in the initial bucket) • If the condition is true, The algorithm needs to check if in those boundary’s regions, there is a better candidate

  8. P(32,45) Example A1 (0,100) <35 35 (60,75) Toronto Denver Omaha A2 (80,65) Buffalo <40 (5,45) Denver (35,40) Chicago 40 Atlanta (85,15) A1 <85 (25,35) Omaha (50,10) Mobile 85 (90,5) Miami Atlanta Miami Mobile (0,0) (100,0) A1 • Notes: • Priority lists are used for computing kNN <60 60 Toronto Buffalo Chicago

  9. Variant: InReCA Tree Ai unknown v1 vn … v2 Ai unknown v1 … >vn >v1 v2 Can be combined with numeric attributes Using Decision Trees as Index Standard Decision Tree Ai vn … v1 v2 • Notes: • Supports Hamming distance • May require backtracking (using BOB-check) • Operates in a similar fashion as k-d trees • Priority lists are used for computing kNN

  10. Properties of Retrieval with Indexed Cases • Advantage: • Disadvantages: • Efficient retrieval • Incremental: don’t need to rebuild index again every time a new case is entered • -error does not occur • Cost of construction is high • Only work for monotonic similarity relations

More Related