1 / 20

Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order

Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order. Edgar Ch ávez Karina Figueroa Gonzalo Navarro. UNIVERSIDAD DE CHILE, CHILE. UNIVERSIDAD MICHOACANA, MEXICO. Content. About the problem Basic concepts Previous work Our technique Experiments

edolie
Download Presentation

Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order Edgar Chávez Karina Figueroa Gonzalo Navarro UNIVERSIDAD DE CHILE, CHILE UNIVERSIDAD MICHOACANA, MEXICO

  2. Content • About the problem • Basic concepts • Previous work • Our technique • Experiments • Conclusion and future wok

  3. Huge Database Expensive distance Proximity Searching • Exact searching is not possible

  4. Applications • Retrieval Information • Classification • People finder through the web • Clustering • Currently used on • Classification of Spider’s web • Face recognition on Chilean’s Web

  5. Extraction of characteristics Complex objects Index Problems (metric spaces) Huge databases High dimension Memory limited

  6. Terminology • Queries • Range query • K nearest neighbor • Properties • Symmetry • Strict possitiveness • Triangle inequality

  7. Pivot based Partition based Pivot distance q Previous work Range query

  8. Pivot based Partition based q centro Previous work

  9. Permutant u Our techniquePermutation P1 p2 P4 P6 IDEA p5 p3

  10. Our technique • Exact matching elements have the same permutation • Similar elements must have a similar permutation (we guess) • Spearman footrule metric • Measures the similarity of the permutations • Promissority elements first

  11. Spearman Footrule metricExample 3-1, 6 - 2, 3-2, 4-1, 5-5, 6-4 Difference of positions

  12. p3,p1,p2 Permutant p1 p3 p2,p1,p3 p2 p2,p3,p1 p3,p2,p1 Searching process (1a. part)Preprocessing time

  13. Permutant q Searching process (2a. part)Query time Sorting elements by Spearman Footrule metric p2,p1,p3 p2,p3,p1 ….. ….. p3,p1,p2 p3,p1,p2 p1 p3 p2,p1,p3 p2 p2,p1,p3 p2,p3,p1 p3,p2,p1

  14. 93% retrieved, comparing 10% of database 90% retrieved, comparing 60% of database Pivot based algorithm Retrieved 48% Experiments %retrieved

  15. 100% retrieved, comparing 15% of database 100% retrieved, comparing 90% of database Experiments up to 84% less work %retrieved

  16. Metric algorithms are using one of them How good is our prediction? Dimension 256, using 256 pivots retrieved Percentage of the database compared

  17. Similarities between permutations Almost the same value

  18. Conclusion • A new probabilistic algorithm for proximity searching in metric space. • Our technique is based on permutations. • Close elements will have similar permutations. • This technique is the fastest known algorithm for high dimension. • Permutations are good predictor

  19. Future Work • Can Non-metric spaces be tackled with this technique? • Approximated all K Nearest neighbor algorithm. • Improving other metric indexes.

  20. Thank you UNIVERSIDAD MICHOACANA, MEXICO UNIVERSIDAD DE CHILE, CHILE Kfiguero@dcc.uchile.cl

More Related