1 / 23

Advanced Indexing Techniques with

Advanced Indexing Techniques with Apache Lucene - Payloads. Agenda. Part 1: Inverted Index 101Posting ListsStored Fields vs. PayloadsPart 2: Use cases for PayloadsBoostingTermQuerySimple facet counting. Advanced Indexing Techniques with Apache Lucene - Payloads. Lucene's data structures. Inver

fedora
Download Presentation

Advanced Indexing Techniques with

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Advanced Indexing Techniques with Apache Lucene - Payloads Advanced Indexing Techniques with Michael Busch (buschmi@apache.org)

    2. Advanced Indexing Techniques with Apache Lucene - Payloads Agenda Part 1: Inverted Index 101 Posting Lists Stored Fields vs. Payloads Part 2: Use cases for Payloads BoostingTermQuery Simple facet counting

    3. Advanced Indexing Techniques with Apache Lucene - Payloads

    4. Advanced Indexing Techniques with Apache Lucene - Payloads

    5. Advanced Indexing Techniques with Apache Lucene - Payloads

    6. Advanced Indexing Techniques with Apache Lucene - Payloads

    7. Advanced Indexing Techniques with Apache Lucene - Payloads

    8. Advanced Indexing Techniques with Apache Lucene - Payloads

    9. Advanced Indexing Techniques with Apache Lucene - Payloads So far… String comparison slow Inverted index used to accelerate search Store positions in posting lists to allow phrase searches Store payloads in posting lists to store arbitrary data with each position

    10. Advanced Indexing Techniques with Apache Lucene - Payloads

    11. Advanced Indexing Techniques with Apache Lucene - Payloads

    12. Advanced Indexing Techniques with Apache Lucene - Payloads

    13. Advanced Indexing Techniques with Apache Lucene - Payloads

    14. Advanced Indexing Techniques with Apache Lucene - Payloads Agenda Part 1: Inverted Index 101 Posting Lists Stored Fields vs. Payloads Part 2: Use cases for Payloads BoostingTermQuery Simple facet counting

    15. Advanced Indexing Techniques with Apache Lucene - Payloads org.apache.lucene.analysis.Token

    16. Advanced Indexing Techniques with Apache Lucene - Payloads Analyzer:

    17. Advanced Indexing Techniques with Apache Lucene - Payloads Similarity:

    18. Advanced Indexing Techniques with Apache Lucene - Payloads

    19. Advanced Indexing Techniques with Apache Lucene - Payloads Analyzer:

    20. Advanced Indexing Techniques with Apache Lucene - Payloads Hitcollector: Use different PriorityQueues for different sites Instead of returning top-n results of the whole data set, return top-n results per site

    21. Advanced Indexing Techniques with Apache Lucene - Payloads Summary In this example: facet (site) used for scoring, but extendable for facet counting Good performance due to locality of facet values

    22. Advanced Indexing Techniques with Apache Lucene - Payloads Payloads offer great flexibility Payloads are stored very space-efficient Sophisticated data structures enable efficient skipping over payloads Payloads should be used whenever special data is required for finding hits and scoring

    23. Advanced Indexing Techniques with Apache Lucene - Payloads Finalize API (currently Beta) Add more out-of-the-box query types Per-document Payloads

    24. Advanced Indexing Techniques with Apache Lucene - Payloads Advanced Indexing Techniques with Questions ?

More Related