1 / 9

Document Store - Pilot 001

Document Store - Pilot 001. Presented to. Objectives. Index 5M+ MARC XML records Demonstrate following features Full-text search Advanced search (fielded search) Search results pagination Sub second query time on commercial hardware Setup Jackrabbit repository ( MySQL persistent store)

rania
Download Presentation

Document Store - Pilot 001

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Document Store - Pilot 001 Presented to

  2. Objectives • Index 5M+ MARC XML records • Demonstrate following features • Full-text search • Advanced search (fielded search) • Search results pagination • Sub second query time on commercial hardware • Setup Jackrabbit repository (MySQL persistent store) • Load up to 5000 documents • Analyze and optimize loading & storage • Generate UUID • Check-in, Check-out and versioning • Establish links between documents

  3. Environment • Hardware • CPU – Quad Core @ 2.93 GHz • Memory – 16 GB • Storage – 500GB • Software • 64 Bit Windows 7 OS

  4. Content Set

  5. Sample Document

  6. Sample Document

  7. Performance Metrics • Indexing time for 5362832 (~5.5M) records is 1 Hour and 42 Minutes • Index size for 5362832 records is 14GB • Extrapolated indexing time for 10M records is ~3 hours • Loading time for 3569 records 112 seconds • Extrapolated loading time for 6M records is 55 hours (~2.31 days) • Average response time for full-text search 69 milliseconds • Average response time for advanced search 3+ fields 200 milliseconds • Note: Basic setup with minimal or no tuning

  8. Work in Progress • Faceted navigation and search suggest • Simultaneously index and search multiple document types • Index and search new document types by configuration • Batch and online management (add, update, delete indexes) • Repository document load, 5M documents • Discovery and Repository integration • Bulk and online operations load, update

  9. Thank You World Headquarters 3270 West Big Beaver Road Troy, MI 48084, U.S.A Phone : 248.786.2500 Fax : 248.786.2515 Web : www.htcinc.com

More Related