1 / 15

Apache Solr/Lucene: Looking Ahead

Apache Solr/Lucene: Looking Ahead. Topics. Me. You? Quick Overview of Lucen e and Solr Solr demo Where are we now? What’s in a version number? Looking Ahead Apache Lucene 3.1 and beyond Apache Solr 3.1 and beyond. Me. You? Lucene? Solr? New to Search? Other Search Engines?

hazel
Download Presentation

Apache Solr/Lucene: Looking Ahead

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Apache Solr/Lucene: Looking Ahead

  2. Topics • Me. You? • Quick Overview of Lucene and Solr • Solr demo • Where are we now? • What’s in a version number? • Looking Ahead • Apache Lucene 3.1 and beyond • Apache Solr 3.1 and beyond

  3. Me • You? • Lucene? • Solr? • New to Search? • Other Search Engines? • Crawling? • Database? • Scale?

  4. Lucene is a mature, high performance Java API to provide search capabilities to applications • Supports indexing, searching and a number of other commonly used search features (highlighting, spell checking, etc.) • Not a crawler and doesn’t know anything about Adobe PDF, MS Word, etc. • Created in 1997 and now part of the Apache Software Foundation • Important to note that Lucene does not have distributed index (shard) support

  5. Solr • Solr is the Lucene based search server providing the infrastructure required for most users to work with Lucene • Without knowing Java! • Also provides: • Easy setup and configuration • Faceting • Highlighting • Replication/Sharding • Lucene Best Practices http://search.lucidimagination.com

  6. Quick Solr Demo • Pre-reqs: • Apache Ant 1.7.x • SVN • svn co https://svn.apache.org/repos/asf/lucene/dev/trunksolr-trunk • cdsolr-trunk/solr/ • ant example • cd example • java –jar start.jar • cdexampledocs; java –jar post.jar *.xml • http://localhost:8983/solr/browse

  7. Where are we now? • Current releases • Apache Lucene 3.0.2 and 2.9.2 • Apache Solr 1.4.1 • Last March, the Lucene and Solr development communities merged to reduce duplication, ease development, etc. • Mail: dev@lucene.apache.org • User communities are still separate • java-user@lucene.apache.org, solr-user@lucene.apache.org

  8. Where are we now? • Is the next release Solr 1.5 or 3.1? • Solr 3.1 (99% certain!) • Two main branches of development for both Lucene and Solr • Trunk (i.e 4.0) • https://svn.apache.org/repos/asf/lucene/dev/trunk/ • No guarantee of back compatibility (but best efforts are made) • 3.x Branch • https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/ • Try to be backwards compatible to 1.4.X release • Most things are applied to both branches, but not all

  9. Words to the Wise “Some or all of the following statements may contain projections or other forward-looking statements regarding future events or implementations in Lucene/Solr” “The statements are not meant to be inclusive of all changes”

  10. Apache Lucene 3.1 and Beyond • https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/CHANGES.txt • Performance Improvements in many areas • Bytes instead of Strings – Better Memory savings • Phrase scoring • Packed Ints • Analysis Contributions • Many new languages/dialects supported: Hindi, Indic, Arabic, Armenian, Persian, Indonesian, etc. on top of support for English, most European languages, Chinese, Japanese, Korean

  11. Lucene 3.1 and Beyond • Expert Level • Flex APIs • Different codecs for the index • Total control over what is in the index • Pluggable scoring models • (Near) Real Time Search • Make newly indexed documents instantly available for search • See https://svn.apache.org/repos/asf/lucene/dev/branches/realtime_search/ • Much, much more

  12. Apache Solr 3.1 and Beyond • http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/CHANGES.txt • Solr Cloud • Make it easy to deploy and manage truly large scale search applications • 10B+ (100B? 1T?) docs with subsecond search/faceting • See http://wiki.apache.org/solr/SolrCloud • (Near) Real Time Search

  13. Apache Solr 3.1 and Beyond • Spatial Search • “Find me all the Lulu authors that live within 50 miles of HQ” • Boost, sort, filter documents by distance and other spatial information • http://wiki.apache.org/solr/SpatialSearch http://www.openstreetmap.org/?lat=44.9744&lon=-93.2484&zoom=14&layers=B000FTFT

  14. Solr 3.1 and Beyond • Group By/Field Collapsing • http://wiki.apache.org/solr/FieldCollapsing • Roll up results that have a common “token” • Examples: • All documents from the same URL • All documents by the same author that match • All documents in the same price range • Auto-suggest • Pivoted Faceting

  15. Resources • http://lucene.apache.org • /solr • /java • http://www.lucidimagination.com • solr-user@lucene.apache.org • java-user@lucene.apache.org

More Related