introduction to apache lucene solr n.
Download
Skip this Video
Download Presentation
Introduction to Apache Lucene/Solr

Loading in 2 Seconds...

play fullscreen
1 / 12

Introduction to Apache Lucene/Solr - PowerPoint PPT Presentation


  • 163 Views
  • Uploaded on

Introduction to Apache Lucene/Solr. CSCI 572: Information Retrieval and Search Engines Summer 2010. Outline. What is Lucene/Solr? Where did it come from? What are the current versions of Lucene/Solr? What can it do?. Apache Lucene. The brainchild of Doug Cutting

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Introduction to Apache Lucene/Solr' - uma-weber


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
introduction to apache lucene solr

Introduction to Apache Lucene/Solr

CSCI 572: Information Retrieval and Search Engines

Summer 2010

outline
Outline
  • What is Lucene/Solr?
  • Where did it come from?
  • What are the current versions of Lucene/Solr?
  • What can it do?
apache lucene
Apache Lucene
  • The brainchild of DougCutting
  • Free-text indexing library that implements most of the functionality I’ve talked to you about
    • Query Models, Ranking, Indexing
  • Core API is implemented in Java
    • C++/C, Ruby, Python APIs as well, but small communities or automatically generated
  • Initially Sourceforge, moved to Apache in 2001
apache solr
Apache Solr
  • Originally developed at CNET
  • Web service layer built on topof Lucene library
  • Provides schema andunderstanding of field types, conversion to and from representation
  • Provides huge-scale scalability, deployed on top of application server like Tomcat or Jetty
  • P/L independent programming APIs
  • Sharing, replication, faceting, highlighting, explain, more like this and other functionality provided easily
how to get started
How to get started
  • Lucene (2.9.2 and 3.0.1 stable)
    • Put your Java hat on
    • Have Eclipse ready or your favorite IDE
    • Download lucene-core-<version>.jar from
      • http://repo1.maven.org/maven2/org/apache/lucene/
    • Download src and build from
      • http://www.apache.org/dyn/closer.cgi/lucene/java/
    • Check out some example Java code that demonstrates indexing and querying from Otis Gospodnetic
      • http://onjava.com/pub/a/onjava/2003/01/15/lucene.html
how to get started1
How to get started
  • Solr
    • Grab a release of Solr (1.4.0 stable)
      • http://www.apache.org/dyn/closer.cgi/lucene/solr/
    • Unpack into e.g., /usr/local/solr
    • Deploy onto tomcat
      • Install tomcat into /usr/local/tomcat
      • Create solr.xml file and drop into /usr/local/tomcat/conf/Catalina/localhost/
        • Create solr.home JNDI property and point to /usr/local/solr/solr
      • Start tomcat
    • Head over to $solr/example/example-docs
      • curl http://localhost:8983/solr/update -H 'Content-type:text/xml; charset=utf-8' --data-binary @artists.xml
modifying your schema xml
Modifying your schema.xml
  • Field Types
  • Analyzers
  • Tokenizers

http://wiki.apache.org/solr/SchemaXml

solr faceting
Solr Faceting
  • facet=on&facet.field=&facet.field=…
  • http://wiki.apache.org/solr/SimpleFacetParameters
advanced topics
Advanced Topics
  • Standing up cores
  • Sharding
  • Replication
  • Zookeeper and Cloud
development currently in flux
Development currently in flux
  • Stick with release versions
  • Depending on trunk won’t really help
  • Lucene and Solr have merged
wrapup
Wrapup
  • Lots more information at
    • http://lucene.apache.org
    • http://lucene.apache.org/solr/
    • http://lucene.apache.org/java/
  • Possible projects
    • Geospatial search
      • Improving existing code and contributing back to Apache SIS and to Apache Solr
    • Improving date faceting
    • Rewriting the ResponseWriter framework
acknowledgements
Acknowledgements
  • Material inspired by discussions and talks on the Apache Mailing lists for Solr, Lucene and through discussions with the rest of the Lucene community
ad