tools for text indexing and searching
Download
Skip this Video
Download Presentation
Tools for Text Indexing and SearchING

Loading in 2 Seconds...

play fullscreen
1 / 13

Tools for Text Indexing and SearchING - PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on

PeWe 2011. Tools for Text Indexing and SearchING. Du šan Zeleník. FIIT STU. zelenik @ fiit.stuba.sk. Searching using SQL LIKE. CREATE INDEX names\_index ON heroes(name) SELECT name FROM heroes WHERE name LIKE “z elen \%” will use names\_index , ok

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Tools for Text Indexing and SearchING' - jamal-stevens


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
searching using sql like
Searchingusing SQL LIKE
  • CREATE INDEX names_index ON heroes(name)
  • SELECTname FROM heroesWHERE name LIKE “zelen%”
    • will use names_index, ok
  • SELECTname FROM heroesWHERE name LIKE “%ik”
    • won’t use names_index (seriously don’t do that)
  • CREATE FULLTEXT INDEX names_fullindex ON heroes(name)
  • SELECTname FROM heroesMATCH(name) AGAINST(“%ik”)
    • will use names_fullindex
  • SELECTname FROM heroesMATCH(name) AGAINST(“ze%ik”)
    • won’t use names_fullindex(seriously don’t do that)
search engines for text
Search Engines for TEXT
  • Lucene
    • Lucene Core - Java (library)
      • Ferret …
    • Solr - Java (standalone server)
      • Sunspot …
    • ElasticSearch - Lucene Core
      • Tire …
  • Sphinx – C++
    • Thinking Sphinx
sphinx
Sphinx
  • Standalone server (http://sphinxsearch.com/)
  • Thinking Sphinx (Rails Gem – MVC)
    • http://freelancing-god.github.com/
    • works directly with DB and Sphinx server
thinking sphinx
Thinking Sphinx

class Hero < ActiveRecord::Base

define_indexdo

indexes description, :sortable=> true

indexes sidekick(:name), :as => :sidekick, :sortable => true

has sidekick, summoned_at, died_at

end

end

Hero.search “zelenik”

Hero.search:conditions=> {:sidekick=> “simko”},

:match_mode=> :any#(:all, :any, :phrase, :boolean)

:order=> :died_at

thinking sphinx1
Thinking Sphinx

Excerpts

  • heroes = Hero.search “gigant”
  • heroes.excerpts.description
    • … has abnormally gigant muscles ….

Facets

  • indexes sidekick.name, :as => :sidekick, :facet => true

Geolocation

  • has "RADIANS(latitude)", :as => :latitude, :type => :float
  • has "RADIANS(longitude)", :as => :longitude, :type => :float
  • Place.search “zelenik",

:geo => [@lat, @lng],

:with => {"@geodist" => 0.0..10_000.0}

slide8
Solr
  • Standalone server (http://lucene.apache.org/solr/)
  • Sunspot (Rails Gem)
    • http://outoftime.github.com/sunspot/
    • communicates with DB and Solr server
sunspot
Sunspot

Hero.searchdo

fulltext ‘muscles'

with(:died_at).less_thanTime.now

order_by :summoned_at, :desc

paginate :page => 2, :per_page => 15

facet :sidekick

end

class Hero < ActiveRecord::Base

searchable do

text :description

string :sidekick do

sidekick.name

end

time :summoned_at

time :died_at

end

end

sunspot1
Sunspot

DSL

Solr highlighting

Class hierarchy

Facets

Geographical searches

WillPaginate support

Lucene analyzers (tokenizers, filters …)

elasticsearch
ElasticSearch
  • Standalone server based on Solr
    • (http://www.elasticsearch.org/)
  • Tire (Rails Gem), better than nothing
    • https://github.com/karmi/tire
    • communicates with DB and ElasticSearch server
slide12
Tire

class Hero < ActiveRecord::Base

include Tire::Model::Search

include Tire::Model::Callbacks

mapping do

indexes :description, :type => 'string‘, :analyzer => 'snowball‘

indexes :name, :type => 'string'

indexes :died_at, :type => ‘time‘

indexes :summoned_at, :type => ‘time‘

end

end

Hero.search ‘muscles'

elasticsearch1
ElasticSearch

ADVANTAGES OF SOLR

REST

DISTRIBUTED!!!

http://www.youtube.com/watch?v=l4ReamjCxHo

For instance, Hadoop …

http://www.elasticsearch.org/guide/reference/modules/gateway/hadoop.html

ad