1 / 35

Amazon CloudSearch Meetup August 15, 2012

Amazon CloudSearch Meetup August 15, 2012. Welcome. Housekeeping Slides will be posted Drawing. Agenda. Introduction to CloudSearch Jon Handler, CloudSearch Solutions Architect Relevance and Ranking Jack Conradson , Software Engineer Case Study: Reddit Keith Mitchell, Programmer

thanh
Download Presentation

Amazon CloudSearch Meetup August 15, 2012

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Amazon CloudSearchMeetupAugust 15, 2012

  2. Welcome • Housekeeping • Slides will be posted • Drawing

  3. Agenda • Introduction to CloudSearch • Jon Handler, CloudSearch Solutions Architect • Relevance and Ranking • Jack Conradson, Software Engineer • Case Study: Reddit • Keith Mitchell, Programmer • Q&A

  4. Introduction to CloudSearch

  5. Introduction to Search

  6. Inverted Index US President

  7. Search On The Web Relevance/Ranking Faceting Range Searching Fielded Searching Boolean Queries Complex Relevance

  8. Search On The Web Relevance/Ranking Faceting Range Searching Fielded Searching Boolean Queries Complex relevance

  9. Search On The Web Relevance/Ranking Faceting Range Searching Fielded Searching Boolean Queries Complex relevance

  10. Search On The Web Relevance/Ranking Faceting Range Searching Fielded Searching Boolean Queries Complex relevance

  11. Search On The Web Relevance/Ranking Faceting Range Searching Fielded Searching Boolean Queries Complex relevance

  12. Search On The Web Relevance/Ranking Faceting Range Searching Fielded Searching Boolean Queries Complex relevance

  13. Search On The Web Relevance/Ranking Faceting Range-Searching Fielded Searching Boolean Queries Complex Relevance

  14. Amazon CloudSearch

  15. Amazon CloudSearch • Fully-managed, full-featured search service • Automatically scales for data & traffic • Handles both structured and unstructured data • Near real-time indexing • Up and running in less than 1 hour

  16. Amazon CloudSearch Architecture SEARCH CLIENT SEARCH DEVELOPER www.example.com Send Search Requests Send Documents Create and Manage Domains Use the Search Tester Search Results SEARCH ENDPOINT DOCUMENT SERVICE ENDPOINT CONFIGURATION SERVICE ENDPOINT Configuration API Command Line Tools Document Service API Command Line Tools Console Search API Console Console SEARCH SERVICE DOCUMENT SERVICE CONFIGURATION SERVICE Add Documents Search Documents Create Domains Update Documents Configure Domains Delete Documents Delete Domains ACCESS CONTROL ACCESS CONTROL ACCESS CONTROL

  17. Automatic Scaling: Data & Traffic DATA Document Quantity and Size SEARCH INSTANCE Index Partition 1 Copy 1 SEARCH INSTANCE SEARCH INSTANCE SEARCH INSTANCE Index Partition n Copy 1 Index Partition 2 Copy 1 Index Partition 1 Copy 1 TRAFFIC Search Request Volume and Complexity SEARCH INSTANCE SEARCH INSTANCE SEARCH INSTANCE Index Partition 2 Copy 2 Index Partition 1 Copy 2 Index Partition n Copy 2 SEARCH INSTANCE SEARCH INSTANCE SEARCH INSTANCE Index Partition n Copy n Index Partition 1 Copy n Index Partition 2 Copy n

  18. Example: Build Your Playlist

  19. Use Case • Million song dataset http://labrosa.ee.columbia.edu/millionsong/ • Search documents are songs • Attributes: title, artist names, years, genre, artist familiarity • We’ll use this to create a “Build Your Playlist” web application.

  20. Demo

  21. SDF Documents [ {"type":"add", "id": "sombzze12a8c134960", "version":5, "lang":"en", "fields": {"title":"Cajun Twisters", "artist_name":"Adam Ant", "year":"1993", "song_id":"sombzze12a8c134960", "artist_familiarity":449425, "genre":["alternative", "electronic", "instrumental", "rock"] } }, … ]

  22. Configuration • cs-configure-from-sdf • Analyzes source files for fields and types. Heuristic • Individually

  23. Upload Documents

  24. PHP Integration $results = file_get_contents( http://search-mn-songs-5bbplyghbb5tk257rsb7iamlsy." . "us-east-1.cloudsearch.amazonaws.com" . "/2011-02-01/search?q=" . $keyword . $bqParam . "&return-fields=title,artist_name,year,genre_result,artist_familiarity&". "facet=year_facet,genre&" . "facet-year_facet-sort=alpha&" . "facet-genre-sort=alpha&" . "facet-genre-top-n=100000&" . "facet-year_facet-top-n=100000&" . "t-year=1985..&" . "t-title=a..&" . "rank=-" . $rank); $resultsObj = json_decode($results);

  25. Common Feature Requests • Field Weighted Relevance • Additional Regions and Languages • High Availability • Tighter integration with other AWS services (Dynamo/S3) • Support For Very Large Use Cases • Geo Sorting

  26. Field-Weighted Values

  27. Field Weights Use Case • Music Search • Dataset composed of the following fields: • Title • Album • Artist • Lyrics • Popularity • Results without field weights • May end up with results based heavily on lyrics when searching for an artist’s name (Guns & Roses vs. roses, guns) • Results with field weights • Possibly apply a greater weight to artist than lyrics

  28. FWV in Rank Expressions • Rank expressions can be used within CloudSearch to customize relevance computations for better returned search results. • song_relevance = text_relevance + popularity • Natural to extend rank expressions to allow field-weighted values using JSON objects. • song_relevance = cs.text_relevance({weights: {artist=3.0, song=4.0}, default_weight=0.5} + 0.5*popularity

  29. Query-Time Rank Expressions • Each set of defined rank expressions may take a while to be deployed to your search domain. • Query-time rank expressions would allow rank expressions to be defined during a query without having to wait • q=‘guns roses’&rank-qtre=cs.text_relevance({weights: {artist=3.0, song=4.0}, default_weight=0.5}&return-fields=qtre&rank=-qtre

  30. Resources • Amazon CloudSearch Overview Page http://aws.amazon.com/cloudsearch/ • FAQs • Community Forum • Documentation & Getting Started Tutorial (IMDb) • Demos and Tutorials • What Is Amazon CloudSearch • Introducing Amazon CloudSearch (Features) • Building a Search Application Using Amazon CloudSearch • Getting Started Tutorial

  31. Upcoming Events Las Vegas, November 27-29 • Enterprise Search Summit/KMworld, DC, Oct. 17-19 • Bay Area Amazon CloudSearch Group: Oct. 24

  32. Q&A

  33. Thank You

More Related