Download
crowdsearch exploiting crowds for accurate real time image search on mobile phones n.
Skip this Video
Loading SlideShow in 5 Seconds..
CrowdSearch : Exploiting Crowds for Accurate Real-Time Image Search on Mobile Phones PowerPoint Presentation
Download Presentation
CrowdSearch : Exploiting Crowds for Accurate Real-Time Image Search on Mobile Phones

CrowdSearch : Exploiting Crowds for Accurate Real-Time Image Search on Mobile Phones

172 Views Download Presentation
Download Presentation

CrowdSearch : Exploiting Crowds for Accurate Real-Time Image Search on Mobile Phones

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. CrowdSearch: Exploiting Crowds for Accurate Real-Time Image Search on Mobile Phones Original work by Tingxin Yan, Vikas Kumar, Deepak Ganesan Presented by Ashok Kumar Jonnalagadda

  2. Roadmap • Problem Description • What is “crowdsourcing”? • System Architecture • The Crowd Search Algorithms • Delay Prediction • Validation Prediction • Experimental Evaluation • Discussion/Criticism • Questions

  3. The Perceived Problem • Text-based search is easy…

  4. The Perceived Problem • Mobile-based search will become more important in the future. • More than 70% of smart phone users perform searches. • Expected to be more mobile searches than non-mobile searches soon • Text-based mobile searches are easy as well… • Issues: • Small form-factor and resource limitations. • Typing on a phone is cumbersome • Scrolling through multiple search results. • multimedia searches requires significant memory, storage, and computing resources. • Mobile: GPS and voice for search is becoming more commonplace.

  5. Image Search from Mobile.? • Image variations in • lighting • Texture • Type of features • image quality and many other factors. • Even Google Goggle doesn’t work with all categories. • Automated image search has limitations in terms of • Humans are naturally good at distinguishing images

  6. The Perceived Problem • But how does a mobile phone user search for this? • No visible words/letters; too far away to know the address.

  7. The Perceived Problem • Ways to find out what that building is: • Ask random people on the street • Travel to the building to see the address/sign • Take a picture of the building with your mobile device and send to a search engine… • How easy is image searching on a mobile phone though?

  8. The Perceived Problem • Image search is a non-trivial problem – have to deal with variations in lighting, texture, image quality, etc. • Even when results are returned, scrolling through multiple pages on a mobile device is cumbersome. • Search should be precise and return very few erroneous results. • Multimedia searches require significant • Memory • Storage • Computing resources

  9. The Proposed Solution • CrowdSearch – Attempts to provide an accurate, image search system for mobile devices by combining… • Automated image search and • Real-time human validation of search results • Leverage crowdsourcing through Amazon Mechanical Turk (AMT)

  10. The Proposed Solution • Humans are good at comparing images • Could an automated search determine these two images are of the same building? • Crowdsourcing increases search result accuracy.

  11. System Architecture • Three main components: • Mobile Device • Initiates queries • Displays responses • Performs local image processing (maybe) • Remote Server • Performs automated image search • Triggers image validation tasks • Crowdsourcing System (AMT) • Validates image search results

  12. Apple iPhone Mobile Client

  13. System Operation Overview

  14. System Operation Overview

  15. System Operation Overview • How do we minimize delay and cost while maximizing accuracy?

  16. System Architecture

  17. Balancing Tradeoffs • Result delay • Should minimize delay or at least keep it within a user-provided bound • Result accuracy • Strive for high (i.e., ≥ 95%) accuracy • Monetary cost • Low cost is better than high cost • Energy • Should consume minimal battery power

  18. Accuracy Considerations • How many validations are required for 95% accuracy? • Requiring at leastthree validationsout of five achieves≥ 95% accuracy.

  19. Optimizing Delay • Utilize parallel posting • Post all candidate images to the crowdsourcing system at the same time. • But this approach increases cost! 5 cents 5 cents 5 cents 5 cents = 20 cents

  20. Optimizing Cost • Utilize serial posting • Post top-ranked candidate first, wait for responses, then post next candidate if necessary. • This approach increases delay!

  21. CrowdSearch Delay/Cost Optimization • Combine elements of parallel and serial posting • Prediction requires delay and validation models • Goal: want at least one verified result by the deadline.

  22. CrowdSearch Delay/Cost Optimization

  23. Delay Prediction Model • The delay of a single response is the combination of acceptance delay and submission delay. • Both of these follow an exponential distribution with an offset. • Thus, overall delay is the convolution of these delays.

  24. Delay Prediction Model Performance

  25. Validation Model • Given a response set S, want to compute probability of positive validation result. • Use training data to set these probabilities • If the probability of a positiveresult is less than somethreshold, send the nextcandidate to validation. • In this example, if the threshold were set to < 76%, the server would post the next candidate image to AMT.

  26. Power Considerations • Should some image processing occur on the local device or should it be outsourced to the server? • It depends! • Use remoteprocessing when WiFi is available. • Use local processingwhen only 3G is available • Extracting featuresfrom query Image • (Scale Invariant feature transform)

  27. Experimental Results • Any of the crowdsourcing schemes lead to better results! • Some types of imagesare easier for automated searchesto handle than others

  28. Experimental Results • CrowdSearch leads to (given a long enough deadline)… • Behavior close to parallel posting for recall • Behavior close to serial posting for search cost

  29. Thoughts/Criticism • The limited nature of the solution • Limitation to the four categories • Buildings • Books • Flowers • Faces • Only 1000 images in the backend database. • Would increasing the number of automated search images increase total task time in a significant way?

  30. Thoughts/Criticism • How useful is this anyway? • Are people willing to go through the trouble to set up a payment account and pay 5-20 cents for a search? • How much effort would it usually take for someone to find out what the object is through traditional means? • Especially for books! • Privacy concerns • People utilizing CrowdSearch must accept the fact that random strangers know what they are looking at and searching for. • Additionally, their GPS information might be provided to the CrowdSearch servers. • What about the privacy of the object of the search? • Undercover police officers

  31. Questions?