CrowdSearch: Exploiting Crowds for Accurate Real-Time Image Search on Mobile Phones Original work by Tingxin Yan, Vikas Kumar, Deepak Ganesan Presented by Ashok Kumar Jonnalagadda
Roadmap • Problem Description • What is “crowdsourcing”? • System Architecture • The Crowd Search Algorithms • Delay Prediction • Validation Prediction • Experimental Evaluation • Discussion/Criticism • Questions
The Perceived Problem • Text-based search is easy…
The Perceived Problem • Mobile-based search will become more important in the future. • More than 70% of smart phone users perform searches. • Expected to be more mobile searches than non-mobile searches soon • Text-based mobile searches are easy as well… • Issues: • Small form-factor and resource limitations. • Typing on a phone is cumbersome • Scrolling through multiple search results. • multimedia searches requires significant memory, storage, and computing resources. • Mobile: GPS and voice for search is becoming more commonplace.
Image Search from Mobile.? • Image variations in • lighting • Texture • Type of features • image quality and many other factors. • Even Google Goggle doesn’t work with all categories. • Automated image search has limitations in terms of • Humans are naturally good at distinguishing images
The Perceived Problem • But how does a mobile phone user search for this? • No visible words/letters; too far away to know the address.
The Perceived Problem • Ways to find out what that building is: • Ask random people on the street • Travel to the building to see the address/sign • Take a picture of the building with your mobile device and send to a search engine… • How easy is image searching on a mobile phone though?
The Perceived Problem • Image search is a non-trivial problem – have to deal with variations in lighting, texture, image quality, etc. • Even when results are returned, scrolling through multiple pages on a mobile device is cumbersome. • Search should be precise and return very few erroneous results. • Multimedia searches require significant • Memory • Storage • Computing resources
The Proposed Solution • CrowdSearch – Attempts to provide an accurate, image search system for mobile devices by combining… • Automated image search and • Real-time human validation of search results • Leverage crowdsourcing through Amazon Mechanical Turk (AMT)
The Proposed Solution • Humans are good at comparing images • Could an automated search determine these two images are of the same building? • Crowdsourcing increases search result accuracy.
System Architecture • Three main components: • Mobile Device • Initiates queries • Displays responses • Performs local image processing (maybe) • Remote Server • Performs automated image search • Triggers image validation tasks • Crowdsourcing System (AMT) • Validates image search results
System Operation Overview • How do we minimize delay and cost while maximizing accuracy?
Balancing Tradeoffs • Result delay • Should minimize delay or at least keep it within a user-provided bound • Result accuracy • Strive for high (i.e., ≥ 95%) accuracy • Monetary cost • Low cost is better than high cost • Energy • Should consume minimal battery power
Accuracy Considerations • How many validations are required for 95% accuracy? • Requiring at leastthree validationsout of five achieves≥ 95% accuracy.
Optimizing Delay • Utilize parallel posting • Post all candidate images to the crowdsourcing system at the same time. • But this approach increases cost! 5 cents 5 cents 5 cents 5 cents = 20 cents
Optimizing Cost • Utilize serial posting • Post top-ranked candidate first, wait for responses, then post next candidate if necessary. • This approach increases delay!
CrowdSearch Delay/Cost Optimization • Combine elements of parallel and serial posting • Prediction requires delay and validation models • Goal: want at least one verified result by the deadline.
Delay Prediction Model • The delay of a single response is the combination of acceptance delay and submission delay. • Both of these follow an exponential distribution with an offset. • Thus, overall delay is the convolution of these delays.
Validation Model • Given a response set S, want to compute probability of positive validation result. • Use training data to set these probabilities • If the probability of a positiveresult is less than somethreshold, send the nextcandidate to validation. • In this example, if the threshold were set to < 76%, the server would post the next candidate image to AMT.
Power Considerations • Should some image processing occur on the local device or should it be outsourced to the server? • It depends! • Use remoteprocessing when WiFi is available. • Use local processingwhen only 3G is available • Extracting featuresfrom query Image • (Scale Invariant feature transform)
Experimental Results • Any of the crowdsourcing schemes lead to better results! • Some types of imagesare easier for automated searchesto handle than others
Experimental Results • CrowdSearch leads to (given a long enough deadline)… • Behavior close to parallel posting for recall • Behavior close to serial posting for search cost
Thoughts/Criticism • The limited nature of the solution • Limitation to the four categories • Buildings • Books • Flowers • Faces • Only 1000 images in the backend database. • Would increasing the number of automated search images increase total task time in a significant way?
Thoughts/Criticism • How useful is this anyway? • Are people willing to go through the trouble to set up a payment account and pay 5-20 cents for a search? • How much effort would it usually take for someone to find out what the object is through traditional means? • Especially for books! • Privacy concerns • People utilizing CrowdSearch must accept the fact that random strangers know what they are looking at and searching for. • Additionally, their GPS information might be provided to the CrowdSearch servers. • What about the privacy of the object of the search? • Undercover police officers