Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Crowdsourcing Case StudyAdSafe PanosIpeirotis
A few of the tasks in the past • Detect pages that discuss swine flu • Pharmaceutical firm had drug “treating” (off-label) swine flu • FDA prohibited pharmaceutical company to display drug ad in pages about swine flu • Two days to build and go live • Big fast-food chain does not want ad to appear: • In pages that discuss the brand (99% negative sentiment) • In pages discussing obesity • Three days to build and go live
Need to build models fast Traditionally, modeling teams have invested substantial internal resources in data formulation, information extraction, cleaning, and other preprocessing No time for such things… However, now, we can outsource preprocessing tasks, such as labeling, feature extraction, verifying information extraction, etc. using Mechanical Turk, oDesk, etc. quality may be lower than expert labeling (much?) but low costs can allow massive scale 6
AdSafe workflow Find URLs for a given topic (hate speech, gambling, alcohol abuse, guns, bombs, celebrity gossip, etc etc)http://url-collector.appspot.com/allTopics.jsp Classify URLs into appropriate categories http://url-annotator.appspot.com/AdminFiles/Categories.jsp Measure quality of the labelers and remove spammershttp://qmturk.appspot.com/ Get humans to “beat” the classifier by providing cases where the classifier failshttp://adsafe-beatthemachine.appspot.com/ 7