1 / 9

DNC-Big Data and Data Mining in 2012 US Election

DNC-Big Data and Data Mining in 2012 US Election. Azamat Kamzin Mandar Bhide. Overview. Highlights of Narwhal System Organization Classification Associative patterns Predictive models References. Highlights. Codename: Narwhal Budget:$100 million Lead Developer: Scott VanDenPlas

titus
Download Presentation

DNC-Big Data and Data Mining in 2012 US Election

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DNC-Big Data and Data Mining in 2012 US Election Azamat Kamzin Mandar Bhide

  2. Overview • Highlights of Narwhal • System Organization • Classification • Associative patterns • Predictive models • References

  3. Highlights • Codename: Narwhal • Budget:$100 million • Lead Developer: Scott VanDenPlas • Chief Analytics: Dan Wagner • Team: Approx. 200 members • General Objective: • Bring together information on voters, supporters, donorsat one place( unlike in 2008 where information was split 6 different servers/vendors) • It was top 20 largest consumer/customer databases ever made • Size: As per VanDenPlas tweet • “4Gb/s, 10k requests per second, 2,000 nodes, 3 datacenters, 180TB and 8.5 billion requests...” • (Service Provider:Amazon Cloud)

  4. System Organization Call/Email to motivate the voter 2008 Voter databases Best Channel and timeslot to advertise Narwhal DreamCatcher Directing volunteers to right door Private/ Public Databases • Level of support for Obama • Likelihood to vote • Estimate donation Amount Data Collection /Enrichment Right email Ad to right person • Automated 1.2 million call survey per day • Tracking visitors behavior online using cookies

  5. Dreamcatcher -Voter Classification • Classification was done in 4 categories

  6. Dreamcatcher:Association Pattern • Output: Detailed profile of voters • Inputs are attributes of each individual stored in Narwhal • Voting history • Social media Likes, comments • Volunteering • Magazine subscriptions • Registered car • Insurance data • Individual Private Information from firms like Aristotle

  7. Predictive Models • A/B Testing: • To understand which image or text user response will be higher • Ex. “Learn More” garnered 18.6 percent more signups per visitor than the default of “Sign Up.” • Time Series Analysis: • To understand Approval and disapproval trend

  8. Predictive Models • Regression • Used to calculate Electoral votes(dependent variable) based on top issues such as economy, healthcare etc. • Packages used were SAS, R and MATLAB • Decision Trees • We don’t believe they used decision trees due to large number of attributes which differ with each individual

  9. Reference • Author: Michael Scherer ( November8, 2012). “How Obama's data crunchers helped him win”. Retrieved from 
http://www.cnn.com/2012/11/07/tech/web/obama-campaign-tech-team • Author: Sasha Issenberg (December 19, 2012). “How President Obama’s campaign used big data to rally individual voters”. Retrieved from 
http://www.technologyreview.com/featuredstory/509026/how-obamas-team-used-big-data-to-rally-voters/

More Related