290 likes | 437 Views
Join Anthony Goldbloom, CEO of Kaggle, as he shares insights on predictive modeling competitions and the evolving landscape of data science. Discover the motivation behind participating in these competitions, how they function, and the benefits of using R on Kaggle. Learn about the challenges of crowdsourcing data analysis and the diverse techniques applied by users, from neural networks to ensemble methods. Goldbloom discusses real-world applications, including predicting HIV viral load and tourism forecasting, emphasizing the impact on professional development and reputation.
E N D
Predictive modeling competitions making data science a sport Anthony Goldbloom CEO, Kaggle e-mail anthony.goldbloom@kaggle.com twitter @antgoldbloom Photo by mikebaird, www.flickr.com/photos/mikebaird
Motivation • Why compete? • How it works • R on Kaggle • The Heritage Health Prize
Global competitions Predicting HIV viral load Competition closes 77% 1½ weeks 70.8% State of the art 70%
Crowdsourcing Mismatch between those with data andthose with the skills to analyse it
Additional slides Not MIT, not SAS … UoL?
Tourism Forecasting Competition Forecast Error(MASE) Existing model Aug 9 2 weeks later 1 month later Competition End
Chess Ratings Competition Existing model (ELO) Error Rate(RMSE) Aug 4 1 month later 2 months later Today
Users apply different techniques • neural networks • logistic regression • support vector machine • decision trees • ensemble methods • adaBoost • Bayesian networks • genetic algorithms • random forest • Monte Carlo methods • principal component analysis • Kalman filter • evolutionary fuzzy modeling
Motivation • Why compete? • How it works • R on Kaggle • The Heritage Health Prize
Why Participants Compete 2 1 More fun than Sudoku Clean, Real world data Professional Reputation & Experience 4 3 Interactions with experts in related fields Prizes
Motivation • Why compete? • How it works • R on Kaggle • The Heritage Health Prize
Competition Mechanics Competitions are judged on objective criteria
Motivation • Why compete? • How it works • R on Kaggle • The Heritage Health Prize
Motivation • Why compete? • How it works • R on Kaggle • The Heritage Health Prize
What could the world’s bestanalysts find in your data? e-mail anthony.goldbloom@kaggle.com phone +61438400053 Photo by gidzy, www.flickr.com/photos/gidzy