110 likes | 348 Views
Data Mining with Big data. By: Pouya Otarod Spring 2014. What is …… ?. Data Mining computational process of discovering patterns in large data sets Big Data it is the term for a collection of data sets so large and complex that it becomes difficult to process
E N D
Data Mining with Big data • By: Pouya Otarod • Spring 2014
What is …… ? • Data Mining • computational process of discovering patterns in large data sets • Big Data • it is the term for a collection of data sets so large and complex that it becomes difficult to process • data has exponential growth, both structured and unstructured
How much Data does exist? • 2.5 quintillion bytes of data are created EVERY DAY • IBM: 90 percent of the data in the world today were produced with past two years • Forms of Data????
Big Data Examples • October 4th, 2012, the first presidential debate • Flicker and its photos
Problem…! • Data has grown tremendously • This large amount of data is beyond the of software tools to manage • Exploring the large volume of data and extracting useful information and knowledge is a challenge, and sometimes, it is almost infeasible
HACE Theorem • Heterogeneous, Autonomous, Complex, Evolving • Big data starts with large volume, heterogeneous, autonomous sources with distributed and decentralized control, and seeks to explore complex and evolving relationships among data • These are characteristics of Big Data • This is theorem to model Big Data characteristics
Huge Data with heterogeneous and diverse dimensionality • represent huge volume of data • Autonomous sources with distributed and decentralized control • main characteristics of Big Data • Complex and evolving relationships
Data Mining Challenges with Big Data • Big Data Mining Platform • Dig Data Semantics and Application Knowledge • Information Sharing and Data Privacy • Domain and Application Knowledge • Big Data Mining Algorithm • Local Learning and Model Fusion for Multiple Information Sources • mining from Sparse, Uncertain, and Incomplete Data • Mining Complex and Dynamic Data