50 likes | 180 Views
This session aims to clarify foundational terminologies in data analytics and distinguish between related concepts such as data analysis and data mining. We will systematically examine the specific data analytics needs in scientific domains and explore the potential applications of big data techniques. A key outcome will be the development of a recommendation document that classifies feasible combinations of analysis algorithms, tools, and data characteristics. This document will serve as a best practice guide for scientific communities investing in big data technologies.
E N D
Big Data Analytics Debrief Rahul Ramachandran, Morris Reidel rahul.ramachandran@uah.edu
Objectives • Clarify some foundational terminologies in the context of data analytics understanding differences/overlaps with terms like data analysis, data mining, etc. • Systematically analyze different specific scientific domain data analytics needs and their potential use of various big data analytics techniques. • Develop a recommendation documents with a systematic classification of feasible combinations of analysis algorithms, analytical tools, data and resource characteristics and scientific queries. • Recommendation documents can serve as a best practice guide for scientific groups/communities interested in investing in Big Data technologies
Breakout Session Focus • Introduction • Terminology refocus • What does it enable that is new, different? • Use Case Presentations • Atmospheric Science – (Volume, Interactivity) Event Analytics • Astronomy – (Volume, Real time) Anomaly Detection • Bioinformatics – (Heterogeneity, Interactivity) • Linguistics – (Unstructured, Complex datasets) • Technology Presentation • Array based Databases • Update on NIST Big Data activities
Next Steps • Flesh out “use case” template (started) • Capture all the use cases (started) • RDA host a variety of “benchmark big data sets” • Similar UCI Machine Learning Datasets • Flesh out “analysis methodology/technology” template (started) • Survey and capture all analysis methodologies/technologies (started) • Mapping between use cases and analysis methodology/technology (to be done by the 3rd Plenary) • Added a new co-chair – Peter Baumann
Process Survey Survey Technologies Use Cases Template Template Technology Classification Use Case Classification Mapping Recommendation Document