1 / 8

Data analytics challenge

11 December 2013 CERN openlab IT Challenges workshop. Data analytics challenge. Thanks to all for the participation, constructive ideas , thanks to Manuel Martin Marquez for taking the notes. General.

ted
Download Presentation

Data analytics challenge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 11 December 2013 CERN openlab IT Challenges workshop Data analytics challenge Thanks to all for the participation, constructive ideas, thanks to Manuel Martin Marquez for taking the notes

  2. General • Convince the user and establish a sponsor model, easy interface helps to explain and gain traction • Include a dissemination aspect (training) • Prepare 2-3 approaches and advertise them • Start preparation in 2014

  3. Techniques • Show similar objects, exploratory solution • Feedback loop from batch analysis to “real-time” • “Real-time” is real-time at the O(second) • First offline analysis before real-time analysis for part (LHC logging) to first understand well the problem, example Unidentified falling objects, to convince the users • Challenge: easy way to describe what you want, ability to translate to a lower level language

  4. Analytics as a service • “Analytics platform” or (Big data) “Analytics-as-a-service” (A3S ?): • Data fed from multiple sources (live) • Stored reliably • Data processing with multiple systems • Easy access, domain expert natural language (DSL)

  5. Constraints • How to not impact production / guarantee that data can be taken(decouple production from analysis) • Different population and needs: engineers, technicians, operators • For part of the data, sequence is not enough, vectors is required • ESA: offline • Agile methodologies for data analytics, work on the process • Yandex: plan to have DSL in DataFactory • Different types and different format, connectors • Communication between nodes should be transparent

  6. TODO: 2014 • Have 4-5 use cases well described • Data, signification of data, goal (measurable) • Pre-evaluate some of the technologies • Prepare architecture and interface design for the “Analytics as a service” platform

More Related