Accelerate AI with a Data Pipeline Strategy

Accelerate AI with a Data Pipeline Strategy Brian Schwarz, VP Product Management and Development

Pure Storage is a Consumer and Producer Share our Learning and Contribute to the Community Our Data Pipeline We are passionate about Deep Learning • IoT use case • 1000s devices • Proactive Support + FLASHBLADE Scale-Out Software+ Flash + Networking

Data is the New Oil "We don't have better algorithms, we just have more data.“ Deep Learning = Innovation New SW Model  New Compute model How to blend this with tradition Analytic and AI? Why Data? What Changes? Deep Learning Older Learning Algorithms Accuracy Amount of Data Deep learning chart courtesy of Andrew Ng

Many Choices Between You and Success Deep learning best choice when dozens, hundreds, thousands of variables at play Traditional app vendor extensions limited in scope Data + Infrastructure is the platform Pre-processed and structured data play a role Data has gravity  Operate where the data resides Data Scientists, Data Explorers, and Data Curators Let Your Data Guide You Software Infrastructure Training Deployment Options PUBLIC CLOUD YOUR CLOUD

Data Pipeline for Deep Learning Training Collect/Extract data (images, video, audio, sensors, etc…)  Files and Objects  Keep copy of raw data Collect Extract Transform Tag Debug Training Optimize Data Management Tagging + Testing / Debug of model bigger challenges than many anticipate Training – more data wins  optimize steps 1-5 Optimize Infrastructure Collect, Extract, Transform, and Tag run on traditional x86 infrastructure Model Debug and Training run on large GPU clusters Avoid slow shared storage, otherwise copying data between local storage at each stage will slow progress

Takeaways Data is the new oil – Design a data factory Hire the best team + augment with trusted advisors from consultants and vendors 1 2 3 The only constant is change (especially in the SW tool-chain), build a solid team and infrastructure to accommodate it

THANK YOU

Accelerate AI with a Data Pipeline Strategy

Accelerate AI with a Data Pipeline Strategy

Presentation Transcript

Data Pipeline: Finance

Data Pipeline

Data-pipeline using ALSPAC data

Data Pipeline Project

Design of a real time strategy game with a genetic AI

Pipeline with Data Forwarding

GCOD Data Pipeline

Data Pipeline to Data Use

KFPA Data Pipeline

A pipeline for fingerprinting data analysis

LifeSize Fills the Pipeline with a Persona-Based Content Strategy

Accelerate Your Marketing with Quality Data

Design of a real time strategy game with a genetic AI

Data Pipeline to Data Use

Data Pipeline Project

KFPA Data Pipeline

Accelerate Business with Power of AI and data

Accelerate Sales with a Corporate Directory

Accelerate Deep Learning Training with Habana Gaudi AI Processor and DDN AI Storage Solutions

Revolutionize Your Advertising Strategy with AI: Dominate the Market with AI

Programma Accelerate data gedreven

Accelerate Decisions and Improve Business Outcomes with an AI-powered Integrated