1 / 52

GTC 2016 Opening Keynote

As artificial intelligence sweeps across the technology landscape, NVIDIA unveiled today at its annual GPU Technology Conference a series of new products and technologies focused on deep learning, virtual reality and self-driving cars.

nvidia
Download Presentation

GTC 2016 Opening Keynote

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A NEW COMPUTING MODEL J EN-HSUN HUANG, CO-FOUNDER & CEO, GTC 2016

  2. LEAPS IN ADOPTION Auto Gov't / Labs M&E Aerospace / Defense Oil & Gas Medical Internet Academia Finance Manufacturing IT / HW / SW 120 Academia Finance Internet National Labs Defense Games Manufacturing Oil & Gas Automotive M & E 100 # accelerated systems 80 300K 60 5,500 4x 40 2,350 20 0 Nov 2013 Nov 2014 Nov 2015 2012 2012 2016 2016 2X GTC Attendees 2X Accelerated Systems, 96% of New Systems on NVIDIA 4X CUDA Developers, 10X in Hyperscale + Auto 2

  3. NVIDIA SDK The Essential Resource for GPU Computing developer.nvidia.com | Available Now 3

  4. NVIDIA GAMEWORKS Volumetric Lighting | Voxel Accelerated Ambient Occlusion | Hybrid Frustum Traced Shadows Available Now COMPUTEWORKS GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS J ETPACK PhysX HairWorks WaveWorks FlameWorks and other technologies such as: Clothing, VXGI, Flex, Destruction 4

  5. NVIDIA DESIGNWORKS Adobe support of MDL | Siemens NX adopts Iray COMPUTEWORKS GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS J ETPACK Iray MDL OptiX Path Rendering and other technologies such as: GL Extensions, GRID, GPU Direct for Video, Mosaic, VXGI, Warp and Blend 5

  6. NVIDIA VRWORKS Oculus Rift and HTC Vive integration | Epic, Max Play and Unity game engines Available Now COMPUTEWORKS GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS J ETPACK Multi-Res Shading VR SLI Context Priority Warp and Blend and other technologies such as: Direct Mode, GPUDirect for Video 6

  7. NVIDIA COMPUTEWORKS CUDA 8 —Available J une | cuDNN 5 —Available April | nvGRAPH —Available J une IndeX plug-in for ParaView —Available May COMPUTEWORKS GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS J ETPACK CUDA cuDNN nvGRAPH IndeX and other technologies such as: AMGx, cuSOLVER, cuSPARSE, OpenACC, NSIGHT, THRUST 7

  8. NVIDIA DRIVEWORKS J PL —Available Now | EAP —Available Q2’16 General Release —Available Q1’17 COMPUTEWORKS GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS J ETPACK SensorFusion Detection Localization HD Maps and other technologies such as: Driving, Planning 8

  9. NVIDIA J ETPACK J etson TX1: 24 images/s/W | GIE - GPU Inference Engine —Available May COMPUTEWORKS GAMEWORKS VRWORKS DESIGNWORKS DRIVEWORKS J ETPACK Deep Learning SDK DIGITS Workflow VisionWorks J etson Media SDK and other technologies such as: Linux4Tegra, NSIGHT EE, OpenCV4Tegra, OpenGL, System Trace, Visual Profiler, Vulkan 9

  10. VR: A START OF A NEW PLATFORM Google announces J ump VR camera platform Microsoft demonstrates Holoportation VR Startups Raise $1.5B in funding Samsung, Oculus, HTC release headsets New York Times ships Cardboard to subscribers 10

  11. EVEREST VR 11

  12. MARS 2030 12

  13. IRAY VR Breakthrough Photoreal VR —Available Starting in J une Pre-render light probes surrounding region of interest Rasterize depth buffer at headset eye positions Reconstruct image for new viewpoint from depth and multiple probes 13

  14. IRAY VR 14

  15. IRAY VR LITE Available in J une 1. Design in 3ds Max 2. Download Iray for 3ds Max Plug-in 3. Download Android Viewer 4. Get VR HMD 15

  16. AN AMAZING YEAR IN AI AlphaGo Microsoft Deep Speech 2 One network, 2 languages Rivals a World Champion “Super Deep Network” Microsoft & Google “Superhuman” Image Recognition A New Computing Model Hits Pop Culture Berkeley’s Brett One network, everything robotics 16

  17. A NEW COMPUTING MODEL ImageNet 100% 90% 80% 70% 60% 50% 40% 30% Traditional CV Deep Learning 20% 10% 0% 2009 2010 2011 2012 2013 2014 2015 2016 Traditional Computer Vision Experts + Time Deep Learning Object Detection DNN + Data + HPC Deep Learning Achieves “Superhuman” Results 17

  18. 18

  19. $500B OPPORTUNITY OVER 10 YRS Other Ad Service Technology Retail Mfg Oil & Gas Investment Media Deep Learning Total Revenue by Segment Deep Learning Software Revenue by Industry IBM: “Cognitive business represents a $2T opportunity” 19 SOURCE: “Deep Learning for Enterprise Applications,” 4Q 2015, Tractica

  20. NVIDIA GPU FOR HYPERSCALE TESLA M40 + TESLA M4 10X Speed up | 20 images/s/W Cloud Services Powered by AI 20

  21. “ Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks.” Soumith Chintala AI Research Engineer, Facebook — Soumith Chintala, Facebook AI Research Alec Radford & Luke Metz indico Research 21

  22. UNSUPERVISED LEARNING 22

  23. TESLA P100 THE MOST ADVANCED HYPERSCALE DATACENTER GPU EVER BUILT 150B XTORS | 5.3TF FP64 | 10.6TF FP32 | 21.2TF FP16 | 14MB SM RF | 4MB L2 Cache 23

  24. “ FIVE MIRACLES” Pascal Architecture 16nm FinFET CoWoS with HBM2 NVLink New AI Algorithms 24

  25. GIANT LEAPS IN EVERYTHING P100 (FP16) P100 3x 160 20 Teraflops (FP32/FP16) P100 Bandwidth (GB/Sec) Bandwidth 120 15 2x P100 (FP32) 80 10 1x M40 K40 M40 40 5 K40 K40 M40 3x Compute 3x GPU Mem BW 5x GPU-GPU BW 25

  26. “ NVIDIA GPU is accelerating progress in AI. As neural nets become larger and larger, we not only need faster GPUs with larger and faster memory, but also much faster GPU-to-GPU communication, as well as hardware that can take advantage of reduced-precision arithmetic. This is precisely what Pascal delivers.” “ AI computers are like space rockets: The bigger the better. Pascal’s throughput and interconnect will make the biggest rocket we’ve seen yet.” Andrew Ng, Chief Scientist, Baidu Yann LeCun, Director of AI Research, Facebook “ This is a new era of computing. New approaches to the underlying technologies will be required for AI and cognitive. The combination of NVIDIA Pascal GPUs and IBM POWER accelerates Watson’s learning of new skills. Together, IBM and NVIDIA will advance the artificial intelligence industry.” “ Microsoft is developing super deep neural networks that are more than 1000 layers. NVIDIA Tesla P100’s impressive horsepower will enable Microsoft’s CNTK to accelerate AI breakthroughs.” Xuedong Huang, Chief Speech Scientist, Microsoft Research Dr. J ohn Kelly III, SVP, Cognitive Solutions & IBM Research 26

  27. TESLA P100 SERVERS Coming in Q1‘17 27

  28. GPU-ACCELERATED DL FOR EVERY MARKET Other Ad Service Technology Retail Mfg Oil & Gas Investment Media Deep Learning in the Cloud IBM: “Cognitive business represents a $2T opportunity” Deep Learning for Enterprise 28 SOURCE: “Deep Learning for Enterprise Applications,” 4Q 2015, Tractica

  29. NVIDIA DGX-1 WORLD’S FIRST DEEP LEARNING SUPERCOMPUTER Engineered for deep learning | 170TF FP16 | 8x Tesla P100 NVLink hybrid cube mesh | Accelerates major AI frameworks 29

  30. 30

  31. “ 250 SERVERS IN-A-BOX” DUAL XEON DGX-1 FLOPS (CPU + GPU) 3 TF 170 TF AGGREGATE NODE BW 76 GB/s 768 GB/s ALEXNET TRAIN TIME 150 HOURS 2 HOURS TRAIN IN 2 HOURS >250 NODES* 1 NODE *Caffe Training on Multi-node Distributed-memory Systems Based on Intel® Xeon® Processor E5 Family (extrapolated) Gennady Fedorov (Intel)'s picture Submitted by Gennady Fedorov (Intel), Vadim P. (Intel) on October 29, 2015 https://software.intel.com/en-us/articles/caffe-training-on-multi-node-distributed-memory-systems-based-on-intel-xeon-processor-e5 31

  32. 12X SPEED-UP IN ONE YEAR 1.33 billion images/day 25 Hours 2 Hours GTC 2015 4 Maxwell GPUS GTC 2016 8 Pascal GPUS 32

  33. “Time series output” GPU0 Model Parallel GPU1 Data Parallel Time series input Bryan Catanzaro Senior Researcher, Baidu Recurrent Neural Nets Model + Data Parallelism 33

  34. keep in registers GPU0 GPU1 Data Parallel weights GPU2 GPU3 repeat ~300 times repeat ~300 times Strong scale to 32X more processors Persistent RNNs: Peak FLOPs at batch of 8 Add Model Parallelism over NVLINK Compose with Data Parallelism 34

  35. Rajat Monga TensorFlow Technical Lead & Manager, Google 35

  36. NVIDIA DGX-1 WORLD’S FIRST DEEP LEARNING SUPERCOMPUTER 170TF | “250 servers in-a-box” | nvidia.com/dgx1 $129,000 36

  37. PIONEERS IN AI RESEARCH Frameworks for Multi-GPU Pascal Large-scale Deep Learning Reinforcement Learning Unsupervised and Transfer Learning Natural Language Understanding Autonomous Driving Medical Applications 37

  38. DEEP LEARNING FOR MEDICINE NVIDIA Founding Technology Partner of MGH Center of Clinical Data Science 10B Medical images on DGX-1 to advance radiology, pathology, genomics 38

  39. TESLA FAMILY M40 + M4 K80 Hyperscale HPC Multi-App HPC Strong-Scale HPC Researchers / Early Adopters 39

  40. AN AMAZING YEAR FOR SELF-DRIVING CARS Volvo Drive Me on Public Roads in 2017 Uber Enters the Race Tesla Model 3: 300K pre-orders Toyota Invests $1B in AI Lab NHTSA: Computer Counts as Driver Audi, BMW, Daimler Buy HERE Baidu Enters the Race GM Buys Cruise Tesla Model S Auto-pilot Honda, Nissan, Toyota Team Up 40

  41. SELF-DRIVING LOOPS MAP LOCALIZE SEE DRIVE 41

  42. NVIDIA DRIVE PX AI CAR COMPUTER World’s first DL-powered car computing platform MAPPING KALDI LOCALIZATION One scalable architecture —from DNN training to cluster, infotainment, ADAS, autonomous driving, and mapping DRIVENET DAVENET Open platform Training on DGX-1 Driving with DriveWorks NVIDIA DGX-1 NVIDIA DRIVE PX 42

  43. NVIDIA DRIVE PX PERCEPTION NVIDIA DRIVENET #1 accuracy score for KITTI car detection MAPPING KALDI LOCALIZATION DRIVENET DAVENET Training on DGX-1 Driving with DriveWorks NVIDIA DGX-1 NVIDIA DRIVE PX 43

  44. NVIDIA DRIVE PX PERCEPTION MAPPING KALDI LOCALIZATION DRIVENET DAVENET Training on DGX-1 Driving with DriveWorks NVIDIA DGX-1 NVIDIA DRIVE PX 44

  45. NEW END-TO-END HD MAPPING MAPPING KALDI LOCALIZATION DRIVENET DAVENET Training on DGX-1 Driving with DriveWorks NVIDIA DGX-1 NVIDIA DRIVE PX 45

  46. BAIDU SELF-DRIVING CAR COMPUTER 46

  47. NEW END-TO-END HD MAPPING MAPPING KALDI LOCALIZATION DRIVENET DAVENET Training on DGX-1 Driving with DriveWorks NVIDIA DGX-1 NVIDIA DRIVE PX 47

  48. PLATFORM FOR MAPPING THE WORLD 48

  49. NEW AI DRIVING MAPPING KALDI LOCALIZATION DRIVENET DAVENET Training on DGX-1 Driving with DriveWorks NVIDIA DGX-1 NVIDIA DRIVE PX 49

  50. WORLD’S FIRST AUTONOMOUS RACE CAR Designed by Daniel Simon | 2,200 lbs | Blazing fast 50

More Related