Download
computer vision since deep learning n.
Skip this Video
Loading SlideShow in 5 Seconds..
Computer Vision since Deep Learning PowerPoint Presentation
Download Presentation
Computer Vision since Deep Learning

Computer Vision since Deep Learning

177 Views Download Presentation
Download Presentation

Computer Vision since Deep Learning

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Computer Vision since Deep Learning Larry Strickland Chief Product Officer lstrickland@dkl.com

  2. What is deep learning? • Well…. …. that is really not well defined. • Possible sources • Large (Artificial) Neural Networks • Multiple Layer Neural Networks • Larger data sets (deep pool of data) • Of note – there is no reference to “Shallow Learning”

  3. Deep Learning applied to Pattern Recognition Deep Learning = Learning Hierarchical RepresentationsSlide by Yann LeCun.

  4. Deep Learning • Deep learning is used to train • Computation models that represent data • Multiple levels of abstraction • Computational models mimic the Brain • Many methods fit within the Deep learning – including • Neural Networks • Hierarchical probabilistic models • Variety of supervised, and unsupervised feature learning algorithms

  5. What’s changed? • Computers have gotten bigger and faster • Move from CPUs to GPUs for parallel compute tasks • Larger memory • Abundance of open source libraries supporting the various methods and algorithms • Almost all libraries leverage the power of the GPU (and apparently the Nividia Stock price) • Availability of data sets • The growth in both labelled and unlabeled data sets provide a rich source of training and testing • Computer Gaming • Simulation of real world scenarios provides a rich set of training data.

  6. What lead up to deep learning Attempt to understand brain neural structure started in 1940s

  7. Computer Vision and Deep Learning • Three most successful (currently) • Convolutional Neural Networks: multiple layers of Neural Networks of different type – each layer performing a different role in the Computer Vision task • Boltzmann family (Deep Belief Networks and Deep Boltzmann Machines): leveraging the Restricted Boltzmann Machine which is a generative stochastic neural network • Stacked Autoencoders: using the Autoencoder (denoised) as the basic building block • Pros and Cons:

  8. Example CNN for object detection

  9. DBN / DBM Deep Belief Network Deep Boltzman Machine

  10. The uses of Deep Learning • Object detection • Facial Recognition • Motion Tracking • Action recognition • Human pose estimation • Semantic Segmentation • Image Processing • Autonomous Driving

  11. Object detection • Example object detection CNN • Training for purpose

  12. Example Video – Object detection

  13. Human Pose Estimation • Detect a different type of object • Body joints • Additional layer to model the output to detect pose

  14. Example – body joint video

  15. Image Processing • Super Resolution • Refocusing Images • Photo Style Transfer • Deep Fakes

  16. Super Resolution • Train by using a collection of LR images with their SR counterparts • Key to the training is being able to have an error function that approximates the error that humans would perceive. • Use trained model to generate SR images from their LR counterparts • Alternative Use – Focus out of Focus Pictures

  17. Generator and Discriminator Networks

  18. Refocused Examples

  19. Photo Style Transfer • Style transfer • Similar approach – train on detecting style

  20. Style Transfer Examples Result 1 Style Reference Image Image Result 2

  21. More extreme examples

  22. Deep Fakes • Facial Recognition • Train a model – to style from one face to another • Essentially multiple styles due to multiple expressions.

  23. Deep Fake Video

  24. Autonomous Driving • Many problems in the domain • Sensors calibration • Coordination between sensors (cameras, lasers, ultrasonic, RADAR, ….) • Object Recognition • Pedestrians, street signs, road works, lights, … • Reconstruction – 2D/3D • Motion estimation and tracking • Labelling - semantic • Multiple Frames • Future Prediction • Scene understanding • end-to-end learning

  25. Pedestrian / Cyclist Detection

  26. Scene Understanding

  27. Predicting Future

  28. Larry Strickland Chief Product Offier 613 523 5500 x256 lstrickland@dkl.com