Computer Vision since Deep Learning Larry Strickland Chief Product Officer email@example.com
What is deep learning? • Well…. …. that is really not well defined. • Possible sources • Large (Artificial) Neural Networks • Multiple Layer Neural Networks • Larger data sets (deep pool of data) • Of note – there is no reference to “Shallow Learning”
Deep Learning applied to Pattern Recognition Deep Learning = Learning Hierarchical RepresentationsSlide by Yann LeCun.
Deep Learning • Deep learning is used to train • Computation models that represent data • Multiple levels of abstraction • Computational models mimic the Brain • Many methods fit within the Deep learning – including • Neural Networks • Hierarchical probabilistic models • Variety of supervised, and unsupervised feature learning algorithms
What’s changed? • Computers have gotten bigger and faster • Move from CPUs to GPUs for parallel compute tasks • Larger memory • Abundance of open source libraries supporting the various methods and algorithms • Almost all libraries leverage the power of the GPU (and apparently the Nividia Stock price) • Availability of data sets • The growth in both labelled and unlabeled data sets provide a rich source of training and testing • Computer Gaming • Simulation of real world scenarios provides a rich set of training data.
What lead up to deep learning Attempt to understand brain neural structure started in 1940s
Computer Vision and Deep Learning • Three most successful (currently) • Convolutional Neural Networks: multiple layers of Neural Networks of different type – each layer performing a different role in the Computer Vision task • Boltzmann family (Deep Belief Networks and Deep Boltzmann Machines): leveraging the Restricted Boltzmann Machine which is a generative stochastic neural network • Stacked Autoencoders: using the Autoencoder (denoised) as the basic building block • Pros and Cons:
DBN / DBM Deep Belief Network Deep Boltzman Machine
The uses of Deep Learning • Object detection • Facial Recognition • Motion Tracking • Action recognition • Human pose estimation • Semantic Segmentation • Image Processing • Autonomous Driving
Object detection • Example object detection CNN • Training for purpose
Human Pose Estimation • Detect a different type of object • Body joints • Additional layer to model the output to detect pose
Image Processing • Super Resolution • Refocusing Images • Photo Style Transfer • Deep Fakes
Super Resolution • Train by using a collection of LR images with their SR counterparts • Key to the training is being able to have an error function that approximates the error that humans would perceive. • Use trained model to generate SR images from their LR counterparts • Alternative Use – Focus out of Focus Pictures
Photo Style Transfer • Style transfer • Similar approach – train on detecting style
Style Transfer Examples Result 1 Style Reference Image Image Result 2
Deep Fakes • Facial Recognition • Train a model – to style from one face to another • Essentially multiple styles due to multiple expressions.
Autonomous Driving • Many problems in the domain • Sensors calibration • Coordination between sensors (cameras, lasers, ultrasonic, RADAR, ….) • Object Recognition • Pedestrians, street signs, road works, lights, … • Reconstruction – 2D/3D • Motion estimation and tracking • Labelling - semantic • Multiple Frames • Future Prediction • Scene understanding • end-to-end learning
Larry Strickland Chief Product Offier 613 523 5500 x256 firstname.lastname@example.org