150 likes | 285 Views
This presentation provides a comprehensive overview of Artificial Neural Networks (ANN), detailing their historical development, biological inspirations, and mathematical underpinnings. Key topics include the architecture of ANNs, weight adjustments, activation functions, and learning paradigms such as supervised and unsupervised learning. The presentation also discusses various applications such as classification, clustering, and optimization in robotic control. Through a structured approach, this resource aims to enhance understanding of how ANN operates and its practical significance in modern technology.
E N D
Artificial Neural Networks R. Michael Winters Music Information Retrieval Dr. Ichiro Fujinaga February 28, 2012
Outline • A brief history of ANN • Quick facts about biological neural networks • Simple explanation of mathematical implementation • Detailed mathematical implementation • Weights • Activation Functions • Layers • Learning in ANN • What they can be used for • Evaluation
A brief history • Rashevsky, 1938. Mathematical Biophysics. • McCulloch and Pitts, 1943. A logical calculus of the ideas immanent in nervous activity. • 1949, Hebb. The Organization of Behavior. • 1958, Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. • 1969, Minsky and Paperst. Perceptrons: an Introduction to Computational Geometry. • 1982, Hopfield. Neural networks and physical systems with emergent collective computational abilities. • 1986,RumelhartandMcClelland.Parallel Distributed Processing: Explorations in the Microstructure of Cognition.
Quick Facts about Neurons • Biological Neural Networks • Firing every 10^-3 seconds • # of Neurons in nervous system: 10^10 • # of connections: 10^13 • Energetic efficiency: 10^-16 Joules per operation per second • Machines • Silicon is 10^-9 seconds • Energetic efficiency is 10^-6 Joule per operation per second (Haykin, 1999)
A simple description • Learning AND • We want {x1=1, x2=1, d=1, x1=0, x2=0, d=0, x1=1, x2=0, d=0, x1=0, x2=1, d=0} • Begin w1 = 0.7 and w2 = 0.2 • Learning algorithm is w1 + or – 0.1. • Weights gradually change until desired result is achieved (Mehrotra, 1997)
More Complicated Explanation • Many connections with many weights • Summed together at junction with bias • Activation function determines output of the neuron (Haykin, 1999)
Types of Activation Function • Activation function is the output given the local field as determined by input connections • Simple examples are step, ramp, and sigmoid • Sigmoid is most commonly used • Strictly increasing function • Variable a can be changed to alter slope (Haykin, 1999)
Architectures • Three types of network architecture • Single Layer • Very simple, but too simple for complex computations • Multi-layer • Very commonly used and maximizes computational simplicity • Feedback (or Recurrent) • Usually very complex and non-linear
Most general multilayer network (Haykin, 1999)
Acylic Network (Haykin, 1999)
Simple Feedforward Network (Haykin, 1999)
Learning • High Level • Supervised, Unsupervised, Reinforcement • Low Level • Correlation, Competitive, Cooperative, Feedback • Hybrid Systems • Use prior knowledge and invariances to maximize efficiency of network
Uses and Evaluation • Uses • Classification, Clustering, Vector Quantization, Pattern Association, Function Approximation, Forecasting, Robotic Control, Optimization, Searching • Evaluation • Quality • Generalizability • Computation Time
References • Haykin, S. 1999. Neural Networks: A Comprehensive Foundation. Upper Saddle River, New Jersey: Simon and Schuster. • Hebb, D. O. 1949. The Organization of Behavior: A Neuropsychological Theory. New York: Wiley. • Hopfield, J. J. 1982, April. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences of the USA 79 (8): 25542558. • McCulloch, W. S., and W. Pitts. 1943, December. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biology 5 (4): 115–133. • Mehrotra, K., C. K. Mohan, and S. Ranka. 1997. Elements of Artificial Neural Networks. Cambridge, Massachusetts: Massachusetts Institute of Technology. • Minsky, M., and S. Papert. 1969. Perceptrons: an Introduction to Computational Geometry. Cambridge, MA: The MIT Press. • Rashevsky, N. 1938. Mathematical Biophysics. Chicago: The University of Chicago Press. • Rosenblatt, F. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65 (6): 386– 408. • Rumelhart, D. E., and J. McClelland. 1986. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge, MA: The MIT Press.