WK1 - Introduction
Download
1 / 36

WK1 - Introduction - PowerPoint PPT Presentation


  • 125 Views
  • Uploaded on

WK1 - Introduction. CS 476: Networks of Neural Computation WK1 – Introduction Dr. Stathis Kasderidis Dept. of Computer Science University of Crete Spring Semester, 2009. Contents. Course structure and details Basic ideas of Neural Networks Historical development of Neural Networks

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' WK1 - Introduction' - yanisin-vargas


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

WK1 - Introduction

CS 476: Networks of Neural Computation

WK1 – Introduction

Dr. Stathis Kasderidis

Dept. of Computer Science

University of Crete

Spring Semester, 2009


Contents

  • Course structure and details

  • Basic ideas of Neural Networks

  • Historical development of Neural Networks

  • Types of learning

  • Optimisation techniques and the LMS method

  • Conclusions

Contents


Course Details

  • Duration: 13 weeks (2 Feb – 15 May 2009)

  • Lecturer: Stathis Kasderidis

    • E-mail: [email protected]

    • Meetings: After arrangement through e-mail.

    • Assts: Farmaki, Fasoulakis

  • Hours:

    • Every Tue 11-1 am and Wed 11-1 am.

    • Laboratory at Fri 11-1 am.

Course


Course Timetable

  • WK1 (3/5 Feb): Introduction

  • WK2 (10/12 Feb): Perceptron

  • WK3 (17/19 Feb): Multi-layer Perceptron

  • WK4 (24/26 Oct): Radial Basis Networks

  • WK5 (3/5 Mar): Recurrent Networks

  • WK6 (10/12 Mar): Self-Organising Networks

  • WK7 (17/19 Mar): Hebbian Learning

  • WK8 (24/26 Mar): Hopfield Networks

  • WK9 (31/2 Apr): Principal Component Analysis

Course


Course Timetable (Cont)

  • WK10 (7/9 Apr): Support Vector Machines

  • WK11 (28/30 Apr): Stochastic Networks

  • WK12 (5/7 May): Student Projects’ Presentation

  • WK13 (12/14 May): Exams Preparation

  • Every week:

    • 3hrs Theory

    • 1hr Demonstration

  • 19 Mar 2009: Written mid-term exams (optional)

Course


Course Timetable (Cont)

  • Lab sessions will take place every Friday 11-1 am. In Lab sessions, you will be examined in written assignments and you can get help between assignments.

  • There will be four assignments during the term on the following dates:

    • Fri 6 Mar (Ass1 – Perceptron / MLP / RBF)

    • Fri 20 Mar (Ass2 – Recurrent / Self-organising)

    • Fri 3 Apr (Ass3 – Hebbian / Hopfield)

    • Fri 8 May (Ass4 – PCA/SVM/Stochastic)


Course Structure

  • Final grade is divided:

    • Laboratory attendance (20%)

      • Obligatory!

    • Course project (40%)

      • Starts at WK2. Presentation at WK12.

      • Teams of 2-4 people depending on class size. Selection from a set of offered projects.

    • Theory. Best of:

      • Final Theory Exams (40%) or

      • Final Theory Exams (25%) + Mid-term exams (15%)

Course


Project Problems

  • Problems categories:

    • Time Series Prediction (Financial Series?)

    • Color Segmentation with Self-Organising Networks.

    • Robotic Arm control with Self-Organising Networks

    • Pattern Classification (Geometric Shapes)

    • Cognitive Modeling (ALCOVE model)

Course


Suggested Tools

  • Tools:

    • MATLAB (+ Neural Networks Toolbox). Can be slow in large problems!

    • TLearn: http://crl.ucsd.edu/innate/tlearn.html

    • Any C/C++ compiler

    • Avoid Java and other interpreted languages! Too slow!

Course


What are Neural Networks?

  • Models inspired by real nervous systems

  • They have a mathematical and computational formulation

  • Very general modelling tools

  • Different approach to Symbolic AI (Connectionism)

  • Many paradigms exist but based on common ideas

  • A type of graphical models

  • Usedin many scientific and technological areas, e.g.

Basic Ideas



What are Neural Networks? (Cont. 2)

  • NNs & Physics: e.g. Spin Glasses

  • NNs & Mathematics: e.g. Random Fields

  • NNs & Philosophy: e.g. Theory of Mind, Consciousness

  • NNs & Cognitive Science: e.g. Connectionist Models of High-Level Functions (Memory, Language, etc)

  • NNs & Engineering: e.g. Control, Hybrid Systems, A-Life

  • NNs & Neuroscience: e.g. Channel dynamics, Compartmental models

Basic Ideas


What are Neural Networks? (Cont. 3)

  • NNs & Finance: e.g. Agent-based models of markets,

  • NNs & Social Science: e.g. Artif. Society

Basic Ideas


General Characteristics I

  • How do they look like?

Basic Ideas


General Characteristics II

  • Node details:

  • Y=f(Act)

  • f is called Transfer function

  • Act=I Xi * Wi –B

  • B is called Bias

  • W are called Weights

Basic Ideas


General Characteristics III

  • Form of transfer function:

Basic Ideas


General Characteristics IV

  • Network Specification:

    • Number of neurons

    • Topology of connections (Recurrent, Feedforward, etc)

    • Transfer function(s)

    • Input types (representation: symbols, etc)

    • Output types (representation: as above)

    • Weight parameters, W

    • Other (weights initialisation, Cost function, training criteria, etc)

Basic Ideas


General Characteristics V

  • Processing Modes:

    • Recall

    • “Learning”

Basic Ideas


General Characteristics VI

  • Common properties of all Neural Networks:

    • Distributed representations

    • Graceful degradation due to damage

    • Noise robustness

    • Non-linear mappings

    • Generalisation and prototype extraction

    • Allow access of memory by contents

    • Can work with incomplete input

Basic Ideas


Historical Development of Neural Networks

  • History in brief:

    • McCulloch-Pitts, 1943: Digital Neurons

    • Hebb, 1949:Synaptic plasticity

    • Rosenblant, 1958: Perceptron

    • Minksy & Papert, 1969: Perceptron Critique

    • Kohonen, 1978: Self-Organising Maps

    • Hopfiled, 1982: Associative Memory

    • Rumelhart & McLelland, 1986: Back-Prop algorithm

    • Many people, 1985-today:EXPLOSION!

History


What is Learning in NN?

Def:

“Learning is a process by which the free parameters of neural network are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes take place”

[Mendel & McClaren (1970)]

Learning


Learning Sequence

  • The network is stimulated by the environment;

  • The network undergoes changes in its free parameters as a result of this stimulation;

  • The network responds in a new way to the environment because of the changes that have occurred in its internal structure.

Learning


Learning Criteria

  • Sum squared error

  • Mean square error

  • X2 statistic

  • Mutual information

  • Entropy

  • Other (e.g. Dot product – ‘similarity’)

Learning


Learning Paradigms

  • Learning with a teacher (supervised learning)

  • Learning without a teacher

    • Reinforcement learning

    • Unsupervised learning (self-organisation)

Learning


Families of Learning Algorithms

  • Error-based learning

    • wkj(n) = h*ek(n)*xj(n) (Delta rule)

  • Memory-based learning (??)

    • 1-Nearest Neighbour

    • K-Nearest Neighbours

  • Hebbian learning

    • wkj(n) = h*yk(n)*xj(n)

    • wkj(n) =F(yk(n),xj(n)) (more general case)

  • Competitive learning

    • wij(n+1) = h*(xj(n)- wij(n))

Learning


Families of Learning Algorithms II

  • Stochastic Networks

    • Boltzmann learning

      • wkj(n) = h*(kj+(n)-kj-(n))

      • (kj* = avg corr of states of neurons i, j )

Learning


Learning Tasks

  • Function approximation

  • Association

    • Auto-association

    • Hetero-association

  • Pattern recognition

  • Control

  • Filtering

Learning


Credit Assignment Problem

  • Def: It is the problem of providing credit or blame to states that lead to useful / harmful outcomes

  • Temporal Credit Assignment Problem: Find which actions in a period q=[t,t-T] lead to useful outcome at time t and credit these actions, I.e.

    • Outcome(t) – f  Actions(q)

  • Structural Credit Assignment Problem: Find which states at time t lead to useful actions at time t, I.e.

  • Actions(t) – g  State(t)

Learning


Statistical Nature of the Learning Process

  • Assume that a set of examples is given:

  • Assume that a statistical model of the generating process is given (regression equation):

  • Where X is a vector random variable (independent variable), D is scalar random variable (dependent) and  is a random variable with the following properties:

Bias / Var


Statistical Nature of the Learning Process II

  • The first property says that  has zero mean given any realisation of X

  • The second property says that  is uncorrelated with the regression function f(X) (principle of orthogonality)

  •  is called intrinsic error

  • Assume that the neural network describes an “approximation” to the regression function, which is:

Bias / Var


Statistical Nature of the Learning Process III

  • The weight vector w is obtained by minimising the cost function:

  • We can re-write this, using expectation operators, as:

Bias / Var

  • … (after some algebra we get) ….


Statistical Nature of the Learning Process IV

  • Thus to obtain w we need to optimise the function:

  • … (after some more algebra!) ….

Bias / Var


Statistical Nature of the Learning Process V

  • B(w) is called bias (or approximation error)

  • V(w) is called variance (or estimation error)

  • The last relation shows the bias-variance dilemma:

    • “We cannot minimise at the same time both

    • bias and variance for a finite set, T. Only

    • when N   both are becoming zero”

  • Bias measures the “goodness” of our functional form in approximating the true regression function f(x)

  • Variance measures the amount of information present in the data set T which is used for estimating F(x,w)

Bias / Var


Comments I

  • We should distinguish Artificial NN from bio-physicalneural models (e.g. Blue Brain Project);

  • Some NNs are Universal Approximators, e.g. feed-forward modles are based on the Kolmogorov Theorem

  • Can be combined with other methods, e.g. Neuro-Fuzzy Systems

  • Flexible modeling tools for:

    • Function approximation

    • Pattern Classification

    • Association

    • Other

Conclusions


Comments II

  • Advantages:

    • Distributed representation allows co-activation of categories

    • Graceful degradation

    • Robustness to noise

    • Automatic generalisation (of categories, etc)

Conclusions


Comments III

  • Disadvantages:

    • They cannot explain their function due to distributed representations

    • We cannot add existing knowledge to neural networks as rules

    • We cannot extract rules

    • Network parameters found by trial and error (in general case)

Conclusions


ad