- 105 Views
- Uploaded on
- Presentation posted in: General

WK1 - Introduction

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

WK1 - Introduction

CS 476: Networks of Neural Computation

WK1 – Introduction

Dr. Stathis Kasderidis

Dept. of Computer Science

University of Crete

Spring Semester, 2009

Contents

- Course structure and details
- Basic ideas of Neural Networks
- Historical development of Neural Networks
- Types of learning
- Optimisation techniques and the LMS method
- Conclusions

Contents

Course Details

- Duration: 13 weeks (2 Feb – 15 May 2009)
- Lecturer: Stathis Kasderidis
- E-mail: stathis@ics.forth.gr
- Meetings: After arrangement through e-mail.
- Assts: Farmaki, Fasoulakis

- Hours:
- Every Tue 11-1 am and Wed 11-1 am.
- Laboratory at Fri 11-1 am.

Course

Course Timetable

- WK1 (3/5 Feb): Introduction
- WK2 (10/12 Feb): Perceptron
- WK3 (17/19 Feb): Multi-layer Perceptron
- WK4 (24/26 Oct): Radial Basis Networks
- WK5 (3/5 Mar): Recurrent Networks
- WK6 (10/12 Mar): Self-Organising Networks
- WK7 (17/19 Mar): Hebbian Learning
- WK8 (24/26 Mar): Hopfield Networks
- WK9 (31/2 Apr): Principal Component Analysis

Course

Course Timetable (Cont)

- WK10 (7/9 Apr): Support Vector Machines
- WK11 (28/30 Apr): Stochastic Networks
- WK12 (5/7 May): Student Projects’ Presentation
- WK13 (12/14 May): Exams Preparation
- Every week:
- 3hrs Theory
- 1hr Demonstration

- 19 Mar 2009: Written mid-term exams (optional)

Course

Course Timetable (Cont)

- Lab sessions will take place every Friday 11-1 am. In Lab sessions, you will be examined in written assignments and you can get help between assignments.
- There will be four assignments during the term on the following dates:
- Fri 6 Mar (Ass1 – Perceptron / MLP / RBF)
- Fri 20 Mar (Ass2 – Recurrent / Self-organising)
- Fri 3 Apr (Ass3 – Hebbian / Hopfield)
- Fri 8 May (Ass4 – PCA/SVM/Stochastic)

Course Structure

- Final grade is divided:
- Laboratory attendance (20%)
- Obligatory!

- Course project (40%)
- Starts at WK2. Presentation at WK12.
- Teams of 2-4 people depending on class size. Selection from a set of offered projects.

- Theory. Best of:
- Final Theory Exams (40%) or
- Final Theory Exams (25%) + Mid-term exams (15%)

- Laboratory attendance (20%)

Course

Project Problems

- Problems categories:
- Time Series Prediction (Financial Series?)
- Color Segmentation with Self-Organising Networks.
- Robotic Arm control with Self-Organising Networks
- Pattern Classification (Geometric Shapes)
- Cognitive Modeling (ALCOVE model)

Course

Suggested Tools

- Tools:
- MATLAB (+ Neural Networks Toolbox). Can be slow in large problems!
- TLearn: http://crl.ucsd.edu/innate/tlearn.html
- Any C/C++ compiler
- Avoid Java and other interpreted languages! Too slow!

Course

What are Neural Networks?

- Models inspired by real nervous systems
- They have a mathematical and computational formulation
- Very general modelling tools
- Different approach to Symbolic AI (Connectionism)
- Many paradigms exist but based on common ideas
- A type of graphical models
- Usedin many scientific and technological areas, e.g.

Basic Ideas

What are Neural Networks? (Cont.)

Basic Ideas

What are Neural Networks? (Cont. 2)

- NNs & Physics: e.g. Spin Glasses
- NNs & Mathematics: e.g. Random Fields
- NNs & Philosophy: e.g. Theory of Mind, Consciousness
- NNs & Cognitive Science: e.g. Connectionist Models of High-Level Functions (Memory, Language, etc)
- NNs & Engineering: e.g. Control, Hybrid Systems, A-Life
- NNs & Neuroscience: e.g. Channel dynamics, Compartmental models

Basic Ideas

What are Neural Networks? (Cont. 3)

- NNs & Finance: e.g. Agent-based models of markets,
- NNs & Social Science: e.g. Artif. Society

Basic Ideas

General Characteristics I

- How do they look like?

Basic Ideas

General Characteristics II

- Node details:

- Y=f(Act)
- f is called Transfer function
- Act=I Xi * Wi –B
- B is called Bias
- W are called Weights

Basic Ideas

General Characteristics III

- Form of transfer function:

Basic Ideas

General Characteristics IV

- Network Specification:
- Number of neurons
- Topology of connections (Recurrent, Feedforward, etc)
- Transfer function(s)
- Input types (representation: symbols, etc)
- Output types (representation: as above)
- Weight parameters, W
- Other (weights initialisation, Cost function, training criteria, etc)

Basic Ideas

General Characteristics V

- Processing Modes:
- Recall
- “Learning”

Basic Ideas

General Characteristics VI

- Common properties of all Neural Networks:
- Distributed representations
- Graceful degradation due to damage
- Noise robustness
- Non-linear mappings
- Generalisation and prototype extraction
- Allow access of memory by contents
- Can work with incomplete input

Basic Ideas

Historical Development of Neural Networks

- History in brief:
- McCulloch-Pitts, 1943: Digital Neurons
- Hebb, 1949:Synaptic plasticity
- Rosenblant, 1958: Perceptron
- Minksy & Papert, 1969: Perceptron Critique
- Kohonen, 1978: Self-Organising Maps
- Hopfiled, 1982: Associative Memory
- Rumelhart & McLelland, 1986: Back-Prop algorithm
- Many people, 1985-today:EXPLOSION!

History

What is Learning in NN?

Def:

“Learning is a process by which the free parameters of neural network are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes take place”

[Mendel & McClaren (1970)]

Learning

Learning Sequence

- The network is stimulated by the environment;
- The network undergoes changes in its free parameters as a result of this stimulation;
- The network responds in a new way to the environment because of the changes that have occurred in its internal structure.

Learning

Learning Criteria

- Sum squared error
- Mean square error
- X2 statistic
- Mutual information
- Entropy
- Other (e.g. Dot product – ‘similarity’)

Learning

Learning Paradigms

- Learning with a teacher (supervised learning)
- Learning without a teacher
- Reinforcement learning
- Unsupervised learning (self-organisation)

Learning

Families of Learning Algorithms

- Error-based learning
- wkj(n) = h*ek(n)*xj(n) (Delta rule)

- Memory-based learning (??)
- 1-Nearest Neighbour
- K-Nearest Neighbours

- Hebbian learning
- wkj(n) = h*yk(n)*xj(n)
- wkj(n) =F(yk(n),xj(n)) (more general case)

- Competitive learning
- wij(n+1) = h*(xj(n)- wij(n))

Learning

Families of Learning Algorithms II

- Stochastic Networks
- Boltzmann learning
- wkj(n) = h*(kj+(n)-kj-(n))
- (kj* = avg corr of states of neurons i, j )

- Boltzmann learning

Learning

Learning Tasks

- Function approximation
- Association
- Auto-association
- Hetero-association

- Pattern recognition
- Control
- Filtering

Learning

Credit Assignment Problem

- Def: It is the problem of providing credit or blame to states that lead to useful / harmful outcomes
- Temporal Credit Assignment Problem: Find which actions in a period q=[t,t-T] lead to useful outcome at time t and credit these actions, I.e.
- Outcome(t) – f Actions(q)

- Structural Credit Assignment Problem: Find which states at time t lead to useful actions at time t, I.e.
- Actions(t) – g State(t)

Learning

Statistical Nature of the Learning Process

- Assume that a set of examples is given:

- Assume that a statistical model of the generating process is given (regression equation):

- Where X is a vector random variable (independent variable), D is scalar random variable (dependent) and is a random variable with the following properties:

Bias / Var

Statistical Nature of the Learning Process II

- The first property says that has zero mean given any realisation of X
- The second property says that is uncorrelated with the regression function f(X) (principle of orthogonality)
- is called intrinsic error
- Assume that the neural network describes an “approximation” to the regression function, which is:

Bias / Var

Statistical Nature of the Learning Process III

- The weight vector w is obtained by minimising the cost function:

- We can re-write this, using expectation operators, as:

Bias / Var

- … (after some algebra we get) ….

Statistical Nature of the Learning Process IV

- Thus to obtain w we need to optimise the function:

- … (after some more algebra!) ….

Bias / Var

Statistical Nature of the Learning Process V

- B(w) is called bias (or approximation error)
- V(w) is called variance (or estimation error)
- The last relation shows the bias-variance dilemma:
- “We cannot minimise at the same time both
- bias and variance for a finite set, T. Only
- when N both are becoming zero”

- Bias measures the “goodness” of our functional form in approximating the true regression function f(x)
- Variance measures the amount of information present in the data set T which is used for estimating F(x,w)

Bias / Var

Comments I

- We should distinguish Artificial NN from bio-physicalneural models (e.g. Blue Brain Project);
- Some NNs are Universal Approximators, e.g. feed-forward modles are based on the Kolmogorov Theorem
- Can be combined with other methods, e.g. Neuro-Fuzzy Systems
- Flexible modeling tools for:
- Function approximation
- Pattern Classification
- Association
- Other

Conclusions

Comments II

- Advantages:
- Distributed representation allows co-activation of categories
- Graceful degradation
- Robustness to noise
- Automatic generalisation (of categories, etc)

Conclusions

Comments III

- Disadvantages:
- They cannot explain their function due to distributed representations
- We cannot add existing knowledge to neural networks as rules
- We cannot extract rules
- Network parameters found by trial and error (in general case)

Conclusions