slide1
Download
Skip this Video
Download Presentation
WK1 - Introduction

Loading in 2 Seconds...

play fullscreen
1 / 36

WK1 - Introduction - PowerPoint PPT Presentation


  • 125 Views
  • Uploaded on

WK1 - Introduction. CS 476: Networks of Neural Computation WK1 – Introduction Dr. Stathis Kasderidis Dept. of Computer Science University of Crete Spring Semester, 2009. Contents. Course structure and details Basic ideas of Neural Networks Historical development of Neural Networks

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' WK1 - Introduction' - yanisin-vargas


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

WK1 - Introduction

CS 476: Networks of Neural Computation

WK1 – Introduction

Dr. Stathis Kasderidis

Dept. of Computer Science

University of Crete

Spring Semester, 2009

slide2

Contents

  • Course structure and details
  • Basic ideas of Neural Networks
  • Historical development of Neural Networks
  • Types of learning
  • Optimisation techniques and the LMS method
  • Conclusions

Contents

slide3

Course Details

  • Duration: 13 weeks (2 Feb – 15 May 2009)
  • Lecturer: Stathis Kasderidis
    • E-mail: [email protected]
    • Meetings: After arrangement through e-mail.
    • Assts: Farmaki, Fasoulakis
  • Hours:
    • Every Tue 11-1 am and Wed 11-1 am.
    • Laboratory at Fri 11-1 am.

Course

slide4

Course Timetable

  • WK1 (3/5 Feb): Introduction
  • WK2 (10/12 Feb): Perceptron
  • WK3 (17/19 Feb): Multi-layer Perceptron
  • WK4 (24/26 Oct): Radial Basis Networks
  • WK5 (3/5 Mar): Recurrent Networks
  • WK6 (10/12 Mar): Self-Organising Networks
  • WK7 (17/19 Mar): Hebbian Learning
  • WK8 (24/26 Mar): Hopfield Networks
  • WK9 (31/2 Apr): Principal Component Analysis

Course

slide5

Course Timetable (Cont)

  • WK10 (7/9 Apr): Support Vector Machines
  • WK11 (28/30 Apr): Stochastic Networks
  • WK12 (5/7 May): Student Projects’ Presentation
  • WK13 (12/14 May): Exams Preparation
  • Every week:
    • 3hrs Theory
    • 1hr Demonstration
  • 19 Mar 2009: Written mid-term exams (optional)

Course

slide6

Course Timetable (Cont)

  • Lab sessions will take place every Friday 11-1 am. In Lab sessions, you will be examined in written assignments and you can get help between assignments.
  • There will be four assignments during the term on the following dates:
    • Fri 6 Mar (Ass1 – Perceptron / MLP / RBF)
    • Fri 20 Mar (Ass2 – Recurrent / Self-organising)
    • Fri 3 Apr (Ass3 – Hebbian / Hopfield)
    • Fri 8 May (Ass4 – PCA/SVM/Stochastic)
slide7

Course Structure

  • Final grade is divided:
    • Laboratory attendance (20%)
      • Obligatory!
    • Course project (40%)
      • Starts at WK2. Presentation at WK12.
      • Teams of 2-4 people depending on class size. Selection from a set of offered projects.
    • Theory. Best of:
      • Final Theory Exams (40%) or
      • Final Theory Exams (25%) + Mid-term exams (15%)

Course

slide8

Project Problems

  • Problems categories:
    • Time Series Prediction (Financial Series?)
    • Color Segmentation with Self-Organising Networks.
    • Robotic Arm control with Self-Organising Networks
    • Pattern Classification (Geometric Shapes)
    • Cognitive Modeling (ALCOVE model)

Course

slide9

Suggested Tools

  • Tools:
    • MATLAB (+ Neural Networks Toolbox). Can be slow in large problems!
    • TLearn: http://crl.ucsd.edu/innate/tlearn.html
    • Any C/C++ compiler
    • Avoid Java and other interpreted languages! Too slow!

Course

slide10

What are Neural Networks?

  • Models inspired by real nervous systems
  • They have a mathematical and computational formulation
  • Very general modelling tools
  • Different approach to Symbolic AI (Connectionism)
  • Many paradigms exist but based on common ideas
  • A type of graphical models
  • Usedin many scientific and technological areas, e.g.

Basic Ideas

slide12

What are Neural Networks? (Cont. 2)

  • NNs & Physics: e.g. Spin Glasses
  • NNs & Mathematics: e.g. Random Fields
  • NNs & Philosophy: e.g. Theory of Mind, Consciousness
  • NNs & Cognitive Science: e.g. Connectionist Models of High-Level Functions (Memory, Language, etc)
  • NNs & Engineering: e.g. Control, Hybrid Systems, A-Life
  • NNs & Neuroscience: e.g. Channel dynamics, Compartmental models

Basic Ideas

slide13

What are Neural Networks? (Cont. 3)

  • NNs & Finance: e.g. Agent-based models of markets,
  • NNs & Social Science: e.g. Artif. Society

Basic Ideas

slide14

General Characteristics I

  • How do they look like?

Basic Ideas

slide15

General Characteristics II

  • Node details:
  • Y=f(Act)
  • f is called Transfer function
  • Act=I Xi * Wi –B
  • B is called Bias
  • W are called Weights

Basic Ideas

slide16

General Characteristics III

  • Form of transfer function:

Basic Ideas

slide17

General Characteristics IV

  • Network Specification:
    • Number of neurons
    • Topology of connections (Recurrent, Feedforward, etc)
    • Transfer function(s)
    • Input types (representation: symbols, etc)
    • Output types (representation: as above)
    • Weight parameters, W
    • Other (weights initialisation, Cost function, training criteria, etc)

Basic Ideas

slide18

General Characteristics V

  • Processing Modes:
    • Recall
    • “Learning”

Basic Ideas

slide19

General Characteristics VI

  • Common properties of all Neural Networks:
    • Distributed representations
    • Graceful degradation due to damage
    • Noise robustness
    • Non-linear mappings
    • Generalisation and prototype extraction
    • Allow access of memory by contents
    • Can work with incomplete input

Basic Ideas

slide20

Historical Development of Neural Networks

  • History in brief:
    • McCulloch-Pitts, 1943: Digital Neurons
    • Hebb, 1949:Synaptic plasticity
    • Rosenblant, 1958: Perceptron
    • Minksy & Papert, 1969: Perceptron Critique
    • Kohonen, 1978: Self-Organising Maps
    • Hopfiled, 1982: Associative Memory
    • Rumelhart & McLelland, 1986: Back-Prop algorithm
    • Many people, 1985-today:EXPLOSION!

History

slide21

What is Learning in NN?

Def:

“Learning is a process by which the free parameters of neural network are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes take place”

[Mendel & McClaren (1970)]

Learning

slide22

Learning Sequence

  • The network is stimulated by the environment;
  • The network undergoes changes in its free parameters as a result of this stimulation;
  • The network responds in a new way to the environment because of the changes that have occurred in its internal structure.

Learning

slide23

Learning Criteria

  • Sum squared error
  • Mean square error
  • X2 statistic
  • Mutual information
  • Entropy
  • Other (e.g. Dot product – ‘similarity’)

Learning

slide24

Learning Paradigms

  • Learning with a teacher (supervised learning)
  • Learning without a teacher
    • Reinforcement learning
    • Unsupervised learning (self-organisation)

Learning

slide25

Families of Learning Algorithms

  • Error-based learning
    • wkj(n) = h*ek(n)*xj(n) (Delta rule)
  • Memory-based learning (??)
    • 1-Nearest Neighbour
    • K-Nearest Neighbours
  • Hebbian learning
    • wkj(n) = h*yk(n)*xj(n)
    • wkj(n) =F(yk(n),xj(n)) (more general case)
  • Competitive learning
    • wij(n+1) = h*(xj(n)- wij(n))

Learning

slide26

Families of Learning Algorithms II

  • Stochastic Networks
    • Boltzmann learning
      • wkj(n) = h*(kj+(n)-kj-(n))
      • (kj* = avg corr of states of neurons i, j )

Learning

slide27

Learning Tasks

  • Function approximation
  • Association
    • Auto-association
    • Hetero-association
  • Pattern recognition
  • Control
  • Filtering

Learning

slide28

Credit Assignment Problem

  • Def: It is the problem of providing credit or blame to states that lead to useful / harmful outcomes
  • Temporal Credit Assignment Problem: Find which actions in a period q=[t,t-T] lead to useful outcome at time t and credit these actions, I.e.
    • Outcome(t) – f  Actions(q)
  • Structural Credit Assignment Problem: Find which states at time t lead to useful actions at time t, I.e.
  • Actions(t) – g  State(t)

Learning

slide29

Statistical Nature of the Learning Process

  • Assume that a set of examples is given:
  • Assume that a statistical model of the generating process is given (regression equation):
  • Where X is a vector random variable (independent variable), D is scalar random variable (dependent) and  is a random variable with the following properties:

Bias / Var

slide30

Statistical Nature of the Learning Process II

  • The first property says that  has zero mean given any realisation of X
  • The second property says that  is uncorrelated with the regression function f(X) (principle of orthogonality)
  •  is called intrinsic error
  • Assume that the neural network describes an “approximation” to the regression function, which is:

Bias / Var

slide31

Statistical Nature of the Learning Process III

  • The weight vector w is obtained by minimising the cost function:
  • We can re-write this, using expectation operators, as:

Bias / Var

  • … (after some algebra we get) ….
slide32

Statistical Nature of the Learning Process IV

  • Thus to obtain w we need to optimise the function:
  • … (after some more algebra!) ….

Bias / Var

slide33

Statistical Nature of the Learning Process V

  • B(w) is called bias (or approximation error)
  • V(w) is called variance (or estimation error)
  • The last relation shows the bias-variance dilemma:
    • “We cannot minimise at the same time both
    • bias and variance for a finite set, T. Only
    • when N   both are becoming zero”
  • Bias measures the “goodness” of our functional form in approximating the true regression function f(x)
  • Variance measures the amount of information present in the data set T which is used for estimating F(x,w)

Bias / Var

slide34

Comments I

  • We should distinguish Artificial NN from bio-physicalneural models (e.g. Blue Brain Project);
  • Some NNs are Universal Approximators, e.g. feed-forward modles are based on the Kolmogorov Theorem
  • Can be combined with other methods, e.g. Neuro-Fuzzy Systems
  • Flexible modeling tools for:
    • Function approximation
    • Pattern Classification
    • Association
    • Other

Conclusions

slide35

Comments II

  • Advantages:
    • Distributed representation allows co-activation of categories
    • Graceful degradation
    • Robustness to noise
    • Automatic generalisation (of categories, etc)

Conclusions

slide36

Comments III

  • Disadvantages:
    • They cannot explain their function due to distributed representations
    • We cannot add existing knowledge to neural networks as rules
    • We cannot extract rules
    • Network parameters found by trial and error (in general case)

Conclusions

ad