Computational Intelligence Winter Term 2010/11

Computational Intelligence Winter Term 2010/11 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund

Plan for Today • Organization (Lectures / Tutorials) • Overview CI • Introduction to ANN • McCulloch Pitts Neuron (MCP) • Minsky / Papert Perceptron (MPP) G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 2

Organizational Issues Who are you? • either • studying “Automation and Robotics” (Master of Science) • Module “Optimization” • or • studying “Informatik” • BA-Modul “Einführung in die Computational Intelligence” • Hauptdiplom-Wahlvorlesung (SPG 6 & 7) G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 3

Organizational Issues Who am I ? Günter Rudolph Fakultät für Informatik, LS 11 Guenter.Rudolph@tu-dortmund.de ← best way to contact me OH-14, R. 232 ← if you want to see me office hours: Tuesday, 10:30–11:30am and by appointment G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 4

Organizational Issues Lectures Wednesday 10:15-11:45 OH-14, R. 304 TutorialsWednesday 12:00-12:45 OH-14, R. 304or 16:00-16:45 OH-14, R. 304 TutorDr. Mohammad Abam, LS 2 Information http://ls11-www.cs.unidortmund.de/people/rudolph/teaching/lectures/CI/WS2010-11/lecture.jsp Slidessee web Literaturesee web G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 5

Prerequisites Knowledge about • mathematics, • programming, • logic is helpful. But what if something is unknown to me? • covered in the lecture • pointers to literature ... and don‘t hesitate to ask! G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 6

Overview “Computational Intelligence“ What is CI ? ) umbrella term for computational methods inspired by nature • artifical neural networks • evolutionary algorithms • fuzzy systems • swarm intelligence • artificial immune systems • growth processes in trees • ... backbone new developments G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 7

Overview “Computational Intelligence“ • term „computational intelligence“ coined by John Bezdek (FL, USA) • originally intended as a demarcation line) establish border between artificial and computational intelligence • nowadays: blurring border • our goals: • know what CI methods are good for! • know when refrain from CI methods! • know why they work at all! • know how to apply and adjust CI methods to your problem! G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 8

Introduction to Artificial Neural Networks Biological Prototype • Neuron • Information gathering (D) • Information processing (C) • Information propagation (A / S) human being: 1012 neurons electricity in mV range speed: 120 m / s axon (A) cell body (C) nucleus dendrite (D) synapse (S) G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 9

signalprocessing signaloutput signalinput Introduction to Artificial Neural Networks Abstraction axon nucleus / cell body dendrites synapse … G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 10

Introduction to Artificial Neural Networks Model x1 functionf x2 f(x1, x2, …, xn) … xn McCulloch-Pitts-Neuron 1943: xi { 0, 1 } =: B f: Bn→ B G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 11

Introduction to Artificial Neural Networks 1943: Warren McCulloch / Walter Pitts • description of neurological networks → modell: McCulloch-Pitts-Neuron (MCP) • basic idea: • neuron is either active or inactive • skills result from connecting neurons • considered static networks (i.e. connections had been constructed and not learnt) G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 12

boolean OR boolean AND x1 x1 x2 x2 ≥ 1 ≥ n ... ... xn xn = n = 1 Introduction to Artificial Neural Networks McCulloch-Pitts-Neuron n binary input signals x1, …, xn threshold  > 0 )can be realized: G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 13

NOT x1 ≥ 0 y1 Introduction to Artificial Neural Networks McCulloch-Pitts-Neuron n binary input signals x1, …, xn threshold  > 0 in addition: m binary inhibitory signals y1, …, ym • if at least one yj = 1, then output = 0 • otherwise: • sum of inputs ≥ threshold, then output = 1 else output = 0 G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 14

Introduction to Artificial Neural Networks Analogons G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 15

x1 ≥  x2 Example: )x1 + x2≥  x1 x2 ≥ 3 x3 x1 x2 ≥ 3 x3 x1 ≥ 2 x4 ≥ 1 Introduction to Artificial Neural Networks Assumption: inputs also available in inverted form, i.e. 9 inverted inputs. Theorem: Every logical function F: Bn → B can be simulated with a two-layered McCulloch/Pitts net. G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 16

Introduction to Artificial Neural Networks • Proof: (by construction) • Every boolean function F can be transformed in disjunctive normal form • ) 2 layers (AND - OR) • Every clause gets a decoding neuron with  = n) output = 1 only if clause satisfied (AND gate) • All outputs of decoding neuronsare inputs of a neuron with  = 1 (OR gate) q.e.d. G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 17

x1 0,2 ¢10 0,4 ≥ 0,7 x2 0,3 x3 x1 x2 ≥ 7 x3 Introduction to Artificial Neural Networks Generalization: inputs with weights fires 1 if 0,2 x1 + 0,4 x2 + 0,3 x3≥ 0,7 2 x1 + 4 x2 + 3 x3≥ 7 ) duplicate inputs! )equivalent! G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 18

Proof: N „)“ Let Multiplication with yields inequality with coefficients in N Duplicate input xi, such that we get ai b1 b2 bi-1 bi+1 bn inputs. Threshold  = a0 b1 bn „(“ Set all weights to 1. q.e.d. Introduction to Artificial Neural Networks Theorem: Weighted and unweighted MCP-nets are equivalent for weights 2Q+. G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 19

Introduction to Artificial Neural Networks Conclusion for MCP nets + feed-forward: able to compute any Boolean function + recursive: able to simulate DFA − very similar to conventional logical circuits − difficult to construct − no good learning algorithm available G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 20

What can a single MPP do? 1 1 J J N N 0 0 Example: isolation of x2 yields: separating line J 1 separatesR2 in 2 classes  N 0 0 1 Introduction to Artificial Neural Networks Perceptron(Rosenblatt 1958) → complex model → reducedbyMinsky & Paperttowhatis „necessary“ → Minsky-Papertperceptron (MPP), 1969 → essential difference: x 2 [0,1] ½R G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 21

NAND AND OR NOR 1 0 0 1 XOR 1 0 0 1 Introduction to Artificial Neural Networks = 0 = 1 )0 <  w1, w2≥  > 0 )w2≥  ? )w1 + w2≥ 2 )w1≥  )w1 + w2 <  contradiction! w1 x1 + w2 x2≥  G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 22

Introduction to Artificial Neural Networks 1969: Marvin Minsky / Seymor Papert • book Perceptrons→ analysis math. properties of perceptrons • disillusioning result:perceptions fail to solve a number of trivial problems! • XOR-Problem • Parity-Problem • Connectivity-Problem • „conclusion“: All artificial neurons have this kind of weakness! research in this field is a scientific dead end! • consequence: research funding for ANN cut down extremely (~ 15 years) G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 23

x1 2 x2 1 x1 2 x2 XOR 1 0 0 1 Introduction to Artificial Neural Networks how to leave the „dead end“: • Multilayer Perceptrons: • Nonlinear separating functions: )realizes XOR g(x1, x2) = 2x1 + 2x2 – 4x1x2 -1 with  = 0 g(0,0) = –1g(0,1) = +1g(1,0) = +1g(1,1) = –1 G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 24

)0 ≥  )w2≥  )w1≥  )w1 + w2<  Introduction to Artificial Neural Networks How to obtain weights wi and threshold  ? as yet: by construction example: NAND-gate requires solution of a system of linear inequalities (2 P) (e.g.: w1 = w2 = -2,  = -3) now: by „learning“ / training G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 25

graphically: → translation and rotation of separating lines Introduction to Artificial Neural Networks Perceptron Learning Assumption: test examples with correct I/O behavior available • Principle: • chooseinitialweights in arbitrarymanner • feedin testpattern • ifoutputofperceptronwrong, thenchangeweights • goto (2) untilcorrectoutputfor al testpaterns G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 26

I/O correct! let w‘x > 0, should be ≤ 0! let w‘x ≤ 0, should be > 0! (w–x)‘x = w‘x – x‘x < w‘ x (w+x)‘x = w‘x + x‘x > w‘ x Introduction to Artificial Neural Networks P: set of positive examplesN: set of negative examples Perceptron Learning • choose w0 at random, t = 0 • choose arbitrary x 2 P [ N • if x 2 P and wt‘x > 0 then goto 2if x 2 N and wt‘x ≤ 0 then goto 2 • if x 2 P and wt‘x ≤ 0 then wt+1 = wt + x; t++; goto 2 • if x 2 N and wt‘x > 0 then wt+1 = wt– x; t++; goto 2 • stop? If I/O correct for all examples! remark: algorithm converges, is finite, worst case: exponential runtime G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 27

- 1 x1 ≥0 w1 x2 w2 Introduction to Artificial Neural Networks Example threshold as a weight: w = (, w1, w2)‘ ) suppose initial vector of weights is w(0) = (1, -1, 1)‘ G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 28

A B Introduction to Artificial Neural Networks We know what a single MPP can do. What can be achieved with many MPPs? Single MPP ) separates plane in two half planes Many MPPs in 2 layers ) can identify convex sets • How? ) 2 layers! • Convex? ( 8 a,b 2 X:  a + (1-) b 2 X for 2 (0,1) G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 29

Introduction to Artificial Neural Networks Single MPP ) separates plane in two half planes Many MPPs in 2 layers ) can identify convex sets Many MPPs in 3 layers ) can identify arbitrary sets Many MPPs in > 3 layers ) not really necessary! • arbitrary sets: • partitioning of nonconvex set in several convex sets • two-layered subnet for each convex set • feed outputs of two-layered subnets in OR gate (third layer) G. Rudolph: ComputationalIntelligence▪ Winter Term 2010/11 30

Computational Intelligence Winter Term 2010/11