Lectures 3&4: Linear Machine Learning Algorithms. Dr Martin Brown Room: E1k Email: [email protected] Telephone: 0161 306 4672 http://www.csc.umist.ac.uk/msc/intranet/EEM016. Lectures 3&4: Outline. Linear classification using the Perceptron Classification problem
Related searches for Lectures 34: Linear Machine Learning Algorithms
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Dr Martin Brown
Room: E1k
Email: [email protected]
Telephone: 0161 306 4672
http://www.csc.umist.ac.uk/msc/intranet/EEM016
Training data: D = {X,y}
Prior
knowledge
Design/
learn
Classifier
m(q,x)
^
Predict
^
Predicted class label: y
New pattern: x
How to encode qualitative target and input features?
x2
+
+
+
+
+
+
+
+
x1
Calculate q in lab 3&4 …
xTq
Instantaneous Parameter Updatex2, q2
^
Errordriven update:
^
x1, q1
^
y, y
1
0
1
^
“error driven”
parameter
estimation
Repeatedly
cycle through data set D, drawing out each sample {xk, yk}
^
yk
xk

+
yk
q2
^
qk
^
qk+1
q
q1
To finish proof, select
^
Is the data linearly separable?
^
^
^
k=0, q = [0.01, 0.1, 0.006]
k=5, q = [0.98, 1.11, 1.01]
k=18, q = [2.98, 2.11, 1.01]
x2
x2
x2
x1
x1
x1
q1,k
^
^
qi,k
q2,k
^
bias q0,k
k: data presentation index
Example: Parameter Trajectory (ii)Lab exercise:
Calculate by hand the first 4 iterations of the learning scheme
x2
x1
x2
1
0
1
x1
= s
Regression Problem Visualisation+
+
+
+
+
^
^
+
y, y
y
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
x
+
+
+
m(yx) = 12
s(e) = 1.5
12
+
+
+
+
+
x
Probabilistic Prediction Outputm(yx) = 12
^
2s(e) = 3
95% of the data lies in the range m+/2s
= [12 +/2*1.5]
= [9,15]
Given a set of features x, a linear predictor has the form:
The output is a realvalued, quantitative variable
The bias term can be included as an extra feature x0 = 1. This renames the bias parameter as q0.
Most linear control system models do not explicitly include a bias term, why is this?
Similar to the Toluca example in week 1.
^
y, y
x
Structure of a Linear Regression Model^
^
^
q2
^
qk
^
qk+1
q
q1
Proof of Convergence (ii)when
k=100
y,
y
k=5
^
k=0
x
^
q0
^
^
q
q1
^
q1
^
q0
k