
Prelude. A pattern of activation in a NN is a vector A set of connection weights between units is a matrix Vectors and matrices have well-understood mathematical and geometric properties Very useful for understanding the properties of NNs. Operations on Vectors and Matrices. Outline.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
[ 1 2 3 4 5 ] [ 0.4 1.2 0.07 8.4 12.3 ] [ 12 10 ] [ 2 ]
[ 1 2 3 4 5 ] [ 0.4 1.2 0.07 8.4 12.3 ] [ 12 10 ] [ 2 ]
[ 12 10 ] ≠ [ 10 12 ]
[ 1 2 3 4 5 ] [ 0.4 1.2 0.07 8.4 12.3 ] [ 12 10 ] [ 2 ]
Row vectors
2
3
4
5
1.5
0.3
6.2
12.0
17.1
Scalars, Vectors and Matrices[ 1 2 3 4 5 ] [ 0.4 1.2 0.07 8.4 12.3 ] [ 12 10 ] [ 2 ]
Row vectors
Column Vectors
Scalar: A single number (integer or real)
Vector: An ordered list of scalars
Matrix: An ordered list of vectors:
1 2 6 1 7 8
2 5 9 0 0 3
3 1 5 7 6 3
2 7 9 3 3 1
Scalar: A single number (integer or real)
Vector: An ordered list of scalars
Matrix: An ordered list of vectors:
1 2 6 1 7 8
2 5 9 0 0 3
3 1 5 7 6 3
2 7 9 3 3 1
Row vectors
Scalar: A single number (integer or real)
Vector: An ordered list of scalars
Matrix: An ordered list of vectors:
1 2 6 1 7 8
2 5 9 0 0 3
3 1 5 7 6 3
2 7 9 3 3 1
Column vectors
Scalar: A single number (integer or real)
Vector: An ordered list of scalars
Matrix: An ordered list of vectors:
1 2 6 1 7 8
2 5 9 0 0 3
3 1 5 7 6 3
2 7 9 3 3 1
Matrices are indexed (row, column)
M =
Scalar: A single number (integer or real)
Vector: An ordered list of scalars
Matrix: An ordered list of vectors:
1 2 6 1 7 8
2 5 9 0 0 3
3 1 5 7 6 3
2 7 9 3 3 1
Matrices are indexed (row, column)
M(1,3) = 6 (row 1, column 3)
M =
Scalar: A single number (integer or real)
Vector: An ordered list of scalars
Matrix: An ordered list of vectors:
1 2 6 1 7 8
2 5 9 0 0 3
3 1 5 7 6 3
2 7 9 3 3 1
Matrices are indexed (row, column)
M(1,3) = 6 (row 1, column 3)
M(3,1) = 3 (row 3, column 1)
M =
x, y, z…
u, v, w…
M, N, O …
, , , , …
If u is a row vector…
u = [ 1 2 3 4 5 ]
…then u’ (“u-transpose”) is a column vector
1
2
3
4
5
… and vice-versa.
u’ =
If u is a row vector…
u = [ 1 2 3 4 5 ]
…then u’ (“u-transpose”) is a column vector
1
2
3
4
5
… and vice-versa.
Why in the world
would I care??
u’ =
You
If u is a row vector…
u = [ 1 2 3 4 5 ]
…then u’ (“u-transpose”) is a column vector
1
2
3
4
5
… and vice-versa.
Answer: It’ll matter when we come to vector multiplication.
u’ =
If u is a row vector…
u = [ 1 2 3 4 5 ]
…then u’ (“u-transpose”) is a column vector
1
2
3
4
5
… and vice-versa.
OK.
u’ =
i1
i2
Output units, i
Connection weights, wij
w23
w11
w13
w12
w22
w21
j1
j2
j3
Input units, j
i1
i2
Output units, i
Connection weights, wij
w23
w11
w13
w12
w22
w21
0.2
0.9
0.5
Input units, j
The activations of the input nodes can be represented as a 3-dimensional vector:
j = [ 0.2 0.9 0.5 ]
1.0
0.0
Output units, i
Connection weights, wij
w23
w11
w13
w12
w22
w21
j1
j2
j3
Input units, j
The activations of the output nodes can be represented as a 2-dimensional vector:
i = [ 1.0 0.0 ]
i1
i2
Output units, i
Connection weights, wij
w23
w11
0.1
0.2
w13
w12
1.0
w22
w21
j1
j2
j3
Input units, j
The weights leading into any output node can be represented as a 3-dimensional vector:
w1j = [ 0.1 1.0 0.2 ]
i1
i2
Output units, i
Connection weights, wij
w23
-0.9
w11
0.1
0.2
w13
w12
1.0
w22
0.1
w21
1.0
j1
j2
j3
Input units, j
The complete set of weights can be represented as a 3 (row) X 2 (column) matrix:
0.1 1.0 0.21.0 0.1 -0.9
W =
i1
i2
Output units, i
Connection weights, wij
Why in the world
would I care??
w23
-0.9
w11
0.1
0.2
w13
w12
1.0
w22
0.1
w21
1.0
j1
j2
j3
Input units, j
The complete set of weights can be represented as a 2 (row) X 3 (column) matrix:
0.1 1.0 0.21.0 0.1 -0.9
W =
Because the mathematics of vectors and matrices is well-understood.
Because vectors have a very useful geometric interpretation.
Because Matlab “thinks” in vectors and matrices.
Because you are going to have to learn to think in Matlab.
Why in the world
would I care??
W
Because the mathematics of vectors and matrices is well-understood.
Because vectors have a very useful geometric interpretation.
Because Matlab “thinks” in vectors and matrices.
Because you are going to have to learn to think in Matlab.
OK.
Dimensionality: The number of numbers in a vector
Dimensionality: The number of numbers in a vector
Dimensionality: The number of numbers in a vector
8
4
10
5
[ 5 4 ] * 2 = [ 10 8 ]
Lengthens the vector but does not change its orientation
If u and v are both row vectors of the same dimensionality…
u = [ 1 2 3 ]
v = [ 4 5 6 ]
If u and v are both row vectors of the same dimensionality…
u = [ 1 2 3 ]
v = [ 4 5 6 ]
… then the product
u ·v =
If u and v are both row vectors of the same dimensionality…
u = [ 1 2 3 ]
v = [ 4 5 6 ]
… then the product
u ·v = NAN
Is undefined.
If u and v are both row vectors of the same dimensionality…
u = [ 1 2 3 ]
v = [ 4 5 6 ]
… then the product
u ·v = NAN
Is undefined.
Huh??
Why??
That’s BS!
I told you you’d eventually care about transposing vectors…
?
u = [ 1 2 3 ]
v = [ 4 5 6 ]
4
5
6
u = [ 1 2 3 ]
v = [ 4 5 6 ]
v’ =
4
5
6
u = [ 1 2 3 ]
v’ =
u ·v’ = 32
v’
Imagine rotating your row vector into a (pseudo) column vector…
4
5
6
1
2
3
u = [ 1 2 3 ]
u ·v’ = 32
v’
Now multiply corresponding elements and add up the products…
4
5
6
1
2
3
4
u = [ 1 2 3 ]
u ·v’ = 32
v’
Now multiply corresponding elements and add up the products…
4
5
6
1
2
3
4
10
u = [ 1 2 3 ]
u ·v’ = 32
v’
Now multiply corresponding elements and add up the products…
4
5
6
1
2
3
4
10
18
u = [ 1 2 3 ]
u ·v’ = 32
v’
Now multiply corresponding elements and add up the products…
4
5
6
1
2
3
4
10
1832
u = [ 1 2 3 ]
u ·v’ = 32
v’
u’
v
4
10
1832
4
5
6
4
5
6
1
2
3
4
10
1832
u = [ 1 2 3 ]
v = [ 4 5 6 ]
u ·v’ = 32
v· u’ = 32
… the net input to a unit
u’
u
u = [ 3, 4 ]
3
4
3
4
9
16
25
5
4
True for vectors of any dimensionality
3
Well…
… and cos(uv) is a length-invariant measure of the similarity of u and v
cos(uv) is a length-invariant measure of the similarity of u and v
V’ = [ 1, 1 ]
U = [ 1, 0 ]
cos(uv) is a length-invariant measure of the similarity of u and v
V’ = [ 1, 1 ]
U V = (1 * 1) + (1 * 0) = 1
uv = 45º; cos(uv) = .707
U = [ 1, 0 ]
cos(uv) is a length-invariant measure of the similarity of u and v
V’ = [ 1, 1 ]
U V = (1 * 1) + (1 * 0) = 1
uv = 45º; cos(uv) = .707
U = [ 1, 0 ]
cos(uv) is a length-invariant measure of the similarity of u and v
V’ = [ 1, 1 ]
||u|| = sqrt(1) = 1
||v|| = sqrt(2) = 1.414
uv = 45º; cos(uv) = .707
U = [ 1, 0 ]
cos(uv) is a length-invariant measure of the similarity of u and v
V’ = [ 1, 1 ]
||u|| = sqrt(1) = 1
||v|| = sqrt(2) = 1.414
uv = 45º; cos(uv) = .707
U = [ 1, 0 ]
cos(uv) is a length-invariant measure of the similarity of u and v
V’ = [ 0, 1 ]
uv = 90º; cos(uv) = 0
U = [ 1, 0 ]
cos(uv) is a length-invariant measure of the similarity of u and v
uv = 270º; cos(uv) = 0
U = [ 1, 0 ]
V’ = [ 0, -1 ]
cos(uv) is a length-invariant measure of the similarity of u and v
uv = 180º; cos(uv) = -1
U = [ 1, 0 ]
V’ = [ -1,0 ]
cos(uv) is a length-invariant measure of the similarity of u and v
uv = 0º; cos(uv) = 1
U = [ 1, 0 ]
V’ = [ 2.2,0 ]
cos(uv) is a length-invariant measure of the similarity of u and v
In general…
cos(uv) -1…1
True regardless of dimensionality
cos(uv) is a length-invariant measure of the similarity of u and v
To see why, consider the cosine expressed in scalar notation…
cos(uv) is a length-invariant measure of the similarity of u and v
… and compare it to the equation for the correlation coefficient…
cos(uv) is a length-invariant measure of the similarity of u and v
… and compare it to the equation for the correlation coefficient…
if u and v have means of zero, then cos(uv) = r(u,v)
cos(uv) is a length-invariant measure of the similarity of u and v
… and compare it to the equation for the correlation coefficient…
if u and v have means of zero, then cos(uv) = r(u,v)
The cosine is a special case of the correlation coefficient!
cos(uv) is a length-invariant measure of the similarity of u and v
… and let’s compare the cosine to the dot product…
cos(uv) is a length-invariant measure of the similarity of u and v
… and let’s compare the cosine to the dot product…
If u and v have lengths of 1, then the dot product is equal to the cosine.
cos(uv) is a length-invariant measure of the similarity of u and v
… and let’s compare the cosine to the dot product…
If u and v have lengths of 1, then the dot product is equal to the cosine.
The dot product is a special case of the cosine, which is a special case of the correlation coefficient, which is a measure of vector similarity!
ai
asymptotic
ni
ai
Step (BTU)
ni
ai
logistic
ni
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
1
2
v = [ 4 5 6 ]
u’ =
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
1
2
v = [ 4 5 6 ]
u’ =
M =
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
1
2
v = [ 4 5 6 ]
u’ =
M =
Row 1
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
1
2
v = [ 4 5 6 ]
u’ =
M =
Row 1 times column 1
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
1
2
4
v = [ 4 5 6 ]
u’ =
M =
Row 1 times column 1 goes into row 1, column 1
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
1
2
4 5
v = [ 4 5 6 ]
u’ =
M =
Row 1 times column 2 goes into row 1, column 2
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
1
2
4 5 6
v = [ 4 5 6 ]
u’ =
M =
Row 1 times column 3 goes into row 1, column 3
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
1
2
4 56
8
v = [ 4 5 6 ]
u =
M =
Row 2 times column 1 goes into row 2, column 1
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
1
2
4 56
8 10
v = [ 4 5 6 ]
u’ =
M =
Row 2 times column 2 goes into row 2, column 2
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
1
2
4 56
8 10 12
v = [ 4 5 6 ]
u’ =
M =
Row 2 times column 3 goes into row 2, column 3
The two vectors need not have the same dimensionality.
Same Mantra: Rows by Columns.
This time, multiply a column vector by a row vector:
M = u’ * v
A better way to visualize it…
v = [ 4 5 6 ]
4 56
8 10 12
1
2
u’ =
= M
Outer product is not exactly commutative…
M = u’ * v
M = v’ * u
u = [ 1 2 ]
v = [ 4 5 6 ]
4
5
6
4 5 6
8 10 12
1
2
4 8
5 10
6 12
u’ =
= M
v’ =
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
Multiply rows
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
Multiply rows by columns
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
Make a proxy column vector…
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
.2
.6
.3
.7
.9
.4
.3
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
.2
.6
.3
.7
.9
.4
.3
Now compute the dot product of the (proxy) row vector with each column of the matrix…
[ 1.5]
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
.2
.6
.3
.7
.9
.4
.3
[ 1.5 1.4]
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
.2
.6
.3
.7
.9
.4
.3
[ 1.5 1.4 0.8]
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
.2
.6
.3
.7
.9
.4
.3
[ 1.5 1.4 0.8 1.5]
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
.2
.6
.3
.7
.9
.4
.3
[ 1.5 1.4 0.8 1.5 1.9]
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
.2
.6
.3
.7
.9
.4
.3
[ 1.5 1.4 0.8 1.5 1.9 1.2]
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
The result is a row vector with as many columns (dimensions) as the matrix (not the vector)
[ 1.5 1.4 0.8 1.5 1.9 1.2]
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
7-dimensional vector
[ 1.5 1.4 0.8 1.5 1.9 1.2]
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
7-dimensional vector
6-dimensional vector
[ 1.5 1.4 0.8 1.5 1.9 1.2]
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
7-dimensional vector
7 (rows) X 6 (columns) matrix
6-dimensional vector
[ 1.5 1.4 0.8 1.5 1.9 1.2]
A row vector:
[ .2 .6 .3 .7 .9 .4 .3 ]
A matrix:
.3 .4 .8 .1 .2 .3
.5 .2 0 .1 .5 .2
.1 .1 .9 .2 .5 .3
.2 .4 .1 .7 .8 .5
.9 .9 .2 .5 .3 .5
.4 .1 .2 .7 .8 .2
.1 .2 .2 .5 .7 .2
7-dimensional vector
7 (rows) X 6 (columns) matrix
6-dimensional vector
[ 1.5 1.4 0.8 1.5 1.9 1.2]
A 2 X 3 matrix
A 3 X 2 matrix
1
2
3
1 2
1 2
1 2
Column 1
Row 1
1 2 3
1 2 3
Result = 6
Place the result in row 1,
A 2 X 3 matrix
A 3 X 2 matrix
1
2
3
1 2
1 2
1 2
Column 1
Row 1
1 2 3
1 2 3
Result = 6
Place the result in row 1, column 1
A 2 X 3 matrix
A 3 X 2 matrix
1
2
3
1 2
1 2
1 2
Column 1
Row 1
1 2 3
1 2 3
Result = 6
Place the result in row 1, column 1 of a new matrix…
6
A 2 X 3 matrix
A 3 X 2 matrix
1
2
3
1 2
1 2
1 2
Column 2
Row 1
1 2 3
1 2 3
Result = 12
Place the result in row 1, column 2 of the new matrix…
6 12
A 2 X 3 matrix
A 3 X 2 matrix
1
2
3
1 2
1 2
1 2
Column 1
1 2 3
1 2 3
Row 2
Result = 6
Place the result in row 2, column 1 of the new matrix…
6 12
6
A 2 X 3 matrix
A 3 X 2 matrix
1
2
3
1 2
1 2
1 2
Column 2
1 2 3
1 2 3
Row 2
Result = 12
Place the result in row 2, column 2 of the new matrix…
6 12
6 12
A 2 X 3 matrix
A 3 X 2 matrix
1 2
1 2
1 2
1 2 3
1 2 3
*
The result has the same number of rows as the first matrix…
=
6 12
6 12
A 2 X 2 matrix
A 2 X 3 matrix
A 3 X 2 matrix
1 2
1 2
1 2
1 2 3
1 2 3
*
The result has the same number of rows as the first matrix…
…and the same number of columns as the second.
=
6 12
6 12
A 2 X 2 matrix
A 2 X 3 matrix
A 3 X 2 matrix
1 2
1 2
1 2
1 2 3
1 2 3
*
…and the number of columns in the first matrix…
=
6 12
6 12
A 2 X 2 matrix
A 2 X 3 matrix
A 3 X 2 matrix
1 2
1 2
1 2
1 2 3
1 2 3
*
…and the number of columns in the first matrix…
…must be equal to the number of rows in the second.
=
6 12
6 12
A 2 X 2 matrix
There’s other more complicated stuff, too.
You (probably) won’t need it for this class.