Loading in 5 sec....

Dimensionality ReductionPowerPoint Presentation

Dimensionality Reduction

- 116 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Dimensionality Reduction' - willow-burton

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Dimensionality Reduction

- High-dimensional == many features
- Find concepts/topics/genres:
- Documents:
- Features: Thousands of words, millions of word pairs

- Surveys – Netflix:480k users x 177k movies

- Documents:

Slides by Jure Leskovec

Dimensionality Reduction

- Compress / reduce dimensionality:
- 106rows; 103columns; no updates
- random access to any cell(s); small error: OK

Slides by Jure Leskovec

Dimensionality Reduction

- Assumption: Data lies on or near a low d-dimensional subspace
- Axes of this subspace are effective representation of the data

Slides by Jure Leskovec

Why Reduce Dimensions?

Why reduce dimensions?

- Discover hidden correlations/topics
- Words that occur commonly together

- Remove redundant and noisy features
- Not all words are useful

- Interpretation and visualization
- Easier storage and processing of the data

Slides by Jure Leskovec

SVD - Definition

A[m x n]= U[m x r][ r x r] (V[n x r])T

- A: Input data matrix
- mx nmatrix (e.g., mdocuments, nterms)

- U: Left singular vectors
- mx r matrix (mdocuments, r concepts)

- : Singular values
- rx r diagonal matrix (strength of each ‘concept’) (r : rank of the matrix A)

- V: Right singular vectors
- nx r matrix (nterms, r concepts)

Slides by Jure Leskovec

SVD - Properties

It is alwayspossible to decompose a real matrix A into A = U VT , where

- U, , V: unique
- U, V: column orthonormal:
- UT U = I; VT V = I (I: identity matrix)
- (Cols. are orthogonal unit vectors)

- : diagonal
- Entries (singular values) are positive, and sorted in decreasing order (σ1σ2 ... 0)

Slides by Jure Leskovec

SVD – Example: Users-to-Movies

- A = U VT - example:

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi

x

x

=

Romnce

Slides by Jure Leskovec

SVD – Example: Users-to-Movies

- A = U VT - example:

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

Romance-concept

SciFi

x

x

=

Romnce

Slides by Jure Leskovec

SVD - Example

U is “user-to-concept”

similarity matrix

- A = U VT - example:

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

Romance-concept

SciFi

x

x

=

Romnce

Slides by Jure Leskovec

SVD - Example

- A = U VT - example:

Matrix

Alien

Serenity

Casablanca

Amelie

‘strength’ of SciFi-concept

SciFi

x

x

=

Romnce

Slides by Jure Leskovec

SVD - Example

- A = U VT - example:

V is “movie-to-concept”

similarity matrix

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

SciFi

x

x

=

Romnce

Slides by Jure Leskovec

SVD - Example

- A = U VT - example:

V is “movie-to-concept”

similarity matrix

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

SciFi

x

x

=

Romnce

Slides by Jure Leskovec

SVD - Interpretation #1

‘movies’, ‘users’ and ‘concepts’:

- U: user-to-concept similarity matrix
- V: movie-to-concept sim. matrix
- : its diagonal elements: ‘strength’ of each concept

Slides by Jure Leskovec

SVD gives best axis to project on:

‘best’ = min sum of squares of projection errors

minimum reconstruction error

SVD - interpretation #2Movie 2 rating

first right singularvector

v1

Movie 1 rating

Slides by Jure Leskovec

x

v1

SVD - Interpretation #2- A = U VT - example:

first right singularvector

Movie 2 rating

v1

Movie 1 rating

=

Slides by Jure Leskovec

SVD - Interpretation #2

- A = U VT - example:

variance (‘spread’) on the v1 axis

x

x

=

Slides by Jure Leskovec

x

SVD - Interpretation #2More details

- Q:How exactly is dim. reduction done?

=

Slides by Jure Leskovec

SVD - Interpretation #2

More details

- Q:How exactly is dim. reduction done?
- A:Set the smallest singular values to zero

x

x

=

A=

Slides by Jure Leskovec

SVD - Interpretation #2

More details

- Q: How exactly is dim. reduction done?
- A:Set the smallest singular values to zero

x

x

~

A=

Slides by Jure Leskovec

SVD - Interpretation #2

More details

- Q: How exactly is dim. reduction done?
- A:Set the smallest singular values to zero:

x

x

~

A=

Slides by Jure Leskovec

SVD - Interpretation #2

More details

- Q: How exactly is dim. reduction done?
- A:Set the smallest singular values to zero:

x

x

~

A=

Slides by Jure Leskovec

SVD - Interpretation #2

More details

- Q: How exactly is dim. reduction done?
- A:Set the smallest singular values to zero

B=

Frobenius norm:

ǁMǁF= Σij Mij2

~

A=

ǁA-BǁF= Σij(Aij-Bij)2

is “small”

Slides by Jure Leskovec

SVD – Best Low Rank Approx.

- Theorem: Let A = U VT(σ1σ2…, rank(A)=r)
thenB = U S VT

- S = diagonal nxn matrix where si=σi (i=1…k) else si=0
is a best rank-k approximation to A:

- Bis solution to minBǁA-BǁFwhere rank(B)=k

- S = diagonal nxn matrix where si=σi (i=1…k) else si=0
- We will need 2 facts:
- where M= PQR is SVD of M
- U VT - U S VT = U ( - S) VT

Slides by Jure Leskovec

SVD – Best Low Rank Approx.

- We will need 2 facts:
- where M= PQR is SVD of M
- U VT -U S VT= U ( - S) VT

We apply:

-- P column orthonormal

-- R row orthonormal

-- Q is diagonal

Slides by Jure Leskovec

SVD – Best Low Rank Approx.

- A = U VT , B = U S VT(σ1σ2… 0, rank(A)=r)
- S = diagonal nxn matrix where si=σi (i=1…k) else si=0
then Bis solution to minBǁA-BǁF, rank(B)=k

- S = diagonal nxn matrix where si=σi (i=1…k) else si=0
- Why?
- We want to choose si to minimize we set si=σi (i=1…k) else si=0

- U VT - U S VT = U ( - S) VT

Slides by Jure Leskovec

SVD - Interpretation #2

Equivalent:

‘spectral decomposition’ of the matrix:

σ1

x

x

=

u1

u2

σ2

v1

v2

Slides by Jure Leskovec

u2

u1

vT2

vT1

σ2

σ1

SVD - Interpretation #2Equivalent:

‘spectral decomposition’ of the matrix

m

k terms

=

+

+...

1 x m

n x 1

n

Assume: σ1σ2σ3 ... 0

Why is setting small σsthe thing to do?

Vectors ui and vi are unit length, so σiscales them.

So, zeroing small σs introduces less error.

Slides by Jure Leskovec

u1

u2

vT1

vT2

σ1

σ2

SVD - Interpretation #2Q: How many σsto keep?

A: Rule-of-a thumb: keep 80-90% of ‘energy’(=σi2)

m

=

+

+...

n

assume: σ1σ2σ3 ...

Slides by Jure Leskovec

SVD - Complexity

- To compute SVD:
- O(nm2) or O(n2m) (whichever is less)

- But:
- Less work, if we just want singular values
- or if we want first k singular vectors
- or if the matrix is sparse

- Implemented in linear algebra packages like
- LINPACK, Matlab, SPlus, Mathematica ...

Slides by Jure Leskovec

SVD - Conclusions so far

- SVD:A= U VT: unique
- U: user-to-concept similarities
- V: movie-to-concept similarities
- : strength of each concept

- Dimensionality reduction:
- keep the few largest singular values (80-90% of ‘energy’)
- SVD: picks up linear correlations

Slides by Jure Leskovec

Case study: How to query?

Q: Find users that like ‘Matrix’ and ‘Alien’

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi

x

x

=

Romnce

Slides by Jure Leskovec

Case study: How to query?

Q: Find users that like ‘Matrix’

A: Map query into a ‘concept space’ – how?

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi

x

x

=

Romnce

Slides by Jure Leskovec

Case study: How to query?

Q: Find users that like ‘Matrix’

A: map query vectors into ‘concept space’ – how?

Matrix

Alien

Serenity

Casablanca

Amelie

q

Alien

v2

q=

v1

Project into concept space:Inner product with each ‘concept’ vector vi

Matrix

Slides by Jure Leskovec

Case study: How to query?

Q: Find users that like ‘Matrix’

A: map the vector into ‘concept space’ – how?

Matrix

Alien

Serenity

Casablanca

Amelie

q

Alien

v2

q=

v1

q*v1

Project into concept space:Inner product with each ‘concept’ vector vi

Matrix

Slides by Jure Leskovec

Case study: How to query?

Compactly, we have:

qconcept = q V

E.g.:

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

=

q=

movie-to-concept

similarities

Slides by Jure Leskovec

Case study: How to query?

How would the user d that rated (‘Alien’, ‘Serenity’) be handled?

dconcept = d V

E.g.:

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

=

d=

movie-to-concept

similarities

Slides by Jure Leskovec

Case study: How to query?

Observation:User d that rated (‘Alien’, ‘Serenity’) will be similar to query “user” q that rated (‘Matrix’), although d did not rate ‘Matrix’!

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

q=

Similarity = 0

Similarity ≠ 0

Slides by Jure Leskovec

SVD: Drawbacks

- Optimal low-rank approximation:
- in Frobenius norm

- Interpretability problem:
- A singular vector specifies a linear combination of all input columns or rows

- Lack of sparsity:
- Singular vectors are dense!

VT

U

Slides by Jure Leskovec

Download Presentation

Connecting to Server..