Dimensionality reduction
Download
1 / 42

Dimensionality Reduction - PowerPoint PPT Presentation


  • 116 Views
  • Uploaded on

Dimensionality Reduction. Dimensionality Reduction. High-dimensional == many features Find concepts/topics/genres: Documents: Features: Thousands of words, millions of word pairs Surveys – Netflix: 480k users x 177k movies. Dimensionality Reduction. Compress / reduce dimensionality:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Dimensionality Reduction' - willow-burton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Dimensionality reduction1
Dimensionality Reduction

  • High-dimensional == many features

  • Find concepts/topics/genres:

    • Documents:

      • Features: Thousands of words, millions of word pairs

    • Surveys – Netflix:480k users x 177k movies

Slides by Jure Leskovec


Dimensionality reduction2
Dimensionality Reduction

  • Compress / reduce dimensionality:

    • 106rows; 103columns; no updates

    • random access to any cell(s); small error: OK

Slides by Jure Leskovec


Dimensionality reduction3
Dimensionality Reduction

  • Assumption: Data lies on or near a low d-dimensional subspace

  • Axes of this subspace are effective representation of the data

Slides by Jure Leskovec


Why reduce dimensions
Why Reduce Dimensions?

Why reduce dimensions?

  • Discover hidden correlations/topics

    • Words that occur commonly together

  • Remove redundant and noisy features

    • Not all words are useful

  • Interpretation and visualization

  • Easier storage and processing of the data

Slides by Jure Leskovec


Svd definition
SVD - Definition

A[m x n]= U[m x r][ r x r] (V[n x r])T

  • A: Input data matrix

    • mx nmatrix (e.g., mdocuments, nterms)

  • U: Left singular vectors

    • mx r matrix (mdocuments, r concepts)

  • : Singular values

    • rx r diagonal matrix (strength of each ‘concept’) (r : rank of the matrix A)

  • V: Right singular vectors

    • nx r matrix (nterms, r concepts)

Slides by Jure Leskovec


n

m

SVD

T

n

VT

A

m

U

Slides by Jure Leskovec


SVD

T

n

1u1v1

2u2v2

A

+

m

σi… scalar

ui … vector

vi … vector

Slides by Jure Leskovec


Svd properties
SVD - Properties

It is alwayspossible to decompose a real matrix A into A = U VT , where

  • U, , V: unique

  • U, V: column orthonormal:

    • UT U = I; VT V = I (I: identity matrix)

    • (Cols. are orthogonal unit vectors)

  • : diagonal

    • Entries (singular values) are positive, and sorted in decreasing order (σ1σ2 ...  0)

Slides by Jure Leskovec


Svd example users to movies
SVD – Example: Users-to-Movies

  • A = U  VT - example:

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi

x

x

=

Romnce

Slides by Jure Leskovec


Svd example users to movies1
SVD – Example: Users-to-Movies

  • A = U  VT - example:

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

Romance-concept

SciFi

x

x

=

Romnce

Slides by Jure Leskovec


Svd example
SVD - Example

U is “user-to-concept”

similarity matrix

  • A = U  VT - example:

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

Romance-concept

SciFi

x

x

=

Romnce

Slides by Jure Leskovec


Svd example1
SVD - Example

  • A = U VT - example:

Matrix

Alien

Serenity

Casablanca

Amelie

‘strength’ of SciFi-concept

SciFi

x

x

=

Romnce

Slides by Jure Leskovec


Svd example2
SVD - Example

  • A = U VT - example:

V is “movie-to-concept”

similarity matrix

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

SciFi

x

x

=

Romnce

Slides by Jure Leskovec


Svd example3
SVD - Example

  • A = U VT - example:

V is “movie-to-concept”

similarity matrix

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

SciFi

x

x

=

Romnce

Slides by Jure Leskovec


Svd interpretation 1
SVD - Interpretation #1

‘movies’, ‘users’ and ‘concepts’:

  • U: user-to-concept similarity matrix

  • V: movie-to-concept sim. matrix

  • : its diagonal elements: ‘strength’ of each concept

Slides by Jure Leskovec


Svd interpretation 2

SVD gives best axis to project on:

‘best’ = min sum of squares of projection errors

minimum reconstruction error

SVD - interpretation #2

Movie 2 rating

first right singularvector

v1

Movie 1 rating

Slides by Jure Leskovec


Svd interpretation 21

x

x

v1

SVD - Interpretation #2

  • A = U VT - example:

first right singularvector

Movie 2 rating

v1

Movie 1 rating

=

Slides by Jure Leskovec


Svd interpretation 22
SVD - Interpretation #2

  • A = U VT - example:

variance (‘spread’) on the v1 axis

x

x

=

Slides by Jure Leskovec


Svd interpretation 23

x

x

SVD - Interpretation #2

More details

  • Q:How exactly is dim. reduction done?

=

Slides by Jure Leskovec


Svd interpretation 24
SVD - Interpretation #2

More details

  • Q:How exactly is dim. reduction done?

  • A:Set the smallest singular values to zero

x

x

=

A=

Slides by Jure Leskovec


Svd interpretation 25
SVD - Interpretation #2

More details

  • Q: How exactly is dim. reduction done?

  • A:Set the smallest singular values to zero

x

x

~

A=

Slides by Jure Leskovec


Svd interpretation 26
SVD - Interpretation #2

More details

  • Q: How exactly is dim. reduction done?

  • A:Set the smallest singular values to zero:

x

x

~

A=

Slides by Jure Leskovec


Svd interpretation 27
SVD - Interpretation #2

More details

  • Q: How exactly is dim. reduction done?

  • A:Set the smallest singular values to zero:

x

x

~

A=

Slides by Jure Leskovec


Svd interpretation 28
SVD - Interpretation #2

More details

  • Q: How exactly is dim. reduction done?

  • A:Set the smallest singular values to zero

B=

Frobenius norm:

ǁMǁF= Σij Mij2

~

A=

ǁA-BǁF= Σij(Aij-Bij)2

is “small”

Slides by Jure Leskovec


U

A

Sigma

VT

=

B is approx A

B

U

Sigma

=

VT

Slides by Jure Leskovec


Svd best low rank approx
SVD – Best Low Rank Approx.

  • Theorem: Let A = U  VT(σ1σ2…, rank(A)=r)

    thenB = U S VT

    • S = diagonal nxn matrix where si=σi (i=1…k) else si=0

      is a best rank-k approximation to A:

    • Bis solution to minBǁA-BǁFwhere rank(B)=k

  • We will need 2 facts:

    • where M= PQR is SVD of M

    • U  VT - U S VT = U ( - S) VT

Slides by Jure Leskovec


Svd best low rank approx1
SVD – Best Low Rank Approx.

  • We will need 2 facts:

    • where M= PQR is SVD of M

    • U VT -U S VT= U ( - S) VT

We apply:

-- P column orthonormal

-- R row orthonormal

-- Q is diagonal

Slides by Jure Leskovec


Svd best low rank approx2
SVD – Best Low Rank Approx.

  • A = U  VT , B = U S VT(σ1σ2…  0, rank(A)=r)

    • S = diagonal nxn matrix where si=σi (i=1…k) else si=0

      then Bis solution to minBǁA-BǁF, rank(B)=k

  • Why?

  • We want to choose si to minimize we set si=σi (i=1…k) else si=0

  • U  VT - U S VT = U ( - S) VT

Slides by Jure Leskovec


Svd interpretation 29
SVD - Interpretation #2

Equivalent:

‘spectral decomposition’ of the matrix:

σ1

x

x

=

u1

u2

σ2

v1

v2

Slides by Jure Leskovec


Svd interpretation 210

u2

u1

vT2

vT1

σ2

σ1

SVD - Interpretation #2

Equivalent:

‘spectral decomposition’ of the matrix

m

k terms

=

+

+...

1 x m

n x 1

n

Assume: σ1σ2σ3  ...  0

Why is setting small σsthe thing to do?

Vectors ui and vi are unit length, so σiscales them.

So, zeroing small σs introduces less error.

Slides by Jure Leskovec


Svd interpretation 211

u1

u2

vT1

vT2

σ1

σ2

SVD - Interpretation #2

Q: How many σsto keep?

A: Rule-of-a thumb: keep 80-90% of ‘energy’(=σi2)

m

=

+

+...

n

assume: σ1σ2σ3  ...

Slides by Jure Leskovec


Svd complexity
SVD - Complexity

  • To compute SVD:

    • O(nm2) or O(n2m) (whichever is less)

  • But:

    • Less work, if we just want singular values

    • or if we want first k singular vectors

    • or if the matrix is sparse

  • Implemented in linear algebra packages like

    • LINPACK, Matlab, SPlus, Mathematica ...

Slides by Jure Leskovec


Svd conclusions so far
SVD - Conclusions so far

  • SVD:A= U  VT: unique

    • U: user-to-concept similarities

    • V: movie-to-concept similarities

    •  : strength of each concept

  • Dimensionality reduction:

    • keep the few largest singular values (80-90% of ‘energy’)

    • SVD: picks up linear correlations

Slides by Jure Leskovec


Case study how to query
Case study: How to query?

Q: Find users that like ‘Matrix’ and ‘Alien’

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi

x

x

=

Romnce

Slides by Jure Leskovec


Case study how to query1
Case study: How to query?

Q: Find users that like ‘Matrix’

A: Map query into a ‘concept space’ – how?

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi

x

x

=

Romnce

Slides by Jure Leskovec


Case study how to query2
Case study: How to query?

Q: Find users that like ‘Matrix’

A: map query vectors into ‘concept space’ – how?

Matrix

Alien

Serenity

Casablanca

Amelie

q

Alien

v2

q=

v1

Project into concept space:Inner product with each ‘concept’ vector vi

Matrix

Slides by Jure Leskovec


Case study how to query3
Case study: How to query?

Q: Find users that like ‘Matrix’

A: map the vector into ‘concept space’ – how?

Matrix

Alien

Serenity

Casablanca

Amelie

q

Alien

v2

q=

v1

q*v1

Project into concept space:Inner product with each ‘concept’ vector vi

Matrix

Slides by Jure Leskovec


Case study how to query4
Case study: How to query?

Compactly, we have:

qconcept = q V

E.g.:

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

=

q=

movie-to-concept

similarities

Slides by Jure Leskovec


Case study how to query5
Case study: How to query?

How would the user d that rated (‘Alien’, ‘Serenity’) be handled?

dconcept = d V

E.g.:

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

=

d=

movie-to-concept

similarities

Slides by Jure Leskovec


Case study how to query6

d=

Case study: How to query?

Observation:User d that rated (‘Alien’, ‘Serenity’) will be similar to query “user” q that rated (‘Matrix’), although d did not rate ‘Matrix’!

Matrix

Alien

Serenity

Casablanca

Amelie

SciFi-concept

q=

Similarity = 0

Similarity ≠ 0

Slides by Jure Leskovec


Svd drawbacks

=

SVD: Drawbacks

  • Optimal low-rank approximation:

    • in Frobenius norm

  • Interpretability problem:

    • A singular vector specifies a linear combination of all input columns or rows

  • Lack of sparsity:

    • Singular vectors are dense!

VT

U

Slides by Jure Leskovec


ad