nonnegative matrix factorization with sparseness constraints l.
Skip this Video
Download Presentation
Nonnegative Matrix Factorization with Sparseness Constraints

Loading in 2 Seconds...

play fullscreen
1 / 23

Nonnegative Matrix Factorization with Sparseness Constraints - PowerPoint PPT Presentation

  • Uploaded on

Nonnegative Matrix Factorization with Sparseness Constraints. S. Race MA591R. Introduction to NMF. Factor A = WH A – matrix of data m non-negative scalar variables n measurements form the columns of A W – m x r matrix of “basis vectors” H – r x n coefficient matrix

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Nonnegative Matrix Factorization with Sparseness Constraints' - zasha

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
introduction to nmf
Introduction to NMF
  • Factor A = WH
  • A – matrix of data
    • m non-negative scalar variables
    • n measurements form the columns of A
  • W – m x r matrix of “basis vectors”
  • H – r x n coefficient matrix
    • Describes how strongly each building block is present in measurement vectors
introduction to nmf con t
Introduction to NMF con’t
  • Purpose:
    • “parts-based” representation of the data
    • Data compression
    • Noise reduction
  • Examples:
    • Term – Document Matrix
    • Image processing
    • Any data composed of hidden parts
introduction to nmf con t4
Introduction to NMF con’t
  • Optimize accuracy of solution:
    • min || A-WH ||F where W,H ≥ 0
    • We can drop nonnegative constraints
      • min || A-(W.W)(H.H) ||
  • Many options for objective function
  • Many options for algorithm
    • W,H will depend on initial choices
    • Convergence is not always guaranteed
common algorithms
Common Algorithms
  • Alternating Least Squares
    • Paatero 1994
  • Multiplicative Update Rules
    • Lee-Seung 2000 Nature
    • Used by Hoyer
  • Gradient Descent
    • Hoyer 2004
    • Berry-Plemmons 2004
why is sparsity important
Why is sparsity important?
  • Nature of some data
    • Text-mining
    • Disease patterns
  • Better Interpretation of Results
  • Storage concerns
non negative sparse coding i
Non-negative Sparse Coding I
  • Proposed by Patrik Hoyer in 2002
  • Add a penalty function to the objective to encourage sparseness
  • OBJ:
  • Parameter λ controls trade-off between accuracy and sparseness
  • f is strictly increasing: f=Σ Hij works
sparse objective function
Sparse Objective Function
  • The objective can always be decreased by scaling W up, H down
    • Set W= cW and H=(1/c)H
  • Thus, alone the objective will simply yield the NMF solution
  • Constraint on the scale of H or W is needed
    • Fix norm of columns of W or rows of H
non negative sparse coding i9
Non-negative Sparse Coding I
  • Pros
    • Simple, efficient
    • Guaranteed to reach global minimum using multiplicative update rule
  • Cons
    • Sparseness controlled implicitly: Optimal λ found by trial and error
    • Sparseness only constrained for H
nmf with sparseness constraints ii
NMF with sparseness constraints II
  • First need some way to define the sparseness of a vector
  • A vector with one nonzero entry is maximally sparse
  • A multiple of the vector of all ones, e, is minimally sparse
  • CBS Inequality
  • How can we combine these ideas?
hoyer s sparseness parameter
Hoyer’s Sparseness Parameter
  • sparseness(x)=
    • where n is the dimensionality of x
  • This measure indicates that we can control a vector’s sparseness by manipulating its L1 and L2 norms
implementing sparseness constraints
Implementing Sparseness Constraints
  • Now that we have an explicit measure of sparseness, how can we incorporate it into the algorithm?
  • Hoyer: at each step, project each column of a matrix onto the nearest vector of desired sparseness.
hoyer s projection algorithm
Hoyer’s Projection Algorithm
  • Problem: Given any vector, x, find the closest (in the Euclidean sense) non-negative vector s with a given L1 norm and a given L2 norm
  • We can easily solve this problem in the 3 dimensional case and extend the result.
hoyer s projection algorithm16
Hoyer’s Projection Algorithm
  • Set si=xi + (L1-Σxi)/n for all i
  • Set Z={}
  • Iterate
    • Set mi=L1/(n-size(Z)) if i in Z, 0 otherwise
    • Set s=m+β(s-m) where β≥0 solves quadratic
    • If s, non-negative we’re finished
    • Set Z=Z U {i : si <0}
    • Set si=0 for all i in Z
    • Calculate c=(Σsi – L1)/(n-size(Z))
    • Set si=si-c for all i not in Z
the algorithm in words
The Algorithm in words
  • Project x onto hyperplane Σsi=L1
  • Within this space, move radially outward from center of joint constraint hypersphere toward point
  • If result non-negative, destination reached
  • Else, set negative values to zero and project to new point in similar fashion
nmf with sparseness constraints
NMF with sparseness constraints
  • Step 1: Initialize W, H to random positive matrices
  • Step 2: If constraints apply to W or H or both, project each column or row respectively to have unchanged L2 norm and desired L1 norm
nmf w sparseness algorithm
NMF w/ Sparseness Algorithm
  • Step 3: Iterate
    • If sparseness constraints on W apply,
      • Set W=W-μw(WH-A)HT
      • Project columns of W as in step 2
      • Else, take standard multiplicative step
    • If sparseness constraints on H apply
      • Set H=H- μHWT(WH-A)
      • Project rows of H as in step 2
      • Else, take standard multiplicative step
advantages of new method
Advantages of New Method
  • Sparseness controlled explicitly with a parameter that is easily interpretted
  • Sparseness of W, H or both can be constrained
  • Number of iterations required grows very slowly with the dimensionality of the problem

Dotted Lines Represent Min and Max Iterations

Solid Line shows average number required

text mining results
Text Mining Results
  • Text to Matrix Generator
    • Dimitrios Zeimpekis and E. Gallopoulos
    • University of Patras
  • NMF with sparseness constraints from Hoyer’s web page