Nonnegative Matrix Factorization with Sparseness Constraints

1 / 23

# Nonnegative Matrix Factorization with Sparseness Constraints - PowerPoint PPT Presentation

Nonnegative Matrix Factorization with Sparseness Constraints. S. Race MA591R. Introduction to NMF. Factor A = WH A – matrix of data m non-negative scalar variables n measurements form the columns of A W – m x r matrix of “basis vectors” H – r x n coefficient matrix

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Nonnegative Matrix Factorization with Sparseness Constraints' - zasha

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Nonnegative Matrix Factorization with Sparseness Constraints

S. Race MA591R

Introduction to NMF
• Factor A = WH
• A – matrix of data
• m non-negative scalar variables
• n measurements form the columns of A
• W – m x r matrix of “basis vectors”
• H – r x n coefficient matrix
• Describes how strongly each building block is present in measurement vectors
Introduction to NMF con’t
• Purpose:
• “parts-based” representation of the data
• Data compression
• Noise reduction
• Examples:
• Term – Document Matrix
• Image processing
• Any data composed of hidden parts
Introduction to NMF con’t
• Optimize accuracy of solution:
• min || A-WH ||F where W,H ≥ 0
• We can drop nonnegative constraints
• min || A-(W.W)(H.H) ||
• Many options for objective function
• Many options for algorithm
• W,H will depend on initial choices
• Convergence is not always guaranteed
Common Algorithms
• Alternating Least Squares
• Paatero 1994
• Multiplicative Update Rules
• Lee-Seung 2000 Nature
• Used by Hoyer
• Hoyer 2004
• Berry-Plemmons 2004
Why is sparsity important?
• Nature of some data
• Text-mining
• Disease patterns
• Better Interpretation of Results
• Storage concerns
Non-negative Sparse Coding I
• Proposed by Patrik Hoyer in 2002
• Add a penalty function to the objective to encourage sparseness
• OBJ:
• Parameter λ controls trade-off between accuracy and sparseness
• f is strictly increasing: f=Σ Hij works
Sparse Objective Function
• The objective can always be decreased by scaling W up, H down
• Set W= cW and H=(1/c)H
• Thus, alone the objective will simply yield the NMF solution
• Constraint on the scale of H or W is needed
• Fix norm of columns of W or rows of H
Non-negative Sparse Coding I
• Pros
• Simple, efficient
• Guaranteed to reach global minimum using multiplicative update rule
• Cons
• Sparseness controlled implicitly: Optimal λ found by trial and error
• Sparseness only constrained for H
NMF with sparseness constraints II
• First need some way to define the sparseness of a vector
• A vector with one nonzero entry is maximally sparse
• A multiple of the vector of all ones, e, is minimally sparse
• CBS Inequality
• How can we combine these ideas?
Hoyer’s Sparseness Parameter
• sparseness(x)=
• where n is the dimensionality of x
• This measure indicates that we can control a vector’s sparseness by manipulating its L1 and L2 norms
Implementing Sparseness Constraints
• Now that we have an explicit measure of sparseness, how can we incorporate it into the algorithm?
• Hoyer: at each step, project each column of a matrix onto the nearest vector of desired sparseness.
Hoyer’s Projection Algorithm
• Problem: Given any vector, x, find the closest (in the Euclidean sense) non-negative vector s with a given L1 norm and a given L2 norm
• We can easily solve this problem in the 3 dimensional case and extend the result.
Hoyer’s Projection Algorithm
• Set si=xi + (L1-Σxi)/n for all i
• Set Z={}
• Iterate
• Set mi=L1/(n-size(Z)) if i in Z, 0 otherwise
• Set s=m+β(s-m) where β≥0 solves quadratic
• If s, non-negative we’re finished
• Set Z=Z U {i : si <0}
• Set si=0 for all i in Z
• Calculate c=(Σsi – L1)/(n-size(Z))
• Set si=si-c for all i not in Z
The Algorithm in words
• Project x onto hyperplane Σsi=L1
• Within this space, move radially outward from center of joint constraint hypersphere toward point
• If result non-negative, destination reached
• Else, set negative values to zero and project to new point in similar fashion
NMF with sparseness constraints
• Step 1: Initialize W, H to random positive matrices
• Step 2: If constraints apply to W or H or both, project each column or row respectively to have unchanged L2 norm and desired L1 norm
NMF w/ Sparseness Algorithm
• Step 3: Iterate
• If sparseness constraints on W apply,
• Set W=W-μw(WH-A)HT
• Project columns of W as in step 2
• Else, take standard multiplicative step
• If sparseness constraints on H apply
• Set H=H- μHWT(WH-A)
• Project rows of H as in step 2
• Else, take standard multiplicative step
• Sparseness controlled explicitly with a parameter that is easily interpretted
• Sparseness of W, H or both can be constrained
• Number of iterations required grows very slowly with the dimensionality of the problem

Dotted Lines Represent Min and Max Iterations

Solid Line shows average number required

Text Mining Results
• Text to Matrix Generator
• Dimitrios Zeimpekis and E. Gallopoulos
• University of Patras
• http://scgroup.hpclab.ceid.upatras.gr/scgroup/Projects/TMG/
• NMF with sparseness constraints from Hoyer’s web page