1 / 55

Message Passing Algorithms for Optimization

Message Passing Algorithms for Optimization. Nicholas Ruozzi Advisor: Sekhar Tatikonda Yale University. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box .: A A A. The Problem.

abram
Download Presentation

Message Passing Algorithms for Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Message Passing Algorithms for Optimization Nicholas Ruozzi Advisor: SekharTatikonda Yale University TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA

  2. The Problem • Minimize a real-valued objective function that factorizes as a sum of potentials • (a multiset whose elements are subsets of the indices 1,…,n)

  3. Corresponding Graph 1 3 2

  4. Local Message Passing Algorithms • Pass messages on this graph to minimize f • Distributed message passing algorithm • Ideal for large scientific problems, sensor networks, etc. 1 3 2

  5. The Min-Sum Algorithm • Messages at time t: 1 3 2 4

  6. Computing Beliefs • The min-marginal corresponding to the ith variable is given by • Beliefs approximate the min-marginals: • Estimate the optimal assignment as

  7. Min-Sum: Convergence Properties • Iterations do not necessarily converge • Always converges when the factor graph is a tree • Converged estimates need not correspond to the optimal solution • Performs well empirically

  8. Previous Work • Prior work focused on two aspects of message passing algorithms • Convergence • Coordinate ascent schemes • Not necessarily local message passing algorithms • Correctness • No combinatorial characterization of failure modes • Concerned only with global optimality

  9. Contributions • A new local message passing algorithm • Parameterized family of message passing algorithms • Conditions under which the estimate produced by the splitting algorithm is guaranteed to be a global optima • Conditions under which the estimate produced by the splitting algorithm is guaranteed to be a localoptima

  10. Contributions • What makes a graphical model “good”? • Combinatorial understanding of the failure modes of the splitting algorithm via graph covers • Can be extended to other iterative algorithms • Techniques for handling objective functions for which the known convergent algorithms fail • Reparameterization centric approach

  11. Publications • Convergent and correct message passing schemes for optimization problems over graphical modelsProceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI), July 2010 • Fixing Max-Product: A Unified Look at Message Passing Algorithms(invited talk)Proceedings of theForty-Eighth Annual Allerton Conference on Communication, Control, and Computing, September 2010 • Unconstrained minimization of quadratic functions via min-sumProceedings of the Conference on Information Sciences and Systems (CISS), Princeton, NJ/USA, March 2010 • Graph covers and quadratic minimizationProceedings of the Forty-Seventh Annual Allerton Conference on Communication, Control, and Computing, September 2009 • s-t paths using the min-sum algorithmProceedings of theForty-Sixth Annual Allerton Conference on Communication, Control, and Computing, September 2008

  12. Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization

  13. The Problem • Minimize a real-valued objective function that factorizes as a sum of potentials • (a multiset whose elements are subsets of the indices 1,…,n)

  14. Factorizations • Some factorizations are better than others • If xi takes one of k values this requires at most 2k2 + k operations

  15. Factorizations • Some factorizations are better than others • Suppose • Only need k operations to compute the minimum value!

  16. Reparameterizations • We can rewrite the objective function as • This does not change the objective function as long as the messages are real-valued at each x • The objective function is reparameterized in terms of the messages

  17. Reparameterizations • We can rewrite the objective function as • The reparameterization has the same factor graph as the original factorization • Many message passing algorithms produce a reparameterization upon convergence

  18. The Splitting Reparameterization • Let c be a vector of non-zero reals • If c is a vector of positive integers, then we could view this as a factorization in two ways: • Over the same factor graph as the original potentials • Over a factor graph where each potential has been “split” into several pieces

  19. The Splitting Reparameterization 1 1 3 2 2 3 Factor graph resulting from “splitting” each of the pairwise potentials 3 times Factor graph

  20. The Splitting Reparameterization • Beliefs: • Reparameterization:

  21. Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization

  22. Lower Bounds • Can lower bound the objective function with these reparameterizations: • Find the collection of messages that maximize this lower bound • Lower bound is a concave function of the messages • Use coordinate ascent or subgradient methods

  23. Lower Bounds and the MAP LP • Equivalent to minimizing f • Dual provides a lower bound on f • Messages are a side-effect of certain dual formulations

  24. Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization

  25. The Splitting Algorithm • A local message passing algorithm for the splitting reparameterization • Contains the min-sum algorithm as a special case • For the integer case, can be derived from the min-sum update equations

  26. The Splitting Algorithm • For certain choices of c, an asynchronous version of the splitting algorithm can be shown to be a block coordinate ascent scheme for the lower bound: • For example:

  27. Asynchronous Splitting Algorithm 1 3 2

  28. Asynchronous Splitting Algorithm 1 3 2

  29. Asynchronous Splitting Algorithm 1 3 2

  30. Coordinate Ascent • Guaranteed to converge • Does not necessarily maximize the lower bound • Can get stuck in a suboptimal configuration • Can be shown to converge to the maximum in restricted cases • Pairwise-binary objective functions

  31. Other Ascent Schemes • Many other ascent algorithms are possible over different lower bounds: • TRW-S [Kolmogorov 2007] • MPLP [Globerson and Jaakkola 2007] • Max-Sum Diffusion [Werner 2007] • Norm-product [Hazan 2010] • Not all coordinate ascent schemes are local

  32. Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization

  33. Constructing the Solution • Construct an estimate, x*, of the optimal assignment from the beliefs by choosing • For certain choices of the vector c, if each argmin is unique, then x* minimizes f • A simple choice of c guarantees both convergence and correctness (if the argmins are unique)

  34. Correctness • If the argmins are not unique, then we may not be able to construct a solution • When does the algorithm converge to the correct minimizing assignment?

  35. Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization

  36. Graph Covers • A graph H covers a graph G if there is homomorphism from H to G that is a bijection on neighborhoods 1 2 3 3 2 1 1’ 3’ 2’ Graph G 2-cover of G

  37. Graph Covers • Potential functions are “lifts” of the nodes they cover 1 2 3 1 3 2 1’ 3’ 2’ Graph G 2-cover of G

  38. Graph Covers • The lifted potentials define a new objective function • Objective function: • 2-cover objective function

  39. Graph Covers • Indistinguishability: for any cover and any choice of initial messages on the original graph, there exists a choice of initial messages on the cover such that the messages passed by the splitting algorithm are identical on both graphs • For choices of c that guarantee correctness, any assignment that uniquely minimizes each must also minimize the objective function corresponding to any finite cover

  40. Maximum Weight Independent Set 2 3 1 1 2 3 1’ 3’ 2’ Graph G 2-cover of G

  41. Maximum Weight Independent Set 2 2 5 5 2 2 5 2 2 Graph G 2-cover of G

  42. Maximum Weight Independent Set 2 2 5 5 2 2 5 2 2 Graph G 2-cover of G

  43. Maximum Weight Independent Set 2 2 3 3 2 2 3 2 2 Graph G 2-cover of G

  44. Maximum Weight Independent Set 2 2 3 3 2 2 3 2 2 Graph G 2-cover of G

  45. More Graph Covers • If covers of the factor graph have different solutions • The splitting algorithm cannot converge to the correct answer for choices of c that guarantee correctness • The min-sum algorithm may converge to an assignment that is optimal on a cover • There are applications for which the splitting algorithm always works • Minimum cuts, shortest paths, and more…

  46. Graph Covers • Suppose f factorizes over a set with corresponding factor graph G and the choice of c guarantees correctness • Theorem: the splitting algorithm can only converge to beliefs that have unique argmins if • f is uniquely minimized at the assignment x* • The objective function corresponding to every finite cover H of G has a unique minimum that is a lift of x*

  47. Graph Covers • This result suggests that • There is a close link between “good” factorizations and the difficulty of a problem • Convergent and correct algorithms are not ideal for all applications • Convex functions can be covered by functions that are not convex

  48. Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization

  49. Quadratic Minimization • symmetric positive definite implies a unique minimum • Minimized at

  50. Quadratic Minimization • For a positive definite matrix, min-sum convergence implies a correct solution: • Min-sum is not guaranteed to converge for all symmetric positive definite matrices

More Related