Improving Performance of The Interior Point Method by Preconditioning

Improving Performance of The Interior Point Methodby Preconditioning Project Proposal by: Ken Ryals For: AMSC 663-664 Fall 2007-Spring 2008

Background The IPM method solves a sequence of constrained optimization problems such that the sequence of solutions approaches the true solution from within the “valid” region. As the constraints are “relaxed” and the problem re-solved, the numerical properties of the problem often become more “interesting”. μ0

Application Why is the IPM method of interest? • It applies to a wide range of problem types: • Linear Constrained Optimization • Semidefinite Problems • Second Order Cone Problems • Once in the “good region” of a solution to the set of problems in the solution path (of µ ‘s): • Convergence properties are great (“quadratic”). • It keeps the iterates in the “valid” region. Specific Research Problem: • Optimization of Distributed Command and Control

Optimization Problem The linear optimization problem can be formulated follows: inf{ cTx | Ax = b}. The search direction is implicitly defined by the system: Δx + πΔz = r A Δx = 0 ATΔy + Δz = 0. For this, the Reduced Equation is: A π ATΔy = −Ar (= b) • From Δy we can get Δx = r − π( −ATΔy ). Note: The Nesterov-Todd direction corresponds to π = D⊗D, where: π z = x. i.e., Dis the metric geometric mean of Xand Z−1D = X½(X½ZX½)−½X½ x is the unknown y is the “dual” of x z is the “slack”

The Problem From these three equations, the Reduced Equations for Δy are: A π ATΔy = −Ar (= b) The optimization problem is “reduced” to solving a system of linear equations to generate the next solution estimate. Ifπcannot be evaluated with sufficient accuracy, solving these equations becomes pointless due to error propagation. Aside: Numerically, evaluating ‘r − πΔz’ can also be a challenge. Namely, if the iterates converge, then Δx (= r − πΔz) approaches zero, but r does not; hence the accuracy in Δx can suffer from problems, too.

Example: A Poorly-Conditioned Problem Consider a simple problem: Let’s change A1,1 to make it ill-conditioned: Constraint Matrix 

Observations • The AD2ATcondition exhibits an interesting dip around iteration 4-5, when the solution enters the region of the answer. • How can we exploit whatever caused the dip? • The standard approach is to use factorization to improve numerical performance. • The Cholesky factorization is: UTU = AπAT • Factoring a matrix into two components often trades one matrix with a condition of “M” for two matrices with conditions of ≈“√M”. • My conjecture is that AAT andD2 interacted to lessen the impact of the ill conditioning in A ⇒Can we precondition with AATsomehow?

Conjecture - Math We are solving: A π ATΔy = −Ar A is not square, so it isn’t invertible; but AAT is… What if we pre-multiplied by it? (AAT)-1 A π ATΔy = − (AAT)-1 Ar Note: Conceptually, we have: = Since, thislookslike a similarity transform, it might have “nice” properties… − (AAT)-1 b − (AAT)-1 Ar (AT)-1π ATΔy (AAT)-1 A π ATΔy

Conjecture - Numbers Revisit the ill-conditioned simple problem, Condition of A π AT (π is D2) = 4.2e+014 Condition of (AAT) = 4.0e+014 Condition of (AAT)-1 A D2 AT = 63.053 (which is a little less than 1014) How much would it cost? AAT is m by m (m = # constraints) neither AAT or(AAT)-1 is likely to be sparse… Let’s try it anyway… (If it behaves nicely, it might be worth figuring out how to do it efficiently)

Experiment - Results It does work ☺ • The condition number stays low (<1000) instead of hovering in the 1014 range. It costs more ☹ • Need inverse of AATonce. • (AAT)-1 gets used every iteration. The inverse is needed later, rather than early in the process; thus, it could be iteratively developed during the iteration process… Solution enters region of “Newton” convergence

Project Proposal • Develop a system to for preconditioned IPM. • Create Matlab version to define structure of system. • Permits validation against SeDuMi and SPDT3 • Use C++ transition from Matlab to pure C++. • Create MEX modules from C++ code to use in Matlab • Apply the technique to the Distributed C2 problem • Modify the system to develop (A*AT)-1 iteratively or to solve the system of equations iteratively • Can we use something like the Sherman-Morrison-Woodbury Formula? (A - ZVT )-1 = A-1 + A-1Z(I - VTA-1Z)-1VTA-1 • Can the system can be solved using the Preconditioned Conjugate Gradient Method? • Time permitting, “parallel-ize” the system • Inverse generation branch, and • Iteration branch using current version of Inverse. Testing: Many test optimization problems can be found online “AFIRO” is a good start (used in AMSC 607 – Advanced Numerical Optimization)

Improving Performance of The Interior Point Method by Preconditioning