1 / 25

Parallel Software for SemiDefinite Programming with Sparse Schur Complement Matrix

Parallel Software for SemiDefinite Programming with Sparse Schur Complement Matrix. Makoto Yamashita @ Tokyo-Tech Katsuki Fujisawa @ Chuo University Mituhiro Fukuda @ Tokyo-Tech Yoshiaki Futakata @ University of Virginia Kazuhiro Kobayashi @ National Maritime Research Institute

dobry
Download Presentation

Parallel Software for SemiDefinite Programming with Sparse Schur Complement Matrix

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Software for SemiDefinite Programming with Sparse Schur Complement Matrix Makoto Yamashita @ Tokyo-Tech Katsuki Fujisawa @ Chuo University Mituhiro Fukuda @ Tokyo-Tech Yoshiaki Futakata @ University of Virginia Kazuhiro Kobayashi @ National Maritime Research Institute Masakazu Kojima @ Tokyo-Tech Kazuhide Nakata @ Tokyo-Tech Maho Nakata @ RIKEN ISMP 2009 @ Chicago [2009/08/26]

  2. Extremely Large SDPs • Arising from various fields • Quantum Chemistry • Sensor Network Problems • Polynomial Optimization Problems • Most computation time is related to Schur complement matrix (SCM) • [SDPARA]Parallel computation for SCM • In particular, sparse SCM

  3. Outline • SemiDefinite Programming and Schur complement matrix • Parallel Implementation • Parallel for Sparse Schur complement • Numerical Results • Future works

  4. Standard form of SDP

  5. Primal-Dual Interior-Point Methods

  6. Exploitation of Sparsity in Computation for Search Direction 2.CHOLESKY Schur complement matrix ⇒ Cholesky Factorizaiton 1.ELEMENTS

  7. Bottlenecks on Single Processor in second Opteron 246 (2.0GHz) Apply Parallel Computation to the Bottlenecks

  8. SDPARA http://sdpa.indsys.chuo-u.ac.jp/sdpa/ • SDPA parallel version(generic SDP solver) • MPI & ScaLAPACK • Row-wise distribution for ELEMENTS • parallel Cholesky factorization for CHOLESKY

  9. Row-wise distribution for evaluation of the Schur complement matrix • 4 CPU is available • Each CPU computes only their assigned rows • . • No communication between CPUs • Efficient memory management

  10. Parallel Cholesky factorization • We adopt Scalapack for the Cholesky factorization of the Schur complement matrix • We redistribute the matrix from row-wise to two-dimensional block-cyclic distribtuion Redistribution

  11. Computation time on SDP from Quantum Chemistry [LiOH] AIST super clusterOpteron 246 (2.0GHz) 6GB memory/node

  12. Sclability on SDP from Quantum Chemistry [NF] ELEMENTS 63 times CHOLESKY 39 times Total 29 times ELEMENTS is very effective

  13. Sparse Schur complement matrix • Schur complement matrix becomes very sparse for some applications. from Control Theory(100%) from Sensor Network(2.12%) ⇒Simple Row-wise loses its efficiency

  14. Sparseness ofSchur complement matrix • Many applications havediagonal block structure

  15. Exploitation of Sparsityin SDPA • We change the formula by row-wise F1 F2 F3

  16. ELEMENTS forSparse Schur complement

  17. CHOLESKY forSparse Schur complement • Parallel Sparse Cholesky factorization implemented in MUMPS • MUMPS adopts Multiple Frontal method Memory storage on each processor should be consecutive. The distribution for ELEMENTS matches this method.

  18. Computation time for SDPs from Polynomial Optimization Problem Parallel Sparse Cholesky achieves mild scalability. ELEMENTS attains 24x speed-up on 32 CPUs. tsubasaXeon E5440 (2.83GHz) 8GB memory/node

  19. ELEMENTS Load-balance on 32 CPUs • Only first processor has a little heavier computation.

  20. Automatic selection ofsparse / dense SCM • Dense Parallel Cholesky achieves higher scalability than Sparse Parallel Cholesky • Dense becomes better for many processors. • We estimate both computation time using computation cost and scalability.

  21. Sparse/Dense CHOLESKY for a small SDP from POP tsubasaXeon E5440 (2.83GHz) 8GB memory/node Only on 4 CPUs, the auto selection failed. (since scalability on sparse cholesky is unstable on 4 CPUs.)

  22. Numerical Results • Comparison with PCSDP • Sensor Network Problemgenerated by SFSDP • Multi Threading • Quantum Chemistry

  23. SDPs from Sensor Network (time unit : second)

  24. MPI + Multi Threading for Quantum Chemistry N.4P.DZ.pqgt11t2p(m=7230) second 64x speed-up on [16nodesx8threads]

  25. Concluding Remarks & Future works • New parallel schemes for sparse Schur complement matrix • Reasonable Scalability • Extremely large-scale SDPs with sparse Schur complement matrix • Improvement on Multi-Threading for sparse Schur complement matrix

More Related