What is the most important kernel of sparse linear solvers for heterogeneous supercomputers?

What is the most important kernel of sparse linear solvers for heterogeneous supercomputers? Shengxin Zhu The University of Oxford Prof. Xingping Liu and Prof. Tongxiang Gu National Key Laboratory of Computational Physics Institute of Applied Physics and Computational Mathematics SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Outlines Brief introduction on Heterogeneous supper-computers Computation kernels of Krylov methods Influence of communications Case study: GPBiCG(m,l) Challenging problems Conclusion SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Introduction to heterogeneous supper-computers 3 • Dawning5000A • Nodes: • Bandwidth: • Memory: SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Computational kernels of Krylov methods Vector update: parallel in nature Mat-vec： Computation intensive; multi-core technology CUDA/OpenMP Inner product: Communication intensive (CPU/MPI). SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Influence of communicationfirst glance Computation cheap Communication expensive Based on Aztec by Prof. Tuminaro et al @ Sandia S Zhu, MSc Thesis, CAEP, 2010 SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Real reason for time-consuming communications Small workshops: focus less preparing time Conference: diversity more preparing time SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Strategies for minimizing communications • Replacing dot by others (semi-Chebyshev ) : workshop only no conference if possible. Inner product free , Gu, Liu, Mo(2002) • Reorganizing algorithm such that: (reduce number of conference and each conference accept more talks) residual replacement strategies due to Von de Vorst (2000s). CA –KSMs, Demmel et al (2008) • Overlapping communication over computation SNSCC'12, shengxin.zhu@maths.ox.ac.uk

A case study, Paralleling GPBiCG(m,l) (S. Fujino, 2002) • GPBiCG(1,0) BiCGSTAB • GPBiCG(0,1) GPBiCG • GPBiCG(1,1) BiCGSTAB2 Could be used to design breakdown free BiCGSTAB method. SNSCC'12, shengxin.zhu@maths.ox.ac.uk

GPBiCG(m,l) (S. Fujino, 2002) SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Algorithm Design of PGPBiCG(m,l) Method SNSCC'12, shengxin.zhu@maths.ox.ac.uk

PGPBiCG(m,l) Method(reduce # global commun.) Algorithm reconstruct: three GobalCs to one！ reconstruct Global synch. Global synch. Global synch. Global synch. SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Performance Based on Aztec by Prof. R.S. Tuminaro et al @ Sandia SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Convergence analysis Residual replacements strategies Backward stable analysis SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Challenging problemAccurate compute dot • Why Mindless by Kahan • Accurate compute inner product. • Ogita and Rump –et-al, Accurate sum and dot product, SIAM Sci Compt. 2005 cited 188 times. (but) …. • PLASMA team • Backward stable analysis of residual replacement methods. • Carson and Demmel, A residual replacement strategy for improving the maximum attainable accuracy of communication avoiding Krylov subspace Methods, April 20 2012 • Reliable dot computation algorithm SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Conclusion: • Avoiding communication • Reliable computation • Inner product computation is very likely to be the most challenging kernel for HHPC, while Mat_vec important for both… • Software abstraction and threads programming are helpful, together with re-designing algorithms will do better Math/Algorithm CS/Performance Applications interface Aztec POSKI POSKI Hyper, PETSc; Trilinos (Parallel Optimized Sparse Kernel Interface LIbrary) Poski v.1.0 May 02/2012 SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Thanks ! SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Initial study on communication complexity More than ten thousand processors are connected by network Global Communication becomes more and more serious SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Methods in literatures • Based on the former two strategies • de Sturler and van der Vorst: Parallel GMRES(m) and CG methods (1995) • Bucker and Sauren: Parallel QMR method (1997) • Yang and Brent: Improved CGS, BiCG and BiCGSTAB methods (2002-03) • Gu and Liu et al.: ICR, IBiCR, IBiCGSTAB(2) and PQMRCGSTAB methods (2004-2010) • Demmel et al CA-KSMs (2008---) • Gu, Liu and Mo: MSD-CG: multiple search direction conjugate gradient method (2004) • replaced the inner products computation by solving linear systems with small size. Eliminates global inner products completely. • The idea have been generated to MPCG by Grief and Bridson (2006) SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Comparison of computational count of two Algorithms SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Mathematical model of the time consummation SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Scalability analysis SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Popt The optimal number of processors SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Convergence Analysis SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Numerical Experiments: timing and improvements SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Numerical Experiments: Speedup SNSCC'12, shengxin.zhu@maths.ox.ac.uk

Conclusions • PGPBiCG(m,l) method is more scalable and parallel for solving large sparse unsymmetrical linear systems on distributed parallel architectures • Performance, isoefficiency analysis and numerical experiments have been done for PGPBiCG(m,l) and GPBiCG(m,l) methods • The parallel communication performance can be improved by a factor of larger than 3. • The PGPBiCG(m,l) method has better parallel speed up compared with the GPBiC(m,l) method. • For further performance improvements: overlap of computation with communication, numerical stability. SNSCC'12, shengxin.zhu@maths.ox.ac.uk

What is the most important kernel of sparse linear solvers for heterogeneous supercomputers?

What is the most important kernel of sparse linear solvers for heterogeneous supercomputers?

Presentation Transcript

Dmitri Tymoczko Princeton University

:

The development of self-cleaning coatings having low free surface energies

24 Hour Locksmith, Car Lockout and Emergency Lockout Kansas

Role Of T ransportation And Logistics Software In An

How to Take Care of Men’s Belts: Three Golden Rules

Ross Pake Good with a Putter

A fantastic roofing contractor company in Nebraska

Bestairpurifierhub.com

7 Benefits of Emergency Dental Care

91-7508109041 BABA JI BABA JI CALL NOW AND GET SOLUTION OF PROBLEM {{ 91-7508109041}}

Power Washing Is Beneficial For Your Buildings

Natural Factors Coenzyme Q10

Prime Four Security and safety troubles plaguing Agencies - Treated with a Security Management System

4 Things You Need To Know When Building A New Home

5 Important Ingredients of a Successful PPC Campaign

App Name Mastery Guide - how to choose app name?

Livelongerlifestyle.com

ACC 456 help A Guide to career/uophelp.com

Get the Salesforce success information by Blindbid

7dayalcoholrehab.com

Leading SEO Specialist In Sri Lanka