Software Challenges in Computational Science

Software Challenges in Computational Science Taking Scientific Supercomputing out of the Stone Age A D Kennedy University of Edinburgh

Overview We contemplate strongly typed, categorical, efficient, portable, reusable, modular, robust, architecture-neutral, buzzword-compliant software and languages for high performance parallel tensor-based scientific computing. A D Kennedy

Problem • Programs need to run efficiently on multiple architectures • It must be easy to make high-level algorithmic changes • Programs must adapt themselves around low-level assembly routines and data layouts • Programs should use state-of-the-art numerical methods A D Kennedy

Tensors? • Physical problems are highly constrained, or maybe even completely determined, by their symmetries • Dynamical behaviour realized by multilinear representations of the symmetry, i.e., tensors • We often need to consider continuous symmetries (Lie groups) and both finite dimensional (spin) and infinite dimensional (space-time) representations A D Kennedy

Commonality • Many (most?) large-scale physics programs are built out of a small set of common numerical “building blocks” such as • Linear algebra • FFTs • Symplectic integrators • Discrete differential operators A D Kennedy

Numerical libraries? • Nobody writes their own sqrt or exp routines • Unambiguous standard definitions • IEEE 754 (but who uses unnormalized underflow?) • sin(3.14159e+30) is a random number in [-1,1] • Efficient implementation provided by vendors for every architecture • So why don’t we always use the nice libraries that kind numerical analysts write for us? • Why do we keep using our own implementations of conjugate gradients to solve linear systems? A D Kennedy

BLAS and all that • Our problems involve huge sparse matrices • Numerical libraries allow us to use our own “black box” functional linear operators • But even our vectors are large: they may be distributed over 105 processors • We need implementations of state-of-the-art linear solvers that will use our implementation of underlying vector operations • AXPY, ..., inner products, norms, ... BLAS! • But with our own strange vector representations A D Kennedy

Why not standard classes? • We must still write our time-critical kernels in assembler • We even build our own hardware for these kernels • QCDSP, QCDOC, Blue Gene, … • We mainly optimize data motion (prefetching, communication, etc.) • Flops are cheap, moving data is expensive A D Kennedy

Abstraction • Languages and interfaces that allows abstraction of these building blocks • Higher-level algorithms expressed in terms of abstract lower-level algebraic structures • Need languages than allow us to describe the necessary structure • The structure of a Hilbert space is a class of classes = a category ≈ an interface • Need agreement on standard structures A D Kennedy

Categories • A category is a class of classes sharing the same structure • But not necessarily having anything else in common • SU(2) and SU(3) are both groups, but you can’t multiply their elements together (except in C++ or Java) • Structure is defined by • An explicit set of signatures • An implicit set of axioms A D Kennedy

Algorithmic categories • Follow mathematical structure • Why not write linear system solvers the way they are expressed in textbooks? • Select appropriate algorithms using type information • At compile time • E.g., use CG for positive symmetric matrices • Mendacity is sometimes useful • E.g., use CG for non-positive symmetric matrices A D Kennedy

Compilers & Languages • Syntactic sugar is good • It is easier to avoid bugs if we can writez ←A*x + y rather than SAXPY(A,x,y,z) • ... or was it SAXPY(A,z,x,y)? • Low-level automatic optimization • Compilers allocate registers better than humans do • Automatic parallelization or vectorization • Requires programmers to write “recognizable” patterns • Work-around for lack of standard structures • Automatic data pre-fetching perhaps a difficult but possible goal A D Kennedy

Memory management • Memory management for vector temporaries • Not enough memory to allocate them statically • Heap fragmentation • Stacks awkward for common sub-expression elimination and so forth • New memory management model is needed, neither universal garbage collection nor complete user responsibility is good enough A D Kennedy

Serialization • Portable interchange format for objects • Data grids: exchange everything as serialised data • Many ↔ one instead of many ↔ many • Usually XML-based • XML is verbose, why not use binary format? • XML compresses quite well • Data layout transformations • Redistribute grid points for different number of processors • FFTs? A D Kennedy

Solutions? • Standard categories for common structures • Relatively easy in some areas, e.g., linear algebra • Harder to get “correct” abstractions for others • Language bindings • We can’t persuade the world to use a new language, even if it is better • Type-checking and optimization better for some languages than others • Implementation of state-of-the-art algorithms in terms of these categories A D Kennedy

Obstacles • Risk: application scientist are not willing to take on the risk of using experimental software in addition to the risks of any cutting-edge scientific supercomputing project • Careers: developing scientific software does not get you a permanent academic job • Expertise: very few applications scientist are familiar with modern software techniques; most supercomputer codes are written by graduate students who believe that documentation is an unnecessary evil A D Kennedy

Software strategy • Must be constructed in layers with well-defined categorical interfaces • Low-level machine dependent layers • Intermediate building blocks (e.g., linear algebra) • Top-level application specific • Categorical interface means that functionality is specified, not implementation or data layout • Layers can be changed independently • Must be well documented A D Kennedy

Sociological strategy • Must be a cooperative venture of application scientists, numerical analysts, computer scientists, vendors, and funders • Needs international cooperation • Need to develop real software to get it accepted A D Kennedy

Funding requirements • Funding to encourage people to work on a long-term project that will not get them publications in their own field • Input from experts in several application areas, numerical analysis, and computer science • People to write and test prototypes • People to write documentation A D Kennedy

Conclusions I’ll still write it in Fortran Is it really buzzword-compliant? A D Kennedy

Centre for Numerical Algorithms and Intelligent Software • At the end of 2008, the UK’s Engineering and Physical Sciences Research Council (EPSRC) together with the Scottish Funding Council (SFC) have provided funds to establish this centre as a joint venture of the University of Edinburgh, Heriot-Watt University and the University of Strathclyde. The total budget of NAIS is £7.5M and the duration of the grant is 5 years, with first spending from August of 2009. • What is the purpose of NAIS? • NAIS will bridge the gap between numerical analysts, computer scientists and HPC software developers by developing new systems of code annotation, new compilers and efficient implementations for application-oriented computational methods such as adaptive finite elements, multiscale modelling, molecular simulation and optimization. • Who is involved in NAIS? • NAIS is a partnership of the Schools of Mathematics and Informatics and the Edinburgh Parallel Computing Centre (EPCC) at the University of Edinburgh, and the Departments of Mathematics at Heriot-Watt and Strathclyde Universities. In addition, NAIS will include collaborations with researchers in the sciences and engineering at the three universities, and a network of collaborations with other institutions including, so far, the University of Cambridge, the University of Warwick, and Wales Institute for Mathematics and Computer Science. A programme with CERFACS in Toulouse will provide for joint workshop and training activities. Industrial collaborations are to be established with IBM, Schlumberger, D.E. Shaw, SGI, Cray, Orage/France Telecom, SAS. An international advisory committee will be established including • Fran Berman, University of California at San Diego • Andreas Frommer, Wuppertal • David Keyes, Columbia and Lawrence Livermore National Laboratory • J. Tinsley Oden, University of Texas at Austin • Philippe Toint, University of Namur, Belgium A D Kennedy

Centre for Numerical Algorithms and Intelligent Software • The first director of NAIS is Benedict Leimkuhler (Mathematics, University of Edinburgh) and the Steering Committee consists of • Mark Ainsworth (Mathematics, Strathclyde University) • Murray Cole (Informatics, University of Edinburgh) • Dugald Duncan (Mathematics, Heriot-Watt University) • Arthur Trew (Edinburgh Parallel Computing Centre). • What research activities are envisaged in NAIS? • NAIS will operate substantial training, visitor and workshop programmes in all relevant areas of numerical analysis, computer science and HPC software development. There will be additional activities at the NAIS partners (currently Cambridge, Warwick and Wales Institute for Mathematical and Computational Science) • What posts are anticipated in NAIS? • 6 lectureships (permanent positions) in Mathematics (two each) at Edinburgh, Heriot-Watt and Strathclyde Universities. • A lectureship in Informatics at the University of Edinburgh • 10 Postdoctoral Research Assistantships (3 years each) • 24 PhD studentships (typically 4 years duration) • The first posts will be advertised in 2009. • Where can I find out more about NAIS? • See the website at http://www.nais.org.ukor send an email to info@nais.org.uk. • PhD student applications are welcome at any time and should be sent to the relevant department with a cover letter that mentions the “Numerical Algorithms and Intelligent Software Science and Innovation Project.” A D Kennedy

Software Challenges in Computational Science