Loading in 5 sec....

Computer Science Colloquium PowerPoint Presentation

Computer Science Colloquium

- 92 Views
- Uploaded on
- Presentation posted in: General

Computer Science Colloquium

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Computer Science Colloquium

Large Scale Computational Challenges in Materials Science*

*Work supported by NSF under grants NSF/ITR-0082094, NSF/ACI-0305120

and by the Minnesota Supercomputing Institute

C. Bekas: CS Colloquium

Introduction and Motivation

- Computational Materials Science Target Problem
- …predict the properties of materials…
- How?
- AB INITIO calculations:Simulate the behavior of materials at the atomic level, by applying the basic laws of physics (Quantum mechanics)

- What do we (hope to) achieve?
- Explain the experimentally found properties of materials
- Engineer new materials with desired properties
Applications:…numerous (some include)

- Semiconductors, synthetic light weight materials
- Drug discovery, protein structure prediction
- Energy: alternative fuels (nanotubes, etc)

- How?

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

In this talk

- Introduction to the mathematical formulation of ab initio calculations…
- in particular…the Density Functional Theory (DFT)…formulation
- Identify the computationally intensive “spots”…

Large scale eigenvalue problems are central…

- symmetric/hermitian problems
- very large number of eigenvalues/vectors required…so
- reorthogonalization and synchronization costs dominate…
- limiting the feasible size of molecules under study

Alternative…Automated Multilevel Substructuring (AMLS)

- Significantly limits reorthogonalization costs…
- can attack very large problems…when O(1000) eigs. are required…

Techniques that avoid the eigenvalue problem altogether…

- Monte-Carlo simulations for the diagonal of a matrix (charge densities)

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Mathematical Modelling: Wave Function

We seek to find the steady state of the electron distribution

- Each electron eiis described by a corresponding wave function ψi …
- ψiis a function of space (r)…in particular it is determined by
- The position rk of all particles (including nuclei and electrons)
- It is normalized in such a way that
- Max Bohr’s probabilistic interpretation: Considering a region D, then
…describes the probability of electron ei being in region D. Thus: the distribution of electrons ei in space is defined by the wave function ψi

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Mathematical Modelling: Hamiltonian

- Steady state of the electron distribution:
- is it such that it minimizes the total energy of the molecular system…(energy due to dynamic interaction of all the particles involved because of the forces that act upon them)

HamiltonianH of the molecular system:

- Operator that governs the interaction of the involved particles…
- Considering all forces between nuclei and electrons we have…

HnuclKinetic energy of the nuclei

HeKinetic energy of electrons

UnuclInteraction energy of nuclei (Coulombic repulsion)

VextNuclei electrostatic potential with which electrons interact

UeeElectrostatic repulsion between electrons

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Mathematical Modelling: Schrödinger's Equation

Let the columns of Ψ:

hold the wave functions corresponding the electrons…Then it holds that

- This is an eigenvalue problem…that becomes a usual…
- “algebraic” eigenvalue problem when we discretize i w.r.t. space (r)
- Extremely complex and nonlinear problem…since
- Hamiltonian and wave functions depend upon all particles…
- We can very rarely (only for trivial cases) solve it exactly…

Variational Principle (in simple terms!)

Minimal energy and the corresponding electron distribution amounts to calculating the smallest eigenvalue/eigenvector of the Schrödinger equation

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Schrödinger's Equation: Basic Approximations

Multiple interactions of all particles…result to extremely complex Hamiltonian…which typically becomes huge when we discretize

Thus…a number of reasonable approximations/simplifications have been considered…with negligible effects on the accuracy of the modeling:

- Born-Oppenheimer: Separate the movement of nuclei and electrons…the latter depends on the positions of the nuclei in a parametric way…(essentially neglect the kinetic energy of the nuclei)
- Pseudopotential approximation: Nucleus and surrounding core electrons are treated as one entity
- Local Density Approximation: If electron density does not change rapidly w.r.t. sparse (r)…then electrostatic repulsion Uee is approximated by assuming that density is locally uniform

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Density Functional Theory

High complexity is mainly due to the many-electron formulation of ab initio calculations…is there a way to come up with an one-electron formulation?

Key Theory

DFT: Density Functional Theory (Hohenberg-Kohn, ‘64)

- The total ground energy of a system of electrons is a functional of the electronic density…(number of electrons in a cubic unit)
- The energy of a system of electrons is at a minimum if it is an exact density of the ground state!
- This is an existence theorem…the density functional always exists
- …but the theorem does not prescribe a way to compute it…
- This energy functional is highly complicated…
- Thus approximations are considered…concerning:
- Kinetic energy and
- Exchange-Correlation energies of the system of electrons

Large Scale Computational Challenges in Materials Science

Kinetic energy of electronei

Total potential that acts on ei at position r

Energy of the i-th state of the system

One electron wave function

Charge density at position r

C. Bekas: CS Colloquium

Density Functional Theory: Formulation (1/2)

Equivalent eigenproblem:

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Density Functional Theory: Formulation (2/2)

Furthermore:

Potential due to nuclei and core electrons

Coulomb potential form valence electrons

Exchange-Correlation potential…also a function of the charge density

Non-linearity:The new Hamiltonian depends upon the charge density

while itself depends upon the wave functions (eigenvectors) i

Thus: some short of iteration is required until convergence is achieved!

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Self Consistent Iteration

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

S.C.I: Computational Considerations

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

S.C.I: Computational Considerations

Conventional approach:

- Solve the eigenvalue problem (1)…and compute the charge densities…
- This is a tough problem…many of the smallest eigenvalues…deep into the spectrum are required! Thus…
- efficient eigensolvers have a significant impact on electronic structure calculations!

Alternative approach:

- The eigenvectors i are required only to compute k(r)
- Can we instead approximate charge densities without eigenvectors…?
- Yes…!

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Computational Considerations in Applying Eigensolvers for Electronic Structure Calculations

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

The Eigenproblem

Hamiltonian Characteristics

- Discretization: High-order finite difference scheme…leads to
- Large Hamiltonians!…typically N>100K…with significant…
- number of nonzero elements (NNZ)>5M…
- Hamiltonian is Symmetric/Hermitian…thus the eigenvalues are real numbers…some smallest and some larger than zero.

Eigenproblem Characteristics(why it is a tough case?)

- We need the algebraically smallest (leftmost) eigenvalues (and vectors)
- How many? Typically a large number of them. Depending upon the molecular system under study:
- for standard spin-less calculations SixHy: (4x + y)/2
- i.e. for the small molecule Si34H36 we need the 86 smallest eigenvalues…

- For large molecules, x,y>500 (or for exotic entities…quantum dots) thousands of the smallest eigenvalues are required….
- Using current state of the-art-methods we need thousands of CPU hours on DOE supercomputers…and we have to do that many times!

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Methods for Eigenvalues (basics only!)

Eigenvalue Approximation from a Subspace

Consider the standard eigenvalue problem:

and let V be a thin N x k (N>>k) matrix…then

approximate the original problem with:

- Observe that:
- Selecting V to have orthogonal columns …VTV = I … but it is expensive to come up with an orthogonal V
- Set H = VTAV… then H is k x k …much smaller than N x N

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Subspace Methods

H is much smaller than A…use LAPACK for H

- How to compute orthogonal V ?
- For a non-symmetric matrix A…use Gramm-Schmidt (Arnoldi)…
- For a symmetric matrix A (our case) use Lanczos…
- Other approaches available (i.e. Jacobi-Davidson) but still some sort of Gramm-Schmidt is required…
REMINDER

We need many eigenvalues/vectors O(1000)…thus V may not be that thin!!!

Large Scale Computational Challenges in Materials Science

NO SYNC.

NO SYNC.

C. Bekas: CS Colloquium

Symmetric Problems: Lanczos

Basic property

Theoretically…(assuming no round-off errors)…Lanczos can build a very large orthogonal basis V requiring in memory only 3 columns of V at each step!

- Lanczos
- 1. Input: Matrix A, unit norm starting vector v0, 0 = 0, # k
- 2. For j = 1,2,…,k Do
- 3. wj = AvjMATRIX VECTOR
- 4. wj = wj - j vj-1DAXPY
- 5. j = (wj, vj)DOT PRODUCT
- 6. wj = wj - j vjDAXPY
- 7. j+1 = ||wj||2.DOT PRODUCT
- 8. If j+1 = 0 then STOP
- 9. vj+1 = wj / j+1DSCAL

- 10. EndDO

SYNC. -BCAST

NO SYNC.

SYNC. - BCAST

NO SYNC.

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Lanczos in Finite Arithmetic…

Round-off errors

- Lanczos vectors vi quickly loose orthogonality…so that
- VTV is no longer orthogonal…thus
- We need to check if vj is to previous vectors 0,1,…,j-1
- If NOT … reorthogonalize it against previous vectors (Gramm-Schmidt)

- Lanczos
- 1. Input: Matrix A, unit norm starting vector v0, 0 = 0, # k
- 2. For j = 1,2,…,k Do
- 3. wj = Avj
- 4. wj = wj - j vj-1
- 5. j = (wj, vj)
- 6. wj = wj - j vj
- 7. j+1 = ||wj||2.
- 8. If j+1 = 0 then STOP
- 9. vj+1 = wj / j+1

- 10. EndDO

ORTHOGONALITY IS LOST HERE…SO THESE STEPS ARE REPEATED AGAINST ALL

PREVIOUS VECTORS…

SELECTIVE REORTH. IS ALSO POSIBLE (SIMON, LARSEN)

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Practical Eigensolvers and Limitations

ARPACK (Sorensen-Lehoucq-Yang): Restarted Lanczos

- Remember that O(1000) eigenvalues/vectors are required…thus
- we need a very long basis V…k = twice the number of eigenvalues which
- will result in a large number of reorthogonalizations…
- Synchronization costs – Reorthogonalization costs – and memory costs become intractable for large problems of interest…

Shift-Invert Lanczos (Grimes et all) – Rational Krylov (Ruhe)

- work with matrix (A-i I)-1 instead…
- compute some of the eigenvalues close to i each time…thus a smaller basis is required each time…BUT
- many shifts i are required…
- cost of working with the different “inverses” (A-i I)-1 becomes prohibitive for (practically) large Hamiltonians…

We need alternative methods that can build large projection bases without the reorthogonalization-synchronization costs

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Automated Multilevel Substructuring

Component Mode Synthesis (CMS) (Hurty ’60, Graig-Bampton ’68)

Well known alternative to Lanczos type methods. Used for many years in Structural Engineering. But it too suffers from limitations due to problem size…

AMLS, (Bennighof, Lehoucq, Kaplan and collaborators)

Multilevel CMS method (solves the dimensionality problem)

Automatic computation of substructures (easy application)

Approximation: Truncated Congruence Transformation

Builds very large projection basis withoutreorthogonalization

Successful in computing thousands of eigenvalues in vibro-acoustic analysis (N>107) in a few hours on workstations (Kropp–Heiserer, 02)

Spectral Schur Complements (Bekas, Saad)

- Significantly improves AMLS accuracy…suitable for electronic structure calculations
- (unlike AMLS) …framework for the iterative refinement of the approximations

Large Scale Computational Challenges in Materials Science

2

1

Subdivide into 2 subdomains: 1 and 2

C. Bekas: CS Colloquium

Component Mode Synthesis: a model problem

Consider the model problem:

Y

on the unit square . We wish to compute smallest eigenvalues.

- Component Mode Synthesis
- Solve problem on each i
- “Combine” partial solutions

X

Large Scale Computational Challenges in Materials Science

CMS: Approximation procedure. Ignore coupling!

C. Bekas: CS Colloquium

Component Mode Synthesis: Approximation

- CMS: Approximation procedure (2)
- Solve B v = v
- Remember that B is block diagonal
- Thus: we have to solve smaller decoupled eigenproblems

- Then, CMS approximates the coupling among the subdomains by the application of a carefully selected operator on the interface unknowns

- Solve B v = v

Large Scale Computational Challenges in Materials Science

Block Gaussian eliminator matrix:

such that:

Where S = C – E> B-1 E is the Schur Complement

Equivalent Generalized Eigenproblem : UTAUu= UTU u

C. Bekas: CS Colloquium

Recent CMS method: AMLS

Large Scale Computational Challenges in Materials Science

Solve decoupled problems

Form basis

FINAL APPROXIMATION:

C. Bekas: CS Colloquium

AMLS Approximation

AMLS: Approximation procedure. Ignore coupling!

Large Scale Computational Challenges in Materials Science

B1

E1

S3

B2

S2

E1*

S1

C. Bekas: CS Colloquium

AMLS: Multilevel application

Scheme applied recursively.

Resulting to thousands of subdomains. Successful in computing thousands smallest eigenvalues in vibro-acoustic analysis with problem size N>107

(Kropp – Heiserer BMW)

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

AMLS: Example

Example: Container ship, 35K degrees of freedom

(Research group of prof. H. Voss, T. U. Hamburg, Germany)

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

AMLS: Example

Example: Container ship, 35K degrees of freedom

(Research group of prof. H. Voss, T. U. Hamburg, Germany)

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

AMLS: Example

- AMLS: Substructure tree (Kropp-Heiserer, BMW)
- Multilevel parallelism…
- Both Top-Down and Bottom-Up implementations are possible…
- At each node we need to solve a linear system…
- Multilevel solution of linear systems…level k depends-benefits from level k+1

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Problem Set…AMLS v.s. Standard Methods

Application Domains,

Kropp-Heiserer, 02

NUMBER OF EIGENVALUES

DEGREES OF FREEDOM

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

AVOIDING EIGENVALUE COMPUTATIONS!

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Motivation

- Target Problem
- Estimate the Diagonal of Matrix:
- The matrix is known ONLY in some “functional form”:
- We have a routine MV(v,x) such that x = Av

- Estimate the Diagonal of Matrix:

- Application in Materials Science
- Estimate the diagonal of the Density matrix, which holds the Charge Densities (for which we solve the expensive eigenproblem!)

- Avoid diagonalization: approximate the Density matrix by expanding the Fermi operator in a basis of Tchebychev polynomials, Jay et al (99)
- Basic property to exploit: Matrix-Vector products with the approximated Density matrix are cheap to compute!

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Estimating the Diagonal: The direct approach

- Multiply the target matrix A by the vector ei to get the i-th column of A
- Inner product of the i-th column with vector ei to get the i-th element

i

i

0

0

.

.

.

1

.

.

0

0 0 . . . 1 . . 0

i

i

Straightforward! Even embarrassingly Parallel!

But: HIGH COST…WE NEED N-Matrix vector products so cost is still O(N3)

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

A Stochastic Estimator for the Trace

- Observe: the same technique can be used to estimate the trace of A, but still the cost will be O(N3)
- Can we do better? Meaning:
- Define special vectors vi such that only a small number m<<N will be required to obtain a good approximation of the diagonal (or the trace)

- Hutchinson (90): Unbiased Stochastic Estimator of the trace for Laplacian Smoothing
- Extended ideas of Girard (87)
- Monte Carlo simulations with the discrete variable that assumes the values -1, 1 with probability ½ …meaning:
- Define random vectors vi, i=1,…,s with entries ±1 and

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

The Stochastic Diagonal Estimator

Let the vectors vk be as before (with ±1 entries)…however they can also have other entries…

Estimate the diagonal :

: Component-wise multiplication

: Component-wise division

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Estimating The Fermi Operator

P: Density matrix, H: Hamiltonian

Then: P = f(H), f is the Fermi-Dirac distribution:

kB: Boltzmann constant, T: temperature, : Chemical potential

When T0, f is treated as the Heaviside function

occ

1

max

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Tchebychev Expansion of the Heaviside Func.

The Hamiltonian is shifted and scaled so that it has eigenvalues in the interval [-1, 1]

Then, we have the following approximation in the basis of Tchebychev polynomials

with Jackson dampening in order to avoid Gibbs oscillations

But…Tchebychev polynomials are defined recursively

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Tchebychev Expansion of the Heaviside Func.

Due to the recursive nature of Tchebychev polynomials Matrix-Vector products with the approximation to the Density matrix is inexpensive!

Thus, the Tchebyshev expansion is ideal for our stochastic and Hadamard estimators

Cost of each approximate Matvec with the Density matrix: M Matrix-Vector products with the Hamiltonian and M xAXPY vector operations (O(N))

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Some Experiments(2/2)

Matrix: AF23650. Using only 10 random vectors in the stochastic estimator

Abs. Relative error for each entry in the diagonal

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Some Experiments(2/2)

Matrix: AF23650. Using 500 random vectors in the stochastic estimator

Abs. Relative error for each entry in the diagonal

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Some Experiments: Density Matrix

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Some Experiments: SiH4

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Implementation Issues – Trilinos

- ab initio calculations:…many ingredients required for successful techniques
- Mesh generation…discretization
- Visualization of input data…results…geometry
- Efficient data structures-communicators for parallel computations
- Efficient (parallel) Matrix-Vector and inner products
- Linear system solvers
- State-of-the-art eigensolvers…
A unifying software development environment will prove to be very useful

- ease of use…
- reusability…(object oriented)
- portable…

- TRILINOS http://software.sandia.gov/trilinos
- software multi-package…developed at SANDIA (M. Heroux)
- modular…no need to install everything in order to work!
- Capabilities of LAPACK, AZTEC, Chaco, SuperLU, etc…combined
- very active user community…ever evolving!
- ease of use…without sacrificing performance

Large Scale Computational Challenges in Materials Science

C. Bekas: CS Colloquium

Conclusions

- Large Scale Challenges in Computational Materials Science
- In DFT eigenvalue calculations dominate…
- …many O(1000) eigenvalues/vectors required…
- …easily reaching and exceeding the limits of state-of-the-art traditional solvers
- AMLS appears as an extremely atractive alternative…however accuracy requirements and efficient parallel implementation is still under development
Alternative approach: Avoid eigenvalue calculations altogether!

- Based on Tchebychev expansions of the Fermi/Dirac distribution…
- we design Monte-Carlo simulations for the diagonal of the density matrix…
- Required: only Matrix-Vector products with the Hamiltonian…
- highly parallelizable!

Many open problems in ab initio calculations…one of the most active fields of research today!

Large Scale Computational Challenges in Materials Science