symmetric eigensolvers in sca lapack n.
Skip this Video
Loading SlideShow in 5 Seconds..
Symmetric Eigensolvers in Sca/LAPACK PowerPoint Presentation
Download Presentation
Symmetric Eigensolvers in Sca/LAPACK

Loading in 2 Seconds...

play fullscreen
1 / 15

Symmetric Eigensolvers in Sca/LAPACK - PowerPoint PPT Presentation

  • Uploaded on

Symmetric Eigensolvers in Sca/LAPACK. Osni Marques ( ). LAPACK Symmetric Tridiagonal Eigensolvers . 1989 2005. Dhillon, Parlett, Voemel, Marques. QR (STEQR): all eigenvectors, O ( n 3 ) Bisection plus inverse iteration (STEVX): subset of eigenvectors, O ( n 2 )

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Symmetric Eigensolvers in Sca/LAPACK' - bess

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
symmetric eigensolvers in sca lapack

Symmetric Eigensolvers in Sca/LAPACK

Osni Marques


lapack symmetric tridiagonal eigensolvers
LAPACK Symmetric Tridiagonal Eigensolvers



Dhillon, Parlett,

Voemel, Marques

  • QR (STEQR): all eigenvectors, O(n3)
  • Bisection plus inverse iteration (STEVX): subset of eigenvectors, O(n2)
  • Divide-and-conquer (STEDC): all eigenvectors, faster than the the previous two but needs more workspace.
  • Multiple relative robust representations (STEGR): faster than all above for most matrices from industrial and scientific applications, least workspace.

Typical performance (timing) of different eigensolvers on matrices coming from industrial applications. In the picture, “old” refers to the version currently available in LAPACK, which will be soon replaced by a “new” and more robust implementation; n ranges from 1824 to 8012.

the essence of the mrrr algorithm
The Essence of the MRRR Algorithm

Factor TI=LDLT, (L,D) is a relative robust representation RRR for the eigenvalue subset 

  • determines all eigenvalues in  to high relative accuracy
  • small relative changes in entries of L and D cause small relative changes in each eigenvalue in 

Given an RRR for a set of eigenvalues:

  • For eacheigenvalue with a large relative gap
    • Compute eigenvalue to high relative accuracy
    • Compute the FP (Fernando Parlett) vector (eigenvector)
  • For each of the remaining groups of eigenvalues
    • Choose shift outside the group
    • Compute new RRR, L+D+L+T=LDLTnew I
    • Refine the eigenvalues.
testing lapack functionalities
Testing LAPACK functionalities
  • At installation time: optional and limited number of test cases to verify the integrity of the installation (LAPACK/TESTING)
  • During the development phase: intensive and stressful tests on a variety of computer architectures
intensive testing requirements and goals
Generation of difficult test cases

Bookkeeping of test cases (so that new or competing algorithm can stressed in a similar way)

Various platforms

AMD Athlon

AMD Opteron

Itanium 2

Pentium III

Pentium 4



SUN sparcv9


Various (Fortran) compilers: Intel, SUN, SGI, IBM…


Performance (time)

Tuning of parameters (automatic or manual)

Algorithmic choices (different IEEE variants)

Intensive Testing: Requirements and Goals

Reveal different numerical behaviors

(in particular IEEE arithmetic features),

as well as performance issues

matrix types
Built-in matrices

1-2-1 tridiagonal matrix

(1D Poisson equation)

Wilkinson tridiagonal matrix

(eigenvalues clustered in pairs)

Built-in eigenvalue distributions

repeated eigenvalues

1=1 and i=1/k, i=2,3…n

1=1 and i=1, i=1,2…n-1, n=1/k

geometric distribution

i= k(1- i)/(n-1), i=1,2…n

different condition numbers (k)

different random number distributions

 can be multiplied by random signs

Glued matrices

combinations of the above cases

very tight eigenvalue clusters

Eigenvalue distributions (D) read from files: QTDQT with random orthogonal Q

Tridiagonal matrices from real world applications


(analysis of molecules)

Harwell-Boeing Collection (structural engineering, etc)

University of Florida Collection

(FEM analysis, NASA)

Matrices from LAPACK users

Lanczos algorithm without reorthogonalization to provoke very close eigenvalues

Matrix Types
what have we found
What have we found?
  • LAPACK 3.0 STEGR (and STEDC!) fails on some of the new test matrices
  • Different matrix classes with different challenges
    • STEGR about 10 times slower than STEDC for glued Wilkinson matrices
  • Architecture differences
    • Pentium slows when infinity occurs
    • Vectorization issues on CRAY
  • Reference tester for future development
parallel eigensolvers
Parallel Eigensolvers
  • PDSYEVX: bisection + inverse iteration
  • PDSYEVD: parallel divide and conquer (F. Tisseur)
  • PDSYEVR: MRRR (C. Vömel)
pitfalls of parallelization
Pitfalls of Parallelization
  • Straightforward approach: n eigenpairs, p processors  cyclic assignment of  n/p eigenpairs to each processor
  • Each processor computes orthogonal eigenvectors
  • Orthogonality between processors is not guaranteed
  • ScaLAPACK: PDSYEVX can break!
mrrr versus dc
MRRR versus DC

(Tridiagonal part of PDSYEVR and PDSYEVD)

Lapw (n=22908, A. Tate). Runtime and efficiency of the tridiagonal MRRR/D&C part on the IBM SP5.

Hubbard (n=63504, Ward and Bai). Runtime and efficiency of the tridiagonal MRRR/D&C part on the IBM SP5.

  • Performance and Accuracy of LAPACK's Symmetric Tridiagonal Eigensolvers, J. Demmel, O. Marques, B. Parlett, and C. Vömel. SIAM J. Sci. Comp., 30:1508–1526, 2008.
  • A Testing Infrastructure for Symmetric Tridiagonal Eigensolvers, J. Demmel, O. Marques, B. Parlett, and C. Vömel. ACM TOMS, 35, 2008.
  • Computations of Eigenpair Subsets with the MRRR Algorithm, B. Parlett, O. Marques and C. Voemel. Numerical Linear Algebra with Applications, 13:643-653, 2006.
  • The Design and Implementation of the MRRR Algorithm, I. Dhillon, B. Parlett, and C. Vömel. Technical Report UT-CS-04-541, December, 2004.
  • ScaLAPACK’S MRRR Algorithm, C. Vömel, LAPACK Working Note 195, November 2007.
  • code available upon request)