data flow pattern analysis of scientific applications
Download
Skip this Video
Download Presentation
Data Flow Pattern Analysis of Scientific Applications

Loading in 2 Seconds...

play fullscreen
1 / 24

Data Flow Pattern Analysis of Scientific Applications - PowerPoint PPT Presentation


  • 96 Views
  • Uploaded on

Data Flow Pattern Analysis of Scientific Applications. Michael Frumkin Parallel Systems & Applications Intel Corporation May 6, 2005. Outline. Why Data Flow Pattern Analysis? CFD Applications The NAS Parallel Benchmarks The NAS Grid Benchmarks Trace File Analysis Conclusions.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Data Flow Pattern Analysis of Scientific Applications' - tomai


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data flow pattern analysis of scientific applications
Data Flow Pattern Analysis of Scientific Applications

Michael Frumkin

Parallel Systems & Applications

Intel Corporation

May 6, 2005

outline
Outline
  • Why Data Flow Pattern Analysis?
  • CFD Applications
  • The NAS Parallel Benchmarks
  • The NAS Grid Benchmarks
  • Trace File Analysis
  • Conclusions
why data flow pattern analysis
Why Data Flow Pattern Analysis?
  • Scientific applications
    • model few natural processes
    • new effects are added infrequently
    • influence on the existing data flows are insignificant
  • Knowledge of data flow in program helps with
    • program understanding
    • program optimization, parallelization, multithreading
    • building application performance model
design of scientific applications
Design of Scientific Applications
  • Time represented as an outer loop
    • Iterations over time step
  • Space is represented by structured/unstructured grids
    • Important for understanding data locality
    • Data access patterns
    • Spatial parallelism
  • Physics is represented by an operator at each grid point
    • Data flow
    • Operator level of parallelism/dependence
cfd data flow patterns
CFD Data Flow Patterns
  • Solve the Navier-Stokes equation

K(ui+1)=Lui

    • u is five-dimensional vector
    • K is non-linear operator
  • Solver
  • RHS computation
adi pattern
ADI method

x-solve

y-solve

z-solve

ADI Pattern
  • ADI method K~Kx*Ky*Kz
  • Multilevel parallelism

y-solve

x-solve

Multipartition

z-solve

explicit operators
Explicit Operators
  • Stencil operators (explicit methods)
  • At each point of a 3-dimensional mesh apply:

seven-point

27-point

slide9
Lower-Upper Triangular

Dependence Matrices

(

)

(

)

  • Two-dimensional pipeline
  • Hyperplane algorithm

-1 0 0 1 0 0

0 -1 0 0 1 0

0 0 -1 0 0 1

slide11
Multigrid V-Cycle

Interpolation & Smoothing

Projection

Interpolation & Smoothing

Projection

Projection

Interpolation & Smoothing

Interpolation & Smoothing

Projection

Smoothing

bt x solve serial call graph
BT x_solve (serial) Call Graph

Data Flow Analysis

do k=1,ksize

do j=1,jsize

do i=1,isize

nest data flow graph
Nest Data Flow Graph

do_45

do_134

do_330

Each arc represents Affinity Relation

nas parallel benchmarks
www.nas.nasa.gov/Software/NPBNAS Parallel Benchmarks
  • Application Benchmarks
    • CFD
      • BT, SP, LU
    • Data Intensive
      • DC, DT, BTIO
    • Computational Chemistry
      • UA
  • Kernel Benchmarks
    • FT, CG, MG, IS
  • Verification
  • Performance Model
  • FORTRAN, C, HPF, Java*
  • Serial, MPI, OpenMP, Java* Threads

* Other names and brands may be claimed as the property of others.

npb performance on altix
NPB Performance on Altix*

**

* Other names and brands may be claimed as the property of others.

** Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests.  Any difference in system hardware or software design or configuration may affect actual performance.  Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing.

basic data flow patterns
Basic Data Flow Patterns
  • Shuffles
    • Sorting
    • FFT
    • Routing
  • Gather/Scatter
    • Conjugate Gradient
    • MD and FE codes
    • Sparse matrices
  • Transpose
    • FFT
    • Sorting
  • Tree
    • Parallel prefix, Reduction
    • Sorting
hpc challenge benchmarks
icl.cs.utk.edu/hpccHPC Challenge Benchmarks
  • HPL*
  • DGEMM*
  • STREAM*
  • PTRANS*
  • FFTE*
  • RandomAccess*
  • Effective Bandwidth b_eff*

* Other names and brands may be claimed as the property of others.

programming with directed graphs
Implemented in DT of NPB and in NGBProgramming With Directed Graphs
  • Arc
    • Arc* newArc(Node *tail, Node *head)
    • AttachArc(DGraph *dg)
    • deleArc(Arc *ar)
  • Node
    • newNode(char *name)
    • Node* AttachNode(DGraph *dg)
    • deleteNode(Node *nd)
  • DGraph
    • DGraph* newDGraph(char *name)
    • writeGraph(DGraph *dg, char* fname)
    • DGraph * readGraph(char* fname)

do_134

directed graphs around
Directed Graphs Around
  • Parse trees
  • File Systems
  • Application task graphs
  • Device Schematics

Visualization and layout Tools

  • VCG tool
  • Edge tool
  • Tom Sawyer Software
  • Commercial tools
cart3d
Task Graphs are rapidly growingCart3D*
  • Performs CFD analysis on complex geometries
  • Uses six executables
    • Intersect* – intersects geometry
    • Cubes* – produces Cartesian meshes
    • Reorder* – reorders meshes
    • Mgprep* – coarsens mesh
    • flowCart* – convergence acceleration
    • Clic* – analyzes the flow
  • Executables communicate via files
  • Returns relevant forces
    • Lift, Drag, Side Force

* Other names and brands may be claimed as the property of others.

the nas grid benchmarks
Mixed Bag (MB)

Launch

LU2

LU4

LU8

MG4

MG8

MG2

FT8

FT8

FT2

Report

#steps

Helical Chain (HC)

Launch

Embarrassingly Distributed (ED)

Visualization Pipeline (VP)

BT

SP

LU

Launch

Launch

BT

SP

LU

SP

SP

SP

SP

SP

SP

SP

SP

SP

BT

MG

FT

BT

SP

LU

BT

MG

Report

FT

BT

MG

FT

Report

Report

The NAS Grid Benchmarks
  • Reflect task level programming paradigm
  • Contain four patterns
    • Embarrassingly Distributed (ED)
    • Helical Chain (HC)
    • Visualization Pipeline (VP)
    • Mixed Bag (MB)
data dependent patterns
Automatic Trace Analysis Using OLAPData Dependent Patterns
  • Intermittent patterns
    • Useful for application performance tuning
  • Visualization is important
    • Allows to employ human eye ability to detect patterns
  • Automatic Pattern Mining
    • OLAP approach
  • MPI communication patterns
conclusions
Conclusions

Data Flow in Applications

  • Application Parallelization
  • Application Understanding
  • Application Mapping
  • Application Performance
ad