scalable system for large unstructured mesh simulation
Download
Skip this Video
Download Presentation
Scalable System for Large Unstructured Mesh Simulation

Loading in 2 Seconds...

play fullscreen
1 / 57

Scalable System for Large Unstructured Mesh Simulation - PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on

Scalable System for Large Unstructured Mesh Simulation. Miguel A. Pasenau, Pooyan Dadvand, Jordi Cotela, Abel Coll and Eugenio Oñate. Overview. Introduction Preparation and Simulation More Efficient Partitioning Parallel Element Splitting Post Processing Results Cache

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Scalable System for Large Unstructured Mesh Simulation' - noma


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
scalable system for large unstructured mesh simulation

Scalable System for Large Unstructured Mesh Simulation

Miguel A. Pasenau, Pooyan Dadvand, Jordi Cotela, Abel Coll and Eugenio Oñate

overview
Overview
  • Introduction
  • Preparation and Simulation
    • More Efficient Partitioning
    • Parallel Element Splitting
  • Post Processing
    • Results Cache
    • Merging Many Partitions
    • Memory usage
    • Off-screen mode
  • Conclusions, Future lines Acknowledgements
overview1
Overview
  • Introduction
  • Preparation and Simulation
    • More Efficient Partitioning
    • Parallel Element Splitting
  • Post Processing
    • Results Cache
    • Merging Many Partitions
    • Memory usage
    • Off-screen mode
  • Conclusions, Future lines Acknowledgements
introduction
Introduction
  • Education: Masters in Numerical Methods, trainings, seminars, etc.
  • Publishers: magazines, books, etc.
  • Research: PhD’s, congresses, projects, etc.
  • One of the International Centers of Excellence on Simulation-Based Engineering and Sciences [Glotzer et al., WTEC Panel Report on International Assessment of Research and Development in Simulation Based Engineering and Science. World Technology Evaluation Center (wtec.org), 2009].
introduction1
Introduction
  • Simulation: structures
introduction2
Introduction
  • CFD: Computer Fluid Dynamics
introduction3
Introduction
  • Geomechanics
  • Industrial forming processes
  • Electromagnetism
  • Acoustics
  • Bio-medical engineering
  • Coupled problems
  • Earth sciences
introduction4

Visualization of results

Geometrydescription

Preparation of analysis data

Computer

Analysis

Provided by CAD or using GiD

Introduction
  • Simulation

GiD

introduction5
Introduction
  • Analysis Data generation

Read in and correct CAD data

Assignment of boundary conditions

Definitions of analysis parameters

Generation of analysis data

Assignment of material properties, etc.

introduction6
Introduction
  • Visualization of Numerical Results
    • Deformed shapes, temperature distributions, pressures, etc.
    • Vector, contour plots, graphs,
    • Line diagrams, results surfaces
    • Animated sequences
    • Particle line flow diagrams
introduction7
Introduction
  • Goal: do a CFD simulation with 100 Million elements using in-house tools
  • Hardware: cluster with
    • Master node: 2 x Intel Quad Core E5410, 32 GB RAM
    • 3 TB disc with dedicated Gigabit to Master node
    • 10 nodes: 2 x Intel Quad Core E5410 and 16 GB RAM
    • 2 nodes: 2 x AMD Opteron Quad Core 2356 and 32 GB
    • Total of 96 cores, 224 GB RAM available
    • Infiniband 4x DDR, 20 Gbps
introduction8
Introduction
  • Airflow around a F1 car model
introduction9
Introduction
  • Kratos:
    • Multi-physics, open source framework
    • Parallelized for shared and distributed memory machines
  • GiD:
    • Geometry handling and data management
    • First coarse mesh
    • Merging and post-processing results
introduction10
Introduction

res. 1

part 1

res. 2

part 2

·

·

·

·

·

·

Geometry

Partition

Merge

Conditions

Distribution

Visualize

Materials

Communication plan

part n

res. n

Coarse mesh generation

Refinement

Calculation

overview2
Overview
  • Introduction
  • Preparation and Simulation
    • More Efficient Partitioning
    • Parallel Element Splitting
  • Post Processing
    • Results Cache
    • Merging Many Partitions
    • Memory usage
    • Off-screen mode
  • Conclusions, Future lines and Acknowledgements
meshing
Meshing
  • Single workstation: limited memory and time
  • Three steps:
    • Single node: GiD generates a coarse mesh with 13 Million tetrahedrons
    • Single node: Kratos+ Metis divide and distribute
    • In parallel: Kratos refines the mesh locally
preparation and simulation
Preparation and simulation

res. 1

part 1

res. 2

part 2

·

·

·

·

·

·

Geometry

Partition

Merge

Conditions

Distribution

Visualize

Materials

Communication plan

part n

res. n

Coarse mesh generation

Refinement

Calculation

efficient partitioning before
Efficient partitioning: before
  • Rank0 read the model, partitions it and send the partitions to the other ranks

Rank 0

Rank 1

Rank 2

Rank 3

efficient partitioning before1
Efficient partitioning: before
  • Rank0 read the model, partitions it and send the partitions to the other ranks

Rank 0

Rank 1

Rank 2

Rank 3

efficient partitioning before2
Efficient partitioning: before
  • Requires large memory in node 0
  • Using the cluster time for partitioning which can be done outside
  • Each rerun need repartitioning
  • Same working procedure for OpenMP and MPI run
efficient partitioning now
Efficient partitioning: now
  • Dividing and writing the partitions in another machine
  • Reading data of each rank separately
preparation and simulation1
Preparation and simulation

res. 1

part 1

res. 2

part 2

·

·

·

·

·

·

Geometry

Partition

Merge

Conditions

Distribution

Visualize

Materials

Communication plan

part n

res. n

Coarse mesh generation

Refinement

Calculation

local refinement triangle
Local refinement: triangle

k

k

n

m

3

m

n

4

1

2

k

k

j

i

i

j

l

l

k

m

2

2

1

1

j

i

i

i

j

j

l

l

k

k

k

m

3

m

m

3

1

2

2

1

j

i

i

i

j

j

l

l

l

local refinement triangle1
Local refinement: triangle
  • Selecting the case respecting nodes Id
  • The decision is not for best quality!
  • It is very good for parallelization
    • OpenMP
    • MPI

k

k

k

m

3

m

m

3

1

2

2

1

j

i

i

i

j

j

l

l

l

local refinement tetrahedron
Local refinement: tetrahedron

Father Element

Child Elements

local refinement uniform
Local refinement: uniform
  • A Uniform refinement can be used to obtain a mesh with 8 times more elements
    • Does not improve the geometry representation
introduction11
Introduction

res. 1

part 1

res. 2

part 2

·

·

·

·

·

·

Geometry

Partition

Merge

Conditions

Distribution

Visualize

Materials

Communication plan

part n

res. n

Coarse mesh generation

Refinement

Calculation

parallel calculation
Parallel calculation
  • Calculated using 12 x 8 MPI processes
  • Less than 1 day for 400 time steps
  • About 180 GB memory usage
  • Single volume mesh of 103 Million tetrahedrons split into 96 files ( mesh portion and its results)
overview3
Overview
  • Introduction
  • Preparation and Simulation
    • More Efficient Partitioning
    • Parallel Element Splitting
  • Post Processing
    • Results Cache
    • Merging Many Partitions
    • Memory usage
    • Off-screen mode
  • Conclusions, Future lines and Acknowledgements
post processing
Post processing

res. 1

part 1

res. 2

part 2

·

·

·

·

·

·

Geometry

Partition

Merge

Conditions

Distribution

Visualize

Materials

Communication plan

part n

res. n

Coarse mesh generation

Refinement

Calculation

post process
Post-process
  • Challenges to face:
    • Single node
    • Big files: tens or hundreds of GB
    • Merging: Lots of files
    • Batch post-processing
    • Maintain generality
big files results cache
Big Files: results cache
  • Uses a defined memory pool to store results.
  • Used to cache results stored in files.

User definable

Memory pool

Results from files: single, multiple, merge

Temporal results

Mesh information

Created Results: cuts, extrusions, tcl

big files results cache1
Big Files: results cache

RC Info

Result

memory footprint

file 1

offset

type

Results cache table

RC info

file 2

offset

type

RC entry

timestamp

· · ·

· · ·

· · ·

RC entry

file n

offset

type

timestamp

Result

· · · · · · ·

RC info

RC entry

timestamp

Result

Open files table

RC info

file

handle

type

file

handle

type

· · ·

· · ·

· · ·

Granularity of result

file

handle

type

big files results cache2
Big Files: results cache
  • Verifies result’s file(s) and gets result’s position in file and memory footprint.
  • Results of latest analysis step in memory.
  • Loaded on demand.
  • Oldest results unloaded if needed.
  • Touch on use.
big files results cache3
Big Files: results cache
  • Chinese harbour:

104 GB results file

7,6 Million tetrahedrons

2.292 time steps

3,16 GB memory usage

( 2 GB results’ cache)

big files results cache4
Big Files: results cache
  • Chinese harbour:

104 GB results file

7,6 Million tetrahedrons

2.292 time steps

3,16 GB memory usage

( 2 GB results’ cache)

merging many partitions
Merging many partitions
  • Before: 2, 4, ... 10 partitions
  • Now: 32, 64, 128, ... of a single volume mesh
  • Postpone any calculation:
    • Skin extraction
    • Finding boundary edges
    • Smoothed normals
    • Neighbour information
    • Graphical objects creation
merging many partitions1
Merging many partitions

Telescope example

23,870,544 tetrahedrons

Before

32 partitions

24’ 10”

After

32 partitions

4’ 34”

128 partitions

10’ 43”

Single file

2’ 16”

merging many partitions3
Merging many partitions

Racing car example

103,671,344 tetrahedrons

Before

96 partitions

> 5 hours

After

96 partitions

51’ 21”

Single file

13’ 25”

memory usage
Memory usage
  • Around 12 GB of memory used with a spike of 15 GB ( MS Windows) 17,5 GB ( Linux), including:
    • Volume mesh ( 103 Mtetras)
    • Skin mesh ( 6 Mtriangs)
    • Several surface and cut meshes
    • Stream line search tree
    • 2 GB of results cache
    • Animations
batch post processing off screen
Batch post-processing: off-screen
  • GiD with no interaction and no window
  • Command line:

gid -offscreen [ WxH] -b+gbatch_file_to_run

  • Useful to:
    • launch costly animations in bg or in queue
    • use gid as template generator
    • use gid behind a web server: Flash Video animation
  • Animation window: added button to generate batch file for offscreen-gid to be sent to a batch queue.
overview4
Overview
  • Introduction
  • Preparation and Simulation
    • More Efficient Partitioning
    • Parallel Element Splitting
  • Post Processing
    • Results Cache
    • Merging Many Partitions
    • Memory usage
    • Off-screen mode
  • Conclusions, Future lines and Acknowledgements
conclusions
Conclusions
  • The implemented improvements helped us to achieve the milestone:

Prepare, mesh, calculate and visualize a CFD simulation with 103 Million tetrahedrons

  • GiD: also modest machines take profit of these improvements
future lines
Future lines
  • Faster tree creation for stream lines.
    • Now: ~ 90 s. creation time, 2-3 s. per stream line
  • Mesh simplification, LOD
    • geometry and results criteria
    • Surface meshes, iso-surfaces, cuts: faster drawing
    • Volume meshes: faster cuts, stream lines
    • Near real-time
  • Parallelize other algorithms in GiD:
    • Skin and boundary edges extraction
    • Parallel cuts and stream lines creation
challenges
Challenges
  • 109 – 1010 tetrahedrons, 6·108 – 6·109 triangles
  • Large workstation with Infiniband to cluster and 80 GB or 800 GB RAM? Hard disk?
  • Post process as backend of a web server in cluster? Security issues?
  • Post process embedded in solver?
  • Output of both: the original mesh and a simplified one?
acknowledgements
Acknowledgements
  • Ministerio de Ciencia e Innovación, E-DAMS project
  • European Commission, Real-time project
thanks for your attention

Thanks for your attention

Scalable System for Large Unstructured Mesh Simulation

ad