PARSEC FACESIM

PARSEC FACESIM study and parallelization Dmitri Makarov, Dmitri Shtilman

facesim facts • iterative numerical methods: data parallel (floating point computation). • size of the problem: ~370K tetrahedrons. • included in PARSEC. C++ application. • parallelized: taskQ, thread pool with custom barrier implementation. • not scalable beyond 16 threads (for 128 threads speedup only 16x on T2+)

facesim scaling on T2+

facesim cputrack profiling

facesim issues • in-house barrier: • pthread_cond_wait() • pthread_cond_signal() • not parallelized stages of the simulation • overhead of tasks, extra computations • resource contention: 128 threads • 32 FPUs • 32 LSs

improvements • reworked the thread pool implementation: • N-1 threads created before the simulation starts • each thread works on its partition of data • master sets a flag when work is ready • all wait at barrier for everyone else to finish task • no need to add tasks to queue (same entry point) • use spin barrier instead of pthread barrier • parallelized sequential stages

results

other platforms

observations • generic API may be portable but inefficient lib implementation could kill performance. • C++ can introduce a lot of redundancy if not used carefully: • new T[N] calls T() sequentially N times. • library should implement default constructors as efficiently as possible. • flexibility has a cost: • load balancing overhead can outweigh its benefits

future work • apply our observations to other PhysBAM-based applications • study and optimize cloth simulation • port facesim to CUDA, OpenCL (real-time rendering) • implement important parts of PhysBAM in Scala

PARSEC FACESIM

PARSEC FACESIM

Presentation Transcript

PARSEC/Glomosim Tutorial

AGN Feedback at the Parsec Scale

Studying Parsec Scale Jets in Star Forming Region

PARSEC vs. SPLASH-2 :

The PARSEC Benchmark Suite

Studying Parsec Scale Jets in Star Forming Region

Parsec Parsing

Parsec-scale Constraints on the ISM From the Millisecond Pulsars in Terzan5

Apparent Speed as a Probe of Parsec-Scale Jet in AGN

An Overabundance of X-ray Binaries in the Central Parsec of the Galaxy

The High-density Ionized Gas in The Central Parsec of The Galaxy

Performance Analysis of NUCA Policies for CMPs Using Parsec v2.0 Benchmark Suite

Stability and evolution of parsec and kiloparsec-scale jets

Parsec-scale Jets in Lobe-dominated Quasars David Hough Trinity University

Parsec-Scale Investigation of the Magnetic Field Structure of Several AGN Jets

Intrinsic structure and kinematics of the sub-parsec scale jet of M87

Implication for star formation in the central parsec of our Galaxy with Subaru observations

Parallel Discrete Event Simulation of Manufacturing Systems using PARSEC

Parsec-Scale Jet-Environment Interactions in AGN

PARSEC PARallaxes of Southern Extremely Cool objects

PARSEC - PARallaxes of Southern Extremely Cool objects