1 / 9

Parallel Computing Research

Parallel Computing Research. L.V. (Sanjay) Kale Professor Dept. of Computer Science http://www.ks.uiuc.edu/Research/namd. Overview. Research at PPL Develop technology that improves: performance of parallel applications programmer productivity Load balancing issues

neo
Download Presentation

Parallel Computing Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Computing Research L.V. (Sanjay) Kale Professor Dept. of Computer Science http://www.ks.uiuc.edu/Research/namd

  2. Overview • Research at PPL • Develop technology that improves: • performance of parallel applications • programmer productivity • Load balancing issues • Communication optimizations • Parallel algorithms • Collaboration: CSE applications

  3. Protein Folding Quantum Chemistry (QM/MM) Molecular Dynamics Computational Cosmology Charm++ Parallel Objects, Adaptive Runtime System Libraries and Tools Crack Propagation Dendritic Growth Space-time meshes Rocket Simulation Enabling CS technology of parallel objects and intelligent runtime systems has led to several collaborative applications in CSE

  4. Charm++ in wider use • Applications are using Charm++ • Adding to its stability, robustness • Rocket simulation (ASCI center) • Computational Cosmology (Astrophysics) • QM (Car-Parinello method) • Crack propagation • Space-time meshes in process simulation • Large data visualization

  5. Blue Gene • Blue Gene/L • 64K dual processor nodes • Targeted peak performance 180/360TF/s • Simulation and performance prediction • Demonstrated efficient parallelization of skeletal MD program

  6. Collective Communication • Performance impediment • Issues • Communication latencies not scaling with bandwidth and processor speeds • High software over head (α) • Synchronous operations (MPI_Alltoall) do not utilize the co processor effectively • All to all personalized communication • Each processor has P messages to send • Dominated by software overhead

  7. Optimizing AAPC • Message combining for small messages • Reduce the total number of messages • Messages sent along a virtual topology • Multistage algorithm to send messages • Group of messages combined and sent to an intermediate processors which then forward them to the final destinations • Using virtual topologies reduces software overhead of sending messages

  8. Virtual Topology:Mesh Organizeprocessors in a 2D (virtual) Mesh Message from (x1,y1) to (x2,y2) goes via (x1,y2) 2* messages instead of P-1

  9. Namd Performance on Lemieux

More Related