1 / 11

Claude Tadonki Laboratoire de l’Accélérateur Linéaire/IN2P3/CNRS University of Orsay

Claude Tadonki Laboratoire de l’Accélérateur Linéaire/IN2P3/CNRS University of Orsay Orsay / France claude.tadonki@u-psud.fr. 1st Workshop on Applications for Multi and Many Core Architectures

garnet
Download Presentation

Claude Tadonki Laboratoire de l’Accélérateur Linéaire/IN2P3/CNRS University of Orsay

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Claude Tadonki Laboratoire de l’Accélérateur Linéaire/IN2P3/CNRS University of Orsay Orsay / France claude.tadonki@u-psud.fr 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

  2. Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI The Algebraic Path Problem 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

  3. Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI The Warshall-Floyd Algorithm 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

  4. Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Shift-toroïdal Reindexation ( Kung-Lo-Lewis, 1987) 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

  5. Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI The CELL Broadband Engine 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

  6. Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Ring Pipelined Algorithm for the APP ( algorithm ) 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

  7. Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Ring Pipelined Algorithm for the APP ( algorithm ) Interestingproperties of ouralgorithm Can runwithanynumber of processors p <= N ( natural LPGS ) Generictilingapplies ( LSGP by blocking ) Each processor onlyrequires a buffer of size bN ( Block of size b ) Fullypipelinedprocesswith local synchronizationonly Perfect computation-communication overlap 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

  8. Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Ring Pipelined Algorithm for the APP ( implementation on the CELL BE ) PPE-DMA is issued only by the first and the last processor Inner SPEs communicate and synchronize locally Computation-communication overlap occurs for all communications Can run on more SPEs or CELL Blades by natural extension 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

  9. Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Performances 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

  10. Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Conclusion and Perspectives Our ring SPMD algorithm suits for the CELL BE with a good scalability Communication and synchronization yield less than 5% overhead Absolute performance can be improved by optimizing the APP kernel Close to 80% of the peak performance expected Our scheduling can be applied to similar problems 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

  11. Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI END & QUESTIONS 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.

More Related