1 / 5

Unified Parallel C

Unified Parallel C. Farhad Kajabadi Dharmanna Pidagannavar Sandeep Menon. The PGAS Model (for UPC). Concurrent threads with a partitioned shared space - A datum may reference data in other partitions - Global arrays have fragments in multiple partitions.

guri
Download Presentation

Unified Parallel C

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unified Parallel C Farhad Kajabadi Dharmanna Pidagannavar Sandeep Menon

  2. The PGAS Model (for UPC) Concurrent threads with a partitioned shared space - A datum may reference data in other partitions - Global arrays have fragments in multiple partitions Ref: http://upc.gwu.edu/tutorials.html

  3. UPC Memory Model • Each of the threads can access private space and shared space which is partitioned into portions each of which has affinity (or logical association). • UPC implementation typically maps a thread along with its private space and the shared memory portion that has affinity to that thread onto one node. • Place shared data close to threads that will need them the most, which can reduce remote memory accesses and increase performance.

  4. Example CODE //matrix_vect_mult1.c #include <upc_relaxed.h> #define N 1000*THREADS shared double A[N][N], X[N], Y[N]; void main() { int i,j; // Y has already been zeroed for(i=0; i<N; i++) if (i%THREADS==MYTHREAD) for (j=0; j<N; j++) Y[i] += X[j] * A[i][j]; } • #include <upc_relaxed.h> - tells the compiler that it can order shared accesses in any way if it sees that performance can be optimized • if (i%THREADS==MYTHREAD) - distributes the iterations onto the threads in a roundrobin fashion

  5. Data Distribution for Example Ref: UPC: Distributed Shared Memory Programming Ghazawi, Carlson, Sterling & Yelick (Wiley, May 2005) http://cnx.org/content/m20649/latest/ http://upc.lbl.gov/docs/

More Related