Pagrid a mesh partitioner for computational grids
This presentation is the property of its rightful owner.
Sponsored Links
1 / 41

PaGrid: A Mesh Partitioner for Computational Grids PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on
  • Presentation posted in: General

PaGrid: A Mesh Partitioner for Computational Grids. Virendra C. Bhavsar Professor and Dean Faculty of Computer Science UNB, Fredericton [email protected] This work is done in collaboration with Sili Huang and Dr. Eric Aubanel. Outline. Introduction Background PaGrid Mesh Partitioner

Download Presentation

PaGrid: A Mesh Partitioner for Computational Grids

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Pagrid a mesh partitioner for computational grids

PaGrid: A Mesh Partitioner for Computational Grids

Virendra C. Bhavsar

Professor and Dean

Faculty of Computer Science

UNB, Fredericton

[email protected]

This work is done in collaboration with Sili Huang and Dr. Eric Aubanel.


Outline

Outline

  • Introduction

  • Background

  • PaGrid Mesh Partitioner

  • Experimental Results

  • Conclusion


Advanced computational research laboratory

Advanced Computational Research Laboratory

Virendra C. Bhavsar


Acrl facilities

ACRL Facilities


Acenet project

ACEnet Project

  • ACEnet (Atlantic Computational Excellence Network) is Atlantic Canada's entry into this national fabric of HPC facilities.

  • A partnership of seven institutions, including UNB, MUN, MTA, Dalhousie, StFX, SMU, and UPEI.

  • ACEnet was awarded $9.9M by the CFI in March 2004. The project will be worth nearly $28M.


Mesh partitioning problem

j

Metal plate

i

Enlarged

hi-1,j

hi,j-1

hi,j+1

hi+1,j

Mesh Partitioning Problem

(a) Heat distribution problem

(b) Corresponding application graph


Mesh partitioning problem1

Cut Edges:

p0: 8

p1: 8

p2: 8

p3: 8

Total Cut Edges:

16

p0

p3

1

p0

p2

p2

1

p1

1

1

1

(a) Homogeneous system graph

p1

p3

1

(b) A partition with

homogeneous partitioning

Mesh Partitioning Problem

  • Mapping of the mesh onto the processors while minimizing the inter-processor communication cost

  • Balance the computational load among processors


Computational grids

Computational Grids

The slide is from the Centre for Unified Computing, University of College Cork, Ireland


Computational grid applications

Computational Fluid Dynamics

Computational Mechanics

Bioinformatics

Condensed Matter Physics Simulation

The slide is from Fluent.com, University of California San Diego, George Washington University, Ohio State University

Computational Grid Applications


A computational grid model

Cluster 2

Cluster 1

p3

p4

p6

p7

4

p2

p5

p8

p9

p1

p0

A Computational Grid Model

  • Computational Grids and their heterogeneity in both processors and networks


Mesh partitioning problem2

1

1

2

p0

p1

p2

p3

(a) Processor graph

p0

p0

p3

p1

p2

p2

p1

p3

(b) Optimal partition with

a homogeneouspartitioner

(c) Optimal partition with

a heterogeneous partitioner

Total Cut Edges:16

Total Communication Cost:40

Total Cut Edges:24

Total Communication Cost:32

Mesh Partitioning Problem

Equation: Total communication cost


Background

Background

  • Generic Multilevel Partitioning Algorithm

The slide is from Centre from CEPBA-IBM Research Institute, Spain.


Background1

4

1

4

[2]

[2]

1

2

3

5

3

1

5

1

1

[2]

2

1

3

1

1

[2]

[2]

2

5

3

1

3

4

1

1

Background

  • Coarsening phase

    • Matching and contraction.

  • Heavy Edge Matching Heuristic.

v2

u

v1


Background2

Background

  • Refinement (Uncoarsening Phase)

    • Kernighan-Lin/Fiduccia-Mattheyses (KL-FM) refinement

      • Refine partitions under load balance constraint.

      • Compute a gain for each candidate vertex.

      • Each step, move a single vertex to a different subdomain.

      • Vertices with negative gains are allowed for migration.

    • Greedy refinement

      • Similar to KL-FM refinement

      • Vertices with negative gains are not allowed to move


Background3

Background

  • (Computational) Load balancing

    • To balance the load among the processors

    • Small imbalance can lead to a better partition.

  • Diffusion-based Flow Solutions

    • Determine how much load to be transferred among processors


Mesh partitioning tools

Mesh Partitioning Tools

  • Mesh Partitioning Tools

    • METIS (Karypis and Kumar, 1995)

    • JOSTLE (Walshaw, 1997)

    • CHACO (Hendrickson and Leland, 1994)

    • PART(Chen and Taylor, 1996)

    • SCOTCH(Pellegrini, 1994)

    • PARTY(Preis and Diekmann, 1996)

    • MiniMax(Kumar, Das, and Biswas , 2002)


Metis

METIS

  • A widely used partitioning tool.

  • Developed from 1995.

  • Uses Multilevel partitioning algorithm.

    • Heavy Edge Matching for Coarsening Phase

    • Greedy Refinement algorithm

  • Does not consider the network heterogeneity.


Jostle

JOSTLE

  • Developed from 1997.

  • A heterogeneous partitioner

  • Uses multilevel partitioning algorithm

    • Heavy Edge Matching

    • KL-type refinement algorithm

  • Does not factor in the ratio of communication time and computation time.


Pagrid mesh partitioner

PaGrid Mesh Partitioner

  • Grid System Modeling

  • Refinement Cost Function

  • KL-type Refinement

  • Estimated Execution Time Load Balancing


Grid system modeling

|(p0, p1)|= 1

|(p1, p2)|= 2

|(p0, p2)|= 3

1

2

p0

p1

p2

Grid system Model

Path lengths

Weighted matrix W

Grid System Modeling

  • Grid system that contains a set of processors (P) connected by a set of edges (C) –> weighted processor graph S.

  • Vertex weight = relative computational power

    • if p0 is twice powerful than p1, and |p1|=0.5, then |p0|=1

  • Path length = accumulative weights in the shortest path.

  • Weighted Matrix W of size |P| X |P| is constructed, where


Refinement cost function

map to

p3

u

p0

p2

p1

map to

v

Refinement Cost Function

  • Given a processor mapping cost matrix W, the total mapping cost for a partition is given by


Refinement cost function1

Refinement Cost Function


Multilevel partitioning algorithm

Multilevel Partitioning Algorithm

  • Coarsening Phase.

    • Heavy Edge Matching

    • Iterate until the number of vertices in the coarsest graph is same as the given number of processors.

  • Initial Partitioning Phase.

    • Assign the each vertex to a processor, while minimizing the cost function.

  • Uncoarsening Phase.

    • Load balancing based on vertex weights

    • KL-type refinement algorithm.

  • Load balancing based on estimated execution time.


Estimated execution time load balancing

Estimated Execution time load balancing

  • Input is the final partition after refinement stage.

  • Tries to improve the quality of final partition in terms of estimated execution time.

  • Execution time for a processor is the sum of time required for computation and the time required for communication.

  • Execution time is a more accurate metric for the quality of a partition.

  • Uses KL-type algorithm


Estimated execution time load balancing1

Estimated Execution time load balancing

  • For a processor p with one of its edges (p, q) in the processor graph, let

  • Estimated execution time for processor p is given as

  • Estimated execution time of the application is:


Experimental results

Experimental Results

  • Test application graphs

  • Grid system graphs

  • Comparison with METIS and JOSTLE


Test application graphs

Test Application Graphs

|V| is the total number of vertices and |E| is the total number of edges in the graph.


Grid systems

32-processor

Grid system

64-processor

Grid system

Grid Systems


Metrics

Metrics

  • Total Communication Cost

  • Maximum Estimated Execution Time


Total communication cost

Total Communication Cost

32-processor Grid System


Total communication cost1

Total Communication Cost

  • Average values of Total Communication Cost of PaGrid are similar to those of METIS.

  • Average values of Total Communication Cost of PaGrid are slightly worse than for Jostle.


Maximum estimated execution time

Maximum Estimated Execution Time

32-processor Grid System


Maximum estimated execution time1

Maximum Estimated Execution Time

  • The minimum and average values of Execution Time for PaGrid are always lower than for Jostle and METIS, except for graph mrng2, where PaGrid is slightly worse than METIS.

  • Even though the results PaGrid are worse than Jostle in terms of average Total Communication Cost, PaGrid’s Estimated Execution Time Load Balancing generates lower average Execution Time than Jostle in all cases.


Total communication cost2

Total Communication Cost

64-processor Grid System


Total communication cost3

Total Communication Cost

  • Average values of Total Communication Cost of PaGrid are better than METIS in most cases, except for graph mrng2 (because of the low ratio of |E|/|V|).

  • Average values of Total Communication Cost of PaGrid are much worse than Jostle in three of five test application graphs.


Maximum estimated execution time2

Maximum Estimated Execution Time

64-processor Grid System


Maximum estimated execution time3

Maximum Estimated Execution Time

  • The difference between PaGrid and Jostle are amplified:

    • even though the results PaGrid are much worse than Jostle in terms of average Total Communication Cost, the minimum and average values of Execution Time for PaGrid are much lower than for Jostle.

  • The minimum Estimated Execution Times for PaGrid are always much lower than for METIS, and the average Execution Times for PaGrid are almost always lower than those of METIS, except for application graph mrng2.


Conclusion

Conclusion

  • Intensive need for mesh partitioner that considers the heterogeneity of the processors and networks in a computational Grid environment.

  • Current partitioning tools provide only limited solution.

  • PaGrid: a heterogeneous mesh partitioner

    • Consider both processor and network heterogeneity.

    • Use multilevel graph partitioning algorithm.

    • Incorporate load balancing that is based on estimated execution time.

  • Experimental results indicate that load balancing based on estimated execution time improves the quality of partitions.


Future work

Future Work

  • Cost function can be modified to be based on estimated execution time.

  • Algorithms can be developed addressing repartitioning problem.

  • Parallelization of PaGrid.


Publications

Publications

  • S. Huang, E. Aubanel, and V.C. Bhavsar, "PaGrid: A Mesh Partitioner for Computational Grids", Journal of Grid Computing, 18 pages, in press, 2006.

  • S. Huang, E. Aubanel and V. Bhavsar, ‘Mesh Partitioners for Computational Grids: a Comparison’, in V. Kumar, M. Gavrilova, C. Tan, and P. L'Ecuyer (eds.), Computational Science and Its Applications, Vol. 2269 of Lecture Notes in Computer Science, Springer Inc., Berlin Heidelberg New York, pp. 60–68, 2003.


Questions

Questions ?


  • Login