Grid parallelization and tests

Grid parallelization and tests CERNGRACE Final ReviewAmsterdam, 15-16 February 2005

Contents • Two GRACE Grid integration models: M1, M2 • Pre-conditions for the tests • Work performed • General test results • Model 1 test results • Simulation of Model 2 • Model 2 tests results • Comparison • Conclusions GRACE Review February 2005 - Amsterdam

Application workflow Single search Grid workflow Approach used: M1 - M2 M1 M2 GRACE Review February 2005 - Amsterdam

Pre-conditions • Adopted Content and Categorization Engines release 4.45. These components have been later on improved and optimized by the partners • A convenient testing corpus of documents has been selected (English documents, correct pdf to txt conversion, small and large sizes) • Configuration problems of GILDA replica manager have been solved (intervention of site administrators) • Search result set size is considered in average between 0.1 and 4 MBs of text • The Usage of DAG for the job model in GILDA has been discarded GRACE Review February 2005 - Amsterdam

Work performed • Preparation of a test plan and report template • Creation of the testing corpus of documents • Verification of testing pre-conditions • Creation of the test scripts for semi-automatic testing • Testing on Gilda testbed • Creation of scripts for validation of output and parsing of logging • Collection and analysis of the results GRACE Review February 2005 - Amsterdam

General tests Model 1 Model 2 Testing: job submission • general (RM, RB, functional, etc.) tests started in October 2004 • main testing period November 2004 • submitted more than 1000 jobs GRACE Review February 2005 - Amsterdam

Variable Parameters GRACE Review February 2005 - Amsterdam

Graphs GRACE Review February 2005 - Amsterdam

Results Results collected and published on a study and test report GRACE Review February 2005 - Amsterdam

General tests GRACE Review February 2005 - Amsterdam

Functional tests • The functional tests were successful. Problems related to the Grid nodes configuration were experienced and fixed: • RB Configuration Problems • RM/SE Configuration Problems GRACE Review February 2005 - Amsterdam

Performance tests (I) Depends on input data size On empty queues Depends on GRACE performance Variable Depends on output data size I = Input Size in MB GRACE Review February 2005 - Amsterdam

Grid overhead retrieving queuing brokering submission Grid overhead is 3 minutes in average GRACE Review February 2005 - Amsterdam

Performance tests (II) The Grid performed well, job success rate > 80% GRACE Review February 2005 - Amsterdam

Model 1 GRACE Review February 2005 - Amsterdam

Normalization Categorization M1 performaces: execution time/input size Tests performed on machines with different specifications The normalization job is the most demanding GRACE Review February 2005 - Amsterdam

Model 2 GRACE Review February 2005 - Amsterdam

M2 description • Search results are split outside the Grid • Grid parallel jobs execute Text normalization • Jobs are monitored for status • Results are stored on the Grid (Replica Manager) • Grid Categorization job executes: • normalized documents merging from SEs • categorization processing • Job is monitored and results retrieved GRACE Review February 2005 - Amsterdam

Waiting time at UI Computing time M2 Simulation Increase due to job submission overhead α{ α GRACE Review February 2005 - Amsterdam Kopt1 Kopt2 Kopt

UI waiting time Normalization Grid overhead Categorization M2 performances execution time/n. of parallel jobs Input size=2MB execution time/input size Splitting parameter = 9 GRACE Review February 2005 - Amsterdam

Model 1 Model 1 Model 2 Model 2 Comparison M1 and M2 execution time/input size computing time/input size GRACE Review February 2005 - Amsterdam

Conclusions • Parallelization proved to improve application performances and lower the query failure rate • Grid performed well: low failure rate, prompt reply of Grid administrators to problems, good coordination with Gilda team GRACE Review February 2005 - Amsterdam

Grid parallelization and tests

Grid parallelization and tests

Presentation Transcript

Data Dependences and Parallelization

CASA Parallelization and High Performance Computing

Loop Parallelization

Trend Towards Parallelization

Parallelization

Cooperative Parallelization

Parallelization and Tuning

HW5: Parallelization

Automatic Parallelization

Tests and tools for ENEA GRID

Parallelization of urbanSTREAM

Parallelization of RHSEG

Parallelization of RHSEG

Parallelization and CUDA libraries

Parallelization Strategies

t-tests and nonparametric tests

Shared Memory Parallelization

Parallelization and Grid Computing

Basic Loop Parallelization

Reasons for parallelization

Optimistic and Pessimistic Parallelization

Tests and more tests