Content

GRACEGrid parallelization and testsRoberta Faggian Marque (CERN)with contribution from Jan Fiete Grosse-OetringhausGRACE General Project MeetingStuttgart, 14-15 December 2004

Two GRACE Grid integration models: M1, M2 Pre-conditions for the tests Planning of the Grid tests and first results M1 test results/performances Simulation of M2 Parallelization study M2 tests results/performances Comparison Conclusions EGEE collaboration - update Content

Application workflow Single search Grid workflow First approach used: M1

Application workflow Single search Grid workflow Second approach used: M2

Pre-conditions for running the tests • GRACE components have been tested and validated: We adopted the components which were made available at the beginning of November on the CVS repository, SANDRA release 4.45. These components have been later on improved and optimized by the partners. • A convenient testing corpus of documents for is used: Provided by CERN and approved by partners. • Problems with replica manager in GILDA have been solved: Solved after CERN investigation & feedback to GILDA team & site administrators. • Only English documents are considered. • Search result set size is in average between 0.1 and 4 MB of text. • The Usage of DAG for the job model in GILDA has been investigated: DAG will not be used in the implementation of M2 since the current installation on the GILDA testbed is not completed.

Preparation of a test plan and report template Creation of the testing corpus of documents Verification of testing pre-conditions Creation of the test scripts for semi-automatic testing Testing on Gilda testbed Creation of scripts for validation of output and parsing of logging Collection and analysis of the results Work performed

Testing: job submission • general tests started on 20.10.2004 • main testing period from 05.11 to 25.11.04 • submitted more than 1000 jobs - about 1 million Java API calls

Functional tests • The functional tests were successful. Anyhow, during the testing period some Grid related problems were experienced and fixed: • RB Configuration Problems • RM/SE Configuration Problems • Java API call failures (very low percentage < 0,1%)

Performance tests (I) Depends on input data size On empty queues Depends on GRACE performance Variable Depends on output data size I = Input Size in MB

Performance tests (II)

Performance tests (III) I = Input Size in MB

Variable parameters

Splitting functions for M2

Graphs

M1 performance – G1/V1/V2

M1 performance – G2/V1

M1 output size – G3/V1

M1 Grid overhead /V1

Split search results outside the Grid Launch N parallel jobs for text normalization Monitor jobs status Store results on the Grid (using Replica Manager) Launch Categorization job: Pick up documents from SEs and merges them Perform Categorization Monitor and get results from Categorization job Parallelized model M2

Simulation of M2 performances Increase due to job submission overhead

Rules for splitting parameters Minimize user interface waiting time  Kopt Save “unnecessary” resources by splitting less than optimal value  Keff (see graph) Calculated formulas for splitting parameters Apply constraints: number of available worker nodes, number of simultaneous users, size of the biggest input file Implemented in Java class for GRACE application Splitting study based on simulation α: % increasing of waiting time at user interface

Simulation of M2 performances α{ Kopt Keff

M2 optimal number of jobs – G6/V1

M2 UI waiting time – G4/V3

M2 execution time G2/V1

M2 UI waiting time – G7/V3

M2 spent computing time – G5/V3

M2 spent computing time – G8/V1

Comparison M1 G7/V1- M2 G7/V1

Comparison M1 G8/V1- M2 G8/V1

The Grid parallelization work is the extension of WP6 activity on Grid integration and testing Conclusions • The Grid parallelization has been completed successfully • Parallelization proved to improve application performances and low the query failure rate (GL measure this? • Grid performed well: low failure rate, prompt reply of Grid administrators to problems, good coordination with Gilda team

EGEE collaboration - update • Feedback submitted to EGEE user requirements db • Grace presented at the EGEE EGAAP commission at the second EGEE conference • Grace will be supported in Gilda until the end of the Project (installation in SHU?) • GRACE requirements entered in the EGEE requirements db (input from GL missing) • Contacts with other projects, collaboration with EGEE networking activities: provided input requirements for project support • Member of EGEE user group (supporting action for EGEE user communities)

Content

Content

Presentation Transcript

Content

CONTENT

content

Content

Content

Content

Content

Content

Content

Content

Content

CONTENT

Content

CONTENT

CONTENT

Content

CONTENT

Content Development & Content Writing

Content

Content

Content

Content