1 / 33

Content

GRACE Grid parallelization and tests Roberta Faggian Marque (CERN) with contribution from Jan Fiete Grosse-Oetringhaus GRACE General Project Meeting Stuttgart, 14-15 December 2004. Two GRACE Grid integration models: M1, M2 Pre-conditions for the tests

brian
Download Presentation

Content

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GRACEGrid parallelization and testsRoberta Faggian Marque (CERN)with contribution from Jan Fiete Grosse-OetringhausGRACE General Project MeetingStuttgart, 14-15 December 2004

  2. Two GRACE Grid integration models: M1, M2 Pre-conditions for the tests Planning of the Grid tests and first results M1 test results/performances Simulation of M2 Parallelization study M2 tests results/performances Comparison Conclusions EGEE collaboration - update Content

  3. Application workflow Single search Grid workflow First approach used: M1

  4. Application workflow Single search Grid workflow Second approach used: M2

  5. Pre-conditions for running the tests • GRACE components have been tested and validated: We adopted the components which were made available at the beginning of November on the CVS repository, SANDRA release 4.45. These components have been later on improved and optimized by the partners. • A convenient testing corpus of documents for is used: Provided by CERN and approved by partners. • Problems with replica manager in GILDA have been solved: Solved after CERN investigation & feedback to GILDA team & site administrators. • Only English documents are considered. • Search result set size is in average between 0.1 and 4 MB of text. • The Usage of DAG for the job model in GILDA has been investigated: DAG will not be used in the implementation of M2 since the current installation on the GILDA testbed is not completed.

  6. Preparation of a test plan and report template Creation of the testing corpus of documents Verification of testing pre-conditions Creation of the test scripts for semi-automatic testing Testing on Gilda testbed Creation of scripts for validation of output and parsing of logging Collection and analysis of the results Work performed

  7. Testing: job submission • general tests started on 20.10.2004 • main testing period from 05.11 to 25.11.04 • submitted more than 1000 jobs - about 1 million Java API calls

  8. Functional tests • The functional tests were successful. Anyhow, during the testing period some Grid related problems were experienced and fixed: • RB Configuration Problems • RM/SE Configuration Problems • Java API call failures (very low percentage < 0,1%)

  9. Performance tests (I) Depends on input data size On empty queues Depends on GRACE performance Variable Depends on output data size I = Input Size in MB

  10. Performance tests (II)

  11. Performance tests (III) I = Input Size in MB

  12. Variable parameters

  13. Splitting functions for M2

  14. Graphs

  15. M1 performance – G1/V1/V2

  16. M1 performance – G2/V1

  17. M1 output size – G3/V1

  18. M1 Grid overhead /V1

  19. Split search results outside the Grid Launch N parallel jobs for text normalization Monitor jobs status Store results on the Grid (using Replica Manager) Launch Categorization job: Pick up documents from SEs and merges them Perform Categorization Monitor and get results from Categorization job Parallelized model M2

  20. Simulation of M2 performances Increase due to job submission overhead

  21. Rules for splitting parameters Minimize user interface waiting time  Kopt Save “unnecessary” resources by splitting less than optimal value  Keff (see graph) Calculated formulas for splitting parameters Apply constraints: number of available worker nodes, number of simultaneous users, size of the biggest input file Implemented in Java class for GRACE application Splitting study based on simulation α: % increasing of waiting time at user interface

  22. Simulation of M2 performances α{ Kopt Keff

  23. Simulation of M2 performances α{ Kopt Keff

  24. M2 optimal number of jobs – G6/V1

  25. M2 UI waiting time – G4/V3

  26. M2 execution time G2/V1

  27. M2 UI waiting time – G7/V3

  28. M2 spent computing time – G5/V3

  29. M2 spent computing time – G8/V1

  30. Comparison M1 G7/V1- M2 G7/V1

  31. Comparison M1 G8/V1- M2 G8/V1

  32. The Grid parallelization work is the extension of WP6 activity on Grid integration and testing Conclusions • The Grid parallelization has been completed successfully • Parallelization proved to improve application performances and low the query failure rate (GL measure this? • Grid performed well: low failure rate, prompt reply of Grid administrators to problems, good coordination with Gilda team

  33. EGEE collaboration - update • Feedback submitted to EGEE user requirements db • Grace presented at the EGEE EGAAP commission at the second EGEE conference • Grace will be supported in Gilda until the end of the Project (installation in SHU?) • GRACE requirements entered in the EGEE requirements db (input from GL missing) • Contacts with other projects, collaboration with EGEE networking activities: provided input requirements for project support • Member of EGEE user group (supporting action for EGEE user communities)

More Related