1 / 19

EGRID Project: Experience Report

EGRID Project: Experience Report. Implementation of a GRID Infrastructure for the Analysis of Economic and Financial data. EGRID Project: Experience Report. Econophysics GRID Italian Ministry of Education (MIUR) funded project.

daryl
Download Presentation

EGRID Project: Experience Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EGRID Project: Experience Report Implementation of a GRID Infrastructure for the Analysis of Economic and Financial data

  2. EGRID Project: Experience Report Econophysics GRID • Italian Ministry of Education (MIUR) funded project. • Purpose: pilot project for future Italian National GRID facility for Economics and Finance. • Serves the computing needs of two select research projects: • INFM’s High frequency dynamics in financial markets. • AREA Trieste’s Softcomputing techniques applied to modern finance. (Both applying models from Physics to Economics/Finance)

  3. EGRID Project: Experience Report Summary: • User requirements • The EGRID facility • Deficiencies of EDG middleware • EGRID solutions/workarounds • Next steps of EGRID

  4. I. User requirements

  5. User requirements AREA • Big DB of corporate budget analysis: to be exported to GRID + WEB. • Access must be: secure + authenticated + authorised. • No real need for computing power.

  6. User requirements INFM: • Management services for 2TB Stock Exchange data (NYSE, Milan, etc.). • Data privacy and security: legally binding contracts. • Processing facility for raw data.

  7. II. The EGRID facility

  8. The EGRID facility Physical Infrastructure: • Non-partner centre INFN Padova supplies all bulk computing power + storage. • Resources: 2.6TB storage + 4 exclusive CPUs + 100 CPUs best effort. • INFN Padova already part of national High Energy Physics GRID – INFN-GRID. • Our Users provide limited local GRID-enabled buffer storage to offset bandwidth problems.

  9. The EGRID facility Padova CE SE 2.6 TB WNs 100 CPUs Firenze site RB (Padova) CE+SE+WN Trieste Palermo . . . . CE+SE+WN CE+SE+WN

  10. The EGRID facility Software Infrastructure: • Peripheral Sites with same middleware of INFN-GRID: GLOBUS 2.2/2.4 based EDG/LCG2. • EGRID software layer on top of EDG/LCG2 to simplify data management: egrid-upload /nyse-2002-01.tar.gz lfn:/fonti/cd/nyse-2002-01.tar.gz edg-replica-manager --vo=egrid copyAndRegisterFile \ file:///home/usr/nyse-2002-01.tar.gz \ -d sfn://egrid-10.egrid.it/flatfiles/SE00/egrid/fonti/cd/nyse-2002-01.tar.gz \ -l lfn:/fonti/cd/nyse-2002-01.tar.gz

  11. The EGRID facility Software Infrastructure: • Raw data processing EGRID SW: Stock Exchange format -> more usable research format. • Ad-Hoc solution for AREA DB access: web-enabling techniques (CGI, JSP, etc.) + GSI security (Apache MOD_GRIDSITE) + GRID Information System integration.

  12. III. Deficiencies of EDG Middleware

  13. Deficiencies of EDG Middleware Data privacy and security • GSIFTP protocol moves data around the GRID but GridFTP daemon only enforces access restrictions by way of standard UNIX permission triple. • Pool account mechanism on SE does not allow access rights partitioning within same VO. • Neither authentication nor authorization enforced on RLS: replica catalogue easily corrupt!

  14. Deficiencies of EDG Middleware Middleware deployment • EDG based on Red Hat Linux 7.3 • No complete installation instructions. • LCFGng installation tool poorly documented + needs dedicated machine + does not allow useful software combinations (i.e. no CE+SE+WN on same machine). • UI needs dedicated machine: cannot be installed on user’s own workstation.

  15. IV. EGRID solutions/workarounds

  16. EGRID solutions/workarounds Data privacy and security: Data resides in SE - that’s where security must be guaranteed; no ACLs available – RedHat 7.3 limit. • Pool account mechanism disabled in SE. • Each GRID user mapped to his/her own corresponding local account. • UNIX groups formed by gathering users based on contract rights to data access. • Files on SE protected by group ownership rights. • A nested directory structure allows: read access to group + write access to subset of group. • Central LDAP server publishes user/group account maps + propagates them to SE.

  17. EGRID solutions/workarounds Middleware Deployment: • Painstaking job of: documentation tracking down + deriving from LCFGng installation explicit procedures for single GRID elements + interpretation of obscure error messages + trial and error. • Knoppix based LiveCD technology for UI and SuperNode: can be run on the fly from the CD, or can be installed on a machine. • Script installs UI on any WorkStation – no need to re-install machine + no need for RedHat 7.3.

  18. V. EGRID next steps

  19. EGRID next steps • Present security mechanism is only a temporary solution (scalability issues)! EGRID working with INFN to develop StoRM SRM server: features ACL enforced security to GRID files. • Portal for User Applications to replace CLI. • Porting of Parallel Applications.

More Related