1 / 23

Databases in the Grid

Databases in the Grid. A New Data Source Oriented CE for GRID Taffoni Giuliano INAF - OATS. Overview. What is a G-DSE An overview of the GDSE Some practice. People: Edgardo Amborsi Giuliano Taffoni Andrea Barisani Claudio Vuerli Antonia Ghiselli. The Database crisis.

keala
Download Presentation

Databases in the Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Databases in the Grid A New Data Source Oriented CE for GRID Taffoni Giuliano INAF - OATS

  2. Overview • What is a G-DSE • An overview of the GDSE • Some practice • People: • Edgardo Amborsi • Giuliano Taffoni • Andrea Barisani • Claudio Vuerli • Antonia Ghiselli

  3. The Database crisis • I have a DB and I want to USE it from my GRID. • I have a number of DBs and I want to USE all of them. • Move the execution to the data and not data to the code. • Fully compliant with gLite

  4. Grid resource definition • The Grid limit: it is able to execute binary code or shell scripts and stores files; • DB in the Grid? Extension of the existing Resource Manager of Globus for providing transparent access to heterogeneous DS and DSE

  5. Blueprint for a Query Element • The Grid Resource Framework Layer, Information System and Data Model is extended so that a software virtual machine as a Data Source Engine becomes a valid instance for a Grid computing model. • A new Grid component(G- DSE) that enables the access to a Data Source Engine and Data Source, totally integrated with the Grid Monitoring and Discovery System and Resource Broker is defined • A new Grid Element, the Query Element, can be built on top of the G-DSE component.

  6. Blueprint for a Query Element • Modify the Job Management component to access new kind of resources • Integrate the Information system with the “description” of the new resource; • Use the Grid Security Infrastructure • No modification on the client and server side: if I can submit a job I can also submit a query! • No modification on the Brokering/Workflow systems: if I can direct the CE I can direct also a QE.

  7. Extending the Grid capabilities • Provide a proper extension of the Grid to care a new resource • Security GSI: no need to extend but to use! • First theory (Grid ASM) then…application. “A Formal Framework for Defining Grid Systems” Zsolt N. Nemeth & Vaidy Sunderam 2nd IEEE/ACM (CCGRID'02)

  8. Globus G-DSE integration GIS GRAM gatekeeper MDS GRIS ldif JobManger QueryManger Ldap query plug-in Scheduler p-in Grid Providers (snmp) Query DB specific driver Pbs/LFS JobProcess QueryProcess RDBMS RDBMS

  9. GRAM services Globus4 Integration Local Db control GRAM Adapter query “plug-in” Delegation QueryProcess GridFTP RFT Remote SE GridFTP

  10. G-DSE Grid formalization • New Grid component: • Integrated within the Grid Information System • May be integrated in the WMS • New Grid Element on top of the G-DSE component the Query Element

  11. query The Query Element CE code QE

  12. QE implementation • Runs on any linux/unix flavor: GT>=2.4.3 • Backbends: any DB vendor (MySQL, Oracle, PostgreSQL etc…) + flat files • Two protocols: GRAM or WS • API: C, C++, python, Java, perl • If it works with Globus it works with G-DSE ora GRAM GDSE psql SOAP file

  13. QE Authorization • Access control using GSI and VOMS • The certificate + roles identify the user permissions on DB Super user: crate, modify, admin, grant and revoke users…. ANYTHING!!! Standard user: select+ insert Simple user: select

  14. QE Authorization VOMS roles and groups mapping with db user: Attribute:/vo/dbuser/ROLE=astrouser/CAPABILITY=select

  15. More than one statement QE language • UI/QE interactions trough a STANDARD LANGUAGE • RSL(SQL) > globus-job-run g.dse.host/dbmanager-ODBC -queue PSQL1 “select a,b from table;” -------------- | a | b | -------------- | Uno | 001 | | Due | 002 | | Tre | 003 | --------------

  16. QE language Off line access > globus-job-submit g.dse.host/dbmanager-ODBC -queue PSQL1 “select a,b from table;” https://g.dse.host/20001/23297/113699980234 >globus-job-status https://g.dse.host/20001/23297/113699980234 DONE >globus-job-get-output https://g.dse.host/20001/23297/113699... -------------- | a | b | -------------- | Uno | 001 | | Due | 002 | | Tre | 003 | --------------

  17. The Information System • QE publishes its presence to the GRID • Software computing machine load and memory space etc.. • We use MIB rdms information: • More than 250 parameters … we are not using all of them!!! • rdbmsSrvInfoFinishedTransactions 1.3.6.1.2.1.39.1.6.1.2 • rdbmsSrvInfoDiskReads 1.3.6.1.2.1.39.1.6.1.3 • rdbmsSrvInfoLogicalReads 1.3.6.1.2.1.39.1.6.1.4 • rdbmsSrvInfoDiskWrites 1.3.6.1.2.1.39.1.6.1.5 • Based on snmp or direct access.

  18. gLite implementation • GRAM + site bdii + top BDII • Based on information provides • Static information • Dynamic information Dynamic providers Static providers snmpquery ODBCquery ldif snmp snmp snmp odbc odbc odbc ORACLE MYSQL POSTGRESQL

  19. QE BDII > ldapsearch -LLL -x -H g.dse.host -b "mds-vo-name=site,o=grid” dn:GlueDSEUniqueID=g.dse.host:2119/dbmanager-ODBC, mds-vo-name=local,o=grid objectClass: GlueCETop objectClass: GlueCE objectClass: GlueDSE objectClass: GlueDSETop objectClass: GlueKey GlueDSEName: TESTDB GlueDSEStateStatus: Production GlueDSEInfoLRMSType: Postgresql GlueDSEInfoLRMSVersion:7.3

  20. QE and the WMS • New job wrapper for dbmanager QueryManger gatekeeper query plug-in Query DB specific driver RB QueryProcess QueryWrapper RDBMS

  21. An Example Type = "Job"; JobType = "Normal"; Executable = ”select A from table;"; StdOutput = "hostname.out"; StdError = "hostname.err"; OutputSandbox = {"hostname.err","hostname.out"}; Arguments = "-xml"; RetryCount = 1; $ glite-job-submit -r gdse.oats.inaf.it:2119/dbmanager-odbc-test1 sqltest.jdl Selected Virtual Organisation name (from proxy certificate extension): inaf Connecting to host arquimedes.rediris.es, port 7772 Logging to host arquimedes.rediris.es, port 9002 ================================ edg-job-submit Success =========== The job has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your job - https://arquimedes.rediris.es:9000/75hD3nNHxbYRDAL3GmiIug The edg_jobId has been saved in the following file: /home/madrid01/jlvpjobid ========================================================================

  22. Summing up • G-DSE supports Data Source (DS) and DSE indexing, monitoring, management and recovery through a rich set of Meta-Data bound to standard GIS. • DS have their core engine into G-DSE, that provides a framework for activity and task management. • A RSL/JDL Transaction/Query permits a number of tasks to be specified, together with their parameters, inputs, outputs and control flow. • The response to a request is generated by the GDSE within a JobQueryManager Session. The GDSE analyses incoming Task and conducts authentication and authorisation • The standard Grid WorkLoad Manager constructs an optimised execution graph. • GIS will monitor a DS’s and DSE’s status digest produced by its internal monitor. • The GDSE has been designed to support dynamic configuration, sessions, transactions, recovery and concurrency.

  23. End of Presentation Thank you for your attention

More Related