Data management gridpp and edg l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 17

Data Management GridPP and EDG PowerPoint PPT Presentation


Data Management GridPP and EDG. Gavin McCance University of Glasgow May 2, 2002. http://www.gridpp.ac.uk/datamanagement http://cern.ch/grid-data-management. Who are we?. GridPP Effort based at Glasgow Collaboration with European DataGrid WP2: Data Management work package

Download Presentation

Data Management GridPP and EDG

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Data management gridpp and edg l.jpg

Data ManagementGridPP and EDG

Gavin McCance

University of Glasgow

May 2, 2002

http://www.gridpp.ac.uk/datamanagement

http://cern.ch/grid-data-management


Who are we l.jpg

Who are we?

  • GridPP

    • Effort based at Glasgow

  • Collaboration with European DataGrid

    • WP2: Data Management work package

    • CERN, Finland, Italy

    • Replication collaboration with Globus + PPDG project

Gavin McCance


What do we do l.jpg

What do we* do?

  • Replica management

    • Replica catalogues

    • File access and transfer

  • Grid query optimisation (replica optimisation)*

  • Secure meta-data catalogues*

  • Service Index

Gavin McCance


Replica catalogues l.jpg

Replica Catalogues

  • Must maintain replica of the same files

  • Have a globally unique Logical File Name (LFN) mapping to multiple physical instances of the file (PFNs).

  • Catalogue to keep track of all these mappings!

File-1

LFN

Paris

File-1

Chicago

Glasgow

File-1

File-1

Gavin McCance


Catalogues l.jpg

…catalogues

  • Current services use LDAP

  • Collaboration with Globus + PPDG on new replica catalogue framework (GIGGLE)

  • Prototype Replica Location Service (RLS) under development

    • Will use meta-data service (Spitfire)…

    • API implemented as wrapper for current LDAP based replica catalogue

Gavin McCance


Slide6 l.jpg

…RLS

RLI

  • Implemented as web service

RLI

RLI

LRC

LRC

LRC

LRC

LRC

Storage

Element

Storage

Element

Storage

Element

Storage

Element

Storage

Element

Gavin McCance


Transferring files l.jpg

Transferring files

  • What replicates the files?

    • Grid Data Mirroring Package (GDMP)

    • GDMP 3.0 software just released

    • GSI authentication and authorisation

    • GridFTP file transfer

    • Subscription based file replication

    • Automatic update of replica catalogue

    • http://cmsdoc.cern.ch/cms/grid/

Gavin McCance


Replica manager l.jpg

Replica Manager

  • New web service under development

    • GDMP functionality will be absorbed

    • Will use replica location service

  • Core API has been defined

    • replicateFile, copyAndRegisterFile, deleteFile, registerEntry, unregisterEntry

  • Iteration with WP5 on accessing data from Storage Elements

Gavin McCance


Optimisation l.jpg

Optimisation

  • Negotiation with scheduling for data intensive jobs

    • minimise job time / max grid throughput

    • Given the distribution of data a job will use, what is the most appropriate place to run it?

    • Once its running: is it better to remote-open, cache or make a new replica nearby?

Gavin McCance


Optimisation10 l.jpg

…Optimisation

  • Dynamic replication decisions based on network stats and file access patterns

  • Economic model being tested

    • “Greedy” local optimisation leads to a reasonable global optimum…

  • Data-centric grid simulation to test these replication algorithms

Gavin McCance


Meta data l.jpg

Meta-data

  • Need for transparent, secure access to meta-data

  • Both for grid-specific (e.g. Replica catalogue) and application specific meta-data.

  • Spitfire service available

    • Current version 1.1.0

    • http://hep-proj-spitfire.web.cern.ch/hep-proj-spitfire

Gavin McCance


Current spitfire l.jpg

Current Spitfire

  • Secure access over HTTPS to retrieve from or publish to any RDBMS

    • Can use web-browser as client

Gavin McCance


Security l.jpg

Security

  • Authentication is provided over SSL via a Globus certificate

  • Remote users are mapped onto a database role, so can only perform authenticated operations on the database

Gavin McCance


Security mechanism l.jpg

HTTP + SSLRequest + client certificate

Is certificate signedby a trusted CA?

Has certificatebeen revoked?

No

No

Yes

Finddefault

Role ok?

Request and connection ID

Security Mechanism

Servlet Container

SSLServletSocketFactory

RDBMS

Trusted CAs

TrustManager

Revoked Certsrepository

Security Servlet

ConnectionPool

Authorization Module

Does user specify role?

Role repository

Translator Servlet

Role

Connectionmappings

Map role to connection id

Gavin McCance


Developments to spitfire l.jpg

Developments to Spitfire

  • Web Services API is defined

    • Implementation to start immediately

    • Access via SOAP, initially over HTTPS

  • Higher level services

    • Meta-data distribution and replication

    • Clean-up services

Gavin McCance


Service index l.jpg

Service Index

  • How do I find a specific grid service?

    • E.g. replica location server, image database, information service

  • XML Service description

    • What, where, attributes, how to contact.

  • Scalable architectures for querying this developed

  • Service index web service

    • W. Hoschek’s thesis and paper ([email protected])

    • API developed

Gavin McCance


More info l.jpg

More Info

  • More information available at…

http://www.gridpp.ac.uk/datamanagement

http://cern.ch/grid-data-management

Gavin McCance


  • Login