data management gridpp and edg
Download
Skip this Video
Download Presentation
Data Management GridPP and EDG

Loading in 2 Seconds...

play fullscreen
1 / 17

Data Management - PowerPoint PPT Presentation


  • 283 Views
  • Uploaded on

Data Management GridPP and EDG. Gavin McCance University of Glasgow May 2, 2002. http://www.gridpp.ac.uk/datamanagement http://cern.ch/grid-data-management. Who are we?. GridPP Effort based at Glasgow Collaboration with European DataGrid WP2: Data Management work package

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Data Management' - daniel_millan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data management gridpp and edg

Data ManagementGridPP and EDG

Gavin McCance

University of Glasgow

May 2, 2002

http://www.gridpp.ac.uk/datamanagement

http://cern.ch/grid-data-management

who are we
Who are we?
  • GridPP
    • Effort based at Glasgow
  • Collaboration with European DataGrid
    • WP2: Data Management work package
    • CERN, Finland, Italy
    • Replication collaboration with Globus + PPDG project

Gavin McCance

what do we do
What do we* do?
  • Replica management
    • Replica catalogues
    • File access and transfer
  • Grid query optimisation (replica optimisation)*
  • Secure meta-data catalogues*
  • Service Index

Gavin McCance

replica catalogues
Replica Catalogues
  • Must maintain replica of the same files
  • Have a globally unique Logical File Name (LFN) mapping to multiple physical instances of the file (PFNs).
  • Catalogue to keep track of all these mappings!

File-1

LFN

Paris

File-1

Chicago

Glasgow

File-1

File-1

Gavin McCance

catalogues
…catalogues
  • Current services use LDAP
  • Collaboration with Globus + PPDG on new replica catalogue framework (GIGGLE)
  • Prototype Replica Location Service (RLS) under development
    • Will use meta-data service (Spitfire)…
    • API implemented as wrapper for current LDAP based replica catalogue

Gavin McCance

slide6
…RLS

RLI

  • Implemented as web service

RLI

RLI

LRC

LRC

LRC

LRC

LRC

Storage

Element

Storage

Element

Storage

Element

Storage

Element

Storage

Element

Gavin McCance

transferring files
Transferring files
  • What replicates the files?
    • Grid Data Mirroring Package (GDMP)
    • GDMP 3.0 software just released
    • GSI authentication and authorisation
    • GridFTP file transfer
    • Subscription based file replication
    • Automatic update of replica catalogue
    • http://cmsdoc.cern.ch/cms/grid/

Gavin McCance

replica manager
Replica Manager
  • New web service under development
    • GDMP functionality will be absorbed
    • Will use replica location service
  • Core API has been defined
    • replicateFile, copyAndRegisterFile, deleteFile, registerEntry, unregisterEntry
  • Iteration with WP5 on accessing data from Storage Elements

Gavin McCance

optimisation
Optimisation
  • Negotiation with scheduling for data intensive jobs
    • minimise job time / max grid throughput
    • Given the distribution of data a job will use, what is the most appropriate place to run it?
    • Once its running: is it better to remote-open, cache or make a new replica nearby?

Gavin McCance

optimisation10
…Optimisation
  • Dynamic replication decisions based on network stats and file access patterns
  • Economic model being tested
    • “Greedy” local optimisation leads to a reasonable global optimum…
  • Data-centric grid simulation to test these replication algorithms

Gavin McCance

meta data
Meta-data
  • Need for transparent, secure access to meta-data
  • Both for grid-specific (e.g. Replica catalogue) and application specific meta-data.
  • Spitfire service available
    • Current version 1.1.0
    • http://hep-proj-spitfire.web.cern.ch/hep-proj-spitfire

Gavin McCance

current spitfire
Current Spitfire
  • Secure access over HTTPS to retrieve from or publish to any RDBMS
    • Can use web-browser as client

Gavin McCance

security
Security
  • Authentication is provided over SSL via a Globus certificate
  • Remote users are mapped onto a database role, so can only perform authenticated operations on the database

Gavin McCance

security mechanism

HTTP + SSLRequest + client certificate

Is certificate signedby a trusted CA?

Has certificatebeen revoked?

No

No

Yes

Finddefault

Role ok?

Request and connection ID

Security Mechanism

Servlet Container

SSLServletSocketFactory

RDBMS

Trusted CAs

TrustManager

Revoked Certsrepository

Security Servlet

ConnectionPool

Authorization Module

Does user specify role?

Role repository

Translator Servlet

Role

Connectionmappings

Map role to connection id

Gavin McCance

developments to spitfire
Developments to Spitfire
  • Web Services API is defined
    • Implementation to start immediately
    • Access via SOAP, initially over HTTPS
  • Higher level services
    • Meta-data distribution and replication
    • Clean-up services

Gavin McCance

service index
Service Index
  • How do I find a specific grid service?
    • E.g. replica location server, image database, information service
  • XML Service description
    • What, where, attributes, how to contact.
  • Scalable architectures for querying this developed
  • Service index web service
    • W. Hoschek’s thesis and paper (WP2@CERN)
    • API developed

Gavin McCance

more info
More Info
  • More information available at…

http://www.gridpp.ac.uk/datamanagement

http://cern.ch/grid-data-management

Gavin McCance

ad