Core srb technology for 2005 ncoic workshop
This presentation is the property of its rightful owner.
Sponsored Links
1 / 41

Core SRB Technology for 2005 NCOIC Workshop PowerPoint PPT Presentation


  • 41 Views
  • Uploaded on
  • Presentation posted in: General

Core SRB Technology for 2005 NCOIC Workshop. By Michael Wan And Wayne Schroeder SDSC. SDSC/UCSD/NPACI. Outline. Basic Concepts behind SRB SRB architecture SRB features SRB Usage Model Wayne: SRB productization - Installation, Administration, etc Security and Authentication

Download Presentation

Core SRB Technology for 2005 NCOIC Workshop

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Core srb technology for 2005 ncoic workshop

Core SRB Technologyfor 2005 NCOIC Workshop

By Michael Wan

And

Wayne Schroeder

SDSC

SDSC/UCSD/NPACI


Outline

Outline

Basic Concepts behind SRB

SRB architecture

SRB features

SRB Usage Model

Wayne:

SRB productization - Installation, Administration, etc

Security and Authentication

Examples and demo


Initial design of srb

Initial Design of SRB

Transparency and Uniformity

Data are increasingly distributed

Design Goal –

use a single interface and authorization mechanism to access data across:

Multiple hosts

Multiple OS platforms

Multiple resource type (UNIX FS, HPSS, UniTree, DBMS ..)


Initial design of srb1

Initial Design of SRB

Global view

Global Logical Name space –

Data organization

UNIX like directories (collections) and files (data)

Mapping of logical name to physical attributes - host address, physical path.

UNIX like API and utilities

Single Global User Name Space

Single sign-on

No need for UNIX account on every systems

Robust access control


Srb architecture

SRB Architecture

Federated middleware system

Client/server model –

Federation of resource servers with uniform interfaces

client-server

server-server - Each request handler has 2 versions

Local

Remote – pass off to server that can handle the request

All Servers use same software

Simplicity – easy to implement, easy to debug

Robust access control

user level, grant access to multiple users

group level

tickets

MCAT –

Metadata catalog


Federation of servers

Federation of Servers

MCAT

Mcat Server

Server1

Server2


Srb as a data grid

SRB as a Data Grid

DB

MCAT

SRB

SRB

SRB

SRB

SRB

SRB

  • Data Grid has arbitrary number of servers

  • Complexity is hidden from users


Srb server design

SRB server design

Three layers design

Top layer

Interacts with clients and other servers through tcp/ip sockets

User authentication

Handle function requests – parses requests and invokes handlers in middle and bottom layers.


Srb server design cont2

SRB server design (cont2)

Middle layer (logical layer)

Most requests pass through here

Input parameters are in their logical representations (logical path name , logical resource name)

Generally, two types of requests –

Data access –

Queries MCAT, translates from logical to physical representations

Calls functions in the bottom (physical) layer to access data

Metadata access –

Interacts with MCAT


Srb server design con2

SRB server design (con2)

Bottom layer (physical layer)

Where all data I/O to/from resources are done

Handles three types of resources

File system

Drivers to interface with different FS

FS supported : UNIX, HPSS, ADS, UniTree, gridFTP (to be released)

DB large objects

DB tables

Access DB tables (query, insert, …)


Srb features authentication

SRB Features -Authentication

Support 2 authentication schemes

Encrypt1 (SDSC) – No plain text password over the net

GSI (Globus)

Wayne will give details


Performance enhancement

Performance Enhancement

Parallel I/O

For transferring large files

Uses multi-threads for data transfer and disk I/O

Interface with HPSS’s mover protocol for parallel I/O

Parallel third party transfer for copy and replicate

One hop data transfer between client and data resource

Bulk Operation

Uploading and downloading large number of small files

Multi-threads

Bulk registration – 500 files in one call

3-10 times speedup


Sput serial mode

Sput – serial mode

Peer-to-peer Request

srbObjCreate

srbObjWrite

Sput

1

5

SRB

server2

SRB

server1

3

4

6

SRB agent

SRB agent

2

Server(s) Spawning

MCAT

1.Logical-to-Physical mapping

2. Identification of Replicas

3.Access & Audit Control

R

Data Transfer


Parallel mode data transfer client initiated

Parallel mode Data Transfer – Client Initiated

Connect to server

Data transfer

Sput -M

srbObjPut

8

1

6

7

SRB

server2

SRB

server1

3

4

SRB agent

SRB agent

2

5

Return socket addr., port and cookie

MCAT

1.Logical-to-Physical mapping

2. Identification of Replicas

3.Access & Audit Control

R


Performance enhancement cont1

Performance Enhancement (cont1)

Container –

physical grouping of small files

for tape I/O or archival resources

Easy to use, transparent to users


Data replication

Data Replication

A SRB file can have multiple replica

Replica can be stored in different resources

Sls –l mfile

fedsrbbrick8 0 demoResc 3029449 2005-07-29-15.37 % mfile

fedsrbbrick8 1 demoResc1 3029449 2005-07-29-21.28 % mfile

Commands that uses replica

Sreplicate – replicate a file to the specified resource

Sbackupsrb – backup a file to the specified resource

SsyncD – Synchronize the replica of a file


Phymove move srb files to another resource

PhyMove –move SRB files to another resource

Move files to another resource without making another replica

Normally used by admin to move files around

Bulk phyMove – large number of small files

Parallel I/O – large files

Container – move files into container

Heavily used by the BBSRC project for distributed archive.

Files uploaded to local server

Files eventually moved to a central archival resource by admin


Performance enhancement cont2

Performance Enhancement (cont2)

Use of checksum

a MCAT metadata associated with a file

Checksum routines is part of server and client codes

For verification and synchronization of data

Built into most data handling utilities

Sput, Sget, Srsync, Schksum


Metadata in srb

Metadata in SRB

SRB System Metadata

Free-form Metadata (User-defined)

Attribute-Value-Unit Triplets…

Extensible Schema Metadata

User Defined

Tables integrated into MCAT Core Schema

External Database

Metadata operations

Metadata Insertion through User Interfaces

Bulk Metadata Insertion

Template based Metadata Extraction

Query Metadata through well defined Interfaces


Srb proxy operation

SRB Proxy operation

Perform operations on server on behalf of user

Operation where data is located

File format conversion, md5 checksum, subsetting and filtering, etc

Two types of proxy operations

Proxy commands

Server fork and exec executable/script on server

Pipe output back to client

Proxy functions

Functions built into server

Well defined framework for writing proxy functions


Hdf5 srb model data flow

HDF5-SRB ModelData flow

Client API

srbObjRequest(void *obj, int objID)

Server API

srbObjProcess(void *obj, int objID)

5. packMsg()

3. H5Obj::op()

6. unpackMsg()

HDF5 Library

1. packMsg()

2. unpackMsg()

4. Access file

SRB Server

HDF5 file


Zone federation

Zone Federation

Federation of multiple MCATs

MCAT ZONE

defines a federation of SRB resources controlled by a single MCAT

Each Zone has full control of its own administrative domain

Each Zone can operate entirely independently from other zone.

Data and Resource sharing across ZONES

Use storage resources in foreign zones

Share data across zones

Copy data across zones


Peer to peer federated mcat zone

Peer to peer Federated MCAT Zone

MCAT1

Server1.1

Server1.2

MCAT3

Server3.1

MCAT2

Server2.2

Server2.1


Srb client implementations

SRB Client Implementations

A set of Basic APIs

Over 160 APIs

Used by all clients to make request to servers

Scommands

Unix like command line utilities for UNIX and Window platforms

Over 60 - Sls, Scp, Sput, Sget …


Srb client implementations cont

SRB Client Implementations (cont)

inQ – Window GUI browser

Jargon – Java SRB client classes

Pure Java implementation

mySRB – Web based GUI

run using web browser

Java Admin Tool

GUI for User and Resource management

Matrix – Web service for SRB work flow


Inq windows gui

inQ Windows GUI


Mysrb web based srb interface

MySRB – Web Based SRB Interface

SRB Browser

Advanced Metadata manipulation


Srb usage model

SRB Usage Model

Various Usage models

Specific Usages

SLAC’s Babar experiment

UK eScience BBSRC

BIRN


Srb configuration peer to peer data grid

SRB Configuration – Peer-to-peer Data Grid

Data sharing, no central resourcet

Projects – NARA, BIRN

Resource

server

Resource

server

Resource

server

Resource

server


Srb configuration exploding star

SRB Configuration - Exploding Star

Satellite

server

Satellite

server

Data source – physics experiment

Projects – Babar, kek

Source

Server

Satellite

server

Satellite

server

Satellite

server


Srb configuration imploding star

SRB Configuration - Imploding Star

Satellite

source

server

Satellite

source

server

Central

Archival

server

Archival Storage Model

Projects – UK eScience –

BBSRC

Central

Cache

Server

Satellite

source

server

Satellite

source

server

Satellite

source

server


Peer to peer federation of mcat zone

Peer to peer Federation of MCAT Zone

MCAT1

Server1.1

Server1.2

MCAT3

Server3.1

MCAT2

Server2.2

Server2.1


Summary of the babar project

Summary of the Babar Project

Preproduction evaluation – 2003

Highlight of Wilco Kroeger’s (SLAC) talk at IEEE 2003

Title - “Distributing Babar Data using SRB”

BaBar Computing resources are geographically distributed: 5 Tier-A center GridKA (D), IN2P3 (F), INFN-Padova (I), RAL (UK), SLAC (USA)

Data have to be replicated to the Tier-A sites.

Number of files is 1M. Size 100’s TB


Babar preproduction srb usage

Babar Preproduction – SRB Usage

Allows transparent access to files.

Don’t need to know host or storage medium (disk,tape).

Accessing files/collections by attributes.

Find files that were produced at a certain time or site.

Find collections from a particular run period.

Preproduction test – 2 weeks of MCAT and file transfer tests


Babar production update

Babar Production Update

Transferred ~70 Tb and 140K files

Peak rate ~2 Tb/day. Average rate – 1 Tb/day

Downtime encountered

hardware problem

DB updates

Plan to federate SLAC and In2p3 Zones –

In2p3 picks up some of the load

Thanks to Wilko Kroeger (SLAC) and Jean-Yves Nief (In2p3) for the info


Uk escience bbsrc

UK eScience BBSRC

Archival of Biological Data from 16 sites to a central resource

Data ingested into local resources

Admin uses bulk Sphymove to move data from local resources to a central cache

Moves data into containers

Replicates containers to cache resource at RAL

Replicates containers to ADS archival at RAL

Removes cache copies


Uk escience bbsrc1

UK eScience BBSRC

Develop some software on their own

User interface using Jargon

GUI

Users not exposed to all SRB functionalities

Request tracker – track data movement after ingestion

Status

Project started at beginning of this year

Just done with pilot program using SRB3.2

Upgrading to 3.3 for production


Biomedical informatics research network birn

Biomedical Informatics Research Network (BIRN)

Major collaboration with SDSC, several of the projects’ Co-Investigators and Co-PIs are at SDSC..

SRB provides the ability to transparently share data across remote sites.


The birn srb data grid

The BIRN SRB Data Grid


The birn data grid

The BIRN Data Grid


Srb in birn

SRB in BIRN

BIRN Toolkit

Collaboration

Applications

Queries/Results

Data Management

Viewing/Visualization

Mediator

GridPort

Grid Management

Data Model

Database

Scheduler

Database

Data Grid

Computational Grid

NMI

SRB

Globus

MCAT

Data Access

HPSS

File System

Distributed Resources


  • Login