1 / 8

SAM Grid, Architecture and Status

SAM Grid, Architecture and Status. Igor Terekhov, Fermilab, Computing Division, PPDG GGF5 BOF July, 2002. SAM as a Data Grid. Provides high-level collective services of reliable data storage and replication

yonah
Download Presentation

SAM Grid, Architecture and Status

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SAM Grid, Architecture and Status Igor Terekhov, Fermilab, Computing Division, PPDG GGF5 BOF July, 2002

  2. SAM as a Data Grid • Provides high-level collective services of reliable data storage and replication • Embraces multiple MSS’s (Enstore , HPSS, etc) local resource management systems (LSF, FBS, PBS, Condor), several different file transfer protocols (bbftp, kerberos rcp, grid ftp, …) • Optionally uses Grid technologies and tools • Condor as a Batch system (in use) • Globus FTP for data transfers (ready for deployment) • From de facto to de jure…

  3. Routing+Caching=Replication Data Site WAN Data Flow User Station Master Station Master Station Master Station Master Station Master Station Master Mass Storage System Mass Storage System User User

  4. Dzero SAM Deployment Map Processing Center Analysis site

  5. SAM usage statistics for DZero • 497 registered SAM users in production • 360 of them have at some time run at least one SAM project • 132 of them have run more than 100 SAM projects • 323 of them have run a SAM project at some time in the past year • 195 of them have run a SAM project in the past 2 months • 48 registered stations, 340 registered nodes • 115TB of data on tape • 63,235 cached files currently (over 1 million entries total) • 702,089 physical and virtual data files known to SAM • 535,048 physical files (90K raw, 300K MC related) • 71,246 “analysis” projects ever ran • http://d0db.fnal.gov/sam_data_browsing/ for more info

  6. Client Applications Web Command line Python codes, Java codes D0 Framework C++ codes Request Formulator and Planner Request Manager Storage Manager Cache Manager Job Manager Collective Services “Dataset Editor” “Project Master” “Station Master” “Station Master” “File Storage Server” Batch Systems - LSF, FBS, PBS, Condor Job Services SAM Resource Management Data Mover “Optimiser” “Stager” Significant Event Logger Naming Service Catalog Manager Database Manager Mass Storage systems protocols e.g. encp, hpss CORBA UDP Catalog protocols File transfer protocols - ftp, bbftp, rcp GridFTP Connectivity and Resource GSI SAM-specific user, group, node, station registration Bbftp ‘cookie’ Authentication and Security Fabric Resource and Services Catalog Meta-data Catalog Tape Storage Elements Disk Storage Elements Replica Catalog LANs and WANs Compute Elements Code Repository Indicates component that will be replaced enhanced or added using PPDG and Grid tools Name in “quotes” is SAM-given software component name

  7. SAM-Grid Architecture Monitoring and Information Job Handling (All) Job Status Updates JH Client Logging and Bookkeeping Info Processor And Converter Request Broker AAA Condor MMS Resource Info GSI Job Scheduler MDS-2 Condor Class Ads Replica Catalog Condor-G Grid RC Site Gatekeeper GRAM DH Resource Management Compute Element Resource Data Delivery and Caching SAM Batch System Grid sensors Data Handling Information Principal Component Implementation Or Library Service

  8. SAMGrid Principal Components • Job Definition and Management: The preliminary job management architecture is aggressively based on the Condor technology provided by through our collaboration with University of Wisconsin CS Group. • Monitoring and Information Services: We assign a critical role to this part of the system and widen the boundaries of this component to include all services that provide, or receive, information relevant for job and data management. • Data Handling: The existing SAM Data Handling system, when properly abstracted, plays a principal role in the overall architecture and has direct effects on the Job Management services.

More Related