1 / 31

Large-scale distributed computing in the Netherlands

Explore the world of large-scale distributed computing in the Netherlands with NIKHEF, the Dutch National Institute for Nuclear and High-Energy Physics. Learn about the fascinating experiments and the need for powerful computing to analyze massive amounts of data. Discover how the Grid enables resource sharing and collaboration across institutions.

aflowers
Download Presentation

Large-scale distributed computing in the Netherlands

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large-scale distributed computing in the Netherlands NIKHEF Dutch National Institute for Nuclear and High-Energy Physics

  2. The basic building blocks of which we and everything in the world about us are made are extremely tiny. Even if you enlarged one of these tiny particles a million million (1012) times, it would still be smaller than a full stop.

  3. Large devices are built to study the nature of matter The ANTARES experiment uses the Mediterranean as a detector

  4. Many detectors like this will generate between 5 and 10 Petabyte each year… That’s almost 9 million CD-ROMs!

  5. The Large Hadron Collider • Physics @ CERN • LHC particle accellerator • operational in 2007 • 5-10 Petabyte per year • 150 countries • > 10000 Users • lifetime ~ 20 years 40 MHz (40 TB/sec) level 1 - special hardware 75 KHz (75 GB/sec) level 2 - embedded 5 KHz (5 GB/sec) level 3 - PCs 100 Hz (100 MB/sec) data recording & offline analysis http://www.cern.ch/ http://www.nikhef.nl/

  6. The power of a computer per Euro doubles every 18 month… Estimated CPU capacity required at CERN Estimated CPU Capacity at CERN 5,000 4,500 4,000 3,500 3,000 2,500 K SI95 2,000 Other experiments 1,500 LHC experiments 1,000 500 0 Moore’s law – some measure of the capacity technology advances provide for a constant number of processors or investment 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 year Jan 2000:3.5K SI95 … not fast enough to keep up with the LHC data rates!

  7. ENVISAT – even more data • 3500 MEuro programme cost • 10 instruments on board • 200 Mbps data rate to ground • 400 Tbytes data archived/year • ~100 `standard’ products • 10+ dedicated facilities in Europe • ~700 approved science user projects http://www.esa.int/

  8. The Grid Resource Sharing in dynamic, multi-institutional “Virtual Organisations” • Dependable • Consistent • Pervasive Bring computers, mass storage, data and information together from different organizations, and make them work like one.

  9. Why is the Grid successful? • Wide-area network doubles every 9 months! • Rate of growth faster than compute power or storage • Makes it worthwile to collaborate over large distances

  10. The Origins of the GRID

  11. 1965 DARPA starts network research 1986 1stIETF 1994W3C 1969ARPA net with 4 hosts 1999 1st GF 2001 1stGGF 1990 TBL creates the Web 2002EDG 1.2 1969 Creeper & Reaper 1972 TELNET 1997 Globus 1973 Ethernet 1974 links to UK, TCP 1970 1980 1990 2000 1978RPC conceived (by Per Hansen) 1997 CORBA 2001WSDL 1985 Condor ~1980 “commodity computing”

  12. The beginnings of the Grid • Grown out of distributed computing • Gigabit network test beds • Supercomputer sharing started in 1995 • Just a few sites and users • Focus shifts to inter-domain operations GUSTO meta-computing test bed in 1999

  13. Standards Requirements • Standards are key to inter-domain operations • Global grid Forum (GGF) established in 2001 • Approx. 40 working & research groups http://www.gridforum.org/

  14. Application Internet Protocol Architecture Application Collective Transport Internet Resource Link Connectivity Fabric Network layers and Architecture Application Presentation Standard bodies: GGFW3C Session Transport Standard body: IETF Network Data Link Standard body: IEEE Physical

  15. Application Application Internet Protocol Architecture “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Collective “Sharing single resources”: negotiating access, controlling use Resource “Talking to things”: communication (Internet protocols) & security Connectivity Transport Internet “Controlling things locally”: Access to, & control of, resources Fabric Link Grid Architecture

  16. Grid Software and Middleware • Globus Project started 1997 • Current de-factostandard • Reference implementation of Global Grid Forum standards • Toolkit `bag-of-services' approach • Several middleware projects: • EU DataGrid • CrossGrid, DataTAG, PPDG, GriPhyN • In NL: ICES/KIS Virtual Lab, VL-E http://www.globus.org/

  17. Scavenging cycles off idle work stations • Leading themes: • Make a job feel `at home’ • Don’t ever bother the resource owner! • Bypassredirect data to process • ClassAdsmatchmaking concept • DAGmandependent jobs • Kangaroofile staging & hopping • NeSTallocated `storage lots’ • PFSPluggable File System • Condor-Greliable job control for the Grid http://www.cs.wisc.edu/condor/

  18. What makes a set of systems a Grid? • Coordinates resources that are not subject to centralized control • Using standard, open, general-purposeprotocols and interfaces • To deliver nontrivial qualities of service

  19. Grid Standards Today • Based on the popular protocols on the ’Net • Use common Grid Security Infrastructure: • Extensions to TLS for delegation (single sign-on) • Uses GSS-API standard where possible • GRAM (resource allocation):attrib/value pairs over HTTP • GridFTP (bulk file transfer):FTP with GSI and high-throughput extras (striping) • MDS (monitoring and discovery service):LDAP + schemas • ……

  20. Getting People TogetherVirtual Organisations • The user community `out there’ is huge & highly dynamic • Applying at each individual resource does not scale • Users get together to form Virtual Organisations: • Temporary alliance of stakeholders (users and/or resources) • Various groups and roles • Managed out-of-band by (legal) contracts • Authentication, Authorization, Accounting (AAA)

  21. Grid Security Infrastructure • Requirements: • Strong authentication and accountability • Trace-ability • “Secure”! • Single sign-on • Dynamic VOs: “proxying”, “delegation” • Work everywhere (“easyEverything”, airport kiosk, handheld) • Multiple roles for each user • Easy!

  22. Alice (e,n) Certificate Request Alice generates a key pair and send the public key to CA CommonName=‘Alice’ Organization=‘KNMI’ Private Key (d,n) Alice… The CA will check identifier in the request against the identity of the requestor CA ships the newcertificate to Alice CA operator signs therequest with the CA key CA private key CA self-signed certificate Authentication & Public Keys • EU DataGrid PKI: 1 PMA, 13 Certification Authorities • Automatic policy evaluation tools • Largest Grid-PKI in the world (and growing )

  23. Single sign-on via “grid-id” & generation of proxy cred. User Proxy Proxy credential Or: retrieval of proxy cred. from online repository Remote process creation requests* Authorize Map to local id Create process Generate credentials Ditto Communication* Remote file access request* GSI-enabled FTP server Site C (Kerberos) Authorize Map to local id Access file * With mutual authentication Storage system GSI in Action“Create Processes at A and B that Communicate & Access Files at C” User Site A (Kerberos) GSI-enabled GRAM server GSI-enabled GRAM server Site B (Unix) Computer Computer Process Process Local id Local id Kerberos ticket Restricted proxy Restricted proxy

  24. Authorization • Authorization poses main scaling problem • Conflict between accountability and ease-of-use / ease-of-management • By getting rid of “local user” concept ease support for large, dynamic VOs: • Temporary account leasing: pool accounts à la DHCP • Grid ID-based file operations: slashgrid • Sandbox-ing applications Direction of EU DataGrid and PPDG

  25. Locating a Data Set Replica • Grid Data Mirror Package • Designed for Terabyte data • Moves data across sites • Replicates both files and individual objects • Resource Broker uses catalogue information to schedule your job • Read-only copies “owner” by the Replica Manager. http://cmsdoc.cern.ch/cms/grid

  26. Mass Data Transport • Need for efficient, high-speed protocol: GridFTP • All storage elements share common interface disk caches, tape robots, … • Also supports GSI & single sign-on • Optimize for high-speed networks (>1 Gbit/s) • Data source striping through parallel streams • Ongoing work on “better TCP”

  27. Grid Access to Databases • SpitFire (standard data source services)uniform access to persistent storage on the Grid • Multiple roles support • Compatible with GSI (single sign-on) though CoG • Uses standard stuff: JDBC, SOAP, XML • Supports various back-end data bases http://hep-proj-spitfire.web.cern.ch/hep-proj-spitfire/

  28. EU DataGrid • Middleware research project (2001-2003) • Driving applications: • HE Physics • Earth Observation • Biomedicine • Operational testbed • 21 sites • 6 VOs • ~ 200 users, growing with ~100/month! http://www.eu-datagrid.org/

  29. EU DataGrid Test Bed 1 • DataGrid TB1: • 14 countries • 21 major sites • CrossGrid: 40 more sites • Growing rapidly… • Submitting Jobs: • Login only once,run everywhere • Cross administrativeboundaries in asecure and trusted way • Mutual authorization http://marianne.in2p3.fr/

  30. DutchGrid Platform www.dutchgrid.nl • DutchGrid: • Test bed coordination • PKI security • Support • Participation by • NIKHEF, KNMI, SARA • DAS-2 (ASCI):TUDelft, Leiden, VU, UvA, Utrecht • Telematics Institute • FOM, NWO/NCF • Min. EZ, ICES/KIS • IBM, KPN, … ASTRON Amsterdam Leiden Enschede KNMI Utrecht Delft Nijmegen

  31. A Bright Future for Grid! You could plug your computer into the wall and have direct access to huge computing resources almost immediately (with a little help from toolkits and portals) … It may still be science – although not fiction – but we are about to make this into reality!

More Related