Information Technologies Department Tour of CERNComputer Center and the Grid at CERN Welcome!
Computing at CERN • General Purpose Computing Environment • Administrative Computing Services • Physics and engineering computing • Consolidation, coordination and standardization of computing activities • Physics applications(e.g., for data acquisition/offline analysis) • Accelerator design and operations http://cern.ch/it
LHC Data every year 1 Megabyte (1MB) A digital photo 1 Gigabyte (1GB) = 1000MB 5GB = A DVD movie 1 Terabyte (1TB) = 1000GB World annual book production 1 Petabyte (1PB) = 1000TB Annual production of one LHC experiment 1 Exabyte (1EB) = 1000 PB 3EB = World annual information production • 600 million collisions per second in each detector • recording rate > 1 Petabyte / sec! • After filtering, 100 collisions of interest per secondrecording rate > 1 Gigabyte / sec • From all experiments • > 25 Petabytes • of stored data / year CMS LHCb ATLAS ALICE http://cern.ch/lhc
LHC Data every year LHC data correspond to about 1000 years ofDVD-quality videoproduced each year Where will the experiments store all of these data?
LHC Data Processing LHC data analysis requires a computing power equivalent to~ 100,000 of today's fastest PC cores Where will the experiments find such a computing power?
Computing power available at CERN (2012) • High-throughput computing based on reliable “commodity” technology • More than 65’000 cores (20’000 CPUs) in about 10’000 boxes (Linux) • 60 Petabytes on 65’000 drives (NAS Disk storage) • 34 Petabytes on 45’000 tape slots with 160 high speed drives Nowhere near enough!
Computing for LHC • Problem: even with Computer Centre upgrade, CERN can provide only a fraction of the necessary resources • Solution:Computing centers, which were isolated in the past, will be connected,uniting the computing resources of particle physicists worldwide Users of CERN (2012) Member States: 600 institutes 7000 users Non Members States: 600 institutes 4400 users
What is the Grid? • The World Wide Web provides seamless access to information that is stored in many millions of different geographical locations • In contrast, the Gridis an emerging infrastructure that provides seamless access to computing power and data storage capacity distributed over the globe
One Web but many Grids Grid development has been initiated by the academic, scientific and research community, but industry is also interested. • UK e-Science Grid • Netherlands – VLAM, PolderGrid • Germany – UNICORE, Grid proposal • France – Grid funding approved • Italy – INFN Grid • Eire – Grid proposals • Switzerland - Network/Grid proposal • Hungary – DemoGrid, Grid proposal • Norway, Sweden - NorduGrid • NASA Information Power Grid • DOE Science Grid • NSF National Virtual Observatory • NSF GriPhyN • DOE Particle Physics Data Grid • NSF TeraGrid • DOE ASCI Grid • DOE Earth Systems Grid • DARPA CoABS Grid • NEESGrid • DOH BIRN • NSF iVDGL • DataGrid (CERN, ...) • EuroGrid (Unicore) • DataTag (CERN,…) • Astrophysical Virtual Observatory • GRIP (Globus/Unicore) • GRIA (Industrial applications) • GridLab (Cactus Toolkit) • CrossGrid (Infrastructure Components) • EGSO (Solar Physics)
1. Sharing resources on a global scaleMain issues are trust, different management policies, virtual organisations, 24 hour access and support.2. SecurityMain issues are well-defined yet flexible rules, authentication, authorisation, compatibility and standards3. Balancing the loadThis is more than just cycle scavenging, (SETI@home), need middleware to monitor and broker resources4. The death of distanceNetworks delivered 56Kb/s 10 years ago,now we have 155Mb/s,for the LHC anticipate 10 Gb/s5. Open standardsGrid standards are converging, and include Web services, the GlobusToolkit, various protocols 5 big ideas
Grid applications for Science • Medical/Healthcare(imaging, diagnosis and treatment ) • Bioinformatics(study of the human genome and proteome to understand genetic diseases) • Nanotechnology(design of new materials from the molecular scale) • Engineering(design optimization, simulation, failure analysis and remote instrument access and control) • Natural Resources and the Environment(weather forecasting, earth observation, modeling and prediction of complex systems)
Grid@CERN • CERN projects: • Worldwide LHC Computing Grid (WLCG) • External projects with CERN participation: • Enabling Grids for E-SciencE (EGEE) – ended April 2010 • European Grid Initiative (EGI) • European Middleware Initiative (EMI) • Industry funded projects: • CERN Openlab
WLCG: Worldwide LHC Computing Grid • Biggest scientific Grid project in the world • Running since Oct 2008 • More than 150 sites in 34 countries • More than 250’000 processors • All Tiers 1 must store all the data produced by LHC • Runs around 100 millions programmes a year • Used by 8000 scientists in 500 institutes http://cern.ch/lcg
The EGEE vision Access to a production quality GRID will change the way science and much else is done in Europe This project officially ended on April 2010 An international network of scientists will be able to model a new flood of the Danube in real time, using meteorological and geological data from several centres across Europe. A team of engineering students will be able to run the latest 3D rendering programs from their laptops using the Grid. A geneticist at a conference, inspired by a talk she hears, will be able to launch a complex biomolecular simulation from her mobile phone. http://www.eu-egee.org
CERN Openlab III • Framework for evaluating and integrating cutting edge IT technologies or services in partnership with industry, focusing on future versions of the WLCG. • Phase III focuses on • Security • Automation tools • Data replication and monitoring • Behaviour of a network of thousand machines • Platform: thermal optimisation, applications... http://cern.ch/openlab