1 / 109

A Brain is a Lot of Data! (Mark Ellisman, UCSD)

Grid computing : an introduction Lionel Brunie Institut National des Sciences Appliquées Laboratoire LIRIS – UMR CNRS 5205 – Equipe DRIM Lyon, France. A Brain is a Lot of Data! (Mark Ellisman, UCSD). And comparisons must be made among many.

katiea
Download Presentation

A Brain is a Lot of Data! (Mark Ellisman, UCSD)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid computing : an introductionLionel BrunieInstitut National des Sciences AppliquéesLaboratoire LIRIS – UMR CNRS 5205 – Equipe DRIMLyon, France

  2. A Brain is a Lot of Data!(Mark Ellisman, UCSD) And comparisons must be made among many We need to get to one micron to know location of every cell. We’re just now starting to get to 10 microns

  3. Données intensives • Physique nucléaire et des hautes énergies • Simulations • Observation terrestre, modélisation du climat • Géophysique, modélisation des tremblements de Terre • Aérodynamique et dynamique des fluides • Dispersion de polluants • Astronomie : les futurs télescopes produiront plus de 10 Petaoctets par an ! • Génomique • Chimie et biochimie • Applications financières • Imagerie médicale

  4. Evolution de la performance des composants informatiques • Performance Réseau/Processeur • La vitesse des processeurs double tous les 18 mois • La vitesse des réseaux double tous les 9 mois • La capacité de stockage des disques double tous les 12 mois • 1986 à 2000 • Processeurs : x 500 • Réseaux : x 340000 • 2001 à 2010 • Processeurs : x 60 • Réseaux : x 4000 Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan-2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.

  5. Conclusion : invest in networks !

  6. Hansel and Gretel are lost in the forest of definitions • Distributed system • Parallel system • Cluster computing • Meta-computing • Grid computing • Peer to peer computing • Global computing • Internet computing • Network computing • Cloud computing

  7. Distributed system • N autonomous computers (sites) : n administrators, n data/control flows • an interconnection network • User view : one single (virtual) system • «A distributed system is a collection of independent computers that appear to the users of the system as a single computer » Distributed Operating Systems, A. Tanenbaum, Prentice Hall, 1994 • « Traditional » programmer view : client-server

  8. Parallel System • 1 computer, n nodes : one administrator, one scheduler, one power source • memory : it depends • Programmer view : one single machine executing parallel codes. Various programming models (message passing, distributed shared memory, data parallelism…)

  9. CPU Memory CPU CPU CPU Memory Memory Memory Periph. network network network network CPU CPU CPU CPU CPU CPU CPU CPU CPU Memory CPU Memory CPU Interconnection network Examples of parallel system A CC-NUMA architecture A shared nothing architecture

  10. Cluster computing • Use of PCs interconnected by a (high performance) network as a parallel (cheap) machine • Two main approaches • dedicated network (based on a high performance network : Myrinet, SCI, Infiniband, Fiber Channel...) • non-dedicated network (based on a (good) LAN)

  11. Where are we today ? • A source for efficient and up-to-date information : www.top500.org • The 500 best architectures ! • N° 1 : 1457 (1105) Tflops ! N° 500 : 22 (13) Tflops • Sum (1-500) = 16953 Tflops • 31% in Europe, 59% in North America • 1 Flops = 1 floating point operation per second • 1 TeraFlops = 1000 GigaFlops

  12. How it grows ? • in 1993 (prehistoric times!) • n°1 : 59.7 GFlops • n°500 : 0.4 Gflops • Sum = 1.17 TFlops • in 2004 (yesterday) • n°1 : 70 TFlops (x1118) • n°500 : 850 Gflops (x2125) • Sum = 11274 Tflops and 408629 processors

  13. 2007/11 best : http://www.top500.org/ Peak: 596 Tflops !!! http://www.top500.org/

  14. 2008/11 best : http://www.top500.org/ Peak: 1457 Tflops !!! http://www.top500.org/

  15. 2009/11 best : http://www.top500.org/ Peak: 2331 Tflops !!! http://www.top500.org/

  16. Performance evolution

  17. Projected performance

  18. Architecture distribution

  19. Interconnection network distribution

  20. NEC earth simulator (1er en 2004 ; 30ème en 2007) Single stage crossbar : 2700 km of cables A MIMD with Distributed Memory 700 TB disk space 1.6 PB mass storage area : 4 tennis court, 3 floors

  21. NEC earth simulator

  22. BlueGene • 212992 processors – 3D torus • Rmax = 478 Tflops ; Rpeak = 596 Tflops

  23. RoadRunner • 3456 nodes (18 clusters) - 2 stage fat tree Infiniband (optical) • 1 node= 2 AMD Opteron DualCore + 4 IBM PowerXCell 8i • Rmax = 1.1Pflops ; Rpeak = 1.5Pflops • 3,9 MW (0,35 Gflops/W)

  24. Jaguar • 224162 cœurs – Mémoire : 300 To – Disque : 10 Po • AMD x86_64 Opteron Six Core 2600 MHz (10.4 GFlops) • Rmax = 1759 – Rpeak = 2331 • Power : 6,950 MW • http://www.nccs.gov/jaguar/

  25. Network computing • From LAN (cluster) computing to WAN computing • Set of machines distributed over a MAN/WAN that are used to execute parallel loosely coupled codes • Depending on the infrastructure (soft and hard), network computing is derived in Internet computing, P2P, Grid computing, etc.

  26. Visualization Meta computing (beginning 90’s) • Definitions become fuzzy... • A meta computer = set of (widely) distributed (high performance) processing resources that can be associated for processing a parallel not so loosely coupled code • A meta computer = parallel virtual machine over a distributed system SAN LAN Cluster of PCs WAN SAN Supercomputer Cluster of PCs

  27. Internet computing • Use of (idle) computer interconnected by Internet for processing large throughput applications • Ex : SETI@HOME • 5M+ users since launching • 2009/11 : 930k users, 2.4M computers; 190k active users, 278k active computers, 2M years of CPU time • 234 « countries » • 1021 floating point operations since 1999 • 769 Tflop/s! • BOINC infrastructure (Décrypthon, RSA-155…) • Programmer view : a single master, n servants

  28. Global computing • Internet computing on a pool of sites • Meta computing with loosely coupled codes • Grid computing with poor communication facilities • Ex : Condor (invented in the 80’s)

  29. Peer to peer computing • A site is both client and server : servent • Dynamic servent discovery by « contamination » • 2 approaches : • centralized management : Napster, Kazaa, eDonkey… • distributed management : Gnutella, KAD, Freenet, bittorrent… • Application : file sharing

  30. Grid computing (1) “coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organisations” (I. Foster)

  31. Grid computing (2) • Information grid • large access to distributed data (the Web) • Data grid • management and processing of very large distributed data sets • Computing grid • meta computer • Ex : Globus, Legion, UNICORE…

  32. Parallelism vs grids : some recalls • Grids date back “only” 1996 • Parallelism is older ! (first classification in 1972) • Motivations : • need more computing power (weather forecast, atomic simulation, genomics…) • need more storage capacity (petabytes and more) • in a word : improve performance ! 3 ways ... Work harder --> Use faster hardware Work smarter --> Optimize algorithms Get help --> Use more computers !

  33. The performance ? Ideally it grows linearly • Speed-up : • if TS is the best time to process a problem sequentially, • its time should be TP=TS/P with P processors ! • Speedup = TS/TP • limited (Amdhal law): any program has a sequential and a parallel part TS=F+T//, • thus the speedup is limited : S = (F+T//)/(F+T///P)<1/F • Scale-up : • if TPS is the time to treat a problem of size S with P processors, • then TPS should also be the time to treat a problem of size n*S with n*P processors

  34. Grid computing

  35. Starting point • Real need for very high performance infrastructures • Basic idea : share computing resources • “The sharing that the GRID is concerned with is not primarily file exchange but rather direct access to computers, software, data, and other resources, as is required by a range of collaborative problem-solving and resource-brokering strategies emerging in industry, science, and engineering” (I. Foster)

  36. Applications • Distributed supercomputing • High throughput computing • On demand (real time) computing • Data intensive computing • Collaborative computing

  37. An Example Virtual Organization: CERN’s Large Hadron ColliderWorldwide LHC Computing Grid (WLCG) 1800 Physicists, 140 Institutes, 33 Countries 10 PB of data per year ; 50,000 CPUs?

  38. LCG System Architecture • Modèle de calcul sur 4 étages • Tier-0: CERN: accélérateur • Acquisition de données et reconstruction • Distribution des données aux Tier-1 (~online) • Tier-1 • Accès et disponibilité 24x7, • Acquisition de données quasi-online • Service de données sur la grille: enregistrement utilisant un service de « mass storage » • Analyse-Lourde des données • ~10 pays • Tier-2 • Simulation • Utilisateur final, analyses des donnéesen batch et en Interactif • ~40 Pays • Tier-3 • Utilisateur final, analyse scientifique Tier-0 (1) Tier-1 (11) Tier-2 (~150) Tier-3 (~50) • LHC • 40 millions de collisions par seconde • ~100 collisions d’intérêt par seconde après filtrage • 1-10 MB de données pour chaque collision • Taux d’acquisition: 0.1 à 1 GB/sec • 1010 collisions enregistrées chaque année • ~10 PetaBytes/an

  39. LCG System Architecture (suite) Tier-0 Trigger and Data Acquisition System 10 Gbps links Optical Private Network (to almost all sites) Tier-1 General Purpose/Academic/Research Network Tier-2 From F. Malek – LCG FRance

  40. Back to roots (routes) • Railways, telephone, electricity, roads, bank system • Complexity, standards, distribution, integration (large/small) • Impact on the society : how US grown • Big differences : • clients (the citizens) are NOT providers (State or companies) • small number of actors/providers • small number of applications • strong supervision/control

  41. Computational grid • “Hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities” • Performance criteria : • security • reliability • computing power • latency • throughput • scalability • services

  42. Some recalls about parallelism

  43. R2 R1 Sources of parallelism ---> pipeline parallelism ---> intra-operator parallelism ---> inter-operator parallelism ---> inter-query parallelism

  44. Parallel Execution Plan (PEP) PARALLEL EXECUTION PLAN “Scenario” of the query processing Define the role played by every processor and its interaction with the other ones Search heuristics Plan model Cost Model Search space

  45. Intrinsic limitations • Startup time • Contentions : • concurrent accesses to shared resources • sources of contention : • architecture • data partitioning • communication management • execution plan • Load imbalance • response time = slowest process • NEED to balance data, IO, computations, comm.

  46. Parallel Execution Scenario ---> Operator processing ordering ---> degree of inter-operation parallelism ---> access method (e.g. indexed access) ---> characteristics of intermediate relations (e.g. cardinality) ---> degree of intra-operation parallelism ---> algorithm (e.g. hybrid hash join) ---> redistribution procedures ---> synchronizations ---> scheduling ---> mapping ---> control strategies

  47. Execution control Precisely planning the processing of a query is impossible ! Global and partial Information Dynamic parameters Mandatory to control the execution and to dynamically re-optimize the plan Load balancing Re-parallelization

  48. End of recalls…

  49. Levels of cooperation in a computing grid • End system (computer, disk, sensor…) • multithreading, local I/O • Cluster (heterogeneous) • synchronous communications, DSM, parallel I/O • parallel processing • Intranet • heterogeneity, distributed admin, distributed FS and databases • low supervision, resource discovery • high throughput • Internet • no control, collaborative systems, (international) WAN • brokers, negotiation

More Related