1 / 32

Large-Scale Distributed Computing: Near-Zero Cost and Near-Infinite Capability

Large-Scale Distributed Computing: Near-Zero Cost and Near-Infinite Capability. Dr. Alexandru Iosup. Education Do, Not Just Listen!. Problems with traditional education I/O model does not teach well ( listening not enough ) I/O model reduces motivation ( listening is boring )

talen
Download Presentation

Large-Scale Distributed Computing: Near-Zero Cost and Near-Infinite Capability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large-Scale Distributed Computing: Near-Zero Cost and Near-Infinite Capability Dr. Alexandru Iosup

  2. EducationDo, Not Just Listen! • Problems with traditional education • I/O model does not teach well (listening not enough) • I/O model reduces motivation (listening is boring) • I/O model dims initiative (are students encouraged to think?) • Partial solution: do, not just listen! • Test acquired knowledge while still fresh • It’s your work, and you are responsible for it • Allow students to exercise critical thinking abilities • Warning: Example during this presentation

  3. ResearchLarge-Scale Distributed Computing • Computing = useful information processing, according to some algorithm/program • Distributed Computing (DC) = multiple concerted programs (jobs) operating on multiple computers communicating over a network. • Large-Scale Distributed Computing (LSDC) = DC + computers spread around the world • Example: the Internet, World of Warcraft, BitTorrent

  4. ResearchLSDC:Grid Computing Just plug in the computing grid and get your results • The Grid = integration of computers as day-to-day computing utility, similar to phone, water, and information • Economy of scale: better service at lower cost • Large-scale reality: operational overhead, functionality (robustness + manageability), real heterogeneity • Primary users • E-Science: high-energy physics, earth sciences, bioinformatics • Industry: financial services, search engines (Google)

  5. Our Grid Research [1/2]Understanding Grids Much of grid resource management research not connected to real world • The Grid Workloads Archive • Data from major grids made public • 1,000s of users, 1,000,000s of jobs • >500,000 jobs/year/trace • Grids are unlike traditional computing systems such as clusters and supercomputers • Mostly bags of single-node jobs, few parallel jobs • Resource dynamics vs. evolution • Models for grid workloads and resources [Iosup et al. FGCS’08] gwa.ewi.tudelft.nl [Iosup et al. Grid’06-’07, CCGrid’07, EuroPar’07-’08, SC’07, HPDC’08-’09, …]

  6. Our Grid Research [2/2]Testing andOptimizing Grids Why we need and how to achieve grid inter-operation? • The GrenchMark testing framework • LSDCs: grids, P2P, resource pools • Workload generation & Results analysis • Distributed testing • For grids, functionality is more important than performancemany failures (>25% common), incompatible middleware stacks, … • Grid inter-operation is beneficial • Delegated MatchMaking is a good inter-operation solution Experimental research [Iosup et al. CCGrid’06-’07, Grid’08] ACM SRC Award at SuperComputing’07 [Iosup et al. SC’07, SciProg’08, …] Best Paper Nomination at ACM SuperComputing’07

  7. ResearchLSDC:Peer-to-Peer Computing • Peer-to-Peer (P2P) = network in which each entity has equal standing with the others [M-W, adapted] • P2P Computing = LSDC in which the system users are also resource providers (though not necessarily the only resource providers) • Near zero-cost: service provider has few of the resources • Difficult to control system properties: users may disconnect, resource providers may be malicious, … • Primary users • P2P file-sharing communities: BitTorrent, eMule, …

  8. Our P2P ResearchUnderstanding & Optimizing BitTorrent Research based on real world requirements & Experimental research • The largest BitTorrent measurements to date • MultiProbe framework, large public data set • >400,000 unique peers, useful bandwidth doubled since 2004 • 2Fast:Collaborative Downloads • Main part in Tribler multiprobe.ewi.tudelft.nl Best Paper Award at IEEE P2P’06

  9. Past Research SummaryUnique Approach and Achievements • Grid and P2P research, Ph.D. in grids • Combined theoretical principles with experimental and engineer-like approaches • Produced real systems and data • Other achievements • Leading EU and national projects • Bibliometrics: 20+ articles, h-index 10, g-index 17, 300+ citations • 2 research awards, 2 other distinctions Future: (1) Maintain and extend top publication level (ACM, IEEE journals)(2) New LSDC research topic

  10. Massively Multiplayer Online Games(MMOGs) are a Popular, Growing Market • 25,000,000 people now, 60,000,000 by 2012 • Over 150 commercial MMOGs in operation • Market size $7,500,000,000/year

  11. What is an MMOG? • ContentGraphics, maps, puzzles, quests+ • Virtual world simulationExplore, do, learn, socialize, compete Myth vs. Reality - Avg player 30 years-old- 50% explore/socialize Romeo and Juliet

  12. Research Objective: Near-Zero Cost and Near-Infinite Capability for MMOGs The Content ProblemGenerating content on time for millions of players - Player-customized: balanced, diverse - Reduce upfront costs The Platform ProblemSupport millions of players inside a seamless world - Arbitrary workload variability - Latency sensitivity - Reduce upfront costs

  13. Today’s Technology The Content Problem - Human content designers- Upfront payment- Updates are rare- Not player-customized The Platform Problem - Large, dedicated multi-server infrastructures (1,000s servers, 100s locations: World of Warcraft, Runescape)- Upfront costs high, maintenance consumes ~40% revenue - Spare capacity

  14. Our Vision: Content and Simulation, Together The Content Problem - Auto content generation - Economy of scale - Frequent updates - Player customization The Platform Problem - Users (peers) provide service, super-peers provide guaranteed service - Cloud (on-demand, paid, guaranteed) resources for excess load - Low upfront costs, efficient and scalable capacity

  15. Research Plan [NWO VENI 2009 proposal; final decision pending] First Steps… “Posible to generate player-customized content” Research Questions • How to generate interesting MMOG content in a scalable and efficient way? • What are the effective platforms for MMOGs, for the different classes of amateur and professional developers? • How to create a realistic model of the MMOG workloads? + MMOG Workloads Archive Scientific prototypeReal open-source games such as BZFlag, FreeCiv, and NetHack (1,000,000+ users) [Iosup EuroPar’09] Research Award “On-Demand resources >> resource Ownership” [Nae et al. SC’08, TPDS’09]

  16. Practice: Meeting Content Demand Source: Daniel James, Metrics for a Brave New Whirled • I will act as replacement for the player community: at fixed intervals I will request content by clapping; one clap requests one content item. • You can create content items bywriting your first name on the sheet of paper in front of you. You deliver one content item by crossing it out. • If I get multiple content items for a single request I can select any of them, but they are all consumed. • If I do not get any item you failed to meet demand.

  17. Results: Meeting Content Demand Source: Daniel James, Metrics for a Brave New Whirled • Can generate content through distributed computing. • Redundant content generation may guarantee that enough content is available. • How efficient is this process? • What happens if the demand pattern changes? • What happens if the demand is not homogeneous?

  18. Additional MMOG Research Or “Towards meeting the 6 PhDs per staff FTE goal of the PDS group” • Creative Technologies (extends content generation) • Generating MMOG news • Broadcasting game/MMOG events • Building creative communities • MMOG Analytics (extends news generation) • Data Acquisition • Data Analysis • “Massivizing” Games • Automatic conversion from MOG to MMOG [Upcoming EU FP7 proposal] [Quantime EU FP7 proposal] [Upcoming EU FP7 proposal] [Iosup ROIA’09]

  19. Attainability and Creating Momentum Why do I qualify? With whom I will collaborate? External funding (* pending)- VENI finalist – 240,000EUR*- Quantime EU FP7 – 10,000,000EUR*- ICT Talent Grant + other personal grants – over 35,000EUR - contributor to other EU and national projects – total over 20,000,000EUR Past commercial game R&D- 3 award-winning commercial games- Co-PI Microsoft grant – teaching- 4 articles on games since Sep 2008 - research award on games (2009) IndustryUbiSoft (#4 game dev world)Khaeon (#1 Dutch MMOG dev)TNO, NLMicrosoftNVIDIA AcademiaVU Amsterdam, NLLeiden University, NLU. Wisconsin-Madison, USAU. Innsbruck, ATPolitehnica U. of Bucharest, ROPortland State U., USA NWO RISCC and EU FP7.

  20. Education, Last But Not LeastMy Approach • Problems with traditional education • I/O model does not teach well (listening not enough) • I/O model reduces motivation (listening is boring) • I/O model dims initiative (are students encouraged to think?) • Do, not just listen! • A good challenge between challenges and rewards in teaching can be achieved through the medium of games. • Familiar environment for students • Playing and knowing [Huizinga’55] • The student can be encouraged to reach maximum potential • Own experience while teaching in two countries

  21. Thank You! Dr. Alexandru Iosup http://www.pds.ewi.tudelft.nl/~iosup/

  22. Why the Industry CANNOT do This? (Why is This Research-Worthy?) • We will design a general MMOG workload model vs. a company studies only the game it operates • We will create an open-access MMOG workloads archive, for a diverse community • We will comparenew scalable computing architectures for MMOGs vs. “whatever works” • We will open a new research area: content generation at large scale and on time

  23. Mitigating Risks for Content Generation Mitigation plan Main Risk How to generate interesting MMOG content? Map Puzzle Quest Quantify “interesting” Play-test (prototype) Example: the Lunar Lockout puzzle

  24. MMOG Workloads Archive Motivation • “Unfortunately, I am not able to release any of the data. Sony still considers it proprietary, and we can't control that.” – Sony “releases” EverQuest data • “Unfortunately, CCP won't allow me to make their data publicly available for EVE Online.” – CCP “releases” EVE Online data Main goal • Easy to share MMOG workload traces and research associated with them vs. The Grid Workloads Archive • New application, new unified format • Adapt tools

  25. MMOG Workloads Archive Applications Inter-disciplinary projects Social Sciences • The emergence and performance of ad hoc groups in contemporary society • Emergent behavior in complex systems Economy • Contemporary economic behavior Psychology • Games as coping mechanism(minorities) • Games as cure(agoraphobia) Biology • Disease spread models

  26. Collaboration UbiSoft, Horia Pintilie, Programming Studio Technical Director Khaeon, Erik ‘t Sas, CEO Microsoft, Christos Gkantsidis, Researcher

  27. OAR Condor Koala Globus GRAM Alien Independent Centralized Condor Flocking OAR2 OurGrid Moab/Torque NWIRE CCS Decentralized Hierarchical Alternatives to/for Grid Inter-Operation Experimental research [Iosup et al. CCGrid’06, Grid’08] Load imbalance? Resource selection? Scale? Root ownership? Node failures? Accounting? Trust? Scale?

  28. Resource request Local load too high Bind remote resource Delegate Resource usage rights Grid Inter-Operation through Delegated MatchMaking (DMM) • Hybrid architecture: hierarchical + decentralized • DMM mechanism for using remote resources

  29. DMM Decentralized Centralized Independent My Grid Research [2/2]Delegated MatchMaking vs. Alternatives (Higher is better) • Input: real and realistic (model) grid workloads • DMM • High goodput • Low wait time • Finishes all jobs • Even better for load imbalance between grids • Reasonable overhead (1) Inter-Operated grids deliver much better performance thanIndependent grids, and (2) DMM = good grid inter-operation [Iosup et al., SC’07]

  30. Superlink@Technion ResearchLSDC:Other Flavors • Volunteer/Desktop Computing • My collaboration with Technion:scheduling • On-Demand Resources for Simulation • My collaboration with U. Stuttgart: disaster prevention • My collaboration with TU Delft/TPM: serious gaming

More Related