Cloud Computing

Cloud Computing Reshaping the Information Age Roy H. Campbell Sohaib and Sara Abbasi Professor University of Illinois at Urbana Champaign http://srg.cs.uiuc.edu

Information is like water.

Information helps you quench your thirst for knowledge

Knowledge is Power

Knowledge is Power Over 10 million ARTICLES Over 20 Billion PAGES Wikipedia Google Human DNA Facebook Over 63 Million ACTIVE-USERS Over 3.3 Billion BASE-PAIRS me

Google Web Search Trends & Hot New Items Legend: Cluster computing, Grid computing, Cloud computing • Nov 15 2007 – IBM Introduces 'Blue Cloud' Computing, CIO Today • Apr 14 2008 – Google and Salesforce.com in cloud computing deal, Siliconrepublic.com • Jun 27 2008 – Yahoo realigns to support cloud computing, 'core strategies', San Antonio Business Journal • Jul 8 2008 – Merrill Lynch Estimates "Cloud Computing" To Be $100 Billion Market, SYS-CON Media • Jul 23 2008 – Cloud Computing Firm Closes $1.5m Series A, SYS-CON Media

Quote “To paraphrase Sun Microsystems’ famous adage, in cloud computing the network becomes the supercomputer” Information Week, 2007

Harnessing Knowledge • Modern information economy is driven by transforming raw data into knowledge • Computers and the Internet catalyze process • Modern computational structure cannot keep pace with the growing body of information • New techniques needed to stay on the forefront

The Era of the Cloud Computing • Frees users from physical server resources • Redistributes computing resources into the network rather than at the edge • Allows flexible on-demand appropriation of resources and dynamic scaling with low latencies • New model treats the Internet as a medium

Cloud Computing Testbed (CCT) • Fertile soil to develop groundbreaking apps • Framework to explore the Cloud architecture • Highly reconfigurable virtual services that adapt to app needs • 1000s of cores running on 100s of nodes

Computing Substrate • Decentralized network of computing elements • Fast links interconnecting Cloud Clusters • Data stored in highly scalable and fault-tolerant data structures • Information is processed using Map/Reduce • Scalable data flow networks enable computational elements to work together seamlessly

University of Illinois Environment • National Center for Supercomputing Applications • Blue Waters (www.ncsa.uiuc.edu/BlueWaters) • NCSA cluster, Data provenance, Forensics • Universal Parallel Computing Research Center • Prof. Marc Snir (www.upcrc.illinois.edu)

University of Illinois Environment • Database and Information Systems Laboratory • Prof. Jiawei Han (http://dais.cs.uiuc.edu/) • Natural Language Processing • Prof. Dan Roth (http://l2r.cs.uiuc.edu/~cogcomp/) • Prof. Bruce Schatz (http://www.canis.uiuc.edu/) • Distributed Systems • Prof. Indranil Gupta (http://dprg.cs.uiuc.edu/) • Algorithms • Prof. Sariel Har-Peled (http://valis.cs.uiuc.edu/~sariel/)

Cloud ComputingCenter of Excellence • Collaborative effort to push the computation into the Cloud • Research fundamental structures, algorithms and applications to leverage the Cloud • Build a scalable test bed to explore design possibilities Partners

Collaboration Open Cloud Computing Testbed

Research Objectives • Foster a collaborative community within computer science that also spans several domains by: • sharing tools, lessons and best practices, • benchmarking the cloud with current applications, and • comparing alternative approaches to service management at datacenter scale. Experiment with design and provisioning large-scale deployments of services Research into dynamic resource management Research into multi-datacenter / global services

Engineering Challenges • Clouds: • Dynamically allocate the amount of resources as their loads and needs change • Need to be almost completely automated • Support multiple users simultaneously • Geographically-dispersed data centers: • Disconnected operation • Network latency • Entire site failure

CloudOS

Quote “Now software companies are making entire Web-based operating systems.” Erica Naone

CloudOS • Research Questions • Structure • Services • Issues • Software Paradigm • MapReduce Overview • Use Cases

Research Questions • Boundary between User and CloudOS space • Increase scalability • Node count • Data throughput • Increase performance • turnaround time for computation • response time for users • responsiveness to changes in load • resource availability • Maximize resource utilization

Research Questions • Increase dependability of infrastructure, data and computations by: • Increasing service uptime • Guarantee security and integrity of execution • Guarantee privacy to the users and applications while enabling limited data sharing between services

Research Questions • Design scalable algorithms and protocols for data-intensive processing in clusters • Handling multi-site clouds • data distribution • access control • Migrating live computations and queries • Limits of Cloud Computing • New application development possibilities • Among many other research questions…

Structure Our vision of CloudOS consists of: • Front-End User and Application management • Storage capability • Reliability management subsystem, and • Several core services running over a hardware cluster and a network such as: • Quotas, • Permanent storage, and • Account management.

Services • Virtualization of Services • Computational paradigms (Hadoop) • Storage (MySQL, HBase), etc. • Highly-Scalable File System for Data-Intensive Computing • Dynamic resource scheduling, monitoring and management • Security for users and data access • Event notification and pub-sub mechanisms • Support for debugging and profiling

Issues • Short-term Research Topics: • Staging in input data • Automatic backup and check pointing • Addressing speed, response time, trustworthiness and dependability, as well as privacy of data • Storage quotas • Different schemes for pricing and accounting • Increasing availability of cluster • Scheduling computation with large amounts of data. • Long-term Questions: • Workload characteristics • Cloud management challenges

Software Paradigm • Large amounts of data • Structured  Databases • Unstructured  Need an efficient way to store and process in parallel • Map Reduce • Useful paradigm • Abstracts the whole cloud to the programmer

MapReduce Overview √2 √2 100110 Reduce 110110 Results 000110

Use Cases • Querying distributed data • Log Management Engines • Crawlers within the cloud • Clouds and their Uses • New Benchmarks

Querying Distributed Data • Data in the cloud is distributed by definition • The cloud should allow expressive, fast, and accurate queries • The cloud should be able to query various data formats including: • Unstructured data (i.e. application logs) • Structured data (i.e. MIBs) • Read-write data (i.e. key-value stores)

Log Management Engines • Need to create, store, manage, and query log data for: • Applications • System resource utilization • Network tracing, etc. • Done in a distributed manner and in real-time • Applications such as: • Online monitoring and fault diagnosis of app • Intrusion detection

Crawlers within the Cloud • They generate, store, and process data offline • Research questions include: • How to concurrently run crawlers and their data processing engines • What does the interface look like? • How to pipeline data between these phases • Novel applications such as: • Real-time visualization of social networks • Real time community tracking of app groups

Clouds and their Uses • Research questions include: • How to make a cluster and their parallel apps schemas (i.e. MapReduce) more cloud-aware? • What is the best network and hardware configurations for Hadoop-style workloads?

Clouds and their Uses • Research questions include: • How to make data-intensive programming models(i.e. Hadoop) more aware and better suited to a particular cloud configuration? • How to address network locality, app data locality, and security (including authorization, authentication and auditing)

New Benchmarks • Need to define benchmarks for: • Data-intensive computing paradigms • CloudOS research • Application research • File system • Key-value store • Hadoop sort does not tax the map or the reduce phases at all.

The U of I approach

Approach • Exclusive allocation of a set of resources • Isolated inside a VLAN - a Physical Resource Set (PRS) • NO virtualization • NO programming platforms • NO required software components • NO software (you supply your own!).

Approach • PRS allows low-level service researchers to experiment: • How many of these can co-exist? • Can run multiple open-source "EC2-lookalike" services (i.e. UCSB's Eucalyptus and Intel's Tashi) • Clients can choose which they prefer • Service providers can offer new service types such as: secure, isolated, virtual resource sets (VRSs)

Existing Cluster Management/ Provisioning Systems • Emulab • PlanetLab • Perceus/Warewolf • xCAT • Rocks • NICL • Orca • Ovirt and Libvirt • OpenNebula

Open source software supporting Amazon EC2 • Eucalyptus from UC Santa Barbara • Tashi proposed as an Apache software project

Related Work • Emulab: PRS-like system that manages nodes, networking, and VLANs • PlanetLab: Important papers are: • Experiences implementing !PlanetLab. Larry Peterson, Andy Bavier, Marc Fiuczynski, and Steve Muir, OSDI'06, November 2006. • Everlab - A production platform for research in network experimentation and computation. Elliot Jaffe, Danny Bickson and Scott Kirkpatrick. 21st Large Installation System Administration Conference (LISA'07), November 2007. • Lessons from Resource Allocators for Large-Scale Multiuser Testbeds. Robert Ricci, David Oppenheimer, Jay Lepreau, and Amin Vahdat. (OSR'06) • Perceus/Warewolf: Is expected to be the next generation of enterprise and cluster provisioning toolkit.

University of Illinois Configuration • Configuration • + 1024 cores (256 processors) • + 128 nodes (dual processor nodes) • RAM: 16GB/node • disk: 2TB/node • 2 network ports/node data network • 1 network port/node management network • high efficiency power supplies • backing store starting at 144TB • 10GB interconnects with security firmware • Constraints: • Power: < 352 amps, 240v 3 phase • Cooling: < 18 tons • Objective: • To obtain the best performing system from our research based on the constraints of budget, power, and cooling

DL160 DL160 DL160 DL160 Internal Switch Internal Switch Internal Switch Internal Switch 32 nodes, 32 nodes 32 nodes 32 nodes Link up Switches Diagram > 144 TB Storage Node Storage Node Storage Node Storage Node 10G Switch Switch Storage Node

Applications

Quote “Cloud computing will follow you everywhere” Linda Tucci, SearchCIO.com

University of Illinois Application Research • Machine Learning • Natural Language Processing • Text Processing: social networking for healthcare • Searching a Population of Wireless Devices • Processing Geographic Information

Semantic processing of informal text - Prof. Bruce Schatz • Finding similar documents to answer queries • Data parallelization does not solve the text processing problem • New systems approach needed to enable loosely coupled clusters to behave like a tightly coupled cluster, for certain application patterns.

Data Mining- Prof. Jiawei Han • Use data mining techniques to analyze massive amounts of information and social networks available on the Internet • Requires expensive large matrix computation as well as costly clustering, ranking, correlation, and classification processes • Parallel data mining algorithms to facilitate fast computation and data mining • Test on several large real datasets, including DBLP, PubMed, and some social networks

Natural Language Processing and Information Extraction– Prof. Dan Roth • Majority of text is unstructured • Semantics must move beyond word level • Analyze huge amounts of data to obtain knowledge to support semantic inference • Develop novel techniques for parallelizing natural language processing • Optimization based on machine learning

Cloud Computing

Cloud Computing

Presentation Transcript

Cloud Computing

Cloud Computing

CLOUD COMPUTING

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

CLOUD COMPUTING

Cloud Computing

Cloud Computing