420 likes | 501 Views
Learn about cloud computing models, providers like Windows Azure, Google App Engine, Rackspace, and Amazon AWS, programming models, and practical considerations for cloud development.
E N D
Cloud Computing Development • Shallow Introduction
What is the cloud computing • Is it computing while in flight? Image Courtesy SevensHeaven.nl
What is the cloud computing • Is it computing while in flight? • NO Image Courtesy SevensHeaven.nl
What is the cloud computing • What is it about then?
What is the cloud computing • What is it about then? • Cloud computing is consumption of computing resources without worrying about specifics.
What is the cloud computing • What is it about then? • As well as ability to add or remove resources according to the demand.
What is the cloud computing • What is it about then? • Similar to the power grid andtelephone network.
What is the cloud computing • What is it about then? • Similar to the power grid andtelephone network.
How does it work? • Consumer signs up for the service. (Same as if you get a mobile phone plan) • Consumer uses services according to their needs • Provider sends the bill at the end of the cycle • Consumer pays
Provider: Windows Azure • Platform as a service • Windows based • Storage provided through blob storage, drives, SQL Azure • State is stored and propagated with Queues and Tables • Integrated with Visual Studio • Eclipse plug-in for PHP
Provider: Google App Engine • Platform as a service • Python or Java based • Storage provided through BigTable • Automatically scales web nodes
Provider: Rackspace • Infrastructure as a service • Very Basic just a few Linux or Windows images • Provides storage with CloudFiles • Very Cheap • Open source API • Relatively New
Provider: Amazon AWS • Oldest on the market • Many services / Images / Third party providers • Provides computation through EC2 / EMR • Provides state / storage through S3, SQS, RDS, SimpleDB • Multiple APIs
Practical Considerations • Cloud Development is slightly different from traditional in house model.
Practical Considerations • Cloud Development is slightly different from traditional in house model. • Everything is virtualized (most of the time) • Everything is distributed • Per instance reliability is much lower • Overall reliability is much higher
Cloud Programming Model • Compute and Interface nodes are not reliable, they can crash and disappear at any time. • Storage and State are reliable and heavily distributed. • At any time we can start more compute or interface nodes and shut them down when demand subsides.
Cloud Programming Model on Azure • Compute : Worker Nodes • State: Tables / Queues / SQL • Storage: SQL / Tables / Blobs / Drives • Client Inteface: Web Nodes
Cloud Programming Model on AWS • Compute : EC2 Instances • State: S3 / Queues / SimpleDB / RDS • Storage: S3 / SimpleDB / RDS • Client Inteface: S3 / EC2 / CloudFront
AWS Details: S3 • S3 = Simple Storage Service • Guaranteed to be reliable • Simple {Key, Value} storage • Keys are stored within buckets • Values could be as large as 5GB • Default Storage Mechanism for AWS
AWS Details: Simple DB • Schema less database • Main storage unit is domain ( similar to table ) • Each record can have many attributes, new attributes could be added at any time • Similar to LISP / Scheme attributes • Can query domain for records containing particular attribute • No Joins / Unions with other domains • Eventual Consistency
AWS Details: RDS • RDS = Relational Data Storage • MySQL in a cluster mode • Preferred to simply running DB server within instance (ask me why for details)
AWS Details: SQS • SQS = Simple Queue System • Massively scalable • Allows to put message in the queue and retrieve later on • Retrieving the message hides it from the other users • When message is processed it is deleted from the queue • If message is not deleted before the timeout it is returned back
EC2 = Elastic Compute Cloud • Allows to run arbitrary virtual machinesProvided they are compatible with Amazon’s modified Xen • Kernels and Startup Disks are stored in S3 • Also have large local storage • Machines are not exactly like physical machines • Local storage is not persistentWhen machine is shut down all local data disappears. • Hardware TCP [No packet layer / No Broadcast ] • Can launch many copies of the machine at the same time • Lot’s of preconfigured machines AWS Details: EC2
AWS Details: Other Services • EMR = Elastic Map ReduceLet’s run Hadoop jobs on EC2 • CloudFront Content Delivery Network • ELB = Elastic Load Balancer • EBS = Elastic Block StorageS3 backed persistent storage • Public Data Sets - Lots of publicly available data Census ( 1980 , 1990, 2000 ), Wikipedia logs, Freebase dumps, Genetic and Chemistry data
Starting Up • Amazon Account • Credentials KeyID : SecretKey • X509 Ceriticate
Helpful Tools • S3 Fox - Firefox extension for browsing S3 • Elastic Fox - Firefox extension for operating EC2 • Transmit - Mac utility for S3 ($) • Right Scale - Web based platform for managing everything ( Free / $ )
Libraries • Official Amazon Libraries (Java) • Unofficial Libraries - .Net / Ruby / Perl • AWS4C - C/C++/Objective C • Boto - Very popular Python library (official Hadoop/EC2 library)
Demo Running Hadoop on EC2
Questions ????