NETE4631 Managing the Cloud and Capacity Planning
NETE4631 Managing the Cloud and Capacity Planning. Lecture Notes #8. Lecture Outline. Managing the cloud Administrating the cloud Managing responsibilities Lifecycle management Emerging cloud management standards Capacity Planning Steps for capacity planner Scenario Load testing
NETE4631 Managing the Cloud and Capacity Planning
E N D
Presentation Transcript
NETE4631Managing the Cloud and Capacity Planning Lecture Notes #8
Lecture Outline • Managing the cloud • Administrating the cloud • Managing responsibilities • Lifecycle management • Emerging cloud management standards • Capacity Planning • Steps for capacity planner • Scenario • Load testing • Resource ceiling • Scaling
Administrating the Cloud • Network management systems are often described as FCAPS (ISO) • Fault/ Configuration/ Accounting/ Performance/ Security • Fundamental features • Administrating/ Configuring / Provisioning of resources, Enforcing security policy, monitoring operations, Optimizing performance, Policy management, Performance maintenance, etc.
Administrating the Cloud (2) • Network management framework tools • BMC ProactiveNet Performance Management • HP OpenView/ HP manager products • IBM Tivoli Service Automation Manager • CA (Computer Associates) Unicenter • Microsoft System Center
Management Responsibilities • What is different from traditional network management? • Cloudy characteristics • Billing is on a pay-as-you-go basis. • The management service is extremely scalable. • The management service is ubiquitous. • Communication between the cloud and other systems uses cloud networking standards. • The type of Cloud affects which tools for monitoring • Level of controlling aspects of operations – IaaS>PaaS>SaaS
What to be Monitored for Cloud? • End-users services such as HTTP, TCP, POP3/ SMTP, etc. • Browser performance on the client • Application monitoring in the cloud such as Apache, MySQL, and so on • Cloud infrastructure monitoring of services such as Amazon Web Services • Machine instance monitoring where the service measures processor utilization, memory usage, disk consumption, queue lengths, etc.
Lifecycle Management • Six different stages in the lifecycle • The definition of the services as a template for creating instances • Client interactions with the service, usually through an SLA (Service Level Agreement) • The deployment of an instance to the cloud and the runtime management of instances • The definition of the attributes of the service while in operation and performance of modification of properties • Management of the operation of instance and routine maintenance • Retirement of service
Cloud Management Products • Very young industry • List of products -> Chapter 11 of Course Book • Core management features • Support of different cloud types • Creation and provisioning of different types of cloud resources such as machine instances, storage, or staged applications • Performance reporting including availability and uptime, response time, resource quota usage • The creation of dashboards that can be customized for a particular client’s needs
Example - CloudKick • www.cloudclick.com
Emerging Cloud Management Standards • Distributes Management Task Force (DMTF) • An industry organization that develops industry system management standards for platform interoperability • Create a working group to help develop interoperability standards for managing transactions between and in public, private, and hybrid cloud systems • Describing resource management and security protocols, packaging methods and network management technologies.
Emerging Cloud Management Standards (2) • Cloud Commons • Initiated by CA and donates to Software Engineering Institute (SEI), CMU, USA • Establishes cloud-based metrics for • file creation and deletion/ Email availability/ console response time/ storage and database benchmark • Using dashboard called CloudSensor to monitor cloud-based services in real time
Capacity Planning • Capacity Planning • Match demand to available resources • Identify critical resources that has resource ceiling and add more resources to remove the bottleneck of higher demands • Not focus on performance tuning or optimization
Steps for Capacity Planner • Iterative process with the following steps • Examine what systems are in place (characteristics) • Measuring their workload for the different resources in the system: CPU, RAM, disk, network and so forth • Load the system until it is overloaded, determine when it breaks, and specify what is required to maintain acceptable performance/ what factors are responsible for the failure (resource ceiling) • Determining usage pattern & predict future demand • Add or tear down resources to meet demand
Scenario • Example (LAMP) • Capacity planner works with a system that has a website on Apache • Also, a site has been processing database transactions (MySQL) • Application-level metrics • Page views (hits/s) • Transactions (trans/s)
Scenario (2) • System-level metrics • What each system is capable of • How resources of such a system affect system-level performance • Example • A machine instance (physical or virtual) • CPU • Memory (RAM) • Disk • Network Connectivity • Measured by tools such as sar command/ Microsoft task manager/ RRDTool for Linux
Load Testing • Load testing seeks to answer the following question. • What is the maximum load that my current system can support? • Which resources represent the bottleneck in the current system that limits the system’s performance? (resource ceiling) • Can I alter the configuration of my server in order to increase capacity? • How does this server’s performance relate to your other servers that might have different characteristics. • Tools • HTTPerf, Siege, Autobench, IBM Rational Performance Tester, HP LodeRunner, Jmeter, OpenSTA
Network Capacity • Three aspects to assessing network capacity • Network traffic to and from the network interface at the server (physical or virtual) • system utilities (I/O), Network monitor (traffic) • Network traffic from the cloud to the network interface • Tools such as those from Apparel Networks • Network traffic from the cloud through your ISP to your local network interface • The connection from the backbone to your computer (through ISP)
Scaling • Scale vertically (scale up) • Add resources to a system to make it powerful • A virtual system can run more virtual machines (operating system instance), more RAM, faster compute times • Example – rendering or memory-limited apps • Scale horizontally (scale out) • Add more nodes to remove I/O bottleneck • Easy to pull resources and partition • Example – web server apps
Scaling Comparison • Cost • Scale up pays more than scale out. • Maintenance • Scale out increases the number of systems you must manage. • Communication • Scale out increases the number of communication between systems. • Scale out introduces additional latency to your system.
References • Chapter 6, 11 of Course Book: Cloud Computing Bible, 2011, Wiley Publishing Inc.