1 / 13

Scalability - PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Scalability. By Alex Huang. Current Status. 10k resources managed per management server node Scales out horizontally (must disable stats collector) Real production deployment of tens of thousands of resources

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


By Alex Huang

Current status
Current Status

  • 10k resources managed per management server node

  • Scales out horizontally (must disable stats collector)

  • Real production deployment of tens of thousands of resources

  • Internal testing with software simulators up to 30k physical resources with 30k VMs managed by 4 management server nodes

  • We believe we can at least double that scale per management server node

Balancing incoming requests
Balancing Incoming Requests

  • Each management server has two worker thread pools for incoming requests: effectively two servers in one.

    • Executor threads provided by tomcat

    • Job threads waiting on job queue

  • All incoming requests that requires mostly DB operations are short in duration and are executed by executor threads because incoming requests are already load balanced by the load balancer

  • All incoming requests needing resources, which often have long running durations, are checked against ACL by the executor threads and then queued and picked up by job threads.

  • # of job threads are scaled to the # of DB connections available to the management server

  • Requests may take a long time depending on the constraint of the resources but they don’t fail.

The much harder problem
The Much Harder Problem

  • CloudStack performs a number of tasks on behalf of the users and those tasks increases with the number of virtual and physical resources available

    • VM Sync

    • SG Sync

    • Hardware capacity monitoring

    • Virtual resource usage statistics collection

    • More to come

  • When done in number of hundreds, no big deal.

  • As numbers increase, this problem magnifies.

  • How to scale this horizontally across management servers?

Comparison of two approaches
Comparison of two Approaches

  • Stats Collector – collects capacity statistics

    • Fires every five minutes to collect stats about host CPU and memory capacity

    • Smart server and dumb client model: Resource only collects info and management server processes

    • Runs the same way on every management server

  • VM Sync

    • Fires every minute

    • Peer to peer model: Resource does a full sync on connection and delta syncs thereafter. Management server trusts on resource for correct information.

    • Only runs against resources connected to the management server node


  • Assume 10k hosts and 500k VMs (50 VMs per host)

  • Stats Collector

    • Fires off 10k requests every 5 minutes or 33 requests a second.

    • Bad but not too bad: Occupies 33 threads every second.

    • But just wait:

      • 2 management servers: 66 requests

      • 3 management servers: 99 requests

    • It gets worse as # of management servers increase because it did not auto-balance across management servers

    • Oh but it gets worse still: Because the 10k hosts is now spread across 3 management servers. While it’s 99 requests generated, the number of threads involved is three-fold because requests need to be routed to the right management server.

    • It keeps the management server at 20% busy even at no load from incoming requests

  • VM Sync

    • Fires off 1 request at resource connection to sync about 50 VMs

    • Then, push from resource as resource knows what it has pushed before and only pushes changes that are out-of-band.

    • So essentially no threads occupied for a much larger data set.

What s the down side
What’s the Down Side?

  • Resources must reconcile between VM states caused by management server commands and VM states it collects from the physical hardware so it requires more CPU

  • Resources must use more memory to keep track of what amounts to a journal of changes since the last sync point.

  • But data centers are full of these two resources.

Resource load balancing
Resource Load Balancing

  • As management server is added into the cluster, resources are rebalanced seamlessly.

    • MS2 signals to MS1 to hand over a resource

    • MS1 wait for the commands on the resources to finish

    • MS1 holds further commands in a queue

    • MS1 signals to MS2 to take over

    • MS2 connects

    • MS2 signals to MS1 to complete transfer

    • MS1 discards its resource and flows the commands being held to MS2

  • Listeners are provided to business logic to listen on connection status and adjusts work based on who’s connected.

  • By only working on resources that are connected to the management server the process is on, work is auto-balanced between management servers.

  • Also reduces the message routing between the management servers.

Designing for scalability
Designing for Scalability

  • Take advantage of the most abundant resources in a data center (CPU, RAM)

  • Auto-scale to the least abundant resource (DB)

  • Do not hold DB connections/Transactions across resource calls.

    • Use lock table implementation (Merovingian2 or GenericDao.acquireLockInTable() call) over database row locks in this situation.

    • Database row locks are still fine quick short lock outs.

  • Balance the resource intensive tasks as # of management server nodes increases and decreases

    • Use job queues to balance long running processes across management servers

    • Make use of resource rebalancing in CloudStack to auto-balance your world load.


By Alex Huang

The five w s of unreliability
The Five W’s of Unreliability

  • What is unreliable? Everything

  • Who is unreliable? Developers& administrators

  • When does unreliability happen? 3:04 a.m. no matter which time zone… Any time.

  • Where does unreliability happen? In carefully planned, everything has been considered data centers.

  • How does unreliability happen? Rather nonchalantly

Dealing with unreliability
Dealing with Unreliability

  • Don’t assume!

  • Don’t bang your head against the wall!

  • Know when you don’t know any better.

  • Ask for help!

Designs against unreliability
Designs against Unreliability

  • Management Servers keeps an heartbeat with the DB. One ping a minute.

  • Management Servers self-fences if it cannot write the heartbeat

  • Other management servers wait to make sure the down management server is no longer writing to the heartbeat and then signal interested software to recover

  • Check points at every call to a resource and code to deal with recovering from those check points

  • Database records are not actually deleted to help with manual recovery when needed

  • Write code that is idempotent

  • Respect modularity when writing your code

  • Login