1 / 32

Differentiated Services == Differentiated Scheduling

The role of the Nova scheduler in managing Quality of Service. Differentiated Services == Differentiated Scheduling. Gary Kotton - VMware Gilad Zlotkin - Radware. 1. Enterprise Ready Openstack.

ekram
Download Presentation

Differentiated Services == Differentiated Scheduling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The role of the Nova scheduler in managing Quality of Service Differentiated Services == Differentiated Scheduling Gary Kotton - VMware Gilad Zlotkin - Radware 1

  2. Enterprise Ready Openstack Migrating existing mission critical and performance critical enterprise applications requires: → High service levels • Availability • Performance • Security → Compliance with existing architectures • Multi-tier • Fault tolerance models 2

  3. Service Level for Applications • Availability 3

  4. Service Level for Applications • Availability • Performance • Transaction Latency (Sec) • Transaction Load/Bandwidth (TPS) 3

  5. Service Level for Applications • Availability • Performance • Transaction Latency (Sec) • Transaction Load/Bandwidth (TPS) • Security • Data Privacy • Data Integrity • Denial of Service 3

  6. Service Level for Applications • Availability • Performance • Transaction Latency (Sec) • Transaction Load/Bandwidth (TPS) • Security • Data Privacy • Data Integrity • Denial of Service What all this has to do with the Nova Scheduler? 3

  7. High Availability Models • Availability Zone Redundancy → The “cloud” way • Server Redundancy → The “classic” way • Both Server and Zone Redundancies → The “enterprise” disaster recovery way 4

  8. Availability Zone Redundancy Global Load Balancing LB1 LB2 WS2 WS3 WS4 WS1 DB2 DB1 AZ1 AZ2 5

  9. Server Redundancy LB1 LB2 WS1 WS2 WS3 DB1 DB2 6

  10. Server and Zone Redundancies Global Load Balancing LB1 LB3 LB2 LB4 WS1 WS4 WS5 WS2 WS3 WS6 DB3 DB1 DB2 DB4 AZ1 AZ2 7

  11. Network Availability VMware’s NSX for example LB1 LB2 WS1 WS2 WS3 DB1 DB2 Transport Network Logical Network Controller Cluster 8

  12. Load Balancer Availability Radware’s Alteon Load Balancer for example Auto Failover Active Standby Configuration Synchronization LB1 LB2 Persistency State Synchronization WS1 WS2 WS3 9

  13. Group Scheduling • Group together VMs to provide a certain service • Enables scheduling policies per group/sub-group • Provides a multi-VM application designed for fault tolerance and high performance 10

  14. Example 11

  15. Example Bad placement: if a host goes down entire service is down! 11

  16. Placement strategy - anti affinity: achieving fault tolerance Example Bad placement: if a host goes down entire service is down! 11

  17. Placement Strategies • Availability - anti affinity • VM's should be placed in different 'failure domains' (e.g., on different hosts) to ensure application fault tolerance 12

  18. Placement Strategies • Availability - anti affinity • VM's should be placed in different 'failure domains' (e.g., on different hosts) to ensure application fault tolerance • Performance • Network proximity • Group members should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance 12

  19. Placement Strategies • Availability - anti affinity • VM's should be placed in different 'failure domains' (e.g., on different hosts) to ensure application fault tolerance • Performance • Network proximity • Group members should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance • Host Capability • IO-Intensive, Network-Intensive, CPU-Intensive,... 12

  20. Placement Strategies • Availability - anti affinity • VM's should be placed in different 'failure domains' (e.g., on different hosts) to ensure application fault tolerance • Performance • Network proximity • Group members should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance • Host Capability • IO-Intensive, Network-Intensive, CPU-Intensive,... • Storage Proximity 12

  21. Placement Strategies • Availability - anti affinity • VM's should be placed in different 'failure domains' (e.g., on different hosts) to ensure application fault tolerance • Performance • Network proximity • Group members should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance • Host Capability • IO-Intensive, Network-Intensive, CPU-Intensive,... • Storage Proximity • Security - Resource Isolation/Exclusivity • Host, Network, ... 12

  22. Anti Affinity • Havana: Anti affinity per group • nova boot --hint group=WS[:anti-affinity] --image ws.img --flavor 2 --num 3 WSi • “Instance Groups” • Properties: • Policies - for example anti affinity • Members - the instances that are assigned to the group • Metadata - key value pairs • Sadly did not make the Havana Release • Continue work in Icehouse with extended functionality 13

  23. Network Proximity (Same Rack) 14

  24. Host Capabilities - IO intensive - CPU intensive - Network intensive → “Smart resource placement” - Yathi Udupi and Debo Dutta (Cisco) → “Host Capabilities” - Don Dugger (Intel) 15

  25. Storage Proximity • Schedule instances to have affinity to Cinder volumes → “Scheduling Across Services” - Boris Pavlovic (Mirantis) and Alex Glikson (IBM) → “Smart resource placement” - Yathi Udupi and DeboDutta (Cisco) 16

  26. Resource Exclusivity • Network Isolation: Neutron, for example VMware’s NSX • Host Allocation: enable user to have a pool of hosts for exclusive use. → “Private Clouds - Whole Host Allocation” - Phil Day (HP), Andrew Laski (Rackspace) 17

  27. Additional Scheduling Topics → “Scheduler Performance” - Boris Pavlovic (Mirantis) → “Methods to Improve DB Host Statistics” - Shane Wang and Lianhau Lu (Intel) → “Scheduler Metrics - Relationship with Ceilometer” - Paul Murray (HP) → “Multiple Scheduler Policies” - Alex Glikson (IBM) 18

  28. Icehouse • Expand on “Instance Group” support • Topology of resources and relationships between them • DeboDutta and Yathi Udupi (Cisco) • Mike Spreitzer (IBM) • Gary Kotton (VMware) 19

  29. API - Aiming for I1 • Proposed API (Nova Extension) • id - a unique UUID • name - human readable name • tenant_id - the ID of the tenant that owns the group • policies - a list of policies for the group (anti affinity, network proximity and host capabilities) • metadata - a way to store arbitrary key value pairs on a group • members - UUIDs of all of the instances that are members of the group 20

  30. Flow • Group will be created with no members • Group will have a policy • Group ID will be used for scheduling • Passed as a hint • Scheduler will update members • Pending support for group of groups • Group membership will be removed when instance is deleted 21

  31. Summary Migrating existing mission critical and performance critical enterprise applications requires: High service levels → Group Scheduling Policies • Availability → Anti-Affinity • Performance →Proximity / Host Capability • Security →Resource Exclusivity 22

  32. Q&A Thank You Gary Kotton: gkotton@vmware.com Gilad Zlotkin: gzlotkin@radware.com

More Related