1 / 21

Live Database Migration for Elasticity in a MultitenantDatabase for Cloud Platforms

Live Database Migration for Elasticity in a MultitenantDatabase for Cloud Platforms. Author: Sudipto Das Shoji Nishimura Divyakant Agrawal Amr El Abbadi. Abstract. 随着云平台的流行,能支持大量应用且运行在一群商品机上可扩展的数据库管理系统对云服务提供商来说是非常重要的。以性能和成本而言,由于云中负载经常变动,弹性负载均衡也是至关重要的。而租户数据库迁移到新的机器群中实现分载是实现弹性负载均衡的一种关键技术。

Download Presentation

Live Database Migration for Elasticity in a MultitenantDatabase for Cloud Platforms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Live Database Migration for Elasticity in a MultitenantDatabase for Cloud Platforms Author: Sudipto Das Shoji Nishimura Divyakant Agrawal Amr El Abbadi

  2. Abstract • 随着云平台的流行,能支持大量应用且运行在一群商品机上可扩展的数据库管理系统对云服务提供商来说是非常重要的。以性能和成本而言,由于云中负载经常变动,弹性负载均衡也是至关重要的。而租户数据库迁移到新的机器群中实现分载是实现弹性负载均衡的一种关键技术。 • 一种在线的数据的迁移,如何最小化系统崩溃时间和对服务的负面影响是首先要考虑的。然而,目前,大多数DBMS都不支持灵活有效的在线数据迁移,数据迁移时候要么是系统下线要么服务崩溃。 • 作者针对多租户云数据库环境下的在线数据迁移展开研究。评估不同的多租赁模型,提出了一种有效地在线数据迁移技术来最小化迁移时的系统或服务崩溃。并在一种专门设计的云数据库中实现了该技术。同时在YCSB和TPC-C两种标准下进行了测试评估。

  3. 1 Introduction • 近年来,云平台迅速发展,部署在云中的应用数额庞大(FaceBook 55K),这对数据库系统提出了严峻的考验。除了可扩展,容错,高可用,数据库面临着如此多租户的大量应用,有不规则的负载。如何最小化操作成本,提供租户间资源共享是很重要的。 • 作者研究的重点就是在多租户数据空中实现有效地在线数据迁移来支持弹性的负载均衡。(租户数据从一个节点迁移到另一个节点)这项技术称为Iterative Copy。除了最小化服务崩溃时间,保证整个系统不下线。

  4. Contributions: • 正式提出在多租户DBMS下的多租户数据迁移问题,并提出可以有效在线数据迁移的技术 • 证明Iterative Copy的正确性和安全性,概括在不同失败环境下的行为。 • 在OLTP标准下实验评估该技术的有效性

  5. 2 Preliminaries • 2.1 Multitenancy Models and Migration • three multitenancy models: • Shared machine • Shared process • Shared table • DEFINITION 1. A Tenant Cell (or Cell for brevity) is a selfcontained granule of application data, meta data, and state representing a tenant in the database.

  6. 2.2 Design Rationale and Reference SystemModel

  7. 设计原则: • 1. Limit tenant transactions to single nodes. • 2. Decouple Ownership from Data storage. • 3. Tenants are oblivious of the physical locationof their data. • 4. Balance resource sharing and tenant isolation. • 5. Balance functionality with scale.

  8. 2.3 Cost of Migration • DEFINITION 2. Live Migration in a database management system is the process of migrating a cell (or a logically contained part of the database) with minimal service interruption, no system downtime, and minimal overhead resulting from migration. • 2.4 Known Migration Techniques • (1)stop and copy • (2)on demand migration

  9. 3 Live Database Migration • 3.1 Iterative copy migration

  10. Focus on transferring the main memory state • the cached database state (DB state)the cached database pages or buffer pool, or some variant of this • the transaction execution state (Transaction state) the state of the active transactions and in some cases a subset of committed transactions. • Nsrc,Ndes,Cmigr • Phase 0: Pre Migration Phase, normal mode • Phase 1:Migration Phase • A: begin migration snapshot ,Nsrc to Ndes,Cmigr initialized • B: Iterative copy Nsrc server the Cmigr,Nsrc tracks the changes, T11,T12……T1m:Transaction completed in Nsrc since the snapshot Ndes synchronize • C: automic handover ownership of Cmigr is transferred from Nsrc to Ndst (1)Nsrc stop server,copies the final un-synchronizedstate to Ndst (2)T1m+1……T1n,not commited transactions,Ndes decide abort them en-tirely, abort at Nsrc and restart at Ndst, or migrate them in a way that the transactions start execution at Nsrc and complete at Ndst (3)query router of the new location of Cmigr • Phase 2: Post Migration Phase ,Ndes servers CmigrTransactions T1m+1, . . . , T1n are completed

  11. 3.2 Failure Handling during Migration • in Phase 0 and 2 are handled as normal DBMS node failures. • If either Nsrc or Ndst fails before Phase 1c, migration of Cmigr is aborted. • Phase 1c • (i) ensure that changes from all completed transactions (T01, . . . , T0k, T11, . . . , T1m) at Nsrchave been flushed to stable storage; • (ii) synchronize the remaining state of Cmigr; • (iii) transfer ownership of Cmigr from Nsrc toNdst • (iv) notify the query router that all future transactions (T21, . . . , T2p) must be routed to Ndst • Similar to 2pc. Nsrc ,Ndes,Router;

  12. 3.3 Correctness Guarantees • DEFINITION 3. Safe Migration. • (i) Data Safety and Unique ownership: The disk resident image of a cell is consistent at any instant of time. This is in turn guaranteed if at any instant of time, only a single DBMS node owns the cell being migrated; • (ii) Durability: Updates from committed transactions are either in the disk resident database image or are recoverable. • DEFINITION 4. Liveness: • (i) If Nsrc and Ndst are not faulty and can communicate with each other for a sufficiently long duration in Phase 1, migration of Cmigr is successfully completed; • (ii) Furthermore, Cmigr is not orphaned (i.e., left without an owner) even in the presence of repeated failures.

  13. THEOREM 1. Atomicity of handover. • THEOREM 2. Independent Recovery. • THEOREM 3. A single failure does not incur additional unavailability. • COROLLARY 4. Provided Cmigr had a unique owner • LEMMA 5. Changes made by aborted transactions are neither persistently stored or copied over during migration • LEMMA 6. Changes made by committed transactions are persistently stored and the log entries of completed transactions on Cmigr at Nsrc can be discarded after successful migration • LEMMA 7. Migration of active transactions during migration • does not violate the durability condition even if the write ahead log • at Nsrc is discarded after successful migration • LEMMA 8. Progress guarantee. Migration succeeds if Nsrc • and Ndst can communicate during the entire duration of Phase 1 • of migration

  14. 3.4 Minimizing service disruption • Phase 1b is to minimize this service disruption by minimizing the amount of state that needs copying in the final step • how transactions T1m+1, . . . , T1n aborted,restarted,carried over to complete execution at Ndst. • Two: • one to deal with transactions whose logic is being executed at the client • dealing with stored procedures

  15. 4. Implements Details • 4.1 System Architecture

  16. 4.2 Implementation Design • due to the NAS abstraction, migration does not need the transfer of the persistent image of Cmigr from Nsrc to Ndst. • TMmaster initiates the migration of a cell • Copying the database cache. • Iterative copy : read cache • Handover: write cache • Copying the transaction state. • Iterative copy :committed transactions • Handover: transactions in flight • Handover phase. • flush of any changes from committed transactions • copies over the state of the active transactions • changes to the read cache • Router(metadata,client):update metadata • directly notiy client cache

  17. 5. Experimental Evaluation • Environment • Ten nodes: • Each node: • with 15 GB of memory, 8 EC2 Compute Units • 1690 GB of local instance storage • In Elastrans stop and copy migration does not involve any movement of the persistent image of the database.

  18. 1

  19. 1

  20. 2

  21. 6. RELATEDWORK • 7. CONCLUSION • In the future, we plan to extend the design by adding an intelligent system control that can model the cost of migration to predict its cost as well the behavior of the entire system. This model can be used to autonomously determine which cells to migrate, when to migrate, and to allow effective resource utilization in the system while maintaining the SLA’s for the tenants of the system.

More Related