900 likes | 1.53k Views
Lecture IX: Multi-tenant Architecture. CS 4593 Cloud-Oriented Big Data and Software Engineering. Outline. Multi-Tenant Architecture Concepts Practice Multi-tenant at data layer Multi-tenant database Multi-tenant at logic layer Multi-tenant at UI layer. SaaS & Multi-Tenancy.
E N D
Lecture IX: Multi-tenant Architecture CS 4593Cloud-Oriented Big Data and Software Engineering
Outline • Multi-Tenant Architecture • Concepts • Practice • Multi-tenant at data layer • Multi-tenant database • Multi-tenant at logic layer • Multi-tenant at UI layer
SaaS & Multi-Tenancy • SaaS (Software-as-a-Service) • Software managed by SaaS vendor, run on SaaS server • Tenancy = client organization • ~10 users, small/medium business • Multi-instance • separate instance for each tenant • Multi-tenancy • single instance of software serving multiple tenants
Why Multi-Tenancy • Economy of scale • large number of tenants • sharing the cost of single software instance • cost saving for each individual user • Examples?
Multi-Tenancy in Practice Big iron # tenants per database 10000 1000 100 Size of Machine 10000 1000 100 10 10000 1000 100 10 1 Blade Email Proj Mgmt CRM ERP Banking Small Large Complexity of Application • Economy of scale decreases with application complexity • At the sweet spot, compare TCO of 1 versus 100 databases
A Typical Blade Server IBM HS20 (Wikipedia) • IBM BladeCenter HS12: • CPU (single, dual, quad-core xeon 2~3 GHz) • Memory (max): 24 GB • Internal storage (max): 293 GB 24G / (10k tenant) = 2.4M/tenant
Multi-Tenancy in Practice Big iron # tenants per database 10000 1000 100 Size of Machine 10000 1000 100 10 10000 1000 100 10 1 Blade Email Proj Mgmt CRM ERP Banking Small Large Complexity of Application • Economy of scale decreases with application complexity • At the sweet spot, compare TCO of 1 versus 100 databases
Outline • Multi-Tenant Architecture • Concepts • Practice • Multi-tenant at data layer • Multi-tenant database • Multi-tenant at logic layer • Multi-tenant at UI layer
Multi-Tenant Databases (MTD) • Consolidating multiple businesses onto same operational system • Consolidation factor dependent on size of the application and the host machine • Support for schema extensibility • Essential for ERP applications
Multi-Tenant Databases (MTD) • Support atop of the database layer • Non-intrusive implementation • Query transformation engine maps logical tenant-specific tables to physical tables • Various problems, for example: • Various table utilization (“hot spots”) • Metadata management when handling lots of tables
Account TenId AcctId Name ... 17 1 Acme 17 2 Gump 35 1 Ball 42 1 Big Classic Web Application (Basic Layout) • Pack multiple tenants into the same tables by adding a tenant id column Great consolidation but no extensibility
Private Table • Each tenant gets his/her own private schema • No sharing • SQL transformation: Renaming only • High meta-data/data ratio • Lots of tables: linear in # of tenants, each with own meta-data • Low buffer utilization (~8k for index/data for each table)
Account TenId AcctId Name ... 17 1 Acme 17 2 Gump 35 1 Ball 42 1 Big Private Table
Handling Lots of Tables • Simplifying assumption: No extensibility • Experiment setup: • CRM schema with 10 tables • 10,000 tenants are packed onto one DBMS (DB2, 1G Memory) • Data set size remains constant
Handling Lots of Tables • Parameter: Schema Variability • Number of tenants per schema instance • 0 (least variable): all tenants share one instance (= 10 tables) • 1 (most variable): each tenant has separate instance (= 100k tables)
Handling Lots of Tables – Results 10 fully shared Tables 100.000 private Tables
Extension Table • Split off the extensions into separate tables • Additional join at runtime • Row column for reconstructing the row (discussion: consider Acct17)
Extension Table (Cont’d) • Good: Better consolidation than Private Table layout • common attributes go to same table • Bad: Number of tables still grows in proportion to number of tenants • tenants in same domain may often have varied schema • so large # of *extension* tables (such as Healthcare_account)
Extension Table Account17(A, N, H, B) = select Ae.A, Ae.N, Ha.H, Ha.B from Account_ext as Ae, Healthcare_account as Ha where Ae.Tenant = Ha.Tenant & Ae.Row = Ha.Row and Ae.Tenant = 17
Universal Table • Generic structure with VARCHAR value columns • n-th column of a logical table is mapped to ColN in an universal table • Extensibility: # of columns may expand as needed • Disadvantages • Very wide rows Many NULL values • Not type-safe Casting necessary • No index support (note column index not very meaningful)
Universal Table logical table (private after renaming)
Each field of a row in logical table is given its own row. Multiple pivot tables for each type (int, string, e.g.) Pivot Table
Row: 0 String Pivot Table Int
Reconstruct Tenant Table from Pivot Table Row: 0 Account17(H, B) = select Ps.Str, Pi.int from Pivot_str as Ps, Pivot_int as Pi where Ps.Tenant = Pi.Tenant & Ps.Table = Pi.Table & Ps.Row = Pi.Row & Ps.Col = 2 & Ps.Col = 3 & Ps.Tenant = 17 & Ps.Table = 0 Int align String (Only Hospital & Beds)
Reconstruct Tenant Table (cont’d) align: same tenant, table, row Row: 0 Account17(A, N, H, B) = select Pi’.int, Ps’.str, Ps.Str, Pi.int from Pivot_int as Pi’, Pivot_str as Ps’, Pivot_str as Ps, Pivot_int as Pi where Pi’.Tenant = Ps’.Tenant & Ps’.Tenant = Ps.Tenant & Ps.Tenant = Pi.Tenant & Pi’.Table = Ps’.Table & Ps’.Table = Ps.Table & Ps.Table = Pi.Table & Pi’.Row = Ps’.Row & Ps’.Row = Ps.Row & Ps.Row = Pi.Row & Pi’.Col = 0 & Ps’.Col = 1 & Ps.Col = 2 & Ps.Col = 3 & Ps.Tenant = 17 & Ps.Table = 0 4-way join Int String
Generic type-safe structure Eliminates handling many NULL values Performance Depends on the column selectivity of the query (number of reconstructing joins) E.g., query on A17(H,B) is more selective than A17(A,N,H,B) See previous example Pivot Table: Performance
Pivot Table Chunk Table Row: 0 Row: 0 Chunk 0 Chunk 1 Field 0 Field 1 Field 2 Field 3 one row for each field one row for each chunk Chunk table Pivot table
How to Chunk or Fragment Original Rows • Many possible fragmentations • One idea: group fields by their “popularity”
Chunk Table 3. reconstruct original row Account17(A, N, H, B) = select Cis.Int1, Cis.Str1, Cis’.Str1, Cis’.Int1 from Chunk Cis, Chunk Cis’ where Cis.Chunk = 0 & Cis’.Chunk =1 & Cis.Row = Cis’.Row & Cis.Tenant = Cis’.Tenant & Cis.Table = Cis’.Table Row: 0 Chunk 0 Chunk 1 2-way join 2. merge chunk 1. align rows
Chunk Table vs. Universal Table Chunk 0 Chunk 1 # of rows (for each original row) = # of chunks Universal as extreme chunking: only one chunk per row Chunk 0 only one row for each original row
Chunk Table Performance Row: 0 Chunk 0 Chunk 1
Querying Chunk Tables • Query Transformation • Row reconstruction needs many self- and equi-joins • Can be automatically translated
Querying Chunk Tables • Compilation Scheme: • Collect all table names and their corresponding columns from the logical source query • Obtain table definitions: for each table • obtain the Chunk Tables and the meta-data identifiers representing the used columns • generate a query that filters the correct columns (based on the meta-data identifiers) and aligns the different chunk relations on their ROW column. • Extend each table reference in the logical source query by corresponding table definition (obtained in step 2)
Query Example: Step 1 Step 1 result: tables = {Account17} columns = {Account17.Beds, Account17.Hospital}
Query Example: Step 2 String Int
Query Example: Step 3 (Chunk Table) Chunk 0 Chunk 1
Summary • Multi-tenancy database critical to scale SaaS solution • Varied schema layout schemes • different degrees of consolidation & extensibility • optimal layout depends on particular data set, work load, etc. • Novel Chunk Table layout
Outline • Multi-Tenant Architecture • Concepts • Practice • Multi-tenant at data layer • Multi-tenant database • Multi-tenant at logic layer • Multi-tenant at UI layer
Web-Worker Model • Managed Code Start • Inbound on • Any TCP Port • HTTP/HTTPS Web role is frontend, Worker Role is backend Web role is worker role with Web Servers
Design Issues • Application • Web Roles and Worker Roles • Stateless design • Easy-to-Scale • Fault Tolerance and Recovery • Under-the-cover Multiple instances • Each runs in Virtual Machine • Handled automatically by hypervisor
Stateless is important • Why? • Random assignment of tasks to workers • No overhead of reading user profiles • Easy Replacement and Recover
Agent and Fabric • Agent • Exposes the API • Monitors the failure conditions of the application • Fabric • Allocate resources according to configuration file • Detect and restart failed web roles and workers
Code Samples var lowMessages = new List<BrokeredMessage>(); for (int i = 0; i < 10; i++) { var message = new BrokeredMessage() { MessageId = Guid.NewGuid().ToString() }; message.Properties["Priority"] = Priority.Low; lowMessages.Add(message); } this.queueManager.SendBatchAsync(lowMessages).Wait(); Queue
Code Samples protected override async Task ProcessMessage(BrokeredMessage message) { // Simulate message processing for High priority messages. await base.ProcessMessage(message); Trace.TraceInformation("High priority message processed by " + RoleEnvironment.CurrentRoleInstance.Id + " MessageId: " + message.MessageId); } protected virtual async Task ProcessMessage(BrokeredMessage message) { // Simulating processing. await Task.Delay(TimeSpan.FromSeconds(2)); } Worker