1 / 37

E190 Heavily Distributed Database Architecture for OLTP Systems

E190 Heavily Distributed Database Architecture for OLTP Systems. Michael Knoll Manager IT - Federal Express Corporation Revenue Systems Engineering mknoll@fedex.com. Agenda.

berget
Download Presentation

E190 Heavily Distributed Database Architecture for OLTP Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. E190 Heavily Distributed Database Architecture for OLTP Systems • Michael Knoll • Manager IT - Federal Express Corporation • Revenue Systems Engineering • mknoll@fedex.com

  2. Agenda How to combine data distribution concepts and strategies with Adaptive Server features to implement very large, scalable OLTP systems. • Why Distribute? • The Nature of Distribution • Products and Features Involved • Core Concepts and Strategies

  3. Why Distribute? Large Database • Need to reduce size for manageability and performance Performance • Eliminate bottlenecks by utilizing more resources Scalability • Need to be able to scale horizontally to meet new requirements and business growth

  4. Why Distribute? Fault Tolerance • Protect from single points of failure (Geography) • Desire to operate in multiple locations or datacenters (Cost) • Take advantage of more, smaller, cheaper hardware (Administration) • More, smaller databases

  5. What kind of distribution are we talking about? Multi-Server • Many SMP machines operating independently Multi-Database • Many databases managed independently Variable Data Distribution • Horizontal, vertical, and replicated

  6. What kind of distribution are we talking about? Batch Applications • “parallel”, partitioned by a key On-line Services • Small distributed transactions routed by a key Locator Services • Low performance data location by brute force

  7. Database Architecture Diagram

  8. Products Involved ASE 12.5 • Important features exist to make it all possible Replication Server 11.5 • Still as reliable as ever Open Client 11.X - 12.X • Custom C/C++ program shells incorporating distribution services

  9. Products Involved BEA Tuxedo • Highly configurable, routed application services MQ Series • Persistent transaction messaging Tibco • Database Adapters for event driven publishing

  10. Key ASE Features • Remote Procedure Calls • Execute Immediate • Distributed Transaction Management • Component Integration Services (CIS, a.k.a. Omni) • Union in View • Data Only Locking • Many Others...

  11. Core Concepts & Strategies • Data Distribution • Transaction Routing • Dynamic Built RPC’s • Multi-database, Multi-server Transaction Coordination • Connection Management • Distributed Access to Central Tables • Physical Table Partitioning • Data Location & “Look Through Databases”

  12. Data Distribution Horizontal Distribution The primary method for spreading the data that makes up the vast majority of the system. • Same schema • Different data • Data is partitioned by a consistent key • Transaction routing is essential

  13. Data Distribution Vertical Distribution The primary method for segregating data that serves different purposes within the system. • Different schema • Different data • Data is not partitioned by a key • Data is placed in separate databases serving different purposes • Connection management is the routing method

  14. Data Distribution Propagation Special purpose data that is required in totality by distributed applications. • Same schema • Same data • Data is replicated from a single master to N replicates • Latency must be tolerable

  15. Database Architecture Diagram

  16. Transaction/Application Routing On-line Transactions • Data dependant routed to Tuxedo application servers • Routing is a lookup on an xref table • Routing assignment is deterministic Batch Applications • Inputs are pre-split by routing key • Control of flow is managed by 3rd party scheduling product

  17. Transaction/Application Routing Database Targets • Every database in the system is assigned a unique target • We chose to identify them at 3 levels • Site - more specifically, datacenter • Server - multiple servers per datacenter • Database - multiple databases per dataserver Example: 010203 = Site 01, Server 02, Database 03

  18. Transaction/Application Routing Database Target Lookup Routing Id resolves database target

  19. Transaction/Application Routing Hostname lookup Database target resolves SQL Server/Database combination

  20. Database Architecture Diagram

  21. Dynamic Built RPC’s • Secondary routing • Used when unit-of-work spans servers/databases • Make no assumptions of data location • Use of execute immediate • Use of Distributed Transaction Management (DTM) feature • Typically only your “on-line transactions”

  22. Dynamic Built RPC’s Multi-server transaction without rpc

  23. Dynamic Built RPC’s Multi-server transaction with rpc

  24. Dynamic Built RPC’s Coding example (utility function): CREATE PROCEDURE proc_sel_rpc_info @db_target varchar(6), @rpc_string varchar(60) OUTPUT AS SELECT @rpc_string = rtrim(sql_svr_nm)+ "." + rtrim(phys_db_nm) FROM db_nm_info WHERE db_nm = @db_target

  25. Dynamic Built RPC’s Coding example (exec immediate): CREATE PROC proc_InsShpmtB @db_target varchar(6), @shpmt_id char(17), @ab_nbr char(12), @parm4 char(9), @parm5 char(2), AS DECLARE @rpc_string varchar(60) , @rpc_proc varchar(100), @return_code int exec proc_sel_rpc_info @db_target, @rpc_string OUTPUT select @rpc_proc=@rpc_string + "..proc_InsShpmtA" exec @return_code=@rpc_proc @shpmt_id, @ab_nbr, @parm4, @parm5 return @return_code

  26. Database Architecture Diagram

  27. Connection Management Methods for database access • All applications, batch and on-line have a single database context for insert/update/delete • Secondary read-only connections (no pooling) • Routed Tuxedo services (provides pooling) • Secondary update connections (no UOW consistency) • Remote procedure calls (DTM w/UOW consistency) • Proxy tables and/or rpc’s (CIS w/UOW consistency)

  28. Distributed Access to Central Tables N:1 Configuration • Proxy tables for the single, centralized base table created at each desired location • Useful for aggregating distributed data • Provides proper isolation as if the table was local • Avoids multiple job steps - makes for easy coding • Can be very contentious

  29. Distributed Access to Central Tables

  30. Physical Table Partitioning • See Russ Wheaton’s E121 afternoon session • Method for managing very large tables often containing transient data • Promotes truncating vs. deleting to purge data • Made to look like a single table to the application • Inserts/Updates are directed to appropriate “partition” • Selects can be directed when the partition key is known. Alternatively a union view can be employed

  31. Data Location • Combines union views and proxy tables or proxy rpc’s • Used in the rare case when the routing key for a transaction is not known • Union views are serial in execution and read only • Where high performance not required, retrieve the data via the union view by way of proxy tables • For performance, only return the location followed by a dynamic generated rpc

  32. Data Location Proxy Tables (1 for each of 18 target databases) create existing table shipment_010101 ( shpmt_id char(17), col1 int, col2 int, … col20 varchar(35) ) external table at “SYB_D1.syb_d1_a1.dbo.shipment” create existing table shipment_010203 ( shpmt_id char(17), col1 int, col2 int, … col20 varchar(35) ) external table at “SYB_D2.syb_d2_a3.dbo.shipment”

  33. Data Location Union view over proxy tables create view shipment_locator as select “010101” “target”, shpmt_id, col1, col2, … , col20 from shipment_010101 union all select “010102 ” “target”, shpmt_id, col1, col2, … , col20 from shipment_010102 union all select “010103 ” “target”, shpmt_id, col1, col2, … , col20 from shipment_010103 union all . . select “010603 ” “target”, shpmt_id, col1, col2, … , col20 from shipment_010603 go

  34. Data Location Proxy rpc’s (1 for each of 18 target databases) create existing table shipment_proc_010101 ( does_exist smallint, _shpmt_id char(17) null ) external procedure at “SYB_D1.syb_d1_a1.dbo.shipment_proc” Procedure at each of 18 target databases create procedure shipment_proc @shpmt_id char(17) as select 1 from shipment where shpmt_id = @shpmt_id

  35. Data Location Union view over proxy rpc’s create view shipment_locator as select “010101” “target”, does_exist, _shpmt_id from shipment_proc_010101 union all select “010102” “target”, does_exist, _shpmt_id from shipment_proc_010102 union all select “010103”, “target”, does_exist, _shpmt_id from shipment_proc_010103 union all . . select “010603”, “target”, does_exist, _shpmt_id from shipment_proc_010603 go

  36. Look Through Database • Contains no data • Contains proxy tables, rpc’s and union views • NOT a high performer • Used when ... • searching for data without routing key • performing queries combining results from multiple locations • single point of access is desired

  37. Database Architecture Diagram

More Related