Lessons Learned in Building a Highly Scalable MySQL Database Mariella Di Giacomo Daniel Chote Geoff Harrison The Hive
Outline • Introduction/Background • Motivations, Project Requirements • Architectural Design • Proposed Solutions • Performance Analysis Results • MySQL and Other Optimizations
The Hive • Started in 1998 (under a differentname) • Operations in 4 countries • Developer-centric organization
The Hive • The Hive’s collective vision is coding and collaborating from anywhere. • The Hive has a number of large scale projects that include video conferencing, VOIP, IM, and virtual whiteboard all connected through smart self-managing project tools.
MySQL Installations at The Hive • 70+ MySQL Server Installations • Linux Operating System • Mostly Commodity Hardware • Used for production, batch processing, replication and development
The Next Generation • “What we need now is a brilliant idea …”
Scalable Architecture • High Volume of users and access traffic • Exponential growth of data access and storage • Use of relatively commodity hardware
Architectural Requirements • Store data related to millions of users • Store some of the data in an XML format • Store and Retrieve the data through an Open Source Relational Database Management System (RDBMS)
Architectural Requirements • Build a scalable architecture capable of handling content updates and data addition at fastest rate reasonably possible • A system that has as little service disruption as possible • Robustness, Flexibility, Scalability and Speed
A Scalable Architecture • Scalable, robust, fast and flexible. • Capable of handling data related to millions of users. • Based on a data layout that maps all the data for each user to its individual table, organizing the users across multiple systems, databases and filesystems. • Consists of homogeneous systems. • Each system consists of two physical computers that synchronize the data through Master Master MySQL Replication.
Hardware Architecture • The hardware architecture consists of: • Pairs of AMD nodes (Opteron). Each node has 2 CPUs 2GHz with 2cores; it has 24 GB of RAM + 8 GB of swap • Each computer system has two types of data storage devices: Serial Advanced Technology Attachment (SATA) and Solid State Drive (SSD). • Each computer System is composed of Debian Linux Operating System (OS)
Software Architecture • We benchmarked 4 filesystems (Ext3, JFS, ReiserFS and XFS). • We chose MySQL as the RDBMS for the following reasons: • Open Source • Speed • Data Storage Capabilities • Database Design • Fault Tolerance • Security
MySQL Database Design • MySQL supports several storage engines. • We chose MyISAM, for the following reasons: • Each table data is stored in a MYD file • Each table indices are stored in a MYI file • Each table file requires less disk space compared to other storage engines. • Memory key buffer for the indexed data • Downsides of MyISAM • Good for high volume of writes and/or read, but not both in high concurrency.
MySQL Database Design • Our design takes advantage of the fact that there will not be high level of concurrency on the same table, while high level of concurrency will be present at database and server level. • Table Structure Design: • Reduce and or combine indices on some fields, to take fewer resources when adding and modifying data • Use of NOT NULL fields where possible • Use the most appropriate, efficient field type for a given field
Why that architecture ? • We had an idea ….. • We validated the idea through benchmark results and analysis • We analyzed several MySQL Storage engines (MyISAM, InnoDB, Maria, PBXT, Falcon) on 4 types of filesystems (Ext3, JFS, XFS, ReiserFS) using two system architectures (AMD and Intel), two types of data device technologies and integrated the benchmarks with some specific low-level optimizations.
Benchmarks • Benchmark Configuration • Benchmark Hardware Architecture • Benchmark Topics
Benchmark Configuration • The benchmark program is written in Perl. The same program is used for table creation, data insertion, update and selection. • The benchmark program spawns 100 parallel instances (clients) that execute Data Definition Language (DDL) and Data Manipulation Language (DML) statements against the MySQL server. • The statement response time has been taken on the server side and on the client side.
Benchmark Configuration • All benchmarks have been run several times for the same configuration. • All the computer systems benchmarked have been set on an isolated network to lower noise as much as possible. • All computer systems are composed entirely of Debian Linux (Lenny) running 2.6.23 and 2.6.24 • All systems have been compiled with latest MySQL 5.1.x
Benchmark Hardware Architecture • The hardware architecture used for the benchmarks consists of: • Pairs of AMD nodes (Opteron). Each node has 2 CPUs 2GHz with 2cores; 24 GB of RAM + 8 GB of swap • 2 Intel Xeon nodes. Each node has 2 CPUs 3GHz with 2cores and hyper-threading enabled (4 virtual cores each CPU); 24GB of RAM + 8 GB of swap • Each system has identical SATA storage devices. The AMD nodes contain also two identical SSD drives that are raided via an add-on raid controller.
Benchmark Hardware Architecture • Each system always uses two dedicated storage devices for database benchmarks.
Benchmark Topics • Data Layout • Organize each user’s data in a separate table • Partition a table to contain the data associated to a set of users, using MySQL horizontal partitioning. • Database Organization • 20 and 100 databases with 2 million tables • 50 databases with 1 million tables • MySQL Storage Engines: MyISAM, InnoDB, Maria, Falcon, PBXT
Benchmark Topics • Filesystem Organization • One data storage device for filesystem. Each filesystem contains 1 million tables • Ext3, JFS, ReiserFS and XFS • Data Storage Device Technology • SATA • SSD
Data Layout Scenarios • Two data layout scenarios are evaluated: • Organize each user’s data in a separate table • Partition a table to contain the data associated to a set of users, using MySQL horizontal partitioning. In this case the total number of tables required is the total number of users divided by the maximum number of partitions allowed on a partitioned table.
Data Layout Scenarios • Each Data Layout scenario has been benchmarked for: • Dynamic Creation of 1 million to 2 million tables on one system. In the case of 1 million tables they would be physically located on only one filesystem (one hard drive). In the case of 2 million tables they would be spread across 100 or 20 databases. Each dedicated filesystem would be mapped to a physical storage device. • Data Insertion and Update on the created tables • Data Selection
Table Creation Configuration • Table creation has been executed using a stored procedure. • Each benchmark has been executed on an AMD and Intel system • Each benchmark has been evaluated with 4 distinct filesystem types. • The table creation benchmarks have been evaluated with write cache enabled, write back (WB) and write cache disabled, write through (WT) disk policies.
Inserts and Updates Configuration • The benchmark configuration for the data insert and update is equivalent to the one used for table creation. • Each benchmark has been executed only on AMD systems • Each benchmark has been evaluated with 4 distinct filesystem types. • The benchmarks have been executed using write cache enabled and write cache disabled.
Inserts and Updates Configuration • While table creation would perform better using write cache disabled, initial benchmarks with DML operations using write cache disabled would perform worst compared to the equivalent with write cache enabled.
Data Layout Scenarios • Two data layout scenarios are evaluated: • Organize each user’s data in a separate table • SATA Device Technology • MyISAM • InnoDB • Falcon • Maria • PBXT • SSD Device Technology • MyISAM
Data Layout Scenarios • Data layout scenarios: • Organize each user’s data in a separate table • SATA Device Technology
MyISAM and SATA Benchmarks • Benchmarks shown in the coming graphs have been executed using the MyISAM Storage Engine • The MyISAM key_buffer is set to 4GB • In each benchmark the system containing MySQL server has two storage devices for the databases
Table Creation • These graphs show the benchmark results derived from processing 2 million tables. • Table creation of 2 million tables was spread across 20 databases. The databases were split across two filesystems. Each filesystem was on a unique SATA drive. (The drives were set up as single volumes through an add-on raid controller)
Table Creation with MyISAM • The benchmark results showed: • When creating 2 million tables, approximately 36GB of space is used • The highest processing rate was obtained using the JFS filesystem on a volume (SATA via raid controller) with write cache disabled • The MyISAM key buffer did not play a major role • The number of volumes would influence the processing rate.
Table Creation with MyISAM • Between 1 million tables on one filesystem versus 2 million tables on two filesystems, we did not see any major impact in number of tables created per second (provided that we would not reduce the number of storage devices). • 1 million tables would generate a total of 3 million files, 2 million tables would double the number.
MyISAM and SATA Benchmarks • Inserts and Updates on 1 million tables (50 databases) • Inserts and Updates on 2 million tables (20 and 100 databases)
Inserts and Updates • The next graphs show the results of DML operations on 1 million tables created using MyISAM storage engine with 4 types of filesystems and using write cache enabled. • While table creation would perform better using write cache disabled, initial benchmarks with DML operations using write cache disabled result in generating response times 10% higher compared to the equivalent with write cache enabled.
MyISAM and SATA Benchmarks • Inserts and Updates on 1 million tables • Inserts and Updates on 2 million tables • Benchmarks on DML operations on 2 million tables were stopped after a few tests due to their high response time, caused by heavy I/O operations on the volumes.