1 / 62

INTRO

INTRO. Managing Your Databases and Sharding with Jetpants. Bob Patterson Jr Database Engineer @ Tumblr MySQL Connect 2013. BACKGROUND. Background. BACKGROUND. What is Jetpants?. BACKGROUND. What is Jetpants?. MySQL management suite Command Line Tool Ruby automation library

dmitri
Download Presentation

INTRO

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INTRO Managing Your Databases and Sharding with Jetpants Bob Patterson Jr Database Engineer @ Tumblr MySQL Connect 2013

  2. BACKGROUND Background

  3. BACKGROUND What is Jetpants?

  4. BACKGROUND What is Jetpants? • MySQL management suite • Command Line Tool • Ruby automation library • Developed by Tumblr (created by Evan Elias) • Open sourced in June 2012 - https://github.com/tumblr/jetpants

  5. BACKGROUND MySQL at Tumblr

  6. BACKGROUND MySQL at Tumblr • 250+ machines • 54 Terabytes • 229,208,178,189 rows of data • 3 engineers

  7. DIVIDING THE DATA Dividing the Data

  8. DIVIDING THE DATA What is Sharding?

  9. DIVIDING THE DATA Sharding is... • Logically Dividing your data • Horizontally Scaling

  10. DIVIDING THE DATA Why Shard?

  11. DIVIDING THE DATA Reasons to Shard • Scale writes • Eliminate single points of failure • To much data for one machine

  12. DIVIDING THE DATA Sharding Options • Functional Partitions • Dynamic Range-Based • Static Range-Based • Hash / Modulo • Lookup Table

  13. DIVIDING THE DATA Functional Partitions • Each pool contains • 1 Master • 0-x Active Slaves • 2 Standby slaves • Global Tables / Unshardable Tables

  14. DIVIDING THE DATA Range Based Sharding

  15. DIVIDING THE DATA Ranged Based Sharding • min and max id range • Ranges can be any size • Easy to generate config file • Each Shard Contains • Same tables • Different sliver of the overall dataset

  16. RANGED SHARDING AT TUMBLR Ranged Sharding at Tumblr

  17. RANGED SHARDING AT TUMBLR Data moves around

  18. RANGED SHARDING AT TUMBLR Data moves around • Shards can be split into N new shards • Splits are non-disruptive • Shards can me merged to join under-utilized shards (new feature)

  19. RANGED SHARDING AT TUMBLR Benefits of Ranged Sharding

  20. RANGED SHARDING AT TUMBLR Benefits of Ranged Sharding • Low shard count • we do not pre-allocate thousands of small shards • Shards created on demand • Shard splits (and now shard merges) defrag the data

  21. RANGED SHARDING AT TUMBLR Pitfalls of Ranged Sharding

  22. RANGED SHARDING AT TUMBLR Pitfalls of Ranged Sharding • Uneven load on your shards • Uneven usage of disk capacity

  23. JETPANTS COMMAND LINE Jetpants Installation

  24. JETPANTS COMMAND LINE Jetpants Installation • Requires ruby 1.9.2+ • gem install jetpants • May need to install system packages first (mysql-devel)

  25. JETPANTS COMMAND LINE MySQL Setup • MySQL / Percona Server 5.1 or 5.5 (5.6 soon) • InnoDB / Percona XtraDB (need clustered index) • First column of PK on sharded tables is sharding key • No auto_increment tables on shards • One mysqld and one database per server

  26. JETPANTS COMMAND LINE Jetpants Command Line Tools

  27. JETPANTS COMMAND LINE Jetpants Command Line Tools • jetpants pools • jetpants summary • jetpants clone_slave • jetpants shard_split • jetpants promotion • jetpants shard_merge (coming soon)

  28. JETPANTS COMMAND LINE jetpants clone slave

  29. JETPANTS COMMAND LINE jetpants clone slave • Utilizes a standby slave • Stops mysqld and copies over files • Uses tar, nc and compression with qpress (you can set your own compression) • Can create multiple slaves by chaining them with tee and fifo on intermediate nodes (uses full up and down link)

  30. JETPANTS COMMAND LINE jetpants exporting data

  31. JETPANTS COMMAND LINE jetpants exporting data • Divide data by primary key into n chunks • Spawn multiple threads to export data • Each thread does a SELECT INTO OUTFILE

  32. JETPANTS COMMAND LINE jetpants importing data

  33. JETPANTS COMMAND LINE jetpants importing data • Spawn multiple threads again • Threads do LOAD DATA INFILE • Disables binary logging before the import • Also sets innodb-autoinc-lock-mode = 2 • Disable unique_checks

  34. JETPANTS COMMAND LINE jetpants removing masters

  35. JETPANTS COMMAND LINE jetpants removing masters • Lockless • Point reads to new master • Point writes to new master • No auto_increment tables

  36. JETPANTS COMMAND LINE jetpants removing masters Master Reads/Writes Standby Slave Standby Slave Standby Slave

  37. JETPANTS COMMAND LINE jetpants removing masters Old Master R/W New Master Standby Slave Standby Slave

  38. JETPANTS COMMAND LINE jetpants removing masters Old Master Writes New Master Reads Standby Slave Standby Slave

  39. JETPANTS COMMAND LINE jetpants removing masters Old Master New Master Reads/Writes Standby Slave Standby Slave

  40. JETPANTS COMMAND LINE jetpants promotions New Master Reads/Writes Old Master Standby Slave Standby Slave

  41. JETPANTS COMMAND LINE jetpants shard splits

  42. JETPANTS COMMAND LINE jetpants shard splits • Clone one new slave for each new shard • Export dataset that will live on new shards • Drop the tables and recreate tables • Import data • Replay replication stream from parent shard • Remove data from other shards

  43. jetpants shard splits JETPANTS COMMAND LINE Master Reads/Writes blogs 1-100 Standby Slave Clone Standby Slave Clone Standby Slave Standby Slave

  44. jetpants shard splits JETPANTS COMMAND LINE Master Reads/Writes blogs 1-100 Standby Slave Clone blogs 1-49 Standby Slave Clone blogs 50-100 Standby Slave blogs 1-100 Standby Slave blogs 1-100

  45. jetpants shard splits JETPANTS COMMAND LINE Parent Master blogs 1-100 Reads/Writes Child Master blogs 1-49 Standby Slave blogs 1-100 Standby Slave blogs 1-100 Child Master blogs 50-100 Standby Slave blogs 1-49 Standby Slave blogs 1-49 Standby Slave blogs 50-100 Standby Slave blogs 50-100

  46. jetpants shard splits JETPANTS COMMAND LINE Parent Master blogs 1-100 Writes Child Master blogs 1-49 Reads Child Master blogs 50-100 Reads Standby Slave blogs 1-100 Standby Slave blogs 1-100 Standby Slave blogs 1-49 Standby Slave blogs 1-49 Standby Slave blogs 50-100 Standby Slave blogs 50-100

  47. jetpants shard splits JETPANTS COMMAND LINE Parent Master blogs 1-100 Child Master blogs 1-49 Reads/Writes Child Master blogs 50-100 Reads/Writes Standby Slave blogs 1-100 Standby Slave blogs 1-100 Standby Slave blogs 1-49 Standby Slave blogs 1-49 Standby Slave blogs 50-100 Standby Slave blogs 50-100

  48. jetpants shard splits JETPANTS COMMAND LINE Child Master blogs 1-49 Reads/Writes Child Master blogs 50-100 Reads/Writes Standby Slave blogs 1-49 Standby Slave blogs 1-49 Standby Slave blogs 50-100 Standby Slave blogs 50-100

  49. JETPANTS COMMAND LINE jetpants merge shards

  50. JETPANTS COMMAND LINE jetpants merge shards • Mariadb as an aggregator node (for its multi-source replication) • Export data from each shard • Import data into aggregator node and new master at same time • Clone new slaves from new master • Pt-query-checksum during testing and a validator for each live run

More Related