1 / 31

Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications

Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications. C. Faloutsos – A. Pavlo How to Scale a Database System. hag·i · og·ra·phy. (noun). ChristosTheGreekGodofDatabases.com. Pinterest meets Causal Encounters meets Kickstarter meets Twitter With Christos!.

sashcraft
Download Presentation

Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Carnegie Mellon Univ.Dept. of Computer Science15-415/615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System

  2. hag·i·og·ra·phy (noun)

  3. ChristosTheGreekGodofDatabases.com • Pinterest meets Causal Encounters meets Kickstarter meets Twitter • With Christos!

  4. ChristosTheGreekGodofDatabases.com • More reads than writes. • All media stored outside of DBMS. • How do we choose the right database architecture? CMU SCS 15-415/615

  5. Outline • Single-Node Databases • NoSQL Systems • NewSQL Systems CMU SCS 15-415/615

  6. Late-1990s / Early-2000s • All the big players were heavyweight and expensive. • Oracle, DB2, Sybase, SQL Server, Informix. • Open-source databases were missing important features. • Postgres, mSQL, MySQL. CMU SCS 15-415/615

  7. Mid-2000s • MySQL + InnoDB is widely adopted by new web companies: • Supported transactions, replication, recovery. • Memcache for caching queries. CMU SCS 15-415/615

  8. ChristosTheGreekGodofDatabases.com • Let’s go with MySQL. • We’re getting a lot of traffic. • Our database server is saturated! How do we increase the capacity of our database server? CMU SCS 15-415/615

  9. Idea #1: Buy a faster machine.

  10. Scaling Up • More disks. • More RAM. • Faster CPUs. • Use SSDs. Application Server Database Server (+) Requires no change to application. (+) Improvements are immediate. (-) Expensive! Diminishing Returns. (-) Single Point of Failure. CMU SCS 15-415/615

  11. Idea #2: Replicate database on multiple servers.

  12. Replication Read Request Application Server Database Server Replicas (+) Requires no change to application. (+) Parallelize read operations. (+) Improved fault tolerance. (-) Expensive! Diminishing Returns. (-) Writes limited to slowest node. CMU SCS 15-415/615

  13. Idea #3: Cache query results.

  14. Query Cache Check Cache Update Cache memcache Query Request Application Server Database Server Replicas (+) Reduce load on DBMS. (+) Fast API. (-) Extra roundtrip per query. (-) Requires application changes. (-) Doesn’t help write-heavy apps. CMU SCS 15-415/615

  15. Idea #4: Push SQL into stored procedures.

  16. Stored Procedures Stored Procedure defgetPage(request): # Process request EXEC SQL EXEC SQL # Process results if x == True: EXEC SQL else: EXEC SQL # Render HTML page return (html) BEGIN: EXEC SQL EXEC SQL if x == True: EXEC SQL else: EXEC SQL return (results) END; defgetPage(request): # Process request EXEC PROCEDURE # Render HTML page return (html) Database Server Replicas Application Code (+) Reduces network roundtrips. (+) Less lock contention. (+) Modularization. (-) Application logic in two places. (-) PL/SQL is not standardized. CMU SCS 15-415/615

  17. Idea #5: Shard database across multiple servers.

  18. Sharding / Partitioning LogicalPartitions Application Server Database Cluster (+) Parallelize all operations. (+) Much easier to add more hardware. (-) Most DBMSs don’t support this. (-) Joins are expensive. (-) Non-trivial to split database. CMU SCS 15-415/615

  19. ChristosTheGreekGodofDatabases.com • We want to scale out but writing a sharding layer is hard. • Some parts of ourapplication don’t needa full-featured DBMS. CMU SCS 15-415/615

  20. Idea #6: Give up ACID guarantees for scalability.

  21. Eventual Consistency Update Profile ? Application Servers DBMS Servers ? Master Replicas Get Profile CMU SCS 15-415/615

  22. Late-2000s (NoSQL) • NoSQL systems are able to scale horizontally right out of the box by giving traditional database features. CMU SCS 15-415/615

  23. ChristosTheGreekGodofDatabases.com • We need to process payments. • We don’t want to lose orders. • We need joins and ACID transactions.

  24. Strong Consistency Use Two-Phase Commit -$100 Nice Christos Pictures! Send Money +$100 Thanks! CMU SCS 15-415/615

  25. Idea #7: Keep guarantees, optimize for workload type.

  26. Early-2010s (NewSQL) • New DBMSs that can scale across multiple machines natively and provide ACID guarantees.

  27. Conclusion • RDBMS (Single-Node): • MySQL, Postgres • NoSQL (Multi-Node): • Key-Value, Documents, Graphs • NewSQL (Multi-Node): • Transaction Processing, MySQL Sharding CMU SCS 15-415/615

  28. What DBMS should my start-up use?

  29. Beyond the 15-415/615 • Christos is teaching 15-826 this fall: • Multimedia Databases and Data Mining • Send me an email if you’re interested in working on a database research project.

More Related