1 / 22

Some key-value stores using log-structure

LevelDB. Riak. Some key-value stores using log-structure. Zhichao Liang frankey0207@gmail.com. Outline. Why log structure? Riak: log-structure hash table Rethinkdb : log-structure b-tree Leveldb : log-structure merge tree Conclusion. Outline. Why log structure?

roman
Download Presentation

Some key-value stores using log-structure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LevelDB Riak Some key-value stores using log-structure Zhichao Liang frankey0207@gmail.com

  2. Outline • Why log structure? • Riak: log-structure hash table • Rethinkdb: log-structure b-tree • Leveldb: log-structure merge tree • Conclusion

  3. Outline • Why log structure? • Riak: log-structure hash table • Rethinkdb: log-structure b-tree • Leveldb: log-structure merge tree • Conclusion

  4. Log Structure • A log-structured file system is a file system design first proposed in 1988 by John K. Ousterhout and Fred Douglis. • Design for high write throughput, all updates to data and metadata are written sequentially to a continuous stream, called a log. • Conventional file systems tend to lay out files with great care for spatial locality and make in-place changes to their data structures.

  5. Log Structure for SSD • Random write degrades the system performance and shrinks the lifetime of ssd. • Log structure is ssd-friendly natively! Magnetic Disk SSD RAM free new data 1 data 1 erased new data 1 free data 1 block free data 2 data 2 free erased data 2 free data 2 data 3 free new data 3 data 3 data 3 erased free data 3 free free data 4 free block free free free free free

  6. Outline • Why log structure? • Riak: log-structure hash table • Rethinkdb: log-structure b-tree • Leveldb: log-structure merge tree • Conclusion

  7. Riak ? • Riak is an open source, highly scalable, fault-tolerant distributed database.  • Supported core features: - operate in highly distributed environments - no single point of failure - highly fault-tolerant - scales simply and intelligently - highly data available - low cost of operations

  8. Bitcask • A Bitcask instance is a directory, and only one operating system process will open that Bitcask for writing at a given time. • The active file is only written by appending, which means that sequential writes do not require disk seeking.

  9. Hash Index: keydir • A keydir is simply a hash table that maps every key in a Bitcask to a fixed-size structure giving the file, offset and size of the most recently written entry for that key .

  10. Merge • The merge process iterates over all non-active file and produces as output a set of data files containing only the “live” or latest versions of each present key.

  11. Outline • Why log structure? • Riak: log-structure hash table • Rethinkdb: log-structure b-tree • Leveldb: log-structure merge tree • Conclusion

  12. RethinkDB ? • RethinkDB is a persistent, industrial-strength key-value store with full support for the Memcached protocol. • Powerful technology: - Linear scaling across cores - Fine-grained durability control - Instantaneous recovery on power failure • Supported core features: - Atomic increment/decrement - Values up to 10MB in size - Multi-GET support - Up to one million transactions per second on commodity hardware

  13. Installation & usage • RethinkDB works on modern 64-bit distributions of Linux. • Running the rethinkdb server: Ubuntu 10.04.1 x86_64 Ubuntu 10.10 x86_64 Red Hat Enterprise Linux 5 x86_64 CentOS 5 x86_64 SUSE Linux 10 • Default installation path: /usr/bin/rethinkdb-1.0 • ./rethinkdb-1.0 -f /u01/rethinkdb_data • ./rethinkdb-1.0 -f /u01/rethinkdb_data -c 4 -p 11500 • ./rethinkdb-1.0 -f /u01/rethinkdb_data • -f /u03/rethinkdb_data -c 4 -p 11500

  14. The methodology • Firstly, lack of mechanical parts makes random reads on SSD are significantly efficient! • Secondly, random writes trigger more erases, making these operations expensive, and decreasing the drive lifetime! • RethinkDB takes an append-only approach to storing data, pioneered by log-structured file system! What are the consequences of appen-only ?

  15. Append-only consequences Data Consistency 1) eliminating data locality requires a larger number of disk access Hot Backups Instantaneous Recovery Easy Replication 2) large amount of data that quickly becomes obsolete in an environment with a heavy insert or update workload Lock-Free Concurrency Live Schema Changes Database Snapshots

  16. Append-only B-tree Page 1 Page 1 Page 1 Page 1 15 15 15 15 15 Data File … … 5 Page 3 Page 3 Page 2 Page 3 Page 2 Page 3 15 5 15 15 15 5 15 9 19 9 19 19 19 9 19

  17. Outline • Why log structure? • Riak: log-structure hash table • Rethinkdb: log-structure b-tree • Leveldb: log-structure merge tree • Conclusion

  18. LevelDB ? • LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values. • Supported core features: - Data is stored sorted by key - Multiple changes can be made in one atomic batch - Users can create a transient snapshot to get a consistent view of data - Data is automatically compressed using the Snappy compression library

  19. Installation & usage • LevelDB works with snappy, which is a compression /decompression library. • It is a library, no database server! download snappy from http://code.google.com/p/snappy/ cd snappy-1.0.4 ./configure && make && make install svn checkout http://leveldb.googlecode.com/svn/trunk/leveldb-read-only cdleveldb-read-only make && cp libleveldb.a /usr/local/lib && cp -r include/leveldb /usr/local/include libleveldb.a

  20. Log-structure merge tree • LevelDB

  21. Outline • Why log structure? • Riak: log-structure hash table • Rethinkdb: log-structure b-tree • Leveldb: log-structure merge tree • Conclusion

  22. Conclusion • Log-structure

More Related