Memory efficient data management policy for flash based key value store
This presentation is the property of its rightful owner.
Sponsored Links
1 / 34

Memory –efficient Data Management Policy for Flash-based Key-Value Store PowerPoint PPT Presentation


  • 84 Views
  • Uploaded on
  • Presentation posted in: General

Memory –efficient Data Management Policy for Flash-based Key-Value Store. Wang Jiangtao 2013-4-12. Outline. Introduction Related work Two works BloomStore [MSST2012] TBF[ICDE2013] Summary. Key-Value Store. KV store efficiently supports simple operations: Key lookup & KV pair insertion

Download Presentation

Memory –efficient Data Management Policy for Flash-based Key-Value Store

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Memory efficient data management policy for flash based key value store

Memory –efficient Data Management Policy for Flash-based Key-Value Store

Wang Jiangtao

2013-4-12


Outline

Outline

  • Introduction

  • Related work

  • Two works

    • BloomStore[MSST2012]

    • TBF[ICDE2013]

  • Summary


Key value store

Key-Value Store

  • KV store efficiently supports simple operations: Key lookup & KV pair insertion

  • Online Multi-player Gaming

  • Data deduplication

  • Internet services


Overview of key value store

Overview of Key-Value Store

  • KV store system should provide high access throughput (> 10,000 key lookups/sec)

  • Replaces traditional relational DBs for its superior scalability & performance.

    • prefer to use KV store for its simplicity and better scalability

  • Popular management (index + storage) solution for large volume of records

    – often implemented through an index structure, mapping Key-> Value


Challenge

Challenge

  • To meet high throughput demand, the performance of index access and KV pair (data) access is critical

    • index access : search the KV pair associated with a given “key”

    • KV pair access: get/put the actual KV pair

  • Available memory space limits the maximum number of stored KV pairs

  • Using in-RAM index structure can only address index access performance demand


Dram must be used efficiently

DRAM must be Used Efficiently

  • 1 TB of data

  • 4 bytes of DRAM for key-value pair

32 B( Data deduplication) => 125 GB!

Index size(GB)

168 B(Tweet) => 24 GB

1 KB(Small image) => 4 GB

Per Key-value pair size (bytes)


Existing approach to speed up index kv pair accesses

Existing Approach to Speed up Index & KV pair Accesses

  • Maintain the index structure in RAM to map each key to its KV pair on SSD

    • RAM size can not scale up linearly to flash size

  • Keep the minimum index structure in RAM, while storing the rest of the index structure in SSD

    • On-flash index structure should be designed carefully

      • Space is precious

      • random writes are slow and bad for flash life (wear out)


Outline1

Outline

  • Introduction

  • Related work

  • Two works

    • BloomStore[MSST2012]

    • TBF[ICDE2013]

  • Summary


Bloom filter

Bloom Filter

  • Bloom Filter利用位数组表示一个集合,并判断一个元素是否属于这个集合。初始状态时,m位的位数组的每一位都置为0,Bloom Filter使用k个相互独立的哈希函数,它们分别将集合中的每个元素映射到{1,…,m}的范围中。对任意一个元素x,第i个哈希函数映射的位置hi(x)就会被置为1(1≤i≤k)。注意,如果一个位置多次被置为1,那么只有第一次会起作用,后面几次将没有任何效果。

  • 错误率

  • Bloom Filter参数选择

    • 哈希函数的个数k、位数组大小m、元素的个数n

    • 降低错误率


Flashstore vldb2010

FlashStore[VLDB2010]

  • Flash as a cache

  • Components

    • Write buffer

    • Read cache

    • Recency bit vector

    • Disk-presence bloom filter

    • Hash table index

  • Cons

    • 6 bytes of RAM per key-value pair


Skimpystash sigmod2011

SkimpyStash[SIGMOD2011]

  • Components

    • Write buffer

    • Hash table

      • Bloom filter

      • using linked list

      • a pointer to the beginning of the linked list of flash

  • Storing the linked lists on flash

    • Each pair have a pointer to earlier keys in the log

    • Cons

    • Multiple flash page reads for a key lookup

    • High garbage collection cost


Outline2

Outline

  • Introduction

  • Related work

  • Two works

    • BloomStore[MSST2012]

    • TBF[ICDE2013]

  • Summary


Memory efficient data management policy for flash based key value store

MSST2012


Introduction

Introduction

  • Key lookup throughput is the bottleneck for data application

  • Keep an in-RAM large-sized hash table

  • Move index structure to secondary storage(SSD)

    • Expensive random write

    • High garbage collection cost

    • Bigger storage space


Bloomstore

BloomStore

  • BloomStore Design

    • An extremely low amortized RAM overhead

    • Provide high key lookup/insertion throughput

  • Componets

    • KV Pair write buffer

    • Active bloom filter

      • a flash page for write buffer

    • Bloom filter chain

      • many flash pages

    • Key-range partition

      • a flash “block”

BloomStore architecture


Kv store operations

KV Store Operations

  • Key Lookup

    • Active Bloom filter

    • Bloom filter chain

    • Lookup cost


Parallel lookup

Parallel lookup

  • Key Lookup

    • Read the entire BF chain

    • Bit-wise AND resultant row

    • High read throughput

h1(ei)

h1(ei)

...

h1(ei)

Bit-wise AND

eiis found

Bloom filters in parallel


Kv store operations1

KV Store Operations

  • KV pair Insertion

  • KV pair Update

    • Append a new key-value pair

  • KV pair Deletion

    • Insert a null value for the key


Experimental evaluation

Experimental Evaluation

  • Experiment setup

    • 1TB SSD(PCIe)/32GB(SATA)

  • Workload


Experimental evaluation1

Experimental Evaluation

  • Effectiveness of prefilter

    • Per KV pair is 1.2 bytes

  • Linux Workload

  • Vx Workload


Experimental evaluation2

Experimental Evaluation

  • Lookup Throughput

    • Linux Workload

      • H=96(BF chain length)

      • m=128(the size of a BF)

    • Vx Workload

      • H=96(BF chain length)

      • m=64(the size of a BF)

      • A prefilter


Memory efficient data management policy for flash based key value store

ICDE2013


Motivation

Motivation

  • Using flash as a extension cache is cost-effective

  • The desired size of RAM-cache is too large

    • Caching policy is memory-efficient

  • Replacement algorithm achieves comparable performance with existing policies

  • Caching policy is agnostic to the organization of data on SSD


Defects of the existing policy

Defects of the existing policy

  • Recency-based caching algotithm

    • Clock or LRU

    • Access data structure and index


Defects of the existing policy1

Defects of the existing policy

  • Recency-based caching algotithm

    • Clock or LRU

    • Access data structure and index


System view

System view

  • DRAM buffer

    • An in-memory data structure to maintain access information (BF)

    • No special index to locate key-value pair

  • Key-value store

    • Provide a iterator operation to traverse

    • Write through

BF

Key-Value cache prototype architecture


Bloom filter with deletion bfd

Bloom Filter with deletion(BFD)

  • BFD

    • Removing a key from SSD

    • A bloom filter with deletion

    • Resetting the bits at the corresponding hash-value in a subset of the hash functions

X1

Delete X1


Bloom filter with deletion bfd1

Bloom Filter with deletion(BFD)

  • Flow chart

  • Tracking recency information

  • Cons

    • False positive

      • polluting the cache

    • False negative

      • Poor hit ratio


Two bloom sub filters tbf

Two Bloom sub-Filters(TBF)

  • Flow chart

  • Dropping many elements in bulk

  • Flip the filter periodically

  • Cons

    • Keeping rarely-accessed objects

      • polluting the cache

    • traversal length per eviction


Traversal cost

Traversal cost

  • Key-Value Store Traversal

    • unmarked on insertion

    • marked on insertion

      • longer stretches of marked objects

      • False positive


Evaluation

Evaluation

  • Experiment setup

    • two 1 TB 7200 RPM SATA disks in RAID-0

    • 80 GB FusionioDrive PCIE X4

    • a mixture of 95% read operations and 5% update

    • Key-value pairs:200 million(256B)

    • Bloom filter

    • 4 bits per marked object

    • a byte per object in TBF

    • hash function:3


Outline3

Outline

  • Introduction

  • Related work

  • Two works

    • BloomStore[MSST2012]

    • TBF[ICDE2013]

  • Summary


Summary

Summary

  • KV store is particularly suitable for some special applications

  • Flash will improve the performance of KV store due to its faster access

  • Some index structure need to be redesign to minimize the RAM size

  • Don’t just treat flash as disk replacement


Thank you

Thank You!


  • Login