bigtable a distributed storage system for structured data n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Bigtable : A Distributed Storage System for Structured Data PowerPoint Presentation
Download Presentation
Bigtable : A Distributed Storage System for Structured Data

Loading in 2 Seconds...

play fullscreen
1 / 14

Bigtable : A Distributed Storage System for Structured Data - PowerPoint PPT Presentation


  • 95 Views
  • Uploaded on

Bigtable : A Distributed Storage System for Structured Data. 0256803 高睿鴻. Introduction. Petabytes of data across thousands of commodity servers. Goal: wide applicability, scalability, high performance , and high availability. Product: Google Analytics, Google Earth, Personalized Search ….

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Bigtable : A Distributed Storage System for Structured Data


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
introduction
Introduction
  • Petabytes of data across thousands of commodity servers.
  • Goal: wide applicability, scalability, high performance , and high availability.
  • Product: Google Analytics, Google Earth, Personalized Search ….
data model
Data Model
  • Row key, column key, timestamp.
chubby
Chubby
  • Highly-available and persistent distributed lock service .
  • Ensure that there is at most one active master at any time.
  • Discover tablet servers and finalize tablet server deaths.
  • Store Bigtable schema information.
  • Store access control lists.
t ablet
Tablet
  • Table consists of a set of tablets.
  • Tablet contains all data associated with a row range.
  • 100 ~ 200 MB
sstable
SSTable
  • SSTable file format is used internally to store Bigtable data.
  • Provides a persistent, ordered immutable map from keys to values.
  • Disk v.s memory
compactions
Compactions
  • Size of memtable increase
  • Minor compaction process (old memtable→ SSTable→ GFS)
  • Merging compaction (old SSTables + memtable→ new SSTable)
performance
Performance
  • A tablet server executes approximately 1200 reads per second.
  • Significant drop in per-server throughput (1~50)
performance1
Performance
  • Imbalance in load in multiple server configuration
  • Other processes contending for CPU and network
  • Throughput 100-fold V.S 500-fold servers
  • Transfer 64KB block over the network for every 1000byte read
real application
Real Application
  • Google Analytics
    • JavaScript, raw click table, summary table
  • Google Earth
    • Satellite imagery, imagery table
  • Personalized search
    • Web search, images, news
conclusion
Conclusion
  • Substantial amount of flexibility from designing their data model for Bigtable
  • Can remove bottlenecks and inefficiencies