1 / 17

深入 Cassandra

深入 Cassandra. 郭鹏. 主题. 什么是 Cassandra Cassandra 的数据模型 Cassandra 数据写入流程 Cassandra 的数据存储 文件 Cassandra 数据读取流程. 什么是 Cassandra. Bigtable Dynamo. Proven.

badrani
Download Presentation

深入 Cassandra

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 深入Cassandra 郭鹏

  2. 主题 • 什么是Cassandra • Cassandra的数据模型 • Cassandra数据写入流程 • Cassandra的数据存储文件 • Cassandra数据读取流程

  3. 什么是Cassandra • Bigtable • Dynamo

  4. Proven • Cassandra is in use at Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco, SimpleGeo, Ooyala, OpenX, and more companies that have large, active data sets. The largest production cluster has over 100 TB of data in over 150 machines.

  5. Fault Tolerant • Data is automatically replicated to multiple nodes for fault-tolerance. Replication across multiple data centers is supported. Failed nodes can be replaced with no downtime.

  6. Decentralized • Every node in the cluster is identical. There are no network bottlenecks. There are no single points of failure.

  7. You're in Control • Choose between synchronous or asynchronous replication for each update. Highly available asynchronus operations are optimized with features like Hinted Handoff and Read Repair.

  8. Rich Data Model • Allows efficient use for many applications beyond simple key/value.

  9. Elastic • Read and write throughput both increase linearly as new machines are added, with no downtime or interruption to applications.

  10. Durable • Cassandra is suitable for applications that can't afford to lose data, even when an entire data center goes down.

  11. Cassandra数据写入流程

  12. SSTable文件构成 • Filter文件 • Index文件 • Data文件

  13. Filter • Filter文件用于快速定位某一个Key是否在该SSTable文件中存在 • 布隆过滤器

  14. Index • Index文件中找到这个Key对应的Column值在Data文件中的具体位置

  15. Data • Data文件中才会存储真正的数据,但是Data文件又不仅仅存储了需要查询的数据,另外还存储了某一个Key对应的一些Column的索引信息。

  16. Cassandra数据读取流程

  17. Q&A 谢谢大家

More Related