1 / 45

introduction to cassandra

introduction to cassandra. eben hewitt september 29. 2010 web 2.0 expo new york city. @ebenhewitt. director, application architecture at a global corp focus on SOA, SaaS, Events i wrote this. agenda. context features data model api. “nosql”  “big data”. mongodb couchdb

aldona
Download Presentation

introduction to cassandra

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. introduction to cassandra eben hewitt september 29. 2010 web 2.0 expo new york city

  2. @ebenhewitt • director, application architecture • at a global corp • focus on SOA, SaaS, Events • i wrote this

  3. agenda • context • features • data model • api

  4. “nosql”  “big data” • mongodb • couchdb • tokyo cabinet • redis • riak • what about? • Poet, Lotus, Xindice • they’ve been around forever… • rdbms was once the new kid…

  5. innovation at scale • googlebigtable (2006) • consistency model: strong • data model: sparse map • clones: hbase, hypertable • amazon dynamo (2007) • O(1) dht • consistency model: client tune-able • clones: riak, voldemort cassandra ~= bigtable + dynamo

  6. proven • The Facebook stores 150TB of data on 150 nodes web 2.0 • used at Twitter, Rackspace, Mahalo, Reddit, Cloudkick, Cisco, Digg, SimpleGeo, Ooyala, OpenX, others

  7. cap theorem • consistency • all clients have same view of data • availability • writeable in the face of node failure • partition tolerance • processing can continue in the face of network failure (crashed router, broken network)

  8. danielabadi: pacelc

  9. write consistency read consistency

  10. agenda • context • features • data model • api

  11. cassandra properties • tuneably consistent • very fast writes • highly available • fault tolerant • linear, elastic scalability • decentralized/symmetric • ~12 client languages • Thrift RPC API • ~automatic provisioning of new nodes • 0(1) dht • big data

  12. write op

  13. Staged Event-Driven Architecture • A general-purpose framework for high concurrency & load conditioning • Decomposes applications into stages separated by queues • Adopt a structured approach to event-driven concurrency

  14. instrumentation

  15. data replication • configurable replication factor • replica placement strategy rack unaware Simple Strategy rack aware  Old Network Topology Strategy data center shard  Network Topology Strategy

  16. partitioner smack-down Random Preserving Order Preserving key distribution determined by token lexicographical ordering required for range queries scan over rows like cursor in index can specify the token for this node to use ‘scrabble’ distribution • system will use MD5(key) to distribute data across nodes • even distribution of keys from one CF across ranges/nodes

  17. agenda • context • features • data model • api

  18. structure

  19. keyspace • ~= database • typically one per application • some settings are configurable only per keyspace

  20. column family • group records of similar kind • not same kind, because CFs are sparse tables • ex: • User • Address • Tweet • PointOfInterest • HotelRoom

  21. think of cassandra asrow-oriented • each row is uniquely identifiable by key • rows group columns and super columns

  22. column family key123 user=eben nickname=The Situation key456 user=alison icon= n=42

  23. json-like notation User { 123 : { email: alison@foo.com, icon: }, 456 : { email: eben@bar.com, location: The Danger Zone} }

  24. 0.6 example $cassandra –f $bin/cassandra-cli cassandra> connect localhost/9160 cassandra> set Keyspace1.Standard1[‘eben’][‘age’]=‘29’ cassandra> set Keyspace1.Standard1[‘eben’][‘email’]=‘e@e.com’ cassandra> get Keyspace1.Standard1[‘eben'][‘age'] => (column=6e616d65, value=39, timestamp=1282170655390000)

  25. a column has 3 parts • name • byte[] • determines sort order • used in queries • indexed • value • byte[] • you don’t query on column values • timestamp • long (clock) • last write wins conflict resolution

  26. column comparators • byte • utf8 • long • timeuuid • lexicaluuid • <pluggable> • ex: lat/long

  27. super column super columns group columns under a common name

  28. super column family <<SCF>>PointOfInterest <<SC>>Central Park <<SC>> Empire State Bldg 10017 desc=Fun to walk in. phone=212. 555.11212 desc=Great view from 102nd floor! 85255 <<SC>> Phoenix Zoo

  29. super column family PointOfInterest { key: 85255 { Phoenix Zoo { phone: 480-555-5555, desc: They have animals here. }, Spring Training { phone: 623-333-3333, desc: Fun for baseball fans. }, }, //end phx key: 10019 { Central Park { desc: Walk around. It's pretty.} , Empire State Building { phone: 212-777-7777, desc: Great view from 102nd floor. } } //end nyc } super column family column key super column flexible schema s

  30. about super column families • sub-column names in a SCF are not indexed • top level columns (SCF Name) are always indexed • often used for denormalizing data from standard CFs

  31. agenda • context • features • data model • api

  32. slice predicate • data structure describing columns to return • SliceRange • start column name • finish column name (can be empty to stop on count) • reverse • count (like LIMIT)

  33. readapi • get() : Column • get the Col or SC at given ColPath COSC cosc = client.get(key, path, CL); • get_slice() : List<ColumnOrSuperColumn> • get Cols in one row, specified by SlicePredicate: List<ColumnOrSuperColumn> results = client.get_slice(key, parent, predicate, CL); • multiget_slice() : Map<key, List<CoSC>> • get slices for list of keys, based on SlicePredicate Map<byte[],List<ColumnOrSuperColumn>> results = client.multiget_slice(rowKeys, parent, predicate, CL); • get_range_slices() : List<KeySlice> • returns multiple Cols according to a range • range is startkey, endkey, starttoken, endtoken: List<KeySlice> slices = client.get_range_slices( parent, predicate, keyRange, CL);

  34. writeapi client.insert(userKeyBytes, parent, new Column(“band".getBytes(UTF8), “Funkadelic".getBytes(), clock), CL); batch_mutate • void batch_mutate(map<byte[], map<String, List<Mutation>>> , CL) remove • void remove(byte[], ColumnPathcolumn_path, Clock,CL)

  35. batch_mutate //create param Map<byte[], Map<String, List<Mutation>>> mutationMap = new HashMap<byte[], Map<String, List<Mutation>>>(); //create Cols for Muts Column nameCol = new Column("name".getBytes(UTF8), “Funkadelic”.getBytes("UTF-8"), new Clock(System.nanoTime());); Mutation nameMut = new Mutation(); nameMut.column_or_supercolumn = nameCosc; //also phone, etc Map<String, List<Mutation>> muts = new HashMap<String, List<Mutation>>(); List<Mutation> cols = new ArrayList<Mutation>(); cols.add(nameMut); cols.add(phoneMut); muts.put(CF, cols); //outer map key is a row key; inner map key is the CF name mutationMap.put(rowKey.getBytes(), muts); //send to server client.batch_mutate(mutationMap, CL);

  36. raw thrift: for masochists only • pycassa (python) • fauna (ruby) • hector (java) • pelops (java) • kundera (JPA) • hectorSharp (C#)

  37. ? what about… SELECT WHERE ORDER BY JOIN ON GROUP

  38. rdbms: domain-based modelwhat answers do I have?cassandra: query-based modelwhat questions do I have?

  39. SELECT WHERE cassandra is an index factory <<cf>>USER Key: UserID Cols: username, email, birth date, city, state How to support this query? SELECT * FROM User WHERE city = ‘Scottsdale’ Create a new CF called UserCity: <<cf>>USERCITY Key: city Cols: IDs of the users in that city. Also uses the Valueless Column pattern

  40. SELECT WHERE pt 2 • Use an aggregate key state:city: { user1, user2} • Get rows between AZ: & AZ; for all Arizona users • Get rows between AZ:Scottsdale & AZ:Scottsdale1 for all Scottsdale users

  41. ORDER BY • Rows • are placed according to their Partitioner: • Random: MD5 of key • Order-Preserving: actual key • are sorted by key, regardless of partitioner Columns are sorted according to CompareWith or CompareSubcolumnsWith

  42. rdbms

  43. cassandra

  44. is cassandra a good fit? • you need really fast writes • you need durability • you have lots of data > GBs >= three servers • your app is evolving • startup mode, fluid data structure • loose domain data • “points of interest” • your programmers can deal • documentation • complexity • consistency model • change • visibility tools • your operations can deal • hardware considerations • can move data • JMX monitoring

  45. thank you! @ebenhewitt

More Related