1 / 24

CS 440 Database Management Systems

CS 440 Database Management Systems. Key /Value Stores. Key/ Value Store. Stores and retrieve data in form of key/ value pairs. Person: (Key, Value) Unique keys (generally) Does not define any schema (schema-less) We build schema on top of it.

sen
Download Presentation

CS 440 Database Management Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 440 Database Management Systems Key/Value Stores

  2. Key/ Value Store • Stores and retrieve data in form of key/ value pairs. • Person: (Key, Value) • Unique keys (generally) • Does not define any schema (schema-less) • We build schema on top of it. • Our program must manage the type and semantic of keys and values. • Person: Key= SSN, Value=(Name, Address)

  3. Key/ Value Stores • No (or little) SQL support • SQL belongs to higher levels • Storage engine • Replacing the storage engine of an RDBMS • Storage engine MySQL, InnoDB

  4. Why key/ value stores? • No need for complex schema or queries • Advantage when the information needs are simple • Lookup user account by user-id. • None of the RDBMS overheads, faster • We talk more about these overheads later. • Real-life applications • Google Account was using Berkeley DB until recently. • Amazon customer preferences and shopping carts.

  5. Why key/ value stores? • Give a good idea what is going on inside a database management system. • They have their own limitations • We may need more structure • We may need high quality data • We may need complex query models • …

  6. Implementations • Local key/ value store • Berkeley DB (BDB) • Distributed key/ value store • Amazon Dynamo • We’ll talk more about them later.

  7. Berkeley DB • A key/ value store library • Used through API • Linked to your program • Supports various access paths • Open source

  8. A Bit of History • Started as a hashing library in 1991 • Released with BSD 4.4 in 1992 • Hash table and B-tree • Seltzer and Bostic started Sleepycat software in 1996 • Open source with a dual license • Oracle acquired Sleepycat in 2006 • Kept the dual license

  9. Berkeley DB Product Family • The original library written in C • API in various languages Java, C, C++ • Berkeley DB Java edition • Pure Java implementation of the library • Java API • Berkeley DB XML • Based on the original library • Persistent API • More complex query model

  10. Berkeley DB Product Family

  11. Key and value • Un-interpreted byte strings • Whatever you like them to be • Can store multiple types of objects in the same table • Employee (name, birth-date, position) • Student (name, birth-date, GPA) • Different tables in RDBMS, same table in BDB

  12. Keys • You can define their sort orders • Sequential access • Duplicate keys! • Possible but discouraged • You will lose many functionalities

  13. Environment: database in RDBMS • A directory that contains related databases ( tables) import com.sleepycat.db.* EnvironmentConfigenvConfig_ = new EnvironmentConfig(); envConfig_.setAllowCreate(true); envConfig_.setInitializeCache(true); envConfig_.setTransactional(false); envConfig_.setInitializeLocking(false); envConfig_.setPrivate(true); envConfig_.setCacheSize(1024 * 1024); File envHome_ = new File(“/home/schoolDB/”); Environment env_ = new Environment(envHome_, envConfig_);

  14. Database, Table in RDBMS • A collection of tuples (key/ value pairs) • A database is referred by a database handle • All method calls use this handle • A file may contain one or more databases

  15. Opening/ Creating a Database //databasesettings DatabaseConfigdbConfig_ = new DatabaseConfig(); //primary access path dbConfig_.setType(DatabaseType.HASH); dbConfig_.setCacheSize(4 * 1024 * 1024); // databasename: student db= env_.openDatabase(null, “student.db“, “student“, dbConfig_);

  16. Storing Tuples public class Student implements Cloneable { private String name; … public String getName(){ return name; } public void setName( String name){ this.name = name; } …. }

  17. Values to Byte Strings • Read from / write to a byte stream public class StudentTupleBindingextends TupleBinding{ public void objectToEntry(Object o, TupleOutput out) { Student std = (Student)o; out.writeString(std.getName()); …} public object entryToObject(TupleInput in) { Student std = new Student(); std.setName(in.readString()); …}

  18. Inserting Tuples DatabaseEntrykey = new DatabaseEntry(); DatabaseEntrydata = new DatabaseEntry(); intkeyvalue = 1; // Convert the key to a byte string IntegerBinding.intToEntry(keyvalue, key); StudentTupleBindingbinding = new StudentTupleBinding(); binding.objectToEntry(entry, data); db.put(null, key, data);

  19. Retrieving Tuples int start = 1; DatabaseEntrykey = new DatabaseEntry(); IntegerBinding.intToEntry(start, key); DatabaseEntrydata = new DatabaseEntry(); intnext = start; //duplicate keys! while(db.get(null, key, data, null) ==perationStatus.SUCCESS){ //Convert from byte string to object Student std= (Student) binding.entryToObject(data); …. }

  20. Access Paths • B-tree • Fast access • Hash table • Fast access for read only data • Heap • Efficient use of disk space • …

  21. Cursors • Represent positions in a database • Iterative (forward and backward ) scan //Configurationinfo Cursor cursor = db.openCursor(null, null); DatabaseEntrykey = new DatabaseEntry(); DatabaseEntrydata = new DatabaseEntry(); while(cursor.getNext(key, data, null) == OperationStatus.SUCCESS){ // do something } cursor.close();

  22. Secondary Index • Stored in another BDB database • No duplicate (primary) key! class sKeyCreatorimplements SecondaryKeyCreator{ public booleancreateSecondaryKey ( SecondaryDatabasesecDb, DatabaseEntrykeyEntry, DatabaseEntrydataEntry, DatabaseEntryresultEntry){ //set resultEntry to the secondary key value }

  23. Secondary Indexes //new database SecondaryConfigsIndexConfig= new SecondaryConfig(); sIndexConfig.setType(DatabaseType.HASH); sIndexConfig.setTransactional(false); // Duplicates are frequently required for secondary databases. sIndexConfig.setSortedDuplicates(true); sKeyCreatorkeyCreator = new sKeyCreator(); sIndexConfig.setKeyCreator(skeyCreator); // Perform the actual open SecondaryDatabasesIndex= env_.openSecondaryDatabase (null, ”senindex.db", null, db, sIndexConfig);

  24. Closing Database & Environment • Releasing resources sIndex.close(); db.close(); env_.close();

More Related