1 / 98

Integrity Coded Databases (ICDB) Ensuring Correctness and Freshness of Outsourced Databases

Integrity Coded Databases (ICDB) Ensuring Correctness and Freshness of Outsourced Databases. Ujwal karki , Graduate student Advisor: Dr. Jyh-haw Yeh Department of Computer Science Boise State University. Cloud Computing.

ember
Download Presentation

Integrity Coded Databases (ICDB) Ensuring Correctness and Freshness of Outsourced Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrity Coded Databases (ICDB) Ensuring Correctness and Freshness of Outsourced Databases Ujwal karki , Graduate student Advisor: Dr. Jyh-haw Yeh Department of Computer Science Boise State University

  2. Cloud Computing Internet-based computing for shared processing, resources and data on demand “The worldwide public cloud services market is projected to grow 18 percent in 2017 to total $246.8 billion, up from $209.2 billion in 2016, according to Gartner, Inc.” “74% of Tech Chief Financial Officers (CFOs) say cloud computing will have the most measurable impact on their business in 2017.”

  3. Cloud Computing Reasons for growing popularity: • Avoids upfront infrastructure costs • Meet fluctuating and unpredictable business demand • Requires minimal management effort • Pay-as-you-go (PAYG) model (charges based on usage)

  4. Cloud Computing • When we think of security, we probably think of bad guys or outsiders • What if the “bad guy” is authorized to use the system? Risks from Insiders • Data Owner have no control of data, once outsourced • System Administrator have total access to your data • Could steal, modify or even destroy sensitive information • State of art technologies for external attack • High risk of potential insider attacks still exist

  5. Thesis Statement • Unauthorized modification to the outsourced database cannot be prevented • We present an Integrity Coded Databases (ICDB), which allows data owner to detect insider’s modifications Integrity: Data item returned from the cloud should be original, without unauthorized modification Integrity Code (IC): ICs are the codes generated by applying cryptographic functions with the Data, unique Serial Number and a Secret Key from data owner • Use of secret key allows only the data owner to generate IC and verify the integrity of the data

  6. Integrity Coded Databases (ICDB) Standard Database Table ICDB Table

  7. Integrity Coded Databases (ICDB) Key Idea of ICDB: • Store Integrity Codes (IC) into the database, along the data • Each query fetches data along with corresponding IC and serial number • Verification: recompute IC and compare to the IC returned ICDB Table

  8. ICDB Concerns Correctness: Returned data should be original, and not forged Freshness: Returned data should be current and not include previously removed data Completeness:All data items satisfying query conditions should be returned

  9. Related Works File system: The hash values are stored at secure local hash repository for verification RDBMS: Signature per tuple for correctness and separate table signature for completeness Using timestamp and probabilistic method: Freshness is detected by checking deleted old fake tuples in result Use of Signature chain for authenticating outsourced database: focuses on completeness problem for the range queries only. Integrity protection using Authenticated Data Structure based techniques : Focused on the Merkle Hash Tree based data integrity techniques -ICDB has efficient technique for freshness, has multiple schemes and is transparent

  10. Focus of the Thesis • How effectively can ICDB ensure the data correctness and freshness? • How much additional memory is required to store integrity codes? -each data to be protected has corresponding IC and serial • How much additional time and data is required for Verification compared to standard DB query? -additional computation is needed to regenerate IC and compare with the fetched IC

  11. ICDB Models • Basic ICDB Model • Dual Mode Verification (DMV) Model -built on the top of basic model -feature to verify query results in aggregate

  12. Basic ICDB Model Entities: • Cloud Database Server (CDS) • ICDB Client Steps 1 & 2 : Create ICDB instance and outsource to CDS Steps 3 & 4 : SQL query converted to ICDB query by Query Conversion component and forwarded to CDS

  13. Basic ICDB Model Step 5 : CDS returns query result plus ICs and serials Step 6 : Integrity Verification Result presented to the user

  14. Dual Mode Verification (DMV) Model Entities • Cloud Database Server (CDS) • ICDB Cloud Application (CA) • ICDB Client Aggregate Verification(AV) Mode:verify fetched data as a whole -reduces the network load Detailed Verification (DV) Mode: -verify each fetched tuple or attribute data -ICDB Cloud Application (CA) in addition to CDS -generates Aggregate Integrity Code (AIC) required for AV

  15. Dual Mode Verification (DMV) Model Aggregate Verification Steps 1 & 2 : DB to ICDB instance conversion remains the same as in basic model Steps 3 & 4 : DVM converts SQL query to ICDB queries Q1 & Q2 -Q1 sent to CDS to fetch SQL result plus serials -Q2 sent to CA as a delegate to fetch ICs and then generate AIC (steps 5 & 6)

  16. Dual Mode Verification (DMV) Model Aggregate Verification Step 7 : CA generates and sends AIC to ICDB client Step 8 : ICDB client generates AIC from the result of Q1 (data plus serials) -matches with AIC from CA for aggregate verification

  17. Dual Mode Verification (DMV) Model Detailed Verification Steps 9 & 10: UI display -If AV fails, DV is optional. -if DV not chosen by user, entire dataset is discarded If DV not chosen by user: Step 11 & 12 : All ICs for data items( fetched earlier by Q1) needs to be fetched by ICDB client using Q2

  18. Dual Mode Verification (DMV) Model Detailed Verification Step 13 & 14 : Q1 plus Q2 result forwarded for detailed verification . -present individual corrupted data to the user. Note: -DV can detect which particular tuple or attribute has been altered -AV can detect if any of the data in whole dataset is altered, but not which in particular

  19. ICDB construction Integrity Code (IC) (correctness) Serial Number(Freshness) IC(d)=G(m,k) ‘d’ is data item, ‘k’ is the secret key ‘IC(d)’ is the integrity code of data ‘d’ ‘m’ is collection of information related to ‘d’ ‘G’ is the IC generating function The pair <IC(d), s> is defined as IC unit where ‘s’ is the unique Serial Number. -data owner keeps a list of serials that are revoked/ invalid. -if query returns a data with valid IC but revoked serial, the data is not fresh.

  20. IC generating Algorithms RSA • Uses Public key for encryption (or signature verification) and Private Key for decryption (or signature generation) ‘m’ is message, (N, x) is the public key and ‘y’ is the private key -In practice, RSA keys are typically 1024 to 4096 bits RSA HMACCMAC

  21. IC generating Algorithms RSA Supports Homomorphic Encryption(multiplication) Homomorphic property allows operation on ciphers without need for decryption. RSA HMACCMAC

  22. IC generating Algorithms Keyed-Hash Message Authentication Code(HMAC) • HMAC uses a cryptographic hash and a secret key • HMAC in this work uses SHA-128 as its digest • HMAC doesn't use the construction as Hash(key||message) • HMAC are not subject to the length extension attacks as normal Hash RSA HMAC CMAC

  23. IC generating Algorithms Cipher-based Message Authentication Code (CMAC) • CMAC is a technique for constructing a MAC from a block cipher • CMAC in this project uses AES-128 as its backing cipher • Fixes the security vulnerabilities like variable-length attack in CBC-MAC -CBC-MAC pairs for two messages (m, t) and (m’, t’) can generate a third message m’’ whose CBC-MAC will also be t' RSAHMAC CMAC

  24. ICDB Granularity Schemes Based on granularity levels of Integrity protection • One Code per Field (OCF) -Each entity attribute has an IC -If data doesn’t match its IC, then the field entry is invalid • One Code per Tuple (OCT) -Each tuple has an IC. -If data doesn’t match its IC, then the tuple entry is invalid. -basically defines the grouping of data to construct IC

  25. ICDB Granularity Schemes One Code per Field (OCF) For every field in a table, there must exist a field to store the corresponding IC and serial ICDB -OCF Table example

  26. ICDB Granularity Schemes ICgenerating function for a data ‘d’ in OCF: ICOCF (d) = G(m, k) = G(T.A(e) + D + T.K(e) + A + T + s, k) ‘m’ is the collection of information related to data ‘d’ that includes: -data ‘d’ itself represented by T.A(e) (field A’s value of entity ‘e’ in table ‘T’) -a delimiter ‘D’ - primary key value T.K(e) of the same entity e -name of field ‘A’, name of Table ‘T’, unique serial ‘s’ assigned to IC -G(m, k) is IC generating function using data owner’s secret key ‘k’

  27. ICDB Granularity Schemes One Code per Tuple (OCT) every table has additional fields to store the corresponding IC and serial • In OCF, ratio between each data field and its IC size is high. • In OCT, storage efficiency is improved by use of single IC per tuple.

  28. ICDB Granularity Schemes One Code per Tuple (OCT) ICOCT (d)= G(m, k) = G(T.A1(e)+ D + T.A2(e) + ….. + T.An(e) + T + s, k) = G(d + T + s, k) d= (T.A1(e)+ D + T.A2(e) + ….. + T.An(e)) are the field values in a tuple ‘D’ is the delimiter between each field values ‘m’ is the collection of information related to ‘d’ ‘T’ is the table name and ‘s’ a unique serial number G (m, k) is the IC generating function on ‘m’ using secret key ‘k’

  29. ICDB Conversion Schema Conversion (OCF) Schema Conversion (OCT) Schema Conversion Data ConversionQuery Conversion ALTER TABLE table_nameADD COLUMN CONCAT(column_name, '_ic') TEXT NOT NULL, CONCAT(column_name, '_serial') TEXT NOT NULL AFTER column_name;

  30. ICDB Conversion Data Conversion • Each data of the tables are copied to a text file. • Integrity Code for each field data is created based on the level of protection granularity and saved in new file. • Converted data is then uploaded to ICDB in the cloud. Schema Conversion Data Conversion Query Conversion

  31. ICDB Conversion Query Conversion • Schema conversion and data conversion is same for both the Basic ICDB Model and DMV Model • Query conversion for basic ICDB model is different from DMV model • For both the models, ICDB query is derived from standard SQL query on the level of protection granularity Schema ConversionData Conversion Query Conversion

  32. ICDB Conversion Query Conversion for Basic-OCF (Algorithm A) Input (an SQL query) Output (an OCF-Basic query) attribute names Key attribute names attribute names in condition } { Serials ICs

  33. ICDB Conversion Query Conversion for Basic-OCF (example SELECT query conversion) Original SQl query Applying Algorithm A, the converted OCF-Basic query is: SELECT salary FROM salaries WHERE emp_no = 1001; SELECT salary, emp _no, from _date salary _IC, salary _serial, emp _no _IC, emp _no _serial, from _date _IC, from _date _serial FROM salaries WHERE emp _no = 1001;

  34. ICDB Conversion Query Conversion for Basic-OCF (example Aggregate functional query conversion) Original SQl query Applying Algorithm A, the converted OCF-Basic query is: Select sum (salary) from salaries; SELECT salary, emp _no, from _date salary _IC, salary _serial, emp _no _IC, emp _no _serial, from _date _IC, from _date _serial FROM salaries;

  35. ICDB Conversion Query Conversion for DMV-OCF makes use of two different cloud services: Cloud Database Server (CDS) and ICDB Cloud Application (CA) • CA requires only the IC's to generate an Aggregate Integrity Code (AIC) • ICDB client fetches data plus serials from CDS -two different modes of verification: Aggregate Verification (AV) and Detailed Verification (DV)

  36. ICDB Conversion Query Conversion for DMV-OCF (Algorithm B) AV Mode: issue two different queries Query Q1 to CDS and Q2 to CA

  37. ICDB Conversion Query Conversion for DMV-OCF (Algorithm B) For detailed verification, the same Q2 is issued to CDS to fetch ICs

  38. ICDB Conversion Query Conversion for Basic-OCT (Algorithm C) Input (an SQL query) Output (an OCT-Basic query)

  39. ICDB Conversion Query Conversion for Basic-OCT (eg. SELECT query conversion) Original SQl query Applying Algorithm C, the converted OCT-Basic query is: SELECT salary FROM salaries WHERE emp_no = 1001; SELECT emp _no, salary, from _date, to _date, salaries _IC, salaries _serial FROM salaries WHERE emp _no = 1001;

  40. ICDB Conversion Query Conversion for DMV-OCT (Algorithm D) -AV Mode: similar to DMV-OCF, issue two different queries -Query Q1 to CDS and Q2 to CA Input (an SQL query)

  41. ICDB Conversion Query Conversion for DMV-OCT (Algorithm D) -for detailed verification, the same Q2 is issued to CDS to fetch ICs

  42. AIC generation and Verification AIC for ICs using RSA are generated by homomorphic multiplication of fetched ICs by cloud application as: The AIC can be regenerated by applying RSA algorithm to the aggregate data by ICDB client as:

  43. AIC generation and Verification For MACs, all ICs are aggregated and applied hashing (SHA-256) to generate AIC by cloud application as: ICDB client has to regenerate all the ICs for all the returned data and then generate the AIC from all the regenerated ICs as:

  44. Experimental Results and Analysis Hardware and software used: • Boise State university’s onyx server • MySQL (MariaDB) with InnoDB as its database engine • JAVA SE 1.8 • Bouncy Castle (an open source crypto library) • MySQL publicly available Employees (v1.0.6)

  45. Experimental Results and Analysis Integrity Protection Forgery Attack: Attack that mutates or alters fields in a database. -> IC cannot be generated without the secret key of data owner Substitution Attack: Attack that modifies fields by substituting them with existing fields within database. ->all data to be protected are tied with their primary keys and other properties as attribute name and table name

  46. Experimental Results and Analysis Old Data Attack: Attack that returns the data(along with correct IC), which was previously stored but is no more in the database. ->use of ICRL prevents this Tuple Insertion Deletion Attack: Attack that adds new rows or deletes the existing rows. -> since forgery is detected, insertion is easily detected. -> deletion can be guaranteed by completeness guarantee only

  47. Experimental Results and Analysis Memory Penalty

  48. Experimental Results and Analysis Performance Penalty (Basic Model) SELECT query process • ICDB client converts SELECT Query to ICDB SELECT query • CDS executes and returns the result for ICDB SELECT query • ICDB client verifies the returned result Results of SELECT * query on Employees.salaries Table.

  49. Experimental Results and Analysis We can interpret Performance Penalty Rate also as Process Rate Process Rate: How many MB of user data can be processed in one second? Total fetched user data size is without ICs and serials Total process time = query conversion + query execution + query verification

  50. Experimental Results and Analysis Performance Penalty (Basic Model) INSERT query process • ICDB client converts SQL INSERT query to ICDB INSERT query -this requires generating IC and unique serial for each data to be protected • CDS then executes the ICDB INSERT query Results of INSERT query on Employees.salaries Table.

More Related