Secure and Privacy-Preserving Database Services in the Cloud - PowerPoint PPT Presentation

secure and privacy preserving database services in the cloud n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Secure and Privacy-Preserving Database Services in the Cloud PowerPoint Presentation
Download Presentation
Secure and Privacy-Preserving Database Services in the Cloud

play fullscreen
1 / 78
Secure and Privacy-Preserving Database Services in the Cloud
199 Views
Download Presentation
aida
Download Presentation

Secure and Privacy-Preserving Database Services in the Cloud

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Secure and Privacy-Preserving Database Services in the Cloud DivyAgrawal, Amr El Abbadi, Shiyuan Wang University of California, Santa Barbara {agrawal, amr, sywang}@cs.ucsb.edu ICDE’2013 Tutorial

  2. Cloud Computing • Successful paradigm for computing and storage • Features • Pay per use • No up-front cost for deployment • Scalability • Elasticity • Software as a Service (SaaS) • Platform as a Service (PaaS) • Infrastructure as a Service (IaaS) ICDE 2013 Tutorial

  3. Adopting the Cloud • Emails • Collaboration • Administrative apps • Conferencing software • Education Early adopters are mainly low risk apps with less sensitive data Sensitive Data ICDE 2013 Tutorial

  4. Cloud – A Tempting Attack Target • Why the cloud? • Ubiquitous access to consolidated data. • Shared infrastructure economies of scale • A lot of small and medium businesses • Why attack? • Target one service provider, attack multiple companies • Financial gain from trading sensitive information ICDE 2013 Tutorial

  5. Cloud Provides Novel Attack Opportunities • Co-residence attack [Ristenpart et al. CCS’09] • Adversary: non-provider-affiliated malicious parties • Map and identify location of target VM • Place attacker VM co-resident with target VM • Cross-VM side-channel attacks (due to sharing of physical resources): eg, number of visitors to a page, or keystroke attacks for password retrieval. • Signature wrapping attack[Somorovsky et al. CCSW’11] • Control Interface compromise by capturing a SOAP msg. • Manipulate SOAP message with arbitrary XML fragments • Use XML signature vulnerability to pass authentication • Take control of a victim’s account ICDE 2013 Tutorial

  6. Amazon’s Best Practices for Cloud Security and Privacy • Defenses [AWS security] • dedicated instances, virtual private cloud, isolated network and traffic • Firewall and access control • Identity and access management, multi-factor authentication • accesses checked and audited • Rely on clients for access control • Recommend using data encryption and encrypted file system • Concerns • Co-residence attacks • Side channel attacks • Network based attacks • Unauthorized accesses • Insider attacks • Privacy violation • Future vulnerabilities? Best effort defense is not sufficient ICDE 2013 Tutorial

  7. A Barrier to Conquer • Security and privacy – a barrier to cloud adoption • Data (sensitive data) – a key concern • We need to solve data security and privacy problems in the cloud ICDE 2013 Tutorial

  8. Outline • Database Security and Privacy: General Practice in the DB Community • Data Security and Privacy in the Cloud • Data Confidentiality • Access Privacy • Open Research Challenges ICDE 2013 Tutorial

  9. Access Control [Bertino et al. TDSC’05] • Problem Statement: authorizing data access scopes (relations, attributes, tuples) to users of DBMS • Discretionary access control • Authorization administration policies, ie, granting and revoking authorization (centralized, ownership, etc) • Content-based using views and rewriting for fine-grained access control • Role-based access control: a function with a set of actions, consisting of users members • Mandatory access control: • Object and subject classification (eg, top secret, secret, unclassified, etc). ICDE 2013 Tutorial

  10. Data Anonymization • Problem: protecting Personally Identifiable Information (PII) and their sensitive attributes Quasi-identifiers need to be generalized or suppressed Quasi-identifiers are sets of attributes that can be linked with external data to uniquely identify an individual ICDE 2013 Tutorial

  11. Equivalence class share same QI Solution: k-Anonymity [Samarati et al. TR’98] • Quasi-identifiers indistinguishable among k individuals • Implemented by building generalization hierarchy or partitioning multi-dimensional data space ICDE 2013 Tutorial

  12. Enhanced Solution: l-Diversity[Machanavajjhala et al. ICDE’06] • At least l values for sensitive attributes in each equivalence class A 3-diverse patient table ICDE 2013 Tutorial

  13. Enhanced Solution: t-Closeness[Li et al. ICDE’07] • Distance between overall distributionof sensitive attribute values and distributionof sensitive attribute values in an equivalence class bounded by t ICDE 2013 Tutorial

  14. Differential Privacy for Statistical Data[Dwork ICALP’06] • Strong privacy guarantees while querying a database Query Indistiguishable! P(A) A PERTURBATION Query P(A’) A’ PERTURBATION • A randomized function K gives ε-Differential Privacy IFF for all datasets D1 and D2 differing on at most one element, and all S Range (K) Thanks to Ben Zhao for this slide

  15. Access Control & Privacy [Chaudhuri et al. CIDR’11] Hybrid System combining authorizationpredicates and “noisy” views ICDE 2013 Tutorial

  16. Secure Devices for Privacy[Anciaux et al. SIGMOD’07] • Problem: protecting private data during queries involving both private (hidden) and public (visible) data • Solution: carry private data in a secure USB key, ensureprivate data never leaves the USB key, andonly public data flows to the key • Query optimization for small RAM USB key ICDE 2013 Tutorial

  17. Outline • Database Security and Privacy • Data Security and Privacy in the Cloud • Data Confidentiality • Access Privacy • Open Research Challenges ICDE 2013 Tutorial

  18. A lot of problems need to be taken care ofSome problems are oldsome problems are amplified by the cloud ICDE 2013 Tutorial

  19. Problems Amplified by the Cloud • Access privacy • Attacks • Inferences on access patterns or query results • Solutions • Private information retrieval • Query obfuscation • Data confidentiality • Attacks • Unauthorized accesses, side channel attacks • Solutions • Encryption, querying encrypted data • Trusted computing User Query Data Answer Cloud Servers ICDE 2013 Tutorial

  20. Data Services in the Cloud Functionality Performance Adversaries: curious but not malicious cloud / insiders 3rd party attackers Actions: obtain / infer data and queries DB Queries ICDE 2013 Tutorial

  21. Challenges: Conflicting Goals Existing Services High Ideal State Functionality Performance Many Crypto Systems/Protocols Low High Confidentiality / Privacy ICDE 2013 Tutorial

  22. Outline • Database Security and Privacy • Data Security and Privacy in the Cloud • Data Confidentiality • Access Privacy • Open Research Challenges ICDE 2013 Tutorial

  23. Data Confidentiality • 1. Encryption • Homomorphic encryption • Partition Index • Order-preserving encryption • Encrypted Index • 2. Leveraging Trust • Distribution • Trusted computing ICDE 2013 Tutorial

  24. Database as a Service [Hacigümüs et al. ICDE’02] • Protects data from steeling but plaintext data can still be seen on the server • Write – encrypt before storing • insert into lineitem (discount) values (encrypt(10,key)) • Read – decrypt before access • select decrypt(discount,key) from lineitem where custid = 300 • Encryption alternatives • Software level v.s. Hardware level (cryptographic coprocessor) encryption • Granularity: field, row, page ICDE 2013 Tutorial

  25. Keyword Search on Encrypted Texts [Song et al. S&P’00] • Directly search on encrypted data without decryption on server side • Encrypt word by word. For word Wi • Block_ciphertext Xi = Ek(Wi), Word key ki = fk(Xi), Pseudorandom sequence Ti = <Si, Fki(Si)> • Searchable_ciphertextCi = Xi Ti • Search for a word W • Block_ciphertext X = Ek(W), Word key ki = fk(X) • Check ciphertexts one by one to see if C X = (Xi Ti) X is of the form <s, Fki(s)> for some random value s ICDE 2013 Tutorial

  26. Homomorphic Encryption • Paillier’s cryptosystem • Fully Homomorphic Encryption [Gentry CACM’10] • Enable arbitrary functions over encrypted data • Addition, multiplication, binary operations ICDE 2013 Tutorial

  27. Homomorphic Encryption Too expensive to be practical 1 million data Aggregation: 16 minutes Range query: 11 hours From Kristen Lauter’sSlides @ MSR Faculty Summit 2011 ICDE 2013 Tutorial

  28. We need practical solutions TO querYING on encrypted dataBASE ICDE 2013 Tutorial

  29. Partition and Identification Index [Hacigümüs et al. SIGMOD’02] • E(tuple): encrypted-tuple, {attribute-index} • Attribute-index: attribute value partition ids 2 7 5 1 4 0 400 800 1000 600 200 ICDE 2013 Tutorial

  30. Partition and Identification Index • Client knows a map function, Map(val) = id of the partition containing val Random mapping 2 1 2 7 4 5 5 1 7 4 0 0 400 400 800 800 1000 1000 600 600 200 200 Order-preserving mapping ICDE 2013 Tutorial

  31. Mapping Predicate Conditions • Map(< val) : ids of the partitions that could contain values < val • E.g. Map(eid < 280) = {2, 7} for random mapping • Map(> val) : ids of the partitions that could contain values > val • Map(Ai = Aj): pairs of ids of the partitions that could have equal Ai and Aj values • Decryption and processing on the client ICDE 2013 Tutorial

  32. Mapping Predicate Conditions emp.did = mrg.did ICDE 2013 Tutorial

  33. Optimal Partition for Range Queries[Hore et al. VLDB’04] • Optimal for privacy-performance tradeoff • Performance: minimize number of false positives over all range queries in a given query distribution • False positives caused by server returning a superset of answers • Privacy: maximize variance, entropy of value distribution in a partition • High variance – increase adversaries’ error in inferring sensitive attribute values • High entropy – reduce adversaries’ ability to identify encrypted tuples satisfying a plaintext query ICDE 2013 Tutorial

  34. Partition / Bucketization Review • Pros • Efficient computation on the server • Cons • Data update is hard (may need re-distribution) • Filtering super answer set could be time consuming depending on the partitions sizes • Might reveal value distribution from relative partitions changes during dynamic data updates ICDE 2013 Tutorial

  35. Can Ciphertext Be Queried Directly • Encryption with special properties that allow predicate evaluation on ciphertexts • Order-preserving partition mapping  order-preserving encryption ICDE 2013 Tutorial

  36. Order Preserving Encryption[Agrawal et al. SIGMOD’04] ICDE 2013 Tutorial

  37. Achieving Order Preserving Encryption ICDE 2013 Tutorial

  38. Order-Preserving Review • Pros • Return exact answers instead of super sets • Can leverage existing DB index • Cons • Hard to perform analysis and aggregation • Some tuples could be easily identified if approach is applied to multiple attributes ICDE 2013 Tutorial

  39. CryptDB[Popa et al. SOSP’11] • Supports a wide range of SQL queries over encrypted data • Server fully evaluates queries on encrypted data, and client does not perform query processing • SQL-aware encryption • leverage provable practical techniques for different SQL operators over encrypted data • Adjustable query-based encryption • Dynamically adjust the encryption level of data items according to user’s queries • Onion of encryptions • From weaker forms of encryption that allow certain computation to stronger forms of encryption that reveal no information ICDE 2013 Tutorial

  40. SQL-Aware Onion Encryption RND: no functionality RND: no functionality DET: equality selection OPE: comparison SEARCH: word selection (only for text fields) OPE-JOIN: inequality join Any value JOIN: equality join Any value HOM: sum int value ICDE 2013 Tutorial

  41. CryptDB System For sending certain onion layer key For performing cryptographic operations ICDE 2013 Tutorial

  42. CryptDB Review • Pros • Support a wide range of SQL queries • Cons • Confidentiality level degrades to the weakest encryption in the long term ICDE 2013 Tutorial

  43. Why can we NOT leverage well proved encryption mechanisms and DB indexing techniques ICDE 2013 Tutorial

  44. Encrypted Index for Outsourced Data • Build a normal B+-tree index on key values • Encrypt B+-tree nodes • Store (and disperse) encrypted index in the cloud [Damiani et al. CCS’03, Wang et al. SDM’11] • A query with predicates on keys is processed by locating desired key values on encrypted index. • Traversal on index relies on the client to retrieve and decrypt index nodes. ICDE 2013 Tutorial

  45. n2 A1 A1 A2 A2 Ad … D: Data Tuples I: B+-tree Index … … t1 t2 . . . . , tN … … … … … … … … … … … … … … … … … … … … … … … … … … … … n1 … … … … … … Salted IDA TD E(tc1) E(tc2) n2 n1 E(n1) E(n2) tc1 tc2 IE ID TE … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … S1 Si Sn Cloud Servers ICDE 2013 Tutorial

  46. Practical Secure Query Processing … … … … root Index I Cache partial index nodes on client to improve efficiency n1 E(tc1) E(tc2) E(tc2) E(n1) E(n1) E(n1) E(tc2) E(n2) IE:1 IE TE TE:2 IE:1 TE:2 … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … Client S1 1 2 Si Sn IEcol1 TEcol2 IEcol1 TEcol2 Proxy Cloud Servers ICDE 2013 Tutorial

  47. Encrypted Index Review • Pros • Can be directly deployed on existing cloud settings • Provide stronger confidentiality than partition, order-preserving encryption without losing query efficiency • Cons • The Cloud’s computational ability is under utilized • Queries directly supported are limited to queries on indexed key attributes ICDE 2013 Tutorial

  48. Data Confidentiality • 1. Encryption • Homomorphic encryption • Partition Index • Order-preserving encryption • Encrypted Index • 2. Leveraging Trust ICDE 2013 Tutorial

  49. Distribution instead of Encryption • Under non-communicating servers assumption [Aggarwal et al. CIDR’05] Server 1 Server 2 Sensitive attributes E(telephone), E(email) Sensitive association name, salary name, E(salary) name salary Q1 Q2 Query Result(Q1) join Result(Q2) ICDE 2013 Tutorial

  50. Distribution Review • Pros • Reduce encryption and decryption overhead • Cons • Non-communicating servers assumption is strong* • Data distribution policy is usually not up to a client, but decided by cloud server providers • * [Emekciet al. ICDE’06, Agrawal et al. SRDS’88, Ciriani et al. ESORICS’09] ICDE 2013 Tutorial