240 likes | 422 Views
Silverline : Toward Data Confidentiality in Storage-Intensive Cloud Applications. Krishna P. N. Puttaswamy , Christopher Kruegel and Ben Y. Zhao UC Santa Barbara SOCC’11 Presented by: Zhang Chunwang. Cloud Infrastructure. Advantages Pay-as-you-use High availability Elastic access, etc
E N D
Silverline: Toward Data Confidentiality in Storage-Intensive Cloud Applications Krishna P. N. Puttaswamy, Christopher Kruegel and Ben Y. Zhao UC Santa Barbara SOCC’11 Presented by: Zhang Chunwang
Cloud Infrastructure • Advantages • Pay-as-you-use • High availability • Elastic access, etc • Disadvantages • Serious risks to data security It (Cloud Computing) is a security nightmare and it can't be handled in traditional ways. -- John Chambers, CISCO’s CEO
Security and Privacy Source: InformationWeek Analytics Cloud Computing Survey, 2009 Source: IDC Enterprise Panel, Aug 2008, n=244 Source: AMD 2011 Global Cloud Computing Adoption, Attitudes and Approaches Study Source: World Economic Forum 2009 Cloud Computing Survey
Existing Solutions • Keyword search on encrypted data • Limited functionality • Fully homomorphic encryption • Too expensive for large data • Tagged-MapReduce • Only for MapReduce computations
Key Observation • In apps that can benefit the most from the cloud, the majority of their computations handle data in an opaque way, i.e., without interpretation. • SELECT * FROM t WHERE UserID = “Bob” • Aggregate sum instances of each event type
A Closer-Looking Example • AstroSpaces: A social networking service • Create user profiles • Add user to friend list • Send msgsto friends • Create blog posts • Write comments to friends’ profiles • Create content on their own profiles 7 database tables and 51 SELECT queries.
A Closer-Looking Example • Out of 24 user data fields, 7 were used in computations • Username, read/unread status of msgs, accepted/unaccepted status of friendship requests, theme and style, activation status of the account, email • Most of personal data are not used in any computation • First and last name, address, phone number, blog posts, msgs exchanged between friends, wall posts, etc • Hence, they can be encrypted without limiting the functionality of the application • -- Functionally Encryptable Data
Key Idea • Split the entire app data into two subsets: • Functionally encryptable data • Data that must be in plaintext to support the desired functionality • Work of this paper • 1) Identify functionally encryptable data • 2) Encryption keys assignment • 3) Secure & transparent data access on user devices
1. Identify All Encryptable Data • Key idea • Assign each database field a unique field number • Track the use of database fields in the computation • Propagate the union of all tags of RHS operand to LHS operand • For operations such as string operators, numerical operators and comparators, write tags of RHS operand to the log file • Collect the log, produce a list of unique DB fields • The other fields are then functionally encryptable • Dynamic program analysis • Using a set of representative queries
1. Identify All Encryptable Data • Illustration of the data tracking
2. DB Labeling and Key Assignment • The problem • Encrypt all data with a single key: all data will be disclosed if a single user is compromised • Encrypt each individual cell with a different key: key management becomes complicated • The goal • Automatically infer the right granularity of data encryption to trade-off between robustness and key management complexity
2. DB Labeling and Key Assignment • Label: a label is a set of all UserIDs (users) who have access to that cell. Each cell has its label. • Labeling algorithm • Input: a sufficiently detailed set of requests • Output: identify for each cell the set of users who can access it • Approach: For each request For each cell returned as the result of that query Update the cell label by adding the user ID
2. DB Labeling and Key Assignment User Bob SELECT * FROM Users WHERE UserID = ‘< oBob>’ User Admin SELECT * FROM Users
2. DB Labeling and Key Assignment • Key assignment • Partition the database tables into sub-groups where each group shares the same label • Assign each group a single, unique key • Give each user all the keys necessary
2. DB Labeling and Key Assignment • Key assignment
2. DB Labeling and Key Assignment • Incompleteness and Database Dynamics • Training set may be incomplete • Database changes dynamically • Solutions • Online monitoring component on the cloud • Track if any computations on encrypted data • Any changes to cell label that impact key assignment
3. Key Management on User Devices • Problem • How to ensure the safeness of keys and decrypted data on user devices (i.e., client-side injected code) • Solution • Prevent untrusted code from accessing sensitive data • Same Origin Policy (SOP) and HTML5 • Using two iFrames: one for the cloud, one for the organization
Evaluation • Purposes • How much of the data in today’s apps can be encrypted without breaking the functionality? • Does the labeling algorithm and key assignment work as expected? • Application descriptions
Evaluation • Amount of functionally encryptable data • Most of the non-encryptable fields store info not directly related to the user • Post content in UseBB and event content in Comender are non-encryptable for supporting keyword search • Keyword search on encrypted data can be applied here
Evaluation • Effectiveness of the key inference techniques • Successfully identify the sharing relationships between users and classify the data into groups • Please refer to the paper for more details
Limitations • Only applicable for selected apps • Not all data on the cloud are encrypted • Leakage of some metadata (deterministic encryption) • Inequality comparisons no longer work • Attacks on commonly-shared data
Future Work • Automatically identify the importance of different database fields and encrypt important ones using appropriate methods; • Automatic partitioning of applications • Move code touching sensitive data to the client