hippocratic databases n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Hippocratic Databases PowerPoint Presentation
Download Presentation
Hippocratic Databases

Loading in 2 Seconds...

play fullscreen
1 / 51

Hippocratic Databases - PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on

Hippocratic Databases. Paper Authors: Rakesh Agrawal , Jerry Kiernan, Ramakrishnan Srikant , Yirong Xu Presented By: Camille Gaspard Originally taken from : pages.cpsc.ucalgary.ca /~ hammad /Fall04-00_files/Reg_Hippocratic%20Databases2.ppt. Introduction .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Hippocratic Databases' - telyn


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
hippocratic databases

Hippocratic Databases

Paper Authors: RakeshAgrawal, Jerry Kiernan, RamakrishnanSrikant, Yirong Xu

Presented By:

Camille Gaspard

Originally taken from:pages.cpsc.ucalgary.ca/~hammad/Fall04-00_files/Reg_Hippocratic%20Databases2.ppt

introduction
Introduction
  • Privacy with respect to databases
  • Related work
  • Founding principles of Hippocratic databases.
  • Strawman design
  • Future research
  • Limiting Disclosure in Hippocratic Databases

Hippocratic Oath: “And about whatever I may see or hear in treatment, or even without treatment, in the life of human beings – things that should not ever be blurted out outside – I will remain silent, holding such things to be unutterable.”

overview
Overview
  • Dramatic and escalating increase in digital data
  • Lack of privacy awareness, lax security
  • We need to re-architect database systems to include responsibility for the privacy of data.

Automatic collection

Digitization

Techniques

Networks

Processors

Storage

vocabulary
Vocabulary
  • Privacy is the right of individuals to determine for themselves when, how and to what extent information about them is communicated to others.
  • Computer security is the prevention of, or protection against,
    • access to information by unauthorized recipients, and
    • intentional but unauthorized destruction or alteration of that information
related work
Related Work
  • Traditional databases
  • Statistical databases
  • Secure databases
traditional databases
Traditional Databases
  • Fundamental to a database system is
    • Ability to manage persistent data.
    • Ability to access a large amount of data efficiently.
  • Universal capabilities of a database system
    • Support for at least one data model.
    • Support for certain high-level languages that allow the user to define the structure of data, access data, and manipulate data.
    • Transaction management, the capability to provide correct, concurrent access to the database by many users at once.
    • Access control, the ability to deny access to data by unauthorized users and the ability to check the validity of the data.
    • Resiliency, the ability to recover from system failures without losing data.
traditional databases1

Efficiency

  • Maximizing Concurrency
  • Resiliency
  • Privacy
  • Consented sharing
Traditional Databases
  • Hippocratic databases require all the capabilities provided by current database systems
  • Different focus
  • Need to rethink data definition and query languages, query processing, indexing and storage structures, and access control mechanisms
statistical databases
Statistical Databases
  • Goal: Provide statistical information (sum, count, average, maximum, minimum, pth percentile) without compromising sensitive information about individuals
  • Query restriction
    • Restricting the size of query results
    • Controlling the overlap among successive queries
    • Keeping audit trails of all answered queries and checking for possible compromises
  • Data perturbation
    • Swapping values between records
    • Replacing the original database by a sample from the same distribution
    • Adding noise to the values in the database
    • Adding noise to the results of a query
  • Hippocratic databases share the goal of preventing disclosure of private information but the class of queries for Hippocratic databases is much broader.
secure databases
Secure Databases
  • Sensitive information is transmitted over a secure channel and stored securely
  • Access controls
  • Encryption
  • Multilevel secure databases
    • Multiple levels of security (top secret, secret, confidential, unclassified)
    • Eg: Bell-LaPadula model (no read up, no write down)
privacy regulations
Privacy Regulations
  • United States Privacy Act of 1974 requires federal agencies to
    • permit an individual to determine what records pertaining to him are collected, maintained, used, or disseminated;
    • permit an individual to prevent records pertaining to him obtained for a particular purpose from being used or made available for another purpose without his consent;
    • permit an individual to gain access to information pertaining to him in records, and to correct or amend such records;
    • collect, maintain, use or disseminate any record of personally identifiable information in a manner that assures that such action is for a necessary and lawful purpose, that the information is current and accurate for its intended use, and that adequate safeguards are provided to prevent misuse of such information;
    • permit exemptions from the requirements with respect to the records provided in this Act only in those cases where there is an important public policy need for such exemption as has been determined by specific statutory authority; and
    • be subject to civil suit for any damages which occur as a result of willful or intentional action which violates any individual’s right under this Act.
privacy regulations1
Privacy Regulations
  • Recent privacy documents
    • 1996 Health Insurance Portability and Accountability Act (HIPAA)
    • 1999 Gramm-Leach-Bliley Financial Services Modernization Act
    • 1995 Canada Personal Information Protection and Electronic Documents Act (PIPEDA)
guidelines
Guidelines
  • Collection
  • Retention
  • Use
  • Disclosure
  • Example: Grad student information at the university
ten founding principles
Ten Founding Principles
  • Purpose Specification. For personal information stored in the database, the purposes for which the information has been collected shall be associated with that information.
  • Consent. The purposes associated with personal information shall have consent of the donor of the personal information.
  • Limited Collection. The personal information collected shall be limited to the minimum necessary for accomplishing the specified purposes.
  • Limited Use. The database shall run only those queries that are consistent with the purposes for which the information has been collected.
  • Limited Disclosure. The personal information stored in the database shall not be communicated outside the database for purposes other than those for which there is consent from the donor of the information.
ten founding principles1
Ten Founding Principles
  • Limited Retention. Personal information shall be retained only as long as necessary for the fulfillment of the purposes for which it has been collected.
  • Accuracy. Personal information stored in the database shall be accurate and up-to-date.
  • Safety. Personal information shall be protected by security safeguards against theft and other misappropriations.
  • Openness. A donor shall be able to access all information about the donor stored in the database.
  • Compliance. A donor shall be able to verify compliance with the above principles. Similarly, the database shall be able to address a challenge concerning compliance.
strawman design
Strawman Design
  • Use scenario: Mississippi Bookstore
  • Mississippi is an on-line bookseller who needs to obtain certain minimum personal information to complete a purchase transaction. This information includes name, shipping address, and credit card number. Mississippi also needs an email address to notify the customer of the status of the order.
  • Mississippi uses the purchase history of customers to offer book recommendations on its site. It also publishes information about books popular in the various regions of the country (purchase circles).
the characters
The Characters
  • Name: Super Mario
  • Privacy pragmatist
  • Likes the convenience of providing his email and shipping address only once by registering at Mississippi . He also likes recommendations but he does not want his transactions used for purchase circles.
the characters1
The Characters
  • Name: Princess Peach
  • Privacy fundamentalist
  • Does not want Mississippi to retain any information once her purchase transaction is complete.
the characters2
The Characters
  • Name: Luigi
  • Mississippi employee with questionable ethics
  • The database and privacy officer must ensure that he is not able to obtain more information that she is supposed to.
privacy metadata
Privacy Metadata

Privacy Metadata Schema

Database Schema

Privacy-Policies Table

privacy metadata1
Privacy Metadata

Privacy-Authorizations Table

This object is copied from the original paper.

alternate organizations
Alternate Organizations
  • This design may be too restrictive or may not fit in some situations.
  • May create a new table with the columns
    • User
    • Table
    • Attribute
    • Purpose
    • External recipient
data collection
Data Collection
  • Privacy Constraint Validatorchecks whether the business’s privacy policy is acceptable to the user
  • Example: If Princess Peach required a 2 week retention period, the database would reject the transaction
  • Data is inserted with the purpose for which it may be used
  • Data Accuracy Analyzer addresses the Principle of Accuracy. For example, verify that the postal code corresponds to the street address.
queries
Queries
  • Submitted to the database along with their purpose. Example: recommendations
  • Before query execution: Attribute Access Control checks privacy-authorizations table for a match on purpose, attribute and user.
  • During query execution: Record Access Control ensures that only records whose purpose attribute includes the query’s purpose will be visible to the query.
queries1
Queries
  • After query execution: Query Intrusion Detector is run on the query results to spot queries whose access pattern is different from the usual access pattern for queries with that purpose and by that user.
  • Detector uses the Query Intrusion Model built by analyzing past queries for each purpose and each authorized user.
  • An audit trail of all queries is maintained for external privacy audits, as well as addressing challenges regarding compliance.
other features
Other Features
  • Data Retention Manager deletes data items that have outlived their purpose.
  • Data Collection Analyzer examines the set of queries for each purpose to determine if any information is being collected but not used. This supports the Principle of Limited Collection.
  • DRA determines if data is being kept for longer than necessary. (Limited Retention)
  • DCA determines if people have unused (unnecessary) authorizations to issue queries with a given purpose. (Limited Use)
  • Encryption Support allows some data items to be stored in encrypted form to guard against snooping.
p3p and hippocratic databases
P3P and Hippocratic Databases
  • Platform for Privacy Preferences (emerging standard developed by the WWW Consortium)
  • P3P provides a way for a web site to encode its data-collection practices in an XML P3P policy
  • The sites policy is programmatically compared to a user’s privacy preferences
  • How to enforce?
  • Integrate with Hippocratic databases
  • The Electronic Privacy Information Center (EPIC) has been critical of P3P and believes P3P makes it too difficult for users to protect their privacy
new challenges language
New Challenges – Language
  • P3P language insufficient
    • Developed for web shopping  language restricted
    • P3P is a good starting for a language which can be used in a wider variety of environments such as finance, insurance, and health care
    • Difficult to find balance between expressibility and usability
  • Work is being done to arrange purposes in a hierarchy rather than the flat space that P3P uses
  • Other research treats the problem as a coalition game [Kleinberg et al. 2001]where information is valuable and users are compensated for revealing some private information.
new challenges efficiency
New Challenges – Efficiency
  • Current databases have undergone years of optimization without concern for privacy. What type of performance hit will integrated privacy checking entail?
  • Some techniques from multilevel secure databases will apply
  • Technique that can reduce the cost of each check:
    • Assume number of purposes is 32 or less
    • Encode the set of purposes by setting a bit in a word
    • Then the Record Access Control check only requires a bit-wise AND and XOR of two words with a check that the result is 0
  • Storage of purpose – space versus efficiency
challenges limited collection
Challenges – Limited Collection
  • Access Analysis: Analyze the queries for each purpose and identify attributes that are collected for a given purpose but not used.
    • Problem: Necessity of one attribute may depend on others
  • Granularity Analysis: Analyze the queries for each purpose and numeric attribute and determine the granularity at which information is needed.
  • Minimal Query Generation: Generate the minimal query that is required to solve a given problem.
challenges limited disclosure
Challenges – Limited Disclosure
  • Strawman design as it exists now does not protect against a dynamically determined set of recipients
  • Example: EquiRate is a credit rating agency that maintains a database of credit ratings
    • Princess Peach has asked EquiRate to only disclose her credit rating to companies that she has contacted
    • Luigi has stolen Princess Peach’s identity and has contacted EasyCredit pretending to be her.
    • EasyCredit contacts EquiRate with Princess Peach’s credentials and obtains her credit rating
    • Both companies have adhered to the protocol but it has failed
challenges limited disclosure1
Challenges – Limited Disclosure
  • Potential solution
    • Princess Peach digitally signs EasyCredit’s company ID with her cryptographic private key
    • She provides the result to EasyCredit
    • EasyCredit contacts EquiRate and gives them Alice’s signed permission
    • EquiRate verifies the signature and provides the information to EasyCredit
challenges limited retention
Challenges – Limited Retention
  • Deleting a record from a database is simple but how does one completely remove the information?
  • Deleting information for all logs, checkpoints, and backups is difficult
new challenges safety
New Challenges – Safety
  • How do we protect against someone whose access controls prevent them from accessing tables through the DBMS but they have access to the files via other means?
  • Example: Suppose Luigi has no access to the DBMS but has root privileges on the system
  • Encryption incurs performance penalties
  • How can one index or query encrypted columns
new challenges openness
New Challenges – Openness
  • A person challenging information about themselves that was not provided by them (eg: credit report)
  • Super Mario wishes to find out what databases have information about him
  • If the database does not have information about him, it should not know who issued the query
  • Related work: symmetrically private information retrieval
new challenges compliance
New Challenges – Compliance
  • Provide each user whose data is accessed with a log of that access along with the query reading the data, without paying a large performance penalty
  • Fingerprinting – sign up with a company that inserts fingerprint records into the database with an email address, phone number and credit card number along with various approved purposes. If one is used for an unauthorized purpose it would be immediately detected and the company notified that a breach has occurred
  • Challenge: use the least number of fingerprint records to maximum effect.
conclusion
Conclusion
  • Presented a vision, inspired by the Hippocratic Oath, of databases that preserve privacy
  • Enunciated key privacy principles
  • Discussed a strawman design for a Hippocratic database
  • Identified technical challenges
  • Discussed related work
limiting disclosure in hippocratic databases

Limiting Disclosure in Hippocratic Databases

Paper Authors: Kristen LeFevrey, RakeshAgrawaly, VukErcegovac, RaghuRamakrishnan, YirongXuy and David DeWitt

Presented By:

Camille Gaspard

motivation
Motivation
  • It’s essential to manage disclosure in a more fine-grain manner.
  • Cell-level disclosure schemes have not been well studied before this work.
contribution
Contribution
  • Investigate the alternative semantics for limited disclosure in relational databases and present two models of cell-level limited disclosure: table semantics and query semantics, weighing the relative semantic and performance tradeoffs implicit in the two models.
  • Rather than viewing the disclosure control problem as one of checking against a list of privileges, they transform it into a query modification problem.
  • Experimental evaluation.
table semantics
Table semantics
  • The table semantics model conceptually defines a view of each data table for each purpose-recipient pair, based on the disclosure constraints specified in the privacy policy. These views combine to produce a coherent relational database for each purpose-recipient pair, independent of any queries, and queries are evaluated against this database.
query semantics
Query semantics
  • The query semantics model takes the query into account when enforcing disclosure.
  • Both models mask prohibited values using SQL's null value.
system overview 1
System Overview (1)
  • Policy definition: Privacy policies are expressed electronically and stored in the database where they can be used to enforce limited disclosure.
  • Privacy meta-data: The rules and conditions prescribed by the privacy policy are stored a privacy meta-data tables in the database.
  • Application context: Each query must be associated with a purpose and recipient. In their system, this information is inferred based on the context of the application issuing the query. The query interceptor infers the purpose and recipient of the query based on context information stored in an additional meta-data table.
system overview 2
System Overview (2)
  • Query modifier: SQL queries issued to the database are intercepted and augmented to reflect the privacy policy rules regarding the purpose and recipient issuing the query. The results of this new query are returned to the issuer.
  • Disclosure model: The result of executing a privacy modified query will reflect one of two cell-level limited disclosure models.
discussion
Discussion
  • Limited Disclosure. Can we really limit it?
  • Limited Retention. Can we really delete the data?
  • Isn’t it better to keep such complex privacy mechanism as part of the application?
  • Why not using new techniques in writing stored procedures?
  • Databases are only part of the whole system. Securing the database helps but not prevent attacks. A holistic solution is what might be needed.