1 / 16

Privacy Issues in Disclosing Averages

Privacy Issues in Disclosing Averages. Susmit Sarkar (CMU). Non-Interference. Non-Interference : Observable actions of programs are not influenced by sensitive data Too restrictive in practice! Think of password security. Safe Relaxation of Non-Interference. Passwords are sensitive data

aosborne
Download Presentation

Privacy Issues in Disclosing Averages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Privacy Issues in Disclosing Averages Susmit Sarkar (CMU)

  2. Non-Interference • Non-Interference : Observable actions of programs are not influenced by sensitive data • Too restrictive in practice! • Think of password security

  3. Safe Relaxation of Non-Interference • Passwords are sensitive data • Checking passwords violates non-interference • This is still okay [Volpano] if passwords are chosen randomly • The interaction is carefully controlled

  4. Generalizing to Averages • Idea: restrict access to allow us to answer interesting queries • Also, we can measure information loss • We want to calculate averages on private data • Generalize the notion of averages

  5. Content Host’s problem • Content host serving multiple content providers • The number of hits is sensitive information • Often, clients ask average hits of specified clients

  6. Example: Sport Site • You want to know how the redesign of your sports portal worked • Complications : It happens to be Superbowl Sunday • We want averages of all sports sites • What if there are only 2 sports sites?

  7. Formal Model Data Query := d1 + d3 + d5 = ? Problem : what about 1 0 1 1 0, and 1 0 1 1 1

  8. Query Model • Solution : Maintain history • Idea : add current query to set, decide if “bad” vectors are derivable • We restrict attention to weighted sums

  9. Issues Ignored in Model • Answers of queries (Right Hand Sides) • Data values • Extraneous information : Correlation between data • Some of this are in further work

  10. Characterizing Bad Vectors • (0 1 0 0 0 0 0 0 0 0 0) • (1 1061 1 1 1 1 1 1 1 1) • We want a measure that indicates when all entries are of similar magnitude

  11. Idea : Entropy • We use the entropy function : -å pi lg pi • Normalize entries so that magnitudes sum to one • Then treat the magnitudes as probabilities in entropy definition • Entropy is low when data is skewed

  12. Formal Problem Statement • m Query vectors Qi = (qi1,qi2,L,qin) • Unknown linear combination U = c1 Q1 + c2 Q2 + L Variables ui = å cj qij • Variables u’i¸ ui and u’i¸ – ui u’i¸ |ui|

  13. Calculating Entropy • Entropyå (u’i / å u’j ) lg (u’i / å u’j) ¸ T • Minimize : å u’I • Notice that this is a convex program

  14. Convex Programming • [Vempala] allows us to do convex programming efficiently • His algorithm allows us to solve our problem in polynomial time

  15. Future Work • Extend our measure to take into account the Right Hand Sides • Change the model to maximize queries we can answer

  16. Bibliography • [Volpano] “Verifying Secrets and Relative Secrecy”, Volpano and Smith, POPL’ 00 • [Vempala] “Solving Convex Programs by Random Walks”, Vempala and Bertsimas, STOC’ 02

More Related