mining multiple private databases l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Mining Multiple Private Databases PowerPoint Presentation
Download Presentation
Mining Multiple Private Databases

Loading in 2 Seconds...

play fullscreen
1 / 17

Mining Multiple Private Databases - PowerPoint PPT Presentation


  • 180 Views
  • Uploaded on

Top k Queries Across Multiple Private Databases (2005) Li Xiong (Emory University) Subramanyam Chitti (GA Tech) Ling Liu (GA Tech) Presented by: Cesar Gutierrez. Mining Multiple Private Databases. About Me. ISYE Senior and CS minor Graduating December, 2008

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Mining Multiple Private Databases' - mike_john


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
mining multiple private databases

Topk Queries Across Multiple Private Databases (2005)

Li Xiong (Emory University)

Subramanyam Chitti (GA Tech)

Ling Liu (GA Tech)

Presented by: Cesar Gutierrez

Mining Multiple Private Databases
about me
About Me
  • ISYE Senior and CS minor
  • Graduating December, 2008
  • Humanitarian Logistics and/or Supply Chain
  • Originally from Lima, Peru
  • Travel, paintball and politics
outline
Outline
  • Intro. & Motivation
  • Problem Definition
  • Important Concepts & Examples
  • Private Algorithm
  • Conclusion
introduction
Introduction
  • ↓ of information-sharing restrictions due to technology
  • ↑ need for distributed data-mining tools that preserve privacy
  • Trade-off

Accuracy

Efficiency

Privacy

motivating scenarios
Motivating Scenarios
  • CDC needs to study insurance data to detect disease outbreaks
    • Disease incidents
    • Disease seriousness
    • Patient Background
  • Legal/Commercial Problems prevent release of policy holder's information
motivating scenarios cont d
Motivating Scenarios (cont'd)
  • Industrial trade group collaboration
    • Useful pattern: "manufacturing using chemical supplies from supplier X have high failure rates"
    • Trade secret: "manufacturing process Y gives low failure rate"
problem assumptions
Problem & Assumptions
  • Model: n nodes, horizontal partitioning
  • Assume Semi-honesty:
    • Nodes follow specified protocol
    • Nodes attempt to learn additional information about other nodes

...

challenges
Challenges
  • Why not use a Trusted Third Party (TTP)?
    • Difficult to find one that is trusted
    • Increased danger from single point of compromise
  • Why not use secure multi-party computation techniques?
    • High communication overhead
    • Feasible for small inputs only
recall our 3 d goal
Recall Our 3-D Goal

Accuracy

Efficiency

Privacy

private max
Private Max
  • Actual Data sent on first pass
  • Static Starting Point Known

start

30

2

1

30

10

40

30

40

20

4

3

40

multi round max
Multi-Round Max
  • Randomly perturbed data passed to successor during multiple passes
  • No successor can determine actual data from it's predecessor
  • Randomized Starting Point

Start

18

32

35

0

D2

D2

30

10

32

35

40

18

32

35

20

40

D4

D3

32

35

40

evaluation parameters
Evaluation Parameters
  • Large k = "avoid information leaks"
  • Large d = more randomization = more privacy
  • Small d = more accurate (deterministic)
  • Large r = "as accurate as ordinary classifier"
conclusion
Conclusion
  • Problems Tackled
    • Preserving efficiency and accuracy while introducing provable privacy to the system
    • Improving a naive protocol
    • Reducing privacy risk in an efficient manner
critique
Critique
  • Dependency on other research papers in order to obtain a full understanding
  • Few/No Illustrations
  • A real life example would have created a better understanding of the charts