graph analysis with node differential privacy n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Graph Analysis with Node Differential Privacy PowerPoint Presentation
Download Presentation
Graph Analysis with Node Differential Privacy

Loading in 2 Seconds...

play fullscreen
1 / 24

Graph Analysis with Node Differential Privacy - PowerPoint PPT Presentation


  • 160 Views
  • Uploaded on

Graph Analysis with Node Differential Privacy. Sofya Raskhodnikova Penn State University. Joint work with Shiva Kasiviswanathan ( GE Research ), Kobbi Nissim ( Ben-Gurion U. and Harvard U. ), Adam Smith ( Penn State ). Publishing information about graphs.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Graph Analysis with Node Differential Privacy' - eugenia-weber


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
graph analysis with node differential privacy

Graph Analysis with Node Differential Privacy

SofyaRaskhodnikova

Penn State University

Joint work withShiva Kasiviswanathan(GE Research),

KobbiNissim(Ben-Gurion U. and Harvard U.),

Adam Smith (Penn State)

publishing information about graphs
Publishing information about graphs

Many datasets can be represented as graphs

  • “Friendships” in online social network
  • Financial transactions
  • Email communication
  • Romantic relationships

image source http://community.expressor-software.com/blogs/mtarallo/36-extracting-data-facebook-social-graph-expressor-tutorial.html

Privacy is a big issue!

American J. Sociology,

Bearman, Moody, Stovel

private analysis of graph data
Private analysis of graph data

Graph G

Users

Trusted

curator

Government,

researchers,

businesses

(or)

malicious

adversary

)

(

queries

answers

  • Two conflicting goals:utility and privacy
    • utility: accurate answers
    • privacy: ?

image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

differential privacy for graph data
Differential privacy for graph data

Graph G

Users

Differential privacy [DworkMcSherryNissim Smith 06]

An algorithm A is-differentially private if

for all pairs of neighborsand all sets of answers S:

Trusted

curator

Government,

researchers,

businesses

(or)

malicious

adversary

)

(

queries

A

answers

  • Intuition:neighbors are datasets that differ only in some information we’d like to hide (e.g., one person’s data)

image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

two variants of differential privacy for graphs
Two variants of differential privacy for graphs
  • Edge differential privacy

Two graphs are neighbors if they differ in one edge.

  • Node differential privacy

Two graphs are neighbors if one can be obtained from the other by deleting a node and its adjacent edges.

G:

G:

G:

G:

node differentially private analysis of graphs
Node differentially private analysis of graphs

Graph G

Users

Trusted

curator

Government,

researchers,

businesses

(or)

malicious

adversary

)

(

queries

A

answers

  • Two conflicting goals:utility and privacy
    • Impossible to get both in the worst case
  • Previously: no node differentially private algorithms that are accurate on realistic graphs

image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

our contributions
Our contributions
  • First node differentially private algorithms that are accurate for sparse graphs
    • node differentially privatefor all graphs
    • accuratefor a subclass of graphs, which includes
      • graphs with sublinear (not necessarily constant) degree bound
      • graphs where the tail of the degree distribution is not too heavy
      • dense graphs
  • Techniques for node differentially private algorithms
  • Methodology for analyzing the accuracy of such algorithms on realistic networks

Concurrent work on node privacy [Blocki Blum DattaSheffet 13]

our contributions algorithms
Our contributions: algorithms
  • Node differentially private algorithms for releasing
    • number of edges
    • counts of small subgraphs

(e.g., triangles, -triangles, -stars)

    • degree distribution
  • Accuracy analysis of our algorithms for graphs with not-too-heavy-tailed degree distribution: with -decayfor constant

Notation:

fraction of nodes in G of degree

    • Every graph satisfies -decay
    • Natural graphs (e.g., “scale-free” graphs, Erdos-Renyi) satisfy

Frequency

A graph G satisfies -decayif for all

Degrees

our contributions accuracy analysis
Our contributions: accuracy analysis
  • Node differentially private algorithms for releasing
    • number of edges
    • counts of small subgraphs

(e.g., triangles, -triangles, -stars)

    • degree distribution
  • Accuracy analysis of our algorithms for graphs with not-too-heavy-tailed degree distribution: with -decay for constant
    • number of edges
    • counts of small subgraphs

(e.g., triangles, -triangles, -stars)

    • degree distribution

A graph G satisfies -decay if for all

(1+o(1))-approximation

}

previous work on
Previous work on

differentially private computations on graphs

Edge differentially private algorithms

  • number of triangles, MST cost [NissimRaskhodnikova Smith 07]
  • degree distribution[Hay RastogiMiklauSuciu 09, Hay Li Miklau Jensen 09]
  • small subgraph counts[KarwaRaskhodnikova Smith Yaroslavtsev 11]
  • cuts[Blocki Blum DattaSheffet12]

Edge private against Bayesian adversary (weaker privacy)

  • small subgraphcounts [Rastogi Hay MiklauSuciu09]

Node zero-knowledge private (strongerprivacy)

  • average degree, distances to nearest connected, Eulerian, cycle-free graphs for dense graphs [GehrkeLui Pass 12]
differential privacy basics
Differential privacy basics

Graph G

Users

How accurately

can an -differentially private algorithm release f(G)?

Trusted

curator

Government,

researchers,

businesses

(or)

malicious

adversary

)

(

statistic f

A

approximation to f(G)

global sensitivity framework dmns 06
Global sensitivity framework [DMNS’06]
  • Global sensitivity of a function is
  • For every function there is an -differentially private algorithm that w.h.p. approximates with additive error .
  • Examples:
  • (G) is the number of edges in G.
  • (G) is the number of triangles in G.
  • =.

=.

projections on graphs of small degree
“Projections” on graphs of small degree

Let = family of all graphs,

= family of graphs of degree .

Notation. = global sensitivity of over.

=global sensitivity of over .

Observation. is low for many useful .

Examples:

  • =(compare to= )
  • = (compare to = )

Idea: ``Project’’ on graphs in for a carefully chosen d << n.

Goal: privacy for all graphs

method 1 lipschitz extensions
Method 1: Lipschitz extensions
  • Release via GS framework[DMNS’06]
  • Requires designing Lipschitz extension for each function
    • we base ours on maximum flow and linear and convex programs

A function is a Lipschitz extension

of from to if

  • agrees with on and
  • =

high

=

low

lipschitz extension of flow graph
Lipschitz extension of : flow graph

For a graph G=(V, E), define flow graph of G:

Add edge iff.

(G) is the value of the maximum flow in this graph.

Lemma.(G)/2is a Lipschitz extension of .

1

1'

1

2

2'

s

3

3'

t

4

4'

5

5'

lipschitz extension of flow graph1
Lipschitz extension of : flow graph

For a graph G=(V, E), define flow graph of G:

Add edge iff.

(G) is the value of the maximum flow in this graph.

Lemma.(G)/2is a Lipschitz extension of .

Proof: (1) (G) = for all G

(2) = 2⋅

1

1'

1/

1

2

2'

s

3

3'

t

4

4'

5

5'

lipschitz extension of flow graph2
Lipschitz extension of : flow graph

For a graph G=(V, E), define flow graph of G:

(G) is the value of the maximum flow in this graph.

Lemma.(G)/2is a Lipschitz extension of .

Proof: (1) (G) = for all G

(2) = 2⋅= 2

1

1'

1

2

2'

s

3

3'

t

4

4'

5

5'

6

6'

lipschitz extensions via linear convex programs
Lipschitz extensions via linear/convex programs

For a graph G=([n], E), define LP with variables for all triangles :

(G) is the value of LP.

Lemma.(G)is a Lipschitz extension of .

  • Can be generalized to other counting queries
  • Other queries use convex programs

Maximize

for all triangles

for all nodes

method 2 generic reduction to privacy over
Method 2: Generic reduction to privacy over

Input: Algorithm B that is node-DP over

Output: Algorithm A that is node-DP over , has accuracy similar to B on “nice” graphs

  • Time(A) = Time(B) + O(m+n)
  • Reduction works for all functions

How it works: Truncation T(G) outputs G with nodes of degree removed.

  • Answer queries on T(G) instead of G

high

low

  • via Smooth Sensitivity framework [NRS’07]
  • via finding a DP upper bound on [Dwork Lei 09, KRSY’11]
  • and running any algorithm that is -node-DP over

G

A

T(G)

query f

T

S

+ noise

(G)

generic reduction via truncation
Generic Reduction via Truncation
  • Truncation T(G) removes nodes of degree .
  • On query , answer

How much noise?

  • Local sensitivity of as a map
  • Lemma.,

where = nodes of degree .

  • Global sensitivityis too large.

Frequency

Nodes that determine

d

Degrees

smooth sensitivity of truncation
Smooth Sensitivity of Truncation

Smooth Sensitivity Framework [NRS ‘07]

is a smooth bound on local sensitivity ofif

  • for all neighbors

Lemma. is a smooth bound for , computable in time

“Chain rule”: is a smooth bound for

G

A

T(G)

query f

T

S

+ noise

(G)

utility of the truncation mechanism
Utility of the Truncation Mechanism

Lemma. If we truncate to a random degree in ,

  • Application to releasing the degree distribution:

an -node differentially private algorithmsuch that

with probability at least if satisfies -decay for

Utility:If G is -bounded, expected noise magnitude is .

G

A

T(G)

query f

T

S

+ noise

(G)

techniques used to obtain our results
Techniques used to obtain our results
  • Node differentially private algorithms for releasing
    • number of edges
    • counts of small subgraphs

(e.g., triangles, -triangles, -stars)

    • degree distribution

via Lipschitz extensions

} via generic reduction

conclusions
Conclusions
  • It is possible to design node differentially private algorithms with good utility on sparse graphs
    • One can first test whether the graph is sparse privately
  • Directions for future work
    • Node-private algorithm for releasing cuts
    • Node-private synthetic graphs
    • What are the right notions of privacy for graph data?