slide1 n.
Download
Skip this Video
Download Presentation
A Privacy-Preserving Framework for Personalized Social Recommendations

Loading in 2 Seconds...

play fullscreen
1 / 47

A Privacy-Preserving Framework for Personalized Social Recommendations - PowerPoint PPT Presentation


  • 201 Views
  • Uploaded on

A Privacy-Preserving Framework for Personalized Social Recommendations Zach Jorgensen 1 and Ting Yu 1,2. 1 NC State University Raleigh, NC, USA. 2 Qatar Computing Research Institute Doha, Qatar. EDBT March 24-28, 2014 Athens, Greece. Motivation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A Privacy-Preserving Framework for Personalized Social Recommendations' - medge-moore


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

A Privacy-Preserving

Framework for Personalized

Social Recommendations

Zach Jorgensen1 and Ting Yu1,2

1NC State University

Raleigh, NC, USA

2Qatar Computing Research Institute

Doha, Qatar

EDBT

March 24-28, 2014

Athens, Greece

motivation
Motivation

Item Preferences

  • Social recommendation task – to predict items a user might like based on the items his/her friends like

i4

i2

i3

i5

i1

Social

Recommendation

System

recommendations

Social Relations

motivation1
Motivation

Model: Top-nSocial Recommender

The utility of recommending item ito user u

  • Input
  • Items
  • Users
  • Social Graph
  • Preference Graph
  • # of recs, n
  • For every item i
  • For every user u
  • Compute μ(i, u)
  • For every user u
  • Sort items by utility
  • Recommend top n items

Output

A personalized list of top n items (by utility), for each user

motivation2
Motivation

= utility of recommending item ito user u

μ

1 if pref. exists

0 otherwise

u, i

u, v

e.g., Common Neighbors

Social

Graph

Social similarity

measure

motivation3
Motivation
  • Many existing structural similarity measures could be used [Survey: Lu & Zhou, 2011]
  • We considered
    • Common Neighbors
    • Adamic-Adar
    • Graph Distance
    • Katz
motivation4
Motivation

Two main privacy problems:

  • Protect privacy of user data from malicious service provider (i.e., the recommender)
  • Protect privacy of user data from malicious/curious users
  • Our focus: preventing disclosure of individual item preferences through the output
motivation5
Motivation

Simple attack on Common Neighbors.

Bob listens to Bieber!

Bob

Alice

motivation6
Motivation

Adversary

  • Knowledge of all preferences except target edge
  • Observes all recommendations
  • Knowledge of the algorithm

Goal: to deduce the presence/absence of a single preference edge (the target edge)

motivation7
Motivation

Differential Privacy [Dwork, 2006]

  • Provides strong, formal privacy guarantees
  • Informally: guarantees that recommendations will be (almost) the same with/without any one preference edge in the input
motivation8
Motivation

Related work: Machanavajjhalaet al. (VLDB 2011)

  • Task: For each node, recommend node with highest social similarity (Common Neighbors, Katz).
  • No distinction between user/items or between preferences/social edges.
  • Negative theoretical results.
motivation9
Motivation
  • We assume that social graph is public
  • Often true in practice…

motivation10
Motivation
  • Main Contribution: a framework that enables differential privacy guarantees for preference edges
  • Demonstrate on real data sets that making accurate and private social recommendation is feasible
outline
Outline
  • Motivation
  • Differential Privacy
  • Our Approach
  • Experimental Results
  • Conclusions
differential privacy
Differential Privacy

A randomized algorithm A gives ε-differential privacy if for anyneighboring data sets D, D’ and any :

X1

Xi

Xn

X1

Xi

Xn

Neighboring data sets differ ina single record

[Dwork, 2006.]

achieving differential privacy
Achieving Differential Privacy

X1

Xi

Xn

noised

Global sensitivity of A: 1

Theorem: satisfies ε-differential

privacy

typically

Smaller ε = more noise/privacy

properties of differential privacy
Properties of Differential Privacy
  • Sequential Composition

DP Interface

D

...

...

-differential privacy

  • Parallel Composition

...

...

ε-differentially private

outline1
Outline
  • Motivation
  • Differential Privacy
  • Our Approach
    • Simplifying observations
    • Naïve Approaches
    • Our Approach
  • Experimental Results
  • Conclusions
simplifying observations
Simplifying Observations

Iterations use disjoint inputs

  • For every item i
  • For every user u
  • Compute μ(i, u)
  • For every user u
  • Sort items by utility
  • Recommend top n items

Post-processing

Our focus: an ε-differentially private procedure for computing μ(i, u), for all users uand a giveni

na ve approaches
Naïve Approaches

Approach 1: Noise-on-Utilities

  • For each item i
  • For every user u
  • Compute
  • For each user u
  • Sort items by utility
  • Recommend top n items

Satisfies ε-differential privacy, but…

destroys accuracy!

na ve approaches1
Naïve Approaches

Approach 2: Noise-on-Edges

  • Add Laplace noise independently to each edge,
  • Run the non-private algorithm with the resulting sanitized preference graph

Example: let

Noise will destroy accuracy!

our approach
Our Approach

StrategyS

c1

i

u2

1

0

ClusterEdges

1

u1

0

u3

1

u4

1

u5

1

u6

0

u8

u7

c2

c3

For now, assume Srandomly assigns

edges to clusters

our approach1
Our Approach

c1

i

For each cluster, compute noisy average weight

u2

1

0

1

u1

0

u3

1

u4

+ noise

1

u5

1

u6

0

u8

u7

c2

c3

noise

+ noise

our approach2
Our Approach

c1

i

Replace edge weights w/ noisy average of respective cluster

u2

u1

u3

u4

+ noise

u5

u6

u8

u7

c2

c3

noise

+ noise

our approach3
Our Approach

i

  • For every item i
  • For each user u
  • Compute μ(i, u)
  • For each user u
  • Sort items by utility
  • Recommend top n items

u2

u1

u3

u4

u5

u6

u8

u7

our approach rationale
Our Approach: Rationale
  • Adding/removing a single preference edge affects one cluster average by at most 1/|ci|
  • Noise added to average for cluster is
  • The bigger the cluster, the smaller the noise

Example: let ε = 0.1, |c| = 50 edges

Intuition: the bigger the cluster, the less sensitive its average weight is to any one preference edge

our approach rationale1
Our Approach: Rationale
  • The catch – averaging introduces approximation error!
  • Need a better clustering strategy that will keep approx. error relatively low
  • Strategy must not leak privacy.
our approach clustering strategy
Our Approach: Clustering Strategy

c0

Social Graph

u2

u2

u1

u1

u3

u3

Community Detection

u4

u4

u5

u5

c1

u6

u6

u8

u8

u7

u7

Cluster the users based on the naturalcommunitystructure of the public social graph.

our approach clustering strategy1
Our Approach: Clustering Strategy

c0

Social Graph

u2

u1

u3

CommunityDetection

u4

u5

c1

u6

u8

u7

For each item, derive clusters for preference edges based on the user clusters

our approach clustering strategy2
Our Approach: Clustering Strategy

c0

Social Graph

u2

u1

u3

Community Detection

u4

u5

c1

u6

u8

u7

Note: we only need to cluster the social

graph once; resulting clusters used for all items

our approach clustering strategy3
Our Approach: Clustering Strategy

c0

Social Graph

u2

u1

u3

Community Detection

u4

u5

c1

u6

u8

u7

Key point: clustering based on the publicsocial graph does not leak privacy!

our approach clustering strategy4
Our Approach: Clustering Strategy
  • Louvain Method [Blondel et al. 2008]
    • Greedy modularity maximization
    • Well-studied and known to produce good communities
    • Fast enough for graphs with millions of nodes
    • No parameters to tune
outline2
Outline
  • Motivation
  • Preliminaries
  • Our Approach
  • Experimental Results
  • Conclusions
data sets
Data Sets
  • 1,892 users
  • 17,632 items
  • Avg. user deg. = 13.4 (std. 17.3)
  • Avg. prefsper user = 48.7 (std. 6.9)
  • 137,372 users
  • 48,756 items
  • Avg. user deg. = 18.5 (std. 31.1)
  • Avg. prefsper user = 54.8 (std. 218.2)

Publicly available:

Last.fm <http://ir.ii.uam.es/hetrec2011/datasets>

Flixster<http://www.sfu.ca/~sja25/datasets>

measuring accuracy
Measuring Accuracy
  • Normalized Discounted Cumulative Gain [Järvelin and Kekäläinen. 2002]
  • NDCG at n – measures quality of the private recommendations relative to non-private recommendations, taking rank and utility into account
  • Ranges from 0.0 to 1.0, with 1.0 meaning private recommender achieves ideal ranking
  • Average over all users in data set
experiments last fm
Experiments: Last.fm

Avg. Accuracy (NDCG at n=50) vs. Privacy

Accuracy

Privacy

High

Low

experiments flixster
Experiments: Flixster

Avg. NDCG at 50; 10,000 random users

Accuracy

Note: different y-axis scale

Privacy

High

Low

experiments na ve approaches
Experiments: Naïve Approaches
  • Naïve approaches on Last.fm data set

Katz Common Graph Adamic-Adar

Nbrs. Dist.

Katz Common Graph Adamic-Adar

Nbrs. Dist.

conclusions
Conclusions
  • Differential privacy guarantees for item preferences
  • Use clustering and averaging to trade Laplace noise for some approx. error
  • Clustering via the community structure of the social graph is a useful heuristic for clustering the edges without violating privacy
  • Personalized social recommendations can be both private and accurate
accuracy metric ndcg
Accuracy Metric: NDCG
  • Normalized Discounted Cumulative Gain
    • items recommended to user u by private recommender; sorted by noisy utility
    • items recommended to user u by non-privaterecommender; sorted by trueutility
    • NDCG ranges from 0…1
    • Averaged over all users in a data set
social similarity measures
Social Similarity Measures
  • Adamic-Adar
  • Graph Distance
  • Katz

small dam-

ping factor

paths of length l

between u,v

experiments last fm1
Experiments: Last.fm

NDCG at 10 NDCG at 100

experiments flixster1
Experiments: Flixster

NDCG at 10 NDCG at 100

slide46

Comparison of approaches on Last.fm data set.

Low Rank Mechanism (LRM) – Yuan et al. PVLDB’12

Group and Smooth (GS) – Kellaris & Papadopoulos. PVLDB’13

ad