Competitive privacy secure analysis on integrated sequence data
Download
1 / 30

Competitive Privacy: Secure Analysis on Integrated Sequence Data - PowerPoint PPT Presentation


  • 212 Views
  • Uploaded on

Competitive Privacy: Secure Analysis on Integrated Sequence Data. Raymond Chi-Wing Wong 1 , Eric Lo 2 The Hong Kong University of Science and Technology 1 Hong Kong Polytechnic University 2. Prepared by Raymond Chi-Wing Wong Presented by Raymond Chi-Wing Wong. Outline. Introduction Problem

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Competitive Privacy: Secure Analysis on Integrated Sequence Data' - ryanadan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Competitive privacy secure analysis on integrated sequence data l.jpg

Competitive Privacy: Secure Analysis on Integrated Sequence Data

Raymond Chi-Wing Wong1, Eric Lo2

The Hong Kong University of Science and Technology1

Hong Kong Polytechnic University2

Prepared by Raymond Chi-Wing Wong

Presented by Raymond Chi-Wing Wong


Outline l.jpg
Outline Data

  • Introduction

  • Problem

  • Algorithm

  • Conclusion


1 introduction l.jpg
1. Introduction Data

  • In this talk,

    • “competitive privacy”

      • occurs when two datasets from two different sources are integrated

  • Illustrate this concept with a transportation application

  • Give the motivation why two datasets should be integrated

  • Explain that there is a privacy issue in this application


1 introduction4 l.jpg
1. Introduction Data

  • Transportation Application

Both companies has implemented RFID-based electronic

Transportation payment systems (e.g., Washington DC’s SmarTrip system

and Hong Kong Octopus System).

Passenger travel history in the bus company

Passenger travel history in the metro company

Bus Company B

Metro Company M


Slide5 l.jpg

These two sequences are stored separately. Data

Suppose that the bus company and the metro company want

to collaborate and offer discounts to passengers who traveled from

airport to uptown using a combination of bus and metro.

We need to integrate these two datasets to know the total number

of such passengers

RFID No. = 222“Downtown Station”, “Uptown Station”

10:15am

11:00am

RFID No. = 222 “Airport Bus Stop”, “Downtown Bus Stop”

Bus Company B

Metro Company M

9:00am

10:00am


Slide6 l.jpg

RFID No. = 222 Data “Airport Bus Stop”, “Downtown Bus Stop”, “Downtown Station”, “Uptown Station”

9:00am

10:15am

10:00am

11:00am

RFID No. = 222“Downtown Station”, “Uptown Station”

10:15am

11:00am

RFID No. = 222 “Airport Bus Stop”, “Downtown Bus Stop”

Bus Company B

Metro Company M

9:00am

10:00am


Slide7 l.jpg

RFID No. = 222 Data “Airport Bus Stop”, “Downtown Bus Stop”, “Downtown Station”, “Uptown Station”

RFID No. = 222“Downtown Station”, “Uptown Station”

RFID No. = 222“Airport Bus Stop”, “Downtown Bus Stop”

Bus Company B

Metro Company M


1 introduction8 l.jpg
1. Introduction Data

  • In this talk,

    • “competitive privacy”

      • occurs when two datasets from two different sources are merged

  • Illustrate this concept with a transportation application

  • Give the motivation why two datasets should be integrated

  • Explain that there is a privacy issue in this application


1 introduction9 l.jpg
1. Introduction Data

  • In this talk,

    • “competitive privacy”

      • occurs when two datasets from two different sources are merged

  • Illustrate this concept with a transportation application

  • Give the motivation why two datasets should be integrated

  • Explain that there is a privacy issue in this application


Slide10 l.jpg

RFID No. = 222 Data “Airport Bus Stop”, “Downtown Bus Stop”, “Downtown Station”, “Uptown Station”

If the metro company knows that the no. of passengers using sB is 80,000, then it may offer discounts to passengers using its own service sM to attract more passengers

This statistical information about the competitive services corresponds

to the “competitive privacy” of the

bus company

Data integration may cause privacy issues.

Thus, the original service sB operated by the bus company will be definitely

affected.

No of Passengers = 80,000

No of Passengers = 10,000

Service sB“Downtown Bus Stop”, “Bay Bus Stop”

Service sM“Downtown Station”, “Bay Station”

These two services are competitive.

Bus Company B

Metro Company M


2 problem l.jpg
2. Problem Data

  • Given

    • two companies

      • the bus company

      • the metro company

  • Objective

    • After the datasets from these two companies are integrated,

    • no company can infer any statistical information about the competitive services of the other company


2 problem12 l.jpg
2. Problem Data

  • Contribution

    • We are the first to propose the concept of “competitive privacy”

    • Privacy model when sequence datasets are integrated

    • Previous works

      • Privacy model when relational datasets are integrated



Slide14 l.jpg

Trusted Third Party Data

Determine whether this query allows that the metro company can infer any statistical information about the competitive services of the bus company.

If yes, we reject the query.

If no, we give the answer of this query.

Integrated database

answer 1

query 1

Bus Company B

Metro Company M


3 algorithm15 l.jpg
3. Algorithm Data

  • Idea:

    • We reject any queries related to the statistical information about all competitive services

    • We skip the details


4 conclusion l.jpg
4. Conclusion Data

  • Privacy Model for Data Integration

    • Competitive Privacy

  • Algorithm


Slide17 l.jpg
Q&A Data


4 empirical studies l.jpg
4. Empirical Studies Data

  • Real dataset

    • Hong Kong Local Transportation Metro Data

    • 63 stations

    • 6 transfer stations

    • 4 railway lanes


4 empirical studies19 l.jpg
4. Empirical Studies Data

  • Variation

    • No. of tuples in the integrated dataset

    • The pattern size in a query

  • Measurements

    • Audit time (the time to determine whether this query should be answered or rejected)

    • Ratio of rejected queries (or restricted queries)


4 empirical studies20 l.jpg
4. Empirical Studies Data

The audit time is small.

The ratio of restricted queries is small.


Slide22 l.jpg

Trusted Third Party Data

Determine whether this query allows that the bus company can infer any statistical information about the competitive services of the metro company.

If yes, we reject the query.

If no, we give the answer of this query.

Integrated database

20,000

answer 1

e.g., the total number of passengers who have a travel pattern

“Airport Bus Stop”, “Downtown Bus Stop”,

“Downtown Station”, “Uptown Station”.

query 1

Pattern Size = 4

Bus Company B

Metro Company M


Slide23 l.jpg

Trusted Third Party Data

Determine whether this query allows that the bus company can infer any statistical information about the competitive services of the metro company.

If yes, we reject the query.

If no, we give the answer of this query.

Integrated database

answer 2

query 2

Bus Company B

Metro Company M


Slide24 l.jpg

Trusted Third Party Data

Determine whether this query allows that the bus company can infer any statistical information about the competitive services of the metro company.

If yes, we reject the query.

If no, we give the answer of this query.

Integrated database

answer 3

query 3

Bus Company B

Metro Company M


Slide25 l.jpg

  • Each query Dataalone may not provide any statistical information of the competitive services

  • However, the combination of all query answers may allow that the metro company can infer the statistical information of competitive services


Slide26 l.jpg

Trusted Third Party Data

Knowledge 2: there are two services from “Downtown District” to “Bay District”

1. The service provided by the bus company (“Downtown Bus Stop” to “Bay Bus Stop”)

2. The service provided by the metro company (“Downtown Station” to “Bay Station”)

Knowledge 3: the total number of passengers who have a travel pattern

“Downtown Station” to “Bay Station” =

10,000

Integrated database

Conclusion: the total number of passengers who have a travel pattern

“Downtown Bus Stop” to “Bay Bus Stop” =

90,000 – 10,000 = 80,000

The statistical information of the competitive services of the bus company.

Knowledge 1

Query: the total number of passengers who have a travel pattern

“Downtown District”, “Bay District”

90,000

Bus Company B

Metro Company M


Slide27 l.jpg

RFID No. = 222 Data “Airport Bus Stop”, “Downtown Bus Stop”, “Downtown Station”, “Uptown Station”

Both companies want to know

the total number of passengers traveling from “Airport Bus Stop” to “Uptown Station”

Roll-up

Both companies want to know

the total number of passengers traveling from “Airport District” to “Uptown District”

Bus Company B

Metro Company M


Slide28 l.jpg

Trusted Third Party Data

Determine whether this query allows that the metro company can infer any statistical information about the competitive services of the bus company.

If yes, we reject the query.

If no, we give the answer of this query.

Integrated database

answer 1

query 1

Bus Company B

Metro Company M


Slide29 l.jpg

Trusted Third Party Data

Determine whether this query allows that the metro company can infer any statistical information about the competitive services of the bus company.

If yes, we reject the query.

If no, we give the answer of this query.

Integrated database

answer 2

query 2

Bus Company B

Metro Company M


Slide30 l.jpg

Trusted Third Party Data

Integrated database

answer 3

query 3

Bus Company B

Metro Company M


ad