1 Competitive Privacy: Secure Analysis on Integrated Sequence Data Raymond Chi-Wing Wong 1, Eric Lo 2 The Hong Kong University of Science and Technology.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Advanced Piloting Cruise Plot.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
UNITED NATIONS Shipment Details Report – January 2006.
We need a common denominator to add these fractions.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
0 - 0.
2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt ShapesPatterns Counting Number.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
2010 fotografiert von Jürgen Roßberg © Fr 1 Sa 2 So 3 Mo 4 Di 5 Mi 6 Do 7 Fr 8 Sa 9 So 10 Mo 11 Di 12 Mi 13 Do 14 Fr 15 Sa 16 So 17 Mo 18 Di 19.
Richmond House, Liverpool (1) 26 th January 2004.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Division- the bus stop method
ABC Technology Project
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
2 |SharePoint Saturday New York City
VOORBLAD.
15. Oktober Oktober Oktober 2012.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Squares and Square Root WALK. Solve each problem REVIEW:
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
© 2012 National Heart Foundation of Australia. Slide 2.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
LO: Count up to 100 objects by grouping them and counting in 5s 10s and 2s. Mrs Criddle: Westfield Middle School.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Chapter 5 Test Review Sections 5-1 through 5-4.
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Addition 1’s to 20.
25 seconds left…...
1 Using one or more of your senses to gather information.
H to shape fully developed personality to shape fully developed personality for successful application in life for successful.
Januar MDMDFSSMDMDFSSS
REGISTRATION OF STUDENTS Master Settings STUDENT INFORMATION PRABANDHAK DEFINE FEE STRUCTURE FEE COLLECTION Attendance Management REPORTS Architecture.
Week 1.
Analyzing Genes and Genomes
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
VPN AND REMOTE ACCESS Mohammad S. Hasan 1 VPN and Remote Access.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Essential Cell Biology
How Cells Obtain Energy from Food
Immunobiology: The Immune System in Health & Disease Sixth Edition
Energy Generation in Mitochondria and Chlorplasts
CpSc 3220 Designing a Database
Presentation transcript:

1 Competitive Privacy: Secure Analysis on Integrated Sequence Data Raymond Chi-Wing Wong 1, Eric Lo 2 The Hong Kong University of Science and Technology 1 Hong Kong Polytechnic University 2 Prepared by Raymond Chi-Wing Wong Presented by Raymond Chi-Wing Wong

2 Outline 1.Introduction 2.Problem 3.Algorithm 4.Conclusion

3 1. Introduction In this talk, competitive privacy occurs when two datasets from two different sources are integrated Illustrate this concept with a transportation application Give the motivation why two datasets should be integrated Explain that there is a privacy issue in this application

4 1. Introduction Transportation Application Bus Company BMetro Company M Passenger travel history in the bus company Passenger travel history in the metro company Both companies has implemented RFID-based electronic Transportation payment systems (e.g., Washington DCs SmarTrip system and Hong Kong Octopus System).

5 Bus Company BMetro Company M RFID No. = 222 Airport Bus Stop, Downtown Bus Stop RFID No. = 222 Downtown Station, Uptown Station These two sequences are stored separately. Suppose that the bus company and the metro company want to collaborate and offer discounts to passengers who traveled from airport to uptown using a combination of bus and metro. We need to integrate these two datasets to know the total number of such passengers 9:00am 10:00am 10:15am 11:00am

6 Bus Company BMetro Company M RFID No. = 222 Airport Bus Stop, Downtown Bus Stop RFID No. = 222 Downtown Station, Uptown Station RFID No. = 222 Airport Bus Stop, Downtown Bus Stop, Downtown Station, Uptown Station 9:00am 10:00am 10:15am 11:00am 9:00am 10:00am 10:15am 11:00am

7 Bus Company BMetro Company M RFID No. = 222 Airport Bus Stop, Downtown Bus Stop RFID No. = 222 Downtown Station, Uptown Station RFID No. = 222 Airport Bus Stop, Downtown Bus Stop, Downtown Station, Uptown Station

8 1. Introduction In this talk, competitive privacy occurs when two datasets from two different sources are merged Illustrate this concept with a transportation application Give the motivation why two datasets should be integrated Explain that there is a privacy issue in this application

9 1. Introduction In this talk, competitive privacy occurs when two datasets from two different sources are merged Illustrate this concept with a transportation application Give the motivation why two datasets should be integrated Explain that there is a privacy issue in this application

10 RFID No. = 222 Airport Bus Stop, Downtown Bus Stop, Downtown Station, Uptown Station Data integration may cause privacy issues. Bus Company BMetro Company M Service s B Downtown Bus Stop, Bay Bus Stop Service s M Downtown Station, Bay Station These two services are competitive. No of Passengers = 80,000 No of Passengers = 10,000 If the metro company knows that the no. of passengers using s B is 80,000, then it may offer discounts to passengers using its own service s M to attract more passengers Thus, the original service s B operated by the bus company will be definitely affected. This statistical information about the competitive services corresponds to the competitive privacy of the bus company

11 2. Problem Given two companies the bus company the metro company Objective After the datasets from these two companies are integrated, no company can infer any statistical information about the competitive services of the other company

12 2. Problem Contribution We are the first to propose the concept of competitive privacy Privacy model when sequence datasets are integrated Previous works Privacy model when relational datasets are integrated

13 3. Algorithm

14 Trusted Third Party Bus Company BMetro Company M Integrated database query 1 Determine whether this query allows that the metro company can infer any statistical information about the competitive services of the bus company. If yes, we reject the query. If no, we give the answer of this query. answer 1

15 3. Algorithm Idea: We reject any queries related to the statistical information about all competitive services We skip the details

16 4. Conclusion Privacy Model for Data Integration Competitive Privacy Algorithm

17 Q&A

18 4. Empirical Studies Real dataset Hong Kong Local Transportation Metro Data 63 stations 6 transfer stations 4 railway lanes

19 4. Empirical Studies Variation No. of tuples in the integrated dataset The pattern size in a query Measurements Audit time (the time to determine whether this query should be answered or rejected) Ratio of rejected queries (or restricted queries)

20 4. Empirical Studies The audit time is small. The ratio of restricted queries is small.

21

22 Trusted Third Party Bus Company BMetro Company M Integrated database query 1 e.g., the total number of passengers who have a travel pattern Airport Bus Stop, Downtown Bus Stop, Downtown Station, Uptown Station. Determine whether this query allows that the bus company can infer any statistical information about the competitive services of the metro company. If yes, we reject the query. If no, we give the answer of this query. answer 1 20,000 Pattern Size = 4

23 Trusted Third Party Bus Company BMetro Company M Integrated database query 2 Determine whether this query allows that the bus company can infer any statistical information about the competitive services of the metro company. If yes, we reject the query. If no, we give the answer of this query. answer 2

24 Trusted Third Party Bus Company BMetro Company M Integrated database query 3 Determine whether this query allows that the bus company can infer any statistical information about the competitive services of the metro company. If yes, we reject the query. If no, we give the answer of this query. answer 3

25 Each query alone may not provide any statistical information of the competitive services However, the combination of all query answers may allow that the metro company can infer the statistical information of competitive services

26 Trusted Third Party Bus Company BMetro Company M Integrated database Query: the total number of passengers who have a travel pattern Downtown District, Bay District 90,000 Knowledge 2: there are two services from Downtown District to Bay District 1. The service provided by the bus company (Downtown Bus Stop to Bay Bus Stop) 2. The service provided by the metro company (Downtown Station to Bay Station) Knowledge 3: the total number of passengers who have a travel pattern Downtown Station to Bay Station = 10,000 Conclusion: the total number of passengers who have a travel pattern Downtown Bus Stop to Bay Bus Stop = 90,000 – 10,000 = 80,000 Knowledge 1 The statistical information of the competitive services of the bus company.

27 Bus Company BMetro Company M RFID No. = 222 Airport Bus Stop, Downtown Bus Stop, Downtown Station, Uptown Station Both companies want to know the total number of passengers traveling from Airport Bus Stop to Uptown Station Both companies want to know the total number of passengers traveling from Airport District to Uptown District Roll-up

28 Trusted Third Party Bus Company BMetro Company M Integrated database query 1 Determine whether this query allows that the metro company can infer any statistical information about the competitive services of the bus company. If yes, we reject the query. If no, we give the answer of this query. answer 1

29 Trusted Third Party Bus Company BMetro Company M Integrated database query 2 Determine whether this query allows that the metro company can infer any statistical information about the competitive services of the bus company. If yes, we reject the query. If no, we give the answer of this query. answer 2

30 Trusted Third Party Bus Company BMetro Company M Integrated database query 3 answer 3