Privacy Protection for RFID Data Benjamin C.M. Fung Concordia Institute for Information systems Engineering Concordia university Montreal, QC, Canada

Slides:



Advertisements
Similar presentations
ADAPTIVE FASTEST PATH COMPUTATION ON A ROAD NETWORK: A TRAFFIC MINING APPROACH Hector Gonzalez, Jiawei Han, Xiaolei Li, Margaret Myslinska, John Paul Sondag.
Advertisements

Anonymity for Continuous Data Publishing
Publishing Set-Valued Data via Differential Privacy Rui Chen, Concordia University Noman Mohammed, Concordia University Benjamin C. M. Fung, Concordia.
RFID Data Aggregation Dritan Bleco, Yannis Kotidis Department of Informatics Athens University of Economics and Business.
Hybrid Context Inconsistency Resolution for Context-aware Services
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System ` Introduction With the deployment of smart card automated.
Center for Secure Information Systems Concordia Institute for Information Systems Engineering k-Jump Strategy for Preserving Privacy in Micro-Data Disclosure.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Anonymizing Location-based data Jarmanjit Singh Jar_sing(at)encs.concordia.ca Harpreet Sandhu h_san(at)encs.concordia.ca Qing Shi q_shi(at)encs.concordia.ca.
Hani AbuSharkh Benjamin C. M. Fung fung (at) ciise.concordia.ca
Anonymizing Healthcare Data: A Case Study on the Blood Transfusion Service Benjamin C.M. Fung Concordia University Montreal, QC, Canada
Constructing Popular Routes from Uncertain Trajectories Authors of Paper: Ling-Yin Wei (National Chiao Tung University, Hsinchu) Yu Zheng (Microsoft Research.
1 Privacy Preserving Data Publishing Prof. Ravi Sandhu Executive Director and Endowed Chair March 29, © Ravi.
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
UTEPComputer Science Dept.1 University of Texas at El Paso Privacy in Statistical Databases Dr. Luc Longpré Computer Science Department Spring 2006.
1 A Distortion-based Metric for Location Privacy Workshop on Privacy in the Electronic Society (WPES), Chicago, IL, USA - November 9, 2009 Reza Shokri.
Privacy Preserving Publication of Moving Object Data Joey Lei CS295 Francesco Bonchi Yahoo! Research Avinguda Diagonal 177, Barcelona, Spain 6/10/20151CS295.
Ordering and Consistent Cuts Presented By Biswanath Panda.
Aki Hecht Seminar in Databases (236826) January 2009
Privacy Preserving Serial Data Publishing By Role Composition Yingyi Bu 1, Ada Wai-Chee Fu 1, Raymond Chi-Wing Wong 2, Lei Chen 2, Jiuyong Li 3 The Chinese.
Temporal Pattern Matching of Moving Objects for Location-Based Service GDM Ronald Treur14 October 2003.
C MU U sable P rivacy and S ecurity Laboratory 1 Privacy Policy, Law and Technology Data Privacy October 30, 2008.
Anatomy: Simple and Effective Privacy Preservation Israel Chernyak DB Seminar (winter 2009)
APPLAUS: A Privacy-Preserving Location Proof Updating System for Location-based Services Zhichao Zhu and Guohong Cao Department of Computer Science and.
Anonymization of Set-Valued Data via Top-Down, Local Generalization Yeye He Jeffrey F. Naughton University of Wisconsin-Madison 1.
Task 1: Privacy Preserving Genomic Data Sharing Presented by Noman Mohammed School of Computer Science McGill University 24 March 2014.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System Rui Chen, Concordia University Benjamin C. M. Fung,
MINING RELATED QUERIES FROM SEARCH ENGINE QUERY LOGS Xiaodong Shi and Christopher C. Yang Definitions: Query Record: A query record represents the submission.
Publishing Microdata with a Robust Privacy Guarantee
Approximate Frequency Counts over Data Streams Loo Kin Kong 4 th Oct., 2002.
Data Publishing against Realistic Adversaries Johannes Gerhrke Cornell University Ithaca, NY Michaela Götz Cornell University Ithaca, NY Ashwin Machanavajjhala.
SixthSense RFID based Enterprise Intelligence Lenin Ravindranath, Venkat Padmanabhan Interns: Piyush Agrawal (IITK), SriKrishna (BITS Pilani)
Differentially Private Data Release for Data Mining Noman Mohammed*, Rui Chen*, Benjamin C. M. Fung*, Philip S. Yu + *Concordia University, Montreal, Canada.
Thwarting Passive Privacy Attacks in Collaborative Filtering Rui Chen Min Xie Laks V.S. Lakshmanan HKBU, Hong Kong UBC, Canada UBC, Canada Introduction.
25/03/2003CSCI 6405 Zheyuan Yu1 Finding Unexpected Information Taken from the paper : “Discovering Unexpected Information from your Competitor’s Web Sites”
Protecting Sensitive Labels in Social Network Data Anonymization.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Garrett Poppe, Liv Nguekap, Adrian Mirabel CSUDH, Computer Science Department.
Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relational Data.
Lesson Title: EPCglobal and ISO/IEC Item Management Standards Dale R. Thompson Computer Science and Computer Engineering Dept. University of Arkansas
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
Preserving Privacy in GPS Traces via Uncertainty- Aware Path Cloaking Baik Hoh, Marco Gruteser, Hui Xiong, Ansaf Alrabady Presented by Joseph T. Meyerowitz.
MINING COLOSSAL FREQUENT PATTERNS BY CORE PATTERN FUSION FEIDA ZHU, XIFENG YAN, JIAWEI HAN, PHILIP S. YU, HONG CHENG ICDE07 Advisor: Koh JiaLing Speaker:
DATA MINING By Cecilia Parng CS 157B.
MaskIt: Privately Releasing User Context Streams for Personalized Mobile Applications SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference.
Privacy-preserving data publishing
1/3/ A Framework for Privacy- Preserving Cluster Analysis IEEE ISI 2008 Benjamin C. M. Fung Concordia University Canada Lingyu.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
D-skyline and T-skyline Methods for Similarity Search Query in Streaming Environment Ling Wang 1, Tie Hua Zhou 1, Kyung Ah Kim 2, Eun Jong Cha 2, and Keun.
1 An Arc-Path Model for OSPF Weight Setting Problem Dr.Jeffery Kennington Anusha Madhavan.
Practical Message-passing Framework for Large-scale Combinatorial Optimization Inho Cho, Soya Park, Sejun Park, Dongsu Han, and Jinwoo Shin KAIST 2015.
Anonymizing Data with Quasi-Sensitive Attribute Values Pu Shi 1, Li Xiong 1, Benjamin C. M. Fung 2 1 Departmen of Mathematics and Computer Science, Emory.
Location Privacy Protection for Location-based Services CS587x Lecture Department of Computer Science Iowa State University.
Probabilistic km-anonymity (Efficient Anonymization of Large Set-valued Datasets) Gergely Acs (INRIA) Jagdish Achara (INRIA)
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
Privacy Vulnerability of Published Anonymous Mobility Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak.
Regulation models addressing data protection issues in the EU concerning RFID technology Ioannis Iglezakis Assistant Professor in Computers & Law Faculty.
Versatile Publishing For Privacy Preservation
CLASS INHERITANCE TREE (CIT)
ACHIEVING k-ANONYMITY PRIVACY PROTECTION USING GENERALIZATION AND SUPPRESSION International Journal on Uncertainty, Fuzziness and Knowledge-based Systems,
Xiaokui Xiao and Yufei Tao Chinese University of Hong Kong
Byung Joon Park, Sung Hee Kim
Privacy Preserving Data Publishing
RFID Security & Privacy at both Physical and System Levels - Presentation to IoT-GSI 26th August 2011 Robert H. Deng & Yingjiu Li School of Information.
Mining Access Pattrens Efficiently from Web Logs Jian Pei, Jiawei Han, Behzad Mortazavi-asl, and Hua Zhu 2000년 5월 26일 DE Lab. 윤지영.
Privacy preserving cloud computing
Presented by : SaiVenkatanikhil Nimmagadda
Walking in the Crowd: Anonymizing Trajectory Data for Pattern Analysis
Instructor Materials Chapter 5: Ensuring Integrity
Presentation transcript:

Privacy Protection for RFID Data Benjamin C.M. Fung Concordia Institute for Information systems Engineering Concordia university Montreal, QC, Canada Ming Cao Concordia Institute for Information systems Engineering Concordia university Montreal, QC, Canada Heng Xu College of Information Science and Technology Penn State University University Park, PA Bipin C. Desai Department of Computer Science & Software Engineering Concordia university Montreal, QC, Canada

Agenda What is RFID ? Privacy Threats Privacy Protection Model – LKC Model Efficient Algorithm Empirical Study Conclusion and Future Work 2

What is RFID? Radio Frequency Identification (RFID) – Technology that allows a sensor (reader) to read, from a distance, and without line of sight, a unique electronic product code (EPC) associated with a tag Interrogate EPC (EPC, time) Tag ReaderServer 3

Application of RFID ? Supply Chain Management: Real-time inventory tracking Retail: Active shelves monitor product availability Access control: Toll collection, credit cards, building access Airline luggage management: Reduce lost/misplaced luggage Medical: Implant patients with a tag that contains their medical history Pet identification: Implant RFID tag with pet owner information 4

What is RFID – RFID Tag and Receiver spacingmontreal.ca 5

RFID Ticketing System According to the STM website, the metro system has transported over 6 billion passengers as of 2006, roughly equivalent to the world's population 6

What is RFID-Tag and Database? 7 Source: KDD 08 Tutorial

RFID Data TrajectoriesApp EventsRaw Events [EPC, Location, Time] [EPC, Location, Time_in, Time_out] [EPC: (L 1,T 1 )(L 2,T 2 )…(L n,T n )] 8

RFID Data 9 Three models in typical RFID applications – Bulky movements: supply-chain management – Scattered movements: E-pass tollway system – No movements: fixed location sensor networks Different applications may require different data warehouse systems Our discussion will focus on Scattered movements Source: KDD 08 Tutorial

Object Specific Path Table {(loc 1 t 1 )  …  (loc n t n ) }:s 1,…,s p : d 1,…,d m Where {(loc 1 t 1 )  …  (loc n t n ) is a path, s 1,…,s p are sensitive attributes, and 1,…,d m are quasi- identifying(QID) attributes associated with object. 10

RFID Data Mining 11

Object Specific Path Table EPCPathNameDiagnose 1 McGill 7 -> Concordia 8 -> McGill 17 BobFlu 2 Atwater 7 -> Concordia 8 -> Vendome 13 -> Cote-Vertu 18 -> St-Laurent 22 JoeHIV 3 LaSalle 8 -> Concordia 9 -> Snowdon 18 -> Place - D'Armes 19 -> Longueuil 24 AliceFlu 4 Cote-vertu 7 -> Concordia 8 -> Cote-Vertu 17 KenSARS 5 Atwater 7 -> Concordia 8 -> Vendome 13 -> Cote-Vertu 18 -> Atwater 20 JulieHIV 12

Privacy Act "Under agreement with the Québec privacy commission, any data used for analytical purpose has user identification stripped out. Access by law enforcement agencies is permitted only by court order." - Steve MunroSteve Munro 13

A simple Attack EPCPathNameDiagnose 1 McGill 7 -> Concordia 8 -> McGill 17 BobFlu 2 Atwater 7 -> Concordia 8 -> Vendome 13 -> Cote-Vertu 18 -> St-Larent 22 JoeHIV 3 Lassale 8 -> Concordia 9 -> Snowdon 18 -> Place - D'Arms 19 -> Longueul 24 AliceFlu 4 Cote-vertu 7 -> Concordia 8 -> Cote-Vertu 7 KenSARS 5 Atwater 7 -> Concordia 8 -> Vendome 13 -> Cote-Vertu 18 -> Atwater 20 JulieHIV 14

A simple Attack EPCPathDiagnose 1 McGill 7 -> Concordia 8 -> McGill 17 Flu 2 Atwater 7 -> Concordia 8 -> Vendome 13 -> Cote-Vertu 18 -> St-Larent 22 HIV 3 Lassale 8 -> Concordia 9 -> Snowdon 18 -> Place -D'Arms 19 - > Longueul 24 Flu 4 Cote-vertu 7 -> Concordia 8 -> Cote-Vertu 7 SARS 5 Atwater 7 -> Concordia 8 -> Vendome 13 -> Cote-Vertu 18 -> Atwater 20 HIV 15

RFID Data Privacy Threats Record Linkage If a path in the table is so specific that not many people match it, releasing the RFID data may lead to linking the victim's record, and therefore, her contracted diagnosis. Attribute Linkage If a sensitive value occurs frequently together with some combination of pairs, then the sensitive information can be inferred from such combination even though the exact record of the victim cannot be identified. Our Goal: preserving data privacy while preserving data usefulness 16

Problem of Traditional K-Anonymity in high dimensional, sparse data Increasing the number of attributes will increase the information loss(ex: 50x12=600 dimension) High Distortion Rate Assume attacker prior knowledge is bounded by at most L pairs of location and timestamp Ensure every possible subsequence q with maximum length L in any path a RFID data table is shared by at least K records and confidence to infer sensitive value not more than C. 17

LK Anonymity An object-specific path table T satisfies LK anonymity if and only if |G(q)| ≥ K for any subsequence q with |q| ≤ L of any path in T, where K is a positive anonymity threshold. IG(q)I is the adversary prior knowledge that could identify a group of record in T. 18

LC Dilution Let S be a set of data holder-specified sensitive values from sensitive attributes S 1,…,S m. An object-specific path table T satisfies LC-dilution if and only if Conf(s|G(q)) ≤ C for any s S and for any subsequence q with |q| < L of any path in T, where 0 ≤C ≤ 1 is a confidence threshold. Conf(s|G(q)) is the percentage of the records in IG(q)I containing S. 19

LKC Privacy An object-specific path table T satisfies LKC- privacy if T satisfies both LK-anonymity and LC-dilution. 20

Problem Definition We can transform an object-specific path table T to satisfy LKC-privacy by performing a sequence of suppressions on selected pairs from T. In this paper, we employ global suppression, meaning that if a pair p is chosen to be suppressed, all instances of p in T are suppressed. 21

Algorithm Phase 1 Identifying critical violations Phase 2 Removing critical violations 22

Phase 1-Violation Let q be a subsequence of a path in T with |q| ≤ L and |G(q)| > 0. q is a violation with respect to a LKC-privacy requirement if |G(q)| C. 23

Phase 1-Critical Violation A violation q is a critical violation if every proper subsequence of q is a non-violation. Observation: A table T 0 satisfies LKC-privacy if and only if T 0 contains no critical violation because each violation is a super sequence of a critical violation. Thus, if T 0 contains no critical violations, then T 0 contains no violations. 24

Phase 1-Efficient Search and Apriori Algorithm We propose an algorithm to efficiently identify all critical violations in T with respect to a LKC- privacy requirement. We generate all critical violations of size i+1, denoted by V i+1, by incrementally extending non-violations of size i, denoted by U i, with an additional pair. 25

Phase 1-Identifying Violation 26

Phase 2-Removing Critical Violation Now we have a set of critical violation set. A naïve approach, removing all the violation set. 27

Phase 2-Critical Violation Tree(Example) 28

Phase 2-Score Function 29

Greedy Algorithm: RFID Data Anonymizer 30 Input: Raw RFID path table T Input: Thresholds L, K, C. Input: Sensitive values S. Output: Anonymous T’ that satisfies LKC-privacy 1: V= Call Gen Violations(T, L,K,C,S) in Algorithm 1; 2: build the Critical Violation Tree (CVT) with Score Table; 3: while Score Table is not empty do 4: select winner pair w that has the highest Score; 5: delete all critical violations containing w in CVT; 6: update Score of a candidate; 7: remove w in Score Table; 8: add w to Sup 9: end while

Empirical Study – Implementation Environment 31 All experiments were conducted on a PC with Intel Core2 Quad 2.4GHz with 2GB of RAM The employed data set is a simulation of the travel route of 20,000 passenger

Empirical Study- Distortion Analysis 32

Empirical Study- Score Function 33

Empirical Study- Efficiency and Scalability 34

Powerful LKC Model with other data 35

Conclusion We illustrate the privacy threats caused by publishing RFID data Formally define a privacy model, called LKC privacy for high dimensional, sparse RFID data Propose an efficient anonymization algorithm to transform a RFID data set to satisfy a given LKC-privacy requirement 36

Paper Our paper titled “Privacy Protection for RFID Data” has been accepted at ACM SAC B. C. M. Fung, M. Cao, B. C. Desai, and H. Xu. Privacy protection for RFID data. In Proceedings of the 24th ACM SIGAPP Symposium on Applied Computing (SAC 2009) Special Track on Database Theory, Technology, and Applications (DTTA), Honolulu, HI: ACM Press, March

Future Work Implement different anonymization methods: generalization or permutation. New attack scenario with QID Enhanced Score function 38

Acknowledgement The research is supported in part by the Discovery Grants( ) from Natural Sciences and Engineering Research Council of Canada(NSERC) 39

Reference: KDD 08 Tutorial, Mining Massive RFID trajectory, and traffic Data Sets, Jiawei Han, Jae-Gil Lee, Hector Gonzalez, Xiaolei Li, ACM SIGKDD’08 Conference Tutorial, Las Vegas, NV Office of the Privacy Commissioner 40