1 Agenda 1. What is (Web) data mining? And what does it have to do with privacy? – a simple view – 2. Examples of data mining and "privacy-preserving data.

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

- A Powerful Computing Technology Department of Computer Science Wayne State University 1.
A PowerPoint Presentation
TOPIC LEARNING BTEC Level 3 Unit 28 Websites L01- All students will understand the web architecture and components which allow the internet and websites.
– with special attention to location ( ) privacy SPACE WEBMINING PRIVACY and : foes or friends? Bettina Berendt Dept. Computer Science K.U. Leuven.
Choose a dataset…. Test_set StockMarket.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
1 Technical Developments Related to Quality Issues Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY
PRIVACY AND SECURITY ISSUES IN DATA MINING P.h.D. Candidate: Anna Monreale Supervisors Prof. Dino Pedreschi Dott.ssa Fosca Giannotti University of Pisa.
Do You Trust Your Recommender? An Exploration of Privacy and Trust in Recommender Systems Dan Frankowski, Dan Cosley, Shilad Sen, Tony Lam, Loren Terveen,
An architecture for Privacy Preserving Mining of Client Information Jaideep Vaidya Purdue University This is joint work with Murat.
Project Summary Everybody’s Google is a web browser extension which mines personalized Google search results and redistributes them to extension users.
Databases – A Key to Unlocking the Future Database Efficacy – Uses in the Classroom.
Data Mining: Next 10 Years Rakesh Agrawal IBM Almaden Research Center Position from KDD-2001 Revisited.
LinkSelector: A Web Mining Approach to Hyperlink Selection for Web Portals Xiao Fang University of Arizona 10/18/2002.
Skills for a Sustainable Business Enterprise Social Media and PPC.
Recommendations via Collaborative Filtering. Recommendations Relevant for movies, restaurants, hotels…. Recommendation Systems is a very hot topic in.
LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.
Yuppies, Cougars, and Tweeps - OH MY! Uses of Large Geo Datasets.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
CS 345 Data Mining Lecture 1 Introduction to Web Mining.
Recommender systems Ram Akella November 26 th 2008.
Data Mining – Intro.
Privacy-Preserving Data Mining Rakesh Agrawal Ramakrishnan Srikant IBM Almaden Research Center 650 Harry Road, San Jose, CA Published in: ACM SIGMOD.
PRIVACY CRITERIA. Roadmap Privacy in Data mining Mobile privacy (k-e) – anonymity (c-k) – safety Privacy skyline.
Lecture 21: Privacy and Online Advertising. References Challenges in Measuring Online Advertising Systems by Saikat Guha, Bin Cheng, and Paul Francis.
Knowledge Management, Semantic Web and
Search Engine Optimization. What is SEO? Search engine optimization (SEO) is the process of improving the visibility of a website or a web page in search.
Promoting Your Business Online Chris Wellings
Recommender Systems and Collaborative Filtering
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Result presentation. Search Interface Input and output functionality – helping the user to formulate complex queries – presenting the results in an intelligent.
Ethics of personalized information filtering Ansgar Koene, Elvira Perez, Christopher J. Carter, Ramona Statache, Svenja Adolphs, Claire O’Malley, Tom Rodden,
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 9.1 Chapter 9 : Social Networks What is a social.
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
Recommender systems Drew Culbert IST /12/02.
United Nations Statistics Division Bringing Information to the World.
SOCIAL BOOKMARKING How to Use del.icio.us to Save, Recall and Share Links Jo-Anne Gibson June, 2007.
Data Mining: Potentials and Challenges Rakesh Agrawal IBM Almaden Research Center.
There are interesting reports in the CUL bibliomining system (logs.library.cornell.edu) Adam Chandler Electronic Resources User Experience Librarian Library.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Josh Schmoldt The Data Mining Experts. My project is an investigation of data mining and Google. Hal Niedzviecki’s book “The Peep Diaries: How We’re Learning.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
UT DALLAS Erik Jonsson School of Engineering & Computer Science FEARLESS engineering Analyzing and Securing Social Media Security and Privacy in Online.
Discovering Computers Fundamentals, Third Edition CGS 1000 Introduction to Computers and Technology Spring 2007.
Randomization in Privacy Preserving Data Mining Agrawal, R., and Srikant, R. Privacy-Preserving Data Mining, ACM SIGMOD’00 the following slides include.
Privacy vs. Utility Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
User Modeling and Recommender Systems: Introduction to recommender systems Adolfo Ruiz Calleja 06/09/2014.
Anonymity and Privacy Issues --- re-identification
Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.
“In the beginning -- before Google -- a darkness was upon the land.” Joel Achenbach Washington Post.
1 Privacy Preserving Data Mining Introduction August 2 nd, 2013 Shaibal Chakrabarty.
The world’s libraries. Connected. Managing your Private and Public Data: Bringing down Inference Attacks against your Privacy Group Meeting in 2015.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
Amy Randolph-Chernis. Blogging Facebook LinkedIn Twitter YouTube Social Networking!
Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) November 6, 2015 Cloud-Centric Assured Information Sharing
User Modeling for Personal Assistant
Data Mining: Concepts and Techniques
Recommender Systems & Collaborative Filtering
Item-to-Item Recommender Network Optimization
Bhavani Thuraisingham
Created By: Matthew Contreras
Chapter 4 Online Consumer Behavior, Market Research, and Advertisement
Data Warehousing Data Mining Privacy
Information & Democracy
Democracy and Information
Democracy and Information
Information & Democracy
Presentation transcript:

1 Agenda 1. What is (Web) data mining? And what does it have to do with privacy? – a simple view – 2. Examples of data mining and "privacy-preserving data mining": l Association-rule mining (& privacy-preserving AR mining) l Collaborative filtering (& privacy-preserving collaborative filtering) 3. A second look at...privacy 4. A second look at...Web / data mining 5. The goal: More than modelling and hiding – Towards a comprehensive view of Web mining and privacy. Threats, opportunities and solution approaches. 6. An outlook: Data mining for privacy

2 Privacy Problems: Example 1 Technical background of the problem: The dataset allows for Web mining (e.g., which search queries lead to which site choices), it violates k-anonymity (e.g. "Lilburn"  a likely k = #inhabitants of Lilburn)

3 Privacy Problems: Example 2 Where do people live who will buy the Koran soon? Technical background of the problem: A mashup of different data sources Amazon wishlists Yahoo! People (addresses) Google Maps each with insufficient k-anonymity, allows for attribute matching and thereby inferences

4 Predicting political affiliation from Facebook profile and link data (1): Most Conservative Traits Trait NameTrait ValueWeight Conservative Groupgeorge w bush is my homeboy Groupcollege republicans Grouptexas conservatives Groupbears for bush Groupkerry is a fairy Groupaggie republicans Groupkeep facebook clean Groupi voted for bush Groupprotect marriage one man one woman Lindamood et al. 09 & Heatherly et al. 09 Privacy Problems: Example 3

5 Predicting political affiliation from Facebook profile and link data (2): Most Liberal Traits per Trait Name Trait NameTrait ValueWeight Liberal activitiesamnesty international Employerhot topic favorite tv showsqueer as folk grad schoolcomputer science hometownmumbai Relationship Statusin an open relationship religious viewsagnostic looking forwhatever i can get Lindamood et al. 09 & Heatherly et al. 09

6 "Privacy-preserving Web mining" example: find patterns, unlink personal data Volvo S40 website targets people in 20s n Are visitors in their 20s or 40s? n Which demographic groups like/dislike the website? n An example of the "Randomization Approach" to PPDM: R. Agrawal and R. Srikant, "Privacy Preserving Data Mining", SIGMOD 2000.

7 Randomization Approach Overview 50 | 40K |...30 | 70K | Randomizer Reconstruct distribution of Age Reconstruct distribution of Salary Data Mining Algorithms Model 65 | 20K |...25 | 60K |......

8 Seems to work well!

9 What is collaborative filtering? "People like what people like them like" – regardless of support and confidence

10 User-based Collaborative Filtering n Idea: People who agreed in the past are likely to agree again n To predict a user’s opinion for an item, use the opinion of similar users n Similarity between users is decided by looking at their overlap in opinions for other items n Next step: build a model of user types  "global model" rather than "local patterns" as mining result

11 1. Privacy as confidentiality: "the right to be let alone" – and to hide data Data Is this all there is to privacy?

12 2. Privacy as control: informational self-determination Data Don‘t do THIS ! n e.g. data privacy: "the right of the individual to decide what information about himself should be communicated to others and under what circumstances" (Westin, 1970) n behind much of data-protection legislation (see Eleni Kosta‘s talk)

13 Discussion item: What is this an example of? Tracing anonymous edits in Wikipedia

14 [Method: Attribute matching]

15 Results (an example)