Hummingbird: Privacy at the time of Twitter

Slides:



Advertisements
Similar presentations
Chapter 10 Real world security protocols
Advertisements

Off-the-Record Communication, or, Why Not To Use PGP
Expressive Privacy Control with Pseudonyms Seungyeop Han, Vincent Liu, Qifan Pu, Simon Peter, Thomas Anderson, Arvind Krishnamurthy, David Wetherall University.
P3: Toward Privacy-Preserving Photo Sharing Moo-Ryong Ra, Ramesh Govindan, and Antonio Ortega Networked Systems Laboratory & Signal and Image Processing.
SPORC: Group Collaboration using Untrusted Cloud Resources Ariel J. Feldman, William P. Zeller, Michael J. Freedman, Edward W. Felten Published in OSDI’2010.
ITIS 6200/ Secure multiparty computation – Alice has x, Bob has y, we want to calculate f(x, y) without disclosing the values – We can only do.
Non-tracking Web Analytics Istemi Ekin Akkus 1, Ruichuan Chen 1, Michaela Hardt 2, Paul Francis 1, Johannes Gehrke 3 1 Max Planck Institute for Software.
WEP 1 WEP WEP 2 WEP  WEP == Wired Equivalent Privacy  The stated goal of WEP is to make wireless LAN as secure as a wired LAN  According to Tanenbaum:
Offline Untrusted Storage with Immediate Detection of Forking and Replay Attacks Marten van Dijk, Jonathan Rhodes, Luis Sarmenta Srini Devadas MIT Computer.
Efficient Private Techniques for Verifying Social Proximity Michael J. Freedman and Antonio Nicolosi Discussion by: A. Ziad Hatahet.
Polymorphic blending attacks Prahlad Fogla et al USENIX 2006 Presented By Himanshu Pagey.
 Guarantee that EK is safe  Yes because it is stored in and used by hw only  No because it can be obtained if someone has physical access but this can.
CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.
APPLAUS: A Privacy-Preserving Location Proof Updating System for Location-based Services Zhichao Zhu and Guohong Cao Department of Computer Science and.
Privacy Preserving Data Mining Yehuda Lindell & Benny Pinkas.
PRIAM: PRivate Information Access Management on Outsourced Storage Service Providers Mark Shaneck Karthikeyan Mahadevan Jeff Yongdae Kim.
Towards Online Spam Filtering in Social Networks Hongyu Gao, Yan Chen, Kathy Lee, Diana Palsetia and Alok Choudhary Lab for Internet and Security Technology.
1 Authors: Anirudh Ramachandran, Nick Feamster, and Santosh Vempala Publication: ACM Conference on Computer and Communications Security 2007 Presenter:
WARNINGBIRD: A Near Real-time Detection System for Suspicious URLs in Twitter Stream.
Cryptanalysis of Two Dynamic ID-based Authentication
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
Social Networking with Frientegrity: Privacy and Integrity with an Untrusted Provider Prateek Basavaraj April 9 th 2014.
Privacy & Security Online Ivy, Kris & Neil Privacy Threat - Ivy Is Big Brother Watching You? - Kris Identity Theft - Kris Medical Privacy - Neil Children’s.
Web Security : Secure Socket Layer Secure Electronic Transaction.
Security protocols  Authentication protocols (this lecture)  Electronic voting protocols  Fair exchange protocols  Digital cash protocols.
Security protocols and their verification Mark Ryan University of Birmingham Midlands Graduate School University of Birmingham April 2005 Steve Kremer.
Presented by: Sanketh Beerabbi University of Central Florida.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
P2: Privacy-Preserving Communication and Precise Reward Architecture for V2G Networks in Smart Grid P2: Privacy-Preserving Communication and Precise Reward.
Protocols for public-key management. Key management –two problems Distribution of public keys (for public- key cryptography) Distribution of secret keys.
Twitter Games: How Successful Spammers Pick Targets Vasumathi Sridharan, Vaibhav Shankar, Minaxi Gupta School of Informatics and Computing, Indiana University.
Leveraging Delivery for Spam Mitigation.
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
Key Generation Protocol in IBC Author : Dhruti Sharma and Devesh Jinwala 論文報告 2015/12/24 董晏彰 1.
CPSC FALL 2015TEAM P6 Real-time Detection System for Suspicious URLs Submitted by T.ANUPCHANDRA V.KRANTHI SUDHA CH.KRISHNAPRASAD Under Guidance.
BY S.S.SUDHEER VARMA (13NT1D5816)
Snapshots, checkpoints, rollback, and restart
Threat Modeling for Cloud Computing
Searchable Encryption in Cloud
Data Management on Opportunistic Grids
Improving searches through community clustering of information
USAGE OF CRYPTOGRAPHY IN NETWORK SECURITY
Cryptography and Network Security
PAYMENT GATEWAY Presented by SHUJA ASHRAF SHAH ENROLL: 4471
What is network security?
Practical Censorship Evasion Leveraging Content Delivery Networks
Outline Introduction Characteristics of intrusion detection systems
Using Social Media to Build and Support Partnerships
Binder Attack Surface in Android
Cryptography.
Are these Ads Safe: Detecting Hidden A4acks through Mobile App-Web Interfaces Vaibhav Rastogi, Rui Shao, Yan Chen, Xiang Pan, Shihong Zou, and Ryan Riley.
Privacy Preserving Record Linkage
563.10: Bloom Cookies Web Search Personalization without User Tracking
Lab for Internet and Security Technology Yan Chen
IT Security awareness Training.
Cryptography and Network Security
0x1A Great Papers in Computer Security
Pooja programmer,cse department
Network Security – Kerberos
Smart Portal To Protect Child Online
KERBEROS.
Public-Key, Digital Signatures, Management, Security
Privacy preserving cloud computing
Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware Kriti shreshtha.
Outline The spoofing problem Approaches to handle spoofing
Cryptography and Network Security
Report from the trenches of an HTML5 game provider
Exploiting Unintended Feature Leakage in Collaborative Learning
Presentation transcript:

Hummingbird: Privacy at the time of Twitter Emiliano De Cristofaro Claudio Soriente Gene Tsudik Andrew Williams Presented by Hongyu Gao, Northwestern University

Motivation Recall the three types of OSN privacy breaches? Facts: Breach from the service provider Breach from other user Breach from 3rd party apps Facts: Now Twitter boasts over 100 million subscribers Users have very little control over privacy Second, we target at designing a system that can be deployed and protect OSNs in real-time. …. In addition, we want it to be tolerant for incomplete training data, since we can never collect all spam campaigns to train the system. At last, the system doesn’t need frequent re-training, which lowers the maintenance cost after deployment. There is an abundance of work in OSNs spam study. Ourselves did a offline spam analysis more than 1 year ago. It received wide media coverage. Grier et.al. did a similar study and reached similar conclusion. Thomas et.al deveoped the Monarch system that classifies URLs in real-time to identify spam messages. Our approach is complementary to theirs in the sense that we mainly use information in the messages and the senders, rather than the landing pages. … , featured in Wall Street Journal, MIT Technology Review, and ACM Tech News

Motivation, cont’d Motivating examples Looking for tweets with #TeaParty might expose one’s political views. Search for #HIVcure might reveal one’s medical condition. What could happen if data is stored in the server in clear text? Mining by service provider Hacker break-in Insider attack Second, we target at designing a system that can be deployed and protect OSNs in real-time. …. In addition, we want it to be tolerant for incomplete training data, since we can never collect all spam campaigns to train the system. At last, the system doesn’t need frequent re-training, which lowers the maintenance cost after deployment. There is an abundance of work in OSNs spam study. Ourselves did a offline spam analysis more than 1 year ago. It received wide media coverage. Grier et.al. did a similar study and reached similar conclusion. Thomas et.al deveoped the Monarch system that classifies URLs in real-time to identify spam messages. Our approach is complementary to theirs in the sense that we mainly use information in the messages and the senders, rather than the landing pages. … , featured in Wall Street Journal, MIT Technology Review, and ACM Tech News

Related Works A Lot different! Quite different xBook, protects user privacy from 3rd party apps Hummingbird, protects user privacy from the server Quite different De-centralized OSNs [7,15] Hummingbird preserves the central server to guarantee availability Second, we target at designing a system that can be deployed and protect OSNs in real-time. …. In addition, we want it to be tolerant for incomplete training data, since we can never collect all spam campaigns to train the system. At last, the system doesn’t need frequent re-training, which lowers the maintenance cost after deployment. There is an abundance of work in OSNs spam study. Ourselves did a offline spam analysis more than 1 year ago. It received wide media coverage. Grier et.al. did a similar study and reached similar conclusion. Thomas et.al deveoped the Monarch system that classifies URLs in real-time to identify spam messages. Our approach is complementary to theirs in the sense that we mainly use information in the messages and the senders, rather than the landing pages. … , featured in Wall Street Journal, MIT Technology Review, and ACM Tech News

Related Works, cont’d Other designs that keep the central server #h00t [6] encrypts/decrypts tweets and the contained hashtag with secret shared within a user group. Facecloak [38] provides fake info to the server. VPSN [17] also provides fake into to the server. … Second, we target at designing a system that can be deployed and protect OSNs in real-time. …. In addition, we want it to be tolerant for incomplete training data, since we can never collect all spam campaigns to train the system. At last, the system doesn’t need frequent re-training, which lowers the maintenance cost after deployment. There is an abundance of work in OSNs spam study. Ourselves did a offline spam analysis more than 1 year ago. It received wide media coverage. Grier et.al. did a similar study and reached similar conclusion. Thomas et.al deveoped the Monarch system that classifies URLs in real-time to identify spam messages. Our approach is complementary to theirs in the sense that we mainly use information in the messages and the senders, rather than the landing pages. … , featured in Wall Street Journal, MIT Technology Review, and ACM Tech News

Privacy Goals Server: learns minimal information beyond that obtained from performing the matching function. Tweeter: learns who subscribes to its hashtags but not which hashtags have been subscribed to. Follower: learns nothing beyond its own subscriptions. It learns no information about other subscribers or any tweets that do not match its subscriptions. Second, we target at designing a system that can be deployed and protect OSNs in real-time. …. In addition, we want it to be tolerant for incomplete training data, since we can never collect all spam campaigns to train the system. At last, the system doesn’t need frequent re-training, which lowers the maintenance cost after deployment. There is an abundance of work in OSNs spam study. Ourselves did a offline spam analysis more than 1 year ago. It received wide media coverage. Grier et.al. did a similar study and reached similar conclusion. Thomas et.al deveoped the Monarch system that classifies URLs in real-time to identify spam messages. Our approach is complementary to theirs in the sense that we mainly use information in the messages and the senders, rather than the landing pages. … , featured in Wall Street Journal, MIT Technology Review, and ACM Tech News

Privacy Non-Goals Server learns who follows whom. Server learns whenever multiple tweets from a given tweeter contain the same hashtag. Server learns whenever multiple followers are subscribed to the same hashtag of a given tweeter. Second, we target at designing a system that can be deployed and protect OSNs in real-time. …. In addition, we want it to be tolerant for incomplete training data, since we can never collect all spam campaigns to train the system. At last, the system doesn’t need frequent re-training, which lowers the maintenance cost after deployment. There is an abundance of work in OSNs spam study. Ourselves did a offline spam analysis more than 1 year ago. It received wide media coverage. Grier et.al. did a similar study and reached similar conclusion. Thomas et.al deveoped the Monarch system that classifies URLs in real-time to identify spam messages. Our approach is complementary to theirs in the sense that we mainly use information in the messages and the senders, rather than the landing pages. … , featured in Wall Street Journal, MIT Technology Review, and ACM Tech News

Roadmap Key System Design System Prototype Performance Overhead Discussions and Conclusions Next in this talk, I will first give our detailed system design, followed by the experimental evaluation of the system. After that, I’d like to share some potential future work, and finally conclude this talk. 8 8

Hummingbird Protocol After deployment, the system sees a stream of incoming messages. There is one new message, and a set of messages that have been observed in the past. The past messages have already been organized into clusters. The system performs incremental clustering, to update the clustering result given the new message.

Crucial Background: OPRF Name: Oblivious PseudoRandom Functions Effect: Securely compute fs(x) Input: s from sender and x from receiver Guarantee: Sender learns nothing about x Receiver only learns the value of fs(x) Incremental clustering requires that the clustering result of (k+1) messages can be obtained by updating the result of the first k messages with the k+1th message. There’s no need to repeat the whole clustering process.

Key Design Bob encrypts a message and Alice decrypts it Bob and Alice share the secret fs(ht) fs(ht) is a cryptographic primitive that prevents Bob from learning ht (OPRF technique) The server forwards Bob’s message to Alice Both Bob and Alice submit a cryptographic token, H2(fs(ht)), to the server Incremental clustering requires that the clustering result of (k+1) messages can be obtained by updating the result of the first k messages with the k+1th message. There’s no need to repeat the whole clustering process.

Privacy Goals, re-visit Server: learns minimal information beyond that obtained from performing the matching function. Tweeter: learns who subscribes to its hashtags but not which hashtags have been subscribed to. Follower: learns nothing beyond its own subscriptions. It learns no information about other subscribers or any tweets that do not match its subscriptions. Second, we target at designing a system that can be deployed and protect OSNs in real-time. …. In addition, we want it to be tolerant for incomplete training data, since we can never collect all spam campaigns to train the system. At last, the system doesn’t need frequent re-training, which lowers the maintenance cost after deployment. There is an abundance of work in OSNs spam study. Ourselves did a offline spam analysis more than 1 year ago. It received wide media coverage. Grier et.al. did a similar study and reached similar conclusion. Thomas et.al deveoped the Monarch system that classifies URLs in real-time to identify spam messages. Our approach is complementary to theirs in the sense that we mainly use information in the messages and the senders, rather than the landing pages. … , featured in Wall Street Journal, MIT Technology Review, and ACM Tech News

System Prototype Another key component of our system is the trained classifier, which requires a careful selection of features. It ensures that it is relatively hard for spammers to manipulate the campaigns to evade detection. The reason is that some campaigns may be absent in the training set. In addition, after the training phase, new campaigns will emerge. Given these two problems, the features that are specific to any campaigns would have limited effectiveness.

Performance Overhead <1ms Also negligible comparing to web transactions Comparable to current Twitter 14 14

Discussions The collusion disaster: By colluding with Alice, for all hashtags that Alice follows, the server can learn H2(fs(ht)), thus learn the identify of all other followers on ht. By colluding with Bob, the server can learn the interest (the subscribed hashtag) of all Bob’s followers. Incremental clustering requires that the clustering result of (k+1) messages can be obtained by updating the result of the first k messages with the k+1th message. There’s no need to repeat the whole clustering process.

Discussions BIG (??) sacrifice of functionality: No retweets. No replying. No tweets without hashtags. No following a user (must follow a (user, hashtag) pair). Do you still want to use Hummingbird? Incremental clustering requires that the clustering result of (k+1) messages can be obtained by updating the result of the first k messages with the k+1th message. There’s no need to repeat the whole clustering process.

Conclusions One the first efforts to mitigate privacy breach from service providers. Propose Hummingbird architecture. Implemented a prototype and demonstrated its low performance overhead Incremental clustering requires that the clustering result of (k+1) messages can be obtained by updating the result of the first k messages with the k+1th message. There’s no need to repeat the whole clustering process.

Thank you!