Di Wu Polytechnic Institute of NYU Security Issues in P2P Di Wu Polytechnic Institute of NYU
Many Apps are Migrating to P2P File sharing Peer-assisted file and patch distribution Live video streaming Video on demand VoIP Hybrid CDN/P2P
Today’s Focus: P2P Security P2P is potentially more vulnerable than client server. Decentralized More difficult to manage and control Need to understand the security issues for architecting future P2P apps Attacks from entertainment industry reveal weak spots in P2P
Attacks On & From Attacks on P2P systems: Attacks from P2P Systems:
Today’s Talk Survey earlier work on P2P security: Pollution attack on file systems Poisoning attack on file systems Poisoning DDoS attack on arbitrary systems Discuss work in progress on BitTorrent attacks
Joint Work With Jian Liang Xiaojun Hei Rakesh Kumar Naoum Naoumov Prithula Dhungel Di Wu
Attacks on P2P sharing Two types: Pollution: file corruption File Content Index poisoning File Index Investigated in two networks: FastTrack/Kazaa Unstructured P2P network Overnet/Kad Structured (DHT) P2P network Part of eDonkey/eMule
File Pollution: Infocom 05 original content polluted content pollution company
File Pollution pollution server pollution company file sharing network
File Pollution Unsuspecting users spread pollution ! Alice Bob
File Pollution Unsuspecting users spread pollution ! Yuck
Index Poisoning: Infocom 06 index title location bigparty 123.12.7.98 smallfun 23.123.78.6 heyhey 234.8.89.20 23.123.78.6 123.12.7.98 file sharing network 234.8.89.20
Index Poisoning index title location bigparty 123.12.7.98 smallfun 23.123.78.6 heyhey 234.8.89.20 bighit 111.22.22.22 index title location bigparty 123.12.7.98 smallfun 23.123.78.6 heyhey 234.8.89.20 23.123.78.6 123.12.7.98 234.8.89.20 111.22.22.22
FastTrack/Kazaa Overlay ON = ordinary node SN = super node SN ON ON ON Each SN maintains a local index
FastTrack Query Alice ON = ordinary node SN = super node SN ON ON ON
FastTrack Download ON = ordinary node HTTP request SN = for hash value super node HTTP request for hash value SN ON ON ON Bob
FastTrack Download ON = ordinary node P2P file transfer SN = super node P2P file transfer SN ON ON ON
What’s a DHT ? Distributed index over peers Records <key, data> distributed over peers Provide API to insert and locate records Efficient Peers have IDs Record is placed in peer whose ID is closest to key Efficient algos to find peer with closest ID
Overnet: DHT 0001 0011 Query 1111 With keyword Publish <fileid, location> Publish <keyword, fileID> 0100 Query location 1100 0101 Download 1010 Bob 1000
In Overnet/Kad Two record types: <hash_keyword, hash_file> <hash_file, IP address>
Titles, versions, hashes & copies The title is the title of song/movie/software A given title can have thousands of versions Each version has its own hash = version_id Each version can have thousands of copies A title can also have non-existent versions, each identified by a hash
Index Poisoning in FastTrack and Overnet FastTrack/Kazaa Advertise to supernodes (target_song, bogus_IP) for many bogus IP’s, many versions of target_song Overnet/E-donkey Advertise record: (hash_target_keyword, bogus_version_id)
Attacks: How Effective? For a given title, what fraction of the “displayed copies” are Clean ? Poisoned? Polluted? Brute-force approach: attempt download all versions versions that don’t download are poisoned for those versions that download, listen/watch each one How do we determine pollution levels without downloading?
Definition of Pollution and Poisoning Levels (t, t+ Δ): investigation interval V: set of all versions of title T V1, V2, V3: sets of poisoned, polluted, clean versions Cv: number of advertised copies of version v
How to Estimate? Need Cv, vєV Need V1, V2, V3 Solution: Don’t want to download and listen to files! Solution: Harvest version ids and copy locations FastTrack: Crawl Overnet: Insert node, receive publish msg’s Heuristic for classifying versions into V1, V2, V3
Copies at Users FastTrack Overnet For certain titles, a tiny fraction of users advertise the majority of the copies
Heuristic Identify heavy and light advertisers Hh = set of hashes from heavy advertisers Hl = set of hashes from light advertisers polluted versions Hh Hl clean versions poisoned versions
FastTrack Copies
Overnet Copies
Defenses Authenticate source of advertisement Rating versions Require advertisements to be sent over TCP Require advertised IP addresses to be consistent with source IP address Rating versions Moderated sites tracking verified hashes Rating and blacklisting sources IP blocks that advertise a large number of copies are assigned low recommendations Distributed reputation, eg, eigentrust
Blacklisting Assign reputations to /n subnets Bad reputation to subnets with large number of advertised copies of any title Obtain reputations locally; share with distributed algorithm Locally blacklist /n subnets with bad reputations
Blacklisting: More
The Inverse Attack (2006) Attacks on P2P systems: But can also exploit P2P systems for DDoS attacks against innocent host:
DDoS Attacks From: Poison Index Example target_IP = www.poly.edu - Advertise records: (Madonna _hit , target_IP:target_port) Users attempt to download popular title Generates fully-open TCP connections from user peers to target Nastier than syn-flood DDoS attack
DDoS Poison Index: Connections
DDoS Poison Index: Connection Durations
DDoS Attacks in DHT: Poison Routing Tables Example Routing advertisements: (node_ID, target_IP) Many nodes will absorb advertisement Query, publish, overlay messages arrive at poisoned node Some are directed at target
DDoS Poison Routing: UDP Bandwidth
DDoS Poison Routing: IP sources
Summary Pollution and Poison attacks Can attack arbitrary title Developed methodology to evaluate success of attack without downloading content Leads to distributed blacklisting scheme Can also DDoS attack arbitrary node with poisoning Index poisoning Routing poisoning
Di Wu Polytechnic Institute of NYU BitTorrent Security Di Wu Polytechnic Institute of NYU
Today’s Talk What is BitTorrent? Who attacks BitTorrent? How vulnerable is BitTorrent? Conclusion
What is BitTorrent?
BitTorrent A Peer-to-Peer Content Distribution Protocol/ Program Developed by Bram Cohen in 2001 Bram grew up in upper west side of Manhattan, NYC First version written in Python
BitTorrent torrent: group of tracker: tracks peers peers exchanging chunks of a file tracker: tracks peers in torrent; provides tracker list trading chunks torrent index server: search for torrents; provides .torrent file peer
BitTorrent Ecosystem Open protocol 50+ client implementations Dozens of tracker implementations Dozens of torrent location sites 5 million simultaneous users & growing Evolving: Peer discovery: DHTs, gossiping Proprietary protocols, private torrents
BitTorrent Basics Seeds and leechers File divided into 256KB pieces. Each piece is 16 blocks. Download blocks and assemble pieces Hash piece to check integrity Peers advertise pieces they have to neighbors Peer sends blocks to four neighbors currently sending it data at the highest rate – Tit-for-Tat And also to one random neighbor
Who attacks BitTorrent?
The P2P Battle Record/movie companies limit piracy in P2P. Three different approaches: Sue P2P file sharing companies Sue individual users Launch large scale Internet attacks Led to demise of Kazaa, eDonkey Question: does it work for BitTorrent?
Does it work for BitTorrent? Sue P2P file sharing companies BitTorrent swarms and clients not controlled by small set of companies Could sue torrent web sites and tracker services too many tracker services being decentralized – DHT & gossiping Sue individual users Painstaking and unpopular Large scale Internet attacks Currently being used - hire technical companies
MediaSentry
MediaDefender
MacroVision
How vulnerable is BitTorrent?
Our Objective Identified attacks against BT components How vulnerable is BitTorrent to attacks? Methodology: Active measurements Passive measurements PlanetLab experiments
Attack BitTorrent swarm: tracker: seed: leecher: trading chunks torrent index server: e.g., piratebay… peer
Classes of BitTorrent Attacks Attacks against an existing torrent against leechers against initial seed against peer discovery against torrent discovery Decoy attacks: attacker creates own torrent Seeding a polluted file Seeding a file and delivering only 99%
Fake Block Attack One kind of Leecher Attack Attacker establishes TCP connections with legitimate peers Peer downloads one fake block from attacker and 15 good blocks from legit peers Hash failure – download is prolonged
Fake Block Attack BitTorrent file divided into 256KB pieces. Each piece consists of 16 blocks Victim Peer Genuine Blocks 5. Hash Fail 1. TCP Connection Genuine Blocks 3. Block Request 2. Fake BitMap 4. Fake Block Genuine Blocks Genuine Peers Attacker
Connection attack One kind of leecher attack. Attacker establishes many TCP connections to each target peer. Doesn’t upload any blocks Chatty peer: keeps connection active with repeated BT handshake messages
Passive Measurements Recent album under attack: “Foo Fighters” Collect traces while downloading Azureus and uTorrent clients DSL and Ethernet connections Total 54 downloads To estimate download time without attack: Obtain blacklist from torrentfreak.com Use Peer Guardian to prevent connections to and from blacklisted peers Developed parser to analyze BT trace
Download is NOT being prolonged by more than 40% Azureus results Azureus w/ IP-filtering w/o IP-filtering Delay Ratio Ethernet 15.52 mins (6 downloads) 20.99 mins 35.2% DSL 19.98 mins 25.88 mins 29.5% Td w/o IP Filtering – Td w/ IP Filtering Td w/ IP Filtering Delay Ratio = Download is NOT being prolonged by more than 40%
Zoom in one Azureus trace Chatty-peers make up a major fraction of the useful peers.
Handshake messages sent by chatty peers
Download is NOT being prolonged by more than 60% for DSL uTorrent uTorrent w/ IP-filtering w/o IP-filtering Delay Ratio Ethernet 9.17 mins (10 downloads) 9.42 mins 2.7% DSL 18.32 mins (5 downloads) 28.93 mins 57.9% Download is NOT being prolonged by more than 60% for DSL
Zoom in one uTorrent trace Even a low percentage of fake block attack peers (2%) resulted in a delay ratio of 60%
Conclusions on Leecher Attack Anti-P2P companies applying different strategies for different BT clients Largely ineffective for Ethernet clients For DSL, download time increases by 30-60%
Seed Attack Attack initial seed: “Nip in the bud” Goal: to diminish the seed’s ability to upload blocks. Two Types of Seed Attacks: Bandwidth Attack Connection Attack
Bandwidth Attack Bandwidth Attack: Connect to seed, download at high rate attempt to consume the majority of seed upload bandwidth Rationale: Conventional algo gives all its bandwidth to 5 highest downloaders Seed: no tit-for-tat
Connection Attack Connection Attack: grab most of connection slots of seed Prevent legitimate peers from connecting to the seed. Rationale: a seed has limited conn. slots.
Planet Lab experiments Attack Environment: Seed: Azureus, BitTornado, and uTorrent Attacker: modified BitTornado, download only Flash crowd: start seed, start 5 leechers, start attack peers, start 25 leechers 30 leechers PL nodes, DL 1.5Mbps, UL 384kbps 1 seed Poly Campus 1 tracker PL node Bw attack peers PL nodes, no DL limit. Conn. Attack peers PL nodes, DL 160kpbs, UL 80kbps
Azureus Seed: Bandwidth Attack 40 attackers, 100 MB file Delay Ratio = Avg. Download Time w/ attack Avg. Download Time w/o attack Seed Bandwidth Delay Ratio 480 Kbps 2.24 2.58 800 Kbps 1.45 1.37 1120 Kbps 1.08 1.04
Azureus: Seed BW Distribution Attackers cannot always occupy the upload slots.
Azureus Seeding Algorithm Azureus seed prefers peers with higher download rate Fewer bytes downloaded from seed so far.
Azureus Seeding Algorithm Seed maintains two lists of peers: Ascending ordered list based on download rate Descending ordered list based download amount 1 2 3 4 dl rate list P1, 10 kbps P2, 20 kbps P3, 50 kbps P4, 100 kbps dl amount list P4, 30 MB P3, 20 MB P1, 10 MB P2, 0 MB combined list P1, 1+3=4 P2, 2+4=6 P3, 3+2=5 P4, 4+1=5 Sort Top n-1 get upload slots! P2, P3, P4, P1
Azureus Seed: Connection Attack 800 attackers, 500 MB file Run multiple attackers on one node Seed Bandwidth Delay Ratio 480 Kbps All leechers stuck at 1.2% 800 Kbps All leechers stuck at 22.3% All leechers stuck at 40.3% 1120 Kbps All leechers stuck at 44.9% All leechers stuck at 16.2%
Azureus: Seed BW Distribution
Azureus Connection Management Optimistic Disconnect Every 30 secs, Select the peer that has not received any data from seed for the longest period Tmax. If Tmax > 5 mins, disconnect that peer. Every 30 secs, if conn slots are full, perform Optimistic Disconnect.
Azureus: Detailed Seed BW Dist.
Azureus Summary Bandwidth attack is unsuccessful. Connection attack is very effective.
BitTornado Seed Seeding Algorithm Connection Management Pure bandwidth-first algorithm Peers with highest download rate get unchoked. Connection Management Incoming connections: accept if room. Outgoing connections: seed initiates conn. Dead connections are removed: data not delivered for 2+ times.
BitTornado: Bandwidth Attack 40 Attackers, 500 MB file Seed Bandwidth Delay Ratio 400 Kbps 4.11 4.44 800 Kbps 1.95 1.39 1200 Kbps 1.08 1.06
BitTornado: Seed BW Dist.
Explanation u Download rate = min{b, u/n} b: peer download capacity u: seed upload capacity n: upload slots b: Download capacity Leechers/ attackers u Seed n upload slots
Explanation When bA > bL >= u/n, attacker has no advantages over leechers. bA: bw attacker, unlimited bL: leecher, 1.5Mbps u/n: 400/4~1200/4 kbps
BitTornado: Connection Attack 800 attackers, 500 MB file Seed Bandwidth Delay Ratio 400 Kbps 9.52 8.99 800 Kbps 1.00 1200 Kbps 1.03
BitTornado: Seed BW Dist.
Explanation Connection attackers’ download rate is smaller than leechers’ download rate. bL > u/n > bA, when u = 800kbps, 1200kbps bA: conn. attacker, 160kbps bL: leecher, 1.5Mbps u/n: 400/4~1200/4 kbps When u=400kbps, a combined bandwidth and connection attack.
BitTornado Summary Both bandwidth attack and connection attack are ineffective for BitTornado. The results about uTorrent are similar to BitTornado.
Decoy Attack Create your own torrent: Seeded file: Own tracker: Seed file Use your own tracker or other’s Upload .torrent files to torrent Web sites Seeded file: Fake file Or download only 99% of file Own tracker: Exaggerate size of torrent to torrent sites. Countermeasure: Users observe problem, notify torrent site administrators, or post comments
BitTorrent Conclusions Fake-block and chatty-peer attacks are generally not highly effective Connection attack is potentially harmful for seed Decoy attack? Nib-in-bud?
Talk Conclusion P2P is potentially more vulnerable than client server. Decentralized More difficult to manage and control Need to understand the security issues for architecting future P2P apps Attacks from entertainment industry reveal weak spots in P2P
Papers P. Dhungel, X. Hei, K.W. Ross, Is BitTorrent Unstoppable?, work in progress P. Dhungel, X. Hei, K. W. Ross, N. Saxena, The Pollution Attack in P2P Live Video Streaming: Measurement Results and Defenses, Sigcomm P2P-TV Workshop, Kyoto, 2007 N. Naoumov, and K.W. Ross, Exploiting P2P Systems for DDoS Attacks, International Workshop on Peer-to-Peer Information Management), Hong Kong, May 2006 J. Liang, N. Naoumov, and K.W. Ross, The Index Poisoning Attack in P2P File-Sharing Systems, Infocom 2006, Barcelona, 2006 J. Liang, N. Naoumov, and K.W. Ross, Efficient Blacklisting and Pollution-Level Estimation in P2P File-Sharing Systems, Asian Internet Engineering Conference, Bangkok, December, 2005 J. Liang, R. Kumar, Yongjian Xi, K.W. Ross, Pollution in P2P File Sharing Systems, Infocom 05, Miami, 2005