OpenVoIP An Open Peer-to-Peer VoIP and IM System

Slides:



Advertisements
Similar presentations
Fall VoN 2000 SIP Servers SIP Servers: A Buyers Guide Jonathan Rosenberg Chief Scientist.
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
Skype & Network Management Taken from class reference : An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol Salman A. Baset and Henning Schulzrinne.
NAT Traversal for P2PSIP Philip Matthews Avaya. Peer X Peer Y Peer W 2. P2PSIP Network Establishing new Peer Protocol connection Peer Protocol messages.
Voice over IP Skype.
Review of a research paper on Skype
Comparison between Skype and SIP- based Peer-to-Peer Voice-Over-IP Overlay Network Johnson Lee EECE 565 Data Communications.
1 © 2005 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID STUN, TURN and ICE Cary Fitzgerald.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
More about Skype. Overview Any node with a public IP address having sufficient CPU, memory and network bandwidth is a candidate to become a super node.
Making Peer-to-Peer Work for SIP Henning Schulzrinne with Salman Baset, Jae Woo Lee Dept. of Computer Science, Columbia University, New York
OSMOSIS Final Presentation. Introduction Osmosis System Scalable, distributed system. Many-to-many publisher-subscriber real time sensor data streams,
Peer-to-Peer Intro Jani & Sami Peltotalo.
Reliability and Relay Selection in Peer- to-Peer Communication Systems Salman A. Baset and Henning Schulzrinne Internet Real-time Laboratory Department.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
Peer-to-peer approaches for SIP Henning Schulzrinne Dept. of Computer Science Columbia University.
On-Demand Media Streaming Over the Internet Mohamed M. Hefeeda, Bharat K. Bhargava Presented by Sam Distributed Computing Systems, FTDCS Proceedings.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
SIMULATING A MOBILE PEER-TO-PEER NETWORK Simo Sibakov Department of Communications and Networking (Comnet) Helsinki University of Technology Supervisor:
P2P File Sharing Systems
Christopher Bednarz Justin Jones Prof. Xiang ECE 4986 Fall Department of Electrical and Computer Engineering University.
An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol Ai-Chun Pang Graduate Institute of Networking and Multimedia Dept. of Comp. Sci. and.
By Shobana Padmanabhan Sep 12, 2007 CSE 473 Class #4: P2P Section 2.6 of textbook (some pictures here are from the book)
 Introduction  VoIP  P2P Systems  Skype  SIP  Skype - SIP Similarities and Differences  Conclusion.
Introduction of P2P systems
Skype P2P Kedar Kulkarni 04/02/09.
HUAWEI TECHNOLOGIES CO., LTD. Page 1 Survey of P2P Streaming HUAWEI TECHNOLOGIES CO., LTD. Ning Zong, Johnson Jiang.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
IETF P2P efforts & Testbeds Salman Abdul Baset, Gaurav Gupta, Jae Woo Lee and Henning Schulzrinne Columbia University SIP 2009 (Paris, January 2009)
An Improved Kademlia Protocol In a VoIP System Xiao Wu , Cuiyun Fu and Huiyou Chang Department of Computer Science, Zhongshan University, Guangzhou, China.
Peer-to-Peer Communication Systems Protocols and Systems, Reliability, Energy Efficiency and Measurements Salman Abdul Baset Department.
An analysis of Skype protocol Presented by: Abdul Haleem.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Protocol Requirements draft-bryan-p2psip-requirements-00.txt D. Bryan/SIPeerior-editor S. Baset/Columbia University M. Matuszewski/Nokia H. Sinnreich/Adobe.
Peer to Peer Network Design Discovery and Routing algorithms
Requirements for Peer protocol draft-jiang-p2psip-peer-protocol-requirement-00.txt Jiang XingFeng (Johnson) P2PSIP WG, IETF #68.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
The NAT Traversal Problem in P2PSIP Bruce Lowekamp (SIPeerior) Philip Matthews (Avaya)
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Peer-to-Peer Protocol (P2PP) Salman Baset, Henning Schulzrinne Columbia University.
SOSIMPLE: A Serverless, Standards- based, P2P SIP Communication System David A. Bryan and Bruce B. Lowekamp College of William and Mary Cullen Jennings.
Skype.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
HIP-Based NAT Traversal in P2P-Environments
1Security for Service Providers – Dave Gladwin – Newport Networks – SIP ’04 – 22-Jan-04 Security for Service Providers Protecting Service Infrastructure.
Peer to peer Internet telephony challenges, status and trend
CS 268: Lecture 22 (Peer-to-Peer Networks)
Distributed Hash Tables
Principles of Network Applications
VIRTUAL SERVERS Presented By: Ravi Joshi IV Year (IT)
Kris, Karthik, Ansley, Sean, Jeremy Dick, David K, Frans, Hari
CHAPTER 3 Architectures for Distributed Systems
Plethora: Infrastructure and System Design
Early Measurements of a Cluster-based Architecture for P2P Systems
Skype P2P communication
EE 122: Peer-to-Peer (P2P) Networks
Peer-to-Peer Protocol (P2PP)
CS 162: P2P Networks Computer Science Division
P2P Systems and Distributed Hash Tables
Jiang XingFeng (Johnson) P2PSIP WG, IETF #68
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Consistent Hashing and Distributed Hash Table
A Scalable Peer-to-peer Lookup Service for Internet Applications
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
Design and Implementation of OverLay Multicast Tree Protocol
Presentation transcript:

OpenVoIP An Open Peer-to-Peer VoIP and IM System Salman Abdul Baset, Gaurav Gupta, and Henning Schulzrinne Columbia University

Agenda What is a peer-to-peer VoIP and IM system? Why P2P? Why not Skype or OpenDHT? Design challenges OpenVoIP architecture and design Implementation issues Demo Relay selection in P2P VoIP system Performance monitoring of a P2P VoIP system

A Peer-to-Peer VoIP and IM System { Establish media session In the presence of NATs Directory service P2P Presence P2P for all of these? Monitoring PSTN connectivity

Why P2P? Cost Scale Media session load Presence load Monitoring load 10 million Skype online users (comscore) 23 million MSN online users (comscore) Media session load 100,000 calls per minute (1,666 calls per second) 106 Mb/s (64 kb/s voice) 426 Mb/s (256 kb/s video) Presence load 1000 notifications per second (500B per notification) 4 Mb/s Monitoring load Call minutes Number of online users

Why not Skype? Median call latency through a relay 96 ms (~6K calls) Two machines behind NAT in our lab (ping<1ms) Call success rate 7.3 % when host cache deleted, call peers behind NAT 4.5K call attempts 74% when traffic blocked between call peers 11K call attempts User annoyance relays calls through a machine whose user needs bw! Shut down the application resulting in call drop Closed and proprietary solution plug P2P in existing SIP phones Hard to obtain statistics for this user annoyance though.

Why not OpenDHT? Actively maintained? NAT traversal 22 nodes as of Sep 7, 2008 [1] NAT traversal Non-OpenDHT nodes cannot fully participate in the overlay [1] http://opendht.org/servers.txt

} } Design Challenges the usual list… #1 Scalability #2 Reliablity #3 Robustness #4 Bootstrap #5 NAT traversal #6 Security data, storage, routing (hard) #7 Management (monitoring) #8 Debugging } at bounded bw, cpu, mem / node (<500 B/s) } must for any commercial p2p network

Design Challenges the not so usual list… #1 Scalability but how? Planet Lab has ~500 online machines online ~400 in August beyond Planet Lab which DHT or unstructured? any? #2 Robustness? a realistic churn model? at best Skype, p2p traces #3 Maintenance? OpenDHT only running on 22 nodes (Sep 7, 2008 [1]) #4 NAT traversal Nodes behind NAT fully participating in the overlay May be, but at what cost? Planet Lab alive nodes http://summer.cs.princeton.edu/status/tabulator.cgi?table=table_nodeviewshort&select=%27resptime%20%3E%200%27 [1] http://opendht.org/servers.txt

OpenVoIP Design goals Implementation goals Performance goals meet the challenges distributed directory service Chord, Kademlia, Pastry, Gia protocol vs. algorithm common protocol / encoding mechanisms establish media session between peers [behind NAT] STUN / TURN / ICE use of peers as relays distributed monitoring / statistics gathering Implementation goals multiplatform pluggable with open source SIP phones ease of debugging Performance goals relay selection and performance monitoring mechanisms beat Skype!

OpenVoIP architecture [ Bootstrap / authentication ] [ monitoring server / Google Maps ] Overlay2 SIP NAT P2P STUN Overlay1 TLS / SSL Protocol stack of a peer alice@domain.com bob@example.com A peer in P2PSIP NAT A client

Peer-to-Peer Protocol (P2PP) A binary protocol Geared towards IP telephony but equally applicable to file sharing, streaming, and p2p-VoD Multiple DHT and unstructured p2p protocol support Application API NAT traversal using STUN, TURN and ICE Request routing recursive, iterative, parallel per message Supports hierarchy (super nodes [peers], ordinary nodes [clients]) Central entities (e.g., authentication server)

Peer-to-Peer Protocol (P2PP) Reliable or unreliable transport (TCP/TLS or UDP/DTLS) Security DTLS, TLS, storage security Multiple hash function support SHA1, SHA256, MD4, MD5 Monitoring ewma_bytes_sent [rcvd], CPU utilization, routing table

OpenVoIP features Kademlia, Bamboo, Chord SHA1, SHA256, MD5, MD4 Hash base: multiple of 2 Recursive and iterative routing Windows XP / Vista, Linux Integrated with OpenWengo Can connect to OpenWengo and P2PP network Buddy lists and IM 1000 node Planet lab network on ~300 machines Integrated with Google maps Demo video: http://youtube.com/?v=g-3_p3sp2MY

OpenVoIP snapshots direct call through a NAT call through a relay

OpenVoIP snapshots Google Map interface

OpenVoIP snapshots Tracing lookup request on Google Maps

OpenVoIP snapshots

OpenVoIP snapshots Resource consumption of a node

Why calls may fail in OpenVoIP? Cannot find a user user is online, but p2p cannot find it. NAT and firewall issues SIP messages call succeeds but media? relay Relay is shutdown System reliability (search + NAT traversal + relay)

Facts of Peer-to-Peer Life Routing loops happen Byzantine failures arise Nodes become disconnected System does not always scale! Automated maintenance does not always work Planet Lab quirks cleans the directory DoS attacks on open ports Bootstrap server is attacked Someone trying their own protocol with our bootstrap server, probably to create a buffer overflow.

OpenVoIP: Key techniques Randomization is our best friend! send the maintenance messages within a bounded random time Churn recovery is on demand and periodic Insert a new entry in routing table after checking liveness Periodically republish SIP records not feasible for large records Avoid overly complex mechanisms can backfire!

OpenVoIP: Debugging Black-box State acquisition Lookup request for a random key State acquisition Remotely obtain the resource and storage utilization of a node Set and Unset a data-value on a node such as BW, CPU utilization to test a relay selection algorithm Remotely enable and disable logging Control log size Find a faulty node hard centralized vs. distributed approach

OpenVoIP – releasing an update Three step process Check in a local network (10-15 nodes) Deploy the update on a managed node that fully participates in the overlay test its functionality Release the update Planet Lab deployment churn one quarter of the network deploy the update continue until done

OpenVoIP: Bootstrap Returns a list of twenty nodes if available Recently joined nodes and some managed nodes

Thank you.

NAT traversal P2PP SIP Media

NAT traversal Solution space Tunnel SIP and RTP within P2PP Tunnel SIP within P2PP NAT traversal for P2PP, SIP, RTP tunnel within STUN, multiplexing different ports, same port

Implementation issues Routing table Routing table maintenance hash table insert a new entry after a ‘keep-alive’ max entries per row (currently 5) proximity neighbor selection [disabled] Churn recovery send keep-alive to nodes after a random time on demand get routing table of randomly selected node Bootstrap bootstrap server and 20 bootstrap peers returns recently joined nodes and some bootstrap nodes x+2i x+2i+1 x+2i+2 x+2i+3

Implementation design } app. pluggability { insert (key, value, callback) callback (resp) lookup (key, callback) Bootstrap Client KadPeer BambooPeer OtherPeer Node Distance Routing table Parser / encoder Neighbor table BigInt Transactions { multiplatform Sys Transport / timers DTLS TLS UDP TCP

Implementation issues Request routing recursive per message state iterative loop detection iterative [machine] recursive [using message state] Replication vs. republish periodically republish [30s – 1 minute] [pro] learn about the topology [con] republishing large data incurs bw overhead Logging log mechanism

Implementation issues Diagnostics protocol command-line showrt, shownt, showro, showcp, insert [key] [value], rlookup, ulookup getrt getnt getro [IPaddr] [port] graphical Platform independence thread: 3 functions createthread, waitforthread [pthread_join], sys: 3 functions strcasecmp, getopt, gettimeofday (GetSystemTimeAsFileTime) net: 4 functions close [closesocket], inet_aton [inet_addr], select timer, getsockopt

Join JP BS P5 P7 P9 JP (P10) BS=bootstrap server 1. Bootstrap 2. 200 P5, P30, P2P-Options 3+. STUN (ICE candidate gathering) 4. Join 5. Join JP (P10) 6. 200 7. 200 N(P9, P15) P2P-Options=P2P algorithm, hash algorithm, logarithm base 1) Joining peer (JP) first sends a query message to the bootstrap server to discover P2P-Options and other peers in the network. 2) It then discovers its NAT type and gathers ICE candidates. 3) Sends a join request which is recursively forwarded. 4) JP will be inserted between P7 and P9. 5) P9 is responsible for all objects between [P7, P9]. It transfers the relevant objects to JP. 6) JPs gathered candidates are sent in the Peer-Info TLV. N(P9, P15) 8. Join 9. 200 10. PublishObject 11. 200 BS=bootstrap server

Call establishment P1 P3 P5 P7 1. LookupObject (P7) 4. 200 (P7 PeerInfo) 5. 200 (P7 PeerInfo) 6. 200 (P7 PeerInfo) 7. INVITE 8. 200 Ok 9. ACK Media

Chord id=x Any node in the interval Neighbor table Routing table Node x+2i x+2i+1 x+2i+2 x+2i+3 Any node in the interval Can be skipped. Node

Kademlia (XOR) id=x No neighbor table Routing table Node 2i 2i+1 2i+2 Can be skipped. Node

Chord – recursive id=x Neighbor table Routing table Node x+2i x+2i+1 Can be skipped. Node

Chord – iterative id=x Neighbor table Routing table Node x+2i x+2i+1 Can be skipped. Node

Relay selection Using peers as relays Peer acting as relay can preallocate fix number of calls Skype one voice/video call per relay can preallocate resources CPU, bw as long as user of relay machine is not ‘annoyed’ what does annoy mean?

Relay selection Annoyance function af() Relay selection approach threshold based af() < threshold, use as a relay real-value Input parameters CPU utilization, interactivity, bytes sent/rcvd Relay selection approach constraint: RTT, loss rate, uptime select a relay set load-balance approach annoyance function approach

Relay selection algorithm Routing table based call load to number of relays in routing table AS number based select a relay within same AS but too many machines in one AS or none … IP prefix based Random

Relay selection algorithm Churn what happens when a relay goes down? active vs. passive approach active: send redundant traffic through alternate relays passive: detect failure and then switch different relays for media traversing in each direction For 18% calls (18K total) Skype use a different relay from caller to callee and vice versa