Peer-to-peer systems for autonomic VoIP and web hotspot handling

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Comparison between Skype and SIP- based Peer-to-Peer Voice-Over-IP Overlay Network Johnson Lee EECE 565 Data Communications.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
DotSlash – A Web Hotspot Rescue System Weibin Zhao Henning Schulzrinne Department of Computer Science Columbia University June 11, 2004.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Handling Web Hotspots at Dynamic Content Web Sites Using DotSlash Weibin Zhao Henning Schulzrinne Columbia University NYMAN’04.
Handling Web Hotspots at Dynamic Content Web Sites Using DotSlash Weibin Zhao Henning Schulzrinne Columbia University Dagstuhl.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Peer-to-peer approaches for SIP Henning Schulzrinne Dept. of Computer Science Columbia University.
Towards Autonomic Computing: Service Discovery and Web Hotspot Rescue Weibin Zhao Department of Computer Science Columbia University.
DotSlash: Providing Dynamic Scalability to Web Applications Weibin Zhao and Henning Schulzrinne Department of Computer Science, Columbia University More.
March 31, 2005Thomson1 Advanced Network Services: P2P VoIP, location-based services and self-managing server farms Henning Schulzrinne (and members of.
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
P2P File Sharing Systems
 Introduction  VoIP  P2P Systems  Skype  SIP  Skype - SIP Similarities and Differences  Conclusion.
Peer-to-Peer Networking. Presentation Introduction Characteristics and Challenges of Peer-to-Peer Peer-to-Peer Applications Classification of Peer-to-Peer.
Introduction of P2P systems
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
DotSlash An Automated Web Hotspot Rescue System Jonathan Bulava CSC8530 – Distributed Systems Dr. Paul Schragger.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Peer-to-peer systems for autonomic VoIP and web hotspot handling Kundan Singh, Weibin Zhao and Henning Schulzrinne Internet Real Time Laboratory Computer.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Peer-to-Peer Name Service (P2PNS) Ingmar Baumgart Institute of Telematics, Universität Karlsruhe IETF 70, Vancouver.
DotSlash: Handling Web Hotspots at Dynamic Content Web Sites Weibin Zhao Henning Schulzrinne Department of Computer Science Columbia.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
P2P-SIP Peer to peer Internet telephony using SIP Kundan Singh and Henning Schulzrinne Columbia University, New York Dec 15, 2005
VOIP over Peer-to-Peer
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Reliable and Scalable Internet Telephony Kundan Singh and Henning Schulzrinne Internet Real Time Lab – Internal Talk Sept 24, 2004.
DotSlash – or how to deal with 15 minutes of fame Weibin Zhao Henning Schulzrinne Columbia University CATT/WICAT Annual Research Review November 14, 2003.
Peer to Peer Network Design Discovery and Routing algorithms
Peer to Peer Computing. What is Peer-to-Peer? A model of communication where every node in the network acts alike. As opposed to the Client-Server model,
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Peer-to-Peer Systems: An Overview Hongyu Li. Outline  Introduction  Characteristics of P2P  Algorithms  P2P Applications  Conclusion.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Malugo – a scalable peer-to-peer storage system..
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
SOSIMPLE: A Serverless, Standards- based, P2P SIP Communication System David A. Bryan and Bruce B. Lowekamp College of William and Mary Cullen Jennings.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
Peer to peer Internet telephony challenges, status and trend
Peer-to-peer Internet telephony using SIP
Early Measurements of a Cluster-based Architecture for P2P Systems
P2P-SIP Using an External P2P network (DHT)
DotSlash: An Automated Web Hotspot Rescue System
P2P-SIP Peer to peer Internet telephony using SIP
Presentation transcript:

Peer-to-peer systems for autonomic VoIP and web hotspot handling Kundan Singh, Weibin Zhao and Henning Schulzrinne Internet Real Time Laboratory Computer Science Dept., Columbia University, New York http://www.cs.columbia.edu/IRT/p2p-sip http://www.cs.columbia.edu/IRT/dotslash

P2P for autonomic computing Autonomic at the application layer: Robust against partial network faults Resources grow as user population grows Self-configuring Traditional p2p systems file storage motivation is often legal, not technical, efficiency usually unstructured, optimized for Zipf-like popularity Other p2p applications: Skype demonstrates usefulness for VoIP  identifier lookup NAT traversal: media traversal OpenDHT (and similar) as emerging common infrastructure? Non-DHT systems with smaller scope  web hotspot rescue Network management (see our IRTF slides) IBM Delhi (Jan. 2006)

Aside: middle services instead of middleware Common & successful network services identifier lookup: ARP, DNS network storage: proprietary (Yahoo, .mac, …) storage + computation: CDNs Emerging network services peer-to-peer identifier lookup network storage network computation (“utility”) maybe programmable already found as web hosts and grid computing IBM Delhi (Jan. 2006)

What is P2P? Share the resources of individual peers CPU, disk, bandwidth, information, … Computer systems Centralized Distributed Client-server Peer-to-peer Flat Hierarchical Pure Hybrid mainframes workstations DNS mount RPC HTTP Gnutella Chord Napster Groove Kazaa File sharing Communication and collaboration Distributed computing SETI@Home folding@Home Napster Gnutella Kazaa Freenet Overnet Magi Groove Skype Another category in the system is the infrastructure component, which defines what routing mechanism to use: Chord, CAN, etc. What are Groove and Magi? Groove: P2P system for office collaboration. Not clear how the user is detected, perhaps statically added by peers. Every peer in the group sees the same user space. Synchronized across all peers. All communications are signed. Proprietary. Magi: goal was to make it standards based such as HTTP, XML, WebDAV to provide document sharing and communication mechanism on the Internet. It uses dynamic DNS to register the Magi nodes/instances. All communications are signed. C S P IBM Delhi (Jan. 2006)

Distributed Hash Table (DHT) Types of search Central index (Napster) Distributed index with flooding (Gnutella) Distributed index with hashing (Chord, Bamboo, …) Basic operations find(key), insert(key, value), delete(key), but no search(*) Properties/types Every peer has complete table Chord Every peer has one key/value Search time or messages O(1) O(log(N)) O(n) Join/leave messages O(log(N)2) IBM Delhi (Jan. 2006)

CAN Content Addressable Network Each key maps to one point in the d-dimensional space Each node responsible for all the keys in its zone. Divide the space into zones. 1.0 C D E A B 0.0 0.0 1.0 C D E A B IBM Delhi (Jan. 2006)

CAN State = 2d Search = dxn1/d Node Z joins 1.0 .75 .5 .25 0.0 E A E A B X C B X Z C D D (x,y) 0.0 .25 .5 .75 1.0 Fault tolerant: know your neighbor’s neighbors. When node fails, one of the neighbor takes over. If adjacent nodes fail at the same time, use flooding to discover topology Node removal: Use heartbeat recover structure and repair routing (background zone reassignment) expanding ring search to build neighbor information before repairing simultaneous failures When joining, use the neighbor which is least loaded Node Z joins Node X locates (x,y)=(.3,.1) State = 2d Search = dxn1/d IBM Delhi (Jan. 2006)

Chord Identifier circle Keys assigned to successor Evenly distributed keys and nodes 1 54 8 58 10 14 47 21 Lookup is based on skiplist. Search in O(logN). State is O(logN). 42 38 32 0 1 2 3 4 5 6 7 8 38 24 30 IBM Delhi (Jan. 2006)

Chord Finger table: logN Key node 8+1 = 9 14 8+2 = 10 8+4 = 12 8+8 = 16 21 8+16=24 32 8+32=40 42 1 54 8 58 10 14 47 21 Finger table: logN ith finger points to first node that succeeds n by at least 2i-1 Stabilization after join/leave Lookup in finger table for the furthest node that precedes the key In a system with N nodes and K keys, with high probability, each node receives at most K/N keys each node maintains information about O(logN) other nodes lookups resolved with O(logN) hops No network locality. Replicas need to have explicit consistency. Optimizations: location: weight neighbor nodes by RTT. When routing choose the neighbor who is closer to destination with lowest RTT from me. Reduce path latency. Multiple physical nodes per virtual node. What if a node leaves? 42 38 32 38 24 30 IBM Delhi (Jan. 2006)

Tapestry ID with base B=2b Route to numerically closest node to the given key Routing table has O(B) columns. One per digit in node ID Similar to CIDR – but suffix-based 427 763 364 123 324 365 135 564 **4 => *64 => 364 N=2 N=1 N=0 064 ?04 ??0 164 ?14 ??1 264 ?24 ??2 364 ?34 ??3 464 ?44 ??4 564 ?54 ??5 664 ?64 ??6 IBM Delhi (Jan. 2006)

Pastry Prefix-based d471f1 Route to node with shared prefix (with the key) of ID at least one digit more than this node Neighbor set, leaf set and routing table d471f1 d467c4 d46a1c d462ba d4213f For concurrent failures there is a |L| proximity vector. So delivery is guaranteed unless more than |L|/2 nodes with adjacent ids fail. There is a neighbor set maintained |M| physically close nodes. If range in |L| then use numerically closest node ID. Find node id in table with largest prefix – larger than ID and this node. Find in |L| with same prefix length but numerically closer to ID Route(d46a1c) d13da3 65a1fc IBM Delhi (Jan. 2006)

Other schemes Distributed TRIE Viceroy Kademlia SkipGraph Symphony … Distributed trie: no leave. Initial join. Insert. Lookup. Each node has l entries. 2^m tables per trie node. Each entry has peer address a and time stamp t. Leaf means peer a held the value at t. Entry at ith tabe, indicates a held this child node at t. If a peer holds a node, it must hold all ancestors. Attack registant since no fixed node per key. Good for frequently accessed keys. If stale views then similar to broadcast. Viceroy: Based on butterfly network. Binary search tree at each node. Node is in one of the different logn levels. Kademlia: XOR distance – symmetric. Interval of nodes instead of single node per level. Lookup path converges to same node – caching possible. Skipnet/Skipgraph: utilizes locality since not on numeric ID, but sorted keys. Bad for node high node failure probability. Search may fail on node failures. Symphony: similar to chord, but range of nodes. Use randomization. IBM Delhi (Jan. 2006)

DHT Comparison Property/ scheme Un-structured CAN Chord Tapestry Pastry Viceroy Routing O(N) or no guarantee d x N1/d log(N) logBN State Constant 2d B.logBN Join/leave (logN)2 Reliability and fault resilience Data at Multiple locations; Retry on failure; finding popular content is efficient Multiple peers for each data item; retry on failure; multiple paths to destination Replicate data on consecutive peers; retry on failure Replicate data on multiple peers; keep multiple paths to each peers Routing load is evenly distributed among participant lookup servers IBM Delhi (Jan. 2006)

Server-based vs peer-to-peer Reliability, failover latency DNS-based. Depends on client retry timeout, DB replication latency, registration refresh interval DHT self organization and periodic registration refresh. Depends on client timeout, registration refresh interval. Scalability, number of users Depends on number of servers in the two stages. Depends on refresh rate, join/leave rate, uptime Call setup latency One or two steps. O(log(N)) steps. Security TLS, digest authentication, S/MIME Additionally needs a reputation system, working around spy nodes Maintenance, configuration Administrator: DNS, database, middle-box Automatic: one time bootstrap node addresses PSTN interoperability Gateways, TRIP, ENUM Interact with server-based infrastructure or co-locate peer node with the gateway IBM Delhi (Jan. 2006)

The basic SIP service HTTP: retrieve resource identified by URI SIP: translate address-of-record SIP URI (sip:alice@example.com) to one or more contacts (hosts or other AORs, e.g., sip:alice@128.59.16.1) single user  multiple hosts e.g., home, office, mobile, secretary can be equal or ordered sequentially Thus, SIP is (also) a binding protocol similar, in spirit, to mobile IP except application layer and without some of the related issues Function performed by SIP proxy for AOR’s domain delegated logically to location server This function is being replaced by p2p approaches IBM Delhi (Jan. 2006)

What is SIP? Why P2P-SIP? (1) REGISTER alice@columbia.edu =>128.59.19.194 (2) INVITE alice@columbia.edu (3) Contact: 128.59.19.194 Alice’s host 128.59.19.194 Bob’s host columbia.edu Problem in client-server: maintenance, configuration, controlled infrastructure Peer-to-peer network Alice 128.59.19.194 (1) REGISTER (2) INVITE alice (3) 128.59.19.194 No central server, but more lookup latency Replace central server to a P2P network What we gain: Reliability, scalability. What we lose: bounds on INVITE. Search feature. IBM Delhi (Jan. 2006)

How to combine SIP + P2P? P2P-over-SIP SIP-using-P2P Additionally, implement P2P using SIP messaging SIP-using-P2P Replace SIP location service by a P2P protocol SIP-using-P2P P2P SIP proxies P2P-over-SIP Maintenance P2P SIP Lookup SIP-using-P2P: Reuse optimized and well-defined external P2P network Define P2P location service interface to be used in SIP Extends to other signaling protocols (H.323) Don’t overload SIP or REGISTER Lookup is separate from call signaling P2P-over-SIP No change in semantics of SIP No dependence on external P2P network Reuse existing features such as forking (for voice mail) Built-in NAT/media relays. Additional message overhead due to SIP. P2P network REGISTER FIND INVITE alice INSERT P2P-SIP overlay Alice 128.59.19.194 INVITE sip:alice@128.59.19.194 Alice 128.59.19.194 IBM Delhi (Jan. 2006)

Design alternatives Use DHT in server farm 1 8 14 21 32 38 58 47 10 24 30 54 42 65a1fc d13da3 d4213f d462ba d467c4 d471f1 d46a1c Route(d46a1c) servers 1 54 10 38 24 30 clients Use DHT in server farm Use DHT for all clients - but some are resource limited Use DHT among super-nodes Hierarchy Dynamically adapt IBM Delhi (Jan. 2006)

Deployment scenarios Interoperate among these! P2P proxies P2P database P2P clients There are three components in client-server SIP architecture: user agents, proxies, and data bases. P2P network can be formed in any of these. They had tradeoffs in terms of ease of deployment, ease of integration with existing SIP clients and proxies, and reusability with other protocols and applications. Different scenarios have different trust models! Plug and play; May use adaptors; Untrusted peers Zero-conf server farm; Trusted servers and user identities Global, e.g., OpenDHT; Clients or proxies can use; Trusted deployed peers Interoperate among these! IBM Delhi (Jan. 2006)

Hybrid architecture Cross register, or Locate during call setup DNS, or P2P-SIP hierarchy To honor administrative boundaries and incremental deployment, need to provide interoperation among multiple P2P networks and with client-server SIP architecture. Cross register the user registrations from one P2P network to the other has problem if lots of networks, since every network needs to store all registrations from other networks. Locate the destination user in another network during call setup Either the global lookup of domain can be done using DNS, or Use P2P-SIP hierarchy (e.g., global P2P SIP network for domain lookups) lookup latency still O(logN) IBM Delhi (Jan. 2006)

What else can be P2P? Rendezvous/signaling (SIP) Configuration storage Media storage (e.g., voice mail) Identity assertion (?) PSTN gateway (?) NAT/media relay (find best one) P2P storage of configuration such as user profile, encrypted password/key P2P storage with redundancy/replication and message waiting notifications for voice mails P2P node and user identity verification instead of replying on central CAs. This in turn helps in malicious node detection and identification. Locating best cost PSTN gateway for particular number on P2P network. Locating best media relay on P2P network: start of call vs during the call (if old media relay node leaves). Trust models are different for different components! IBM Delhi (Jan. 2006)

What is our P2P-SIP? Unlike server-based SIP architecture Unlike proprietary Skype architecture Robust and efficient lookup using DHT Interoperability DHT algorithm uses SIP communication Hybrid architecture Lookup in SIP+P2P Unlike file-sharing applications Data storage, caching, delay, reliability Disadvantages Lookup delay and security IBM Delhi (Jan. 2006)

Implementation: SIPpeer Platform: Unix (Linux), C++ Modes: Chord: using SIP for P2P maintenance OpenDHT: using external P2P data storage based on Bamboo DHT, running on PlanetLab nodes Scenarios: P2P client, P2P proxies Adaptor for existing phones Cisco, X-lite, Windows Messenger, SIPc Server farm Chord: Join, leave, failure, lookup, ordinary node vs super-node, node naming (URI), authentication (email), maintenance message (REGISTER) OpenDHT: connect to one or more OpenDHT server (dynamically refresh, if server leaves), signing and verification of contacts. IBM Delhi (Jan. 2006)

P2P-SIP: identifier lookup P2P serves as SIP location server: address-of-record  contacts e.g., alice@example.com  128.59.16.1, 128.72.50.13 multi-valued: (keyn, value1), (keyn, value2) with limited TTL variant: point to SIP proxy server either operated by supernode or traditional server allows registration of non-p2p SIP domains (*@example.com) easier to provide call routing services (e.g., CPL) alice  128.59.16.1 alice  128.72.50.13 IBM Delhi (Jan. 2006)

Background: DHT (Chord) Identifier circle Keys assigned to successor Evenly distributed keys and nodes Finger table: logN ith finger points to first node that succeeds n by at least 2i-1 Stabilization for join/leave Key node 8+1 = 9 14 8+2 = 10 8+4 = 12 8+8 = 16 21 8+16=24 32 8+32=40 42 1 54 8 58 10 14 47 21 Lookup in finger table for the furthest node that precedes the key In a system with N nodes and K keys, with high probability, each node receives at most K/N keys each node maintains information about O(logN) other nodes lookups resolved with O(logN) hops No network locality. Replicas need to have explicit consistency. Optimizations: location: weight neighbor nodes by RTT. When routing choose the neighbor who is closer to destination with lowest RTT from me. Reduce path latency. Multiple physical nodes per virtual node. What if a node leaves? 42 38 0 1 2 3 4 5 6 7 8 32 38 24 30 IBM Delhi (Jan. 2006)

Implementation: SIPpeer User interface (buddy list, etc.) Signup, Find buddies IM, call On reset Signout, transfer On startup User location Leave Find Discover Join Audio devices DHT (Chord) REGISTER, INVITE, MESSAGE This is just an example architecture. Codecs Peer found/ Detect NAT Multicast REGISTER REGISTER ICE SIP RTP/RTCP SIP-over-P2P P2P-using-SIP IBM Delhi (Jan. 2006)

P2P vs. server-based SIP Prediction: P2P for smaller & quick setup scenarios Server-based for corporate and carrier Need federated system multiple p2p systems, identified by DNS domain name with gateway nodes 2000 requests/second ≈7 million registered users IBM Delhi (Jan. 2006)

Open issues Presence and IM where to store presence information: need access authorization Performance how many supernodes are needed? (Skype: ~1000) Reliability P2P nodes generally replicate data if proxy or presence agent at leaf, need proxy data replication Security Sybil attacks: blackholing supernodes Identifier protection: protect first registrant against identity theft Anonymity, encryption Protecting voicemails on storage nodes Optimization Locality, proximity, media routing Deployment SIP-P2P vs P2P-SIP, Intra-net, ISP servers Motivation Why should I run as super-node? IBM Delhi (Jan. 2006)

Comparison of P2P and server-based systems scaling server count  scales with user count, but limited by supernode count efficiency most efficient DHT maintenance = O((log N)2) security trust server provider; binary trust most supernodes; probabilistic reliability server redundancy; catastrophic failure possible unreliable supernodes; catastrophic failure unlikely IBM Delhi (Jan. 2006)

Using P2P for binding updates Proxies do more than just plain identifier translation: translation may depend on who’s asking, time of day, … e.g., based on script output hide full range of contacts from caller sequential and parallel forking disconnected services: e.g., forward to voicemail if no answer Using a DHT as a location service  use only plain translation run services on end systems run proxy services on supernode(s) and use proxy as contact  need replication for reliability Skype approach IBM Delhi (Jan. 2006)

Reliability and scalability Two stage architecture for CINEMA a*@example.com a1 Master a.example.com _sip._udp SRV 0 0 a1.example.com SRV 1 0 a2.example.com s1 a2 Slave sip:bob@example.com s2 sip:bob@b.example.com b*@example.com Master b.example.com _sip._udp SRV 0 0 b1.example.com SRV 1 0 b2.example.com s3 b1 One group can become backup for other group. Slave example.com _sip._udp SRV 0 40 s1.example.com SRV 0 40 s2.example.com SRV 0 20 s3.example.com SRV 1 0 ex.backup.com ex b2 Request-rate = f(#stateless, #groups) Bottleneck: CPU, memory, bandwidth? Failover latency: ? IBM Delhi (Jan. 2006)

SIP p2p summary http://www.p2psip.org and Advantages Out-of-box experience Robust catastrophic failure-unlikely Inherently scalable more with more nodes Status IETF involvement Columbia SIPpeer Security issues Trust, reputation malicious node, sybil attack SPAM, DDoS Privacy, anonymity (?) Other issues Lookup latency,proximity P2P-SIP vs SIP-using-P2P Why should I run as super-node? http://www.p2psip.org and http://www.cs.columbia.edu/IRT/p2p-sip IBM Delhi (Jan. 2006)

DotSlash: An Automated Web Hotspot Rescue System Weibin Zhao Henning Schulzrinne DotSlash: An Automated Web Hotspot Rescue System

The problem Web hotspots Also known as flash crowds or the Slashdot effect Short-term dramatic load spikes at web servers Existing mechanisms are not sufficient Over-provisioning Inefficient for rare events Difficult because the peak load is hard to predict CDNs Expensive for small web sites that experience the Slashdot effect IBM Delhi (Jan. 2006)

The challenges Automate hotspot handling Eliminate human intervention to react quickly Improve availability during critical periods (“15 minutes of fame”) Allocate resources dynamically Static configuration is insufficient for unexpected dramatic load spikes Address different bottlenecks Access network, web server, application server, and database server IBM Delhi (Jan. 2006)

Our approach DotSlash An automated web hotspot rescue system by building an adaptive distributed web server system on the fly Advantages Fully self-configuring – no configuration Service discovery, adaptive control, dynamic virtual hosting Scalable, easy to use Works for static & LAMP applications handles network, CPU and database server bottlenecks Transparent to clients cf. CoralCache IBM Delhi (Jan. 2006)

DotSlash overview Rescue model Mutual aid community using spare capacity Potential usage by web hosting companies DotSlash components Workload monitoring Rescue server discovery Load migration (request redirection) Dynamic virtual hosting Adaptive rescue and overload control IBM Delhi (Jan. 2006)

Handling load spikes Request redirection DNS-RR: reduce arrival rate HTTP redirect: increase service rate Handle different bottlenecks Technique Bottleneck Addressed Cache static content Network, web server Replicate scripts dynamically Application server Cache query results on demand Database server IBM Delhi (Jan. 2006)

Rescue example Cache static content client1 origin server (2) HTTP redirect (4) (3) (1) reverse proxy origin server rescue server (3) (4) (1) client2 DNS server (2) DNS round robin IBM Delhi (Jan. 2006)

Rescue example (2) Replicate scripts dynamically origin server Apache origin server PHP database server (1) (6) PHP client (2) (5) PHP (4) (3) (7) rescue server (8) MySQL Apache IBM Delhi (Jan. 2006)

Rescue example (3) Cache query results on demand database origin server query result cache origin server client data driver database server query result cache rescue server data driver IBM Delhi (Jan. 2006)

Allocate rescue server Server states Origin server Get help from others SOS state Allocate rescue server Release all rescues Normal state Accept SOS request Shutdown all rescues Rescue server Provide help to others Rescue state IBM Delhi (Jan. 2006)

Handling load spikes Load migration DNS-RR: reduce arrival rate HTTP redirect: increase service rate Both: increase throughput Benefits Reduce origin server network load by caching static content at rescue servers Reduce origin web server CPU load by replicating scripts dynamically to rescue servers IBM Delhi (Jan. 2006)

Adaptive overload control Objective CPU and network load in desired load region Origin server Allocate/release rescue servers Adjust redirect probability Rescue server Accept SOS requests Shutdown rescues Adjust allowed redirect rate IBM Delhi (Jan. 2006)

Self-configuring Rescue server discovery via SLP and DNS SRV Dynamic virtual hosting: Serving content of a new site on the fly use “pre-positioned” Apache virtual hosts Workload monitoring: network and CPU take headers and responses into account Adaptive rescue control Don’t know precise load handling capacity of rescue servers particularly for active content Establish desired load region (typically, ~70%) Periodically measure and adjust redirect probability convey via protocol IBM Delhi (Jan. 2006)

Implementation Based on LAMP (Linux, Apache, MySQL, PHP) Apache module (mod_dots), DotSlash daemon (dotsd), DotSlash rescue protocol (DSRP) Dynamic DNS using BIND with dot-slash.net Service discovery using enhanced SLP Apache mod_dots SHM dotsd other dotsd DSRP HTTP client SLP DNS BIND mSLP IBM Delhi (Jan. 2006)

Handling File Inclusions The problem A replicated script may include files that are located at the origin server Assume: included files under DocumentRoot Approaches Renaming inclusion statements Need to parse scripts: heavy weight Customized error handler Catch inclusion errors: light weight IBM Delhi (Jan. 2006)

Evaluation Workload generation httperf for static content RUBBoS (bulletin board) for dynamic content Testbed LAN cluster and WAN (PlanetLab) nodes Linux Redhat 9.0, Apache 2.0.49, MySQL 4.0.18, PHP 4.3.6 Metrics Max request rate and max data rate supported IBM Delhi (Jan. 2006)

Results in LANs Request rate, redirect rate, rescue rate Date rate IBM Delhi (Jan. 2006)

Handling worst-case workload Settling time: 24 second #timeouts 921/113565 IBM Delhi (Jan. 2006)

Results for dynamic content Configuration: Rescue (LC) Rescue (LC) Rescue (LC) Rescue (LC) Rescue (LC) Rescue (LC) Origin (HC) Rescue (LC) DB (HC) Rescue (LC) Rescue (LC) No rescue: R=118 CPU: Origin=100% DB=45% With rescue: R=245 #rescue servers: 9 CPU: Origin=55% DB=100% 245/118>2 IBM Delhi (Jan. 2006)

Caching TTL and Hit Ratio (Read-Only) IBM Delhi (Jan. 2006)

CPU Utilization (Read-Only) with rescue no cache READ4 with rescue with co-located cache READ5 with rescue with shared cache IBM Delhi (Jan. 2006)

Request Rate (Read-Only) with rescue no cache READ4 with rescue with co-located cache READ5 with rescue with shared cache IBM Delhi (Jan. 2006)

CPU Utilization (Submission) with rescue no cache SUB5 with rescue with cache no invalidation SUB6 with rescue with cache with invalidation IBM Delhi (Jan. 2006)

Request Rate (Submission) with rescue no cache SUB5 with rescue with cache no invalidation SUB6 with rescue with cache with invalidation IBM Delhi (Jan. 2006)

Performance Static content (httperf) 10-fold improvement Relieve network and web server bottlenecks Dynamic content (RUBBoS) Completely remove web/application server bottleneck Relieve database server bottleneck Overall improvement: 10 times for read-only mix, 5 times for submission mix IBM Delhi (Jan. 2006)

Conclusion DotSlash prototype Applicable to both static and dynamic content Promising performance improvement Released as open-source software On-going work Address security issues in deployment Extensible to SIP servers? Web services? For further information http://www.cs.columbia.edu/IRT/dotslash DotSlash framework: WCW 2004 Dynamic script replication: Global Internet 2005 On-demand query result cache: TR CUCS-035-05 (under submission) IBM Delhi (Jan. 2006)