Reliable and Scalable Internet Telephony by Kundan Singh Advisor: Henning Schulzrinne Computer Science Department, Columbia University, New York Feb 14, 2005
2 Agenda for the presentation What is Internet telephony? What is the problem? Why is it important? Results so far Difference with related work 30 slides
3 Internet telephony Multimedia calls over the Internet Signaling SIP: Session Initiation Protocol Where is located? Media Audio/video codecs RTP: Real-time Transport Protocol Elements (devices) End system, server (proxy/registrar), gateway, MCU, …
4 Internal Telephone Extn: 7040 SIP/PSTN Gateway Department PBX Web based configuration Web server Telephone switch SQL database sipd: Proxy, redirect, Registrar server NetMeeting H.323 rtspd: media server sipum: Unified messaging Quicktime RTSP clients RTSP 713x CINEMA servers sipconf: Conference server siph323: SIP-H.323 translator Local/long distance PSTN Internet telephony infrastructure CINEMA: Columbia InterNet Extensible Multimedia Architecture SIP VXML vxml cgi My work
5 CINEMA My contribution in design and implementation sipdsip323sipconfsipumsipvxml rtspd CINEMA Libraries libNT Win32 stub libcine Utilities parsing IPv6 libsip Basic SIP library libsipapi SIP UA library libconf RTP audio mixer libdict Hash table libdb++ mySQL interface RTSP media server SIP proxy server SIP/H.323 gateway SIP/RTP conferencing SIP/RTSP unified messaging SIP/VoiceXML browser Xerces-C OpenH323 MySQL PWLib Resparse librtsp RTSP client rtplib++ RTP library libsnmp SIP MIB Flite Xerces-C CINEMA Applications libcanon canonic alize libmedia Recordi ng, files … and web-based GUI C/C++: 58K out of 187 KLOC Tcl: 30 KLOC
6 My research background India India H.323 client gateway SIP-H.323 translator SIP-RTSP voice mail SIP conferencing Libsip++ (SIP library) P2P VoIP using SIP SIP Failover/load sharing Enterprise VoIP infrastructure Interactive voice response CINEMA user interface Multimedia collaboration Mobile NAT Reliability and scalability VoIP infrastructure
7 Telephone reliability (PSTN: Public Switched Telephone Network) “bearer” network telephone switch (SSP) database (SCP) for freephone, calling card, … signaling network (SS7) signaling router (STP) local telephone switch (class 5 switch) 10,000 customers 20,000 calls/hour database (SCP) 10 million customers 2 million lookups/hour signaling router (STP) 1 million customers 1.5 million calls/hour regional telephone switch (class 4 switch) 100,000 customers 150,000 calls/hour
8 DB Internet telephony (SIP: Session Initiation Protocol) yahoo.comexample.com REGISTER INVITE DNS
9 SIP network architecture Scalability requirement depends on role GW MG IP network PSTN SIP/PSTN SIP/MGC Carrier network ISP Cybercafe IP PSTN GW PBX IP phones PSTN phones T1 PRI/BRI
10 Reliability and scalability for call routing, registration, conferencing, voic s Requirements Reliable Mean Time Between Failures (MTBF), Mean Time To Recover (MTTR), percentage availability Scalable Registration rate, call rate, #requests/s Server and network components Proposed solutions Server redundancy Apply existing web-redundancy designs Evaluate quantitatively Peer-to-peer Novel P2P-SIP architecture Evaluate quantitatively
11 Server redundancy The problem: failure or overload REGISTERINVITE REGISTERINVITE REGISTERINVITE Replicate registration or search on call
12 Server redundancy Known techniques Client-based Cisco phones: primary and backup proxy DNS NAPTR, SRV IP address takeover Database redundancy
13 High availability Failover in our test bed - CINEMA Slave/ master Web scripts D2 P2 Master/ slave Web scripts D1 P1 phone.cs.columbia.edu sip2.cs.columbia.edu REGISTER proxy1 = phone.cs backup = sip2.cs _sip._udp SRV phone.cs.columbia.edu SRV sip2.cs.columbia.edu replication
14 High availability More issues Client re-sends INVITE to P2 Immediately on ICMP error Or after 10s otherwise sipd has in-memory cache Refresh registration much before expiry Cisco phone registers to P1 and P2 Web access gets delayed information
15 High availability Measurements on failover Call setup latency Client retry timeout (T1), DNS TTL User unavailability None (refresh; double register) Registration refresh interval (Tr), cache refresh interval (Tc), client retry timeout (T2), DB replication delay, DNS TTL Web access latency #servers Tradeoff: reliability vs capacity Slave/ master Master/ slave DNS P1 P2 Caller Callee D1 D2 T1 Tr T2 Tc Td A A Tc P1 P2 D1 D2
16 Scalability Load sharing: redundant proxies and databases REGISTER Write to D1 & D2 INVITE Read from D1 or D2 Database write/ synchronization traffic becomes bottleneck D1 D2 P1 P2 P3 REGISTER INVITE
17 Scalability Load sharing: divide the user space Proxy and database on the same host Stateless proxy can become overloaded Use many Hashing Static vs dynamic D1 D2 P1 P2 P3 D3 a-h i-q r-z
18 Scalability Comparison of the two designs ((tr/D)+1)TN = (A/D) + B ((tr+1)/D)TN = (A/D) + (B/D) D1 D2 P1 P2 P3 D1 D2 P1 P2 P3 D2 a-h i-q r-z Total time per DB D = number of database servers N = number of writes (REGISTER) r = #reads/#writes = (INV+REG)/REG T = write latency t = read latency/write latency Low reliability High scale
19 Reliability and scalability Two stage architecture for CINEMA MasterSlaveMasterSlave s1 s2 s3 a1 a2 b1 b2 example.com _sip._udp SRV 0 40 s1.example.com SRV 0 40 s2.example.com SRV 0 20 s3.example.com SRV 1 0 ex.backup.com a.example.com _sip._udp SRV 0 0 a1.example.com SRV 1 0 a2.example.com b.example.com _sip._udp SRV 0 0 b1.example.com SRV 1 0 b2.example.com Request-rate = f(#stateless, #groups) Bottleneck: CPU, memory, bandwidth? Failover latency: ? ex
20 Reliability and scalability Future work: analysis and measurement When is stateless proxy stage needed What are the optimal values for S,B,P for required scalability (1-10 million BHCA) and reliability (99.999%) using commodity hardware MasterSlaveMasterSlave s1 s2 s3 a1 a2 b1 b2 S=3 B=2 P=1+1 ex = R + P REGISTER+ INVITE, etc r, p ss /B R s M s R p M p
21 Server-based vs peer-to-peer Server-based Cost: maintenance, configuration Central points of failures Controlled infrastructure (e.g., DNS) Peer-to-peer Robust: no central dependency Self organizing, no configuration Scalability ? C C C C C S P P P P P
22 We propose: P2P-SIP Unlike server-based SIP architecture Unlike proprietary Skype architecture Robust and efficient lookup using DHT Interoperability DHT algorithm uses SIP communication Hybrid architecture Lookup in SIP+P2P Unlike file-sharing applications Data storage, caching, delay, reliability Disadvantages Lookup delay and security
23 P2P-SIP Background: DHT (Chord) Identifier circle Keys assigned to successor Evenly distributed keys and nodes Finger table: logN i th finger points to first node that succeeds n by at least 2 i-1 Stabilization for join/leave Keynode 8+1 = = = = = =4042
24 P2P-SIP Design Alternatives 65a1fc d13da3 d4213f d462ba d467c4 d471f1 d46a1c Route(d46a1c) Use DHT in server farm Use DHT for all clients; But some are resource limited Use DHT among super-nodes servers clients
25 P2P-SIP Node architecture: registrar, proxy, user agent DHT communication using SIP REGISTER Known node: Unknown node: User: User interface (buddy list, etc.)SIPICERTP/RTCPCodecsAudio devicesDHT (Chord) On startup DiscoverUser location Multicast REGPeer found/ Detect NAT REG REG, INVITE, MESSAGE Signup, Find buddies Join Find Leave On reset Signout, transfer IM, call
26 P2P-SIP Problems Mapping node identifier to SIP URI Node startup Discovery, join, maintenance Node shutdown or failure Adaptor for existing phones NAT/firewall traversal Offline messages Multi-party conferencing Security
27 P2P-SIP Implementation sippeer : C++, Linux, Chord Node join and form the DHT Node failure is detected and DHT updated Registrations transferred on node shutdown Co-located sipc can use sippeer service
28 P2P-SIP Future work: scalability evaluation #messages depends on Keep-alive and finger table refresh rate Call arrival distribution User registration refresh interval Node join, leave, failure rates M={r s + r f (log(N)) 2 } + c.log(N) + (k/t)log(N) + (log(N)) 2 /N #nodes = f(capacity,rates) CPU, memory, bandwidth Verify by measurement and profiling
29 P2P-SIP Future work: reliability and call setup latency evaluation User availability depends on Super-node failure distribution Node keep-alive and finger refresh rate User registration refresh rate Replicate user registration Measure effect of each Call setup latency Same as DHT lookup latency: O(log(N)) Calls to known locations (“buddies”) is direct DHT optimization can further reduce latency User availability and retransmission timers Measure effect of each
30 Summary and future work Internet telephony infrastructure Server redundancy Two stage architecture evaluation Server-less/peer-to-peer VoIP Quantitative evaluation Multi-domain deployment PSTN interworking
31 Publications Conference, workshop, technical report, magazine 1. H. Schulzrinne, K. Singh and X. Wu, "Programmable Conference Server", Columbia University Technical Report CUCS , NY, Oct K. Singh and H. Schulzrinne, "Peer-to-peer Internet Telephony using SIP", New York Metro Area Networking Workshop, CUNY, NY, Sep K. Singh and H. Schulzrinne, "Peer-to-peer Internet Telephony using SIP", Columbia University Technical Report CUCS , NY, Oct K. Singh and H. Schulzrinne, "Failover and Load Sharing in SIP Telephony", Columbia University Technical Report CUCS , NY, May K. Singh, Xiaotao Wu, J. Lennox and H. Schulzrinne, "Comprehensive Multi-platform Collaboration", MMCN SPIE Conference on Multimedia Computing and Networking, Santa Clara, CA, Jan K. Singh, Xiaotao Wu, J. Lennox and H. Schulzrinne, "Comprehensive Multi-platform Collaboration", Columbia University Technical Report CUCS , NY, Nov M. Buddhikot, A. Hari, K. Singh and S. Miller, "MobileNAT: A new Technique for Mobility across Heterogeneous Address Spaces", WMASH ACM International Workshop on Wireless Mobile Applications and Services on WLAN Hotspots, San Diego, CA, Sep K. Singh, A. Nambi and H. Schulzrinne, "Integrating VoiceXML with SIP services", ICC Global Services and Infrastructure for Next Generation Networks, Anchorage, Alaska, May K. Singh, A. Nambi and H. Schulzrinne, "Integrating VoiceXML with SIP services", Second New York Metro Area Networking Workshop, Columbia University, NY, Sep K. Singh, W. Jiang, J. Lennox, S. Narayanan and H. Schulzrinne, "CINEMA: Columbia InterNet Extensible Multimedia Architecture", Columbia University Technical Report CUCS , NY, May W. Jiang, J. Lennox, H. Schulzrinne and K. Singh, "Towards Junking the PBX: Deploying IP Telephony", NOSSDAV W. Jiang, J. Lennox, S. Narayanan, H. Schulzrinne, K. Singh and X. Wu, "Integrating Internet Telephony Services", IEEE Internet Computing (magazine), May/June 2002 (Vol. 6, No. 3). 8.K. Singh, Gautam Nair and H. Schulzrinne, "Centralized Conferencing using SIP", 2nd IP-Telephony Workshop (IPTel'2001), April K. Singh and H. Schulzrinne, "Unified Messaging using SIP and RTSP", IP Telecom Services Workshop 2000, Atlanta, Georgia, U.S.A, Sept K. Singh and H. Schulzrinne, "Unified Messaging using SIP and RTSP", Columbia University Technical Report CUCS , NY, Oct K. Singh, H.Schulzrinne, "Interworking Between SIP/SDP and H.323", 1st IP-Telephony Workshop (IPTel'2000), April K. Singh and H. Schulzrinne, "Interworking Between SIP/SDP and H.323", Columbia University Technical Report CUCS , NY, May 2000.
32 Backup slides
33 MobileNAT Architecture Two IP addresses Virtual IP (fixed host-id) Actual IP (routable; changes) DHCP, NAT, mobility manager A= moves V= Actual IP Virtual IP MN CN Application Socket TCP/UDP IP Addr “A” Shim Layer Addr “V” Net IF Anchor node (AN)
34 MobileNAT Comparison with other work MIPCIPHawaiiHMIP (RR) IDMP TeleMIP MIP LR MIP RO SIPIPv 6 Mobile NAT Virtual NAT MIP messagingYNYYY--NYNN Inter-tunnelYYYYYNYNOON Intra-tunnel-NNYY---OON PagingOYYYY--NYUDN Host IDHA CoA LCoA--SIPHACoAvirtual signalingYDataYYYYYYYDHCP/ MM Y CN modify?NNNNNYY-NNY MN modify?YYYYYYY-YYY Router modify?FAYY ---ONN NAT supportY1Y1 YYYYIN Y Y Non-mobile IP nodes YNYYY---YYIN Triangular routeYYYYYNNNNN/YN Y: yes N: no - :N/A O: optional IN:independent UD: Under Development 1: We assume Mobile IP with UDP tunneling for NAT
35 Related work IP telephony and multimedia communication Unlike low cost VoIP: Vonage, AT&T We provide enterprise infrastructure There are enterprise IPtel: Cisco, Nortel But redundancy architecture, interoperability, distributed components model differ Collaboration: CSCW, SIGGROUP Unlike web-centric, or application specific We provide standard-based multimedia collaboration platform Multimedia conferencing: Mbone, H.323 Ours is SIP-based infrastructure, reuse existing tools and protocols such as RTSP, media server
36 Related work Comprehensive multi-platform collaboration Goal: Alternate between synchronous and asynchronous communication, and access from different devices and clients. Synchronous (tightly coupled) Video conference, IM, screen sharing, floor control, … Asynchronous (loosely coupled) File sharing, message board, … Messaging and notifications Personalized view Per-user calendar, access control, address book We try to incorporate… Long lived groups Design teams, committees, college classes Asymmetric events Lecture and lecture series Short-lived spontaneous interaction Current practice , teleconference Vendor specific tools, platform dependence Application specific E.g., collaborative software development
37 Multi-party collaboration What is done, and what is left. Sipconf: conference server Audio, video, IM, screen, shared browsing, floor control No XCON yet: use web interface Small to medium size conferences Cascaded conference mixer #participants, audio delay Failover State sharing between servers
38 Related work Availability for (web) servers Availability = f(reliability,maintainability) Reliability: time to failure pdf Maintainability: time to recover pdf Existing work on failover TCP connection migration IP address takeover MAC address takeover Reliable server pooling Requires new protocol support in clients Reliability analysis tools ( Availability in the face of (DoS) attacks
39 Related work Scalability for (web) servers Existing work Connection dispatcher Content/session-based redirection DNS-based load sharing HTTP vs SIP UDP+TCP, signaling not bandwidth intensive, no caching of response, read/write ratio is comparable for DB SIP scalability bottleneck Signaling (chapter 4), real-time media data, gateway 302 redirect to less loaded server, REFER session to another location, signal upstream to reduce
40 Related work SIPStone: SIP server performance metric Steady state rate for successful registration, forwarding and unsuccessful call attempts measured using 15 min test runs. Measure: #requests/s with given delay constraint. Performance=f(#user,#DNS,UDP/TCP,g(request),L) where g=type and arrival pdf (#request/s), L=logging? For register, outbound proxy, redirect, proxy480, proxy200. Parameters Measurement interval, transaction response time, register/s, calls/s, transaction failure probability<5%, Shortcomings: does not consider forking, scripting, Via header, packet size, different call rates, SSL. Is there linear combination of results?
41 Related work 3GPP (release 5)’s IP Multimedia core network Subsystem uses SIP Proxy-CSCF (call session control function) First contact in visited network. 911 lookup. Dialplan. Interrogating-CSCF First contact in operator’s network. Locate S-CSCF for register Serving-CSCF User policy and privileges, session control service Registrar Connection to PSTN MGCF and MGW
42 Related work: Skype From the KaZaA community Host cache of some super nodes Bootstrap IP addresses Auto-detect NAT/firewall settings Similar to STUN and TURN Protocol among super nodes – ?? Allows searching a user (e.g., kun*) History of known buddies All communication is encrypted Promote to super node Based on availability, capacity Conferencing Problems: Proprietary, single service, centralized login P P P P P P P P P PPP
43 Related work P2P P2P networks Unstructured (Kazaa, Gnutella,…) Structured (DHT: Chord, CAN,…) Skype and related systems Flooding based chat, groove, Magi P2P-SIP telephony Proprietary: NimX, Peerio, damaka File sharing: SIPShare
44 Why we chose Chord? Chord can be replaced by another As long as it can map to SIP High node join/leave rates Provable probabilistic guarantees Easy to implement X proximity based routing X security, malicious nodes
45 Related work JXTA vs Chord in P2P-SIP JXTA Protocol for communication (peers, groups, pipes, etc.) Stems from unstructured P2P P2P-SIP Instead of SIP, JXTA can also be used Separate search (JXTA) from signaling (SIP)
46 P2P-SIP Node Startup SIP REGISTER with SIP registrar DHT Discover peers: multicast REGISTER Join DHT using node-key=Hash(ip) REGISTER with DHT using user- Dialing out Call, instant message, etc. INVITE MESSAGE Last seen, SIP NAPTR/SRV, DHT REGISTER DB sipd Detect peers columbia.edu REGISTER alice=42 REGISTER bob=12
47 P2P-SIP Node Leaves Graceful leave Un-REGISTER Transfer registrations Failure Attached nodes detect and re-REGISTER New REGISTER goes to new super-nodes Super-nodes adjust DHT accordingly DHT REGISTER key=42 OPTIONS 42 REGISTER
48 P2P-SIP Advanced services Offline messages INVITE or MESSAGE fails => Responsible node stores voic , instant message. Conferencing Mixer, full mesh, multicast
49 P2P-SIP Security – open issues (threats, solutions, issues) More threats than server-based Privacy, confidentiality Malicious node Don’t forward all calls, log call history (spy),… “free riding”, motivation to become super-node Existing solutions Focus on file-sharing (non-real time) Centralized components (boot-strap, CA) Assume co-operating peers works for server farm in DHT Collusion Hide security algorithm (e.g., yahoo, skype) Chord Recommendations, design principles, …
50 Server-based vs peer-to-peer Server-based vs peer-to-peer Reliability, failover latency DNS-based. Depends on client retry timeout, DB replication latency, registration refresh interval DHT self organization and periodic registration refresh. Depends on client timeout, registration refresh interval. Scalability, number of users Depends on number of servers in the two stages. Depends on refresh rate, join/leave rate, uptime Call setup latency One or two steps.O(log(N)) steps. SecurityTLS, digest authentication, S/MIME Additionally needs a reputation system, working around spy nodes Maintenance, configuration Administrator: DNS, database, middle-box Automatic: one time bootstrap node addresses PSTN interoperability Gateways, TRIP, ENUMInteract with server-based infrastructure or co-locate peer node with the gateway
51 My contribution in CINEMA Sip-h323: signaling translator Background: ITU-T’s H.323 Binary ASN.1 PER, collection of protocols (H.245, H.225.0, Q.931, RAS, H.450.x) H.323 gatekeeper similar but not same as SIP server Problems in interworking Multi-stage dialing in H.323v1 Fast start in v2 is optional User registration Both SIP and H.323 users should be reachable Session description is more complex End system should select the codecs Security and QoS: end-to-end or not? Solution List different scenarios No modification in SIP or H.323 Direct RTP traffic if possible Implementation
52 My contribution in CINEMA Sipum: Unified messaging using SIP and RTSP Problem Existing systems have voic with PBX or phone, or send voice attachments in Downloading the whole message is not desirable Solution Using existing standards (RTSP, SIP) and tools (web, media player) Distributed components for different architectures (PBX, phone, service provider) Many ways to retrieve your message (RTSP, SIP, phone, web) Message deletion issues Call reclaiming Implementation
53 My contribution in CINEMA Sipconf: Centralized conferencing using SIP/RTP Problem Multicast is not available and ad hoc conference is useful for small number of users Heterogeneous clients (some have video also; or different audio codecs) Solution Audio mixer, video forwarder IM, VNC screen sharing, shared web browsing Playout delay adjustments Web based configuration, floor control G.711 A/Mu, G.721, DVI, ADPCM, G.722, … Modular: libconf, libmedia, rtplib++ Implementation and performance evaluation
54 My contribution in CINEMA Sipvxml: SIP-based VoiceXML browser Background VoiceXML for touch tone-based service programming Backend scripts (CGI) or servlets Problem Then existing solutions were PSTN based Solution First SIP-VoiceXML implementation SIP interface (works with PSTN via a gateway) Example cgi scripts Calling card service Joining a conference (Ajay) Accessing voice mail (Ajay) by phone (Pimrampai) Auto attendant (Sean)
55 My contribution in CINEMA libsip++: SIP user agent library in C++ All the applications (sipum, sipconf, siph323, sipvxml) use a common underlying library Similar API for H.323 defined using wrapper around openH323 Unlike JAIN-SIP or SIP servlet, libsip++ is more high level with facility to access low level features Dialog, call, endpoint, registration are defined as objects (JAIN-SIP 1.1 added dialog as object) Uses underlying transaction and parsing library shared with sipd Test user agent (sipua) is used as tools, e.g., for sipconf testing Documentation is at
56 My contribution in CINEMA GUI: web-based user interface Configuration, user profile, etc., stored in SQL DB Front end as web-based GUI CGI scripts in Tcl About 100 pages for various configuration User friendly (beginner vs advanced, context help) Asynchronous collaboration Voic , file sharing, IM archive, groups, address book, calendar Undergone three iterations See current version at