Backup slides.

Backup slides

My contributions Server redundancy Peer-to-peer (P2P)
Implemented failover using database replication Two-stage architecture for SIP load sharing Comparison of thread models for SIP server Peer-to-peer (P2P) SIP servers using external P2P network Additionally, P2P maintenance using SIP Enterprise IP telephony Multi-platform collaboration using SIP Scalable centralized conferencing architecture Interworking between SIP/SDP and H.323 MTTR depends on a number of components such as DNS TTL, ARP cache, DHCP timers, SIP registration and call setup latency depending on the failover architecture 100 requests/s, 1hr=3600s refresh interval => users. Complicated by mobility (higher registration rate), authentication Our proxy server does 300 registrations and 90 calls/s. Same infrastructure can be used for generic multimedia communication on the Internet. New architecture, New algorithm or approach, Implementation, Evaluation

Server redundancy The problem: failure or overload
REGISTER INVITE Replicate registration vs search on call: Tradeoff: registration traffic vs Call setup latency. Types of servers: stateless, stateful, registrar, call stateful, etc.

Scalability Load sharing: redundant proxies and databases
REGISTER Write to D1 & D2 INVITE Read from D1 or D2 Database write/ synchronization traffic becomes bottleneck P1 REGISTER D1 P2 D2 With MySQL 4.0 not possible to do database replication since write can happen to any. So each SIP server must write to all Ds. INVITE P3

Scalability Load sharing: divide the user space
Proxy and database on the same host Stateless proxy can become overloaded Use many P1 D1 a-h P2 D2 i-q Any study on dynamic hashing for web? What are the issues with dynamic hashing: transfer registrations to new DB. P3 D3 r-z

Server performance Results of my measurement
Event-based performs 30% better than existing thread-pool architecture on single-CPU Two stage thread-pool architecture gives better performance on multi-CPU 60% more on 4xPentium Both Pentium and Sparc took 2 MHz of CPU cycles per call/s on single-CPU Earlier memory pool gave 30% improvement Stateless: 1550 CPS, i.e., S3P3 16m BHCA Stateful: 1150 CPS

How to combine SIP + P2P? P2P-over-SIP SIP-using-P2P
Additionally, implement P2P using SIP messaging SIP-using-P2P Replace SIP location service by a P2P protocol SIP-using-P2P P2P SIP proxies P2P-over-SIP Maintenance P2P SIP Lookup SIP-using-P2P: Reuse optimized and well-defined external P2P network Define P2P location service interface to be used in SIP Extends to other signaling protocols (H.323) Don’t overload SIP or REGISTER Lookup is separate from call signaling P2P-over-SIP No change in semantics of SIP No dependence on external P2P network Reuse existing features such as forking (for voice mail) Built-in NAT/media relays. Additional message overhead due to SIP. P2P network REGISTER FIND INVITE alice P2P-SIP overlay INSERT Alice INVITE Alice

SIP-using-P2P Logical Operations
Contact management put (user id, signed contact) Key storage User certificates and private configurations Presence put (subscribee id, signed encrypted subscriber id) Composition needs service model Offline message put (recipient, signed encrypted message) NAT and firewall traversal STUN and TURN server discovery needs service model P2P-SIP design consists of many logical operations. The contact management deals with storing and retrieving user contacts as in SIP location service. The contacts are signed by the user on put and verified on get before making a call. Key storage deals with storing the certificate and encrypted private key of the user. The caller uses this certificate to verify. Presence deals with the subscribers updating the watcher list of the given subscribee such that only he can read the identifiers of the subscribers. Similarly, offline message deals with putting the signed and encrypted messages for the recipient such that only he can read and delete it. For NAT and firewall traversal, it provides P2P service discovery of a STUN or TURN server. XML-based data format

P2P-over-SIP Architecture and implementation
DHT (Chord) algorithm using SIP messages with query and update semantics of REGISTER Has SIP registrar, proxy, user agent Other: discovery, NAT traversal, failover Adaptor: allows existing SIP devices to become P2P

P2P-over-SIP Analysis: scalability
Computed message load as function of Refresh rate (keep-alive, finger table, user registration), call arrival rate, churn (join, leave, failure), scale (number of peer nodes and users) Number of nodes = f(individual node capacity) Measured performance: 800 register/s. Assuming a conservative 10 reqister/s capacity, and aggressive refresh and call rate of 1/min, it gives more than 16 million peers (super nodes) in the network. Scalability of chord: load balancing: 99th percentile in 10^4 nodes, 10^6 keys, is 4.6xmean. Max is 10x. Keys per node is linear in number of keys. PDF looks exponential. Path length: 10^4 node network. Average 6, 99th percentile 11. Looks Poisson distribution.

P2P-over-SIP Analysis: availability and call setup latency
To increase user availability: Fast failure detection: increase keep-alive rate Reduce unavailability: frequent registration refresh Replicate: user and node registrations Call setup latency: Same as DHT lookup latency: O(log(N)) Calls to known locations (“buddies”) is direct Chord: nodes => 6 hops At most a few seconds User availability and retransmission timers Advanced services also possible. the ConChord system for certificate storage, the Tarzan system for anonymous communications, and the proposal to use Chord as replacement of standard DNS hierarchies by Cox, Muthitacharoen, and Morris -- this last proposal is most similar in spirit to your work

SIP-using-P2P vs P2P-over-SIP
Not SIP-specific, hence no implementation overhead for non-VoIP but P2P applications Low transport and transaction overhead No P2P security burden on SIP No dependency on single DHT implementation Reuse SIP naming, routing, security, NAT/firewall traversal Easily reuse existing SIP components without change voic , conference Single DHT implementation Readily supports service model

Conference server Performance evaluation of audio mixer
On commodity PC About 480 participants in a single conference with one active speaker About 80 four-party conferences, with one speaker each Both Pentium and Sparc took 6 MHz per participant Commodity PC: 3 GHz, Pentium 4, with 1 GB memory Memory: 20 kB/participant Delay: less than 20 ms Packetization interval of 40 ms gives 700 participants, but increases delay

SIP network architecture Scalability requirement depends on role
Cybercafe ISP IP network IP phones GW ISP MG SIP/MGC MG SIP/PSTN GW SIP/MGC Carrier network MG Mention about requirements in #customers. GW IP PSTN PBX PSTN phones T1 PRI/BRI PSTN A mobile service provider in 3GPP with millions of subscribers

Reliability of DNS, network, web
[Wenyu’03]: VoIP network 98% (.5+1.5) [Pang’04]: local DNS:98%;auth DNS:99% Minor correlation to load MTBF: order of days, weeks or longer (1%: 10days) MTTR: order of hours (60%:5hr; 90%:15) [Sohar’00]: overall 99.57% availability Main: 99.6%, Backbone: 99.8%, Cisco switches: 99.98%, PC server: 99.7%, DB server: 99.96%, Solaris OS: 99.99% Hosted web portals: Usually give 99.5% in SLA; but achieve 99.9% except for scheduled downtime

Availability and “nines”
Downtime Per month Per year 98% 14 hrs 7.3 days 99% 7hr 3days, 15hrs 99.5% 3.5hr 1day, 19hr 99.9% 44m 8h, 46min 99.95% 22m 4h, 23min 99.99% 4.4m 52m, 36s 99.999% 26s 5m, 15s % 2.7s 32s PC, VoIP calls IP-PBX software (guess) PSTN, network US Power, Solaris OS Switches

Related work: web server reliability
Dispatcher or stateful NAT Not data dominated so becomes bottleneck IP address takeover (linux-ha) or MAC takeover Requires servers on same subnet; reachability Can’t use for scalability for stateful proxy IP anycast Probably closer/faster server Client-based Cisco phones: primary and backup proxy; implementation dependent; what is servers IP change. DNS Dynamic: load; caching problem; not every impl respects TTL NAPTR, SRV Rserpool: to maintain server pool and name server pool Requires new protocol support; local network; Database redundancy TCP connection migration/process migration

Related work: SIP server reliability
Cisco: proprietary protocol; same subnet Dynamicsoft (bought by cisco): backend database replication Ubiquity: load balancer Vovida: some kind of SIP replication; sync problem after recovery Iptel: IP anycast with state replication Requires transaction state sharing among servers because replication is not visible to the client

High availability Failover implementation in our test-bed - CINEMA
MySQL: No locking protocol between master and slave. Race if insert into D1 and remove from D2 Web scripts Web scripts D1 D2 Master/ slave Slave/ master replication P1 P2 sipd has in-memory cache: REGISTER refresh much before expiry; web gets delayed data; not an issue for cisco phones Known techniques: Client-based (Cisco phones: primary and backup proxy); DNS (NAPTR, SRV); IP address takeover (Requires same subnet); Database redundancy MySQL 4.0 has no locking protocol between master and slave. They proposed for 5.0 but not done yet. So only master should be updated. But if slave is updated when master is running, there may be problem. Mostly not visible: SIP register are additive. If contact is added to D1 and removed from D2, race condition. But primary is preferred over secondary so all replication happens D1 to D2 unless D1 is down. Make sure DB are consistent before failed server is brought up. MySQL Cluster has delivered five 9s—in other words, percent—availability in testing, according to company officials. That works out to five minutes of downtime per year. The technology has been tested on as many as 48 nodes, with failover response times running between five and 10 milliseconds, according to MySQL Vice President of Marketing Zack Urlocker. Reducing refresh interval much before expiry makes user appear available when he is not (Phone dies without un-registering) e.g., refresh interval = (Expires/2)-e Database replication causes problem for Redundant voic different vmail contacts for P1 and P2=>don’t replicate. Or use DNS NAPTR/SRV. conference server: must be tied with conference server state replication also. Replication performance: indicates a simple experiment that does best case analysis on lightly loaded servers, where 95% times the record is available at slave on first attempt, and 99.3% on first two attempts. phone.cs.columbia.edu sip2.cs.columbia.edu REGISTER _sip._udp SRV phone.cs.columbia.edu SRV sip2.cs.columbia.edu proxy1 = phone.cs backup = sip2.cs INVITE to P2 either on ICMP error or after 10 s

High availability Analysis
Master/ slave Slave/ master D1 D2 System reliability (1-(1-R)2) Call setup latency TR (1-R) P[tM<TD] where TD is DNS TTL, tM is time-to-repair, and P[tM<TD] = 1 – e-TD/TM User unavailability None (refresh; double register) For first time registration, probability that the server goes down before replication is: 1 – e-(Td/+Tc)/TF where TF is mean-time-to-failure Redundant servers Tradeoff: reliability vs capacity DNS Caller P1 P2 TR Callee P1 D1 P2 D2 R is the single server reliability (probability that the system is up). Latency equation assumes no network delay; R is approx 1; and MTTF is much larger than MTTR. TODO: Web access latency to work out. Tc Td Tc A Tr TR A Tc

Bulk arrival What if all servers come up immediately after power recovery? Carrier (gradually) vs enterprise (all at once) Wait timer in SIP configuration based server capacity Don’t query the server for config on reboot Not a problem if server gracefully drops requests without degrading the performance Clients retry – and succeed over an hour

What does scalability depend on? Depends on traffic type
Registration (uniform) Authentication, mobile users Call routing (Poisson) stateful vs stateless proxy, redirect, programmable scripts Beyond telephony (Don’t know) Instant message, presence (including sensors), device control Stateful calls (Poisson arrival, exponential call duration) Firewall, conference, voic Transport type UDP/TCP/TLS (cost of security) 100 requests/s, 1hr=3600s refresh interval => users. Our proxy server does 300 registrations and 90 calls/s. Same infrastructure can be used for generic multimedia communication on the Internet.

Related work: web server scalability
Existing work Connection dispatcher (NAT) Same IP address Content/session-based redirection DNS-based load sharing HTTP vs SIP UDP+TCP, signaling not bandwidth intensive, no caching of response, read/write ratio is comparable for DB SIP scalability bottleneck Signaling, real-time media data, gateway 302 redirect to less loaded server, REFER session to another location, signal upstream to reduce Multi-homing. Geographic load balancing. Round-robin DNS – no account of server load, and server failure. Load balancer – can use server response time, with damping. Or load information is pushed. Active content verification by load balancer Session and context – use external DB or sticky session in load balancer

SIP vs HTTP server Signaling (vs data) bound Transactions
No File I/O (exception: scripts, logging) No caching; DB read and write frequency are comparable Transactions Stateful wait for response Depends on external entities DNS, SQL database Transport UDP in addition to TCP/TLS; may not need TCP state Goals Carrier class scaling using commodity hardware Try not to customize/recompile OS or implement (parts of) server in kernel (khttpd, AFPA)

What is SIPstone? SIP server performance metrics
SQL database Steady state rate for successful registration, forwarding and unsuccessful call attempts measured using 15 min test runs. Measure: #requests/s with given delay constraint. Performance=f(#user,#DNS,UDP/TCP,g(request),L) where g=type and arrival pdf (#request/s), L=logging? For register, outbound proxy, redirect, proxy480, proxy200. Parameters Measurement interval, transaction response time, RPS (registers/s), CPS (calls/s), transaction failure probability<5%, Delay budget: R1 < 500 ms, R2 < 2000 ms Shortcomings: does not consider forking, scripting, Via header, packet size, different call rates, SSL. Is there linear combination of results? Whitebox measurements: turnaround time Extend to SIMPLEstone Server Loader Handler REGISTER R1 200 OK INVITE User population: BHCA=f(#users,%active,call interval); f(20000,1/4,3min)=100,000/hr=28c/s Request modeling: poison arrival, INVITE transaction duration (20s=.7x8.5s+.3x38s), UAS latency<100ms Transaction response time<200ms for register and <100ms for 1xx, <2s for 2xx of INVITE, At 100c/s with 1500 bytes each, signaling requires 1.2Mb/s E721 says at most 6s, 8s, 11s respectively for local,toll,international call setup delay. 100 Trying INVITE R2 180 Ringing 180 Ringing 200 OK 200 OK ACK ACK BYE BYE 200 OK 200 OK

Comparison of two designs
a-h D1 D1 P2 P2 i-q D2 D2 P3 P3 r-z D2 Total time per DB =((tr/D)+1)TN =(A/D) + B =((tr+1)/D)TN =(A/D) + (B/D) System reliability =(1-(1-Rp)P).(1-(1-Rd)D) =R0.(RP)D.(Rd)D How derived: (a) writes = NT, total read = rN.tT (b) total writes = NT, total read = rN.tT D = number of database servers N = number of writes (REGISTER) r = #reads/#writes = (INV+REG)/REG T = write latency t = read latency/write latency P = number of proxy servers Rp = reliability of the proxy server Rd = reliability of the database server Low Scalability High Reliability High Scalability Low Reliability

Two stage architecture
Rp Mp a1 Master Rs Ms P=1+1 s1 a2 Slave S=3  = R + P REGISTER+ INVITE, etc B=2 s2 /B Master r, p s3 s b1 Slave ex b2 When is stateless proxy stage needed What are the optimal values for S,B,P for required scalability (1-10 million BHCA) and reliability (99.999%) using commodity hardware

Performance evaluation Scalability result (UDP, stateless, no DNS, no mempool)
This means 10 million BHCA (busy hour call attempts) using S3P3. I(s) II(p) calls/s Stateful proxy gave similar graphs with 650 CPS for single server. Line segments due to non-uniform distribution in II stage; I have verified uniform distribution also. Regitration test also gave similar graphs with about 2400 RPS (no auth). This means 10 million subscribers using S3P3. On commodity hardware: 3 GHz, Pentium 4, 1 GB memory

User diversity on incoming calls Is uniform hashing an issue?
Didn’t find any existing evaluation Adaptive load balancing for identifier-based distribution. issue – transfer registrations to new DB Related results: File popularity is Zipf; user is more uniform (poisson) Dynamic load balancing in web is useful if service time range is more than two orders. (but all servers have same pages) Intuitive: With 100,000 subscribers per server, you need to be very unlucky that one of those is equivalent to thousands of subscribers

User diversity on incoming calls Is uniform hashing an issue?
My argument (not sure): Assume user popularity for calls is Poisson with mean L. (independent calls for users) X is a random variable indicating number of calls a user receives If there are two servers, and assuming the user population gets uniformly distributed, each server gets roughly half the user population for each call rank. So user popularity again is Poisson with mean L. Thus mean (and variance) remains the same However if user popularity is Zipf, then it is different; variance can be high Alternative: If you plot number of calls vs users in decreasing order. Assume this is exponential. (analogy: calls/time to calls/users) Having two servers basically shrinks the x-axis to half. So each server has exponential graph with mean L/2

Reliability of two-stage
If S=P=B=3, each server is 99% available Total: (1-(1-R)S).(1-(1-R)B)P = % P doesn’t have to be 1+1 for each group, but can be N+1 for N groups. One backup keeps all records in memory, but gets only updates in contact rather than each refresh. Doesn’t have to survive reboots. Clones itself to another DB when failure occurs.

What is the best architecture?
Event-based Reactive system Process pool Each pool process receives and processes to the end (SER) Thread pool Receive and hand-over to pool thread (sipd) Each pool thread receives and processes to the end Staged event-driven: each stage has a thread pool recvfrom or accept/recv Match transaction Modify response Update DB Lookup DB Build Request DNS sendto, send or sendmsg parse Response Stateless proxy Found stateful REGISTER other Redirect/reject Proxy

Stateless proxy UDP, no DNS, six messages per call
Match transaction Modify response stateful Stateless proxy Response sendto, send or sendmsg recvfrom or accept/recv Found Update DB parse Redirect/reject REGISTER Match transaction Build response Request other Lookup DB Stateless proxy Proxy 16 processes or threads in the pool Modify Request DNS

Stateful proxy UDP, no DNS, eight messages per call
Event-based single thread: socket listener + scheduler/timer Thread-per-message pool_schedule => pthread_create Thread-pool1 (sipd) Thread-pool2 N event-based threads Each handles specific subset of requests (hash(call-id)) Receive & hand over to the correct thread poll in multiple threads => bad on multi-CPU Process pool Not finished yet

Server performance Results of my measurements; effect of multi-processor
Both Pentium and Sparc took approx 2 MHz CPU cycles per call/s on single-processor Calls/s for stateless proxy, UDP, no DNS, 6 msg/call Architecture /Hardware 1 PentiumIV 3GHz, 1GB, Linux2.4.20 (1xP) 4 pentium, 450MHz, 512 MB, Linux2.4.20 (4xP) 1 ultraSparc-IIi, 300 MHz, 64MB, Solaris (1xS) 2 ultraSparc-II, 300 MHz, 256MB, Solaris (2xS) Event-based 1550 400 150 600 Thread per msg 1300 500 100 Pool-thread per msg 1400 850 110 Thread-pool 1500 152 750 Process-pool 1600 1350 160 1000 Better performance as this includes mempool changes Calls/s for stateful proxy, UDP, no DNS, 8 msg/call Software architecture further improves performance: S3P3 can support 16 million BHCA Architecture /Hardware 1 PentiumIV 3GHz, 1GB, Linux2.4.20 (1xP) 4 pentium, 450MHz, 512 MB, Linux2.4.20 (4xP) 1 ultraSparc-IIi, 360MHz, 256 MB, Solaris5.9 (1xS) 2 ultraSparc-II, 300 MHz, 256 MB, Solaris5.8 (2xS) Event-based 1150 300 160 400 Thread per msg 600 175 90 Thread-pool 850 340 120 2 stage thread-pool 1100 550 155 500

Not much concurrency in stateful mode: needs more investigation

What is the best architecture?
Stateless CPU is bottleneck Memory is constant Process pool is the best Event-based not good for multi-CPU Thread/msg and thread-pool similar Thread-pool2 close to process-poll Stateful Memory can become bottle-neck Thread-pool2 is good But not N x CPU Not good if P  CPU Process pool may be better (?)

In memory database in sipd
Call routing involves ( 1) contact lookups 10 ms per query (approx) Cache (FastSQL) Loading entire database is easy Periodic refresh Potentially useful for DNS lookups Web config SQL database Periodic Refresh Cache < 1 ms [2002:Narayanan] Single CPU Sun Ultra10 Turnaround time vs RPS

Thread-per request doesn’t scale
One thread per message Doesn’t scale Too many threads over a short timescale Stateless: 2-4 threads per transaction Stateful: 30s holding time Thread pool + queue Thread overhead less; more useful processing Pre-fork processes for SIP-CGI Overload management Graceful failure, drop requests over responses Not enough if holding time is high Each request holds (blocks) a thread Incoming Requests R1-4 Fixed number of threads R1 R2 R3 R4 Incoming Requests R1-4 Thread pool with overload control Throughput Thread per request Load

Avoid blocking calls in sipd
DNS 10-25 ms (29 queries) Cache 110 to 900 CPS Internal vs external non-blocking Logger Lazy logger as a separate thread Date formatter Strftime() 10% REG processing Update date variable every second random32() Cache gethostid()- 37s Logger: while (1) { lock; writeall; unlock; sleep; }

Resource management in sipd
Socket management Problems: OS limit (1024), “liveness” detection, retransmission One socket per transaction does not scale Global socket if downstream server is alive, soft state – works for UDP Hard for TCP/TLS – apply connection reuse Socket buffer size 64KB to 128KB; Tradeoff: memory per socket vs number of sockets Memory management Problems: too many malloc/free, leaks Memory pool Transaction specific memory, free once; also, less memcpy About 30% performance gain Stateful: 650 to 800 CPS; Stateless: 900 to 1200 CPS For 64KB: packet drop started at 1200 CPS, and throughput saturated at 1600 CPS for process pool For 128KB: packet drop started and saturated at 1700 CPS. Stateless processing time (s) INV 180 200 ACK BYE REG W/o mempool 155 67 95 139 62 237 70 W/ mempool 111 49 48 64 106 41 202 Improvement (%) 28 27 33 24 34 15 31

Optimized processing in SER
Reduce copying and string operations Data lumps, counted strings (+5-10%) Reduce URI comparison to local User part as a keyword, use r2 parameters Parser Lazy parsing (2-6x), incremental parsing 32-bit header parser (2-3.5x) Use padding to align Fast for general case (canonicalized) Case compare Hash-table, sixth bit Database Cache is divided into domains for locking [2003:Jan Janak] SIP proxy server effectiveness, Master’s thesis, Czech Technical University

Bottlenecks and other scalability concerns addressed in SER
Protocol bottlenecks Parsing Order of headers Host names vs IP address Line folding Scattered headers (Via, Route) Authentication Reuse credentials in subsequent requests TCP Message length unknown until Content-Length Other scalability concerns Configuration: broken digest client, wrong password, wrong expires Overuse of features Use stateless instead of stateful if possible Record route only when needed Avoid outbound proxy if possible

Comparison of sipd and SER
Thread pool Events (reactive system) Memory pool PentiumIV 3GHz, 1GB, 1200 CPS, 2400 RPS (no auth) SER Process pool Custom memory management PentiumIII 850 MHz, 512 MB => 2000 CPS, 1800 RPS

Results of our measurement
Stateless proxy S=1050, P=900 CPS S3P3 => 10 million BHCA (busy hour call attempts) Stateful proxy S=800, P=650 CPS Registration (no auth) S=2500, P=2400 RPS S3P3 => 10 million subscribers (1 hour refresh) Memory pool and thread-pool2/event-based further increase the capacity (approx 1.8x)

3GPP’s IMS 3GPP (release 5)’s IP Multimedia core network Subsystem uses SIP
Proxy-CSCF (call session control function) First contact in visited network. 911 lookup. Dialplan. Interrogating-CSCF First contact in operator’s network. Locate S-CSCF for register Serving-CSCF User policy and privileges, session control service Registrar Connection to PSTN MGCF and MGW

Peer-to-peer Internet telephony

Comparison: Server-based, structured and unstructured P2P
Architecture Server-based (two-stage) Structure P2P (Chord/log(N)) Unstructured P2P (blind search) Reliability (or user record availability) (1-(1-R)P) Upper bound No guarantee Performance (delay) Low Log(N) No guarantee; Implementation dependent Scalability (number of users) Linear with number of servers Exponential with per node capacity If constant degree, then no limit

P2P vs server-based server-based P2P scaling server count 
scales with user count, but limited by supernode count efficiency most efficient DHT maintenance = O((log N)2), lookup = O(logN) security trust server provider; binary trust most supernodes; probabilistic reliability server redundancy; catastrophic failure possible unreliable supernodes; catastrophic failure unlikely

Structured vs. unstructured P2P-SIP
Unstructured P2P alone is not good No guarantee (upper bound) in lookup Fall back to server; has central dependency Caching is not much useful as contacts change often; caching good for non-mutable data (e.g., certificates) because replication and caching of mutable data don’t go together well. Skype works: Few super nodes are too burdened Node:supernode ~ 400:1 Probably some kind of structure (user name prefix) among super nodes, and falls back to central server

NAT traversal and P2P-SIP
Joining a DHT from behind a NAT: open issue NATed nodes acts as users of DHT Media traversal STUN (not symmetric: 18%?), TURN, ICE Hole punching – works with 60% symmetric

Security open issues (threats, solutions, issues)
More threats than server-based Privacy, confidentiality Malicious node Don’t forward all calls, log call history (spy),… “free riding”, motivation to become super-node Existing solutions Focus on file-sharing (non-real time) Centralized components (boot-strap, CA) Assume co-operating peers works for server farm in DHT Collusion Hide security algorithm (e.g., yahoo, skype) Chord Recommendations, design principles, … Design principles: define verifiable system invariants and verify them Allow querier to observe the lookup progress Assign keys to nodes in a verifiable manner Server selection in routing may be abused Cross check routing tables using random queries Avoid single point of responsibility Pastry - exclude untrusted nodes with help of central certificate authority. Incorrect data served: verify by self certifying path names

Security Random thoughts
Un-trusted peers: Misrouting: easy to detect No-routing: hard (redundancy, reputation, “call itself”) Problems: Malicious program, copyright violation, stolen identity, privacy, free riding, sybil, programmable scripts, aliases Separate DHT from application Managed DHT: more trusted

Why P2P-SIP? What is our P2P-SIP?
Server-based Maintenance and configuration cost: dedicated administrator Central point of failures: catastrophic failures Depends on controlled infrastructure (e.g., DNS) Peer-to-peer Self adjusting, robust against catastrophic failures, highly scalable, and no configurations Call setup and user search latency is higher: O(log(N)) Security: how to handle malicious peers? Identity protection? Our P2P-SIP Hybrid architecture: works with both P2P and server-based Built-in P2P network: acts as a service node for proxy, registrar, presence, offline storage, and media relay External P2P network: managed and trusted peer nodes Identity protection: identifier == SIP identifier

Deployment scenarios Interoperate among these! P2P proxies P2P clients
Different scenarios have different trust models! P2P proxies P2P clients Plug and play; May use adaptors; Untrusted peers; Super-nodes Zero-conf server farm; Trusted servers and user identities Interoperate among these!

SIP-using-P2P Using an External P2P network (distributed hash table - DHT)
Data model Treat DHT as database Service model Join DHT to provide service [5] bob bob [3] [1] [2] [1] [3] DHT DHT Service node ( ) There are two approaches to do the P2P-SIP operations. In the data model, the DHT is treated as a database with put, get, remove API, and performs all operations using this. In the service model, the every P2P-SIP node joins the DHT as a service node and serves as registrar, proxy, presence agent and STUN/TURN server for other nodes. It uses lookup, join and leave API. It is possible to layer them on one another: data model on top of service model is straight forward. Additionally OpenDHT shows that service model on top of data model is also possible using the ReDiR interface. [4] [2] [5] alice alice [1] join( ) [2] lookup(H(bob)) gives [3] REGISTER sip:bob to [4] lookup(H(bob)) gives [5] INVITE sip:bob to [1] put(k, ), k is H(bob) [2] get(k) gives [3] INVITE sip:bob to

SIP-using-P2P Implementation in SIPc with the help of Xiaotao Wu
OpenDHT Trusted nodes Robust Fast enough (<1s) Identity protection Certificate-based SIP id == P2P for Calls, IM, presence, offline message, STUN server discovery and name search P2P clients better than proxies: Less DHT calls OpenDHT quota per client limits put by proxies We have implemented P2P-SIP in our multimedia collaboration client, sipc, using OpenDHT running on Planetlab with about 200 nodes. The advantage of using an externally managed DHT is that we can trust to some extent that the nodes are not malicious and perform the DHT operations (get/put) correctly. Thus the security problem is mostly avoided. The identity protection is provided using a well known CA such as ours which gives out the certificate to the user for her address, so that the user can securely use her address as the SIP identifier in P2P-SIP. The implementation includes the P2P modes for calls, IM, presence, offline message storage, STUN server discovery and name search (find the user identifier for “Firstname Lastname”) OpenDHT is robust and fast enough for our needs. Lookups on an average take less than a second. We implemented redundancy and failover so that if one OpenDHT node is unavailable it uses another randomly choosen closer node.

P2P-over-SIP Node architecture: registrar, proxy, user agent
User interface (buddy list, etc.) Signup, Find buddies IM, call On reset Signout, transfer On startup User location Leave Find Discover Join Audio devices DHT (Chord) REGISTER, INVITE, MESSAGE Codecs Peer found/ Detect NAT Multicast REGISTER REGISTER ICE SIP RTP/RTCP SIP-over-P2P P2P-using-SIP DHT communication using SIP REGISTER Known node: Unknown node: User:

P2P-over-SIP Implementation
31 sippeer: C++, Linux, Chord Node join and form the DHT Node failure is detected and DHT updated Registrations transferred on node shutdown Co-located sipc can use sippeer service 29 1 31 30 25 26 26 9 19 11 15

Chord background Finger table: logN Stabilization for join/leave
Key node 8+1 = 9 14 8+2 = 10 8+4 = 12 8+8 = 16 21 8+16=24 32 8+32=40 42 1 54 8 58 10 14 47 21 Lookup in finger table for the furthest node that precedes the key In a system with N nodes and K keys, with high probability, each node receives at most K/N keys each node maintains information about O(logN) other nodes lookups resolved with O(logN) hops No network locality. Replicas need to have explicit consistency. Optimizations: location: weight neighbor nodes by RTT. When routing choose the neighbor who is closer to destination with lowest RTT from me. Reduce path latency. Multiple physical nodes per virtual node. What if a node leaves? Identifier circle Keys assigned to successor Evenly distributed keys and nodes Finger table: logN ith finger points to first node that succeeds n by at least 2i-1 Stabilization for join/leave 42 38 32 38 24 30

Design alternatives Use DHT in server farm
1 8 14 21 32 38 58 47 10 24 30 54 42 65a1fc d13da3 d4213f d462ba d467c4 d471f1 d46a1c Route(d46a1c) servers 1 54 10 38 24 30 Hierarchy Dynamically adapt clients Use DHT in server farm Use DHT for all clients; But some are resource limited Use DHT among super-nodes

SIP messages DHT (Chord) maintenance
1 DHT (Chord) maintenance Query the node at distance 2k with node id 11 REGISTER To: From: SIP/ OK Contact: Update my neighbor about me To: Contact: 10 22 7 15 Find(11) gives 15

SIP messages User registration Call setup and instant messaging
REGISTER To: Contact: Call setup and instant messaging INVITE To: From:

Adaptor for existing phones
Use P2P-SIP node as an outbound proxy ICE for NAT/firewall traversal STUN/TURN server in the node

Hybrid architecture Cross register, or Locate during call setup
DNS, or P2P-SIP hierarchy

P2P-over-SIP Analysis: scalability
Number of messages depends on Number of peer nodes (N) Keep-alive (rs) and finger table refresh rate (rf) Call arrival distribution (c) Node join, leave, failure rates () Number of users (k.N) User registration refresh rate (t) M={rs+ rf(log(N))2} + c.log(N) + (k/t)log(N) + (log(N))2/N Number of nodes = f(node-capacity) Nmax  min[2M/(r+c),2M/r] for large N Measured M = 800 reg/s and assuming aggressive refresh and call rate of 1/min, it gives 2219 nodes >> 2160. Even for a conservative 10 req/s capacity, it gives more than 16 million nodes (super nodes) in the network.

P2P-over-SIP Analysis: availability and call setup latency
To increase user availability: Increase keep-alive rate (fast failure detection) Increase user registration refresh rate (reduce unavailability) Replicate user and node registrations Call setup latency: Same as DHT lookup latency: O(log(N)) Calls to known locations (“buddies”) is direct DHT optimization further reduces latency Chord: nodes => 6 hops At most a few seconds User availability and retransmission timers Advanced services also possible. the ConChord system for certificate storage, the Tarzan system for anonymous communications, and the proposal to use Chord as replacement of standard DNS hierarchies by Cox, Muthitacharoen, and Morris -- this last proposal is most similar in spirit to your work

Skype From the KaZaA community
Host cache of some super nodes Bootstrap IP addresses Auto-detect NAT/firewall settings Similar to STUN and TURN Protocol among super nodes – ?? Allows searching a user (e.g., kun*) History of known buddies All communication is encrypted Promote to super node Based on availability, capacity Conferencing Problems: Proprietary, single service, centralized login P Super-nodes Election: capacity (bandwidth, storage, CPU) and availability (connection time,public address) Read on STUN and TURN – what kind of NAT they support etc. Symmetric NAT (different outbound connection from internal IP:port to different destination use different external address) Full cone, restricted and port restricted cone: same internal IP:port has same external address for any destination, same destination IP, same destination IP:port TURN: acts as a relay for UDP or TCP ICE: caller collects all external address, sends to callee, callee tries STUN to all, sends packets to whichever works, callee also collects all external address, sends to caller, caller reuses callees address or tries STUN. Periodically keep trying STUN.

Related work P2P networks Skype and related systems P2P-SIP telephony
Unstructured (Kazaa, Gnutella,…) Structured (DHT: Chord, CAN,…) Skype and related systems Flooding based chat, groove, Magi Skype-in/out uses SIP P2P-SIP telephony Proprietary: NimX, Peerio, File sharing: SIPShare Now in IETF: W&M, Avaya, Panasonic, … Mercora – legal music sharing S2S – science to science is a p2p search engine built using JXTA SOS – source->access point->beacon->secret servlet->filter/firewall->target. Target picks secret servlets. Beacon know abt secret servlets.

Why we chose Chord? Just as a simple example DHT
Chord can be replaced by another As long as it can map to SIP High node join/leave rates Provable probabilistic guarantees Easy to implement X proximity based routing X security, malicious nodes

Why we chose Chord? Reliability of Chord under churn
If 50% nodes fail, only 3% lookups fail If successor list length is O(logN) and node failure probability is 50% in a stable network, then still works in O(logN) lookup hops. Becomes strongly stable after O(N2) rounds of strong stabilization If failures happen atmost N/2 in LogN steps, then becomes stable in O(N3)

Chord stabilization vs availability
Increasing message rate to 1.25 msg/sec/node allows < 0.4% failed lookups

JXTA vs Chord in P2P-SIP JXTA P2P-SIP
Protocol for communication (peers, groups, pipes, etc.) Stems from unstructured P2P P2P-SIP Instead of SIP, JXTA can also be used Separate search (JXTA) from signaling (SIP) JXTA: peer discovery, resolver, information, rendezvous, pipe binding, endpoint routing protocols. Peers, groups, messages, pipes, advertisement,

Node startup SIP DHT Dialing out REGISTER with SIP registrar
columbia.edu DB sipd SIP REGISTER with SIP registrar DHT Discover peers: multicast REGISTER Join DHT using node-key=Hash(ip) REGISTER with DHT using Dialing out Call, instant message, etc. INVITE MESSAGE Last seen, SIP NAPTR/SRV, DHT REGISTER Detect peers REGISTER alice=42 42 58 12 14 REGISTER bob=12 32

Node leaves Graceful leave Failure Un-REGISTER Transfer registrations
Attached nodes detect and re-REGISTER New REGISTER goes to new super-nodes Super-nodes adjust DHT accordingly REGISTER key=42 REGISTER DHT OPTIONS 42 42

Advanced services Offline messages Conferencing Inter-domain
INVITE or MESSAGE fails => Responsible node stores voic , instant message. Conferencing Mixer, full mesh, multicast Inter-domain Local DHT; connected by DNS or global DHT

Enterprise IP telephony

Goal of my work Beyond voice: video, text, IM, presence, screen sharing, shared web browsing, … Beyond SIP phone: regular telephone, , web, … Beyond synchronous communication: offline mails, discussion forum, file sharing, … Program Call routing SIP SAP RSVP RTCP RTP Media G.711 MPEG RTSP Signaling Quality of service Media transport Internet Telephony Radio/TV Messaging and Presence Interactive voice response Unified messaging Video conferencing Physical layer Link layer Network (IPv4, IPv6) Transport (TCP, UDP) Application layer Voice XML DTMF Mixing Speech/ text SDP

Related work IP telephony and multimedia communication
Unlike low cost VoIP: Vonage, AT&T We provide enterprise infrastructure There are enterprise IPtel: Cisco, Nortel But redundancy architecture, interoperability, distributed components model differ Collaboration: CSCW, SIGGROUP, Breeze Unlike web-centric, or application specific We provide standard-based multimedia collaboration platform Multimedia conferencing: Mbone, H.323 Ours is SIP-based infrastructure, reuse existing tools and protocols such as RTSP, media server Distributed software development – CHIME (kaiser)

Related work Comprehensive multi-platform collaboration
Goal: Alternate between synchronous and asynchronous communication, and access from different devices and clients. Synchronous (tightly coupled) Video conference, IM, screen sharing, floor control, … Asynchronous (loosely coupled) File sharing, message board, … Messaging and notifications Personalized view Per-user calendar, access control, address book We try to incorporate… Long lived groups Design teams, committees, college classes Asymmetric events Lecture and lecture series Short-lived spontaneous interaction Current practice , teleconference Vendor specific tools, platform dependence Application specific E.g., collaborative software development

Multi-party collaboration What is done, and what is left.
Sipconf: conference server Audio, video, IM, screen, shared browsing, floor control No XCON yet: use web interface Small to medium size conferences Cascaded conference mixer #participants, audio delay Failover State sharing between servers

Communication to collaboration
Comprehensive Personalized view Calendar, address book, groups and access control Synchronous (tightly-coupled) collaboration Conferencing: audio, video, IM, white-board, screen sharing, shared web browsing Asynchronous (loosely-coupled) collaboration Unified messaging, shared files, discussion forum, notification Multi-platform (device) Telephone: touch tone input and audio (IVR) PC: multimedia client, , IM Reuse existing protocols and tools Unified messaging The gaps among different media (audio, video, text), devices (PC, phone) and means of communications ( , SIP, IM) disappear for messaging Current practice is to use and telephone conference, or use vendor specific or platform dependent tools, or use application specific collaboration. Our goal is to provide synchronous and asynchronous collaboration and alternate between the two, and access from variety of devices such as telephone and PC using tools such as , IM, PC client or audio client.

What’s next? from multimedia communications to collaboration
Synchronous communications Conferencing, IM Asynchronous communications Voic s, message board, file sharing Ubiquitous computing i-button, ID-card Service creation User friendly, end-point/network one of the most important advantages of Internet Telephony is that it can provide more innovative services and can create the service more efficiently.

Voicemail Goals Why SIP and RTSP? Universal access Scalability
(1) INVITE INVITE OK CANCEL (3) OK (2) SETUP (4) RTP (5) BYE rtspd sipum Goals Universal access Scalability Provider independent Why SIP and RTSP? Reuse existing infrastructure and tools Design goals: Message recording and playback, Answering machine and voic , Universal access: web, , VoIP, PSTN, notification, Scalability for large domains, Separable from ITSP or ISP, Reuse existing infrastructure, Media-agnostic, Tool-agnostic, Telephony interface - DTMF

Web interface Retrieval Configuration Web interface
rtsp://server/alice /inbox/1677.au, press 1 to listen… Configuration Folders Options Retrieval or deletion Retrieval using RTSP clients (Quicktime), SIP user agent (e*phone) or Web browser. Features Integration with web/ for more control over voic configuration (e.g., folder management, notification.) Web based voice mail accounts for users (Similar to Hotmail)

Conferencing models (non-multicast)
B C D  A+C+D A B C D  A B C B+C+D A+B+D D A+B+C  Topology star full-mesh ad-hoc Advantages Heterogeneous simple clients No central point of failure Disadvantages External server with high bandwidth link Complex endpoints Complex signaling Typically only three party conferences

Conference

Conference D E A B C G711 DVI GSM M=A+B+C M - A=B+C M - B M - C
Linear Playout delay Periodic timer M=A+B+C Mixed linear M - A=B+C M - B M - C Receive Send G.711, GSM, DVI, Speex, G.722 mixing (decode-mix-encode) Video replication; IM; text; VNC screen sharing; floor control; IVR for joining Optimization possible for same codecs

Performance evaluation
Increase in parameter value CPU Bandwidth Delay Packetization interval (T) Reduces Increases Codec bitrate (B) N/A Codec complexity (M) Network jitter (J) CPU usage = (.P + ).C  = (Me+a.B’+b) and  = (Md+c.B’+d) B’ = B + 320/T For C conferences, each with P participants. a,b,c,d are constants; b,d are comparatively insignificant For G.711 codec Me and Md are insignificant (5.5 and 1.7 s), thus CPU = C.(a.P+c).(B+320/T) For GSM, G.722, (or G.723.1), Me and Md are dominant (70 and 30/50 s), thus CPU = C.(Me.P+Md)

Performance evaluation
Delay less than 20 ms: increases from first to last participant in a conference About 480 participants in a single conference with one speaker Packetization interval of 40 ms gives better performance: 720, but increases delay too About 80 four-party conferences Memory used 20 kB per call or participant Both Pentium and Sparc took about 6 MHz/participant

CINEMA My contribution in design and implementation
CINEMA Applications RTSP media server SIP/VoiceXML browser SIP/H.323 gateway SIP/RTP conferencing SIP/RTSP unified messaging SIP proxy server rtspd sipvxml sip323 sipconf sipum sipd Flite Xerces-C Xerces-C OpenH323 CINEMA Libraries libsip Basic SIP library rtplib++ RTP library libcine Utilities parsing IPv6 librtsp RTSP client libsipapi SIP UA library libconf RTP audio mixer libmedia Recording, files libNT Win32 stub libdict Hash table libdb++ mySQL interface libsnmp SIP MIB libcanon canonicalize MySQL PWLib Resparse … and web-based GUI C/C++: 60K out of 187 KLOC Tcl: 30 KLOC

Contribution Sip-h323: signaling translator
Background: ITU-T’s H.323 Binary ASN.1 PER, collection of protocols (H.245, H.225.0, Q.931, RAS, H.450.x) H.323 gatekeeper similar but not same as SIP server Problems in interworking Multi-stage dialing in H.323v1 Fast start in v2 is optional User registration Both SIP and H.323 users should be reachable Session description is more complex End system should select the codecs Security and QoS: end-to-end or not? Solution List different scenarios No modification in SIP or H.323 Direct RTP traffic if possible Implementation

Contribution Sipum: Unified messaging using SIP and RTSP
Problem Existing systems have voic with PBX or phone, or send voice attachments in Downloading the whole message is not desirable Solution Using existing standards (RTSP, SIP) and tools (web, media player) Distributed components for different architectures (PBX, phone, service provider) Many ways to retrieve your message (RTSP, SIP, phone, web) Message deletion issues Call reclaiming Implementation

Contribution Sipconf: Centralized conferencing using SIP/RTP
Problem Multicast is not available and ad hoc conference is useful for small number of users Heterogeneous clients (some have video also; or different audio codecs) Solution Audio mixer, video forwarder IM, VNC screen sharing, shared web browsing Playout delay adjustments Web based configuration, floor control G.711 A/Mu, G.721, DVI, ADPCM, G.722, … Modular: libconf, libmedia, rtplib++ Implementation and performance evaluation

Contribution Sipvxml: SIP-based VoiceXML browser
Background VoiceXML for touch tone-based service programming Backend scripts (CGI) or servlets Problem Then existing solutions were PSTN based Solution First SIP-VoiceXML implementation SIP interface (works with PSTN via a gateway) Example cgi scripts Calling card service Joining a conference (Ajay) Accessing voice mail (Ajay) by phone (Pimrampai) Auto attendant (Sean)

Contribution libsip++: SIP user agent library in C++
All the applications (sipum, sipconf, siph323, sipvxml) use a common underlying library Similar API for H.323 defined using wrapper around openH323 Unlike JAIN-SIP or SIP servlet, libsip++ is more high level with facility to access low level features Dialog, call, endpoint, registration are defined as objects (JAIN-SIP 1.1 added dialog as object) Uses underlying transaction and parsing library shared with sipd Test user agent (sipua) is used as tools, e.g., for sipconf testing Documentation is at

Contribution GUI: web-based user interface
Configuration, user profile, etc., stored in SQL DB Front end as web-based GUI CGI scripts in Tcl About 100 pages for various configuration User friendly (beginner vs advanced, context help) Asynchronous collaboration Voic , file sharing, IM archive, groups, address book, calendar Undergone three iterations See current version at

Interworking between SIP and H.323
Transport Layer SIP SDP RTP Codecs RTCP Terminal Control/Devices IP and lower layers TCP UDP TPKT Q.931 H.245 RAS RTCP RTP Codecs Terminal Control/Devices Both use RTP for media thus allows scalable translator SDP is simple. H.245 is very exhaustive and can represent inter-codec dependencies. H.323 has multi-stage call setup. SIP has single stage. H.323 single step fast connect is optional Basic calls are possible to translate. Complete interworking is not possible without modifying (conference, security).

SIP vs H.323 Both use RTP/RTCP over UDP/IP Binary ASN.1 PER encoding
Text based request response SDP (media types and media transport address) Server roles: registrar, proxy, redirect Binary ASN.1 PER encoding Sub-protocols: H.245, H.225 (Q.931, RAS, RTP/RTCP), H.450.x... H.323 Gatekeeper SIP and H.323 are different, particularly, in terms of complexity, scope of feature definitions and session description. Both use RTP/RTCP over UDP/IP

Interworking Problems Call setup translation
H.323 SIP Q.931 SETUP INVITE Destination address Q.931 CONNECT 200 OK Terminal Capabilities Media capabilities (audio/video) Terminal Capabilities ACK Open Logical Channel Media transport address (RTP/RTCP receive) Open Logical Channel Problems in call setup translation: Three pieces of information needed for call setup : Destination signaling address Self and remote media capabilities Self and remote media transport address SIP carries these in INVITE and its response. H.323 spreads them across different stages. Mapping multistage dialing in H.323 to single stage in SIP is not trivial. H.323 v2 Fast-Start supports single stage dialing, but it is optional.. Multi-stage dialing H.323v2 Fast-start is optional

Interworking Problems User Registration
? H.323 H.323 Gatekeeper SIP registrar SIP H.323 terminal SIP user agent Alias: Henry E164: 7040 Problems in user registration: User may be registered in either H.323 or SIP network, and should be callable from either H.323 or SIP endsystems. Location independent user identifier ? Use information from both networks

Interworking Problems Media Description
SIP/SDP (dynamically choose from listed modes) List of alternative set of algorithms. audio G.711 Mu law, G.723.1, G.728 video H.261 H.323/H.245 (declare your exact modes) Supports inter-media constraints { [G.711 Mu law, G.711 A law][H.261 video]} { [G.723.1] [no video] } H.245 can represent media constraints as shown in the example. SDP is very simple and can not specify such constraints. How do we translate H.245 capabilities to SDP ? One approach is to use multiple SDP in SIP message. Our approach of “maximal intersection” does not need such things. Other problem, how to allow selection of audio/video algorithms by the end-systems, instead of by the signaling gateway. Translation in both directions Algorithm selection by end-systems

Interworking Problems Common subset of media capabilities
D D D1’ D2’ {[a1,a2,a3], [v1,v2]}{[a1,a4,a5],[v1]} {[a1,a4,a2],[v1,v2,v3]}{[a1,a2,a5],[v1,v3]} S S S1’ S2’ S1^S1’, S2^S2’: {[a1,a2],[v1,v2]} D1 D1’: {[a1,a2,a3],[v1,v2]}{[a1,a4,a2],[v1,v2,v3]} S1^S2’, S2^S1’: {[ ],[ ]} {[a1,a4],[v1]} D2 D1’: {[a1,a4,a5],[v1]} {[a1,a4,a2],[v1,v2,v3]} {[ ],[ ]} {[a1,a2],[v1]} D1 D2’: {[a1,a2,a3],[v1,v2]}{[a1,a2,a5],[v1, v3]} H.245 can represent media constraints as shown in the example. SDP is very simple and can not specify such constraints. How do we translate H.245 capabilities to SDP ? One approach is to use multiple SDP in SIP message. Our approach of “maximal intersection” does not need such things. Other problem, how to allow selection of audio/video algorithms by the end-systems, instead of by the signaling gateway. {[ ],[ ]} {[a1,a5],[v1]} D2 D2’: {[a1,a4,a5],[v1]} {[a1,a2,a5],[v1,v2,v3]} {[ ],[ ]} Result: {[a1,a2],[v1,v2]} {[a1,a4],[v1]} {[a1,a5],[v1]}

Interworking Problems Call Services
H.323 Conferencing: centralized signaling control, MC (Multi-point Controller) Supplementary services, like call transfer: H.450.x SIP Conferencing: centralized bridged + decentralized distributed REFER, 3pcc H.323 has centralized control for conferencing. SIP supports both centralized and distributed conferencing. H.323 invents new protocols (H for call transfer, H for call diversion, H for call hold) and so on. In SIP, crucial pieces are identified (e.g., REFER) and additional services are built upon them.

Interworking Problems Security and QoS
H.323 uses H.235, whereas SIP uses Basic, Digest, TLS, S/MIME Media Traffic end-to-end; QoS ? SIP relies on external protocol for QoS. In H.323 QoS goes hand-in-hand with call establishment. Since, QoS requires end-to-end RSVP signaling (if RSVP) is used, you may not be able to have direct RTP channels between SIP and H.32 entities.

What we want ? Transparent translation
Minimum modification in SIP or H.323 Use features from both SIP and H.323 Direct RTP/RTCP traffic; end-to-end User should be able to dial an internet address or an alias address or an URI and should be able to reach the intended destination, no matter whether it is in SIP or H.323 (or PSTN) network. Interworking should be able to utilize the services of SIP servers (with CGI/CPL capabilities) as well as that of H.323 gatekeepers. If an H.320-H.323 gateway is available somewhere, then a SIP user should be able to reach an H.320 user via sip-h323 and H.323-H.320 gateways, for example. Since both SIP and H.323 use RTP/RTCP for media, it is desirable to have direct media connection without routing media packets through the signaling gateway, to reduce delay.

User registration Registration info to foreign network
REGISTER Contact:pc1 SIP registrar server H.323 Gatekeeper + SGW pc1.office.com home.com INVITE RRQ Contact: 3xx Moved Contact:pc1 SIP user agent H.323 terminal To solve the user registration problem: registration information should be transported to the foreign network. This allows for three different architectures: Signaling gateway co-located with H.323 gatekeeper Signaling gateway co-located with SIP registrar/proxy Independent signaling gateway. The first scenario is shown in an example. H.323 terminal’s registration information is transported to SIP network. Now any SIP entity can also reach the H.323 terminal. In the second architecture, SIP registration information is transported to H.323 gatekeepers, which can then resolve the SIP addresses also. If the signaling gateway is independent of the registration servers, it can use SIP OPTIONS and/or H.323 Location Request (LRQ) to check the validity of the address in SIP and H.323 networks respectively. use SIP REGISTER and/or H.323 RRQ/RCF

SIP registrar server + SGW
User registration Registration info from foreign network LRQ/LCF SIP registrar server + SGW H.323 Gatekeeper pc1.office.com home.com INVITE RRQ Contact: 200 OK SIP user agent H.323 terminal use SIP OPTIONS and/or H.323 LRQ/LCF

User registration Different Architectures
SGW co-located with H.323 gatekeeper SGW co-located with SIP registrar/proxy server Independent SGW

Call Setup with H.323v2 Fast Start
(Almost) One-to-one mapping between SIP and H.323 messages. H323 SIP Setup/FastStart INVITE Connect/FastStart 200 OK. ACK With H.323 fast start, there is a one-to-one mapping between the SIP and H.323 messages, as the media description is also carried in H.323 Setup and Connect messages. Translation is simple. RTP/RTCP Reverse direction is similar

Call Setup without Fast Start
SIP Q.931 SETUP INVITE Destination address Q.931 CONNECT 200 OK Terminal Capabilities Media capabilities (audio/video) Terminal Capabilities ACK Open Logical Channel Media transport address (RTP/RTCP receive) Open Logical Channel Remember the three pieces of information for call setup. Accept the call from H.323, forward to SIP after OLC ? Not desirable.

Call Setup without Fast Start, SIP to H.323
Setup/Q931 INVITE Signaling Gateway Connect/Q931 Capabilities/H245 Capabilities/H245. Media Transport Address 200 OK. Open Logical Channel/ H245 Without Fast start, translation is simple in SIP to H.323 direction, because all three pieces of information needed for call setup are available in SIP INVITE message. This can be split across multiple messages of H.323. Once the media transport addresses of H.323 endpoint is known in OpenLogicalChannelAck, a confirmation can be sent to SIP endpoint. RTP and RTCP are carried directly between endsystems, since both endsystems know about the media transport address of the other. Acknowledgement Open Logical Channel / H245 ACK Acknowledgement RTP/RTCP

Call Setup without Fast Start, H.323 to SIP
Setup/Q931 INVITE Signaling Gateway Connect/Q931 200 OK Capability Exchange ACK Media Transport Address Open Logical Channel Re-INVITE/SIP+SDP Without Fast start in H.323 to SIP direction: The signaling gateway forwards INVITE to SIP on receipt of Setup from H.323. This INVITE contains dummy SDP. Once the confirmation is received from SIP endpoint, Connect is sent on H.323. Since the INVITE response from SIP contains the SDP of SIP side, it can be used to send and acknowledge H.323 capability negotiation and logical channel messages. Once the acknowledgements for all the logical channel messages in SIP to H.323 direction, are received, the signaling gateway knows the media transport address of H.323 endpoint and it can send re-INVITE with new SDP to SIP endpoint. RTP and RTCP are carried directly between endsystems, since both endsystems know about the media transport address of the other. Acknowledgement Open Logical Channel Acknowledgement RTP/RTCP

Media Capability Modify SIP/SDP : multiple capability sets, or...
Let the SGW choose a sub-set of capabilities for SIP side Re-INVITE or change in H.323 mode or logical channels, whenever it changes The signaling gateway maintains capability sets of both SIP and H.323 endpoints in each direction. These capability sets are used to calculate maximal intersection of capability sets in each direction. Then an operating mode is derived for each direction (for each media type) by selecting one algorithm from the alternative set in the maximal intersection. The H.323 logical channnels and SIP SDP are formatted using these operating modes. Whenever there is a change in operating modes, SIP reINVITE and/or H.323 ModeRequest/Open and CloseLogicalChannel are sent as appropriate.

Session description INVITE alice@home.com
Y Session description INVITE I can support -law and G.729 Send me audio at :6780 Bob Alice OK; I can support -law Send me audio at :8000 RTP To port 8000 RTP To port 6780

Presence/event notification
Y Presence/event notification office.com Presence server PUA PA REGISTER SUBSCRIBE PUA + PA NOTIFY PUA registrar General event notification method for Internet presence, conferencing, device control Instant messaging MESSAGE with text body

Columbia IM and presence
Y Columbia IM and presence

Network call control SIP-CGI CPL SIP servlets SIP proxy Y Urgent
Priority.pl SIP_FROM SIP_TO stdin CGI-PROXY-REQUEST stdout SIP-CGI CPL SIP servlets SIP proxy Urgent Low-priority Voic Phone if (defined $ENV{SIP_FROM} && $ENV{SIP_FROM} =~ { foreach $reg (get_regs()) print "CGI-PROXY-REQUEST $reg SIP/2.0\n"; print "Priority: urgent\n\n"; } SIP-CGI: Programming language independent. Maintains state via an opaque token. For SIP proxies and endpoints: call routing, controlling forking, call rejection, call modification (Priority, Call-Info). RFC Upload via web or REGISTER

Call transfer REFER Blind/ consultation/ attended A B C active call
REFER C Referred-By: B INVITE C Referred-By: B BYE A active call

B2BUA and third-party call control
Back-to-back UA Incoming call triggers outgoing call Services Calling card Anonymizer B SIP A C INVITE OK (SDP1) ACK INVITE (SDP1) OK (SDP2) ACK INVITE (SDP2) OK ACK

Voicemail Problems in PSTN Design alternatives Issues
Y Voic Endpoint redirects Problems in PSTN Design alternatives Issues Redirect after 10s Proxy controls Voic acts like a phone vmail.pl CFB=call forward busy Problems in PSTN: Voice mail system tied to PBX or phone company Integration of video, fax, whiteboard? How to integrate with Internet telephony? How to integrate with , web and other user applications? Existing solutions Voice Profile for Internet Messaging (VPIM) Web-based unified messaging systems with personalized PSTN voice mail number Issues Call reclaiming Deleting voice/video mail Integration with PSTN phone Integration with VPIM, IMAP, POP3 SIP_FROM SIP_TO stdin CGI-PROXY-REQUEST stdout If no response accept after 15s

VoiceXML PSTN Internet Gateway sipvxml Telephone Internet user
Voice gateway Web server Service logic (CGI, servlet, JSP) Voice and telephony functions VoiceXML browser Internet user VXML HTML Internet Telephone IVR platform Voice and telephony functions (ASR, TTS, DTMF) Service logic (application specific)

Y VoiceXML contd. <form> <field name=‘id’> <prompt> Your ID, please. </prompt> </field> <block> <submit next=“url”/> </block> </form> <form action=“url”> Enter your Id: <input name=‘id’> <input type=‘submit’> </form> Telephony, speech synthesis or audio output, user input and grammar, program flow, variable and properties, error handling, …

Columbia sipvxml PSTN Internet Unified messaging access Email by phone
Telephone SIP/PSTN gateway Unified messaging access by phone Event notification and scheduling Audio volume control for conference Advanced conference control TTS, ASR, DTMF, XML, HTTP, RTSP sipvxml SIP phone Media server rtspd Web server .tcl

Email + phone Email by phone Email procmail important mails Email to
Inbox procmail important mails Internet to IM Login formatting Listen, reply, delete, compose, forward Navigation -next, previous, jump SIP SIP formatting SIP based Text-to-speech VoiceXML browser TTS IM Call SIP HTTP Internet JSP servlet Inbox DB to phone

IM + voice call Who can initiate? Feedback Talk-spurt detection
Y IM + voice call Who can initiate? authentication, billing, … Feedback to voice user to IM user initial IM greeting Talk-spurt detection Speech recognition SIP MESSAGE TTS ASR IM Call SIP INVITE RTP

Notification Calendar Events Conferences Schedule from a browser
SIP call IM Web server Calendar.cgi “at 6:00pm”

Phone announcement server
Y Phone announcement server Destinations Text or audio Input Range:93970?? List: A, B, C Example Announcement Emergency Issues Voic Failure detection TTS PAS SIP . . .

RTSP + TTS + ASR TTS ASR Media server SETUP rtsp://server/tts.cgi
O RTSP + TTS + ASR TTS ASR Media server SETUP rtsp://server/tts.cgi ?text=How+are+you. SETUP rtsp://server/asr.cgi PLAY RECORD Ask the server to stream converted speech to client. TTS text can be in URL, in body of play or as a http file name in URL. Also useful for SIP based component. RTP RTP SET_PARAMETER Text=I am fine, thank you.

Centralized conferencing
Conference as URL On the fly conferences Basic task: join/leave Dial in, Refer dial in Dial out, Refer dial out INVITE INVITE REFER INVITE server REFER

Conference + VoiceXML Call transfer vs bridged mode 1. INVITE sipvxml
Y Conference + VoiceXML 1. INVITE sipvxml 2. Call accepted 3. Enter your four digit PIN 4. Entered 5. Authenticate user, 4683=>Alice 6. Enter the conference identifier 7. Entered 2-3-# sipvxml 8. Permission to join, 23=>meet 9. REFER 10.Terminate the old call Caller 11.INVITE sipconf Call transfer vs bridged mode

New conference applications
Y New conference applications The ease & flexibility of sipvxml enables us to build custom telephonic applications to suit our needs. e.g., Volume Check Application 1. INVITE sipvxml 2. Menu 1. Vol Check 2. Mic Check 3. User enters 2 sipvxml 4. User speaks out a voice sample 5. Voice sample is analyzed 6. SipVXML: Vol level too high/low/… Caller 7. User adjusts the vol level. sipconf 8. User now joins conference.

Conferencing Netmeeting Automatic volume adjustments
Y Conferencing SIP323 Netmeeting Automatic volume adjustments Automatic load balancing Delay adjustments Conference recording Local or RTSP sipc SIP/PSTN Recording in a media server

PSTN interworking Translating audio (PCMU/PCMA)
Telephone network Telephone subscriber SIP/PSTN gateway SIP server IP endpoint Translating audio (PCMU/PCMA) Translating signaling (PRI/T1,ISUP) Overlap signaling Advanced features in SIP are lost in PSTN Translating identifiers (phone number) Determining transition points

PSTN to IP Gateway knows the SIP server ENUM DNS
ENUM DNS => e164.arpa => Suitable for relatively “static” contacts

IP to PSTN Static mapping ITGW information is dynamic: TRIP
xxxx ITGW information is dynamic: Overlapping networks Multiple providers Load balancing TRIP Route advertisement Can be implemented in outbound proxy Suitable for current hierarchical network @service.mci.com at 4¢/min @nyc.gw.com at 1¢/min @itgw1.columbia.edu free

30 second version I developed reliability and scalability techniques for Internet telephony and showed that it is at par with the existing telephone system reliability and scalability at a much lower cost. I developed interoperable architecture to build self organizing network of Internet telephones, without depending on managed infrastructure or servers. I developed tools and components for a multi-platform multimedia collaboration system using existing standards and showed that it can scale to large number of simultaneous participants.

Scientific contribution
That linear cluster scalability can be observed in SIP servers. That we don’t need to depend on servers or managed infrastructure for Internet telephony. That we can build highly scalable and reliable Internet telephony systems using existing standards on commodity hardware.

Backup slides.

Similar presentations

Presentation on theme: "Backup slides."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Backup slides.

Similar presentations

Presentation on theme: "Backup slides."— Presentation transcript:

Similar presentations

About project

Feedback