Download presentation
Presentation is loading. Please wait.
2
Robust Low-Latency Voice and Video Communication over Best-Effort Networks Department of Electrical Engineering Stanford University March 12, 2003 http://www.stanford.edu/~yiliang/ Yi Liang
3
Liang: Robust Low-Latency Voice and Video Communication 2 Media Delivery over IP Networks Internet
4
Liang: Robust Low-Latency Voice and Video Communication 3 QoS Concerns and Challenges Communication over best-effort networks … Delay Impairs interactivity of conversational services Voice over IP: recommended one way delay < 150 ms [ITU-T G.114] Packet loss Impairs perceptual quality Delay jitter Obstructs sequential and continuous media output
5
Liang: Robust Low-Latency Voice and Video Communication 4 Outline of Contributions Server Client Packet network I. Client side II. Transport III. Network- adaptive coding I.Client side Adaptive playout scheduling for VoIP that reduces latency and packet loss II.Transport Packet path diversity and applications in low-latency communications III.Network-adaptive coding Low-latency video communication that does not require packet retransmission
6
Liang: Robust Low-Latency Voice and Video Communication 5 Outline I.Client side Adaptive playout scheduling for VoIP that reduces latency and packet loss II.Transport Packet path diversity and applications in low-latency communications III.Network-adaptive coding Low-latency video communication that does not require packet retransmission
7
Liang: Robust Low-Latency Voice and Video Communication 6 Delay Jitter and Buffering Avg. buffering delay (ms) Late loss rate (%) Late loss Buffering delay Fixed Playout Schedule
8
Liang: Robust Low-Latency Voice and Video Communication 7 Adaptive Playout Scheduling (1) Buffer. delay Adaptive Playout Schedule Fixed schedule
9
Liang: Robust Low-Latency Voice and Video Communication 8 Adaptive Playout Scheduling (2) Requires media scaling Slow down Speed up Sender Receiver Playout 12345678 12345678 time Packetization time 1.How to set the playout schedule? 2.How to scale the media? 3.Quality of scaled voice?
10
Liang: Robust Low-Latency Voice and Video Communication 9 Determine the Playout Schedule Delay (ms) Probability Delay histogram Next packet: Given the acceptable loss rate, find the playout deadline History-based estimation using past w delays Deadline Loss prob.
11
Liang: Robust Low-Latency Voice and Video Communication 10 Voice Scaling Using Time-Scale Modification Based on WSOLA [Verhelst ‘93] Preserves pitch Improved to scale short individual voice packets; no delay Output 1/20/12/33 4 Input Pitch period 02134 Template segment Similar segment Packet expansion
12
Liang: Robust Low-Latency Voice and Video Communication 11 Examples of Time-Scale Modification Speech scaling Audio scaling Original130%70%
13
Liang: Robust Low-Latency Voice and Video Communication 12 Quality of Time-Scale Modified Voice Packets scaled: 18.4 % Scaling ratio: 50 - 200% DMOS: 4.5 out of 5 [ITU-T P.800] Adaptive Playout Schedule ORIGINALMODIFIED
14
Liang: Robust Low-Latency Voice and Video Communication 13 Results and Comparison Algorithms: 1.Fixed playout schedule 2.Only adjust playout schedule during silence periods [Ramjee ’94; Moon ‘98] 3.Adaptive playout scheduling 1 2 3
15
Liang: Robust Low-Latency Voice and Video Communication 14 Overall Performance Alg.Loss rate MOS Alg. 210%2.6 Alg. 34%3.7 Stanford Chicago MOS scale : 1 - 5 [ITU-T P.800] -50%
16
Liang: Robust Low-Latency Voice and Video Communication 15 Subjective Listening Test Results Trace MOS Stanford 1.Chicago 2.Germany 3.MIT 4.China Alg. 2 Alg. 3
17
Liang: Robust Low-Latency Voice and Video Communication 16 Summary Adaptive Playout Scheduling Improves the tradeoff between buffering delay and packet loss Time-scale modification-based speech processing does not impair speech quality Overall speech quality improves by 1 on a 5-point MOS scale The passive algorithm can be easily implemented on client Audacity T2, 8X8, Inc.
18
Liang: Robust Low-Latency Voice and Video Communication 17 Outline I.Client side Adaptive playout scheduling for VoIP that reduces latency and packet loss II.Transport Packet path diversity and applications in low-latency communications III.Network-adaptive coding Low-latency video communication that does not require packet retransmission
19
Liang: Robust Low-Latency Voice and Video Communication 18 Packet Path Diversity Motivation Typically better alternative path exists [Savage, SigComm ‘99] Uncorrelated packet loss on independent paths [Apostolopoulos ‘01] Low-latency requirement Sender Receiver 1 2 Relay server Relay server
20
Liang: Robust Low-Latency Voice and Video Communication 19 Internet Experiments Qwest Exodu s Comm. BBN Planet Santa Clara, CA 192.84.16.176 MIT 18.184.0.50 Harvard 140.247.62.110 (5ms) (45ms) (40ms) (5ms) Sender Relay Server Receiver (delay incurred on a link or ISP network)
21
Liang: Robust Low-Latency Voice and Video Communication 20 Measured Packet Delay Trace
22
Liang: Robust Low-Latency Voice and Video Communication 21 Adaptive Playout Scheduling for Two-Stream
23
Liang: Robust Low-Latency Voice and Video Communication 22 Multiple Description Speech Coding Complementary and redundant descriptions of media Stream 1: Even samples: finer quantization Odd samples: coarser quantization Stream 2: Vice versa [Jiang, Ortega ‘00] E s1 s2 O E O E O O E O E O E E O Packet length Time
24
Liang: Robust Low-Latency Voice and Video Communication 23 Determine the Playout Schedule To minimize the Lagrangian cost function Stream 1 p1p1 p2p2 Delay d Prob. Stream 2
25
Liang: Robust Low-Latency Voice and Video Communication 24 Overall Performance Avg end-to-end delay (ms) Loss rate (%) -35ms
26
Liang: Robust Low-Latency Voice and Video Communication 25 Summary Packet Path Diversity Exploitation of statistically uncorrelated delay jitter and packet loss behavior Adaptive playout scheduling for multiple streams provides lower latency and reduced distortion
27
Liang: Robust Low-Latency Voice and Video Communication 26 Outline I.Client side Adaptive playout scheduling for VoIP that reduces latency and packet loss II.Transport Packet path diversity and applications in low-latency communications III.Network-adaptive coding Low-latency video communication that does not require packet retransmission
28
Liang: Robust Low-Latency Voice and Video Communication 27 Low-Latency Video Communication Motivation for low-latency video Real-time conversational services Interactive video streaming Voice vs. Video Voice over IPTypical video streaming < 150 ms5 ~ 15 seconds pre-roll time Weak or no dependency across packets Strong dependency across packets due to motion- compensated coding
29
Liang: Robust Low-Latency Voice and Video Communication 28 Low-Latency Video — Challenges What the problems are Packet dependency due to hybrid motion-compensated coding Large receiver buffer and packet retransmission employed I P P P P P P P Interframe prediction Time Transmission error The “P-I” scheme
30
Liang: Robust Low-Latency Voice and Video Communication 29 Approaches Goal Achieve VoIP-like latency Approach Eliminate the need for retransmission Robust network-adaptive coding by optimal packet dependency management
31
Liang: Robust Low-Latency Voice and Video Communication 30 Coding Mode P1 P2 P5 INTRA … … Coding mode Increased error- resilience
32
Liang: Robust Low-Latency Voice and Video Communication 31 Error-Resilience vs. Compression Efficiency Foreman sequence coded at PSNR=35.9 dB (H.26L TML8.5, 30 fps, 270 frames) INTRA Coding mode Increased error-resilience Decreased compression efficiency Rate (Kbps)
33
Liang: Robust Low-Latency Voice and Video Communication 32 Determine R-D Optimized Coding Modes Select the prediction mode that minimizes the R-D cost … Long-Term Memory V P2 : (R 2, D 2 ) … PV: (R V, D V ) I : (R , D ) … P1 : (R 1, D 1 ) v : coding mode
34
Liang: Robust Low-Latency Voice and Video Communication 33 Estimation of Distortion 1-p p p p … p D 11, p 11 =(1-p) 3 D 12, p 12 =(1-p) 2 p D 18, p 18 =p 3 ………… P1 D 21, p 21 =(1-p) 2 1-p pD 22, p 22 =(1-p)p D 23, p 23 =p(1-p) D 24, p 24 =p 2 P2 n-3 n-2 n-1 n Channel feedback utilized at the source coder
35
Liang: Robust Low-Latency Voice and Video Communication 34 Experimental Results Comparing 1.Rate-distortion optimized dependency management 2.Simple P-I 1 2 IPPPPP …
36
Liang: Robust Low-Latency Voice and Video Communication 35 R-D Performance (1) No retransmission; no algorithm delay channel loss rate=10% 1.2dB 36%
37
Liang: Robust Low-Latency Voice and Video Communication 36 R-D Performance (2) No retransmission; no algorithm delay channel loss rate=10%
38
Liang: Robust Low-Latency Voice and Video Communication 37 R-D Performance (3) Bitrate 200 Kbps, various channel loss rates
39
Liang: Robust Low-Latency Voice and Video Communication 38 Video Demo (1) R-D optimizedSimple P-I Foreman, 109Kbps, 10% channel loss No retransmission; no algorithm delay
40
Liang: Robust Low-Latency Voice and Video Communication 39 Video Demo (2) R-D optimized Simple P-I Mother-Daughter, 318Kbps, 10% channel loss No retransmission; no algorithm delay
41
Liang: Robust Low-Latency Voice and Video Communication 40 Summary Network-Adaptive Packet Dependency Management R-D optimization improves the tradeoff between error-resilience and compression efficiency Eliminated the need for packet retransmission; achieved VoIP-like low latency
42
Liang: Robust Low-Latency Voice and Video Communication 41 Summary of Contributions Server Client Packet network I. Client side II. Transport III. Network- adaptive coding I.Client side Adaptive playout scheduling that reduces latency and packet loss II.Transport Packet path diversity that further reduces communication delay and distortion III.Network-adaptive coding A video communication system that requires no packet retransmission, which allows VoIP-like low-latency
43
Liang: Robust Low-Latency Voice and Video Communication 42 Other Contributions Other contributions not covered in this presentation A low-latency loss concealment scheme Packet path diversity for robust low-latency video communication A layered coding structure to avoid mismatch error for streaming of pre-coded video An accurate model to quantify video distortion as a result of packet losses A prescient scheme that optimizes the dependency for a group of packets for video streaming
44
Liang: Robust Low-Latency Voice and Video Communication 43 Publications Journal publications: 3 IEEE Transactions on Multimedia Journal of Wireless Communication and Mobile Computing IEEE Transactions on Circuits and Systems for Video Technology Invited papers : 4 Papers in conference proceedings: 8 Proceedings ACM Multimedia (SigMM) …
45
Liang: Robust Low-Latency Voice and Video Communication 44 Media Delivery over IP Networks Internet
46
Liang: Robust Low-Latency Voice and Video Communication 45 Low-Latency Media Communication
47
Liang: Robust Low-Latency Voice and Video Communication 46 Acknowledgements Committee members, EE faculty My family members Our sponsors IVMS group members and alumni, and assistants Many friends, in ISL, EE, and Stanford
48
Liang: Robust Low-Latency Voice and Video Communication 47 Backup Slides The following backup slides may or may not be used …
49
Liang: Robust Low-Latency Voice and Video Communication 48 Determine the Playout Schedule Delay (ms) Percentage Delay histogram
50
Liang: Robust Low-Latency Voice and Video Communication 49 Likelihood Ratio Factor [Gibbon, Little, ‘96]
51
Liang: Robust Low-Latency Voice and Video Communication 50 More Samples for Time-Scale Modification Audio scaling OriginalExpanded by 20%Compressed by 20%
52
Liang: Robust Low-Latency Voice and Video Communication 51 Low-Latency Loss Concealment Earlier work [Stenger ‘96] Algorithm delay reduced to one packet time Nicely integrates into adaptive playout system 20% random packet loss: Original: Loss: Concealed: i-2i-1i+1i+2 i-2i+2 time i lost i-1 i+1 L LL 2L2L 1.3L Alignment found by correlation
53
Liang: Robust Low-Latency Voice and Video Communication 52 Speech Samples Alg.Loss rate MOS Alg. 210%2.6 Alg. 34%3.7 Original4.4
54
Liang: Robust Low-Latency Voice and Video Communication 53 Overall Performance Stanford -> 1.Chicago 2.Germany 3.MIT 4.China 12 34
55
Liang: Robust Low-Latency Voice and Video Communication 54 Multi-Stream Playout Scheduling Time 123456 Sending on path 1 Receiving on path 1 Playout 12 34 6 5 123456 Sending on path 2 Receiving on path 2 Packet path diversity reduces effective delay jitter and therefore late loss rate
56
Liang: Robust Low-Latency Voice and Video Communication 55 Path Diversity – Voice Demo Original Average total end-to-end delay: 84 ms Error concealment: speech segment repetition Average total end-to-end delay: 84 ms Error concealment: speech segment repetition Path DiversitySingle-stream with FEC at same data rate
57
Liang: Robust Low-Latency Voice and Video Communication 56 More Experiment Results Results obtained by varying 2 while keeping 1 fixed With higher delay: better chances to play both descriptions Observed lower playout rate variation by using multiple streams Jitter averaged; lower STD of min(d i, d j )
58
Liang: Robust Low-Latency Voice and Video Communication 57 PESQ Results Perceptual Evaluation of Speech Quality (ITU-T Rec. P.862, Feb. 2001) PESQ can be used for end-to-end quality assessment Ranges from –0.5 to 4.5 but usually produces MOS-like scores between 1.0 and 4.5
59
Liang: Robust Low-Latency Voice and Video Communication 58 Internet Experiment (2) VBNS IP Backbone Service DANTE Operations UUNE T Tech. Erlangen 131.188.130.136 Harvard 140.247.62.110 (7ms) (40ms) AT&T (5ms) (10ms) New Jersey 165.230.227.81 Path 1 (direct): N. J. – Erlangen Path 2 (alternative): N. J. – Harvard – Erlangen
60
Liang: Robust Low-Latency Voice and Video Communication 59 Results (2) Path 1 (direct): N. J. – Germany Path 2 (alternative): N. J. – Harvard – Germany Mean delay 61.3/65.0 ms link loss 0.6% / 1.1% Significant reduction of late loss and end-to-end delay by packet path diversity
61
Liang: Robust Low-Latency Voice and Video Communication 60 Video Streaming Using Path Diversity Path 1 Path 2 n-5 n-4 n-3 n-2 n-1 n Next frame to encode and send: n Goal Minimize distortion under rate constraint (1) Path selection to minimize the loss probability of frame n and maximize the benefit of path diversity Alternate when both channels are good Send small probe packets over the channel in bad state [Setton, Liang, Girod, ICME’03, submitted] (2) Source coding
62
Liang: Robust Low-Latency Voice and Video Communication 61 Determine Prediction Mode Long-Term Memory V=5 n-5 n-4 n-3 n-2 n-1 n Prediction modes: v =1, 2, … V, I V=1 V=2 V=3 V=5 Path 1 Path 2
63
Liang: Robust Low-Latency Voice and Video Communication 62 Results (1) Channel loss rate_1 =loss rate 2 =15% Avg burst len=8 Feedback delay=6 Comparing to RPS-NACK [Lin, ICME’01] Video redundancy coding (VRC) [H.263++]
64
Liang: Robust Low-Latency Voice and Video Communication 63 Results (2) Channel loss rate_1 =loss rate 2 =15% Avg burst len=8 Feedback delay=6
65
Liang: Robust Low-Latency Voice and Video Communication 64 Path Diversity Gain with Shared Link
66
Liang: Robust Low-Latency Voice and Video Communication 65 TCP-Friendly Streaming [Mahdavi, Floyd, ‘97] [Floyd, Handley, Padhye, Widmer, ‘00]
67
Liang: Robust Low-Latency Voice and Video Communication 66 Long-Term Memory Prediction and Packet Dependency To manage prediction dependency Long-term Memory (LTM) prediction on macroblock level [Wiegand, Zhang, Färber, Girod, ’99, ‘00] Reference Picture Selection (RPS) [Annex N H.263+, Annex U H.263++, H.26L] NEWPRED [ISO/IEC MPEG-4] NACK
68
Liang: Robust Low-Latency Voice and Video Communication 67 R-D Optimization [H.26L TML 8.5] [Wiegand, Girod, ICIP’01 ]
69
Liang: Robust Low-Latency Voice and Video Communication 68 Dynamic PSNRs
70
Liang: Robust Low-Latency Voice and Video Communication 69 Streaming of Pre-Encoded Media Media pre-coded and pre-stored offline Bit-stream assembly at streaming times Pre-coded content benefits large number of users One potential problem …
71
Liang: Robust Low-Latency Voice and Video Communication 70 Potential Mismatch Error Transmitted PPPIPP … IIIIII … S1 S2 Encoded PPPPPP … Previous schemes using S-frame [Färber, ICIP’97 ], SP-frame [H. 26L] alleviate or solve the problem at the cost of higher bitrate Decoded PPPIPP … Mismatch
72
Liang: Robust Low-Latency Voice and Video Communication 71 Layered Coding Structure for Bitstream Assembly P5P5P5P5 LAYER II IP5P5P5P5 IP5P5P5… IP5P5… V=5 SYNC-frames: allow switching LAYER I I T GOP =25 P5 LAYER III
73
Liang: Robust Low-Latency Voice and Video Communication 72 P-I for Comparison I P P P P P P P P P I P … I P P P P P P P P P I …
74
Liang: Robust Low-Latency Voice and Video Communication 73 R-D Performance (1) No retransmission; no algorithm delay channel loss rate=10% 1.2dB 36%
75
Liang: Robust Low-Latency Voice and Video Communication 74 R-D Performance (2) No retransmission; no algorithm delay channel loss rate=10%
76
Liang: Robust Low-Latency Voice and Video Communication 75 R-D Performance (3) Bitrate 200 Kbps, various channel loss rates
77
Liang: Robust Low-Latency Voice and Video Communication 76 Cost of Error-Resilience (1) Error-resilience / low-latency is not free PSNR (dB) Bitrate increase for 5% loss Bitrate increase for 10% loss 33.417%39% 35.920%43% 37.814%35% Distortion at the encoder
78
Liang: Robust Low-Latency Voice and Video Communication 77 Cost of Error-Resilience (2) PSNR (dB)Bitrate increase for 5% loss Bitrate increase for 10% loss 35.020%52% 36.417%45% 39.322%46% 40.016%40% Distortion at the encoder
79
Liang: Robust Low-Latency Voice and Video Communication 78 Cost of Layered Coding Structure (1) 23% 25% 30% Lossless channel 32%
80
Liang: Robust Low-Latency Voice and Video Communication 79 Cost of Layered Coding Structure (2) Channel loss rate=5%
81
Liang: Robust Low-Latency Voice and Video Communication 80 Comparing Different Error-Resilience Schemes LatencyR-D costResilience to burst loss ARQHighLow FECMedium-lowMedium-highMedium-low, depending on delay Dependency control Very lowMedium-highHigh
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.