Presentation is loading. Please wait.

Presentation is loading. Please wait.

1998-09-16 1/31 “Large Scale Audio Distribution on the Internet” A technical perspective by Kåre Synnes.

Similar presentations


Presentation on theme: "1998-09-16 1/31 “Large Scale Audio Distribution on the Internet” A technical perspective by Kåre Synnes."— Presentation transcript:

1 unicorn@cdt.luth.se 1998-09-16 1/31 “Large Scale Audio Distribution on the Internet” A technical perspective by Kåre Synnes

2 unicorn@cdt.luth.se 1998-09-16 2/31 Born 1969 in Sollefteå, Sweden Books, games, sports, food, film, music, company Engaged to Maggie

3 unicorn@cdt.luth.se 1998-09-16 3/31 Large Scale Audio Distribution on the Internet Techniques for Packet-Loss Repair of Audio Streams Layering of Audio Data Adaptive Audio Applications Techniques for Packet-Loss Repair of Audio Streams Layering of Audio Data Adaptive Audio Applications

4 unicorn@cdt.luth.se 1998-09-16 4/31 Large Scale Audio Distribution on the Internet Large Scale= Many receivers Audio= Prioritized temporal data Distribution= One-to-Many Internet= Best-effort (lossy)

5 unicorn@cdt.luth.se 1998-09-16 5/31 Issues at hand Distribution needs to be scalable for very large groups - multicast RTP/UDP/IP Best-effort IP transport results in: –delay (~400ms acceptable) –delay variation (buffering) –loss (congestion, jitter, overload, delay variation)

6 unicorn@cdt.luth.se 1998-09-16 6/31 IP Multicast What’s HOT! Minimum traffic load Scaleable... Effective protocols (RTP/UDP/IP) Cheap, no special network equipment needed (I.e. MTUs) What’s HOT! Minimum traffic load Scaleable... Effective protocols (RTP/UDP/IP) Cheap, no special network equipment needed (I.e. MTUs) What’s NOT! By default turned off Complex distribution tree management No back-off for UDP at congestion Lossy Few applications What’s NOT! By default turned off Complex distribution tree management No back-off for UDP at congestion Lossy Few applications

7 unicorn@cdt.luth.se 1998-09-16 7/31 Loss - a generalization Low loss Single packets are lost Loss are 'almost' evenly distributed Medium and High loss Packet are lost in twos or threes Losses are 'clustered' Low loss Single packets are lost Loss are 'almost' evenly distributed Medium and High loss Packet are lost in twos or threes Losses are 'clustered' Also, given a large group: Most receivers will have 2-5% loss A small number of receivers will have greater loss Each packet is assumed to be lost atleast once Also, given a large group: Most receivers will have 2-5% loss A small number of receivers will have greater loss Each packet is assumed to be lost atleast once

8 unicorn@cdt.luth.se 1998-09-16 8/31 Techniques for Packet-Loss Repair of Audio Streams Receiver-Only Repairs Silence Substitution Waveform Substitution –White Noice –Repetition –(Predictive) Interpolation Receiver-Only Repairs Silence Substitution Waveform Substitution –White Noice –Repetition –(Predictive) Interpolation Sender Initiated Repairs Piggy-backed Redundancy Forward Error Correction Parallell Redundancy Receiver Initiated Repairs Semi-Reliable Transmissions Sender Initiated Repairs Piggy-backed Redundancy Forward Error Correction Parallell Redundancy Receiver Initiated Repairs Semi-Reliable Transmissions

9 unicorn@cdt.luth.se 1998-09-16 9/31 Silence Substitution Very simple to implement Adequate performance for: –small packets ( <32ms ) –low loss ( <1% ) Not very good (clipping)

10 unicorn@cdt.luth.se 1998-09-16 10/31 White Noice Also, Very simple to implement Better than Silence Substitution Subconsious repairs –Applies to noice but not silence Tolerance of 5-10% loss

11 unicorn@cdt.luth.se 1998-09-16 11/31 Self-similarity Speech waveforms often exhibit a degree of self-similarity. Generation of a replacement packet with similar spectral qualities is possible. Clips shorter than 30 ms is recommended (phonems). Speech waveforms often exhibit a degree of self-similarity. Generation of a replacement packet with similar spectral qualities is possible. Clips shorter than 30 ms is recommended (phonems).

12 unicorn@cdt.luth.se 1998-09-16 12/31 Repetition Again, Very simple to implement Significantly improves audio quality, at 5-15% loss Bad effects if overdone (echo/reverberating) An amplitude gain shift is good Experience: 50% decrease for at most 2 consecutive 40ms clips Again, Very simple to implement Significantly improves audio quality, at 5-15% loss Bad effects if overdone (echo/reverberating) An amplitude gain shift is good Experience: 50% decrease for at most 2 consecutive 40ms clips

13 unicorn@cdt.luth.se 1998-09-16 13/31 (Predictive) Interpolation Interpolation can be done in two ways: –Use two sorrounding clips (additional delay) –Use two or more earlier clips (less accurate) Not so common due to complexity Gives better results than Repetition Interpolation can be done in two ways: –Use two sorrounding clips (additional delay) –Use two or more earlier clips (less accurate) Not so common due to complexity Gives better results than Repetition

14 unicorn@cdt.luth.se 1998-09-16 14/31 Interleaving Spread the effect of a packet over several packets, thus smaller losses to repair Phonems are ~20 ms Additional delay No extra BW cost Uncertain of the results (intelligibility)

15 unicorn@cdt.luth.se 1998-09-16 15/31 Audio Formats There are several new codecs developed: proprietary down to 1.2 kbps! There are several new codecs developed: proprietary down to 1.2 kbps!

16 unicorn@cdt.luth.se 1998-09-16 16/31 Redundancy Synthetic low quality, low bit-rate encodings can be used as redundant repairs. LPC is considered to contain ~60% of a speech signal, while preserving the frequency spectra. GSM is even better, but at the double bit-rate, 13 vs 4.8 kbps. Multiple redundancy is also an option. Synthetic low quality, low bit-rate encodings can be used as redundant repairs. LPC is considered to contain ~60% of a speech signal, while preserving the frequency spectra. GSM is even better, but at the double bit-rate, 13 vs 4.8 kbps. Multiple redundancy is also an option.

17 unicorn@cdt.luth.se 1998-09-16 17/31 Piggy-backed Redundancy High tolerance of loss (25-40%). A singular redundancy using PCM (64 kbps) and GSM (13 kbps) is common. Degree of loss determines optimal delay. Non-redundancy capable receivers may be able to skip the the redundant encoding(s). High tolerance of loss (25-40%). A singular redundancy using PCM (64 kbps) and GSM (13 kbps) is common. Degree of loss determines optimal delay. Non-redundancy capable receivers may be able to skip the the redundant encoding(s).

18 unicorn@cdt.luth.se 1998-09-16 18/31 Forward Error Correction Redundancy is added with XOR methods 50% extra overhead in the example, but the redundancy can be recoded Other options possible as well, e.g.: 1.a, f(a,b), b, f(b,c), c,... 2.a, b, c, x(a,b,c), d, e, f, x(d,e,f),... 3.a, b, c, x(a,c), d, x(b,d), e, x(c,e),... 4.x(a,b), x(b,c), x(a,b,c),... Better than simple redundancy, but more CPU expensive Redundancy is added with XOR methods 50% extra overhead in the example, but the redundancy can be recoded Other options possible as well, e.g.: 1.a, f(a,b), b, f(b,c), c,... 2.a, b, c, x(a,b,c), d, e, f, x(d,e,f),... 3.a, b, c, x(a,c), d, x(b,d), e, x(c,e),... 4.x(a,b), x(b,c), x(a,b,c),... Better than simple redundancy, but more CPU expensive

19 unicorn@cdt.luth.se 1998-09-16 19/31 Parallell Redundancy The idea is to use several channels. Division of bandwidth need –Main transmission in one channel –Redundancy over another cannel Can be applied to any scheme Receivers can decide how much redundancy, or even which encoding they prefer Additional overhead (headers) The idea is to use several channels. Division of bandwidth need –Main transmission in one channel –Redundancy over another cannel Can be applied to any scheme Receivers can decide how much redundancy, or even which encoding they prefer Additional overhead (headers)

20 unicorn@cdt.luth.se 1998-09-16 20/31 Semi-Reliable Transmissions 1.The sender transmit a packet 2. A receiver send a NACK if it is lost 3.The sender retransmit the packet, if it is still in the queue 1.The sender transmit a packet 2. A receiver send a NACK if it is lost 3.The sender retransmit the packet, if it is still in the queue A time-limited repair is achieved Protocols such as SRRTP can be used. This can be used for small groups on networks with low delay Other redundancy schemes are preferable A time-limited repair is achieved Protocols such as SRRTP can be used. This can be used for small groups on networks with low delay Other redundancy schemes are preferable

21 unicorn@cdt.luth.se 1998-09-16 21/31 mAudio

22 unicorn@cdt.luth.se 1998-09-16 22/31 mAudio Recovery int cnt = 0; // Number of consecutive lost packets byte[] read() { if (received(n)) { // main or redundant packet decreaseBuffer(); // adaptive buffering cnt=0; return recode(n); } increaseBuffer(); cnt++; if (cnt == 1) // Repeat with 50% amplitude return amplify(n-1, 0.5); if (cnt == 2) // Repeat with 25% amplitude return amplify(n-2, 0.25); if (cnt < 10) // Feed noice with correct amplitude return noice(n-cnt); return silence; // Feed silence } Packet n is lost!

23 unicorn@cdt.luth.se 1998-09-16 23/31 Layered Encodings Allows the receivers to adapt to network conditions –Main parts are sent over one channel –Additional parts over other channels Example, 6 layers: –50%, 25%, 12%, 6%, 4%, 3% Can be CPU expensive This is tricky for audio, simpler for video

24 unicorn@cdt.luth.se 1998-09-16 24/31 Simple Layering Time (ms) Amplitude (db) 8 kHz 16 kHz Audio artifacts when only merged (frequency overtones) –‘tin can’ sound –reverberating Filtering needed Audio artifacts when only merged (frequency overtones) –‘tin can’ sound –reverberating Filtering needed 32 kHz sampling 8,16,24,32 kHz

25 unicorn@cdt.luth.se 1998-09-16 25/31 Wavelet Encoding Frequency (Hz) Amplitude (db) 8 162432 Speech Transform the data to the frequency domain, and divide it there Computational difficult (expensive) Longer delays due to buffering Very good division Transform the data to the frequency domain, and divide it there Computational difficult (expensive) Longer delays due to buffering Very good division

26 unicorn@cdt.luth.se 1998-09-16 26/31 Adaptive Audio Applications How can we support heterogeneous environments? –Network: 56k modem, ISDN, xDSL, Ethernet –Load: congestion, hardware jitter, delay variation –Client: Mobile phone, PDA, NC, PC, Workstation How can we support heterogeneous environments? –Network: 56k modem, ISDN, xDSL, Ethernet –Load: congestion, hardware jitter, delay variation –Client: Mobile phone, PDA, NC, PC, Workstation Allow scaling of Quality NOT use a least common denominator! Senders should adapt slowly while receivers adapt more rapidly, i.e. highly adaptive clients Allow scaling of Quality NOT use a least common denominator! Senders should adapt slowly while receivers adapt more rapidly, i.e. highly adaptive clients

27 unicorn@cdt.luth.se 1998-09-16 27/31 RTP/RTCP Receiver Reports The receivers report on: Loss rate (long-term congestion) Delay-variation (short-term congestion) Throughput Additional (Load, Encoding etc) The receivers report on: Loss rate (long-term congestion) Delay-variation (short-term congestion) Throughput Additional (Load, Encoding etc) Can be used to change: Encoding Redundancy Layering How do we do this for many receivers? Voting? Can be used to change: Encoding Redundancy Layering How do we do this for many receivers? Voting?

28 unicorn@cdt.luth.se 1998-09-16 28/31 Summary Receiver-only techniques are good for low loss and small packets Up to 40% loss rates can be repaired intelligible, using redundancy schemes There is a trade-off between delays and buffering, which affects response-times Much can be done to enhance audio quality Receiver-only techniques are good for low loss and small packets Up to 40% loss rates can be repaired intelligible, using redundancy schemes There is a trade-off between delays and buffering, which affects response-times Much can be done to enhance audio quality

29 unicorn@cdt.luth.se 1998-09-16 29/31 Questions? E-mail: unicorn@cdt.luth.se URL: http://www.cdt.luth.se/~unicorn/

30 unicorn@cdt.luth.se 1998-09-16 30/31 Future Work Use real network statistics to model loss, while studying receiver report effects Try different combinations of recovery, to achieve optimal adaptation Measure gain (intelligibility) vs. cost (net and CPU load)


Download ppt "1998-09-16 1/31 “Large Scale Audio Distribution on the Internet” A technical perspective by Kåre Synnes."

Similar presentations


Ads by Google