The Good, the Bad and the Muffled: the Impact of Different Degradations on Internet Speech Anna Watson and M. Angela Sasse Dept. of CS University College.

Slides:



Advertisements
Similar presentations
IP Cablecom and MEDIACOM 2004 Prediction and Monitoring of Quality for VoIP services Quality for VoIP services Vincent Barriac – France Télécom R&D SG12.
Advertisements

Introduction to MP3 and psychoacoustics Material from website by Mark S. Drew
Measuring Perceived Quality of Speech and Video in Multimedia Conferencing Applications Anna Watson and M. Angela Sasse Dept. of CS University College.
QoS Impact on User Perception and Understanding of Multimedia Video Clips G. Ghinea and J.P. Thomas Department of Computer Science University of Reading,
Advanced Audio Setup Troubleshooting your audio Users’ Reference.
RTP: A Transport Protocol for Real-Time Applications Provides end-to-end delivery services for data with real-time characteristics, such as interactive.
Speech Compression. Introduction Use of multimedia in personal computers Requirement of more disk space Also telephone system requires compression Topics.
The Essentials of 2-Level Design of Experiments Part I: The Essentials of Full Factorial Designs The Essentials of 2-Level Design of Experiments Part I:
The Good, the Bad and the Muffled: the Impact of Different Degradations on Internet Speech Anna Watson and M. Angela Sasse Department of CS University.
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli University of Calif, Berkeley and Lawrence Berkeley National Laboratory SIGCOMM.
Experimental Design and Efficient Research Dr. Richy Hetherington and Dr. Kim Pearce.
IT-101 Section 001 Lecture #8 Introduction to Information Technology.
A Basic Multimedia Quality Model Majid Bagheri
Bolo – A Simple Audioconference CS525u Multimedia Computing Due date: Project 2.
Evaluation of Speak Project 2b Due March 24th. Overview Experiments to evaluate performance of your audioconference (proj2) Focus not only on how your.
A New Adaptive FEC Loss Control Algorithm for Voice Over IP Applications Chinmay Padhye, Kenneth Christensen and Wilfirdo Moreno Department of Computer.
Evaluation of Speak Project 2b Due March 30 th. Overview Experiments to evaluate performance of your audioconference (proj2) Focus not only on how your.
Successful Multiparty Audio Communication over the Internet Vicky Hardman, M. Angela Sasse and Isidor Kouvelas Department of Computer Science University.
An Empirical Study of Delay Jitter Management Policies D. Stone and K. Jeffay Computer Science Department University of North Carolina, Chapel Hill ACM.
Performance metrics and configuration strategies for group network communication Tom Z. J. FU Dah Ming Chiu John C. S. Lui.
Experiments in Computer Science Mark Claypool. Introduction Some claim computer science is not an experimental science –Computers are man-made, predictable.
Nov. 3, 2000 Adaptive Playout Scheduling in Packet Voice Communications.
The Effects of Jitter on the Perceptual Quality of Video Mark Claypool and Jonathan Tanner Computer Science Department Worcester Polytechnic Institute.
Using Redundancy and Interleaving to Ameliorate the Effects of Packet Loss in a Video Stream Yali Zhu, Mark Claypool and Yanlin Liu Department of Computer.
Successful Multiparty Audio Communication over the Internet Vicky Hardman, M. Angela Sasse and Isidor Kouvelas Department of Computer Science University.
Using Interleaving to Ameliorate the Effects of Packet Loss in a Video Stream Mark Claypool and Yali Zhu Computer Science Department Worcester Polytechnic.
Objective and Subjective Degradations of Transcoded Voice for Heterogeneous Radio Networks Interoperability Ľubica Blašková 1, Jan Holub 1, Michael Street.
Improving Spoken English NativeAccent™. What is NativeAccent? New internet-delivered technology that assesses a student’s English pronunciation skills.
Hypothesis testing – mean differences between populations
V Telecommunications Industry AssociationTR XXX.
Analyze Assure Accelerate Network Model for Evaluating Multimedia Transmission Performance Over Internet Protocol PN Will become TIA/EIA-921 Jack.
ETSI STQ, Taiwan Workshop, February 13th, Recent improvements in transmission quality assessment : Background noise transmission Results of STF.
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 8: Quantitative.
Copyright 1998, S.D. Personick. All Rights Reserved1 Telecommunications Networking I Lectures 2 & 3 Representing Information as a Signal.
Subjective Sound Quality Assessment of Mobile Phones for Production Support Thorsten Drascher, Martin Schultes Workshop on Wideband Speech Quality in Terminals.
Estimation of Statistical Parameters
Chapter 8 Experimental Design: Dependent Groups and Mixed Groups Designs.
Interaction Modeling. Introduction (1) Third leg of the modeling tripod. It describes interaction within a system. The class model describes the objects.
GSM voice quality Final report Tunis, 15 th of July 2004 Customer: Tunisiana.
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google Talk, and MSN Messenger Chen-Chi Wu, Kuan-Ta Chen, Yu-Chun Chang, and Chin-Laung.
Week Five: Assure Case Study Ruth Vega CMP/555 September 26, 2005.
Understanding Research Design Can have confusing terms Research Methodology The entire process from question to analysis Research Design Clearly defined.
Definition and Coordination of Signal Processing Functions for telephone connections involving automotive speakerphones Scott Pennock Senior Hands-Free.
Compression No. 1  Seattle Pacific University Data Compression Kevin Bolding Electrical Engineering Seattle Pacific University.
1 Presented by Jari Korhonen Centre for Quantifiable Quality of Service in Communication Systems (Q2S) Norwegian University of Science and Technology (NTNU)
University of Plymouth United Kingdom {L.Sun; ICC 2002, New York, USA1 Lingfen Sun Emmanuel Ifeachor Perceived Speech Quality.
CS529 Multimedia Networking Experiments in Computer Science.
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
Analogue & Digital. Analogue Sound Storage Devices.
Wireless communications and mobile computing conference, p.p , July 2011.
Research Design Week 6 Part February 2011 PPAL 6200.
Time-Dependent Dynamics in Networked Sensing and Control Justin R. Hartman Michael S. Branicky Vincenzo Liberatore.
>> HIGHERVIEW Team: A. Sasse J. D. McCarthy D. Miras J. Riegelsberger Presentation to UCL Network Group: 3rd March 2004.
 Face to face  Oral  Written  Visual  Electronic Communication in Administration 2.
Evaluation of Speak Project 2b Due November 4 th.
Mitigating Congestion in Wireless Sensor Networks Bret Hull, Kyle Jamieson, Hari Balakrishnan MIT Computer Science and Artificial Intelligence Laborartory.
RTP and playout delay compensation Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003.
Producing Data: Experiments BPS - 5th Ed. Chapter 9 1.
1 Video and Voice over IP performance over a Satellite link Bob Dixon, Ohio State University/OARnet Prasad Calyam, OARnet Joint Techs Workshops, Columbus,
Quality Is in the Eye of the Beholder: Meeting Users ’ Requirements for Internet Quality of Service Anna Bouch, Allan Kuchinsky, Nina Bhatti HP Labs Technical.
Olfactory Cues Modulate Facial Attractiveness Dematte, Osterbauer, & Spence (2007)
1 Topic 14 – Experimental Design Crossover Nested Factors Repeated Measures.
Evaluation of Speak Project 2b Due: March 20 th, in class.
Using Speech Recognition to Predict VoIP Quality
Aim To test Cherry’s findings on attention ‘more rigorously’. Sample
VoIP over Wireless Networks
Between-Subjects, within-subjects, and factorial Experimental Designs
Analogue & Digital.
Kouhei Fujimoto Graduate School of Engineering Science
The Essentials of 2-Level Design of Experiments Part I: The Essentials of Full Factorial Designs Developed by Don Edwards, John Grego and James Lynch.
Presentation transcript:

The Good, the Bad and the Muffled: the Impact of Different Degradations on Internet Speech Anna Watson and M. Angela Sasse Dept. of CS University College London, London, UK Proceedings of ACM Multimedia November 2000

Introduction Multimedia conference is a growing area Well-known that need good quality audio for conferencing to be successful Much research focused on improving delay, jitter, loss Many think bandwidth will fix –But bandwidth has been increasing exponentially while quality not!

Motivation Large field trial from –13 UK institutions –150 participants Recorded user Perceptual Quality Matched with objective network performance metrics Suggested that network was not primary influence on PQ!

Example: Missing Words Throughout But loss usually far less than 5%! - 1 hour Meeting - UCL to Glasgow

Problems Cited Missing Words –Likely causes: packet loss, poor speech detection, machine glitches Variation in volume –Likely causes: insufficient volume settings (mixer), poor headset quality Variation in quality among participants –Likely causes: high background noise, open microphone, poor headset quality Experiments to measure which affect quality

Outline Introduction Experiments Results Conclusions

Audioconference Fixed Parameters Robust Audio Tool –Home brewed in UCL –Limited repair of packet loss Coded in DVI 40 ms sample size Use “repetition” to repair lost packets

Audioconference Variables Packet loss rates –5% (typical) and 20% (upper limit to tolerate) ‘Bad’ microphone –Hard to measure, but Altai A087F Volume differences –Quiet, normal, loud through “pilot studies” Echo –From open microphone

Measurement Methods: PQ Not ITU (see previous paper) Subjective through “slightly” labeled scale “Fully subscribe that … speech quality should not be treated as a unidimensional phenomenon…” But …

Measurement Method: Physiological User “cost” –Fatigue, discomfort, physical strain Measure user stress –Using a sensor on the finger Blood Volume Pulse (BVP) –Decreases under stress Heart Rate (HR) –Increases under stress (“Fight” or “Flight)

Experimental Material Take script from ‘real’ audioconference Act-out by two males without regional accents Actors on Sun Ultra workstations on a LAN –Only audio recorded –16 bit samples –Used RAT –Used silence deletion (hey, proj1!) Vary volume and feedback (speakers to mic) Split into 2-minute files, 8Khz, 40 ms packets Repetition when loss

Experimental Conditions Reference – non-degraded 5% loss – both voices, with repetition 20% loss – both voices, with repetition Echo – one had open mic, not headset Quiet – one recorded low volume, other norm Loud – one recorded high volume, other norm Bad mic – one had low quality mic, other norm Determined “Intelligibility” not affected by above

Subjects 24 subjects –12 men –12 women All had good hearing Age 18 – 28 None had previous experience in Internet audio or videoconferencing

Procedure Each listened to seven 2-minute test files twice –Played with audio tool First file had no degradations (“Perfect”) –Users adjusted volume –Were told it was “best” Randomized order of files –Except “perfect” was 1 st and 8 th –So, 7 conditions heard once than another order Baseline physiological readings for 15 min When done, use slider and explain rating (tape-recorded)

Outline Introduction Experiments Results Conclusions

Quality Under Degradation Statistically significant?

Statistical Significance Tests Anova Test –For comparing means of two groups: first hearing and second hearing –No statistical difference between the two groups Analysis of variance –Degradation effect significant –Reference and 5% loss the same –Reference and Quiet the same –Reference and all others are different –5% Loss and Quiet the same –20% Loss and Echo and Loud the same

Physiological Results: HR

Physiological Results: BVP Statistically significant?

Physiological Statistical Significance Tests Bad mic, loud and 20% loss all significantly more stressful than quiet and 5% loss Echo significantly more stressful than quiet in the HR data only Contrast to quality! –Mic worse than 20% loss –Least stressful were quiet and 5% loss

Qualitative Results Asked subjects to describe why each rating Could clearly identify –quiet, loud and echo Bad mic –‘distant’, ‘far away’ or ‘muffled’ –‘on the telephone’, ‘walkie-talkie’ or ‘in a box’

Qualitative Results of Loss 5% loss –‘fuzzy’ and ‘buzzy’ (13 of 24 times) +From waveform changing in the missing packet and not being in the repeated packet –‘robotic’, ‘metallic’, ‘electronic’ (7 times) 20% loss –‘robotic’, ‘metallic’, ‘digital’, ‘electronic’ (15 times) –‘broken up’ and ‘cutting out’ (10 times) –‘fuzzy’ and ‘buzzy’ infrequently (2 times) 5 said ‘echo’, 10 described major volume changes –Not reliably see the cause of the degradation

Discussion 5% loss is different than reference condition (despite stats) because of descriptions –But subjects cannot identify it well –Need a tool to identify impairments 20% loss is worse than bad mic based on quality, but is the same based on physiological results –need to combine physiological and subjective Methodology of field trials to design controlled experiments can help understand media quality issues

Conclusion Audio quality degradation not primarily from loss –Volume, mic and echo are worse –And these are easy to fix! Educating users harder. By getting descriptions, should be easier to allow users to diagnose problems –Ex: ‘fuzzy’ or ‘buzzy’ to repetition for repair Volume changes harder –Could be reflected back to the user –Could do expert system to make sure certain quality before being allowed in

Future Work Delay and jitter compared with other degradations Interactive environments rather than just listening –Ex: echo probably worse Combination effects –Ex: bad mic plus too loud

Evaluation of Science? Category of Paper Space devoted to Experiments? Good Science? –1-10 –See if scale meshes with amount of experimental validation