2nd Workshop on Wideband Speech Quality - June Perceptual Wideband Audio Quality Assessments Using PEAQ Christian Schmidmer Opticom GmbH, Erlangen
2nd Workshop on Wideband Speech Quality - June Contents Quality, definitions User expectation Subjective tests Psychoacoustics PEAQ PESQ vs. PEAQ
2nd Workshop on Wideband Speech Quality - June Aspects of Perceived Quality Conversational Quality = ...
2nd Workshop on Wideband Speech Quality - June What is “Quality”? “Quality is the difference between what we perceive and what we expect.” From habilitation thesis of Prof. Ute Jekosch “…they are used to phones that sound like a phone.” Frank Meier, Infineon Maybe more important: …is for free.
2nd Workshop on Wideband Speech Quality - June Differences in Perception of Voice and Audio Experience, a priori knowledge Expectation Cognitive effects “Error correction” Different subjective tests require different models
2nd Workshop on Wideband Speech Quality - June The Problem of Subjective Scales BitrateMOS 256kBit/s5 128kBit/s4 … 64kBit/s1 BitrateMOS 128Bit/s5 64kBit/s4 … 16kBit/s1 High Quality: Intermediate Quality: The range of qualities in the subjective test defines the subjective scale!
2nd Workshop on Wideband Speech Quality - June MOS acc. To P.800 Standardized Listening Test Procedure acc. to ITU-T P.800ff Absolute Category Rating Test (ACR), no comparison to reference signal (original) „How good does it sound?“ 5-point grading scale ‚opinion scale‘ Averaging over test Subjects: MOS ‚Mean Opinion Score‘ Language dependent! Excellent Good Fair Poor Bad ImpairmentGrade
2nd Workshop on Wideband Speech Quality - June Standardised assessment procedure for 'small impairments' in audio systems (ITU-R 1994) Comparison between reference and test signal Very sensitive to subtle distortions double-blind triple-stimulus with hidden reference Subjective Assessment in ITU-R BS.1116 OriginalAB original / coded coded / original
2nd Workshop on Wideband Speech Quality - June Continuous grading scale with “anchors” “Subjective Difference Grade“ (SDG) Question: „How different do the files sound“ Subjective Assessment in ITU-R BS.1116
2nd Workshop on Wideband Speech Quality - June Subjective Testing of Intermediate Audio Quality (IAQ) “MUSHRA” Multi Stimulus Test with Hidden Reference and Anchors developed by EBU working group B/AIM targets at IAQ ITU-R BS.1534
2nd Workshop on Wideband Speech Quality - June MUSHRA Test Training of Subjects subjects can randomly access all types of codecs at similar bitrate comparison with CD quality reference two low-pass 'anchors' (7kHz, 3.5kHz) incl.
2nd Workshop on Wideband Speech Quality - June MUSHRA Test Scoring Phase comparison with CD reference, hidden reference inc.. two low-pass 'anchors' (7kHz, 3.5kHz) inc.. subjects can randomly assess all codecs under test of similar bitrate at the same time subjects adjust slider, no score involved slider mapped to
2nd Workshop on Wideband Speech Quality - June Comparison of Subjective Test Methods
2nd Workshop on Wideband Speech Quality - June
2nd Workshop on Wideband Speech Quality - June Temporal Masking t [ms] SL [dB] Pre-Simultaneous-Postmasking Premasking: 2-5ms Postmasking: 120ms Depending on the signal characteristics of the masker Masker
2nd Workshop on Wideband Speech Quality - June Pitch Scale / Critical Bands A sine tone and a noise of critical bandwidth with the same center frequency and energy density are perceived equally loud.
2nd Workshop on Wideband Speech Quality - June Threshold in Quiet - Masked Threshold Threshold in Quiet
2nd Workshop on Wideband Speech Quality - June PEAQ is based on: –PAQM KPN Research, Netherlands / OPTICOM –NMR Fraunhofer, Germany / OPTICOM –DIX TU Berlin / Deutsche Telekom Berkom –POM CCETT, France –PERCEVAL CRC, Canada –"Tool box" IRT, Germany ITU-R TG 10/4: Call for proposals (1995) Jan released as ITU-R Rec. BS.1387 PEAQ
2nd Workshop on Wideband Speech Quality - June Intrusive Testing Network X A Network Y B Comparison with known stimulus: + Very high accuracy +Black box approach – no knowledge of DUT - Requires a reference signal -Generates traffic Alternatively both signals may be captured by the test system!
2nd Workshop on Wideband Speech Quality - June Two Versions of PEAQ: PEAQ „Basic“ computational efficiency realtime performance PEAQ „Advanced“ highest possible accuracy
2nd Workshop on Wideband Speech Quality - June Structure of a perceptual measurement tool Reference (=sent file) Feature- Extractor Perceptual Model Test (=received file) Cognitive Model MOS (Quality Measure) Perceptual Model a b a b
2nd Workshop on Wideband Speech Quality - June Excitation Listening Level (dB SPL) Input Signal 1 FFT & Scaling 2048 Punkte 42.6ms/23.4Hz Outer and Middle Ear Weighting Grouping into Critical Bands ¼ Bark “Pitch” Internal Noise Spreading Temporal Masking Forward masking 2 + fs=48kHz (fs=44.1kHz) a b Perceptual Model, PEAQ “Basic”
2nd Workshop on Wideband Speech Quality - June MOVs used in PEAQ “Basic” Version
2nd Workshop on Wideband Speech Quality - June Perceptual Model, PEAQ “Advanced”
2nd Workshop on Wideband Speech Quality - June
2nd Workshop on Wideband Speech Quality - June PEAQ vs. MUSHRA Microsoft Windows Media 4 MPEG-4 AAC (Fraunhofer) MP3 (Fraunhofer) Quicktime 4, Music-Codec 2 (Qdesign) Real Audio 5.0 RealAudio G2 MPEG-4 TwinVQ (Yahama) EBU Tests of Internet Audio Codecs
2nd Workshop on Wideband Speech Quality - June Constraints of MUSHRA Testing no absolute scores: -> scores depend on the test condition low-pass anchors are only one quality dimension -> disturbance of artefacts is another one spreading of the scale from best to worst -> what about adding new items to an existing test? In order to verify PEAQ performance we must adjust the best and worst item (not the anchors!)
2nd Workshop on Wideband Speech Quality - June PEAQ vs. MUSHRA (EBU Test)
2nd Workshop on Wideband Speech Quality - June Results
2nd Workshop on Wideband Speech Quality - June Results
2nd Workshop on Wideband Speech Quality - June Results
2nd Workshop on Wideband Speech Quality - June Results
2nd Workshop on Wideband Speech Quality - June Results
2nd Workshop on Wideband Speech Quality - June When to use PEAQ or PESQ Is it a BS.1116 or MUSHRA Experiment? Use PEAQ! Is the subjective test P.800? Is it speech? Yes: –Is the bandwidth <= 8kHz? »Yes: Use PESQ! »No: Use PEAQ with care! No: »Use PEAQ with care!
2nd Workshop on Wideband Speech Quality - June Final Question: Can I use PESQ instead of PEAQ? Perception of voice differs from perception of music PESQ time alignment fails on music PEAQ and PESQ are modelling different subjective tests No!
2nd Workshop on Wideband Speech Quality - June OPTICOM Germany More Information: Thank you!