Information-Theoretic Listening

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Ricard V. Solè and Sergi Valverde Prepared by Amaç Herdağdelen
Lecture 7: Basis Functions & Fourier Series
Lecture 11: Introduction to Fourier Series Sections 2.2.3, 2.3.
Auditory Scene Analysis (ASA). Auditory Demonstrations Albert S. Bregman / Pierre A. Ahad “Demonstration of Auditory Scene Analysis, The perceptual Organisation.
Statistical Full-Chip Leakage Analysis Considering Junction Tunneling Leakage Tao Li Zhiping Yu Institute of Microelectronics Tsinghua University.
Paper Discussion: “Simultaneous Localization and Environmental Mapping with a Sensor Network”, Marinakis et. al. ICRA 2011.
Locally Constraint Support Vector Clustering
Digital Image Processing: Revision
2005/11/101 KOZ Scalable Audio Speaker: 陳繼大 An Introduction.
Digital audio and computer music COS 116: 2/26/2008.
Software Metrics II Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
Digital Signal Processing A Merger of Mathematics and Machines 2002 Summer Youth Program Electrical and Computer Engineering Michigan Technological University.
Independent Component Analysis (ICA) and Factor Analysis (FA)
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Xinqiao LiuRate constrained conditional replenishment1 Rate-Constrained Conditional Replenishment with Adaptive Change Detection Xinqiao Liu December 8,
Digital audio and computer music COS 116, Spring 2012 Guest lecture: Rebecca Fiebrink.
Entropy and some applications in image processing Neucimar J. Leite Institute of Computing
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
1. An Overview of the Data Analysis and Probability Standard for School Mathematics? 2.
DIGITAL WATERMARKING OF AUDIO SIGNALS USING A PSYCHOACOUSTIC AUDITORY MODEL AND SPREAD SPECTRUM THEORY * By: Ricardo A. Garcia *Research done at: University.
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
Overview of Computing. Computer Science What is computer science? The systematic study of computing systems and computation. Contains theories for understanding.
Mr Background Noise and Miss Speech Perception in: by Elvira Perez and Georg Meyer.
2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.
CS :: Fall 2003 Media Scaling / Content Adaptation Ketan Mayer-Patel.
Purdue University Page 1 Color Image Fidelity Assessor Color Image Fidelity Assessor * Wencheng Wu (Xerox Corporation) Zygmunt Pizlo (Purdue University)
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
A Complexity Metric for Practical Ship Design 1 Jean-David Caprace ANAST – University of Liège – Belgium PRADS – September 2010.
Supervisor: Dr. Boaz Rafaely Student: Limor Eger Dept. of Electrical and Computer Engineering, Ben-Gurion University Goal Directional analysis of sound.
An Introduction to Support Vector Machines (M. Law)
STATISTICAL COMPLEXITY ANALYSIS Dr. Dmitry Nerukh Giorgos Karvounis.
It sure is smart but can it swing? (Digital audio and computer music)
Just Noticeable Difference Estimation For Images with Structural Uncertainty WU Jinjian Xidian University.
CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.
Automatic Transcription System of Kashino et al. MUMT 611 Doug Van Nort.
Sensation & Perception. Motion Vision I: Basic Motion Vision.
Machine Learning CUNY Graduate Center Lecture 6: Linear Regression II.
1Causal Performance Models Causal Models for Performance Analysis of Computer Systems Jan Lemeire TELE lab May 24 th 2006.
Piano Music Transcription Wes “Crusher” Hatch MUMT-614 Thurs., Feb.13.
Alex Stabile. Research Questions: Could a computer learn to distinguish between different composers? Why does music by different composers even sound.
What you see is what you get? Heather Johnston March 24, 2005.
Outline Random variables –Histogram, Mean, Variances, Moments, Correlation, types, multiple random variables Random functions –Correlation, stationarity,
Analyzing circadian expression data by harmonic regression based on autoregressive spectral estimation Rendong Yang and Zhen Su Division of Bioinformatics,
Measurement and Instrumentation
Chapter 7. Classification and Prediction
National Mathematics Day
A 2 veto for Continuous Wave Searches
SIGNALS PROCESSING AND ANALYSIS
Lecture 15: Technical Metrics
Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.
Presenter: Artur M. KUCZAPSKI
Lee, Jung-Woo Interdisciplinary Program in Cognitive Science
Term Project Presentation By: Keerthi C Nagaraj Dated: 30th April 2003
Statistical Methods For Engineers
CSc4730/6730 Scientific Visualization
CHAPTER 7 BAYESIAN NETWORK INDEPENDENCE BAYESIAN NETWORK INFERENCE MACHINE LEARNING ISSUES.
Developing a Versatile Audio Synthesizer TJHSST Computer Systems Lab
Search for gravitational waves from binary black hole mergers:
Analysis of Audio Using PCA
Nonparametric Hypothesis Tests for Dependency Structures
Lecture 5: Phasor Addition & Spectral Representation
Building Valid, Credible, and Appropriately Detailed Simulation Models
Geology 491 Spectral Analysis
DIGITAL WATERMARKING OF AUDIO SIGNALS USING A PSYCHOACOUSTIC AUDITORY MODEL AND SPREAD SPECTRUM THEORY By: Ricardo A. Garcia University of Miami School.
Introduction to Neural Networks
8. Communication Systems
How to win big by thinking straight about relatively trivial problems
Presentation transcript:

Information-Theoretic Listening * 07/16/96 Information-Theoretic Listening Paris Smaragdis Machine Listening Group MIT Media Lab 11/11/2018 *

Outline Defining a global goal for computational audition * 07/16/96 Outline Defining a global goal for computational audition Example 1: Developing a representation Example 2: Developing grouping functions Conclusions *

Auditory Goals Goals of computational audition are all over the place, should they? Lack of formal rigor in most theories Computational listening is fitting psychoacoustic experiment data

Auditory Development What really made audition? How did our hearing evolve? How did our environment shape our hearing? Can we evolve, rather than instruct, a machine to listen?

Goals of our Sensory System Distinguish independent events Object formation Gestalt grouping Minimize thinking and effort Perceive as few objects as possible Think as little as possible

Entropy Minimization as a Sensory Goal Long history between entropy and perception Barlow, Attneave, Attick, Redlich, etc ... Entropy can measure statistical dependencies Entropy can measure economy in both ‘thought’ (algorithmic entropy) and ‘information’ (Shannon entropy)

What is Entropy? Shannon Entropy: A measure of: Order Predictability Information Correlations Simplicity Stability Redundancy ... High entropy = Little order Low entropy = Lots of order

Representation in Audition Frequency decompositions Cochlear hint Easier to look at data! Sinusoidal bases Signal processing framework

Evolving a Representation Develop a basis decomposition Bases should be statistically independent Satisfaction of minimal entropy idea Decomposition should be data driven Account for different domains

Method Use bits of natural sounds to derive bases Analyze these bits with ICA

Results We obtain sinusoidal bases! Transform is driven by the environment Uniform procedure for different domains

Auditory Grouping Heuristics Bootstrapped to individual domains Hard to implement on computers Require even more heuristics to resolve ambiguity Weak definitions Bootstrapped to individual domains Vision Gestalt  Auditory Gestalt  … Common AM Common FM Good Continuation

Method Goal: Find grouping that minimizes scene entropy Parameterized Auditory Scene s(t,n) Density Estimation Ps(i) Shannon Entropy Calculation

Common Modulation - Frequency Scene Description: Entropy Measurement: Time n = 0.5 Frequency

Common Modulation - Amplitude Scene Description: Entropy Measurement: Sine 2 Amplitude n = 0.5 Sine 1 Amplitude Time

Common Modulation - Onset/Offset Scene Description: Entropy Measurement: Sine 2 Amplitude n = 0.5 Sine 1 Amplitude Time

Similarity/Proximity - Harmonicity I Scene Description: Entropy Measurement: Time Frequency

Similarity/Proximity - Harmonicity II Scene Description: Entropy Measurement: Time Frequency

Simple Scene Analysis Example 5 Sinusoids 2 Groups Simulated Annealing Algorithm Input: Raw sinusoids Goal: Entropy minimization Output: Expected grouping

Important Notes No definition of time Developed a concept of frequency No parameter estimation requirement Operations on data not parameters No parameter setting!

Conclusions Elegant and consistent formulation No constraint over data representation Uniform over different domains (Cross-modal!) No parameter estimation No parameter tuning! Biological plausibility Barlow et al ... Insight to perception development

Future Work Good Cost Function? Incorporate time Joint entropy vs entropy of sums Shannon entropy vs Kolmogorov complexity Joint-statistics (cumulants, moments) Incorporate time Sounds have time dependencies I’m ignoring Generalize to include perceptual functions

Teasers Dissonance and Entropy Pitch Detection Instrument Recognition