SPARSITY & SPEECH SCIENCE? TOWARDS A DATA-DRIVEN CHARACTERIZATION OF SPEECH MOTOR CONTROL V IKRAM R AMANARAYANAN University of Southern California, Los.

Slides:



Advertisements
Similar presentations
Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin.
Advertisements

SPPA 403 Speech Science1 Unit 3 outline The Vocal Tract (VT) Source-Filter Theory of Speech Production Capturing Speech Dynamics The Vowels The Diphthongs.
Causation, Control, and the Evolution of Complexity H. H. Pattee Synopsis Steve B. 3/12/09.
Challenge in neuroscience  Neuroscience is a very broad field. It covers everything from gene expression, to a single neuron firing, to activity across.
Speech Group INRIA Lorraine
Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009
Forward and Inverse Kinematics CSE 3541 Matt Boggus.
Image Denoising using Locally Learned Dictionaries Priyam Chatterjee Peyman Milanfar Dept. of Electrical Engineering University of California, Santa Cruz.
NDT Mary Rose Franjoine PT, DPT, MS, PCS
1 Finite-Length Scaling and Error Floors Abdelaziz Amraoui Andrea Montanari Ruediger Urbanke Tom Richardson.
CSCE 641: Forward kinematics and inverse kinematics Jinxiang Chai.
“Random Projections on Smooth Manifolds” -A short summary
History, Theory, and Philosophy of Science (In SMAC + RT) 7th smester -Fall 2005 Institute of Media Technology and Engineering Science Aalborg University.
CSCE 641: Forward kinematics and inverse kinematics Jinxiang Chai.
Reporter : Mac Date : Multi-Start Method Rafael Marti.
CSCE 689: Forward Kinematics and Inverse Kinematics
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
Near-Optimal Decision-Making in Dynamic Environments Manu Chhabra 1 Robert Jacobs 2 1 Department of Computer Science 2 Department of Brain & Cognitive.
Slide# Ketter Hall, North Campus, Buffalo, NY Fax: Tel: x 2400 Control of Structural Vibrations.
Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.
SOMTIME: AN ARTIFICIAL NEURAL NETWORK FOR TOPOLOGICAL AND TEMPORAL CORRELATION FOR SPATIOTEMPORAL PATTERN LEARNING.
Holistic Music Therapy and Rehabilitation Jennifer Townsend NMT, MT-BC Neurologic Music Therapist Music Therapist-Board Certified The National Flute Association.
ENGINEERING BIOMEDICAL Michael G. Heinz Background: PhD at MIT –Speech and Hearing Bioscience and Technology Postdoctoral Fellow at Johns Hopkins Univ.
Trends in Motor Control
Natural Language Understanding
THEORIES OF MIND: AN INTRODUCTION TO COGNITIVE SCIENCE Jay Friedenberg and Gordon Silverman.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
By Paul Cottrell, BSc, MBA, ABD. Author Complexity Science, Behavioral Finance, Dynamic Hedging, Financial Statistics, Chaos Theory Proprietary Trader.
Multiple-access Communication in Networks A Geometric View W. Chen & S. Meyn Dept ECE & CSL University of Illinois.
SOMAYEH SHAHSAVARANI 90/1/29 Speech Production. Language SpeechSigningWritingPainting.
Department of Information Technology Indian Institute of Information Technology and Management Gwalior AASF hIQ 1 st Nov ‘09 Department of Information.
A Framework for Distributed Model Predictive Control
Integrating Neural Network and Genetic Algorithm to Solve Function Approximation Combined with Optimization Problem Term presentation for CSC7333 Machine.
Cs: compressed sensing
IRCS/CCN Summer Workshop June 2003 Speech Recognition.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Compressive sampling and dynamic mode decomposition Steven L. Brunton1, Joshua L. Proctor2, J. Nathan Kutz1 , Journal of Computational Dynamics, Submitted.
USC Linguistics Resonance tuning in soprano singing and vocal tract shaping: Comparison of sung and spoken vowels 2pSC29 Shrikanth Narayanan *^, Erik Bresch.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
Approximation of Protein Structure for Fast Similarity Measures Fabian Schwarzer Itay Lotan Stanford University.
CSCE 441: Computer Graphics Forward/Inverse kinematics Jinxiang Chai.
Motor Control. Beyond babbling Three problems with motor babbling: –Random exploration is slow –Error-based learning algorithms are faster but error signals.
Human Brain and Behavior Laboratory Center for Complex Systems & Brain Sciences Neural mechanisms of social coordination: a continuous EEG analysis using.
Introduction: Brain Dynamics Jaeseung Jeong, Ph.D Department of Bio and Brain Engineering, KAIST.
Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.
Data Mining Course 0 Manifold learning Xin Yang. Data Mining Course 1 Outline Manifold and Manifold Learning Classical Dimensionality Reduction Semi-Supervised.
Concept learning, Regression Adapted from slides from Alpaydin’s book and slides by Professor Doina Precup, Mcgill University.
Article Summary of The Structural Complexity of Software: An Experimental Test By Darcy, Kemerer, Slaughter and Tomayko In IEEE Transactions of Software.
Causality, symmetry, brain, evolution, DNA, and a new theory of Physics Sergio Pissanetzky 1.
Model-based learning: Theory and an application to sequence learning P.O. Box 49, 1525, Budapest, Hungary Zoltán Somogyvári.
Speech Science II Capturing and representing speech.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Dynamic Causal Model for evoked responses in MEG/EEG Rosalyn Moran.
Chapter 4 Motor Control Theories Concept: Theories about how we control coordinated movement differ in terms of the roles of central and environmental.
Scientific Data Analysis via Statistical Learning Raquel Romano romano at hpcrd dot lbl dot gov November 2006.
Asynchronous Control for Coupled Markov Decision Systems Michael J. Neely University of Southern California Information Theory Workshop (ITW) Lausanne,
CSCE 441: Computer Graphics Forward/Inverse kinematics Jinxiang Chai.
Neural Network Approximation of High- dimensional Functions Peter Andras School of Computing and Mathematics Keele University
RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU 2: 1 Nonlinear Models.
GEPPETO 1 : A modeling approach to study the production of speech gestures Pascal Perrier (ICP – Grenoble) with Stéphanie Buchaillard (PhD) Matthieu Chabanas.
Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.
An Articulatory Analysis of Phonological Transfer Using Real-Time MRI Joseph Tepperman, Erik Bresch, Yoon-Chul Kim, Sungbok Lee, Louis Goldstein, and Shrikanth.
Summary of points for WS04 – from Weiss & Jeannerod (1998)
CSCE 441: Computer Graphics Forward/Inverse kinematics
Character Animation Forward and Inverse Kinematics
Machine Learning Basics
CSCE 441: Computer Graphics Forward/Inverse kinematics
Jessica McKee Speech, Language and Hearing Sciences
Hydrology Modeling in Alaska: Modeling Overview
Intuition for latent factors and dynamical systems.
Presentation transcript:

SPARSITY & SPEECH SCIENCE? TOWARDS A DATA-DRIVEN CHARACTERIZATION OF SPEECH MOTOR CONTROL V IKRAM R AMANARAYANAN University of Southern California, Los Angeles

SPEECH PRODUCTION INVOLVES COMPLEX COORDINATION! 2 Scientific understanding S. Narayanan et al (2004), JASA. Clinical rehabilitation LIPS THROAT How do we model speech motor control? Understanding languages WHY STUDY THIS? Robotics/prosthetics

THE SPEECH COMMUNICATION CHAIN 3 Image adapted from: Jianwu Dang, The speech communication system is complex! GOAL: Build a systems-level model of the speech motor controller.

CONCEPTUAL OUTLINE OF MY WORK 4 CONTROLLER BEHAVIOR Hypotheses: kinematic & postural planning – pauses Use results : insight into motor control models Movement primitives: low-dim. space for control and coordination Build better control models TOP-DOWNBOTTOM-UP

ATTACKING THE COORDINATION PROBLEM 5 One perspective: Each degree-of-freedom behaves in a well defined way as defined by a central controller Coordination: bringing of parts into proper relation. Problem: Too many degrees of freedom! Bernstein’s suggestion: The system self-organizes its multiple degrees of freedom to work in only a few principal dimensions Bernstein (1967). Coordination and regulation of movements.

6 ARTICULATORY PRIMITIVES A set of time-varying functional units (“synergies”) or basis functions, weighted combinations of which can be used to represent any movement of articulators in the vocal tract. RECONSTRUCTIO N ACCURACY MODEL-DEPENDENT SMOOTHNESS OR SPARSITY V. Ramanarayanan et al, JASA, under review. OPTIMIZATION GOAL: Find a spatiotemporal dictionary of primitives that trades off : We designed an algorithm that solves this optimization problem

EXAMPLE: /kʌt/ as in “CUT” 7 Captures linguistically meaningful events Release of /k/ Vowel Formation of /t/

ONGOING: FINDING SYNERGIES OF CONTROL INPUTS THAT DRIVE THIS DYNAMICAL SYSTEM 8 Chhabra and Jacobs (2006). Neural Computation Goal: To find the optimal control signals u(t) that minimize cost J Problem: Computationally intractable for nonlinear systems Solution: Use approx. methods, find locally optimal solns Li and Todorov (2005). Iterative LQR/LQG methods ACCURACY OF MOVEMENT TRAJECTORY SMOOTHNESS OPTIMIZATION GOAL: Find a dictionary of control signals u(t) that trades off : VERY LARGE CONTROL VALUES

CONTRIBUTIONS/FINDINGS 9 CONTROLLER BEHAVIOR STRUCTURE (VOCAL APPARATUS) 1. An algorithmic framework that can learn interpretable spatio-temporal dictionaries of movement primitives from input timeseries data. 2. A control framework that is based on a small dictionary of control signals, i.e., a lower dimensional control manifold.

OUTLOOK: DIVERSE EE PERSPECTIVES 10 CONTROL perspective – Low-dimensional control manifold: Can we find synergies of control inputs that can drive this stochastic nonlinear dynamical system? COMMUNICATION perspective – Production-perception links: Do we produce speech movements (and thus, primitives) for optimal perception of sounds? SIGNAL PROCESSING perspective – Better machine learning: How can we define better models that capture an optimal number of primitives of variable temporal lengths? ACKNOWLEDGEMENTS: SAIL and SPAN groups, USC and NIH Grant DC