Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Data Mining Classification: Alternative Techniques
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
Overview What : Stroke type Transformation: Timbre Rhythm When: Stroke timing Resynthesis.
G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
CS292 Computational Vision and Language Pattern Recognition and Classification.
Lesson 8: Machine Learning (and the Legionella as a case study) Biological Sequences Analysis, MTA.
Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.
CONTENT BASED FACE RECOGNITION Ankur Jain 01D05007 Pranshu Sharma Prashant Baronia 01D05005 Swapnil Zarekar 01D05001 Under the guidance of Prof.
Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.
System Microphone Keyboard Output. Cross Synthesis: Two Implementations.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Presented by Zeehasham Rasheed
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
The Implicit Mapping into Feature Space. In order to learn non-linear relations with a linear machine, we need to select a set of non- linear features.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,
INSTANCE-BASE LEARNING
Digital audio and computer music COS 116, Spring 2012 Guest lecture: Rebecca Fiebrink.
Information theory, fitness and sampling semantics colin johnson / university of kent john woodward / university of stirling.
DSP. What is DSP? DSP: Digital Signal Processing---Using a digital process (e.g., a program running on a microprocessor) to modify a digital representation.
How To Do Multivariate Pattern Analysis
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
Gesture Recognition & Machine Learning for Real-Time Musical Interaction Rebecca Fiebrink Assistant Professor of Computer Science (also Music) Princeton.
IXA 1234 : C++ PROGRAMMING CHAPTER 1. PROGRAMMING LANGUAGE Programming language is a computer program that can solve certain problem / task Keyword: Computer.
A Confidence-Based Approach to Multi-Robot Demonstration Learning Sonia Chernova Manuela Veloso Carnegie Mellon University Computer Science Department.
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
Learning from observations
Digital audio and computer music COS 116, Spring 2010 Adam Finkelstein Slides and demo thanks to Rebecca Fiebrink.
Distributed Representative Reading Group. Research Highlights 1Support vector machines can robustly decode semantic information from EEG and MEG 2Multivariate.
It sure is smart but can it swing? (Digital audio and computer music)
Advanced Analytics on Hadoop Spring 2014 WPI, Mohamed Eltabakh 1.
SVMs for (x) Recognition (From Moghaddam / Yang’s “Gender Classification with SVMs”) Brian Whitman.
Some questions -What is metadata? -Data about data.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
ISE-575 Presentation 1 Designing Smule’s iPhone Ocarina By Ge Wang Huihui Cheng 3/20/2011.
0 The old computing is about what computers can do… the new computing is about what people can do. - Ben Shneiderman.
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
Wekinator
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
Harvestworks Part 1: ChucK basics Rebecca Fiebrink Princeton University 1.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
語音訊號處理之初步實驗 NTU Speech Lab 指導教授: 李琳山 助教: 熊信寬
Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
Machine Learning Models
Artificial Intelligence with .NET
School of Computer Science & Engineering
Session 7: Face Detection (cont.)
Introductory Seminar on Research: Fall 2017
Introduction CSE 1310 – Introduction to Computers and Programming
Basic machine learning background with Python scikit-learn
Unsupervised Learning and Autoencoders
Machine Learning Week 1.
Nearest-Neighbor Classifiers
Artificial Intelligence Lecture No. 28
Pattern Recognition & Machine Learning
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Word representations David Kauchak CS158 – Fall 2016.
CS249: Neural Language Model
Presentation transcript:

Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1

Real-time audio analysis Goal: Analyze audio within same sample- synchronous framework as synthesis & interaction.

The Unit Analyzer center freq, radius Impulse generator BiQuad Filter DAC Send impulse FFT Spectral feature extractors IFFT … … Time-domain feature extractors UAna New: Unit Analyzer UGen Old: Unit Generator

The Unit Analyzer 4 Like a unit generator –Blackbox for computation –Plug into a directed graph/network/patch Unlike a unit generator –Input is samples, data, and/or metadata –Output is samples, data, and/or metadata –Not tied to sample rate; computed on-demand

=>

=^

=> =^ chuck upchuck See upchuck_operator.ck, upchuck_function.ck, continuous_feature_extraction.ck

The UAnaBlob Upchucked by UAna Generic representation for metadata. – Real and complex arrays – Spectra, feature values, or user-defined – Timestamped One associated with each UAna

FFT/IFFT Takes care of: – Buffering input / overlap-adding output – Maintaining window and FFT sizes – Mediating audio rate and analysis “rate” FFT outputs complex spectrum as well as magnitude spectrum – Low-level: access/modify contents manually – High-level: connect FFT to spectral processing UAnae See ifft.ck, ifft_transformation.ck

Example: Cross-synthesis Apply the spectral envelope of one sound to another sound – Ex: xsynth_robot123.ck, xsynth_guitar123.ck – Voice spectrum taken from: 10

Machine learning for live performance Problem: How do we use audio and gestural features? – there is a semantic gap between the raw data that computers use and the musical, cultural, aesthetic meanings that humans perceive and assign.

One solution: A lot of code What algorithm would you design to tell a computer whether a picture contains a human face? 12

The problem If your algorithm doesn’t work, how can you fix it? You can’t easily reuse it to do a similar task (e.g., recognizing monkey faces that are not human) There’s no “theory” for how to write a good algorithm It’s a lot of work! 13

Another solution: Machine learning (Classification) Classification is a data-driven approach for applying labels to data. Once a classifier has been trained on a training set that includes the true labels, it will predict labels for new data it hasn’t seen before. 14

Classifier Data Set: A feature vector and class for every data point Train the classifier on a labeled dataset

Run the trained classifier on new data Classifier NO!

Candidates for classification Which gesture did the performer just make with the iCube? Which instruments are playing right now? Who is singing? What language are they singing? Is this chord major or minor? Is this dancer moving quickly or slowly? Is this music happy or sad? Is anyone standing near the camera? 18

An example algorithm: kNN The features of an example are treated as its coordinates in n-dimensional space To classify an new example, the algorithm looks for its k (maybe 10) nearest neighbors in that space, and chooses the most popular class. 19

kNN space: Basketball or Sumo? 20 Feature 1: WeightFeature 2: Height

kNN space: Basketball or Sumo? 21 Feature 1: WeightFeature 2: Height ?

kNN space: Basketball or Sumo? 22 Feature 1: WeightFeature 2: Height ? K=3

kNN space: Basketball or Sumo? 23 Feature 1: WeightFeature 2: Height S S

SMIRK (small music information retrieval toolkit) For real-time application of machine learning – Learning in ChucK – E.g., kNN gesture classification, musical audio genre/artist classification 24

Interaction & on-the-fly learning Can we make process of training a classifier interactive? Performative? 25

Another technique: Neural networks Very early method Inspired by the brain Results in highly non- linear functions from input to output 26

Combining Techniques with Wekinator 27 ChucK: Pass features to Java, receive results back and use them to make sound Java: Train a neural network to map features to sounds OSC Example: Wekinator See performance video at

Review Machine learning can be used to: – Apply meaningful labels (classification) – Learn (& re-learn) functions from inputs to outputs (e.g., neural networks) Appropriate for camera, audio, sensors, and many other types of data Live, interactive performance is a very interesting application area 28

Wrap-up Thanks for coming, thanks to Harvestworks! See resources on handout; workshop webpage with slides & code Please fill out evaluation forms! 29