Songsmith: Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team Microsoft Research Using Machine Learning to Help People Make Music.

Slides:

Advertisements

Similar presentations

1 Gesture recognition Using HMMs and size functions.

Advertisements

Learning to Generalize for Complex Selection Tasks Alan Ritter University of Washington Sumit Basu Microsoft Research research.microsoft.com/~sumitb/smartselection.

CSC321: Introduction to Neural Networks and Machine Learning Lecture 24: Non-linear Support Vector Machines Geoffrey Hinton.

Support Vector Machines

Computer vision: models, learning and inference Chapter 8 Regression.

Digital Interactive Entertainment Dr. Yangsheng Wang Professor of Institute of Automation Chinese Academy of Sciences

Intelligent and Adaptive Interfaces Julia Schwarz April 25, 2013.

Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.

Page 1 Hidden Markov Models for Automatic Speech Recognition Dr. Mike Johnson Marquette University, EECE Dept.

Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.

Hidden Markov Models Theory By Johan Walters (SR 2003)

Lecture 15 Hidden Markov Models Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer.

Week 8 Video 4 Hidden Markov Models.

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1.

HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.

Computational Steering on the GRID Using a 3D model to Interact with a Large Scale Distributed Simulation in Real-Time Michael.

CS 188: Artificial Intelligence Fall 2009 Lecture 21: Speech Recognition 11/10/2009 Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.

Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.

1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.

Probabilistic Model of Sequences Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán)

Scalable Text Mining with Sparse Generative Models

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.

Crystal Hoyer Program Manager IIS Team Preview of features that will be announced at MIX09 Please do not blog, take pictures or video of session.

Gaussian Mixture Model and the EM algorithm in Speech Recognition

Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.

Graphical models for part of speech tagging

CS 4720 Usability and Accessibility CS 4720 – Web & Mobile Systems.

Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.

Scoring Matrices Scoring matrices, PSSMs, and HMMs BIO520 BioinformaticsJim Lund Reading: Ch 6.1.

Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

H IDDEN M ARKOV M ODELS. O VERVIEW Markov models Hidden Markov models(HMM) Issues Regarding HMM Algorithmic approach to Issues of HMM.

Hidden Markov Models Usman Roshan CS 675 Machine Learning.

Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Sequence Models With slides by me, Joshua Goodman, Fei Xia.

Hidden Markov Models in Keystroke Dynamics Md Liakat Ali, John V. Monaco, and Charles C. Tappert Seidenberg School of CSIS, Pace University, White Plains,

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.

ICMC 2004 – Nov. 5 1 Andante: Composition and Performance with Mobile Musical Agents Leo Kazuhiro Ueda Fabio Kon

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.

CSC321: Neural Networks Lecture 16: Hidden Markov Models

1 CRANDEM: Conditional Random Fields for ASR Jeremy Morris 11/21/2008.

School of Computer Science 1 Information Extraction with HMM Structures Learned by Stochastic Optimization Dayne Freitag and Andrew McCallum Presented.

1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2006 Oregon Health & Science University OGI School of Science & Engineering John-Paul.

Today's Specials ● Detailed look at Lagrange Multipliers ● Forward-Backward and Viterbi algorithms for HMMs ● Intro to EM as a concept [ Motivation, Insights]

Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.

Why Can't A Computer Be More Like A Brain?. Outline Introduction Turning Test HTM ◦ A. Theory ◦ B. Applications & Limits Conclusion.

Statistical Models for Automatic Speech Recognition Lukáš Burget.

1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida March 31,

Automated Speach Recognotion Automated Speach Recognition By: Amichai Painsky.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.

Data-Intensive Computing with MapReduce Jimmy Lin University of Maryland Thursday, March 14, 2013 Session 8: Sequence Labeling This work is licensed under.

OSSIM Technology Overview Mark Lucas. “Awesome” Open Source Software Image Map (OSSIM)

Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:

May 2003 SUT Color image segmentation – an innovative approach Amin Fazel May 2003 Sharif University of Technology Course Presentation base on a paper.

Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.

Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.

Behavior Recognition Based on Machine Learning Algorithms for a Wireless Canine Machine Interface Students: Avichay Ben Naim Lucie Levy 14 May, 2014 Ort.

Spectral Algorithms for Learning HMMs and Tree HMMs for Epigenetics Data Kevin C. Chen Rutgers University joint work with Jimin Song (Rutgers/Palentir),

Computer vision: models, learning and inference

Statistical Models for Automatic Speech Recognition

Statistical Models for Automatic Speech Recognition

Overview of Machine Learning

Music Signal Processing

Presentation transcript:

Songsmith: Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team Microsoft Research Using Machine Learning to Help People Make Music

Songsmith: Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team Microsoft Research Using Machine Learning to Help People Make Music

Computational User Experiences (CUE) group In general: HCI + (sensors, devices, machine learning, health, physiology)

Computational User Experiences (CUE) group  Using physiological signals for input

Computational User Experiences (CUE) group  Using physiological signals for input  Health and wellness

Computational User Experiences (CUE) group  Using physiological signals for input  Health and wellness  Creativity support tools

Songsmith: Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team Microsoft Research Using Machine Learning to Help People Make Music

What is Songsmith? Songsmith Your music You, singing... “Automatic accompaniment generation for vocal melodies”

High-Risk Live Demo (What other kind of live demo is there?)

Who is Songsmith for?

Today’s Talk  Overview and demo  How Songsmith works  Exposing machine learning parameters  What are people doing with Songsmith?  Creativity support Microsoft Research

Songsmith: 5000’ Overview G Amin C Daug

Chords from Melody  Songsmith’s core: Hidden Markov Model Song start Chord 1 Chord 2 Chord 3 Chord 4 Song end Hidden states (chords) Observations (note vectors)

States Observations What does an HMM do for me? What does an HMM want from me? HMMs in 5 Minutes or Less

States Observations HMMs in 5 Minutes or Less Possible states: C major C minor C diminished … … Transition probabilities: P(C Major | A Minor)? P(C Major | D Major)? P(F# Major | F# Major)? Observation probabilities: P( | A Minor)? P( | D Major)?

Building our HMM  Things my HMM wants from me: Possible states Transition probabilities Observation probabilities Observations

 Data-driven  Not heuristic-driven Finding the Probabilities

Training Data ~300 lead sheets (vocal melodies with chords)

C MajorC MinorC# MajorC# MinorD MajorD Minor… C Major C Minor C# Major……………… C# Minor……………… D Major……………… D Minor……………… …………………  Convert all chords to five basic triads  Transpose every song into the same key  Count the number of transitions from every chord to every other chord (transition probabilities):  Count the total duration of each melody note occurring while each chord is playing (observation probabilities): Processing the Database Notes played over C Major:

Building our HMM  Things my HMM wants from me: Possible states Transition probabilities Observation probabilities Observations

Observations: what did the user sing? Input: Hard!

Building a Pitch Histogram FFT

Building our HMM  Things my HMM wants from me: Possible states Transition probabilities Observation probabilities Observations  Run Viterbi algorithm, get “best” sequence of chords… thank you, HMM!  One hitch: key determination…

Today’s Talk  Overview and demo  How Songsmith works  Exposing machine learning parameters  What are people doing with Songsmith?  Creativity support Microsoft Research

So we can choose optimal chords... so what?  My demo took 10 seconds… is that fun?  What would a songwriter do next?  How can we build creative exploration into a learning-driven system?

 There are always hard-coded “magic numbers” in machine learning  Machine learning also use lots of learned parameters  Can we let users control those numbers? UI: Exposing Learning Parameters

A Bad User Interface Songsmith: A fun way to make music (if you have a PhD in math and/or computer science) Observation weight: Transition matrix (edit me!) CC#DD#E C C# D D# E Expected pitch histograms (edit me!) C C# D D# E F Frequency smoothing: Conjugate prior:

 Can we let users control those numbers? UI: Exposing Learning Parameters  Can we let users intuitively control those numbers?

Exposing Model Parameters in Songsmith

The “Happy Factor”

 Partition database into two databases (major and minor songs) using clustering  Build separate transition probability matrix for each database  When we actually run our HMM, blend the two transition matrices together according to user input… Happy Factor: Implementation

The “Happy Factor” Bonus question: what’s wrong with this equation?

Another “Happy Factor” Example Happy Sad

The “Jazz Factor”

 When running our HMM, we need to make chords match the voice and each other  Computing how well each chord fits at a given position: k log( P(this chord | what the user sang) ) + (1-k) log( P(this chord | the previous chord) )  Just put k on a slider! Jazz Factor: Implementation

Chord Locking  Global sliders are very coarse  Chords can be “locked” by the user

“Suggested Chords”  Songwriters will often explore “chord substitutions”…  …but we’re assuming our audience doesn’t know that much music theory…  Expose suboptimal marginal probabilities at each node as “suggestions”

Interactive Machine Learning  Songsmith is one example of IML Roughly: moving what used to be in the domain of ML experts into users’ hands  Related work: image classification Fogarty et al, CHI 2008: CueFlik Fails and Olsen, IUI 2003: Crayons  Why? Harness end-user knowledge Use ML as a tool for data exploration Use ML as a tool for creative expression

Today’s Talk  Overview and demo  How Songsmith works  Exposing machine learning parameters  What are people doing with Songsmith?  Creativity support Microsoft Research

Today’s Talk  Overview and demo  How Songsmith works  Exposing machine learning parameters  What are people doing with Songsmith?  Creativity support Microsoft Research

Dynamic Mapping of Physical Controls for Tabletop Groupware (Fiebrink, CHI 2009) One project, two problems: 1.Direct-touch, tabletop input is great for collaboration… …but suffers from serious precision issues. 2.Working on music alone is boring.

 Incorporate high-precision controllers into a tabletop environment  Evaluate in a collaborative audio-editing app Dynamic Mapping of Physical Controls for Tabletop Groupware (Fiebrink, CHI 2009)

Data-Driven Exploration of Musical Chord Sequences (Nichols, IUI 2009)  Problem: let people create and explore music by moving around in a reduced- dimensionality space  Genres, artists make for intuitive labels

 Why isn’t this easy?  Solution(s): Divergence-maximizing clustering, PCA Data-Driven Exploration of Musical Chord Sequences (Nichols, IUI 2009)

Future work  Make Songsmith freakin’ amazing, and port it to a bajillion platforms, and make an amazing community Web site, and build it into audio hosts… etc…  Other applications of machine learning in creativity support tools… Writing? Painting? Web Design? CAD? Music? Future work?

Songsmith Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team Microsoft Research research.microsoft.com/songsmith Live/Google: songsmith