DeepMIDI: Music Generation

Slides:



Advertisements
Similar presentations
Overview Part 2 – Combinational Logic Functions and functional blocks
Advertisements

Multimedia Database Systems
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”
Information Retrieval in Practice
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
The Relational Database Model. 2 Objectives How relational database model takes a logical view of data Understand how the relational model’s basic components.
October 5, 2010Neural Networks Lecture 9: Applying Backpropagation 1 K-Class Classification Problem Let us denote the k-th class by C k, with n k exemplars.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Chapter 14 Recording and Editing Sound. Getting Started FAQs: − How does audio capability enhance my PC? − How does your PC record, store, and play digital.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
The Relational Database Model
Chapter 15 Recording and Editing Sound. 2Practical PC 5 th Edition Chapter 15 Getting Started In this Chapter, you will learn: − How sound capability.
(Spring 2015) Instructor: Craig Duckett Lecture 10: Tuesday, May 12, 2015 Mere Mortals Chap. 7 Summary, Team Work Time 1.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Creating Music Text, Rhythm, and Pitch Combined to Compose a Song.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Neural Modeling - Fall NEURAL TRANSFORMATION Strategy to discover the Brain Functionality Biomedical engineering Group School of Electrical Engineering.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
PREDICTING SONG HOTNESS
Unit 9– Seminar Analyzing Content: Historical, Secondary, and Content Analysis and Crime Mapping Professor Chris Lim, MA, Ph.D.(ABD)
Convectional Neural Networks
Information Retrieval in Practice
Chapter 3 Data Representation
Cognitive model of stereotype change: Hewstone & Johnston
Chapter 15 Recording and Editing Sound
A Smart Tool to Predict Salary Trends of H1-B Holders
Microsoft Visual Basic 2010: Reloaded Fourth Edition
Tenacious Deep Learning
Overview Part 2 – Combinational Logic Functions and functional blocks
References and Related Work
SNS COLLEGE OF TECHNOLOGY
Tomás Pérez-García, Carlos Pérez-Sancho, José M. Iñesta
Regression Testing with its types
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Deep Feedforward Networks
The Relational Database Model
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
(Winter 2017) Instructor: Craig Duckett
ECE Computer Engineering Design Project
Syntax-based Deep Matching of Short Texts
Observations by Dance Move
Timing Model Start Simulation Delay Update Signals Execute Processes
Fast Preprocessing for Robust Face Sketch Synthesis
Enhancing User identification during Reading by Applying Content-Based Text Analysis to Eye- Movement Patterns Akram Bayat Amir Hossein Bayat Marc.
Web Development A Visual-Spatial Approach
Chapter Six Training Evaluation.
NBA Draft Prediction BIT 5534 May 2nd 2018
Office of Education Improvement and Innovation
Predicting Government Spending on Professional Services
A First Look at Music Composition using LSTM Recurrent Neural Networks
Ninja Trader: Introduction to data mining in financial applications
Zhengjun Pan and Hamid Bolouri Department of Computer Science
Word Embedding Word2Vec.
Integrating Segmentation and Similarity in Melodic Analysis
The Relational Database Model
Neural Speech Synthesis with Transformer Network
Course Introduction CSC 576: Data Mining.
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
ECE 352 Digital System Fundamentals
The ultimate in data organization
Neural networks (1) Traditional multi-layer perceptrons
Computer Organization & Architecture 3416
Ali Hakimi Parizi, Paul Cook
Advances in Deep Audio and Audio-Visual Processing
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

DeepMIDI: Music Generation Arindam Bhattacharya, Jonathan Burge, Bryce Codell | MSiA 490-30 Deep Learning | Spring 2017 | Northwestern University Problem Statement Technical Approach Results Our most successful attempt at generating a jazzy bass line resulted from seeding with “Autumn Leaves,” a classic 1940’s piece that strongly exhibits typical characteristics of jazz harmony and music. We have visualized how strongly our generated music exhibits these characteristics below: There is constant demand for new musical content for a multitude of uses, ranging from artistic expression, to jingles for new TV shows, to elevator music. Objective: to generate original music content using deep learning. Audio files were converted from MIDI to text and fed into a text generation model, the output of which was converted back to MIDI. The primary metric of success was whether or not the music demonstrated the typical characteristics of jazz music. Pre- Process Fit Model 2 Post- Convert midi files to csv Compute derived features Subset features Reformat data into midi-csv structure Provide proper midi metadata Convert from csv back to midi format Vectorize text using “musical” vocabulary Train models LSTM, GRU Provide new ‘seeds’ Save generated data to csv 1 3 Most significant challenge: deriving the appropriate features and vocabulary necessary to generate meaningful output Derived features: note duration and note delay (start time of current note - start time of previous note) Alleviated several otherwise difficult-to-address constraints, e.g. generated start times needing to be in strictly ascending order Vocabulary: instead of generating all four note components with the standard text generation vocabulary (e.g. ‘1’, ‘2’, etc.), we made four distinct vocabularies, each corresponding the to unique numeric values of the four note components We fit four models in parallel, each corresponding to the four components of the notes being generated Experimented with a variety of diversity, dropout, and memory/window sizes to avoid common traps (e.g. getting stuck in a loop) Legendary jazz bassist Charles Mingus Dataset Our dataset was obtained from freemidi.org We converted these MIDI files to csv format prior to model fitting Below are samples of what our data looks like prior and after pre- processing Data cleaning involved several steps: Removing files of text that did not directly pertain to the notes being played Standardization of note format Subsetting specific channels from individual MIDI files We faced several data processing challenges: identifying a method for batch conversion of MIDI files into csv format, understanding the details of how MIDI files are converted into audio, generalizing our data processing to account for the significant variance in syntax and structure among MIDI files, and enforcing proper output format after converting model output back to MIDI format. Our training set consisted of bass lines from 8 jazz classics, all converted to MIDI by Mel Webb. Conclusion Overall, DeepMIDI showed promising potential for music generation. It demonstrated the ability to identify and integrate key features of jazz bass into its generated music (particularly those which are rhythmic in nature) However, it struggled a bit more with tonality, perhaps due to the small training data set and/or the inclusion of multiple key signatures Moderate diversity (0.6 - 0.8) produced the best sounding results One-hot encoding numeric values (effectively transforming them into categorical variables) has several major advantages, but also imposes limitations on the range of output prediction possibilities Additional limitations are inherent to the parallelized computation undertaken here - models ignore relationships between the four note components (e.g. shorter delays tend to be associated with shorter durations) Model Training Music Generation References and Related Work Freemidi.org - Free Midi Music Songs Download. Retrieved from https://freemidi.org. Keras/Theano Jazz Generation - https://github.com/jisungk/deepjazz LSTM Text Generation - https://github.com/fchollet/keras/blob/ master/examples/lstm_text_generation.py MidiCSV - Convert midi file to and from csv. Retrieved from http://www.fourmilab.ch/webtools/midicsv/ Our model consisted of 4 LSTM neural networks which each handled a single feature that together compose a note. The “note” and “velocity” models had fewer hidden nodes because their vocabularies were significantly smaller than those of “duration” and “delay.”