Joe Cabrera Locating partial duplicate songs using beat-chroma features: A Codebook Approach
Partial Duplicates Songs (Remix) A partial duplicate song is a song that has been modified in a way to make it appear different from the original song Are these the same songs? Hey JudeHey Jude (partial duplicate)
Partial Duplicates Songs (Remix) Song modifications include Pitch Alteration Time Alteration Resampling Noise Amplification Time Chunks Some modification make our retrieval harder
Applications Copyright Song Detection YouTube Last.fm Echoprint GraceNote
Applications Song Wikipedia Initially a five-piece line-up of Lennon, McCartney, Harrison, Stuart Sutcliffe (bass) and Pete Best (drums), they built their reputation playing clubs in Liverpool and Hamburg over a three-year period from Sutcliffe left the group in 1961, and Best was replaced by Starr the following year. Moulded into a professional outfit by their manager, Brian Epstein, their musical potential was enhanced by the creativity of producer George Martin. …….Stuart SutcliffePete BestHamburgBrian EpsteinGeorge Martin In 1968, John Lennon and his wife Cynthia Lennon separated due to John's affair with Yoko Ono. Soon afterwards, Paul McCartney drove out to visit Cynthia and Julian, her son with Lennon. "We'd been very good friends for millions of years and I thought it was a bit much for them suddenly to be personae nongratae and out of my life,"Cynthia Lennon Yoko Onopersonae nongratae "Hey Jude" is a song by the English rock band The Beatles. Credited to Lennon/McCartney, the ballad evolved from "Hey Jules", a song Paul McCartney wrote to comfort John Lennon's son Julian during his parents' divorce. "Hey Jude" begins with a verse-bridge structure based around McCartney's vocal performance and piano accompaniment; further instrumentation is added as the song progresses to distinguish sections. After the fourth verse, the song shifts to a fade-out coda that lasts for more than four minutes.rockThe BeatlesLennon/McCartneyballadPaul McCartneyJohn LennonJuliancoda The Beatles were an English rock band, formed in Liverpool in They are one of the most commercially successful and critically acclaimed acts in the history of popular music. [1] From 1962, the group consisted of John Lennon (rhythm guitar, vocals), Paul McCartney (bass guitar, vocals), George Harrison (lead guitar, vocals) and Ringo Starr (drums, vocals). Rooted in skiffle and 1950s rock and roll, the group later worked in many genres ranging from pop ballads to psychedelic rock, often incorporating classical and other elements in innovative ways. The nature of their enormous popularity, which first emerged as "Beatlemania", transformed as their songwriting grew in sophistication. They came to be perceived as the embodiment of ideals of the social and cultural revolutions of the 1960s.rockLiverpool [1] John LennonPaul McCartneyGeorge HarrisonRingo Starrskifflerock and rollgenrespopballadspsychedelic rockclassicalBeatlemaniasocial and cultural revolutions of the 1960s ………. Artist: Beatles (1960–present) Title: Hey Jude Date: 1968 Album: Revolution Genre: Rock Large Scale Partial- Duplicate Song Search Query Song
Applications Mobile Song Search
Challenge Large scale music retrieval – 1 million songs Beat-aligned chroma feature retrieval from very noisy partial duplicate songs (remixes) Clustering large number of features Each song has 400 features on average 1 million * 400 features = 4,000,000,000 features
Goal Given an original song can we retrieval a list of the partial duplicate songs from our database Database contains 1 million songs Accurately Efficiently
Basic Idea Bag of Words model effective in text Information Retrieval (IR) Retriev e After weeks of tremendous growth, Google+ is starting to show signs of slowing down according to the latest report from the web- analytics company Experian Hitwise. "If everyone here hasn't been on Google+ today, it's doomed. They are off to a rocky start."
Basic Idea Big {doc 1, doc 2} Bad {doc 1} Wolf {doc 1} Play {doc2} Query: Big Bad Wolf Doc 1 Doc 2 Return : Doc 1 Inverted Index
Basic Idea Query Song: {song 1} {song 1, song2} Return: Song 1 Song 2 Inverted codeword index
Codebook Collection of the most important features in a dataset We don't have to represent all the features from a dataset In our approach, these codes are found by clustering Query songs and partial duplicate songs are encoded with these codewords to improve query time
Beat-aligned chroma pattern All Western music can be represented by 12 semitones Beat-aligned chroma measures intensity at each beat Grouped by time signature
Our Approach Million Song Dataset Extract Features Cluster Features Encode new song with codewords Form dendrogram with codewords as nodes
Partial Duplicate Song retrieval Quick retrieval of partial duplicates
Experiments Collected 20 query original songs Collected 10 partial duplicates for each original song Constructed 4 different codebook sizes 1k, 5k, 10k, 25k Partial duplicate songs added to Million Song Dataset Measured Precision and Recall Query time
Results
Conclusions Precision and Recall increase with larger codebook Larger codebook helps discriminate well between features Query time decrease with larger codebook Larger codebooks leads to faster identification of features
Considerations It is difficult to extracted beat-aligned chroma from songs that are very noisy Songs may be encoded with the incorrect codeword Hey Jude (another partial duplicate)
Considerations It may be important to considering positional information in analysis Currently there are no known baselines for partial song retrieval
Accomplishments Propose feature learning approach to large dataset – 1 million songs Created framework for the identification of partial duplicate songs Found successful approach to identification of partial duplicate songs in a large song dataset Preform state-of-the-art partial song retrieval
Demo