Download presentation
Presentation is loading. Please wait.
1
Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music
2
„Today, I woud like to listen to something cheerful.“ „Something like Lenny Kravitz would be great.“ „Who can help me to discover my collection?“
3
„In my shelf AC/DC is next to the ZZ Top...“
4
„play random songs that match my mood“
5
What‘s the Talk about? basic idea: map of music constructing the map (MDS, PLSA) using the map (Youtube, Android)
6
Map of Music: What is it good for?
7
Similar songs are close to each other Similar songs are close to each other Quickly find nearest neighbors Quickly find nearest neighbors Span (and play) volumes Span (and play) volumes Create smooth playlists by interpolation Create smooth playlists by interpolation Visualize a collection Visualize a collection Low memory footprint Low memory footprint – Well suited for mobile domain Advantages of a Map convenient basis to build music software
8
How to Construct a Map of Music?
9
Similar or different???
10
Music Similarity Audio Analysis Usage Data
11
From Usage Data to Similarity Collaborative Filtering Folksonomies (Tags)
12
Collaborative Filtering and MDS Method 1:
13
Basic Idea d = ? item-to-item collaborative filtering 1 1
14
Item-to-Item Collaborative Filtering People Who Listen To This Song also Listen to... [Linden et al., 2003]
15
Users who listen to A also listen to B Users who listen to A also listen to B – Top-50 listened songs per user – Normalization (cosine similarity) Pairwise Similarity #common users (co-occurrences) (co-occurrences) Occurrences of song A Occurrences of song B
16
Graph Embedding EEBB DDCC AA 22 33 33 44 55 44 55 BB AA EE DD CC
17
Well-known for dimensionality reduction Well-known for dimensionality reduction – first described by Young and Householder, 1938 Principal Component Analysis (PCA): Principal Component Analysis (PCA): – Project on hyperplane that maximizes variance. – Computed by solving an eigenvalue problem. Basic idea of MDS: Basic idea of MDS: – Assume that the exact positions y 1,...,y N in a high-dimensional space are given. – It can be shown that knowing only the distances d(y i, y j ) between points we can calculate the same result as applying PCA to y 1,...,y N. Problem: Complexity O(n 2 log n) Problem: Complexity O(n 2 log n) Classical Multidimensional Scaling (MDS)
18
Select k landmarks and embed them using MDS Select k landmarks and embed them using MDS For the remaining points: For the remaining points: – Place according to distances from landmarks Complexity: O(k n log n) Complexity: O(k n log n) Landmark MDS [de Silva and Tenenbaum, 1999]
19
Assumption: some links erroneously shortcut certain paths Assumption: some links erroneously shortcut certain paths Idea: Use embedding as estimator for distance Idea: Use embedding as estimator for distance – Shortcut edges get stretched – Remove edges with worst stretch and re-embed Example: Kleinberg graph (20x20 grid with random edges) Example: Kleinberg graph (20x20 grid with random edges) Iterative Embedding Original embedding (spring embedder) After 6 rounds After 12 rounds After 30 rounds
20
Evaluation: Dimensionality 10 Dimensions give a reasonable quality Example Neighborhoods in 10D Space
21
Social Tags and PLSA Method 2:
22
Similarity from Social Tagging
23
Probabilistic Latent Semantic Analysis (PLSA) documents latent semantic classes wordssongs latent music style classes tags [Hofmann, 1999]
24
PLSA: Interpretation as Space can be seen as a vector that defines a point in space [Hofmann, 1999] K small: Dimensionality reduction
25
PLSA Space latent class 2 latent class 3 latent class 1 Probabilities sum to 1: K-1 dimensional hyperplane Probabilities sum to 1: K-1 dimensional hyperplane Similar documents (songs) are close to each other music space: 32 dimensions
26
Advantages of LMDS: Advantages of LMDS: – Same accurracy at lower dimensionality (10 vs. 32) Advantages of PLSA: Advantages of PLSA: – Natural meaning of tags – Assignment of tags to songs (probabilistic) LMDS vs. PLSA Space Current sizes (approx.): LMDS: 600K tracks PLSA: 1.1M tracks Current sizes (approx.): LMDS: 600K tracks PLSA: 1.1M tracks
27
Using the Map
28
Visualization? high-dimensional!high-dimensional!
29
Identify relevant tags Identify relevant tags Find centroids of these tags in 10D Find centroids of these tags in 10D Apply Principal Component Analysis (PCA) to these centroids Apply Principal Component Analysis (PCA) to these centroids From 10D to 2D
31
What people have chosen during the researcher‘s night in Zurich
32
YouJuke – The YouTube Jukebox
33
YouTube as media source YouTube as media source Music map to create smart playlist
36
Reaching YouJuke www.youjuke.orgwww.youjuke.org apps.facebook.com/youjukeapps.facebook.com/youjuke
37
μseek „play random songs that match my mood“ „In my shelf John Lennon is next to the Beatles...“ μseek: The map of music on Android μseek: The map of music on Android
38
An Intelligent iPod-Shuffle
39
Realization After only few skips, we know pretty well which songs match the user‘s mood After only few skips, we know pretty well which songs match the user‘s mood
40
Work in Progress: Who is Dancing? AC/DCAC/DC BeatlesBeatles ProdigyProdigy
41
„In my shelf AC/DC is next to the ZZ Top...“ Browsing Covers
42
Video
43
www.musicexplorer.org/museekwww.musicexplorer.org/museek Internet connection only required at first startup!
44
Thanks to: Thanks to: – Lukas Bossard – Mihai Calin – Olga Goussevskaia – Michael Lorenzi – Roger Wattenhofer – Samuel Welten URLs: URLs: – www.musicexplorer.org/museek – www.youjuke.org – apps.facebook.com/youjuke E-Mail: E-Mail: – kuhnmi@tik.ee.ethz.ch (Michael Kuhn) Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.