Download presentation
Presentation is loading. Please wait.
Published byBertha Cummings Modified over 9 years ago
1
ETH Zurich – Distributed Computing Group Michael Kuhn 1ETH Zurich – Distributed Computing Group Social Audio Features An Intuitive Guide to the Music Galaxy Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich kuhnmi@tik.ee.ethz.ch
2
„Today, I would like to listen to something cheerful.“ „Something like Lenny Kravitz would be great.“ „Who can help me to discover my collection?“
3
„half of the time I spend skipping songs...”
4
„In my shelf AC/DC is next to the ZZ Top...“
5
Similar or different???
6
cover flow looks better cover flow looks better
7
does not well represent perceived similarity miles davis beatles fatboy slim beatles fatboy slim avril lavigne miles davis
9
…well reflects perceived music similarity. …is as convenient to use as an audio feature space. We want to have something that… Social Audio Features
10
socially derived music similarity + mapping into Euclidean space = Social Audio Features
11
ETH Zurich – Distributed Computing Group Michael Kuhn 11 Advantages of a Feature Space Similar songs are close to each other Quickly find nearest neighbors Span (and play) volumes Create smooth playlists by interpolation Visualize a collection Low memory footprint –Well suited for mobile domain convenient basis to build music software
12
Creating Social Audio Features, Method 1: Collaborative Filtering and MDS
14
#common users (co-occurrences) (co-occurrences) Occurrences of song A Occurrences of song B „Users who listen to Muse also listen to Oasis...“ Problem: Only pairwise similarity, but no global view!
15
Getting a global view... d = ? pairwise similarities 1 1
16
Principal Component Analysis (PCA): – Project on hyperplane that maximizes variance. – Computed by solving an eigenvalue problem. Basic idea of MDS: – Assume that the exact positions y 1,...,y N in a high-dimensional space are given. – It can be shown that knowing only the distances d(y i, y j ) between points we can calculate the same result as applying PCA to y 1,...,y N. Problem: Complexity O(n 2 log n) – use approximation: LMDS [da Silva and Tenenbaum, 2002] Classical Multidimensional Scaling (MDS)
17
Problem: Some links erroneously shortcut certain paths Problem: Use embedding as estimator for distance: Remove edges that get stretched most and re-embed
18
After 30 rounds of iterative embedding Original embedding
19
Pink Floyd - Time Pink Floyd - On the Run Pink Floyd - Any Colour you Like Pink Floyd - The Great Gig in the Sky Pink Floyd - Eclipse Pink Floyd - Us and Them Pink Floyd - Brain Damage Pink Floyd - Speak to Me Pink Floyd - Money Pink Floyd - Breathe Pink Floyd - One of These Days Miles Davis - So What Horace Silver - Song For My Father Bill Evans - All of You Miles Davis - Freddie Freeloader Nat King Cole - The More I See You Miles Davis - So Near Miles Davis - Flamenco Sketches Charles Mingus - Eat That Chicken Jimmy Smith - On the Sunny Side Julie London - Daddy Bill Evans – My Man‘s Gone Now 10 Dimensions give a reasonable quality Example Neighborhoods in 10D Space (0.5M songs)
20
Creating Social Audio Features, Method 2: Social Tags and PLSA
22
Meaningful labels, but sparse data Meaningful labels, but sparse data Good similarity information, but no labels Good similarity information, but no labels Let’s combine this information
23
ETH Zurich – Distributed Computing Group Michael Kuhn 23 Combining Usage Data and Social Tags
24
ETH Zurich – Distributed Computing Group Michael Kuhn 24 art painting artist music collection approach psychology feeling female subjective audio signal music beat timbre 1)Select latent class z with probability P(z|d) 2)Select word w with probability P(w|z) PLSA: find probabilities that best approximate observed word distribution PLSA: Probabilistic Latent Semantic Analysis (PLSA)
25
ETH Zurich – Distributed Computing Group Michael Kuhn 25 Probabilistic Latent Semantic Analysis (PLSA) Everyonehasaphotographicmemory… some just don’t have film. 1)Select latent class z with probability P(z|d) 2)Select word w with probability P(w|z) PLSA: find probabilities that best approximate observed word distribution PLSA:
26
ETH Zurich – Distributed Computing Group Michael Kuhn 26 PLSA: Interpretation as Space can be seen as a vector that defines a point in space [Hofmann, 1999] K small: Dimensionality reduction songs latent music style classes tags
27
ETH Zurich – Distributed Computing Group Michael Kuhn 27 … Greenday – basket case rock punk pop-punk Madonna – like a prayer pop dance female vocalists Beatles – hey jude 60‘s Classic rock british Applying PLSA to Music and Tags Greenday Beatles Madonna 32 latent classes (=dimensions), 1.1M songs
28
ETH Zurich – Distributed Computing Group Michael Kuhn 28 Evaluation Artist clustering Comparison to coll. filtering Comparison to coll. filtering Tag consistency
29
ETH Zurich – Distributed Computing Group Michael Kuhn 29 LMDS vs. PLSA Space Advantages of LMDS: –Same accurracy at lower dimensionality (10 vs. 32) Advantages of PLSA: –Natural meaning of tags –Assignment of tags to songs (probabilistic) Current sizes (approx.): LMDS: 600K tracks PLSA: 1.1M tracks Current sizes (approx.): LMDS: 600K tracks PLSA: 1.1M tracks
30
Using the Social Audio Features
31
high-dimensional!high-dimensional!
32
ETH Zurich – Distributed Computing Group Michael Kuhn 32 Visualization in 2D Identify relevant tags Find centroids of these tags in high-dimensional space Apply Principal Component Analysis (PCA) to these centroids
33
ETH Zurich – Distributed Computing Group Michael Kuhn 33
34
What people have chosen during the researcher‘s night in Zurich
35
ETH Zurich – Distributed Computing Group Michael Kuhn 35 YouJuke – The YouTube Jukebox
36
YouTube as media source YouTube as media source Social Audio Features to create smart playlist
39
www.youjuke.orgwww.youjuke.org apps.facebook.com/youjukeapps.facebook.com/youjuke
40
„Half of the time I spend skipping songs“
41
I only want to listen to songs that match my mood...
42
After only few skips, we know pretty well which songs match the user‘s mood After only few skips, we know pretty well which songs match the user‘s mood
43
ETH Zurich – Distributed Computing Group Michael Kuhn 43 Work in Progress: Who is Dancing? AC/DCAC/DC BeatlesBeatles ProdigyProdigy
44
ETH Zurich – Distributed Computing Group Michael Kuhn 44 „In my shelf AC/DC is next to ZZ Top...“ Browsing Covers
45
www.museek.ethz.ch
46
Video
47
Selected Comments from museek Users Your software is a pathetic piece of crap! […] Does a good job learning my tastes[…] […] easy browse and make playlists. Auto play related music is very good. 넥원 잘돌아갑니다 버벅거리지안고 굿 ui 도 굿이고요 ! [...] Love the ability to automatically play similar music. [...] [...] Love the ability to automatically play similar music. [...] Good potential, but album art is tiny & blurry […] Just got it and want to put more music on my sd card now. Pretty cool once you get the hang of it. L'algorithme de sélection des playlists en fonction de l'évolution de votre humeur est un véritable bijou. Félicitations […] Awesome app beating the ipod genius feature and coverflow. […]
48
ETH Zurich – Distributed Computing Group Michael Kuhn 48 Questions? Thanks to: –Lukas Bossard –Mihai Calin –Matthias Flückiger –Olga Goussevskaia –Michael Lorenzi –Roger Wattenhofer –Samuel Welten –Martin Wirz URLs: –www.museek.ethz.ch –www.youjuke.org –apps.facebook.com/youjuke E-Mail: –kuhnmi@tik.ee.ethz.ch (Michael Kuhn)
49
ETH Zurich – Distributed Computing Group Michael Kuhn 49 Publications Sensing Dance Engagement for Collaborative Music Control. Michael Kuhn, Martin Wirz, Matthias Flückiger, Roger Wattenhofer, Gerhard Tröster. (accepted at ISWC 2011) Social Audio Features for Advanced Music Retrieval Interfaces. Michael Kuhn, Roger Wattenhofer, and Samuel Welten. ACM Multimedia, Florence, October 2010. Visually and Acoustically Exploring the High-Dimensional Space of Music. Lukas Bossard, Michael Kuhn, and Roger Wattenhofer. IEEE International Conference on Social Computing (SocialCom), Vancouver, Canada, August 2009. From Web to Map: Exploring the World of Music. Olga Goussevskaia, Michael Kuhn, Michael Lorenzi, and Roger Wattenhofer. IEEE/WIC/ACM International Conference on Web Intelligence (WI), Sydney, Australia, December 2008. Exploring Music Collections on Mobile Devices. Olga Goussevskaia, Michael Kuhn, and Roger Wattenhofer. International Conference on Human-Computer Interaction with Mobile Devices and Services (MobileHCI), Amsterdam, Netherlands, September 2008.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.