CS 445/656 Computer & New Media

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014.
Multimedia Interfaces What is a multimedia interface – Most anything where users do not just interact with text – E.g., audio, speech, images, faces, video,
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
LYRIC-BASED ARTIST NETWORK METHODOLOGY Derek Gossi CS 765 Fall 2014.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
More Interfaces for Retrieval. Information Retrieval Activities Selecting a collection –Lists, overviews, wizards, automatic selection Submitting a request.
Berenzweig - Music Recommendation1 Music Recommendation Systems: A Progress Report Adam Berenzweig April 19, 2002.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Designing Software for Personal Music Management and Access Frank Shipman & Konstantinos Meintanis Department of Computer Science Texas A&M University.
Personalised Search on the World Wide Web Originally by Micarelli, Gasparetti, Sciarrone & Gauch
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
CS292 Computational Vision and Language Visual Features - Colour and Texture.
Projects in the Intelligent User Interfaces Group Frank Shipman Associate Director, Center for the Study of Digital Libraries.
Multimedia Data Mining Arvind Balasubramanian Multimedia Lab (ECSS 4.416) The University of Texas at Dallas.
Overview of Search Engines
Information Retrieval in Practice
Topics for Today General Audio Speech Music Music management support.
Contactforum: Digitale bibliotheken voor muziek. 3/6/2005 Real music libraries in the virtual future: for an integrated view of music and music information.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
August 12, 2004IAML - IASA 2004 Congress, Olso1 Music Information Retrieval, or how to search for (and maybe find) music and do away with incipits Michael.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Music Information Retrieval -or- how to search for (and maybe find) music and do away with incipits Michael Fingerhut Multimedia Library and Engineering.
Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.
By Jenny Brooks.  Save presentation by going to Office button (PowerPoint) or File (Keynote and Impress) and click on “Save.” At “Save” window, remind.
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Music Information Retrieval Information Universe Seongmin Lim Dept. of Industrial Engineering Seoul National University.
The Structure of Information Retrieval Systems LBSC 708A/CMSC 838L Douglas W. Oard and Philip Resnik Session 1: September 4, 2001.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
A content-based System for Music Recommendation and Visualization of User Preference Working on Semantic Notions Dmitry Bogdanov, Martin Haro, Ferdinand.
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
CS 445/656 Computer & New Media
Information Retrieval in Practice
- The most common types of data models.
Social Audio Features for Advanced Music Retrieval Interfaces
Visual Information Retrieval
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Search Engine Architecture
Multimedia Content-Based Retrieval
Usability Fujinaga 2003.
Search Engine Architecture
Thawatchai Piyawat Jantawan Noiwan Anthony F. Norcio
Module 4: Strategy Formulation: Customer Interface
Introduction to Music Information Retrieval (MIR)
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Getting Started with Dreamweaver
Databases.
Multimedia Information Retrieval
Music Computer & New Media.
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Search Engine Architecture
Text Categorization Berlin Chen 2003 Reference:
Ying Dai Faculty of software and information science,
Microsoft Office Illustrated Fundamentals
Measuring the Similarity of Rhythmic Patterns
Presentation transcript:

CS 445/656 Computer & New Media Audio, Speech and Music CS 445/656 Computer & New Media

Topics for Monday & Wednesday General Audio Speech Music Music management support

Music Music processing can support a variety of activities Composition From traditional to interactive Selection Example: iTunes, Pandora, Use for shared spaces Playback Example: MobinLenin Management & Summarization Example: MusicWiz, APOLLO Games Guitar Hero, Rockband, etc.

MobiLenin Enable interaction with music in a public space Not karaoke Voting like in many pub/bar games Audience can affect which version of music and video is shown

Lessons Gave a focal point for interaction between members of a group Content variety is necessary for continued engagement Lottery for free beer motivated participation

Music Summarization Music Thumbnailing Effective browsing Indexing, retrieval and management of tracks stored Typical assumption is that the most repeated pattern is the most representative part of music Leitmotifs (key phrases) ABACAB pattern (A-verse, B-chorus, C-bridge) Summarization methods Signal analysis Automatically detect repeated patterns of musical signal using self-similarity analysis or semantics Keys, pitch, and length of notes, tempo etc. Clustering or HMM to find key phrases of songs Using similarity matrix based on MFCC

Music Summarization Most summaries in commercial sites are either the first phrase or a single selected musical phrase Study of whether 22 second long multi-phrase music summaries would be better previews Three algorithms vary the selection of the components between phrases that are sonically distinct and phrases that are repeated more often A comparative evaluation study showed that: Multi-phrase previews were selected in 87% of the cases over the preview representing the first 22 seconds of the song

Managing Personal Music Collections Music management is mainly based on: explicit attributes (e.g. metadata values like the artist, the composer and the genre). explicit feedback (e.g. ratings of preference and relevance) Benefits Easy to understand Formal: consistent updating and access Context-free Question How can music be accessed based on the feelings or memories it triggers? e.g. Music that sounds happy, makes us feel gloomy or reminds us of a person

Current Practices Common metadata tags usually not sufficient to describe mood, feelings, memories and complex concepts Effort/benefit trade-off issues Personal reactions to music change Explicit feedback and usage statistics helpful in retrieving music of preference Questions How would people organize music if there was a low-effort way of expressing their personalized interpretation of music? Use of additional tags or customization of the existing ones can be tedious Use of additional ratings associated to specific music attributes can be overwhelming for the user

MusicWiz 12 participants asked to organize songs & create playlists using spatial hypertext In spatial hypertext, information has visual attributes & spatial layout that can be changed to express associations The majority found spatial hypertext helpful in organizing Participants appreciated: expressive power and freedom of the workspace directly accessible metadata information of music music previews for remembering music Participants missed: interactive hierarchical / tree views music previews for understanding music

Organization using categories & subcategories with labels Preliminary Study Organization using categories & subcategories with labels Figure shows part of the finished workspace for one participant. The songs are divided into those the participant knew and those he did not. The unknown songs were organized based on the participant’s opinion about the artist (“generally like the artist”, “neutral about the artist”). The songs he knew were grouped based on personal assessments of the music (“like but hard to listen to”, “cheesy”, “hate”, “fun songs”, and “too slow”) and associations the music had for the participant (“remind me of my wife”). Some of these categories had further subcategories such as the “I swear my wife has these songs on a mix-CD” under “remind me of my wife” and “classics” under “fun songs”. This participant’s workspace shows a greater degree of structure and interpretation than the workspaces created by most of the participants.

Music Access & Implicit Attributes Considerable research into extracting and using implicit cues for associating music to overcome: limitations of metadata & statistics to describe music concepts unwillingness of users to provide explicit feedback cost of employing human experts to find music similarity Music Management extended by: signal features (e.g. intensity, timbre and rhythm) collaborative filtering e.g. Last.fm, Genius, Music Gathering Application, Flytrap, Musicovery, MusicSim, Musicream In an effort to escape from the limitations of using metadata to describe custom music concepts and the unwillingness of users to provide explicit feedback, there is considerable research into extracting and using implicit cues for associating music

Statistics of Artist Similarity Relatedness Assessment MusicWiz Architecture Metadata Module Audio Signal Module Lyrics Module Worksp. Express. Module Artist Module Relatedness Table Inference Engine Workspace Status Related Song Titles Music Collection Songs & Metadata Songs MusicWiz Interface Lyrics Statistics of Artist Similarity Internet Relatedness Assessment Sim. Values Music management environment that combines: explicit information implicit information non-verbal expression of personal interpretation Two basic components: interface for interacting with the music collection inference engine for assessing music relatedness

MusicWiz Interface Hierarchical Folder Tree View Workspace Playlist Pane Related Songs & Search Results View Folder Tree View: Provides a location-based hierarchical views of the music collection. Related Songs & Search Results View: Displays songs that are similar to the currently selected songs in the system tree view or the results of the search. Songs then can be dragged and dropped from the list into the workspace and the playlist pane to update collections and playlists respectively. Playback Controls The MusicWiz interface

MusicWiz Inference Engine 5 modules for extracting, processing and comparing artists, metadata, audio content, lyrics, and workspace expression Overall Similarity (S1, S2) = = W1 * Overall Metadata Similarity(S1, S2) + + W2 * Overall Audio Signal Similarity(S1, S2) + + W3 * Overall Lyrics Similarity(S1, S2) + + W4 * Overall Workspace Expression Similarity(S1, S2) where, S1, S2 are the songs under comparison and Wn, n = 1..4 the user adjusted weights of the specialized similarity assessments Each module produces an assessment of relatedness (a normalized value ranging from 0 – songs very dissimilar, to 1 – songs almost identical)

MusicWiz Inference Engine – Artist Module Assesses relatedness in music using online resources: human evaluations of artist similarity from: Similar Artists lists of the All Music Guide website co-occurrence of artists in playlists from: OpenNap file-sharing network Art of the Mix website Its output is used directly by the metadata module when comparing the artist name

MusicWiz Inference Engine – Metadata Module Evaluates the pair wise similarity of the metadata values of all songs String comparison is applied to the title, genre, album-name, and year of the songs as well as the file-system path where they are stored uses a distance metric that combines the Soundex and the Monge-Elkan algorithms The Soundex phonetic algorithm is valuable for identifying similarity between transliterated or misspelled names. It uses the six phonetic classifications of human speech sounds to convert the input into a string that identifies the set of words that are phonetically alike (similar pronunciation). The Monge-Elkan algorithm identifies similarity among expressions where the words are listed in a different order; it is a dynamic programming algorithm that calculates the distance of two strings based on the cost of transformations required to convert the first expression into the second expression.

MusicWiz Inference Engine – Audio Signal Module Uses signal processing techniques to analyze music content Extracts and compares information about the harmonic structure and acoustic attributes of music beat, brightness, pitch, starting note and potential key (music scale) of the song The greater the distance in the beat, brightness and pitch levels, the less likely songs are perceived as being of similar style or mood

MusicWiz Inference Engine – Lyrics Module Textually analyzes the lyrics Lyrics are scraped from a pool of popular websites for: display in music objects comparison Lyrical comparison uses term vector cosine similarity: Overall Lyrics Similarity (S1, S2) = cos(θ) The more words lyrics have in common, the greater the possibility that the songs are motivated by or describe related themes dict = { dog, cat, lion } Document 1 “cat cat” → (0,2,0) Document 2 “cat cat cat” → (0,3,0) Document 3 “lion cat” → (0,1,1) Document 4 “cat lion” → (0,1,1)

MusicWiz Inference Engine – Workspace Expression Module Music objects can be related visually and spatially Spatial parser identifies relations between the music objects Recognizes three types of spatial structures: lists, stacks and composites List Stack Composite

MusicWiz Functionality Music collection can be explored by filtering: attribute values (i.e. id3 tags, audio signal attributes and lyrics) similarity values (i.e. overall similarity) Playlists can be created: manually: songs can be added from the left-side views & the workspace) automatically: filter - based mode: selection based on the ID3 tags similarity - based mode: selection based on the relatedness of songs on the current playlist Id3- metadata container to allow information such as the title, artist, album, track number, and other information about the file to be stored in the file itself.

MusicWiz Evaluation 20 participants were asked to: Task 1: organize 50 rock songs into sub-collections according to their preference Task 2: form three, twenty-minute long playlists based on three different moods or occasions of their choice Task 3: form three six-song long playlists, where each of them had to be related to a provided “seed”-song (not from the fifty of the original collection)

MusicWiz Evaluation Configuration No Suggestions Suggestions Importance of the workspace Importance of the music previews Four groups of system use: Group 1 (no workspace / no suggestions) had to complete the three tasks using MusicWiz’s browsing, searching, and playback functionality and Windows Explorer folders to form the collections and playlists Group 2 (no workspace / with suggestions) used the same features as group 1 but also received suggestions from the similarity inference engine Group 3 (with workspace / no suggestions) had to perform the tasks using the features available in group 1 but used the MusicWiz workspace to create the collections and playlists Group 4 (with workspace / with suggestions) had all MusicWiz features Configuration No Suggestions Suggestions No Workspace Group 1 Group 2 Workspace Group 3 Group 4

Task 1 - Organization of Music Statement (1 – “I strongly disagree” to 7 – “I strongly agree”) Group 1 Group 2 Group 3 Group 4 The system support in organizing effortlessly / quickly was enough 4.4 5.4 5.6 6.2 Enjoyed doing task 5.8 6.4 6 Organization will be easily understood by others 4.2 Configuration No Suggestions Suggestions No Workspace Group 1 Group 2 Workspace Group 3 Group 4

Tasks 2 & 3 – Playlist Creation Statement (1 – “I strongly disagree” to 7 – “I strongly agree”) Task Group 1 Group 2 Group 3 Group 4 System support for quick selection was enough Two 4.8 6.2 5.8 Three 4.4 6.8 5.6 finding music 6 5.4 4.6 6.4 Enjoyed doing task 5.2 6.6 Configuration No Suggestions Suggestions No Workspace Group 1 Group 2 Workspace Group 3 Group 4

Apollo A hierarchically structured set of freeform canvases for creating and manipulating musical ideas (text and audio recording) Support for searching the hierarchy with either melodic or text-based queries Less structured and prescriptive approach to recording and developing musical inspirations Bainbridge, David, Brook J. Novak, and Sally Jo Cunningham. "A user-centered design of a personal digital library for music exploration.“ Proceedings of the 10th annual joint conference on Digital libraries. ACM, 2010.

Apollo

Apollo

Bag of Audio Words Audio can be treated as a kind of special document which is composed of unordered collection of audio words Features from each audio frame map to certain audio word A piece of music = A bag of Audio words

Topics Summary General Audio Speech Music Audio cues, spatialized audio Speech Segmentation, speaker id, recognition Music Interactive music, summarization, organization, Bag of audio