SEASR Applications and Future Work University of Illinois at Urbana-Champaign
Outline Audio Analytics: NEMA Text Analytics: HTRC SEASR Central Future Meandre Features Future Meandre Workbench Features Attendee Plan Presentations
Defining Music Information Retrieval? Music Information Retrieval (MIR) is the process of searching for, and finding, music objects, or parts of music objects, via a query framed musically and/or in musical terms Music Objects: Scores, Parts, Recordings (WAV, MP3, etc.), etc. Musically framed query: Singing, Humming, Keyboard, Notation-based, MIDI file, Sound file, etc. Musical terms: Genre, Style, Tempo, etc.
NEMA Networked Environment for Music Analysis –UIUC, McGill (CA), Goldsmiths (UK), Queen Mary (UK), Southampton (UK), Waikato (NZ) –Multiple geographically distributed locations with access to different audio collections –Distributed computation to extract a set of features and/or build and apply models
Work – NEMA Executes a SEASR flow for each run –Loads audio data –Extracts features from every 10 second moving window of audio –Loads models –Applies the models –Sends results back to the WebUI
NEMA Flow – Blinkie
NEMA Vision researchers at Lab A to easily build a virtual collection from Library B and Lab C, acquire the necessary ground-truth from Lab D, incorporate a feature extractor from Lab E, combine with the extracted features with those provided by Lab F, build a set of models based on pair of classifiers from Labs G and H validate the results against another virtual collection taken from Lab I and Library J. Once completed, the results and newly created features sets would be, in turn, made available for others to build upon
Do It Yourself (DIY)
DIY Options
DIY Job List
DIY Job View
Nester: Cardinal Annotation Audio tagging environment Green boxes indicate a tag by a researcher Given tags, automated approaches to learn the pattern are applied to find untagged patterns
NESTER: Cardinal Audio Analysis
HathiTrust Research Center (HTRC)
Portal Agent Registry Solr Index NoSQL text repository CI Logon Authentication Service Algorithm scripts Compute resources collections URLs to service resources Algorithm scripts Data resources Ingest Service HathiTrust Corpora Log in Authenticate credentials spawns Fetch and register items Resources and items deploys Query and fetch produces return s Crawls and indexes rsync pushes HTRC Storage and Services
HTRC API Data API –RESTful web service allowing clients to retrieve digital text by providing volume IDs or page IDs
HTRC Spellcheck Flow
HTRC Mallet Flow
HTRC Announcements HTRC Uncamp, September 10-11, 2012
SEASR Central feedback | login | search central Categories Recently Added Top 50 Submit About RSS Featured Component [read more] Word Counter by Jane Doe Description Amazing component that given text stream, counts all the different words that appear on the text Rights: NCSA/UofI open source license Featured Component [read more] Word Counter by Jane Doe Description Amazing component that given text stream, counts all the different words that appear on the text Rights: NCSA/UofI open source license Featured Flow [read more] FPGrowth by Joe Does Browse By Joe Doe Rights: NCSA/UofI Description: Webservices given a Zotero entry tries to retrieve the content and measure its By Joe Doe Rights: NCSA/UofI Description: Webservices given a Zotero entry tries to retrieve the content and measure its Type Component Flows Categories Image JSTOR Zotero Name Author Centrality Readability Upload Fedora
SEASR Central Use Cases register for an account search for components / flows browse components / flows / categories upload component / flow share component / flow with: everyone or group unshare component / flow create group / delete group join group / leave group create collection generate location URL (permalink) for components, flows, collection (the location URL can be used inside the Workbench to gain access to that component or flows) view latest activity in public space / my groups
Meandre Workbench Futures Add custom property editor for types (checkbox, lists, etc) Ability to specify parallel computation like in ZigZag Ability to use flows within flows (for grouping of functionality)
Survey You’ll find the comment / evaluations form OR
Discussion Questions How can SEASR benefit my research? What does SEASR need to look like for the future of humanities research? What scholarly questions do I have from my research for what to do with a million books?