SEASR Applications and Future Work National Center for Supercomputing Applications University of Illinois at Urbana-Champaign.

Slides:



Advertisements
Similar presentations
Collaborative e-Portfolios
Advertisements

HATHI TRUST A Shared Digital Repository Delivering Data For New Generations of Research Strategies and Challenges Jeremy York NISO/BISG Forum ALA 2010.
A Toolbox for Blackboard Tim Roberts
CHAPTER 15 WEBPAGE OPTIMIZATION. LEARNING OBJECTIVES How to test your web-page performance How browser and server interactions impact performance What.
HathiTrust Research Center Architecture
University of Illinois Visualizing Text Loretta Auvil UIUC February 25, 2011.
IWay Service Manager 6.1 Product Update Scott Hathaway iWay Software Copyright 2010, Information Builders. Slide 1.
SDN and Openflow.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
University of Illinois Role of Mashups, Cloud Computing, and Parallelism for Visual Analytics Loretta Auvil.
Passage Three Introduction to Microsoft SQL Server 2000.
Web 2.0: Concepts and Applications 2 Publishing Online.
SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.
Digital Library Architecture and Technology
The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation SEASR Overview Loretta Auvil and Bernie Acs National.
ArcGIS Workflow Manager An Introduction
OM. Brad Gall Senior Consultant
Web 2.0: Concepts and Applications 4 Organizing Information.
DuraCloud Managing durable data in the cloud Michele Kimpton, Director DuraSpace.
Evolution to CIMI Charles (Cal) Loomis & Mohammed Airaj LAL, Univ. Paris-Sud, CNRS/IN2P3 29 August 2013.
Chapter 16 The World Wide Web Chapter Goals ( ) Compare and contrast the Internet and the World Wide Web Describe general Web processing.
Tutorial 10 Adding Spry Elements and Database Functionality Dreamweaver CS3 Tutorial 101.
The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation SEASR Overview Loretta Auvil and Bernie Acs National.
Project 1 Online multi-user video monitoring system.
Advanced Level Course. Site Extras Site Extras consist of four categories: Stationeries Site Trash Designs Components.
DISTRIBUTED COMPUTING
PUBLISHING ONLINE Chapter 2. Overview Blogs and wikis are two Web 2.0 tools that allow users to publish content online Blogs function as online journals.
Tutorial 121 Creating a New Web Forms Page You will find that creating Web Forms is similar to creating traditional Windows applications in Visual Basic.
Module 9 Configuring Messaging Policy and Compliance.
Informix IDS Administration with the New Server Studio 4.0 By Lester Knutsen My experience with the beta of Server Studio and the new Informix database.
Office of Educational Technology School District of Philadelphia Introduction to Sites Google Sites This presentation is available at
SEASR Applications and Future Work University of Illinois at Urbana-Champaign.
Installation and Development Tools National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The SEASR project and its.
SEASR Analytics for Zotero Loretta Auvil Automated Learning Group Data-Intensive Technologies and Applications, National Center for.
CH1. Hardware: CPU: Ex: compute server (executes processor-intensive applications for clients), Other servers, such as file servers, do some computation.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Meandre Workbench National Center for Supercomputing Applications University of Illinois at Urbana-Champaign.
Okalo Daniel Ikhena Dr. V. Z. Këpuska December 7, 2007.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation Meandre Workbench National Center for Supercomputing.
SEASR Analytics Loretta Auvil Automated Learning Group Data-Intensive Technologies and Applications, National Center for Supercomputing.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Katy Wolstencroft myGrid University of Manchester IMPACT/Taverna Hackathon 2011.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
Tools and Deployment University of Illinois at Urbana-Champaign.
Intro to Datazen.
Web Technologies Lecture 8 Server side web. Client Side vs. Server Side Web Client-side code executes on the end-user's computer, usually within a web.
January 2006Colby College ITS Setting Up Course Pages.
Internet Applications (Cont’d) Basic Internet Applications – World Wide Web (WWW) Browser Architecture Static Documents Dynamic Documents Active Documents.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.
A Technical Overview Bill Branan DuraCloud Technical Lead.
Creating Zotero Flows Data-Intensive Technologies and Applications, National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign.
Module 6: Administering Reporting Services. Overview Server Administration Performance and Reliability Monitoring Database Administration Security Administration.
Apache Solr Dima Ionut Daniel. Contents What is Apache Solr? Architecture Features Core Solr Concepts Configuration Conclusions Bibliography.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
Integrating and Extending Workflow 8 AA301 Carl Sykes Ed Heaney.
V7 Foundation Series Vignette Education Services.
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
Scan, Import, and Automatically file documents to Box Introduction
AEM Digital Asset Management - DAM Author : Nagavardhan
Deployment of Flows Loretta Auvil
Distributed Cache Technology in Cloud Computing and its Application in the GIS Software Wang Qi Zhu Yitong Peng Cheng
Amazon Storage- S3 and Glacier
Open Source distributed document DB for an enterprise
Joseph JaJa, Mike Smorul, and Sangchul Song
Welcome! Thank you for joining us. We’ll get started in a few minutes.
Overview of big data tools
4CeeD Demonstration Step-by-step demonstration showing creation, uploading, and sharing of research data Timothy Spila, Ph.D. June 4, 2018.
Presentation transcript:

SEASR Applications and Future Work National Center for Supercomputing Applications University of Illinois at Urbana-Champaign

Outline Audio Applications Future Hands-On

Defining Music Information Retrieval? Music Information Retrieval (MIR) is the process of searching for, and finding, music objects, or parts of music objects, via a query framed musically and/or in musical terms Music Objects: Scores, Parts, Recordings (WAV, MP3, etc.), etc. Musically framed query: Singing, Humming, Keyboard, Notation-based, MIDI file, Sound file, etc. Musical terms: Genre, Style, Tempo, etc.

NEMA Networked Environment for Music Analysis –UIUC, McGill (CA), Goldsmiths (UK), Queen Mary (UK), Southampton (UK), Waikato (NZ) –Multiple geographically distributed locations with access to different audio collections –Distributed computation to extract a set of features and/or build and apply models

Work – NEMA Executes a SEASR flow for each run –Loads audio data –Extracts features from every 10 second moving window of audio –Loads models –Applies the models –Sends results back to the WebUI

NEMA Flow – Blinkie

NEMA Vision researchers at Lab A to easily build a virtual collection from Library B and Lab C, acquire the necessary ground-truth from Lab D, incorporate a feature extractor from Lab E, combine with the extracted features with those provided by Lab F, build a set of models based on pair of classifiers from Labs G and H validate the results against another virtual collection taken from Lab I and Library J. Once completed, the results and newly created features sets would be, in turn, made available for others to build upon

Do It Yourself (DIY) 1

DIY Options

DIY Job List

DIY Job View

Nester: Cardinal Annotation Audio tagging environment Green boxes indicate a tag by a researcher Given tags, automated approaches to learn the pattern are applied to find untagged patterns

NESTER: Cardinal Audio Analysis

Examining Audio Collection Tagged a set of examples Male and Female

SEASR Central feedback | login | search central Categories Recently Added Top 50 Submit About RSS Featured Component [read more] Word Counter by Jane Doe Description Amazing component that given text stream, counts all the different words that appear on the text Rights: NCSA/UofI open source license Featured Component [read more] Word Counter by Jane Doe Description Amazing component that given text stream, counts all the different words that appear on the text Rights: NCSA/UofI open source license Featured Flow [read more] FPGrowth by Joe Does Browse By Joe Doe Rights: NCSA/UofI Description: Webservices given a Zotero entry tries to retrieve the content and measure its By Joe Doe Rights: NCSA/UofI Description: Webservices given a Zotero entry tries to retrieve the content and measure its Type Component Flows Categories Image JSTOR Zotero Name Author Centrality Readability Upload Fedora

SEASR Central Use Cases register for an account search for components / flows browse components / flows / categories upload component / flow share component / flow with: everyone or group unshare component / flow create group / delete group join group / leave group create collection generate location URL (permalink) for components, flows, collection (the location URL can be used inside the Workbench to gain access to that component or flows) view latest activity in public space / my groups

Hot topics on 1.4.X Complex concurrency model based on traditional semaphores written in Java Server performance bounded by JENA’s persistent model implementation State caching on individual servers increase complexity of single-image clusters Cloud-deployable, but not cloud-friendly

How 1.5 efforts turned into 2.0? Cloud-friendly infrastructure required rethinking core functionalities Drastic redesign of backend state storage Revisited execution engine to support distributed flow execution Changes on the API that will render returned JSON documents incompatible with 1.4.X

What's New 2.0 Series? Rewritten from scratch in Scala RDBMS backend via Jena/JDBC has been dropped MongoDB for state management and scalability Meandre 2.0 server is stateless Meandre API revised –Revised response documents –Simplified API (reduced the number of services) –New Job API (Submit jobs for execution; Track them (monitor state, kill, etc.); Inspect console and logs in real time

What's New From 1.4.X Series? New HTML interaction interface Off-the-shelf full-fledged single-image cluster Revised flow execution lifecycle: Queued, Preparing, Running, Done, Failed, Killed, Aborted Flow execution as a separate spawned process. Multiple execution engines are available Running flows can be killed on demand Rewritten execution engine (Snowfield) Support for distributed flow fragment execution

Meandre 2.0 Meandre 2.0 requires at least 2 separate services running –A MongoDB for shared state storage and management holds all server state, job related information, and system information –A Meandre server to provide services and facilitate execution (customizable execution engines) A single-image Meandre cluster scales horizontally by adding new Meandre servers and sharding the MongoDB store

Meandre and Cloud Computing Next generation data-intensive applications will: –Use cloud computing technologies and conduits –Require adaptation of programming paradigms –Leverage a flexible architecture and a modular –Promote processing and resources at scale. Meandre –Data-intensive execution engine –Component-based programming architecture –Distributed data flow designs to allow processing to be co-located with data sources and enable transparent scalability –Orchestrate cloud deployments –Leverage cloud conduits

Meandre Workbench Futures Copy and paste (between and within flows) Add custom property editor for types (checkbox, lists, etc) Ability to specify parallel computation like in ZigZag Ability to use flows within flows (for grouping of functionality)

Projects SEASR Follow-On with Mellon Foundation –Collaborators: Stanford, University of Maryland, George Mason University Hathi-Trust Research Center –NCSA as a Computational Site –Collaboration with Indiana University –HTRC reception at Digital Humanities :00pm - 7:30pm (PDT) on Monday, June 20 Bamboo –Deploy a set of analytical services

Demonstration

Discussion Questions How can SEASR benefit my research? What does SEASR need to look like for the future of humanities research? What scholarly questions do I have from my research for what to do with a million books?