Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara February.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

Overview of the Science Environment for Ecological Knowledge (SEEK) Ricardo Scachetti Pereira.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Data Management Expert Panel - WP2. WP2 Overview.
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
File Server Organization and Best Practices IT Partners June, 02, 2010.
Enterprise Content Management Departmental Solutions Enterprisewide Document/Content Management at half the cost of competitive systems ImageSite is:
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
UCSD SAN DIEGO SUPERCOMPUTER CENTER Ilkay Altintas Scientific Workflow Automation Technologies Provenance Collection Support in the Kepler Scientific Workflow.
K-State Digital Library: New tools for collection building and e-resource discovery KSU Digital Library Department.
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
Digital Video Archiving. ViArchive Overview ViArchive provides user friendly solutions for… – uploading video clips with metadata (searchable file info.
ARCHIMÈDE Presented by Guy Teasdale Directeur, Services soutien et développement Bibliothèque de l’Université Laval CARL Workshop on Institutional Repositories.
It’s always better live. MSDN Events Security Best Practices Part 2 of 2 Reducing Vulnerabilities using Visual Studio 2008.
Experiences in Integration of the 'R' System into Kepler Dan Higgins – National Center for Ecological Analysis and Synthesis (NCEAS), UC Santa Barbara.
Workflow Exchange and Archival: The KSW File and the Kepler Object Manager Shawn Bowers (For Chad Berkley & Matt Jones) University of California, Davis.
CADDLAB Medical Imaging on Remote Compute Servers.
Consists of the following components (which are purchased separately) Resource Discovery * Web based deposit (including authorisation)* Full Text Index.
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul.
SAN DIEGO SUPERCOMPUTER CENTER Developing a CUAHSI HIS Data Node, as part of Cyberinfrastructure for the Hydrologic Sciences David Valentine Ilya Zaslavsky.
Leveraging semantic metadata for ecological data discovery and integration for analysis and modeling Matthew B. Jones Mark P. Schildhauer with contributions.
The Kepler Project Overview, Status, and Future Directions Matthew B. Jones on behalf of the Kepler Project team National Center for Ecological Analysis.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Federated Searching Pre-Conference Workshop - The federated searching cookbook Qin Zhu HP Labs Research Library February 18, 2007.
Opening SharePoint to External Users.  Centralize all files  Eliminate the need for Matching Subs RFI’s to our RFI’s (Dan Campbell, ETC)  Create a.
LGU Document Management Solution. What is it? A Web-based Centralized Document Management Solution to keep track of digital documents Instantly search.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
XenData Digital Archives Simplify your video archive workflow XenData LTO Video Archive Solutions Overview © Copyright 2013 XenData Limited.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
January, 23, 2006 Ilkay Altintas
Introduction for BEAM Ecological Niche Modeling Working Meeting Deana Pennington University of New Mexico December 14, 2004.
ChemStation Integration with ECM November 7, 2006 Integration of ChemStation with OpenLAB ECM Life Sciences Solutions Unit Susanne Kramer, Application.
Long Term Ecological Research Network Information System LTER Grid Pilot Study LTER Information Manager’s Meeting Montreal, Canada 4-7 August 2005 Mark.
Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.
Science Environment for Ecological Knowledge: EcoGrid Matthew B. Jones National Center for.
Computer Emergency Notification System (CENS)
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Grid Technologies Arcot Rajasekar (SEEK) Paul Watson (North East eScience Centre)
Copenhagen, 7 June 2006 Toolkit update and maintenance Anton Cupcea Finsiel Romania.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
The SEEK EcoGrid: A Data Grid System for Ecology Arcot Rajasekar Matthew Jones Bertram Ludäscher
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Using R in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
Using Desktop Data in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
Kepler includes contributors from GEON, SEEK, SDM Center and Ptolemy II, supported by NSF ITRs (SEEK), EAR (GEON), DOE DE-FC02-01ER25486.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
EScience Workshop on Scientific Workflows Matthew B. Jones National Center for Ecological Analysis and Synthesis University of California Santa Barbara.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
© 2006 University of Kansas An LSID resolver for specimens and a digression into issues raised by the use of GUIDs Steve Perry
Satisfy Your Technical Curiosity 27, 28 & 29 March 2007 International Convention Center (ICC) Ghent, Belgium.
Ben Robb MVP, SharePoint Server CTO, cScape Ltd Interoperability Overview: All Roads Lead to SharePoint.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Satisfying Requirements BPF for DRA shall address: –DAQ Environment (Eclipse RCP): Gumtree ISEE workbench integration; –Design Composing and Configurability,
Visualization in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
Institute for the Protection and Security of the Citizen HAZAS – Hazard Assessment ECCAIRS Technical Course Provided by the Joint Research Centre - Ispra.
Plug-In Architecture Pattern. Problem The functionality of a system needs to be extended after the software is shipped The set of possible post-shipment.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
EcoGrid in SEEK A Data Grid System for Ecology Bertram Ludaescher University of California, Davis Arcot Rajasekar San Diego Supercomputer Center, University.
Store, Share, Sync and Collaborate
Flexible Extensible Digital Object Repository Architecture
Flexible Extensible Digital Object Repository Architecture
DUCKS – Distributed User-mode Chirp-Knowledgeable Server
SharePoint 2007 Developer Overview Collaboration BI Features
Plug-In Architecture Pattern
Presentation transcript:

Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara February 13, 2007 Berkeley, California The Kepler Actor Repository: Enabling Remote Storage, Query and Retrieval of Workflow Components

Purpose of the Repository Easy method for workflow authors to share components A common archive for components Enables strong versioning Workflow components can become metadata for research papers Helps with lineage tracking Repository Client Component

Functional Requirements Kepler users should be able to easily locate and use components Kepler users should be able to easily add components to the archive Users should be able to restrict access to their components Components should be browsable or searchable

Important Differences Between Kepler and Ptolemy Kepler Object Manager (OM) database of all objects registered with the system objects are read in at startup OM organizes objects based on an ontology Kepler Objects Each component has one or more semantic types Domain specific ordering via semantic type/ontologies Each object has a unique LSID Ontology: specification of a conceptualization within a knowledge domain

Kepler Archive (KAR) Files Used for component transfer and archiving OM can create or ingest KAR files Consits of actor metadat, manifest and eventually class/jar files Each object has LSID listed in the manifest KAR itself has an LSID KAR files are used for transporting components between the client and server Important Differences Between Kepler and Ptolemy

Architecture

The Repository All services provided via the EarthGrid (formerly known as the EcoGrid) Web Services: Get/Query, Put, Auth Web interface allows users to search for and download components outside of Kepler Component Storage KAR files metadata file external to the KAR for indexing

Client Interface Object Manager Handles local get/put/query Handles remote get/put/query through the EarthGrid interface User right clicks on a component to upload it Remote search results are integrated into the actor library User drags component from the search results to download it Internal database of LSIDs is synced to the server via EarthGrid interfaces

Uploading and Searching

Downloading to the Client Remote Components downloaded when dragged to the canvas For initial display (in the results tree), only actor metadata is loaded After initial download, KAR file is cached Want and need dynamic class loading to make this more useful (back to that in a minute)

Authentication Uses the EarthGrid interface Backend is currently LDAP Currently, a component can either be public or private, but control could be finer grained Authentication interface in Kepler is extensible and provides for other authentication schemes, such as GAMA

Documentation Kepler uses the Ptolemy documentation system For displaying docs remotely, Kepler uses a custom attribute and inserts the docs directly into the actor metadata on the server Docs can then be transformed for viewing on the website

Viewing Documentation

Future Work Use the repository as the main storage location for components instead of shipping Kepler with an extensive library Make the web interface more usable Dynamic class loading….

Dynamic Class Loading Motivation: allow the use of multiple versions of the same class in one workflow execution. Problems Loading classes isn’t too hard, but reloading classes requires removing the entire classloader. Two different actors may use two different versions of the same class. We don’t always want to use the class (with the same name) that is already loaded. Security issues with not always using the preloaded java classes. Potential Solutions Create a custom classloader that allows reloading/coloading Create a new classloader for each loaded class (that can be removed if necessary) Suggestions?

More info: This material is based upon work supported by the National Science Foundation under award and others. SEEK Partner Institutions University of New Mexico Napier University, Edinburgh Scotland University of Kansas University of Vermont University of California, Santa Barbara National Center for Ecological Analysis and Synthesis University of California, Davis Arizona State University University of North Carolina San Diego Supercomputer Center Kepler Partner Projects SEEK Ptolemy ROADNet SDM/SPA SDM/CPES GEON Resurgence