Measurement Data Archive – Integration Effort GEC11 July 2011 Giridhar Manepalli Corporation for National Research Initiatives.

Slides:



Advertisements
Similar presentations
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Advertisements

V Alyssa Rosemartin 1, Lee Marsh 1, Ellen Denny 1, Bruce Wilson USA National Phenology Network, Tucson, AZ; 2 - Oak Ridge National Laboratory, Oak.
The North American Carbon Program Google Earth Collection Peter C. Griffith, NACP Coordinator; Lisa E. Wilcox; Amy L. Morrell, NACP Web Group Organization:
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Project Overview Goal: Instrumentation and Measurement capabilities for GENI experimenters and operations Outcomes: Software to perform centralized and.
Information Types and Registries Giridhar Manepalli Corporation for National Research Initiatives Strategies for Discovering Online Data BRDI Symposium.
Database Theory Why use database? Data is a valuable corporate resource which needs adequate accuracy, consistency and security controls. The centralized.
Metadata for Digital Content Jane Mandelbaum, Ann Della Porta, Rebecca Guenther.
1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University.
Shawn McClure Software Engineer CIRA, Colorado State University Projects: Visibility Information Exchange Web.
Chapter 4 Database Management Systems. Chapter 4Slide 2 What is a Database Management System (DBMS)?  Database An organized collection of related data.
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
An Introduction to Database Management Systems R. Nakatsu.
Definitions Collaboration – working together on team projects and sharing information, often through ad-hoc processes, to accomplish project goals. Document.
Measurement Data Archive – Project Highlights GEC12 Nov 2011 Giridhar Manepalli Corporation for National Research Initiatives
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Databases & Data Warehouses Chapter 3 Database Processing.
8/21/2015J-PARC1 Data Management Machine / Application State Data.
Presented by DOI Create: TERN as a use-case Siddeswara Guru
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Common Ground A Policy Framework for Open Access to Research Data Susan Reilly, LIBER Projects
Database Design - Lecture 1
Systems analysis and design, 6th edition Dennis, wixom, and roth
Information Requirements for Integrating Spatially Discrete, Feature- Based Earth Observations Jeffery S. Horsburgh Anthony Aufdenkampe, Kerstin Lehnert,
Measurement Data Archive GEC11 July 2011 Giridhar Manepalli Corporation for National Research Initiatives
THE GITB TESTING FRAMEWORK Jacques Durand, Fujitsu America | December 1, 2011 GITB |
Digital Object Architecture
Sponsored by the National Science Foundation GEC16 Service Developers Roundtable: Strawman Unified I&M Tools and Services Marshall Brinn, GPO March 19,
1 Matthew J. McAuliffe, Ph.D., Chief, Biomedical Imaging Research Services Section (BIRSS) CIT Ramona Hicks, Ph.D., Program Director, Repair and Plasticity.
The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation.
M EASUREMENT D ATA O BJECT D ESCRIPTOR S PECIFICATION - P RESENT S TATUS Giridhar Manepalli Corporation for National Research Initiatives.
XML (with a bias towards query language issues) A boring research topic? A new frontier? A means to keep standards people busy? Prepared by S. Abiteboul.
DASISH Final Conference Common Solutions to Common Problems.
© 2008 The McGraw-Hill Companies, Inc. All rights reserved. ACCESS 2007 M I C R O S O F T ® THE PROFESSIONAL APPROACH S E R I E S Lesson 9 – Building Links,
Sponsored by the National Science Foundation 1 March 15, 2011 GENI I&M Update: Gathering, Transferring and Sharing MD Goals Architecture Overview –Process.
Sponsored by the National Science Foundation GENI Registry Services, a.k.a. Digital Object Registry Spiral 2 Year-end Project Review CNRI PI: Larry Lannom.
Sponsored by the National Science Foundation 1 March 15, 2011 GENI I&M Update: MD Objects and Descriptors Goals Architecture Overview –Process –Functional.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
EPA’s Environmental Terminology System and Services (ETSS) Michael Pendleton Data Standards Branch, EPA/OEI Ecoiformatics Technical Collaborative Indicators.
Sponsored by the National Science Foundation GENI I&M Update: Architecture Overview and Current Status GENI Engineering Conference 10 San Juan, PR GPO.
Database Management Systems.  Database management system (DBMS)  Store large collections of data  Organize the data  Becomes a data storage system.
26 Mar 04 1 Application Software Practical 5/6 MS Access.
Policy Based Data Management Data-Intensive Computing Distributed Collections Grid-Enabled Storage iRODS Reagan W. Moore 1.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
Measurement Data Workspace and Archive: Current State and Next Steps GEC15 Oct 2012 Giridhar Manepalli Corporation for National Research Initiatives
LAMP: Bringing perfSONAR to ProtoGENI Martin Swany.
SPASE and the VxOs Jim Thieman Todd King Aaron Roberts.
Sponsored by the National Science Foundation 1 Nov 4, 2010 Inst & Meas WG Meeting at GEC9 Thur, Nov 4, 9am – 10:30am Introductions (9am) Topic 2: Meas.
Scientific Annotation Middleware (SAM) Jim Myers, Elena Mendoza PNNL Al Geist, Jens Schwidder ORNL.
OCLC Research Library Partnership Work-In-Progress webinar 3 December 2015 A Close Look at the Four Million Archival MARC Records in WorldCat Jackie Dooley.
1 MS Access. 2 Database – collection of related data Relational Database Management System (RDBMS) – software that uses related data stored in different.
1 Designing a Privacy Management System International Security Trust & Privacy Alliance.
U.S. Department of the Interior U.S. Geological Survey Decision Support Tools and USGS Data Management Best Practices Cassandra Ladino USGS Chesapeake.
WISE Working Group D September 2009, Brussels Jon Maidens.
Research data management using Globus ESIP Summer Meeting 2015 Rachana Ananthakrishnan University of Chicago
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
XAM and OSD J Jensen hepsysman RAL June Filesystem Device Middleware/Apps open(), fopen() Kernel scsi etc.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Storage Accounting for Grid Environments Fabio Scibilia INFN - Catania.
Efforts to Link Ecological Metadata with Bacterial Gene Sequences at the Sapelo Island Microbial Observatory Wade M. Sheldon Mary Ann Moran James T. Hollibaugh.
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
Sponsored by the National Science Foundation 1 March 15, 2011 GENI I&M Update: I&M Service Types, Arrangements, Assembling Goals Architecture Overview.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The Data Type.
Document Management with Office SharePoint Server 2007 Jason Morrill Program Manager Windows SharePoint Services.
IBM Software Group © 2006 IBM Corporation Confidentiality/date line: 13pt Arial Regular, white Maximum length: 1 line Information separated by vertical.
Welcome: To the fifth learning sequence “ Data Models “ Recap : In the previous learning sequence, we discussed The Database concepts. Present learning:
Virtual multidisciplinary EnviroNments USing Cloud infrastructures Data Management at VENUS-C Ilja Livenson KTH
International Planetary Data Alliance Registry Project Update September 16, 2011.
records Database Vocabulary It can be useful to collect information.
FDA Topics Going Forward…???
Data Models.
Presentation transcript:

Measurement Data Archive – Integration Effort GEC11 July 2011 Giridhar Manepalli Corporation for National Research Initiatives

Measurement Data Archive: Status Deployed a prototype of measurement data archive that includes a temporary storage space, aka workspace A hierarchical storage system that allows making collections of objects Mints a persistent identifier that resolves to data Indexes metadata to support queries and data discovery Supports SFTP, SCP, SMB, REST, and Web-based Interface into the system Early adopters in GENI: OnTimeMeasure - Ohio State University INSTOOLS - University of Kentucky

Success Criteria for an Archive Archive cannot be just a store-and-retrieve service. An eco-system surrounding the archive is needed to motivate communities into using it. Visualization, policy enforcement, dissemination, etc. are examples of services an archive could provide. To build such an eco-system, a basic understanding of what we store is necessary: #1: Data Model. How do you define a data object? (Not how it is serialized, e.g., databases, file-systems, etc.). Do we need a data agnostic archive? Do we manage relationships across data objects? Too many storage systems failed because of the lack of a proper data model. #2: Metadata. What constitutes a metadata record? How is it associated with a data object? Lack of metadata results in a pile of bytes in an archive. Building an eco- system of services with a pile of bytes is impossible. #3: API. How is data (and metadata) pushed into an archive? What are the end-point definitions and data structures? #1 and #2 are more important.

Integration: Next Steps Step #1: Define a data object. Is data just a series of bytes? Or do we pack X, Y, & Z into it? Are relationships across objects required or not? (Not nice-to-have, but are they required?) Do we have data visibility criteria? Permissions, etc. Step #2: Validate metadata recommendation. Projects should generate a few metadata records with these goals: To identify which elements are needed, which are optional, and which are not required. To capture different profiles of data. Perhaps some elements are needed for one class of data, and other elements are needed for other class of data. This may result in a few profiles. Although unlimited profiles are hard to manage, a limited number will result in less optional fields. To validate the suggested controlled vocabulary for some of the elements, and to identify vocabulary where missing. Controlled vocabulary brings some order into metadata and discovery. Step #3: Identify API. What end-points and data structures are reasonable for a given project? REST+XML, XML-RPC, etc.