Metadata Management and Cataloging Breakout Jim Myers, Line Pouchard Ann Chervenak, Richard Mount, Larry Rahn, Greg Riccardi, Sonja Tidemann, Steve Wiley.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

Meta Data Larry, Stirling md on data access – data types, domain meta-data discovery Scott, Ohio State – caBIG md driven architecture semantic md Alexander.
A centre of expertise in digital information management Tools for the Trade? Supporting Multidisciplinary Research Dr Liz Lyon, Director.
Digital Repositories: interoperability & common services Closing Remarks Dr Liz Lyon, UKOLN, University of Bath, UK
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
Information Types and Registries Giridhar Manepalli Corporation for National Research Initiatives Strategies for Discovering Online Data BRDI Symposium.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
1 NODC, Russia GISC & DCPC developers meeting Langen, 29 – 31 March E2EDM technology implementation for WIS GISC development S. Sukhonosov, S. Belov.
Making certificates programmable1 John DeTreville Microsoft Research April 24, 2002.
Organizing Data & Information
Mapping Physical Formats to Logical Models to Extract Data and Metadata Tara Talbott IPAW ‘06.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Chapter 1 Overview of Databases and Transaction Processing.
DCC Conference, Glasgow November, Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan Moore, San Diego.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Semantic Publishing Update Second TUC meeting Munich 22/23 April 2013 Barry Bishop, Ontotext.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
311: Management Information Systems Database Systems Chapter 3.
Information Systems Today (©2006 Prentice Hall) 3-1 CS3754 Class Note 12 Summery of Relational Database.
Delivering business value through Context Driven Content Management Karsten Fogh Ho-Lanng, CTO.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Joint Declaration of Data Citation Principles Notes [1] CODATA 2013: sec 3.2.1; Uhlir (ed.) 2012, ch 14; Altman &
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
FEA DRM Management Strategy Presented by : Mary McCaffery, US EPA.
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
NASA Perspectives on Data Quality July Overall Goal To answer the common user question, “Which product is better for me?”
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
© 2006 University of Kansas An LSID resolver for specimens and a digression into issues raised by the use of GUIDs Steve Perry
National Library of Finland Strategic, Systematic and Holistic Approach in Digitisation Cultural unity and diversity of the Baltic Sea Region – common.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
Cornell CS 502 Metadata for the Web Issues and Simple Answers CS 502 – Carl Lagoze – Cornell University.
Knowledge Modeling and Discovery. About Thetus Thetus develops knowledge modeling and discovery infrastructure software for customers who: Have high-value.
NSF Cyberinfrastructure Workshop Metadata, semantic information and ontologies Lead: Danielle Forsyth Respondents: Jim Bonner Bertram Ludaescher Complex.
©MIT LKTR Workshop, Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan Moore, San Diego Supercomputer.
17 th October 2002Data Provenance Grid Data Requirements Scoping Metadata & Provenance Dave Pearson Oracle Corporation UK.
GPO’s Future Digital System (FDsys) November 2, 2006 LS&CM CENDI Presentation.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Towards Unifying Vector and Raster Data Models for Hybrid Spatial Regions Philip Dougherty.
OOI Cyberinfrastructure and Semantics OOI CI Architecture & Design Team UCSD/Calit2 Ocean Observing Systems Semantic Interoperability Workshop, November.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
Working Group 4 Data and metadata lifecycle management  1. Policies and infrastructure for data and metadata changes  2. Supporting file and data formats.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Distributed Archives Interoperability Cynthia Y. Cheung NASA Goddard Space Flight Center IAU 2000 Commission 5 Manchester, UK August 12, 2000.
COMPASS09 Annual Conference of Compass Informatics.
Joint Declaration of Data Citation Principles (Overview) The Data Citation Synthesis Group Joint Declaration.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Chapter 1 Overview of Databases and Transaction Processing.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
knowledge organization for a food secure world
C2CAMP (A Working Title)
Introduction to Semantic Metadata & Semantic Web
Capturing and Organizing Scientific Annotations
One Language. One Enterprise.™
HAO/SCD: VO, metadata, catalogs, ontologies, querying
About Thetus Thetus develops knowledge discovery and modeling infrastructure software for customers who: Have high value data that does not neatly fit.
Presentation transcript:

Metadata Management and Cataloging Breakout Jim Myers, Line Pouchard Ann Chervenak, Richard Mount, Larry Rahn, Greg Riccardi, Sonja Tidemann, Steve Wiley

Wrong Title? “I could do more science if I could: Automate workflow, Search my data faster. …” “My paper is too short, I need more metadata…”

Drivers for Metadata Extracting more value from data –Data useful beyond grad student lifetime –Single-user efficiency Dealing with Moore’s Law, CS Advances –Managing more complex experiments – unique names is no longer sufficient –Componetization of Codes –(Decomposition of concerns (aspects))

Drivers for Metadata Changing Science – moving beyond an oral tradition –Need to share context-dependent data across community(ies) (data dissemination/discovery) –Support mapping between data models (across domains, over time) –Managing non-hierarchical data relationships / multiple hierarchies at once –Describing hypothesis/statements of trust /reification (statements about other statements)

Catch 22 Everybody says metadata is important, but few actually record it –Frog in the pot –Tragedy of the Commons –Paradigm shift What’s changing? –New Science drivers require it –New technologies will simplify capture and management

Uses –Provenance (original conditions, subsequent workflow-workflow by example), Reproducing experiments and analysis Virtual Data Workflow-by-example –Data Discovery Metadata-based search (features, subsets, …) –Data Quality evaluation/review endorsement Curation/records information –Annotation Data context Relation to other data –Discovery/Mining/Inference/Monitoring –Not discussed much – metadata applies not only to data but services, programs, machines, instruments

R&D Challenges –What to standardize, what to record? Infrastructure is general, some schema should be (workflow, experiment mgmt) but most are domain specific –Metadata Services scalable, distributed, schema-independent semantic federation/ontology mapping, derived indexes/info retrieval service, global ids, rich authorization models, data granularity, inference services, curation (tuning based on access, etc.) ) Usability - Metadata input/capture – automation, cultural change, rewards, Google precedent Maintainability - Automated quality management

Logical Naming Lifecycle Discovery (data, schema) Basic Management (ingest, storage, query, update, notification,...) Reasoning (mapping, inference, …) Records (signing, nonrepudiation) Migration (schema, formats, signatures,...) Archival/versioning (copies of external data, services, …) Policy enforcement (fit for purpose, adheres to common data model, …) Federation & aggregation Collection and/or Compounding Curation (e.g. conflict detection & resolution) Workflow “Proxy” Provenance: Conceptual Services From workshop 1

Program Scope –Research – metadata services (see above) –Pilot – use of rich metadata to support grand challenge projects –Develop/deploy: General metadata capture tools – capture from workflows, problem solving environments –Maintain – metadata management as cyberinfrastructure (requires research on scaling, maintainability,…)