ORNL DAAC Semi-Automated Data Ingest Process Daine Wright Suresh Vannan, Tammy Beaty, Bob Cook, Yaxing Wei, Ranjeet Deverakonda, Harold.

Slides:



Advertisements
Similar presentations
Jump to Contents Instructor Tutorial essignments.com Paperless assignment submission system.
Advertisements

How to Create a Local Collection
V Alyssa Rosemartin 1, Lee Marsh 1, Ellen Denny 1, Bruce Wilson USA National Phenology Network, Tucson, AZ; 2 - Oak Ridge National Laboratory, Oak.
Co-op/Intern with UT Dallas CSO Training and Networking Conference May 21, 2014 Presented by: UT Dallas Career Center.
For those with Add/Edit Permissions. OREC Disclosure, Disclaimer HUD Lead-based Paint.
Web Plus Overview Division of Cancer Prevention and Control National Center for Chronic Disease Prevention and Health Promotion CDC Registry Plus Training.
5/15/2012. An OARRS Account Administrator is the person(s) who approves the personnel from your jurisdiction or agency to have access to the system. Each.
MT Rules! ARMed for Success ARMI is a technology initiative to:  Automate state agency rule submittals  Streamline the publishing processes.
Steps 2 & 3 Presenter: Updated 6/21/2013. Training Overview Introduction Walk Through Steps 2 & 3 Username & passwords Team Requirements Scheduled FET.
Request Material Information Use Case Item as created in Optiva. Supplier information request(s) can happen at any time. The same process works for Optiva.
Reference and Instruction Automated Statistics Gathering and Reporting System Members: Patrick Chen (pyc7) Soo-Yung Cho (sc444) Gregg Herlacher (gah24)
System Implementations American corporations spend about $300 Billion a year on software implementation/upgrade projects.
Phillips Andover Academy 2/23/2006 – 4:00-5:00 Darek Sady Blackboard Learning System (Release 6.3) e-Portfolios.
1 NIH Public Access Policy Policy on Enhancing Public Access to Archived Publications Resulting From NIH-Funded Research (Public Access Policy)
A Guide to Getting Started
Mentor ePolls Polling for Groups. 2 What is ePolls? ePolls is the newest feature of Mentor, the IEEE tool for Working Group collaboration. ePolls allows.
Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager Presented at ESIP Summer Meeting 2015.
Web Content Management Systems. Lecture Contents Web Content Management Systems Non-technical users manage content Workflow management system Different.
Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for Library Automation, University of Florida DigCCurr2007.
ETD Submission Process Fall 2011 Valerie Emerson ETD Administrator, Gelman Library / (202)
Slide 1 of 19 Welcome to GSA’s Vendor and Customer Self Service (VCSS) course Section 7: Correspondence Navigation This presentation is compliant with.
A Guide to the BIZNET Online Filing System STATE OF CONNECTICUT DEPARTMENT OF CHILDREN & FAMILIES (DCF) DEPARTMENT OF DEVELOPMENTAL SERVICES (DDS) DEPARTMENT.
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
Preserving the Scientific Record: Establishing Relationships with Archives Matthew Mayernik National Center for Atmospheric Research Version 1.0 Review.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
California Learning Resource Network Bridget Foster Presentation to Software Publishers July 31, 2000 Bridget Foster Presentation to Software Publishers.
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
Welcome to the San Bernardino County Coach Quarterly Meeting.
Statistics Monitor of SPMSII Warrior Team Pu Su Heng Tan Kening Zhang.
Karen Herter (HMG) Mike Langley (DGS) April 15, 2008 Portfolio Manager for California State Buildings Meeting the Requirements of Executive Order S
CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey 1.
Introduction & Step 1 Presenter: Updated 6/21/2013.
A Web Based Workorder Management System for California Schools.
ECPIC Quick Guide: eCPIC-ITDB Interactions Purpose: The eCPIC-ITDB Interactions Quick Guide has been developed to provide a high-level, informational overview.
Enhancing Linkages Between Projects and Datasets: Examples from LBA-ECO for NACP Lisa Wilcox, Amy L. Morrell,
EARTH SCIENCE MARKUP LANGUAGE Why do you need it? How can it help you? INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
On-line data submission training California Partnership for Achieving Student Success.
ARM Aerial Facilities BAECC Kickoff Meeting Feb Aircraft Data Local Share Drive ARM IOPshare (at Oak Ridge) ARM IOPshare (at Oak Ridge) Corrections.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
© 2012 Cengage Learning. All Rights Reserved. This edition is intended for use outside of the U.S. only, with content that may be different from the U.S.
The new system significantly enhances the co-op evaluation process by leveraging the advantages of a web- based system. Employers fill out and submit evaluations.
Graduate Degree Progress & Clearance Graduate School Office Amy Gillett and Amy Corr.
Recommend SSDB FY06 Priorities Oct – Sep Provide access 2.Respond to reviews 3.Add new capabilities 4.Action items SSDB Advisory Board.
1 ARRO: Anglia Ruskin Research Online Making submissions: Benefits and Process.
PearsonAccess April 14, PearsonAccess – Agenda Order Tracking Additional Orders Student Data Upload (SDU) files New Student Wizard Online Testing.
Put your assignment on the page. Embed a document for students to complete and send back to you. Have students complete the document and submit it to.
1 AARP Tax-Aide Prospective Volunteer Recruitment System Leadership Reports and VMIS Functionality 11/4/09.
LBA-DIS Working Group Report LBA Science Steering Committee Meeting Cuiabá - MT May 15-17, 2003 Luiz M. Horta.
Cyberinfrastructure to promote Model - Data Integration Robert Cook, Yaxing Wei, and Suresh S. Vannan Oak Ridge National Laboratory Presented at the Model-Data.
Introduces ePIRATE electronic Portal for Institutional Research at ECU East Carolina University Office for Human Research Integrity.
Process Manager. What is the Process Manager? Online customized workflow system Tool that allows customized document tracking and storage Hosted ASP Solution.
LAPS Submission Guide. LAPS Submission Process – How to access the LAPS Forms LAPS Forms have been ed to the LTCHs by their respective LHIN.
The Challenge Posting Process Using the Loft Platform.
ORNL DAAC SPATIAL DATA ACCESS TOOL Open Geospatial Consortium (OGC) Services Bruce E. Wilson Suresh K. Santhana Vannan Yaxing Wei Tammy W. Beaty National.
Electronic Theses and Dissertations: The bepress Approach Ben Hermalin Interim Dean, Haas School of Business, UC Berkeley & Co-Founder, bepress.
CommonFloor Groups : Apartment Communities Admin User Guide.
MyCLSS v.2 Critical Issues and Amend Checklist. Critical Issue After receiving final returns, SGB staff will perform a review according to a list of critical.
1. On the homepage, click the “Register” link below the Login box in the left navigation bar. Registration Step 1.
Data Coordinating Center University of Washington Department of Biostatistics Elizabeth Brown, ScD Siiri Bennett, MD.
Quarterly Geo/SIG Coordinator Webinar June 25, 2014.
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
© 2016 University at Buffalo Click Training Agreements Module University at Buffalo Office of the Vice President for Research and Economic Development.
R2R ↔ NODC Steve Rutz NODC Observing Systems Team Leader May 12, 2011 Presented by L. Pikula, IODE OceanTeacher Course Data Management for Information.
Engineering Change Request (ECR) Process Change Request (PCR)
Orders & Shipment Tracking
USOAP Continuous Monitoring Approach (CMA) Workshop
Activating your account and navigating through TIDE
nd Vice Chair’s Report - Nov 2010
ZTE Customer Request Self-Service Portal Operation Guide V1.0.5
Provider Maintenance—Accreditation Module
Presentation transcript:

ORNL DAAC Semi-Automated Data Ingest Process Daine Wright Suresh Vannan, Tammy Beaty, Bob Cook, Yaxing Wei, Ranjeet Deverakonda, Harold Shanafield ESIP Summer Meeting 2015 July ORNL DAAC

Ingest “Semi-automation” Why did we do this? Provide the ability to track a data set from acceptance to publication Automate steps that can be automated to improve efficiencies and reduce redundancy Provide a centralized system to manage the various aspects of ingest – Data Files – Documentation – Code – Communications internal and external Update legacy ingest infrastructure 1ORNL DAAC

Key Components An archival interest form, that identifies an investigator’s data set for archival Data Provider Questions (DPQ): On-line form that serves as the basis for a metadata record. DAAC Ingest Dashboard (DID) and Ingest Kit: Data file management system, including PI upload and movement to archive area Semi-automated QA evaluation DAAC Online Metadata Editor (DAACOME): Metadata Editor that is capable of producing the data set documentation Seamless publication 2ORNL DAAC

Archival Interest Form 3ORNL DAAC

DAAC-Ingest Dashboard (DID) Format: custom Drupal (php) module with MySQL schema Adds links to navigation menu Initiates data set submission s data provider with instruction for data provider questions and data upload Monitors data upload and data provider questions progress Assigns QA, s assignees and coordinator Assign Documentation, s assignee and coordinator Displays the life cycle of a data set submission with completion dates for simplified reporting Includes DAAC-ingest database schema 4

Data Provider Questions (DPQ) Language: Perl / HTML / JavaScript / MySQL Answers should be readily available Form should only take about 20 minutes to complete Gathers preliminary metadata on data sets Travels with data set throughout archival process AACSUB/repos/data-provider-questions AACSUB/repos/data-provider-questions ORNL DAAC5

Ingest Kit Language: Perl Records s between data provider and DAAC Monitors data upload area Copies files from upload area to storage and QA area Collects granule level metadata Backs up MySQL database ts/DAACSUB/repos/ingest-kit ts/DAACSUB/repos/ingest-kit ORNL DAAC6

Interest SubmissionQADocumentationPublication DP IC QA DL DS DP IC QA DL DS DAAC Ingest Automation Swimlanes Data Provider Ingest Coordinator Quality Assurance Documentation Lead DAAC Scientist DP IC QA DL DS Assemble Metadata in database Archival Interest Form Create ORNL XCAMS account Answer Data Provider Questions Upload data Confirm Submission DAAC Appropriate? DP with appropriate alternate archives Collect initial metadata Assign QA staff member Verify Data Set completeness Publish Data Set Monitor submission Initiate data set submission Send initial to DP Perform QA for granule data & metadata Iterate with DP/DL/IC Verify QA and distribution package Assign Documentation Coordinator Scientific Review / Approval DSP Create/Edit Metadata Output landing page and guide doc ORNL DAAC7

Questions? Daine Wright ORNL DAAC8

Initiate data set submission Initiate Data Set Submission 9ORNL DAAC

Send initial to DP 10ORNL DAAC

Answer Data Provider Questions 11ORNL DAAC

Answer Data Provider Questions 12ORNL DAAC

Upload data FTP upload area 13ORNL DAAC

Pending Data Set Submissions Monitor Submission 14ORNL DAAC

Close Submission 15ORNL DAAC

Assign QA staff member Assign QA Staff Member 16ORNL DAAC

Assign QA staff member View QA Assignment 17ORNL DAAC

Collect initial granule metadata 18ORNL DAAC

Pending QA Assignments Monitor QA 19ORNL DAAC

NDVI Growing Season Trends Issue: A netCDF was provided but it was not described in the documentation. It also was not CF compliant. – Resolution: The PI had to be contacted and he explained that the netCDF was provided as an accessory file to a multiband geotiff that contained identical information. Since the data was not multidimensional the geotiff was chosen for archival. Issue: The data in the provided geotiff did not exactly match the data shown in a similar figure in the research paper. – Resolution: The PI had to be contacted. He explained that the geotiff he provided had been updated since the paper’s publishing. Issue: According to the research paper, yearly growing season NDVI data was produced but this data was not submitted to the DAAC. – Resolution: A request for this data was submitted to the PI and he produced geotiffs for each year. The DAAC staff created a netCDF that incorporated all of the geotiff data as well as a time dimension. Perform QA for granule data & metadata 20ORNL DAAC

Assign Documentation Coordinator 21ORNL DAAC

Assign Documentation Coordinator 22ORNL DAAC

Pending Documentation Assignments Monitor Documentation 23ORNL DAAC

Create/Edit Metadata DAAC Online Metadata Editor (DAACOME) 24ORNL DAAC

Output landing page and guide doc DAACOME Guide Doc 25ORNL DAAC

Scientific Review / Approval DSP 26ORNL DAAC

Pending Documentation Assignments Monitor Submissions 27ORNL DAAC

Pending Documentation Assignments Monitor Submissions 28ORNL DAAC

Published Data Set Data Set Landing Page Guide Documentation 29ORNL DAAC

Published Data Set Data Set Landing Page Guide Documentation 30ORNL DAAC

31 Ongoing discussions with NODC on Approaches Possible collaborations Best Practices