1 DSARCH OVERVIEW Dataset Archiving Utility Overview By Zaihua Ji.

Slides:



Advertisements
Similar presentations
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Introduction to Rails.
Advertisements

CC SQL Utilities.
A Toolbox for Blackboard Tim Roberts
Utilizing the GDB debugger to analyze programs Background and application.
Tutorial 12: Enhancing Excel with Visual Basic for Applications
3/5/2009Computer systems1 Analyzing System Using Data Dictionaries Computer System: 1. Data Dictionary 2. Data Dictionary Categories 3. Creating Data Dictionary.
Information & Library Services Australian Education Index, British Education Index and ERIC Sally Giffen August 2006.
Guide to Oracle10G1 Introduction To Forms Builder Chapter 5.
Chapter 9 Chapter 9: Managing Groups, Folders, Files, and Object Security.
Introduction to Structured Query Language (SQL)
A Guide to Oracle9i1 Introduction To Forms Builder Chapter 5.
Introduction to Databases CIS 5.2. Where would you find info about yourself stored in a computer? College Physician’s office Library Grocery Store Dentist’s.
Customizing Outlook. Forms Window in which you enter and view information in Outlook Outlook Form Designer The environment in which you create and customize.
Macros Tutorial Week 20. Objectives By the end of this tutorial you should understand how to: Create macros Assign macros to events Associate macros with.
Guide To UNIX Using Linux Third Edition
Guide To UNIX Using Linux Third Edition
Introduction To Form Builder
1 Chapter 5: Introduction To Form Builder. 2 Forms  Why Do We Use Form Builder?  Why Don’t We Use SQL Only?!
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
Tutorial 11: Connecting to External Data
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System 1 Zaihua Ji Doug Schuster Steven Worley Computational.
Database Updates Made Easy In WebFocus Using SQL And HTML Painter Sept 2011 Lender Processing Services 1.
State of Connecticut Core-CT Project Query 4 hrs Updated 1/21/2011.
Product Retrieval Statistics Canada / Statistique Canada Chuck Humphrey ACCOLEDS/DLI Training December, 2001.
© 2008 The McGraw-Hill Companies, Inc. All rights reserved. M I C R O S O F T ® Preparing for Electronic Distribution Lesson 14.
Creating a Simple Page: HTML Overview
Linux Operations and Administration
Global Update with Confidence Mary M. Strouse Innovative Users Group May 19, 2009.
Session 5: Working with MySQL iNET Academy Open Source Web Development.
Lesson 7-Creating and Changing Directories. Overview Using directories to create order. Managing files in directories. Using pathnames to manage files.
Database-Driven Web Sites, Second Edition1 Chapter 8 Processing ASP.NET Web Forms and Working With Server Controls.
EBSCOadmin. Select Change Password Select EBSCOadmin Security.
What’s New in VRS? GUGM May 15, 2008 Presenter: Kelly P. Robinson GIL Service Georgia State University
Chapter Four UNIX File Processing. 2 Lesson A Extracting Information from Files.
Guide To UNIX Using Linux Fourth Edition
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice SISP Training Documentation Template.
FireRMS NEMSIS (Part 2) Presented by Laura Small FireRMS Quality Assurance.
© 2009 Bentley Systems, Incorporated Chris Collins D&C Manager Quantities.
1 Chapter Overview Preparing to Upgrade Performing a Version Upgrade from Microsoft SQL Server 7.0 Performing an Online Database Upgrade from SQL Server.
1 Chapter 2: Working with Data in a Project 2.1 Introduction to Tabular Data 2.2 Accessing Local Data 2.3 Accessing Remote Data 2.4 Importing Text Files.
Microsoft Access 2010 Chapter 10 Administering a Database System.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
MCSE Guide to Microsoft Windows Vista Professional Chapter 5 Managing File Systems.
Linux+ Guide to Linux Certification, Third Edition
Linux+ Guide to Linux Certification, Third Edition
XP New Perspectives on Microsoft Office FrontPage 2003 Tutorial 7 1 Microsoft Office FrontPage 2003 Tutorial 8 – Integrating a Database with a FrontPage.
Lesson 13 Databases Unit 2—Using the Computer. Computer Concepts BASICS - 22 Objectives Define the purpose and function of database software. Identify.
IN THE NAME OF GOD. Reference Citing Software.
8 Chapter Eight Server-side Scripts. 8 Chapter Objectives Create dynamic Web pages that retrieve and display database data using Active Server Pages Process.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall. 1 Skills for Success with Office 2010 Vol. 1, 2e PowerPoint Lecture to Accompany.
Introducing Dreamweaver. Dreamweaver The web development application used to create web pages Part of the Adobe creative suite.
Hyperion Artifact Life Cycle Management Agenda  Overview  Demo  Tips & Tricks  Takeaways  Queries.
1 DSARCH USAGE Dataset Archiving Utility Usage By Zaihua Ji.
Chapter – 8 Software Tools.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. A Concise Introduction to MATLAB ® William J. Palm III.
B Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Working with PDF and eText Templates.
TOPSpro Special Topics I: Database Managemen t. Agenda for Module I: Database Management  TOPSpro Backup/Restore Wizard  TOPS-TOPS Import/Export Wizard.
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
© 2015 Ex Libris | Confidential & Proprietary Yoel Kortick Senior Librarian Cataloging introductory flow.
Oracle 11g: SQL Chapter 5 Data Manipulation and Transaction Control.
Microsoft Excel Illustrated Introductory Workbooks and Preparing them for the Web Managing.
XP Creating Web Pages with Microsoft Office
Lesson 5-Exploring Utilities
Product Retrieval Statistics Canada / Statistique Canada Title page
CAR Phase 22 Release Notes
Topics Introduction to File Input and Output
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System Zaihua Ji Doug Schuster Steven Worley Computational.
Using Templates and Library Items
Topics Introduction to File Input and Output
Presentation transcript:

1 DSARCH OVERVIEW Dataset Archiving Utility Overview By Zaihua Ji

2 Outline Definitions and how DSARCH fits into RDA functions Purpose of DSARCH Introduction of DSARCH Usage Procedures of using DSARCH to archive data Convert existing datasets Publish MSS/RDA Server data filelists

3 RDA Components

4 Definition of Metadata -- Metadata that summarizes a dataset, such as author, title, summary, etc. -- Metadata that defines external properties of RDA files, such as file locations on MSS/RDA Server, sizes, file packaging (tar, COS block, etc.), dataset sub-groups of files, archive file type (P, B, W, …), etc. -- Metadata that defines internal properties of RDA files, such as data format (GRIB, ASCII, etc.), variables, spatial & temporal ranges, internal metrics (eg., number of grids or stations), etc.

5 Metadata Coverage

6 Current Data Archive Flow

7 Dsarch Work Flow for Existing dataset

8 Dsarch Work Flow for New Dataset

9 What DSARCH For Archives/Retrieves data files to/from MSS and RDA server Records/Retrieves dataset metadata (mainly VSN metadata) to/from RDADB Organizes archived data files for one dataset into sub-datasets, called groups

10 What DSARCH Does Defines group and dataset information in RDADB Automatically selects MSS VSN by default, other options are available Archives data files on the MSS and/or RDA server and saves transaction records into the RDADB Retrieves dataset/group/file information from the RDADB Modifies/corrects information in the RDADB Copies MSS files back to local computers (RDA Server, bison, etc) Moves MSS/Web data files from one dataset/group to another Removes files from MSS or RDA server RDADB Maintenance Functions Backups dataset/group/file information from RDADB into CVS Archive Restores dataset/group/file information into RDADB from CVS Archive

11 Advantage of Using DSARCH

12 General DSARCH Usage dsarch [[-DS] dsnnn.n] [Action Option] [Mode Options] [Information Options] Quotes [] indicate optional Three Option categories: Action, Mode, and Information (Info for short) Options. Action options - specify what tasks this utility to execute Mode options - modify behaviors of given actions Info options - pass information, one or multiple values, to run DSARCH An option is given in either short or long names, eg. -DS or -Dataset All Action and Mode options, as well as Info option -IF (-InputFile), must be given on command line; all other info options can be given either on command line or in one or multiple input files specified by option -IF

13 Categories of Action Options Dataset Actions - create, modify and retrieve dataset information in RDADB Group Actions - create, delete, modify and retrieve dataset group information in RDADB File Actions archive files onto MSS and RDA server move and delete files on MSS and RDA server create, delete, modify and retrieve information about data files in RDADB File-Name Actions - generate, release, retrieve and archive MSS file names (VSN - Volume Serial Number) per RDADB card bank Info Actions - create, modify and retrieve all information for datasets, groups in datasets, on MSS and RDA Server Backup Actions - archive, restore, and check CVS backup history information for datasets, groups, and MSS and RDA Server files

14 Procedure for Creating a New Dataset Create SCCS dataset archive file with ‘Search & Discovery Metadata’ only (for dataset main webpage) Create initial dataset record in RDADB per utility ‘filldataset’ Set flag for using RDADB to ‘Y’ and modify the dataset info Create group information if needed Archive local files onto MSS and/or RDA Server Set flag for using RDADB to ‘P’ or ‘I’, and publish dataset file lists per utility ‘publish_filelist’

15 Procedure to Convert Existing Dataset Set flag of using RDADB to ‘Y’ Edit VSN section from the SCCS dataset document to create a file named ‘dsnnn.n.sccs’ Use ‘myconvert’ utility to reformat ‘dsnnn.n.sccs’ into ‘dsnnn.n.mss’, which is an input file designed for RDADB Use DSARCH to enter the input file ‘dsnnn.n.mss’ into RDADB Set flag of using RDADB to ‘P’ or ‘I’, and use ‘publish_filelist’ utility to publish dataset file lists

16 Convert SCCS Metadata for An Existing Dataset myconvert dsnnn.n.sccs > dsnnn.n.mss dsnnn.n.sccs - MSS file information modified from VSN file section of SCCS dataset metadata file, by inserting conversion control keys for information about the dataset, groups and files. See examples later dsnnn.n.mss - DSARCH input file holding dataset, group, and/or MSS file information

17 Conversion Key Categories Dataset Keys - mark public and/or internal MSS dataset notes Group Keys - build up group information, such as group index, group name (ID), group titles, and public and/or internal MSS group notes MSS File Keys - setup MSS file information, such as format key to specify what information should be collected from the description part of a file line, and keys for data and file formats A special key LB, which can be used to turn on (LB, for example) or off (LB ) of a html line- break symbol for multiple line notes

18 Dataset Keys DM - public MSS dataset note; description lines following ‘DM ’ are collected, including empty lines DI - internal MSS dataset note; description lines following ‘DI ’ are collected, including empty lines

19 Group Keys GN - group name or group ID, up to 20 characters. It is given in format of GN GroupID, and is optional if GI is present, eg. GN List-A GI - group index number. It is automatically assigned in order, if GN is present. It is given format of GI GroupIndex, eg. GI 5

20 MSS File Keys DE - file note or description SD - shared note for multiple files. Insert a line of 'SD ' to record description lines as a shared note LF - local file name for a MSS VSN file name FF - file format, up to 10 characters, eg. 'FF BI.TAR' means the following files are binary COS-blocked and then tarred TF - data format, up to 10 characters, eg. 'TF ASC.IMMA' means the following files hold data files in both ASCII and IMMA formats RL - length of each record in a file RN - number (count) of records in a file FMT - format for description part of file information lines; eg. 'FMT LF,,DE' means that description part of each line is split into three columns by delimiter ','; the first column is local file name, the second is ignored and the third is file note

21 Example of ds540.1.sccs p- TF BINARY FF TAR FMT LF, GN MSGSTD2 p- MSG, Standard Statistics, 2x2, py61279 stdg tar, less than.5 MB py61281 stdg tar, less than.5 MB …. FF Z.TAR GN MSGSTD2_R2.2 p- MSG, Standard Statistics, 2x2, release 2.2, p- Release 2.2 is exclusively new data for For convenience, data from p- Group ID MSGSTD2 for have been included in these files. py88809 msg_2deg/stdg tar, MB by88810 b/u y88809 py88811 msg_2deg/stdg tar, MB by88812 b/u y88811 ….

22 Snapshots of Web Display …..

23 Publish MSS/Web Filelists Publish_filelist [-t] dsnnn.n Set flag of using RDADB to P or I before publishing a dataset (Y is fine for option -t to publish test filelists) Generates html file of MSS public filelist to root directory of given dataset Generates html file of MSS internal filelist to internal dataset directory Generates html index files of RDA Server data in data and sub-data directories of given dataset, unless manually created index html files exist already All html filelist files will be built dynamically and cached

24 Update Dataset With New Data Run DSARCH to archive new data files onto MSS and/or RDA server No need republish filelists Filelists including the information of new archived data files will be recached automatically when the filelist web pages are accessed by users