Metadata and Data Management activities at CSIRO Marine Research, Australia Kim Finney & Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart.

Slides:



Advertisements
Similar presentations
ICAO Seminar on Aeronautical spectrum management (Cairo, 7 – 17 June 2006) SAFIRE Spectrum and Frequency Information Resource (presented by Eurocontrol)
Advertisements

28 March 2003e-MapScholar: content management system The e-MapScholar Content Management System (CMS) David Medyckyj-Scott Project Director.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
CSIRO Marine Research Divisional Data Centre Current and Future Activities Tony Rees, Data Centre Manager April 2004.
How the University Library can help you with your term paper
Dr Gordon Russell, Napier University Unit Data Dictionary 1 Data Dictionary Unit 5.3.
What is the Internet? Internet: The Internet, in simplest terms, is the large group of millions of computers around the world that are all connected to.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
Copyright 2004 Monash University IMS5401 Web-based Systems Development Topic 2: Elements of the Web (g) Interactivity.
5 th September 2003Diane Tough Content Creation at the NHM or The evolving catalogue!
Conferences James Shaw and Sue Bird WISER Finding Stuff.
XP Browser and Basics1. XP Browser and Basics2 Learn about Web browser software and Web pages The Web is a collection of files that reside.
Peoplesoft Fundamentals David Lewis 10/18/02 (adapted from Psoft Training Materials)
WWW and Internet The Internet Creation of the Web Languages for document description Active web pages.
1 The World Wide Web. 2  Web Fundamentals  Pages are defined by the Hypertext Markup Language (HTML) and contain text, graphics, audio, video and software.
Browser and Basics Tutorial 1. Learn about Web browser software and Web pages The Web is a collection of files that reside on computers, called.
Knowledge Portals and Knowledge Management Tools
Chapter 1 Introduction to Databases
A Data Management Life-Cycle By David Ferderer Project Chief Chris SkinnerContractor Greg GuntherContractor
Improving access to digital resources: a mandate for order mandate: managing digital assets in tertiary education craig green,
How the University Library can help you with your term paper Computer Science SC Hester Mountifield Science Library x 8050
Software Development Unit 2 Databases What is a database? A collection of data organised in a manner that allows access, retrieval and use of that data.
Rimantas Ramanauskas Kazys Maksvytis Alvydas Janulevičius State Enterprise Centre of Registers INTEGRATED PROCESSING OF DIGITAL CADASTRAL DATA IN LITHUANIA.
MarLIN - CSIRO Marine Laboratories Information Network CAAB - Codes for Australian Aquatic Biota plus other systems of interest... Tony Rees Divisional.
MEDIN Data Guidelines. Data Guidelines Documents with tables and Excel versions of tables which are organised on a thematic basis which consider the actual.
Classroom User Training June 29, 2005 Presented by:
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Controlled Vocabularies (Term Lists). Controlled Vocabs Literally - A list of terms to choose from Aim is to promote the use of common vocabularies so.
MSF Requirements Envisioning Phase Planning Phase.
Postacademic Interuniversity Course in Information Technology – Module C1p1 Contents Data Communications Applications –File & print serving –Mail –Domain.
What is the Internet? Internet: The Internet, in simplest terms, is the large group of millions of computers around the world that are all connected to.
F. Toussaint (WDCC, Hamburg) / / 1 CERA : Data Structure and User Interface Frank Toussaint Michael Lautenschlager World Data Center for Climate.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
M.Lautenschlager (WDCC, Hamburg) / / 1 Semantic Data Management for Organising Terabyte Data Archives Michael Lautenschlager World Data Center.
MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
CSIRO Marine Research Data Centre linked databases - CAAB, MarLIN and Divisional Data Warehouse.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Digital Commons & Open Access Repositories Johanna Bristow, Strategic Marketing Manager APBSLG Libraries: September 2006.
Introduction to Morpho BEAM Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
This presentation describes the development and implementation of WSU Research Exchange, a permanent digital repository system that is being, adding WSU.
1 Chapter 1 Introduction to Databases Transparencies.
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
NDD (National Oceans Office Data Directory) development overview as at 1 July 2002 Tony Rees/Miroslaw Ryba CSIRO Marine Research, Hobart.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Introduction to Morpho RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
U.S. Department of the Interior U.S. Geological Survey The Biological Data Profile Extending the FGDC Metadata Standard Kirsten Larsen.
MarLIN - CSIRO Marine Laboratories Information Network.
Hellenic Centre for Marine Research (HCMR) MedOBIS - Ocean Biogeographic Information System for the Eastern Mediterranean and Black Sea.
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
CAAB and taxon management at CSIRO Marine Research Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart
MarLIN: a research data metadatabase for CSIRO Marine Research Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart contact:
CAAB - Codes for Australian Aquatic Biota Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart
DESIGN AND DEVELOPMENT OF NOAA VIRTUAL LIBRARIES: THE INTERSECTION OF TRADITIONAL LIBRARY KNOWLEDGE AND CUTTING EDGE INFORMATION TECHNOLOGIES Dottie Anderson.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Metadata V1 By Dick M.A. Schaap – technical coordinator Oostende, June 08.
Your Interactive Guide to the Digital World Discovering Computers 2012 Chapter 13 Computer Programs and Programming Languages.
IODE Ocean Data Portal - technological framework of new IODE system Dr. Sergey Belov, et al. Partnership Centre for the IODE Ocean Data Portal.
Architecture Review 10/11/2004
ICAO Seminar on Aeronautical spectrum management (Cairo, 7 – 17 June 2006) SAFIRE Spectrum and Frequency Information Resource (presented by Eurocontrol)
Flanders Marine Institute (VLIZ)
Web Engineering.
Building A Web-based University Archive
ICAO Seminar on Aeronautical spectrum management (Cairo, 7 – 17 June 2006) SAFIRE Spectrum and Frequency Information Resource (presented by Eurocontrol)
Introduction of KNS55 Platform
Unit# 5: Internet and Worldwide Web
The ultimate in data organization
Presentation transcript:

Metadata and Data Management activities at CSIRO Marine Research, Australia Kim Finney & Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart

The Australian MarLIN Connection Great Minds Think Alike !!! – Almost simultaneous emergence of UK and Australian MarLIN projects. – Different emphases but many overlapping problems. Why Are We Here ? – To exchange ideas, make some new data friends and hopefully leverage off of UK developments that can also address Oz marine data issues. Who Are We ? – CSIRO Division of Marine Research (CMR), an Australian Commonwealth Government research agency. Approximately 300 staff. One of a number of such agencies (others include AIMS, GBRMPA)

Orientation information... RV Franklin Oceanographic research vessel FRV Southern Surveyor Fisheries research vessel CMR 16 million 2 km ocean territory

CMR - Data Centre Established in 1997 – 12 staff (multidisciplinary), – service Division and two ships, – focal point for promoting data management culture within CMR, Data Management Strategy – developed in 1997, – Outlines actions that CMR must take to move its data management practices into the 21st Century, – Covers policy, technology issues, data handling procedures, standards development/adoption - available on Data Centre web site.

What Are Some Of The Issues We Face ? Corporate knowledge of datasets held (internal & external sourced). Purchase & sharing of externally sourced data. Access & re-use of data generated by individuals. Data archiving for re-use. Coordination of external data exchange/data provision. Data pricing policies. Divisional use of WWW & database technology. Conformance with national & international standards (data exchange, data processing, data documentation) Contribution to national data management issues & activities. Data management tools (availability, development for re-use, divisional software libraries) Integration of data, records, publications and financial systems

Divisional Data Policies What Is Our Approach ? E.Commerce Module Data Licensing Module Basic WWW Metadata Directory Hyperlinked Data Files Hyperlinked Publications Hyperlinked Databases Standards

RMI HTTP Divisional Data Policies E.Commerce Module Data Licensing Module Basic WWW Metadata Directory Hyperlinked Data Files Hyperlinked Publications Hyperlinked Databases Standards Development Of CMR’s Research Database Network Protocol Client Server Servlet (Database Access Program) ORACLE Database ( Java Applet, or Browser ) Yet to be includ ed Video Data Catalogu e Conceptu al/ Physical Deployme nt Project Informati on Model Sources GIS Sources Device Sources Time Series Data Types Profil e Data Type s Photo Data Catch Data Model Data Image Data Meteorologic al Sedime n Sample Data Spatial Option J D B C

Video Data Catalogue {long table indexing all features in the database} Conceptual/ Physical Deployment Project Information Model Sources GIS Sources Device Sources Time Series Data Types Profile Data Types Photo Data Catch Data Model Data Image Data Meteorological Data Sediment Data Sample Data

Concluding Remarks

MarLIN - Marine Laboratories Information Network and CAAB - Codes for Australian Aquatic Biota

Situation at CMR pre-MarLIN Centrally-held data Derived products CMR -produced reference works & guides Scientific publications project/ voyage/ person details Supporting information CAAB taxonomic database Externally sourced data Indexes and catalogues Dispersed data (numerous dispersed resources)

MarLIN metadatabase as at July 1999: showing pointers/links to ( ) or information sourced from ( ) Centrally-held data Derived products CMR -produced reference works & guides Scientific publications project/ voyage/ person details Supporting information CAAB taxonomic database Externally sourced data Indexes and catalogues Dispersed data

MarLIN design questions... How to make data querying, entry and maintenance easily user-accessible (but maintain metadata standards)? – use www interfaces, but moderate user entries and updates What information to store, in what manner? – use ANZLIC and “Blue Pages” elements, plus additional ones as deemed useful for Divisional needs What metadata standards, thesauri, etc. to follow? – mostly follow ANZLIC & “Blue Pages”, with some extensions & replacements How to handle taxon-level information? – store taxonomic codes in MarLIN, referenced to scientific and common names from Division’s “CAAB” taxonomic database What about subject-based searching? – use “MarLIN subject categories”, developed from ASFA (R) scheme

MarLIN metadatabase implementation Oracle database, with www front end and HTML forms/JAVA interfaces – www used for searching and metadata submission/ metadata update, also for most administrative functions Relational design – common aspects to numerous records (e.g. project, voyage, person information) stored in separate tables Data entry and update is via user logon (restricted to users on CMR computer domain) – enterer details, time, etc. are automatically logged and added to record on submission “Submitted” records reside in separate (parallel) tables until approved by database administrator Nightly script runs to generate CMR’s “Blue Pages” entries from MarLIN metadata records

MarLIN metadata elements # = “Blue Pages” extension to ANZLIC standard, * = new element added for MarLIN Dataset... Title * Identifier/Short Title # Data Type Custodian Organisation * Contributors * Acknowledgements # References * Publication Date Abstract * Author's Comments On-Line Links (Data, Graphics, Documentation) Location Keywords Bounding Coordinates Subject Categories and Search Words * MarLIN Subject Categories # Habitat Keywords # Taxonomy Keywords * CAAB Species Codes # Parameters Measured # Equipment Used # Blue Pages Themes ANZLIC Search Words Project, vessel and voyage details # Originating Project Name * Project Details # Platform/Vessel Name * Voyage Identifier * Voyage Details Data Currency and Status Date range (Beginning and End Dates) Progress Maintenance Data Access Stored Data Format(s) * Stored Data Volume * Stored Data Location * Specific Data Location * Specific Software Requirements * Stored Data Documentation Available Format Type(s) Access Constraints Data Quality Data Source, Processing, and Quality Control * GIS Datum and scale used (if relevant) Logical Consistency Report Positional Accuracy Parameter Accuracy Completeness Contact point Contact Person and Details Metadata Information * Related MarLIN Datasets Additional Metadata * Metadata Availability Metadata Created On/By... (date, person) * Metadata Last Updated On/By... (date, person)

Aspects of MarLIN “Search” interface...

Example search results Lists of titles Summary information Links to voyage tracks

External MarLIN linkages (July 1999) Hyperlinks to documents, data, etc. Selected details exported to... Online link back to... Internet search engines “Blue Pages” HTML documents (many organisations’ records) MarLIN database (CMR’s records) Blue Pages search facility MarLIN search facility

MarLIN continuing development... Incorporate “live” links to other databases e.g. CAAB, CMR corporate databases, library systems Increase data coverage, try to maintain currency and consistency of entries Continue to “sell the concept” for users to document their own data Make a “view” of MarLIN records visible to ASDD Possible future links with metadata systems based on other standards, using “crosswalks” MarLIN v.2 to be developed in c. 12 months … closely integrated with new Divisional data storage system (with parallel development of interfaces etc., automated retrieval of data as well as metadata)

MarLIN present ( ) and future ( ) operation Centrally-held data Derived products CMR -produced reference works & guides Scientific publications project/ voyage/ person details Supporting information CAAB taxonomic database Externally sourced data Indexes and catalogues Dispersed data

CAAB Codes for Australian Aquatic Biota

Example CAAB codes (hammerhead sharks) (dogfishes)

CAAB rationale/ historic reasons for existence Taxonomists needed a tool for organising specimen collections and supporting information Field biologists needed a tool for rapid data entry (to include categories corresponding to “non orthodox groups”) Data custodians needed a system for storing taxon- related information in a long-term, stable form (independent of future name changes) Use of “intelligent” codes permits rapid human- or computer-based sorting of taxa, and retrieval of supporting information

CAAB implementation CAAB has 47 “major categories” (e.g. fish, mammals, Algae - Phaeophyta, angiosperms), each with up to 999,999 available codes for allocation to Australian aquatic taxa Coverage of Australian fish species (c.4,500) is essentially complete, also some smaller groups (marine reptiles and mammals) Other categories - populated on “as needs” basis (e.g. 300 molluscs, 350 crustaceans, 60 angiosperms - plus ongoing additions) 2-digit prefix (category code) and 3-digit family code are machine- sortable - e.g.: – 37 = fish = fish family = fish family 1 species 1 – families are in contiguous blocks, e.g. families to are all types of sharks Numeric code is attached to taxon, independent of changes of scientific or common name (gives relative stability for data storage) Master CAAB database stores taxon/voucher specimen details, present and any previous scientific names, common names, comments and other information

Present usage of CAAB information CMR -produced reference works & guides CAAB taxonomic database CAAB - generated species lists Other organisations’ databases CMR databases (including MarLIN) used in... generates... Quoted in...

Intended future CAAB operation Links to on- line information CAAB taxonomic database CAAB species lists - on-line generation CAAB www interface CAAB taxon-level report Additional search facilities - e.g. MarLIN, other CMR databases, ITIS, www, etc. Users’ databases

CAAB continuing tasks... Taxon-level information from other local databases to be incorporated into CAAB (coverage will gradually be extended to most groups of aquatic organisms) Database structure will be improved to suit external www user access to the database Species common names to be handled in a structured way, permitting user-definable output formats, more comprehensive searching, etc. Hyperlinks will be incorporated, to electronic versions of available maps, images, etc. as available On-line links to other databases from CAAB will be enabled (and vice versa)

Selected data and metadata developments elsewhere in Australia On-line data, data products, and summaries Collection-based information On-line references Other metadata systems