GEM METADATA DEVELOPMENT Xiaoping Wang, Macrosearch Allen Macklin, PMEL and Bernard Megrey, AFSC
TOPICS Introduction about EML metadata standard GEM metadatabase development Advantages of Oracle Database
GOAL Generate EML-compliant metadata documents for the datasets that are in the interests of GEM (Gulf of Alaska Ecosystem Monitoring, a program of the Exxon Valdez Oil spill Trustee Council).
WHAT IS EML Stands for Ecological Metadata Language. Exists as a set of XML Schema documents. Allows for the structural expression of metadata elements.
ADVANTAGES OF EML Includes almost all metadata elements covered by other metadata standards. Can be used in a modular and extensible manner. Can be used to describe: - Dataset - Literature - Software - Protocol
USE OF DATASET MODULE Data table Spatial raster Spatial vector Stored procedure Other entity
METADATA ELEMENTS (1) General Information Dataset title, abstract and purpose Data creator(s), metadata provider(s) and contact information Keywords Data maintenance Data distribution Geographic/time coverage
METADATA ELEMENTS (2) Research Project Information Project title and description Participants and their roles Funding sources Study area description Design description
METADATA ELEMENTS (3) Method Information Method description Sampling Instruments Software
METADATA ELEMENTS (4) Data Information Table name and description Attribute name and definition Attribute domain code and definition Data unit Data precision Missing value code Accuracy
METADATABASE DEVELOPMENT (1) Database Table Design Main table – One row for each dataset Other tables – One or multiple rows for each dataset. - Keywords - Personnel - Data tables - Table attributes - Attribute domain codes - Instruments.
METADATABASE DEVELOPMENT (2) Integrity Constraints Primary key – dataset record ID in main table Foreign key – dataset record ID in other tables Check constraints – allowed values of EML elements NOT Null constraints – mandatory EML elements
METADATABASE DEVELOPMENT (3) Stored Procedures Handle repeated database operations Input large text files
METADATA FILE GENERATION Java Program development –Read data from metadatabase –Generate EML-compliant metadata files Validate metadata files against EML –no XML errors –no EML errors
ORACLE DATABASE Advantages –Can be used on multiple platforms (Windows, Unix, and Linux…) –Has the best security features –Has the highest availability and reliability –Has a powerful language (PL/SQL) for data query and manipulation Disadvantage –More expensive
FUNDAMENTAL DATA SECURITY REQUIREMENTS Confidentiality - users can see only the data that they are supposed to see. Integrity - data is protected from deletion and corruption. Availability - data is available to authorized users without delay.
DATA AVAILABILITY (1) Real Application Clusters
DATA AVAILABILITY (2) Replication
DATA AVAILABILITY (3) Data Guard
DATA AVAILABILITY (4) Stream
DATA MANAGEMENT Database management - Data storage Metadata management - Data documentation Data availability - Online data share