Presentation is loading. Please wait.

Presentation is loading. Please wait.

PDS4 Data Standards IPDA - JAXA July 22, 2015. Topics Schedule and Status of the PDS4 Information Model (IM) Data Design Working Group and Task Statuses.

Similar presentations


Presentation on theme: "PDS4 Data Standards IPDA - JAXA July 22, 2015. Topics Schedule and Status of the PDS4 Information Model (IM) Data Design Working Group and Task Statuses."— Presentation transcript:

1 PDS4 Data Standards IPDA - JAXA July 22, 2015

2 Topics Schedule and Status of the PDS4 Information Model (IM) Data Design Working Group and Task Statuses Other work 2

3 Schedule Released IM V1.4.0.0 for Build 5b Integration and Test on March 31, 2015 Staged IM V1.4.1.0 as a point release within Build 5b. Late July. Supporting requirements for LADEE, MAVEN, InSight, Osiris-Rex, BepiColumbo, ExoMars, Juice 3

4 IM Status Common model remains stable. One new Product has been added, Product_Ancillary. The majority of changes involve schematron rules and new permissible values. Release V1.4.1.0 has been staged and CCB approved SCRs and Bug Fixes are being implemented. Key discipline dictionaries are under review. cartography and geometry International community is engaged. 4

5 Information Model Release Process 5

6 Approved changes implemented for Information Model V1.4.0.0 CCB Approved Changes CCB-83 - Align IM and SR Base Types for ASCII_NonNegative_Integer CCB-84 - Provide Definitions for Enumerated Values CCB-85 - Improve Attribute and Enumerated Value Definitions (Simple List) CCB-88 - Validation that counts match occurrences. CCB-96 - Include the Array class in File_Area_Observational and File_Area_Observational_Supplemental and File_Area_Browse. CCB-99 - Remove Array_1D from model CCB-101 - Remove ASCII_Date, ASCII_Date_Time and ASCII_Date_Time_UTC from the Field classes. CCB-102 - Change Cardinality of Axis_Array in Array and Its Subclasses CCB-105 - Add name and description to Group class CCB-109 - Add “Plasma Stream” to Target.type permissible values list and revise the Solar Wind context product 6

7 Approved changes implemented for Information Model V1.4.0.0 Bug Fixes Lower case JSON exported file extension Change cardinality of Document_Edition from 0..* to 1..* for Document Filter out ANAME meta-attribute from Product_Attribute_Definitions Replace angle brackets "<" with escape sequence in product_attribute_definition. Modify the schemaLocation attribute in product_attribute_definition. Clean up IM to remove all references to MD5 software. Only the algorithm should be referenced. Clean several ASCII_Date* and ASCII_Date_Time* patterns. Fix LID for Product_*_Definition; lowercase LID and add bundle:context OPS – Added Property_Map to Ingest_LDD to support PDS4 Query Models and the auto-generation of SISes. 7

8 Approved changes implemented for Information Model V1.4.1.0 CCB Approved Changes CCB-78 - Formalize List of Acceptable Supplemental Formats for Document_File.document_standard_id CCB-77 - Augment Product Update with CCB-87 - Fix omission of Excel from list of Acceptable Supplemental formats for Document_File.document_standard_id (Prior to 1.4.0.0 ?) CCB-91 - add new legal values for instrument type CCB-106 - Move Description Attribute to Field from Its Subclasses CCB-110 - Updates to Appendix F of the DPH -- context product LIDs CCB-111 - Add Product_Ancillary CCB-112 - Request to add "Observation Geometry" to "purpose" value set in "Primary_Result_Summary" CCB-114 - Poorly defined record_delimiter in Table_Binary CCB-115 - Undefined unit "SFU" CCB-116 - instrument_host_version_id is undefined. CCB-117 - Change collection inventory file extension from ".tab" to ".csv" in PDS4 documents CCB-118 - Error in Generating Data Dictionary from Information Model CCB-119 - Cannot supply context information for Product_XML_Schema CCB-120 - Bug fixes for point release Version 1.4.1.0 8

9 Approved changes implemented for Information Model V1.4.1.0 Bug Fixes Added subclasses to Data_Object for use in decision tree for product types. E.g. Ancillary_Object. Added {Information_Model_Version, LDD_Version_Id} to the Product_XML_Schema LID. Fixed descriptions for Software_Binary and Software_Source. Both used "Script" Augment Product_Service with Service class. Added Information_Model_Version to the LID for Product_Attribute_Definition and Product_Class_Definition Added "esa:psa" as allowed tokens in PDS4 product LIDs. 9

10 International Interactions IPDA/PSA provides excellent review comments for PDS4 standards. PDS4 Implementation Project Review PDS4 Data Standards documentation. Analysis of existing validation rules Identification and analysis of guidelines for consistent use of keywords within PDS4-compliant products (e.g. CDF, NetCDF, FITS). 10

11 Documents 1 11 1 Posted to http://pds.nasa.gov/pds4http://pds.nasa.gov/pds4 Standards Documents Information Model Specification – Version 1.4.0.0 - (1.4.1.0) XML Schemas – Version 1.4.0.0 - (1.4.1.0) Data Dictionary – Version 1.4.0.0 - (1.4.1.0) PDS4 Example Products - Version 1.4.0.0 Standards Reference - Version 1.3.0 – (1.4.0) Support Documents DD Tutorial Glossary Concepts Document Data Provider’s Handbook PAG (Proposers Archive Guide)

12 DDWG Status and Tasks The DDWG meets for one hour 1 teleconferences on alternate weeks Agenda Research, additional study, prototyping, and proof-of-concepts Discipline Team Tasks Geometry (flyby and orbital missions) – in Review Cartography – in Review NSSDC/PDS Interface – in Testing Metadata Consistency – in Implementation Document Improvement – on-going 12 1 Has been known to exceed the allocated time

13 Task - Geometry Team lead and members: E. Guinness, M. Gordon, A. Raugh, C. Isbell, S. McLaughlin, B. Semenov, C. Acton, E. Rye, S. Hughes Short Description: Capture geometry requirements from across the disciplines, obtain a consensus model, and write a geometry dictionary. Currently focusing on classes for flyby/orbital and landed missions. Goals: Cross-discipline Geometry Model and Dictionary Schedule (Major milestones): Done - Requirements captured in white paper April - Release of Geometry dictionary for orbital and landed missions post-Apr – Review of dictionary by DDWG and IPDA 13

14 Task - Cartography Team leads: C. Isbell, E. Rye, [R. Joyner] Short Description/Goals: Capture planetary cartography requirements and definitions across disciplines and projects. Produce a cross-discipline Cartography dictionary compliant with the PDS4 IM. Schedule (Major milestones): Done - Initiated Federal Geographic Data Committee (FGDC) based model, completed initial testing & sample product labels, extended FGDC model for Planetary (v1.2.0.0), presented status to IPDA Recent - Additional extensions, then v1.4.0.0 release (14Apr2015) for next level testing and review. Next - As part of review/testing, receive review input, coordinate with Geometry team, update/add attributes, enumeration May 2015 - Complete current review, implement updates June 2015 - Release first operational version for wide full review 14

15 Task - NSSDCA/PDS4 Interface Team lead and members: McLaughlin, Bell, McCaslin, Kodis, Hughes, Hardman, Joyner, Guinness, Huber, Isbell, King, Raugh Short Description & Goals: Develop an interface for transferring PDS4 data to the NSSDCA deep archive Automate deliveries; eliminate multiple ingests of products Provide the ability to return basic products, collections, and bundles from the deep archive; report on those entities Perform data integrity checks Schedule (Major milestones) Testing continues. 15

16 Task – Metadata Consistency Team lead and members: S. Hughes, J. Mafi, E. Guinness, S. Hardman, M. Martin, A. Raugh, M. Gordon Short Description: Define explicit set(s) of attributes to be used as search parameters for products within the PDS4 archive. Note: An implicit set of search parameters already exists, the required attributes for Product_Observational. Goals: Define the PDS4 “query model” template; Develop and test use cases for selected product types; Develop recommendations for how the query models are:designed and maintained, used to configure system components, used to guide data providers, used in product label validation. Schedule (Major milestones): Past 4 months – A proposed implemented solution was presented at the Planetary Data Workshop. September 2015 – Documentation for Build 6. 16

17 Task - Document Improvement - 1 Team members: D. Simpson, M. Gordon, R. Joyner Short Description: Address issues associated with the PDS4 Data Standards documents. Goals: Consistent data standard documents and information model. Schedule (continuing work): Support Documents – Planned Reviews and Updates DD Tutorial (rewrite but probably not for v1.4) Glossary and List of Acronyms and Abbreviations (update) Concepts Document (revise for v1.4) Data Provider's Handbook (RJ and MG) 17

18 Task- Document Improvement - 2 Schedule (continuing work): Standards Reference Receive comments on SR from IPDA (expected soon) Update SR for v1.4 18

19 Information Model Extracts Requests have been made from projects for extracts of the contents of the PDS4 Information Model in other data definition languages. JSON 1 – APPS (AMMOS-PDS Pipeline Service) SKOS 2 and OWL 5 – Linked Open Data 3 Project (Bernd Ritschel – GFZ Potsdam) RDF 4 – Various requesters 1 JavaScript Object Notation (JSON) is a lightweight data-interchange format. 2 Simple Knowledge Organization (SKOS) is a common data model for sharing and linking knowledge organization systems via the Web. 3 Linked Open Data (LOD) is a way of publishing structured data that allows metadata to be connected and enriched, so that different representations of the same content can be found, and links made between related resources. 4 Resource Description Framework (RDF) is a family of World Wide Web Consortium (W3C) specifications used as a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax notations and data serialization formats. 5 The Web Ontology Language (OWL) is a family of knowledge representation languages or ontology languages for authoring ontologies or knowledge bases.

20 Query Models in Context 20 Concepts Document Big Picture Standards Reference Requirements User Friendly XML Schemas Blueprints PDS4 Product Labels Deliverables Data Dictionary Definitions PDS4 Information Model Requirements (Engineering Specification) Data Provider’s Handbook derive generate reference create / validate instruct generate reference Registry Configuration File Object Descriptions configure generate Registry Product Tracking and Cataloging Informative Document Standards Document File System Legend Query Models Identify Search Attributes configure Search Engine Apache Solr generate Property_Maps Identify Search Attributes ingest

21 Papers and Presentations Paper presented at E-SCIENCE 2014 - 10th IEEE International Conference on e-Science: A Scalable Planetary Science Information Architecture for Big Science Data Paper presented at ADASS XXIV: An Overview of PDS4 Archival File Formats Several papers/posters presented at AGU 2014. GSAW Conference

22 Summary The “common” part of the PDS4 Information Model remains stable. Most changes are the type expected, for example new permissible values and Schematron fixes. Good progress is being made on discipline dictionaries. Special teams focused on one issue are effective. New focus on IM’s role in supporting user access. The international community is engaged. 22 “At least for a closet computer geek like me, the new standard is a thing of beauty.” “I must admit that I came away from this particular DDWG with a new appreciation of the Information Model.” 12/6/2014.

23 Questions and Answers 23

24 Backup 24

25 Definition and Questions Query Model – A subset of attributes from the PDS4 dictionaries, either common or local, to be used as search parameters. Used to configure the harvester and search engine Can be used as guidance to data providers Are mapped to registry slots A select few are implemented as facets. Facet – A facet corresponds to an attribute in the PDS4 Information Model. For example the attribute could be mapped to a facet. In the PDS4 search engine a faceted classification system allows users to explore products by applying multiple filters Open questions How many query models are needed? Current general search use cases are: Product_Observational Product_Collection (mission science data collection) Product_Document How do they guide the data provider? Paragraph in Standards Reference Schematron Rules 25

26 Testing Oxygen is used to test the generated XML Schemas and Schematron files for validity and well- formedness. File Difference (Ultra-Edit Compare) is used to identify differences between current and prior versions of generated files after updates. XML Schema, Schematron, Information Model Specification, ISO/IEC-11179 Data Dictionary files. Regression tests are performed against a suite of example products. Regression tests are performed against selected node bundles (Imaging, Atmospheres)

27 Acknowledgements* * Anyone who sat through a DDWG 2-hour telecon or provided useful input. Ed Bell Richard Chen Dan Crichton Amy Culver Patty Garcia Ed Grayzeck Ed Guinness Mitch Gordon Sean Hardman Lyle Huber Steve Hughes Chris Isbell Steve Joy Ronald Joyner Debra Kazden Todd King John Kodis Joe Mafi Mike Martin Thomas Morgan Lynn Neakrase Paul Ramirez Anne Raugh Shannon Rees Mark Rose Matias Roybal Elizabeth Rye Boris Semenov Dick Simpson Susie Slavney Dillon White Peter Allan David Heather Michel Gangloff Santa Martinez Thomas Roatsch Alain Sarkissian

28 Metadata Consistency What is the problem? A single query model 1 is not appropriate for PDS4. What solution was planned for this problem? The product-centric PDS4 architecture allows custom query models to be defined for sets of one or more product types. Who is responsible? Definition - Node discipline experts Validation and Processing - PDS4 Infrastructure What are the goals? Define the PDS4 “query model” template. Develop and test use cases for selected product types Develop recommendations for how the query models are: designed and maintained used to configure system components used to guide data providers used in product label validation. 28 1 Query Model - A set of attributes to be used as search parameters for products within a defined set of products.

29 Role of the Query Model 29 Query Models Discipline Expertise PDS4 IM Metadata Source 1.Product Label 2.Index Table Dynamic Derived Registry Repository Harvester Search Engine Configure Harvests from Guidance for Data Engineering Metadata external to the label

30 Release Process for the IM Update IM with CCB approved SCRs and Bug Fixes Generate documents from the IM Perform difference and regression tests Release for Unit Testing Fix bugs found during Unit Testing Nodes are requested to participate Release for Model Integration and Testing (MI&T) Fix bugs found during MI&T Release for System Integration and Testing as part of the system Build process 30

31 31 PDS4 Documents and their Relationships Concepts Document Big Picture Standards Reference Requirements User Friendly XML Schemas Blueprints PDS4 Product Labels Deliverables Data Dictionary Definitions PDS4 Information Model Specification Requirements Engineering Specification Data Dictionary Tutorial Informative Data Provider’s Handbook Cookbook derive generate references create / validate instruct generate references Registry Configuration File Object Descriptions configure generate Registry Product Tracking and Cataloging generate Informative Document Standards Document File System Legend


Download ppt "PDS4 Data Standards IPDA - JAXA July 22, 2015. Topics Schedule and Status of the PDS4 Information Model (IM) Data Design Working Group and Task Statuses."

Similar presentations


Ads by Google