Data Model & DDWG Update Management Council Face-to-Face Flagstaff, Arizona August 22-23, 2011
Topics Design Process Builds Calendar Build 1b Review Issues
Data Standards Design Process
What exactly has to happen? "Build"
Freeze the Information Model "Build"
Freeze the Information Model Finalize the System Generate Schema Freeze the Document Set "Build"
Freeze the Information Model Finalize the System Generate Schema Freeze the Document Set Introduction Concepts Document Glossary Jump Start Data Provider's Handbook Standards Reference Dictionary Tutorial Data Dictionary Example Set "Build"
Reasonably Stable Freeze the Information Model Finalize the System Generate Schema Freeze the Document Set Introduction Concepts Document Glossary Jump Start Data Provider's Handbook Standards Reference Dictionary Tutorial Data Dictionary Example Set
"Build" Generated Reasonably Stable Freeze the Information Model Finalize the System Generate Schema Freeze the Document Set Introduction Concepts Document Glossary Jump Start Data Provider's Handbook Standards Reference Dictionary Tutorial Data Dictionary Example Set
"Build" Generated Reasonably Stable Human Intervention Freeze the Information Model Finalize the System Generate Schema Freeze the Document Set Introduction Concepts Document Glossary Jump Start Data Provider's Handbook Standards Reference Dictionary Tutorial Data Dictionary Example Set
What this translates to is "lead time". Right now we're looking at two to three weeks lead time from "freeze the model" to "flip the switch" on the build. Let's look at a calendar. "Build"
Objects on the Calendar
Objects on the Calendar Are Closer Than They Appear
Internal Review Issues 1b Review produced > 200 separate issues/comments Issues fell into two broad categories: Documentation issues - clarity, consistency, completeness, integration. Concerns about the model contents & implementation. The Status of the review issues fall into two categories: Open Closed
Internal Review Issues Open Still working for Build 2. Will address after Build 2. Have not decided whether or not to implement. Closed We have implemented. Model related issue arose from misunderstanding some aspect of PDS4. We disagree: Incompatible with PDS4 requirements. Incompatible with the model approach we're using. Not possible to implement within our time & budget constraints.
Internal Review Some Closed Issues Implemented Document set integration. Need analogs for PDS3 spreadsheet & container. Misunderstanding New Structures don't support qubes. Volatile metadata in a static archive (redelivery issue). Disagree Labels that describe multiple data objects don't really work. Do away with character tables. Other space science archives: Consider using VOTABLE, CDM & OPeNDAP approach, class="variable" & named "dimension".
Internal Review Some Open Issues Documentation issues – still working many of them. Need robust, global metadata. New Structures don't support some EDRs, Telemetry, DSN data. Use a standard bundle entry (bundle index.html) Consider a nomenclature review. There is a proposed alternate XML implementation Starts with XML Schema 1.0 or 1.1? Perceived complexity. Too many subclasses.
Open Issue: Too many Subclasses (1) Going back to the original reviews, the issue is for the number of variations expanded from the four base structural types. The underlying concerns are overhead and confusion. There have been a lot of changes since build 1b. Now as we look at this issue we have to ask three questions. What do we count? Are there too many? If the numbers are reasonable, do we have the right ones?
Open Issue: Too many Subclasses (2) What do we count? Count what the data providers and end users see.
Open Issue: Too many Subclasses (3) What do we count? Count what the data providers and end users see. Schema – specifically the Product_* schema.
Open Issue: Too many Subclasses (4) What do we count? Count what the data providers and end users see. We have 40 Product schema. Wait for it …
Open Issue: Too many Subclasses (5) 40 Product schema – by function. Aggregations – 2 (Probably will be 3)
Open Issue: Too many Subclasses (6) 40 Product schema – by function. Aggregations – 2 Observational Data – 10 (probably will add 1 or 2)
Open Issue: Too many Subclasses (7) 40 Product schema – by function. Aggregations – 2 Observational Data – 10 Observational Support – 10 (e.g., browse, document)
Open Issue: Too many Subclasses (8) 40 Product schema – by function. Aggregations – 2 Observational Data – 10 Observational Support – 10 Context – 5
Open Issue: Too many Subclasses (9) 40 Product schema – by function. Aggregations – 2 Observational Data – 10 Observational Support – 10 Context – 5 Operations – 13 (includes 5 PDS3 Context)
Open Issue: Too many Subclasses (10) 40 Product schema – by function. Aggregations – 2 Observational Data – 10 Observational Support – 10 Context – 5 Operations – 13 Providers see 27, end users see 22.
Open Issue: Too many Subclasses (11) Are there too many? Comparing to PDS3 tends to be an apples and oranges situation, but the number of PDS4 observational data products is roughly equivalent to the corresponding subset of PDS3 Data Objects. PDS4 context products is roughly equivalent to the corresponding subset of PDS3 Catalog Objects. PDS4 observational data support products is substantially greater than the corresponding subset of PDS3 Data Objects.
Open Issue: Too many Subclasses (12) Do we have the correct set? We're close, but will probably add and subtract a few. May be significantly affected by the potential change in the XML Schema implementation.
Questions?
Backups
Acknowledgements* Ed Bell Richard Chen Dan Crichton Amy Culver Patty Garcia Ed Grayzeck Ed Guinness Mitch Gordon Sean Hardman Lyle Huber Steve Hughes Chris Isbell Steve Joy * Anyone who sat through a DDWG 2-hour telecon or provided useful input. Ronald Joyner Debra Kazden Todd King Joe Mafi Mike Martin Thomas Morgan Lynn Neakrase Paul Ramirez Anne Raugh Mark Rose Elizabeth Rye Boris Semenov Dick Simpson Susie Slavney Peter Allan David Heather Michel Gangloff Santa Martinez Thomas Roatsch Alain Sarkissian
PDS4 Documents and their Relationships Concepts Document Big Picture Standards Reference Requirements User Friendly XML Schemas Blueprints PDS4 Product Labels Deliverables Data Dictionary Definitions PDS4 Information Model Specification Requirements Engineering Specification Informative Data Provider’s Handbook Cookbook derive generates references creates / validates instruct generates references Registry Configuration File Object Descriptions configures generates Registry Product Tracking and Cataloging generates Introduction to PDS4 Documentation JumpstartGlossaryData Dictionary Tutorial Complete Some TBD Legend
Requirements & Domain Knowledge PDS4 Information Model Query Models Information Model Specification XML Schema (Generic) Filter and Translator Information Modeling Tool PDS4 Data Dictionary (Doc and DB) PDS4 Data Dictionary (Doc and DB) XML Schema (Specific) XML Document (Label) XMI/UML Registry Configuration Parameters PDS4 Data Dictionary (ISO/IEC 11179) PDS4 Information Model and Generated Documents