Download presentation
Presentation is loading. Please wait.
Published byMelissa Bond Modified over 9 years ago
1
Design Considerations for Catalogs Joseph A. Hourclé 2008-02-20NSO-Tucson
2
About Me
3
Types of Catalogs Data Catalogs Used to track all data available Used to track all data available May be ‘observation’ centric or ‘file’ centric May be ‘observation’ centric or ‘file’ centric Typically maintained by the mission or PI Typically maintained by the mission or PI Event / Feature Catalogs Added science input Added science input Typically a byproduct of other research Typically a byproduct of other research
4
Why catalogs? Too much data SDO : ~100k discrete observations per day SDO : ~100k discrete observations per day Why repeat the work? New science builds on previous work New science builds on previous work Draws on experience from domain experts Draws on experience from domain experts Adds value to the data Annotates the data Annotates the data Makes data of interest more ‘findable’ Makes data of interest more ‘findable’
5
Common Catalog Problems
6
What are we cataloging? May need a common concept of ‘record’ to create meaningful unions ‘Observations’ vs. ‘Files’ May have multiple files that contain a given observation May have multiple files that contain a given observation Browse products, different processing, different file formats Browse products, different processing, different file formats
7
Data Permutations
8
Lack of Documentation What does ‘red’ mean? ‘LightRed’ vs ‘DarkRed’ ‘LightRed’ vs ‘DarkRed’ How do we translate the PI’s terms to discipline concepts? Does the catalog use an unconventional definition of a term? Does the catalog use an unconventional definition of a term? Has usage of the term changed since the catalog was started? Has usage of the term changed since the catalog was started?
9
Catalogs Have Intended Purposes / Users Catalogs may have to be manipulated to get the information you want A Solar Physics catalog may not be useful to someone in Heliospheric Physics Trying to answer different questions Trying to answer different questions Camera Filter Position Polarization Position 021 110 201
10
No Entry ≠ No Event Lack of a record may be from lack of data LASCO CME catalog what about periods when LASCO wasn’t observing? what about periods when LASCO wasn’t observing? Event catalogs show when we know something existed, not when we know they didn’t exist. Event catalogs need to disclose gaps in the data they’re based on
11
Catalogs are not just flat files Catalogs can be databases Don’t need to be 2 dimensional Don’t need to be 2 dimensional Relational databases Relational databases Can have multiple views into the data Can have multiple views into the data Each may have a different purpose Each may have a different purpose Each may have a different audience Each may have a different audience Storage of catalogs Beware of proprietary formats Beware of proprietary formats May not be useful to all audiences May not be useful to all audiences May not be available in the future May not be available in the future
12
Added Value Catalogs can link to other catalogs Or browse products Or browse products Or applications Or applications Search Systems Search Systems Visualization Tools Visualization Tools
13
Other Types of Catalogs Link Publications to Data Where can I find the data from this article? Where can I find the data from this article? Has this data been written about? Has this data been written about? Catalogs of Services / Applications What data is out there? (VxOs) What data is out there? (VxOs) Translators / reprocessors? (SSW, CoSEC) Translators / reprocessors? (SSW, CoSEC) Visualization tools? (SSW, CoSEC?) Visualization tools? (SSW, CoSEC?) Annotation Services Observing / Campaign Catalogs
14
There is help Database Administrators Data Modeling Data Modeling Library and Information Science Cataloging / Library Theory Cataloging / Library Theory Information Architecture Information Architecture Terminology varies AGU : Science Informatics AGU : Science Informatics NSF : Cyber Infrastructure NSF : Cyber Infrastructure ARL : E-science ARL : E-science
15
Summary Documentation! Define terms / columns / rows Define terms / columns / rows Note the unknown periods Note the unknown periods Consult experts Database admins / Librarians / etc Database admins / Librarians / etc Consider non-flat structures Multiple views into the data Multiple views into the data Consider accessibility Can people read/understand this format? Can people read/understand this format?
17
Sunspot on 15 July 2002 from the Swedish 1-m Solar Telescope on La Palma
18
http://virtualsolar.org/ joseph.a.hourcle@nasa.gov
20
Functional Requirements for Bibliographic Records
21
And it mostly works
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.