Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design Considerations for Catalogs Joseph A. Hourclé 2008-02-20NSO-Tucson.

Similar presentations


Presentation on theme: "Design Considerations for Catalogs Joseph A. Hourclé 2008-02-20NSO-Tucson."— Presentation transcript:

1 Design Considerations for Catalogs Joseph A. Hourclé 2008-02-20NSO-Tucson

2 About Me

3 Types of Catalogs  Data Catalogs Used to track all data available Used to track all data available May be ‘observation’ centric or ‘file’ centric May be ‘observation’ centric or ‘file’ centric Typically maintained by the mission or PI Typically maintained by the mission or PI  Event / Feature Catalogs Added science input Added science input Typically a byproduct of other research Typically a byproduct of other research

4 Why catalogs?  Too much data SDO : ~100k discrete observations per day SDO : ~100k discrete observations per day  Why repeat the work? New science builds on previous work New science builds on previous work Draws on experience from domain experts Draws on experience from domain experts  Adds value to the data Annotates the data Annotates the data Makes data of interest more ‘findable’ Makes data of interest more ‘findable’

5 Common Catalog Problems

6 What are we cataloging?  May need a common concept of ‘record’ to create meaningful unions  ‘Observations’ vs. ‘Files’ May have multiple files that contain a given observation May have multiple files that contain a given observation Browse products, different processing, different file formats Browse products, different processing, different file formats

7 Data Permutations

8 Lack of Documentation  What does ‘red’ mean? ‘LightRed’ vs ‘DarkRed’ ‘LightRed’ vs ‘DarkRed’  How do we translate the PI’s terms to discipline concepts? Does the catalog use an unconventional definition of a term? Does the catalog use an unconventional definition of a term? Has usage of the term changed since the catalog was started? Has usage of the term changed since the catalog was started?

9 Catalogs Have Intended Purposes / Users  Catalogs may have to be manipulated to get the information you want  A Solar Physics catalog may not be useful to someone in Heliospheric Physics Trying to answer different questions Trying to answer different questions Camera Filter Position Polarization Position 021 110 201

10 No Entry ≠ No Event  Lack of a record may be from lack of data  LASCO CME catalog what about periods when LASCO wasn’t observing? what about periods when LASCO wasn’t observing?  Event catalogs show when we know something existed, not when we know they didn’t exist.  Event catalogs need to disclose gaps in the data they’re based on

11 Catalogs are not just flat files  Catalogs can be databases Don’t need to be 2 dimensional Don’t need to be 2 dimensional Relational databases Relational databases Can have multiple views into the data Can have multiple views into the data Each may have a different purpose Each may have a different purpose Each may have a different audience Each may have a different audience  Storage of catalogs Beware of proprietary formats Beware of proprietary formats May not be useful to all audiences May not be useful to all audiences May not be available in the future May not be available in the future

12 Added Value  Catalogs can link to other catalogs Or browse products Or browse products Or applications Or applications Search Systems Search Systems Visualization Tools Visualization Tools

13 Other Types of Catalogs  Link Publications to Data Where can I find the data from this article? Where can I find the data from this article? Has this data been written about? Has this data been written about?  Catalogs of Services / Applications What data is out there? (VxOs) What data is out there? (VxOs) Translators / reprocessors? (SSW, CoSEC) Translators / reprocessors? (SSW, CoSEC) Visualization tools? (SSW, CoSEC?) Visualization tools? (SSW, CoSEC?)  Annotation Services  Observing / Campaign Catalogs

14 There is help  Database Administrators Data Modeling Data Modeling  Library and Information Science Cataloging / Library Theory Cataloging / Library Theory Information Architecture Information Architecture  Terminology varies AGU : Science Informatics AGU : Science Informatics NSF : Cyber Infrastructure NSF : Cyber Infrastructure ARL : E-science ARL : E-science

15 Summary  Documentation! Define terms / columns / rows Define terms / columns / rows Note the unknown periods Note the unknown periods  Consult experts Database admins / Librarians / etc Database admins / Librarians / etc  Consider non-flat structures Multiple views into the data Multiple views into the data  Consider accessibility Can people read/understand this format? Can people read/understand this format?

16

17 Sunspot on 15 July 2002 from the Swedish 1-m Solar Telescope on La Palma

18 http://virtualsolar.org/ joseph.a.hourcle@nasa.gov

19

20 Functional Requirements for Bibliographic Records

21 And it mostly works

22

23


Download ppt "Design Considerations for Catalogs Joseph A. Hourclé 2008-02-20NSO-Tucson."

Similar presentations


Ads by Google