Download presentation
Presentation is loading. Please wait.
Published byShonda Louisa Tucker Modified over 9 years ago
1
Event and Feature Catalogs in the Virtual Solar Observatory Joseph A. Hourclé and the VSO Team SP54A-07 : 2008 May 30
2
Types of Catalogs Data Catalogs Used to track all data available Used to track all data available May be ‘observation’ centric or ‘file’ centric May be ‘observation’ centric or ‘file’ centric Typically maintained by the mission or PI Typically maintained by the mission or PI Have basic similarities across archives Have basic similarities across archives Event / Feature Catalogs Added science input Added science input Typically a byproduct of other research Typically a byproduct of other research Very heterogeneous Very heterogeneous
3
What are we cataloging? Features: Active Regions Active Regions Sunspots Sunspots CMEs CMEs Filaments Filaments Prominences Prominences Bright points Bright points Coronal Loops Coronal Loops Oscillations Oscillations Coronal Holes Coronal Holes EIT Waves EIT Waves Events: Radio Bursts Flares Campaigns Non Events: Data Gaps Data: * Publications Annotation
4
Processing Catalogs Ingestion Reading the information Reading the information Understanding the information Understanding the information Storage Presentation Single Catalog Use Single Catalog Use Multi Catalog Use Multi Catalog Use
5
Ingestion Who is the authoritative source? What if there are multiple value-added derivative products? What if there are multiple value-added derivative products? Which is the best format for ingestion? Which is the best format for ingestion? What is being cataloged? What is the unit for each record? What is the unit for each record? What data is in each record? Columns != Data Fields Columns != Data Fields May need to infer values from other Fields May need to infer values from other Fields
6
Ingestion, cont’d Formatting issues? Fixed width values overflowing column Fixed width values overflowing column Formatting may store information Formatting may store information Color / Font effects Color / Font effects May vary the record depending on info May vary the record depending on info May be maintained by hand May be maintained by hand … by multiple maintainers Data in sub-headings Data in sub-headings Missing Data? How are null values / error issues marked? How are null values / error issues marked?
7
Storage IDL: Difficult to access from other platforms Difficult to access from other platforms XML Self documenting; good for interchange, not so great for use Self documenting; good for interchange, not so great for use Flat file Compact, but must be loaded into another system to ‘do science’. Good for interchange if well formed & documented Compact, but must be loaded into another system to ‘do science’. Good for interchange if well formed & documented RDBMS Good for searching & filtering … may not be able to handle all data types / multi-value fields without normalization Hierarchical databases Can handle multi-value, but not designed for record cross-correlation
8
Storage of Field Values Columns may be multiple fields: (value) or (comment) (value) or (comment) (value) and (units) (value) and (units) What do we store? Store both a ‘display’ and a more useful value? Store both a ‘display’ and a more useful value? Eg, store display value & units, but also store value in a fixed unit. Eg, store display value & units, but also store value in a fixed unit. Ensure correct sorting, eg for X-ray flares: Ensure correct sorting, eg for X-ray flares: M9 < X2 < X10 … store as M09 / X02 / X10 M9 < X2 < X10 … store as M09 / X02 / X10 What if X100? Store as W/m2? What if X100? Store as W/m2?
12
Presentation Can I mimic the original display? Does that limit the uses of the catalog? Does that limit the uses of the catalog? Do the concepts need to be adjusted? Changes in definitions or accepted community standards Changes in definitions or accepted community standards Do columns need to be linked to make sense? Min / Max / Units for a range Min / Max / Units for a range Field may not be sortable without another column Field may not be sortable without another column Other presentation issues come from use.
13
Single Catalog Use: How do we access the catalog? SQL very powerful, but difficult to learn / use. How do we export to ‘do science’ with it? very powerful, but difficult to learn / use. How do we export to ‘do science’ with it? IDL good for scientists who use IDL, allows them to ‘do science’ without conversion good for scientists who use IDL, allows them to ‘do science’ without conversion HTML GUI Javascript allows more processing of tables. Still need to export for more complicated science APIs Yes, but what format output, and what features do we need to support?
14
Single Catalog Use, cont’d What is the best presentation format for a given use of the catalog? Simple display / browsing Simple display / browsing Searching / Filtering Searching / Filtering … are there other common science tasks? Different uses may suggest or require different formats
17
Multi-Catalog Use Need to understand what the fields mean so we can cross-correlate the tables. If correlations done by hand, is O(n 2 ) If correlations done by hand, is O(n 2 ) Just because it’s of the same unit doesn’t mean it’s directly comparable. Just because it’s of the same unit doesn’t mean it’s directly comparable. VOTable has ‘UCD+’, but may not be specific enough VOTable has ‘UCD+’, but may not be specific enough Some concepts don’t translate well: Carrington Coordinates to Heliographic Carrington Coordinates to Heliographic Observations from well off the sun-earth line Observations from well off the sun-earth line Of items off the solar disk Of items off the solar disk
18
Multi-Catalog Use, cont’d Ontologies More descriptive More descriptive Can define relationships between column types (eg, how to convert) Can define relationships between column types (eg, how to convert) Expensive to start Expensive to start VSTO & SWEET could serve as a foundation VSTO & SWEET could serve as a foundation SESDI already has prototypes SESDI already has prototypes Becomes an O(n) problem Becomes an O(n) problem Describe each catalog individually Describe each catalog individually Reasoner determines how to join them Reasoner determines how to join them
19
For the Future Virtual Solar Observatory Ingesting solar-related catalogs so they can be served via an API for other projects Ingesting solar-related catalogs so they can be served via an API for other projects Being tested by HelioViewer Being tested by HelioViewer Heliophysics Event List Manager Define API requirements for catalogs Define API requirements for catalogs Deal with cross-catalog issues Deal with cross-catalog issues
20
http://virtualsolar.org/ joseph.a.hourcle@nasa.gov
22
And now … For those playing along at home : Look at the example image Look at the example image Try to find how many things you can find that might be a problem with the catalog Try to find how many things you can find that might be a problem with the catalog Go to the next slide for a list of (some) of the issues Go to the next slide for a list of (some) of the issues
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.