Presentation is loading. Please wait.

Presentation is loading. Please wait.

Event and Feature Catalogs in the Virtual Solar Observatory Joseph A. Hourclé and the VSO Team SP54A-07 : 2008 May 30.

Similar presentations


Presentation on theme: "Event and Feature Catalogs in the Virtual Solar Observatory Joseph A. Hourclé and the VSO Team SP54A-07 : 2008 May 30."— Presentation transcript:

1 Event and Feature Catalogs in the Virtual Solar Observatory Joseph A. Hourclé and the VSO Team SP54A-07 : 2008 May 30

2 Types of Catalogs  Data Catalogs Used to track all data available Used to track all data available May be ‘observation’ centric or ‘file’ centric May be ‘observation’ centric or ‘file’ centric Typically maintained by the mission or PI Typically maintained by the mission or PI Have basic similarities across archives Have basic similarities across archives  Event / Feature Catalogs Added science input Added science input Typically a byproduct of other research Typically a byproduct of other research Very heterogeneous Very heterogeneous

3 What are we cataloging?  Features: Active Regions Active Regions Sunspots Sunspots CMEs CMEs Filaments Filaments Prominences Prominences Bright points Bright points Coronal Loops Coronal Loops Oscillations Oscillations Coronal Holes Coronal Holes EIT Waves EIT Waves  Events: Radio Bursts Flares Campaigns  Non Events: Data Gaps  Data: * Publications Annotation

4 Processing Catalogs  Ingestion Reading the information Reading the information Understanding the information Understanding the information  Storage  Presentation Single Catalog Use Single Catalog Use Multi Catalog Use Multi Catalog Use

5 Ingestion  Who is the authoritative source? What if there are multiple value-added derivative products? What if there are multiple value-added derivative products? Which is the best format for ingestion? Which is the best format for ingestion?  What is being cataloged? What is the unit for each record? What is the unit for each record?  What data is in each record? Columns != Data Fields Columns != Data Fields May need to infer values from other Fields May need to infer values from other Fields

6 Ingestion, cont’d  Formatting issues? Fixed width values overflowing column Fixed width values overflowing column Formatting may store information Formatting may store information Color / Font effects Color / Font effects May vary the record depending on info May vary the record depending on info May be maintained by hand May be maintained by hand … by multiple maintainers Data in sub-headings Data in sub-headings  Missing Data? How are null values / error issues marked? How are null values / error issues marked?

7 Storage  IDL: Difficult to access from other platforms Difficult to access from other platforms  XML Self documenting; good for interchange, not so great for use Self documenting; good for interchange, not so great for use  Flat file Compact, but must be loaded into another system to ‘do science’. Good for interchange if well formed & documented Compact, but must be loaded into another system to ‘do science’. Good for interchange if well formed & documented  RDBMS Good for searching & filtering … may not be able to handle all data types / multi-value fields without normalization  Hierarchical databases Can handle multi-value, but not designed for record cross-correlation

8 Storage of Field Values  Columns may be multiple fields: (value) or (comment) (value) or (comment) (value) and (units) (value) and (units)  What do we store? Store both a ‘display’ and a more useful value? Store both a ‘display’ and a more useful value? Eg, store display value & units, but also store value in a fixed unit. Eg, store display value & units, but also store value in a fixed unit. Ensure correct sorting, eg for X-ray flares: Ensure correct sorting, eg for X-ray flares: M9 < X2 < X10 … store as M09 / X02 / X10 M9 < X2 < X10 … store as M09 / X02 / X10 What if X100? Store as W/m2? What if X100? Store as W/m2?

9

10

11

12 Presentation  Can I mimic the original display? Does that limit the uses of the catalog? Does that limit the uses of the catalog?  Do the concepts need to be adjusted? Changes in definitions or accepted community standards Changes in definitions or accepted community standards  Do columns need to be linked to make sense? Min / Max / Units for a range Min / Max / Units for a range Field may not be sortable without another column Field may not be sortable without another column  Other presentation issues come from use.

13 Single Catalog Use: How do we access the catalog?  SQL very powerful, but difficult to learn / use. How do we export to ‘do science’ with it? very powerful, but difficult to learn / use. How do we export to ‘do science’ with it?  IDL good for scientists who use IDL, allows them to ‘do science’ without conversion good for scientists who use IDL, allows them to ‘do science’ without conversion  HTML GUI Javascript allows more processing of tables. Still need to export for more complicated science  APIs Yes, but what format output, and what features do we need to support?

14 Single Catalog Use, cont’d  What is the best presentation format for a given use of the catalog? Simple display / browsing Simple display / browsing Searching / Filtering Searching / Filtering … are there other common science tasks?  Different uses may suggest or require different formats

15

16

17 Multi-Catalog Use  Need to understand what the fields mean so we can cross-correlate the tables. If correlations done by hand, is O(n 2 ) If correlations done by hand, is O(n 2 ) Just because it’s of the same unit doesn’t mean it’s directly comparable. Just because it’s of the same unit doesn’t mean it’s directly comparable. VOTable has ‘UCD+’, but may not be specific enough VOTable has ‘UCD+’, but may not be specific enough  Some concepts don’t translate well: Carrington Coordinates to Heliographic Carrington Coordinates to Heliographic Observations from well off the sun-earth line Observations from well off the sun-earth line Of items off the solar disk Of items off the solar disk

18 Multi-Catalog Use, cont’d  Ontologies More descriptive More descriptive Can define relationships between column types (eg, how to convert) Can define relationships between column types (eg, how to convert) Expensive to start Expensive to start VSTO & SWEET could serve as a foundation VSTO & SWEET could serve as a foundation SESDI already has prototypes SESDI already has prototypes Becomes an O(n) problem Becomes an O(n) problem Describe each catalog individually Describe each catalog individually Reasoner determines how to join them Reasoner determines how to join them

19 For the Future  Virtual Solar Observatory Ingesting solar-related catalogs so they can be served via an API for other projects Ingesting solar-related catalogs so they can be served via an API for other projects Being tested by HelioViewer Being tested by HelioViewer  Heliophysics Event List Manager Define API requirements for catalogs Define API requirements for catalogs Deal with cross-catalog issues Deal with cross-catalog issues

20 http://virtualsolar.org/ joseph.a.hourcle@nasa.gov

21

22 And now …  For those playing along at home : Look at the example image Look at the example image Try to find how many things you can find that might be a problem with the catalog Try to find how many things you can find that might be a problem with the catalog Go to the next slide for a list of (some) of the issues Go to the next slide for a list of (some) of the issues

23

24

25

26

27

28


Download ppt "Event and Feature Catalogs in the Virtual Solar Observatory Joseph A. Hourclé and the VSO Team SP54A-07 : 2008 May 30."

Similar presentations


Ads by Google