Download presentation
Presentation is loading. Please wait.
Published byShon Doyle Modified over 9 years ago
1
Virtual Observatories and Data Interfaces for Atmospheric Science 12th EISCAT International Workshop Incoherent Scatter Radar School Swedish Institute of Space Physics, Space Campus, Kiruna, Sweden Bill Rideout MIT Haystack Route 40, Westford, MA, USA 1-781-981-5624 brideout@haystack.mit.edu 26 August 2005
2
Outline Virtual Observatories Madrigal –Web interface –Remote API –Extending/Contributing –Madrigal 2.4 Cedar Database Other data sources
3
A day in the life of an Atmospheric Scientist I have done an experiment with my instrument, but now I need to … –Search numerous websites for data –Figure out their parameters, units –Figure out their coordinate system, date format –Figure out how to determine data quality –Write code to download data, or (worse) manually download –Write code to convert to your format –Finally, do science
4
How can Virtual observatories help? Virtual Observatories – one stop data shopping!
5
Virtual Observatories Ideally… –Provide a single interface to access all data –Knows about all data sources –Allows simple, powerful searches to discover unknown data sources –Always gets the most up-to-date data –Uses a single set of well-defined parameters –Provides data in consistent format(s) –Provides data in consistent coordinates –Informs user of contact information and rules-of- the-road for all data
6
Two approaches Top down Bottom up Build an interface Build a standard data source
7
How do they work? Top-down approach: –Accept that all data sources will be forever incompatible –Build a data model so metadata can be shared –Build a unique interface to interface to each new data source. –Scales linearly with number of data sources. –Works best with more uniform data (i.e., astronomical images) Bottom-up approach: –Standardize data format and semantics –Standardize data provider API –Approach taken by Madrigal/Cedar –Try for community acceptance
8
Outline Virtual Observatories Madrigal –Web interface –Remote API –Extending/Contributing –Madrigal 2.4 Cedar Database Other data sources
9
What is the Madrigal database? An open-source, web-based database designed to hold one group’s data –www.openmadrigal.org has all code and downloads www.openmadrigal.org Built upon the Cedar database format established over 20 years ago Fundamentally a data source – allows local owners to improve/correct their data Designed to be used for a wide variety of instruments New installations always welcome!
10
Madrigal Data Model Madrigal site (typically a facility with scientists and a Madrigal installation) ↓ Instruments (ground-based, typically with a set location) ↓ Experiments (typically of limited duration, with a single contact) ↓ Experiment Files (represents data from one analysis of the experiment) ↓ Records (measurement over one period of time) ↓ Data shared among all Madrigal sites Data unique to one Madrigal site
11
Madrigal Records Records (measurement over one period of time) Three types: Catalog record –descriptive information about entire experiment Header record –descriptive information about one section of experiment Data record –Stores values –All parameters defined by Cedar Database standard –Contains 3 parts Prolog 1D records 2D records
12
Madrigal Data Records Prolog –Start and end time –Instrument id –Kind of data id 1D records (scalar) –Single value parameters 2D records (vector) –Multiple value parameters –All parameters must have same number of rows –Meant to allow multiple spatial measurements –Not meant for time variation – conflicts with Prolog! Data record Prolog ID (scalar) – S/N=2.5 2D (vector) – Altitudes = 100,150, 200,250,300,350
13
Cedar/Madrigal Database All parameters in file defined –http://cedarweb.hao.ucar.edu/documents/parameters_list.txt http://cedarweb.hao.ucar.edu/documents/parameters_list.txthttp://cedarweb.hao.ucar.edu/documents/parameters_list.txt Ranges of parameters for each instrument Data stored in one or two 16 bit ints –Additional increment parameters Error parameters –Mnemonics start with D –Code is negative of parameter
14
Cedar/Madrigal Database, continued Special values –missing –assumed (error value only) –knownbad (error value only) Defined in –http://cedarweb.hao.ucar.edu/cgi- bin/cedar_file_access.pl?filename=documents/c edar_fmt.pdf
15
Cedar Database parameters Example additional increment parameter
16
Cedar parameters - continued Madrigal contains many “derived only” parameters –Not included in Cedar standard –Cannot be stored in Cedar file New python API hides the existence of additional increment parameters –All values are doubles –Exceptions occur on overflow –More later…
17
Madrigal Derivation Engine Derived parameters appear to be in file Assumes information can be derived from records –Time from prolog –Position either as 1D or 2D –Other parameters Engine determines all parameters that can be derived
18
Classes of derived parameters Space, time –Examples: Local time, shadow height Geophysical –Examples: Kp, Dst, Imf, F10.7 Magnetic –Examples: Bmag, Mag conjugate lat and long, Tsyganenko magnetic equatorial plane intercept MSIS –Examples: Tn, Nol
19
Outline Virtual Observatories Madrigal –Web interface –Remote API –Extending/Contributing –Madrigal 2.4 Cedar Database Other data sources
20
Madrigal web interface - homepage All Madrigal sites Access Data
21
Three ways to access Madrigal data Data in individual experiments Data across experiments Plot data across experiments
22
Searching for experiments By default, all Madrigal sites are searched By default, view only most recent files Choose one or more instruments Find any experiments with any overlap with these dates
23
Madrigal experiment listing These links could be to experiments at any site
24
Madrigal experiment files – part 1 These two files have no catalog or header records, otherwise there would be a link Data browser (isprint) allows viewing both measured and derived parameters with filtering
25
Madrigal experiment files – part 2 Madrigal allows any additional web-compatible files to be added to the experiment Image-conversion feature written at Eiscat Notes can be added by users – also written at Eiscat
26
Data browser (isprint) – part 1 Users can define filters to select certain filters and parameters with one click Filters to reduce data Time Altitude Azimuth Elevation …
27
Data browser (isprint) – part 2 Filters, continued Filter data using any parameter, or the sum, difference, product or quotient of two parameters. Example: Nel –DNel > 1
28
Data browser (isprint) – part 3 Choose parameters to display Measured in bold Derived in normal font Listed by category Click on any parameter for a full description See a full description of all parameters
29
Data browser (isprint) – part 4 User clicked on CHISQ Some parameters have a more complete description
30
Data browser (isprint) – part 5 Longer description of CHISQ
31
Data browser (isprint) – part 6 Show header for each record option String to indicate missing data
32
Data output Display only text Save text version to file Summary of selected filters Headers were on in this example
33
Second approach – Global search Global search for data
34
Global search – part 1 Choose one or more instruments Choose date range (optional) Choose kinds of data Choose seasonal filter (optional)
35
Global search – part 2 Filter by experiment name (optional) Select parameters to display Filter using any parameter, just like isprint
36
Global search – selecting parameters Parameters with categories and pop-up definitions as on isprint page
37
Global search – review search Review all aspects of the global search before submitting
38
Global search – returned message Message returns number of files being searched, along with rough estimate of time required. Since reports may take a long time to generate, a email with a link is sent when done
39
Third approach – Plotting across experiments Plotting data from various instruments across experiments
40
Creating plots Select one or more instruments. In this example Svalbard and Millstone (not visible) selected. Click here to see a list of all experiments Select date range (can cross experiment boundaries) Select a scatter plot or pcolor of altitude versus time
41
Choose single parameter to plot Same pop-up listing of parameters as in isprint Radio buttons, since only one parameter can be selected
42
Set up limits and filters Set limits on the parameter you selected If a pcolor plot, can set altitude limits Data can also be filtered using another parameter
43
Pcolor plot output Single request generates Millstone and Eiscat plots Plot are requested from each site simultaneously to improve performance Rules of road for each site shown
44
Pcolor plot output – part 2 Can add more stacked plots with different parameters, or start over
45
Adding additional plots Now add a scatter plot of DST with same time scale
46
Adding additional plots – part 2 Time scales align with stacked plots if times not changed
47
Outline Virtual Observatories Madrigal –Web interface –Remote API –Extending/Contributing –Madrigal 2.4 Cedar Database Other data sources
48
Remote Access to Madrigal Data Built on web services Like the web, available from anywhere on any platform Complete Matlab and Python API written More APIs available on request or via contribution
49
Madrigal Web Services Simple delimited output via CGI scripts Not based on SOAP or XmlRpc since no support in languages such as Matlab CGI arguments and output fully documented at http://www.haystack.edu/madrigal/remoteAP Is.html
50
Madrigal Web Services – part 2 To write a new API, each method must –Take input arguments and generate the correct CGI URL –Parse the delimited text –Return data to user
51
Matlab Remote API Methods –getInstrumentsWeb –getExperimentsWeb –getExperimentFilesWeb –getParametersWeb –isprintWeb –madCalculatorWeb Methods match Madrigal model
52
Simple Matlab example filename = '/usr/local/madroot/experiments /2003/tro/05jun03/NCAR_2003-06-05_tau2pl_60_uhf.bin'; eiscat_cgi_url = 'http://www.eiscat.se/madrigal/cgi-bin/'; % download the following parameters from the above file: ut, gdalt, ti parms = 'ut,gdalt,ti'; filterStr = 'filter=gdalt,200,600 filter=ti,0,5000'; % returns a three dimensional array of double with the dimensions: % % [Number of rows, number of parameters requested, number of records] % % If error or no data returned, will return error explanation string instead. data = isprintWeb(eiscat_cgi_url, filename, parms, filterStr); Matlab Madrigal API call
53
Simple Matlab example, continued In real code, higher level methods to search for filename Entire web could be built via remote calls See http://madrigal.haystack.edu/madrigal/remot eMatlabAPI.html for complete documentation and more examples http://madrigal.haystack.edu/madrigal/remot eMatlabAPI.html http://madrigal.haystack.edu/madrigal/remot eMatlabAPI.html
54
Simple Python example # create the main object to get all needed info from Madrigal madrigalUrl = ‘http://www.haystack.mit.edu/madrigal’ testData = madrigalWeb.madrigalWeb.MadrigalData(madrigalUrl) # get all MLH experiments in 1998 expList = testData.getExperiments(30, 1998,1,1,0,0,0,1998,12,31,23,59,59) for exp in expList: # print out all experiments # print out all experiments print exp print exp # print list of all files in first experiment fileList = testData.getExperimentFiles(expList[0].id) for thisfile in fileList: for thisfile in fileList: print thisfile print thisfile
55
Python Remote API Similar methods to Matlab Fully documented with examples Used to implement plotting across multiple sites Used by SuperDarn to constantly poll for real-time Millstone Hill data See http://madrigal.haystack.edu/madrigal/remotePyth onAPI.html for documentation and more examples http://madrigal.haystack.edu/madrigal/remotePyth onAPI.html http://madrigal.haystack.edu/madrigal/remotePyth onAPI.html
56
Outline Virtual Observatories Madrigal –Web interface –Remote API –Extending/Contributing –Madrigal 2.4 Cedar Database Other data sources
57
Extending/contributing to Madrigal Madrigal is completely open source See www.openmadrigal.org for CVS www.openmadrigal.org All new code is C/Python, with some Tcl. Extending the Madrigal derivation engine is simple
58
Extending the Madrigal derivation engine Simply a list of methods with input Madrigal parameters and output Madrigal parameters –int methodName(int inCount, double * inputArr, int outCount, double * outputArr, FILE * errFile) Register parameters in list Details at http://madrigal.haystack.edu/madrigal/exten dingMaddata.html
59
Example – Tsyganenko parameters /*********************************************************************** * getTsygan derives field line crossing points using Tsyganenko model. * * arguments: * inCount (num inputs) = 5 (UT1, UT2, GDLAT, GLON, GDALT) * inputArr - double array holding: * UT1 - UT at record start * UT2 - UT at record end * GDLAT - geodetic latitude * GLON - geodetic longitude * GDALT - geodetic altitude * outCount (num outputs) = 4 * outputArr - double array holding: * TSYG_EQ_XGSM - X GSM value where field line crosses GSM XY plane * TSYG_EQ_YGSM - Y GSM value where field line crosses GSM XY plane * TSYG_EQ_XGSE - X GSE value where field line crosses GSE XY plane * TSYG_EQ_YGSE - Y GSE value where field line crosses GSE XY plane * * Algorithm: See Geopack_2003.f, T01_01.f * returns - 0 (successful) */ int getTsygan(int inCount, double * inputArr, int outCount, double * outputArr, FILE * errFile)
60
Outline Virtual Observatories Madrigal –Web interface –Remote API –Extending/Contributing –Madrigal 2.4 Cedar Database Other data sources
61
New features of Madrigal 2.4 Plotting (as demonstrated) Automatic updating of all geophysical data Capture of user name, email, organization –Web –Remote API Simple python class to create/edit Madrigal files Simple scripts/API to create experiments, add files, update metadata
62
Creating files with python -example “”” create a file with two data records””” import madrigal.metadata import madrigal.cedar ################# sample data ################# kinst = 30 # instrument identifier of Millstone Hill ISR modexp = 230 # id of mode of experiment kindat = 3408 # id of kind of data processing nrow = 5 # all data records have 5 2D rows SYSTMP = (120.0, 122.0) TFREQ = (4.4E8, 4.4E8) GDALT = ((70.0, 100.0, 200.0, 300.0, 400.0), (70.0, 100.0, 200.0, 300.0, 400.0)) GDLAT = ((42.0, 42.0, 42.0, 42.0, 42.0), (42.0, 42.0, 42.0, 42.0, 42.0)) GLON = ((270.0, 270.0, 270.0, 270.0, 270.0), (270.0, 270.0, 270.0, 270.0, 270.0)) TR = (('missing', 1.0, 1.0, 2.3, 3.0), ('missing', 1.0, 1.7, 2.4, 3.1)) DTR = (('missing', 'assumed', 'assumed', 0.3, 0.7), ('missing', 'assumed', 0.7, 0.4, 0.5))
63
Creating files with python – part 2 newFile = '/tmp/testCedar.dat' # create a new Madrigal file cedarObj = madrigal.cedar.MadrigalCedarFile(newFile, True) # create all data records - each record lasts one minute startTime = datetime.datetime(2005, 3, 19, 12, 30, 0, 0) recTime = datetime.timedelta(0,60) for recno in range(2): endTime = startTime + recTime dataRec = madrigal.cedar.MadrigalDataRecord(kinst, kindat, startTime.year, startTime.month, startTime.day, startTime.hour, startTime.minute, startTime.second, startTime.microsecond/10000, endTime.year, endTime.month, endTime.day, endTime.hour, endTime.minute, endTime.second, endTime.microsecond/10000, ('systmp', 'tfreq'), ('gdalt', 'gdlat', 'glon', 'tr', 'dtr'), nrow)
64
Creating files with python – part 3 # set 1d values dataRec.set1D('systmp', SYSTMP[recno]) dataRec.set1D('tfreq', TFREQ[recno]) # set 2d values for n in range(nrow): dataRec.set2D('gdalt', n, GDALT[recno][n]) dataRec.set2D('gdlat', n, GDLAT[recno][n]) dataRec.set2D('glon', n, GLON[recno][n]) dataRec.set2D('tr', n, TR[recno][n]) dataRec.set2D('dtr', n, DTR[recno][n]) # append new data record cedarObj.append(dataRec) startTime += recTime # write new file cedarObj.write()
65
Editing files with python “”” increases all values of Ti by 20%””” import madrigal.metadata import madrigal.cedar orgFile = ‘/opt/madrigal/experiments/1998/mlh/20jan98/mil980120g.003' newFile = '/tmp/mil980120g.003' # read the Madrigal file into memory cedarObj = madrigal.cedar.MadrigalCedarFile(orgFile) # loop through each record, increasing all Ti values by a factor of 1.2 for record in cedarObj: # skip header and catalog records if record.getType() == 'data': # loop through each 2D roow for row in range(record.getNrow()): presentTi = record.get2D('Ti', row) # make sure its not a special string value, eg 'missing' if type(presentTi) != types.StringType: record.set2D('Ti', row, presentTi*1.2) # write edited file cedarObj.write('Madrigal', newFile)
66
Python File creation/editing - summary Creates, edits catalog, header, data records Hides details of Cedar file formats –Various flavors of file format –Use of 16 bit integers to store data –Use of “additional increment” parameters See http://madrigal.haystack.edu/madrigal/pythonCeda rTutorial.html for complete documentation http://madrigal.haystack.edu/madrigal/pythonCeda rTutorial.html http://madrigal.haystack.edu/madrigal/pythonCeda rTutorial.html
67
Outline Virtual Observatories Madrigal –Web interface –Remote API –Extending/Contributing –Madrigal 2.4 Cedar Database Other data sources
68
Cedar Database Outgrowth of the Madrigal Database A central repository –Data persistence –Wider variety of data Has model result/tools Wider variety of output formats Data not as actively updated Does not (yet) derive parameters Does not separate data by experiment See http://cedarweb.hao.ucar.edu/documents/dbexamples.html
69
Cedar – simple example Click on Data Services Click on Get/Plot Data
70
Cedar – select instrument Select instrument
71
Cedar instrument – part 2 Select instrument
72
Cedar date – part 1 Select year In the next three pages you are selecting a starting day. UI is designed to ensure that only a date with data can be selected.
73
Cedar date – part 2 Select month
74
Cedar date – part 3 Select starting day Select number of days to view
75
Cedar output format Choose output format Data filtering available (optional)
76
Cedar TAB output TAB format By default, shows all measured parameters
77
Cedar Database – for more info More complex examples at http://cedarweb.hao.ucar.edu/documents/db examples.html Contacts: –Barbara Emery (emery@ucar.edu) –Jose Garcia (jgarcia@ucar.edu)
78
Outline Virtual Observatories Madrigal –Web interface –Remote API –Extending/Contributing –Madrigal 2.4 Cedar Database Other data sources
79
Arecibo database Simple interface focused on their data http://www.naic.edu/aisr/database/html/fram edoc.html Site-specific Easy to use
80
Virtual Solar Observatory UI allows filtering of data. Based on uniform data model.
81
Virtual Space Physics Observatory Based on SPASE data model Development slowed by budget issues
82
National Geophysical Data Center (NOAA) – Solar Terrestrial Physics SPIDR provides a Virtual-Observatory like interface to many of the datasets See SPIDR tutorial at http://spidr.ngdchttp://spidr.ngdc. noaa.gov/spidr/ tutorial.do
83
Many other data sources World Data Center –http://www.ngdc.noaa.gov/wdc/wdcmain.html http://www.ngdc.noaa.gov/wdc/wdcmain.html Canadian Space Science Data Portal –http://www.ssdp.ca/ http://www.ssdp.ca/ NASA,ESA satellite sites Magnetometer arrays –http://www-ssc.igpp.ucla.edu/gem/worldmag/index.html NASA's Space Physics Data Facility –http://spdf.gsfc.nasa.gov/ http://spdf.gsfc.nasa.gov/ And many more…
84
Summary Virtual Observatory concept beginning to influence data gathering Future success may depend on standardization Submit suggestions, or write improvements to Madrigal –www.openmadrigal.org –openmadrigal-users@openmadrigal.org
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.