DateADASS How to Navigate VO Datasets Using VO Protocols Ray Plante (NCSA/UIUC), Thomas McGlynn and Eric Winter NASA/GSFC T HE US N ATIONAL V IRTUAL O BSERVATORY
DateADASS Summary Data Discovery: Using the VO Registry Data Recovery: –Protocols –VOTable –UCDs
DateADASS Example Using the Registry use SOAP::Lite; # Install the Perl SOAP library. my $soap = SOAP::Lite # Locate the SOAP service -> uri(' -> on_action( sub { join '/', ' $_[1]} ) -> proxy(' my $method = SOAP::Data->name('QueryRegistry') # The method to invoke ->attr({xmlns=> ' # Specify the parameters of the method call = (SOAP::Data->name("predicate" => "ServiceType LIKE 'SIAP%' and ContentLevel='Research'") ); my $result = $soap->call($method # Query the remote service.... # Loop over the results foreach ($result->valueof('//SimpleResource')) {... if($$_{ServiceType} eq "SIAP/Cutout"){... Handle a cutout service } elsif (($$_{ServiceType} eq "SIAP/Archive") && ($$_{Title} eq $cxc)){... Handle an Archive service } else {... Default; }
DateADASS Registry usage issues Straightforward, fast and flexible access. Using Registry as Web service. –Need to install SOAP for environment to be used. –Interface is not yet standardized, so details of specific implementation are exposed. –SQL style query (i.e., SQL WHERE clause). The ultimate syntax may use something more like XQuery –Cryptic magic in some calls needs to be done properly (e.g., on_action argument in the constructor). Users need to copy from working examples. Content of Registry still in some flux –Detailed and final specification of service metadata. –Hierarchical database issues.
DateADASS Querying the registry we can easily obtain lists of various kinds of resources and use the associated metadata to organize them however we wish.
DateADASS Protocols Cone search provides access to anything that returns a table regarding a position. –Object tables: Lists of distinct astronomical objects –Observation tables: Lists of pointed observations No standard link from observation tables to archival data yet, but data set ID’s may provide such. SIAP Archives –Users get static, often ‘rawish’, data (Chandra, ADIL) –May get many images returned from the same dataset (i.e., lots of Chandra images of a given field). SIAP Services –Users get data customized to their invocation (DPOSS, SkyView) –Typically get only one or a few images from a given service but several different services may be returned by the same SIAP server. E.g., SkyView returns images from many different surveys – but only one of each. SIAP retrievals are a two step process. The SIAP server is in essence a registry service giving data available at a given location.
DateADASS Cone Search and SIAP Examples BEGIN { # Avoid HTTP 2.0 chunking (Perl doesn’t like it!) $ENV{PERL_LWP_USE_HTTP_10} = 1; } use LWP::UserAgent; # Standard Perl libraries from CPAN use URI::URL; use Set up a base URL for the service $url.= "POS=$ra,$dec&SIZE=$size"; # SIAP # "RA=$ra&DEC=$dec&SR=$size; # Cone search my $u = URI::URL->new($url); my $req = $url); my $ua = LWP::UserAgent->new(); my $resp = $ua->request($req);... Process the response...
DateADASS Protocol Issues SIAP and Cone Search are invoked almost identically for minimal interface. –Lots of additional capabilities may be available in SIAP, but very few are required to be supported by the server. –Metadata queries use special forms. SR=0 for Cone search asks for metadata on returned table. FORMAT=METADATA keyword used to get metadata from SIAP. Both return VOTables. –In SIAP this describes available images. SIAP may return multiple entries for same image in different formats. Links between these are not standardized.
DateADASS Reading VOTables use VOTable::Document; # VOTable library.... my $doc = VOTable::Document->new_from_string($xstring); = $doc->get_votable(); my $vot = $votarr[0]; = $vot->get_resource(); foreach my $res { # Loop over the resources in the VOTable = $res->get_table(); foreach my $tab { # Loop over the tables within the Resource my $data = $tab->get_data(); if ($data) { $nRow = $data->get_num_rows(); } my $ra = $tab->get_field_position_by_ucd("POS_EQ_RA_MAIN"); # Find RA/Dec columns my $dec = $tab->get_field_position_by_ucd("POS_EQ_DEC_MAIN"); = $tab->get_field(); for ($i = 0; $i < $nRow; $i += 1) {# Loop over the rows within the table = $tab->get_row($i); for ($j=0; $j <= $#rowdata; $j += 1) {# Loop over the columns within the row my $element = $rowdata[j];... This is the the row_i, column_j element in the table. }
DateADASS VOTable Issues VOTables can be complex –Most current tables are simple but ID attribute may be useful for complex VOTables. –Need to handle arrays of resources and tables. –Formats of SIAP and Cone search results are better constrained. Streaming versus trees –Most libraries support one paradigm easily and the other with some difficulty. Trees are easier but run into limits handling > 10 5 rows. UCDs versus column names –Protocols refer to UCDs but particular applications may require specific columns. Support for aggregate quantities (e.g., ra,dec->position) likely in updates.
DateADASS Target Data Correlator Results (VOTABLE) Remote catalogs A A B B C C D D ClassX Correlation Defines the services to be queried or UCDs we are interested in. What fields are need in the results? Single query results Join criteria and output filter The ClassX cross-correlator uses small XML files to describe what VOTable enabled services to query, what fields to extract, and how to combine information from multiple tables. With consistently defined protocols and output formats, only these small control files need to be changed to correlate tables from VizieR, the HEASARC and many other sites.
DateADASS UCDs SIAP and Cone search protocols levy requirements that columns with certain UCDs are present. –Position –Links to actual data file and format for SIAP –These UCDs are pretty much the only thing you are guaranteed to get in the output. UCDs may indicate appropriate candidates for cross-correlation UCD structure likely to change in the near term. –Modifiers like ‘main’, ‘error’ –UCDs for aggregate quantities Use UCDs for column discovery (i.e., when the structure of the returned table is unknown), use column names for column query.
DateADASS Summary Use registries to find resources –Example: Use standard protocols to query resources –Cone search : –SIAP : Descend the hierarchical structure of the VOTable –VOTable specification : –Libraries : Perl : Java : C/C++: Use UCDs to find columns of interest. –UCD info and tools: