Download presentation
Presentation is loading. Please wait.
Published byChastity Dawson Modified over 8 years ago
1
XML and related technologies tutorial Developed using material at http://www.w3schools.com/xml
2
Topics covered XML (.xml)– describe data XML Schema (.xsd)– validate data XPATH – navigate data XSLT (.xsl) – transform data
3
XML (.xml)– describe data XML stands for Extensible Markup Language. XML stands for Extensible Markup Language. XML is a markup language much like HTML XML is a markup language much like HTML XML was designed to describe data XML was designed to describe data XML tags are not predefined. You must define your own tags XML tags are not predefined. You must define your own tags XML uses an XML Schema(.xsd) to validate the data (or Document Type Definition (.dtd), …) XML uses an XML Schema(.xsd) to validate the data (or Document Type Definition (.dtd), …) XML with a XML Schema is designed to be self- descriptive XML with a XML Schema is designed to be self- descriptive XML can be used to create other XML based languages XML can be used to create other XML based languages
4
XML pros/cons Format Pro - Data and data description tags are written in text so they are portable and not dependent on proprietary formats or conversion processes for use. Pro - Data and data description tags are written in text so they are portable and not dependent on proprietary formats or conversion processes for use. Con – because data is verbosely described, larger datasets(e.g. model outputs) or binary formats(e.g. images) can be poor candidates for pure xml adoption. The common solution is to leave these types of data in their raw formats with use of some xml to describe the useful metadata(observation type, temporal/spatial range,…) of the file. Con – because data is verbosely described, larger datasets(e.g. model outputs) or binary formats(e.g. images) can be poor candidates for pure xml adoption. The common solution is to leave these types of data in their raw formats with use of some xml to describe the useful metadata(observation type, temporal/spatial range,…) of the file.Structure Pro - XML structure can be easily extended with the addition of elements/attributes as needed. Pro - XML structure can be easily extended with the addition of elements/attributes as needed. Con – deciding on the initial XML structure is driven by the application use of the data which can vary widely. Con – deciding on the initial XML structure is driven by the application use of the data which can vary widely.
5
XML Syntax <to>John</to><from>Jane</from><heading>Reminder</heading> Don't forget me this weekend! Don't forget me this weekend! </note> is the ‘root’ element of the document. is a parent element, is a child element and is a sibling element. is the ‘root’ element of the document. is a parent element, is a child element and is a sibling element. id=“100” is an attribute of the element. Attribute use should be limited, but is generally considered ok when referring to element metadata. id=“100” is an attribute of the element. Attribute use should be limited, but is generally considered ok when referring to element metadata. All elements must have a closing tag and be properly nested. All elements must have a closing tag and be properly nested. Tags are case sensitive. Tags are case sensitive. Attribute values must be quoted. Attribute values must be quoted.
6
Element Naming XML elements must follow these naming rules: Names can contain letters, numbers, and other characters Names can contain letters, numbers, and other characters Names must not start with a number or punctuation character Names must not start with a number or punctuation character Names must not start with the letters xml (or XML or Xml …) Names must not start with the letters xml (or XML or Xml …) Names cannot contain spaces(substitute underscore(_) instead) and should not use the colon(:) or dash(-) characters Names cannot contain spaces(substitute underscore(_) instead) and should not use the colon(:) or dash(-) characters
7
XML Schema (.xsd)– validate data defines elements that can appear in a document defines elements that can appear in a document defines attributes that can appear in a document defines attributes that can appear in a document defines which elements are child elements defines which elements are child elements defines the order of child elements defines the order of child elements defines the number of child elements defines the number of child elements defines whether an element is empty or can include text defines whether an element is empty or can include text defines data types for elements and attributes defines data types for elements and attributes defines default and fixed values for elements and attributes defines default and fixed values for elements and attributes
8
John John Smith Smith Several elements can refer to the same complex type
9
XPath – navigate data XPath is a syntax for defining parts of an XML document XPath is a syntax for defining parts of an XML document XPath uses path expressions to navigate in XML documents XPath uses path expressions to navigate in XML documents XPath contains a library of standard functions XPath contains a library of standard functions XPath is a major element in XSLT XPath is a major element in XSLT
10
#file obs_system.xml #file obs_system.xml no no yes yes bar bar 1050 1050 900 900 yes yes yes yes bar bar 1050 1050 900 900
11
#!perl #!perl use strict; use strict; use XML::XPath; use XML::XPath; my $xp = XML::XPath->new(filename => 'obs_system.xml'); my $xp = XML::XPath->new(filename => 'obs_system.xml'); foreach my $element ($xp- >findnodes('/system/platform[@id="SUN2"]/online' )) { foreach my $element ($xp- >findnodes('/system/platform[@id="SUN2"]/online' )) { print $element->string_value()."\n"; print $element->string_value()."\n"; }
12
XSLT (.xsl) – transform data XSL stands for EXtensible Stylesheet Language. XSL stands for EXtensible Stylesheet Language. XSLT stands for XSL Transformations XSLT stands for XSL Transformations XSLT is the most important part of XSL XSLT is the most important part of XSL XSLT transforms an XML document into another XML document or output(see xsl:output method="text") XSLT transforms an XML document into another XML document or output(see xsl:output method="text") XSLT uses XPath to navigate in XML documents XSLT uses XPath to navigate in XML documents
13
Empire Burlesque Empire Burlesque Bob Dylan Bob Dylan USA USA Columbia Columbia 10.90 10.90 1985 1985......
14
<xsl:stylesheet version="1.0" <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> My CD Collection My CD Collection Title Title Artist Artist
16
Salinity Workshop Possible Web services which return data IOOS XML Schema for data returned IOOS XML Schema for data returned Salty Slim – more documentation at http://twiki.sura.org/twiki/bin/view/Main/SalinityWo rkshop http://twiki.sura.org/twiki/bin/view/Main/SalinityWo rkshop http://twiki.sura.org/twiki/bin/view/Main/SalinityWo rkshop Same set of possible web services could also be used by generalized observing systems(NEON or GEOSS) or as example for USGS, NDBC, OBIS, etc web services
17
Possible web services Possible web services ##tell me the services you offer ##tell me the services you offer # GetCapabilities returns a list of methods available with their associated input vars and outputs. returns a list of methods available with their associated input vars and outputs. ##give me reference handles to the platforms, their position and what they collect ##give me reference handles to the platforms, their position and what they collect # GetPlatformList returns list of platform_id's and their associated geographic position and observation type(standard_names) which they collect returns list of platform_id's and their associated geographic position and observation type(standard_names) which they collect
18
##### ##### ##give me all the data ##give me all the data # GetLatest(optional: platform_id/bounding_box) returns a list of platform_id's and their corresponding latest observations(optionally for a specific platform_id/within the selected geographic bounding_box) returns a list of platform_id's and their corresponding latest observations(optionally for a specific platform_id/within the selected geographic bounding_box) ##give me just the observations requested ##give me just the observations requested # GetLatestByObservation(observation_standard_name[list?], optional: platform_id/bounding_box) returns a list of platform_id's and only the selected observation[list?] for the latest data(optionally for a specific platform_id/within the selected geographic bounding_box) returns a list of platform_id's and only the selected observation[list?] for the latest data(optionally for a specific platform_id/within the selected geographic bounding_box) ##### ##### ##instead of just the latest data, give me for the specified date range ##instead of just the latest data, give me for the specified date range ##give me all the data ##give me all the data # GetByDateRange(optional: platform_id/bounding_box, start_datetime, end_datetime) returns a list of platform_id's and and observations for data within the date range(optionally for a specific platform_id/within the selected geographic bounding_box) returns a list of platform_id's and and observations for data within the date range(optionally for a specific platform_id/within the selected geographic bounding_box) ##give me just the observations requested ##give me just the observations requested # GetByDateRangeByObservation(observation_standard_name[list?], optional: platform_id/bounding_box, start_datetime, end_datetime) returns a list of platform_id's and only the selected observation[list?] for data within the date range(optionally for a specific platform_id/within the selected geographic bounding_box) returns a list of platform_id's and only the selected observation[list?] for data within the date range(optionally for a specific platform_id/within the selected geographic bounding_box)
19
Salty Slim – more documentation at http://twiki.sura.org/twiki/bin/view/Main/SalinityWorkshop Salty Slim – more documentation at http://twiki.sura.org/twiki/bin/view/Main/SalinityWorkshop <ioos_data <ioos_data xmlns="http://localhost/xml_schema" xmlns="http://localhost/xml_schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://10.203.10.47/xml_schema ioos_sst.xsd"> xsi:schemaLocation="http://10.203.10.47/xml_schema ioos_sst.xsd"> ndbc ndbc http://www.ndbc.noaa.gov http://www.ndbc.noaa.gov NameOfReferenceSystem NameOfReferenceSystem meters meters NGVD88 NGVD88 NAD83 NAD83
20
<platform> 41004 41004 ndbc_41004 ndbc_41004 fixed_point fixed_point 32.50 32.50 -79.09 -79.09 2.5 2.5 <url>http://www.ndbc.noaa.gov/station_page.php?station=41004</url><fgdc_metadata_url>http://url_to_metadata</fgdc_metadata_url><opendap_url>http://url_to_opendap</opendap_url> http://url_to_qc_documentation http://url_to_qc_documentation 10 10
21
sstPrimary sstPrimary ndbc_41004_sstPrimary ndbc_41004_sstPrimary sea_surface_temperature sea_surface_temperature degree_Celsius degree_Celsius 30 30 20.6 20.6 20.4 20.4
22
*fixed_profile(wls,adcp,ctd) - with free_depth represented on a per attribute basis. omitted. *fixed_profile(wls,adcp,ctd) - with free_depth represented on a per attribute basis. omitted. 20.6 20.6 21.6 21.6 *fixed_depth(ships, floaters) - with free_latitude, free_longitude represented on a per attribute basis., omitted. *fixed_depth(ships, floaters) - with free_latitude, free_longitude represented on a per attribute basis., omitted. 20.6 20.6 21.6 21.6 *free(subs, tagged species) - 'free_latitude', 'free_longitude', 'free_depth' represented on a per attribute basis.,, omitted. *free(subs, tagged species) - 'free_latitude', 'free_longitude', 'free_depth' represented on a per attribute basis.,, omitted. 20.6 20.6 21.6 21.6 Technically speaking, everything could be represented in the 'free' type format, but it might be useful from a metadata/processing standpoint to know what the collection type is. Technically speaking, everything could be represented in the 'free' type format, but it might be useful from a metadata/processing standpoint to know what the collection type is.
24
#the following URL (which is the same as 'GetLatest') supports the Carolinas coast website latest observations #the following URL (which is the same as 'GetLatest') supports the Carolinas coast website latest observations http://nautilus.baruch.sc.edu/wfs/seacoos_in_situ?SERVICE=WFS&VERSION=1.0.0&REQUEST=GETFEATURE&BBOX=- 91.5,22,-71.5,36.5&typename=latest_in_situ_obs http://nautilus.baruch.sc.edu/wfs/seacoos_in_situ?SERVICE=WFS&VERSION=1.0.0&REQUEST=GETFEATURE&BBOX=- 91.5,22,-71.5,36.5&typename=latest_in_situ_obs #returns the following XML document (only one platform listing shown) #returns the following XML document (only one platform listing shown) - - -91.320000,22.030000 -72.230000,36.480000 -91.320000,22.030000 -72.230000,36.480000 - - - - -79.710000,32.860000 -79.710000,32.860000 -79.710000,32.860000 -79.710000,32.860000 - - -79.710000,32.860000 -79.710000,32.860000
25
carocoops_CAP1_wls carocoops_CAP1_wls 2005-11-14 08:00:00 2005-11-14 08:00:00 2005-11-14 09:54:00 2005-11-14 09:54:00 2005-09-16 17:54:00 2005-09-16 17:54:00 1015.97 mb @ 3m 1015.97 mb @ 3m http://nautilus.baruch.sc.edu/portal_rs/query_details_air_pressure_in_situ.phtml?ho ur_range=24&station_id=carocoops_CAP1_wls&lon=- 79.71&lat=32.86&air_pressure_table=air_pressure_prod&archive_flag=&time_stamp=2005_09_16_17_54_00&pressure_ units=MB http://nautilus.baruch.sc.edu/portal_rs/query_details_air_pressure_in_situ.phtml?ho ur_range=24&station_id=carocoops_CAP1_wls&lon=- 79.71&lat=32.86&air_pressure_table=air_pressure_prod&archive_flag=&time_stamp=2005_09_16_17_54_00&pressure_ units=MB 30.00 in Hg (0 deg C) @ 3m 30.00 in Hg (0 deg C) @ 3m http://nautilus.baruch.sc.edu/portal_rs/query_details_air_pressure_in_s itu.phtml?hour_range=24&station_id=carocoops_CAP1_wls&lon=- 79.71&lat=32.86&air_pressure_table=air_pressure_prod&archive_flag=&time_stamp=2005_09_16_17_54_00&pressure_ units=INCHES_MERCURY http://nautilus.baruch.sc.edu/portal_rs/query_details_air_pressure_in_s itu.phtml?hour_range=24&station_id=carocoops_CAP1_wls&lon=- 79.71&lat=32.86&air_pressure_table=air_pressure_prod&archive_flag=&time_stamp=2005_09_16_17_54_00&pressure_ units=INCHES_MERCURY 2005-09-17 03:54:00 2005-09-17 03:54:00 26.98 deg C @ 3m 26.98 deg C @ 3m http://nautilus.baruch.sc.edu/portal_rs/query_details_air_temperature.phtml ?hour_range=24&station_id=carocoops_CAP1_wls&lon=- 79.71&lat=32.86&air_temperature_table=air_temperature_prod&archive_flag=&time_stamp=2005_09_17_03_54_00&de gree_units=C http://nautilus.baruch.sc.edu/portal_rs/query_details_air_temperature.phtml ?hour_range=24&station_id=carocoops_CAP1_wls&lon=- 79.71&lat=32.86&air_temperature_table=air_temperature_prod&archive_flag=&time_stamp=2005_09_17_03_54_00&de gree_units=C 80.56 deg F @ 3m 80.56 deg F @ 3m http://nautilus.baruch.sc.edu/portal_rs/query_details_air_temperature.ph tml?hour_range=24&station_id=carocoops_CAP1_wls&lon=- 79.71&lat=32.86&air_temperature_table=air_temperature_prod&archive_flag=&time_stamp=2005_09_17_03_54_00&de gree_units=F http://nautilus.baruch.sc.edu/portal_rs/query_details_air_temperature.ph tml?hour_range=24&station_id=carocoops_CAP1_wls&lon=- 79.71&lat=32.86&air_temperature_table=air_temperature_prod&archive_flag=&time_stamp=2005_09_17_03_54_00&de gree_units=F......
26
SEACOOS XML Services http://nautilus.baruch.sc.edu/twiki_dmcc/bin/view/Main/CodeR epositorySeacoosXMLServices http://nautilus.baruch.sc.edu/twiki_dmcc/bin/view/Main/CodeR epositorySeacoosXMLServices Using an XML descriptor file to describe ASCII column oriented data for later processing Web forms simplify the process of creating needed XML Service currently exists for fixed point data which converts ASCII to SEACOOS netCDF Data scout currently converts netCDF to SQL for relational database population, but future efforts may skip netCDF step entirely
27
time,wind_speed,wind_from_direction,sea_surface_ temperature time,wind_speed,wind_from_direction,sea_surface_ temperature 2004-10-22 14:00:00+00_SEP_5.0_SEP_120.0.0_SEP_12.0 2004-10-22 14:00:00+00_SEP_5.0_SEP_120.0.0_SEP_12.0 2004-10-22 15:00:00_SEP_6.0_SEP_125.0_SEP_13.0 2004-10-22 15:00:00_SEP_6.0_SEP_125.0_SEP_13.0 2004-10-22 16:00:00_SEP_7.0_SEP_130.0_SEP_14.0 2004-10-22 16:00:00_SEP_7.0_SEP_130.0_SEP_14.0 2004-10-22 17:00:00_SEP_8.0_SEP_135.0_SEP_15.0 2004-10-22 17:00:00_SEP_8.0_SEP_135.0_SEP_15.0
29
CF-1.0 CF-1.0 SEACOOS-NETCDF-2.0 SEACOOS-NETCDF-2.0 SEACOOS-XML-1.0 SEACOOS-XML-1.0 <!-- format_category list <!-- format_category list [fixed-point,fixed-profiler,fixed-map,moving-point-2D,moving-point-3D,moving-profiler] [fixed-point,fixed-profiler,fixed-map,moving-point-2D,moving-point-3D,moving-profiler] --> --> fixed-point fixed-point Jeremy Cothran (jcothran@carocoops.org) Jeremy Cothran (jcothran@carocoops.org) Baruch Institute, University of South Carolina at Columbia Baruch Institute, University of South Carolina at Columbia http://carocoops.org http://carocoops.org carocoops carocoops CAP2 CAP2 buoy buoy
30
http://trident.baruch.sc.edu/storm_surg e_data/latest 2 _SEP_ http://trident.baruch.sc.edu/storm_surg e_data/latest 2 _SEP_
31
<dependent_variables> 2 wind_speed m s-1 2 wind_speed m s-1 3.0 3.0
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.