Presentation is loading. Please wait.

Presentation is loading. Please wait.

Federated Network for Sharing Air Quality Data and Processing Services Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University,

Similar presentations


Presentation on theme: "Federated Network for Sharing Air Quality Data and Processing Services Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University,"— Presentation transcript:

1 Federated Network for Sharing Air Quality Data and Processing Services Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University, St. Louis, MO 63130 April 2005, rhusar@me.wustl.edu DRAFT Project Coordinators: Software Architecture: R. Husar Software Implementation: K. Höijärvi Data and Applications: S. Falke, R. Husar

2 DataFed Project at CAPITA DataFed Goal To facilitate sharing of air quality data and processing services DataFed Relationship to other Data Networks There are several evolving data sharing federations (e.g. OGC, OPenDAP) DataFed is intended to be ‘socially well behaving’ federation: beyond coexistence, it actively seeks linking, co-evolution and fusion with other data federations. Through the Service Oriented Architecture of DataFed it can link to OGC, O PeNDAP and other federations. The NASA REASoN Project and the NASA SEEDS Program is well suited for the inter-federation linking, learning and co-evolution. Project Funding and Status Funding from NSF (2001-4) and NASA (2004-9) for data-sharing middleware web-services infrastructure Funding from CEC, EPA, RPOs, to develop DataFed-based web-applications for specific projects, NAM Emission, FASNET, CATT & Tools as services

3 Project Background and Rationale Air Quality Data and Analysis Shift from primary to secondary pollutants. Ozone and PM2,5 travel 500 + miles across state or international boundaries and their sources are not well established New Regulatory approach. Compliance evaluation based on ‘weight of evidence’ and tracking the effectiveness of controls Shift from command & control to participatory management. Inclusion of federal, state, local, industry, international stakeholders. Information System Challenges Broader user community. The information systems need to be extended to reach all the stakeholders ( federal, state, local, industry, international) A richer set of data and analysis. Establishing causality, ‘weight of evidence’, emissions tracking requires more data and air quality analysis Opportunities Rich AQ data availability. Abundant high-quality routine and research monitoring data from EPA, NASA, NOAA and other agencies are now available. New information technologies. Web-based data and tools sharing now allows cooperation (sharing) and coordination and communication among diverse groups.

4 The Air Quality Analyst's Challenge “The researcher cannot get access to the data; if he can, he cannot read them; if he can read them, he does not know how good they are; and if he finds them good he cannot merge them with other data.” Information Technology and the Conduct of Research: The Users View National Academy Press, 1989 Thus, researchers encountering these resistances need help: A catalog of distributed data resources for easy data ‘discovery’ Uniform data coding and formatting for easy access, transfer and merging Rich and flexible metadata structure to encode the knowledge about data Powerful shared tools to access, merge and ‘analyze’ the data

5 The ‘Decision Support System’ Processing of AQ data into ‘productive’ knowledge occurs through a value-chain Currently, much of the decision support is done by human analysts and advisers Information technologies could automate the Data-to-Information transformation This would liberate more resources for Analyzing and Judging activities More productive or ‘actionable’ knowledge would lead to better decision making (after Taylor, 1987) Informing Knowledge ActionProductive Knowledge Information Organizing Grouping Classifying Formatting Displaying Analyzing Separating Evaluating Interpreting Synthesizing Judging Options Quality Advantages Disadvantages Deciding Matching goals, Compromising Bargaining Deciding CIRA VIEWSLangley IDEAAQ Manager WG Summary Rpt Data Examples:

6 Data Flow & Processing in AQ Management AQ data arise from diverse sources, each having specific history, driving forces, formats, quality, etc. Data analysis, i.e. turning the raw data into ‘actionable’ knowledge, requires combining data from these sources The three major data ‘processing’ operations (services) are filtering, aggregation and fusion AQ DATA EPA Networks IMPROVE Visibility Satellite-PM Pattern METEOROLOGY Met. Data Satellite-Transport Forecast model EMISSIONS National Emissions Local Inventory Satellite Fire Locs Status and Trends AQ Compliance Exposure Assess. Network Assess. Tracking Progress AQ Management Reports ‘Knowledge’ Derived from Data Primary Data Diverse Providers Data ‘Refining’ Processes Filtering, Aggregation, Fusion

7 DataFed Data Model: Multidimensional Data Cube k j i

8 4 D Geo-Environmental Data Cube (X, Y, Z, T) Environmental data represent measurements in the physical world which has space (X, Y, Z) and time (T) as its dimensions. The specific inherent dimensions for geo- environmental data are: Longitude X, Latitude Y, Elevation Z and DateTime T. Additional dimensions may include parameters, pollutant source, etc. The needs for finding, sharing and integration of geo-environmental data requires that data are coded in this multidimensional data space

9 Environmental Data: Multi-Dimensional Data can be distributed over 1,2, …n dimensions 1 Dimensional e.g. Time dimension i j k j i Data Granule i 1 Dimensional e.g. Location & Time 1 Dimensional e.g. Location, Time & Parameter View 1 Data Space View 2 Views are generally orthogonal slices through multidimensional data cubes Spatial and temporal slices through the data space are most common

10 Common Views (slices) through 4D Data Space XY MAP: Z,T fixed Vertical Profile:XYT fixed Time Chart: X,Y,Z fixed Vertical Cross sect: YT fixed Vertical Profile Trend: X,Y fixed

11 Monitor Storage Data 1 Virtual Int. Data Integrated Data 1 VIEWS Integration Integrated Data n Monitor IMPROV E Data 2 Monitor Storage Data n

12 Web Services Service Broker Service Provider Publish Find Bind Service User

13 Software as Service In a restaurant, one does not have to instruct the cook how to make crêpes and tell the waiter how to bring out the food. There is a well- defined service-interaction between the customer, the waiter and the cook. (Revive the butler metaphor?) Software services could offer the same kind of convenience to ease usage of the service. In the web services paradigm, all operations are viewed as published services with well defined interfaces and behavior. In a Service-Oriented Architecture (SOA), the tedious software tasks of data access, transformations and rendering can be performed by user-driven service ‘orchestration’. (background music becomes more forceful…picture the user as she creates a web app with the wand.. )

14 Web Services as Program Components A Web Service is a URL addressable resource that returns requested data, e.g. current weather or the map for a neighborhood. Web Services use standard web protocols: HTTP, XML, SOAP, WSDL allow computer to computer communication, regardless of their language or platform. Web Services are reusable components, like ‘LEGO blocks’, that allow agile development of richer applications with less effort. Web services can transform the web from a medium for viewing and downloading to data/knowledge-exchange and distributed computing platform. (Berners-Lee)

15 Web Service Components and Standards Each operation is governed by standard protocols: Discovery and Integration: UDDI (Universal Description, Discovery and Integration) Service Description: WSDL (Web Services Description Language) Content Envelope: SOAP (Simple Object Access Protocol) Data Encoding: XML (Extensible Modeling Language) Service Broker Service Provider Publish UDDI, WSDL Find UDDI, WSDL Access SOAP, XML Service User Service providers publish services to a service broker. Service users find the needed service and get access key from a service broker With the access key, users bind to the service provider The result is a dynamic binding mechanism between the service users and providers Components:Provider – User – Broker Actions: Publish – Find - Bind

16 Interoperability through a Layered Protocol Stack Web Services are implemented on a layered stack of technologies and standards The lower layers enable connectivity binding and message exchange Higher layers of the stack enable interoperability and service integration TCP/IP, HTTP, FTP ASCII, XML, etc. HTML, XML OGC -GML OGC Coverage, CoordTransfom, WMS HTTP, SOAP WSDL UDDI OGC Catalog WSFL, RPNChain Standards Interoperability Comm. Protocols Data Encoding Data Schema Data Binding Web Service Service Integr. Service Discovery Service Descript. Connectivity

17 Web Application: Chained Web Services A Web Service Provider may also be a User of other services Multiple web services can be chained into an interactive workflow system The result is an agile application that can be created ‘just in time’ by the user for a specific need Service Broker Service Provider/User Publish Find Bind Service User Chain Service Provider Bind Chain

18 Anatomy of a Wrapper Service: TOMS Satellite Image Data Given the URL template and the image description, the wrapper service can access the image for any day, any spatial subset using a HTTP URL or SOAP protocol: src_img_width src_img_height src_margin_rightsrc_margin_left src_margin_top src_margin_bottom src_lon_min src_lat_max src_lat_min src_lon_max Image Description for Data Access: src_image_width=502 src_image_height=329 src_margin_bottom=105 src_margin_left=69 src_margin_right=69 src_margin_top=46 src_lat_min=-70 src_lat_max=70 src_lon_min=-180 src_lon_max=180 The daily TOMS images reside on the FTP archive, e.g. ftp://toms.gsfc.nasa.gov/pub/eptoms/images/aerosol/y2000/ea000820.gif ftp://toms.gsfc.nasa.gov/pub/eptoms/images/aerosol/y2000/ea000820.gif URL template: ftp://toms.gsfc.nasa.gov/pub/eptoms/images/aerosol/y[yyyy]/ea[yy][mm][dd].gif Transparent colors for overlays RGB(89,140,255) RGB(41,117,41) RGB(23,23,23) RGB(0,0,0)

19 Web Publish HTTP, FTP Data Access though a Proxy Service Service Broker Publish UDDI, WSDL Service Consumer Find UDDI, WSDL Access SOAP, XML Any web-content can be delivered as a Web Service through a Proxy Server. The Proxy Server supplies a web server-to-web service ‘wrapper’ The Proxy Server publishes the web service to the Broker The User accesses the Proxy to get the Web Server data Service Proxy Web Server Service User Chain

20 Service-Oriented Architecture Service-oriented architecture (SOA): An architecture built around a collection of reusable components with well- defined interfaces. Loosely coupled: The use of well-defined interfaces to connect services; SOAs are built using a loosely coupled approach, where a change in one service does not require changes in linked services. Enterprise service bus: A software infrastructure that uses a standard interface and messaging to integrate applications; one way to implement an SOA.

21 Problems of Current Client-Server Architecture User Tasks: Fi nd servers Reformat data Custom process Server Client-Server Architecture: User accesses, processes and integrates data by customized tools User Problems in combining data: –Legacy data systems can not be altered to support integration –Data systems use different terms or meaning of similar terms –Some data sources, do not have a schema nor formal access methods Result: User Carries the Burden In current client-server architectures the user carries much of the burden of integration.

22 Mediator-Based Integration Architecture (Wiederhold, 1992) Software agents (mediators) can perform many of the data integration chores Heterogeneous sources are wrapped by translation software local to global language Mediators (web services) obtain data from wrappers or other mediators and pass it on … Wrappers remove technical, while mediators resolve the logical heterogeneity The job of the mediator is to provide an answer to a user query In database sense, a mediator is a view of the data found in one or more sources (Ullman, 1997)Ullman, 1997 Busse et. al, 1999 Wrapper Service Query View Mediators

23 Service Oriented Architecture Control Data Process Peer-to-peer network representation Data Service Catalog Process Data, as well as services and users (of data and services) are distributed Users compose data processing chains form reusable services Intermediate and resulting data are also exposed for possible further use Processing chains can be further linked into value-adding processes Service chain representation User Tasks: Fi nd data and services Compose service chains Expose output Chain 2 Chain 1 Chain 3 Data Service User Carries less Burden In service-oriented peer-to peer architecture, the user is aided by software ‘agents’

24 Web Programs Built on DataFed Infrastructure

25 Generic Data Flow and Processing in DataFed DataView 1 DataProcessed Data Portrayed Data Process Data Portrayal/ Render Abstract Data Access View Wrapper Physical Data Abstract Data Physical Data Resides in autonomous servers; accessed by view- specific wrappers which yield abstract data ‘slices’ Abstract Data Abstract data slices are requested by viewers; uniform data are delivered by wrapper services DataView 2 DataView 3 View Data Processed data are delivered to the user as multi-layer views by portrayal and overlay web services Processed Data Data passed through filtering, aggregation, fusion and other web services

26 Render Service Chaining in Spatio-Temporal Data Browser Spatial Slice Find/Bind Data nDim Data Cube Time Slice Time Portrayal Spatial PortrayalSpatial Overlay Time Overlay OGC-Compliant GIS Services Time-Series Services PortrayOverlay Homogenizer Catalog Wrapper Mediator Client Browser Cursor/Controller Maintain Data Vector GIS Data XDim Data SQL Table OLAP Satellite Images Data Sources

27 An Application Program: Voyager Data Browser The web-program consists of a stable core and adoptive input/output layers The core maintains the state and executes the data selection, access and render services The adoptive, abstract I/O layers connects the core to evolving web data, flexible displays and to the a configurable user interface: –Wrappers encapsulate the heterogeneous external data sources and homogenize the access –Device Drivers translate generic, abstract graphic objects to specific devices and formats –Ports connect the internal parameters of the program to external controls –WDSL web service description documents Data Sources Controls Displays I/O Layer Device Drivers Wrappers App State Data Flow Interpreter Core Web Services WSDL Ports


Download ppt "Federated Network for Sharing Air Quality Data and Processing Services Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University,"

Similar presentations


Ads by Google