Federated Network for Sharing Air Quality Data and Processing Services Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University,

Slides:



Advertisements
Similar presentations
Web Services Implementation Case Study: DataFed Air Quality Data & Services Project Coordinators: Software Architecture: R. Husar Software Implementation:
Advertisements

Federated PM and Haze Data Warehouse Project a sub- project of (enter your sticker & logo here ) Nov 20, 2001, RBH St. Louis Midwest Supersite Project.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Latest techniques and Applications in Interprocess Communication and Coordination Xiaoou Zhang.
Service-Oriented Architecture
A New Computing Paradigm. Overview of Web Services Over 66 percent of respondents to a 2001 InfoWorld magazine poll agreed that "Web services are likely.
The Frontier of GIS GIS Web Services Nadine Alameh, Global Science & Technology Next Generation of Community Statistical Systems Tampa, Florida March 14,
Stefan Falke Center for Air Pollution Impact and Trend Analysis Washington University in St. Louis Networked Data and Tools for Environmental Management.
Distributed Data Analysis & Dissemination System (D-DADS) Prepared by Stefan Falke Rudolf Husar Bret Schichtel June 2000.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Select, Overlay, Explore; Multidimensional data Maintain Distributed Data; Heterogeneous coding, access Connect providers to users; Homogenize data access.
SOA, BPM, BPEL, jBPM.
Distributed Voyager (DVoy) Web Services
DRAFT June 6, 2005 ESIP AQ Cluster, Air Quality Cluster Air Quality Cluster TechTrack Earth Science Information Partners Partners NASA.
What is Service Oriented Architecture ? CS409 Application Services Even Semester 2007.
Web Services (SOAP, WSDL, UDDI) SNU OOPSLA Lab. October 2005.
Instrument Builders Information Specialists (ESIP) Scientists Curriculum Developers Teachers Decision Analysts Decision Makers Reports From Kim Kastens.
REASoN REASoN Project to link NASA's data, modeling and systems to users in research, applications and education Application of NASA ESE Data and Tools.
Web Services based e-Commerce System Sandy Liu Jodrey School of Computer Science Acadia University July, 2002.
Air Quality Focus Group Discussion Summary ESIP Winter Meeting January 2005 Air Quality is one of 12 Applications of National Priority as defined by NASA.
Federated Network for Sharing Air Quality Data and Processing Services Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University,
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
REASoN REASoN Project to link NASA's data, modeling and systems to users in research, education and applications Application of NASA ESE Data and Tools.
VOYAGER Data Explorer: Architecture and Technologies See also Design and ApplicationsDesign and Applications Built and used Used by a Virtual Community.
Spatio-Temporal Data Sharing using XML Web Services Presented at the Workgroup Meeting on Web-based Environmental Information System for Global Emission.
Current Air Quality Information ‘Ecosystem’ (Draft for Feedback) AQ information includes emissions, ambient & satellite data and model outputs The distributed.
Stefan Falke Center for Air Pollution Impact and Trend Analysis Washington University in St. Louis Brooke Hemming US EPA – Office of Research and Development.
Application of ESE Data and Tools to Particulate Air Quality Management The CAPITA REASoN Project August 15, 2003 Stefan Falke and Rudolf Husar Center.
Air Quality Cluster Air Quality Cluster TechTrack Earth Science Information Partners Partners(?) NASA NOAA EPA USGS DOE NSF Industry… Data Flow Technologies.
Accessing and Using Fire-Related Data with the CAPITA DataFed.net* Services Framework Stefan Falke Rudolf Husar Kari Hoijarvi Washington University in.
1 Application Scenario: Smoke Impact REASoN Project: Application of NASA ESE Data and Tools to Particulate Air Quality Management (PPT/PDF)Application.
WS Roadmap. The pathway to a service-oriented architecture The pathway to a service-oriented architecture Bob Sutor, IBM IBM identified four steppingstones.
RSISIPL1 SERVICE ORIENTED ARCHITECTURE (SOA) By Pavan By Pavan.
1 Using the GEOSS Common Infrastructure in the Air Quality & Health SBA: Wildfire & Smoke Assessment Prepared by the GEOSS AIP-2 Air Quality & Health Working.
Select, Overlay, Explore; Integration of diverse data Distributed Data Heterogeneous coding, access Connects providers to users; Homogenize data access.
Stefan Falke and Rudolf Husar Center for Air Pollution Impact and Trend Analysis Washington University in St. Louis A NSF Digital Government Pilot Project.
VOYAGER Data Explorer: Architecture and Technologies See also the the Voyager Developer Website and early ApplicationsDeveloper WebsiteApplications Layered.
Kemal Baykal Rasim Ismayilov
David Smiley SOA Technology Evangelist Software AG Lead, follow or get out of the way Here Comes SOA.
COMMUNITY. Data Acquisition and Usage Value Chain.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Dvoy Related Ideas. Data Acquisition and Usage Value Chain.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
NASA REASoN Project SHAirED: S ervices for H elping the Air -quality Community use E SE D ata Stefan Falke, Kari Höijärvi and Rudolf Husar, Washington.
NASA REASoN Project SHAirED: S ervices for H elping the Air -quality Community use E SE D ata Stefan Falke, Kari Höijärvi and Rudolf Husar, Washington.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
Processes of the Information Value Chain Informing Knowledge ActionProductive Knowledge Information Organizing Grouping Classifying Formatting Geo-referencing.
Architecture and Technologies for an Agile, User-Oriented Air Quality Data System Rudolf B. Husar Washington University, St. Louis Presented at the workshop.
Architecture and Technologies for an Agile, User-Oriented Air Quality Data System Rudolf B. Husar Washington University, St. Louis Presented at the workshop.
Web Services-Based Mediator of Distributed Data Flow and Processing Project Coordinators: Software Architecture: R. Husar Software Implementation: K. Höijärvi.
ESIP Air Quality Jan Air Quality Cluster Air Quality Cluster Technology Track Earth Science Information Partners Partners NASA NOAA EPA (?) USGS.
: Data Sharing/Processing Infrastructure Data Catalog and Access Dozens of datasets on aerosols, emissions, fire, meteorology,
The Federated Data System DataFed: Experiences in Data Homogenization and Networking R.B. Husar, K. Hoijarvi, S. R. Falke, E. M. Robinson, Washington University,
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
1 SEEDS IT Vision Scenario: Smoke Impact REASoN Project: Application of NASA ESE Data and Tools to Particulate Air Quality Management (PPT/PDF)Application.
MEDIATORS. Mediation Typical file-sharing systems have a single global schema for describing their data P2P networks have to consider heterogeneous schemas.
DRAFT June 6, 2005 ESIP AQ Cluster, Contact R. Husar Air Quality Cluster Air Quality Cluster TechTrack Earth Science Information Partners.
Application of NASA ESE Data and Tools to Particulate Air Quality Management A proposal to NASA Earth Science REASoN Solicitation CAN-02-OES-01 REASoN:
Voyager Data Services Services for Finding, Exploring and Presenting Distributed Environmental Data Outline Prepared by Voyager Interest Group on Environmental.
Federated Network for Sharing Air Quality Data and Processing Services Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University,
Fire, Smoke & Air Quality: Tools for Data Exploration & Analysis : Data Sharing/Processing Infrastructure This project integrates.
Software Architecture Patterns (3) Service Oriented & Web Oriented Architecture source: microsoft.
Environmental Data Content and Form Stuff. 4 D Geo-Environmental Data Cube (X, Y, Z, T) Environmental data represent measurements in the physical world.
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION ESDS Reuse Working Group Earth Science Data Systems Reuse Working Group Case Study: SHAirED Services for.
DATAFED Application Programs. Dvoy Data Flow and Processes DataView 1 View Data Abstract Portrayal Device Portrayal Render Device View Portrayal Device.
ESIP Air Quality Jan Air Quality Cluster Air Quality Cluster Technology Track Earth Science Information Partners Partners NASA NOAA EPA (?) USGS.
Service Oriented Architecture (SOA) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Sabri Kızanlık Ural Emekçi
Presentation transcript:

Federated Network for Sharing Air Quality Data and Processing Services Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University, St. Louis, MO April 2005, DRAFT Project Coordinators: Software Architecture: R. Husar Software Implementation: K. Höijärvi Data and Applications: S. Falke, R. Husar

DataFed Project at CAPITA DataFed Goal To facilitate sharing of air quality data and processing services DataFed Relationship to other Data Networks There are several evolving data sharing federations (e.g. OGC, OPenDAP) DataFed is intended to be ‘socially well behaving’ federation: beyond coexistence, it actively seeks linking, co-evolution and fusion with other data federations. Through the Service Oriented Architecture of DataFed it can link to OGC, O PeNDAP and other federations. The NASA REASoN Project and the NASA SEEDS Program is well suited for the inter-federation linking, learning and co-evolution. Project Funding and Status Funding from NSF (2001-4) and NASA (2004-9) for data-sharing middleware web-services infrastructure Funding from CEC, EPA, RPOs, to develop DataFed-based web-applications for specific projects, NAM Emission, FASNET, CATT & Tools as services

Project Background and Rationale Air Quality Data and Analysis Shift from primary to secondary pollutants. Ozone and PM2,5 travel miles across state or international boundaries and their sources are not well established New Regulatory approach. Compliance evaluation based on ‘weight of evidence’ and tracking the effectiveness of controls Shift from command & control to participatory management. Inclusion of federal, state, local, industry, international stakeholders. Information System Challenges Broader user community. The information systems need to be extended to reach all the stakeholders ( federal, state, local, industry, international) A richer set of data and analysis. Establishing causality, ‘weight of evidence’, emissions tracking requires more data and air quality analysis Opportunities Rich AQ data availability. Abundant high-quality routine and research monitoring data from EPA, NASA, NOAA and other agencies are now available. New information technologies. Web-based data and tools sharing now allows cooperation (sharing) and coordination and communication among diverse groups.

The Air Quality Analyst's Challenge “The researcher cannot get access to the data; if he can, he cannot read them; if he can read them, he does not know how good they are; and if he finds them good he cannot merge them with other data.” Information Technology and the Conduct of Research: The Users View National Academy Press, 1989 Thus, researchers encountering these resistances need help: A catalog of distributed data resources for easy data ‘discovery’ Uniform data coding and formatting for easy access, transfer and merging Rich and flexible metadata structure to encode the knowledge about data Powerful shared tools to access, merge and ‘analyze’ the data

The ‘Decision Support System’ Processing of AQ data into ‘productive’ knowledge occurs through a value-chain Currently, much of the decision support is done by human analysts and advisers Information technologies could automate the Data-to-Information transformation This would liberate more resources for Analyzing and Judging activities More productive or ‘actionable’ knowledge would lead to better decision making (after Taylor, 1987) Informing Knowledge ActionProductive Knowledge Information Organizing Grouping Classifying Formatting Displaying Analyzing Separating Evaluating Interpreting Synthesizing Judging Options Quality Advantages Disadvantages Deciding Matching goals, Compromising Bargaining Deciding CIRA VIEWSLangley IDEAAQ Manager WG Summary Rpt Data Examples:

Data Flow & Processing in AQ Management AQ data arise from diverse sources, each having specific history, driving forces, formats, quality, etc. Data analysis, i.e. turning the raw data into ‘actionable’ knowledge, requires combining data from these sources The three major data ‘processing’ operations (services) are filtering, aggregation and fusion AQ DATA EPA Networks IMPROVE Visibility Satellite-PM Pattern METEOROLOGY Met. Data Satellite-Transport Forecast model EMISSIONS National Emissions Local Inventory Satellite Fire Locs Status and Trends AQ Compliance Exposure Assess. Network Assess. Tracking Progress AQ Management Reports ‘Knowledge’ Derived from Data Primary Data Diverse Providers Data ‘Refining’ Processes Filtering, Aggregation, Fusion

DataFed Data Model: Multidimensional Data Cube k j i

4 D Geo-Environmental Data Cube (X, Y, Z, T) Environmental data represent measurements in the physical world which has space (X, Y, Z) and time (T) as its dimensions. The specific inherent dimensions for geo- environmental data are: Longitude X, Latitude Y, Elevation Z and DateTime T. Additional dimensions may include parameters, pollutant source, etc. The needs for finding, sharing and integration of geo-environmental data requires that data are coded in this multidimensional data space

Environmental Data: Multi-Dimensional Data can be distributed over 1,2, …n dimensions 1 Dimensional e.g. Time dimension i j k j i Data Granule i 1 Dimensional e.g. Location & Time 1 Dimensional e.g. Location, Time & Parameter View 1 Data Space View 2 Views are generally orthogonal slices through multidimensional data cubes Spatial and temporal slices through the data space are most common

Common Views (slices) through 4D Data Space XY MAP: Z,T fixed Vertical Profile:XYT fixed Time Chart: X,Y,Z fixed Vertical Cross sect: YT fixed Vertical Profile Trend: X,Y fixed

Monitor Storage Data 1 Virtual Int. Data Integrated Data 1 VIEWS Integration Integrated Data n Monitor IMPROV E Data 2 Monitor Storage Data n

Web Services Service Broker Service Provider Publish Find Bind Service User

Software as Service In a restaurant, one does not have to instruct the cook how to make crêpes and tell the waiter how to bring out the food. There is a well- defined service-interaction between the customer, the waiter and the cook. (Revive the butler metaphor?) Software services could offer the same kind of convenience to ease usage of the service. In the web services paradigm, all operations are viewed as published services with well defined interfaces and behavior. In a Service-Oriented Architecture (SOA), the tedious software tasks of data access, transformations and rendering can be performed by user-driven service ‘orchestration’. (background music becomes more forceful…picture the user as she creates a web app with the wand.. )

Web Services as Program Components A Web Service is a URL addressable resource that returns requested data, e.g. current weather or the map for a neighborhood. Web Services use standard web protocols: HTTP, XML, SOAP, WSDL allow computer to computer communication, regardless of their language or platform. Web Services are reusable components, like ‘LEGO blocks’, that allow agile development of richer applications with less effort. Web services can transform the web from a medium for viewing and downloading to data/knowledge-exchange and distributed computing platform. (Berners-Lee)

Web Service Components and Standards Each operation is governed by standard protocols: Discovery and Integration: UDDI (Universal Description, Discovery and Integration) Service Description: WSDL (Web Services Description Language) Content Envelope: SOAP (Simple Object Access Protocol) Data Encoding: XML (Extensible Modeling Language) Service Broker Service Provider Publish UDDI, WSDL Find UDDI, WSDL Access SOAP, XML Service User Service providers publish services to a service broker. Service users find the needed service and get access key from a service broker With the access key, users bind to the service provider The result is a dynamic binding mechanism between the service users and providers Components:Provider – User – Broker Actions: Publish – Find - Bind

Interoperability through a Layered Protocol Stack Web Services are implemented on a layered stack of technologies and standards The lower layers enable connectivity binding and message exchange Higher layers of the stack enable interoperability and service integration TCP/IP, HTTP, FTP ASCII, XML, etc. HTML, XML OGC -GML OGC Coverage, CoordTransfom, WMS HTTP, SOAP WSDL UDDI OGC Catalog WSFL, RPNChain Standards Interoperability Comm. Protocols Data Encoding Data Schema Data Binding Web Service Service Integr. Service Discovery Service Descript. Connectivity

Web Application: Chained Web Services A Web Service Provider may also be a User of other services Multiple web services can be chained into an interactive workflow system The result is an agile application that can be created ‘just in time’ by the user for a specific need Service Broker Service Provider/User Publish Find Bind Service User Chain Service Provider Bind Chain

Anatomy of a Wrapper Service: TOMS Satellite Image Data Given the URL template and the image description, the wrapper service can access the image for any day, any spatial subset using a HTTP URL or SOAP protocol: src_img_width src_img_height src_margin_rightsrc_margin_left src_margin_top src_margin_bottom src_lon_min src_lat_max src_lat_min src_lon_max Image Description for Data Access: src_image_width=502 src_image_height=329 src_margin_bottom=105 src_margin_left=69 src_margin_right=69 src_margin_top=46 src_lat_min=-70 src_lat_max=70 src_lon_min=-180 src_lon_max=180 The daily TOMS images reside on the FTP archive, e.g. ftp://toms.gsfc.nasa.gov/pub/eptoms/images/aerosol/y2000/ea gif ftp://toms.gsfc.nasa.gov/pub/eptoms/images/aerosol/y2000/ea gif URL template: ftp://toms.gsfc.nasa.gov/pub/eptoms/images/aerosol/y[yyyy]/ea[yy][mm][dd].gif Transparent colors for overlays RGB(89,140,255) RGB(41,117,41) RGB(23,23,23) RGB(0,0,0)

Web Publish HTTP, FTP Data Access though a Proxy Service Service Broker Publish UDDI, WSDL Service Consumer Find UDDI, WSDL Access SOAP, XML Any web-content can be delivered as a Web Service through a Proxy Server. The Proxy Server supplies a web server-to-web service ‘wrapper’ The Proxy Server publishes the web service to the Broker The User accesses the Proxy to get the Web Server data Service Proxy Web Server Service User Chain

Service-Oriented Architecture Service-oriented architecture (SOA): An architecture built around a collection of reusable components with well- defined interfaces. Loosely coupled: The use of well-defined interfaces to connect services; SOAs are built using a loosely coupled approach, where a change in one service does not require changes in linked services. Enterprise service bus: A software infrastructure that uses a standard interface and messaging to integrate applications; one way to implement an SOA.

Problems of Current Client-Server Architecture User Tasks: Fi nd servers Reformat data Custom process Server Client-Server Architecture: User accesses, processes and integrates data by customized tools User Problems in combining data: –Legacy data systems can not be altered to support integration –Data systems use different terms or meaning of similar terms –Some data sources, do not have a schema nor formal access methods Result: User Carries the Burden In current client-server architectures the user carries much of the burden of integration.

Mediator-Based Integration Architecture (Wiederhold, 1992) Software agents (mediators) can perform many of the data integration chores Heterogeneous sources are wrapped by translation software local to global language Mediators (web services) obtain data from wrappers or other mediators and pass it on … Wrappers remove technical, while mediators resolve the logical heterogeneity The job of the mediator is to provide an answer to a user query In database sense, a mediator is a view of the data found in one or more sources (Ullman, 1997)Ullman, 1997 Busse et. al, 1999 Wrapper Service Query View Mediators

Service Oriented Architecture Control Data Process Peer-to-peer network representation Data Service Catalog Process Data, as well as services and users (of data and services) are distributed Users compose data processing chains form reusable services Intermediate and resulting data are also exposed for possible further use Processing chains can be further linked into value-adding processes Service chain representation User Tasks: Fi nd data and services Compose service chains Expose output Chain 2 Chain 1 Chain 3 Data Service User Carries less Burden In service-oriented peer-to peer architecture, the user is aided by software ‘agents’

Web Programs Built on DataFed Infrastructure

Generic Data Flow and Processing in DataFed DataView 1 DataProcessed Data Portrayed Data Process Data Portrayal/ Render Abstract Data Access View Wrapper Physical Data Abstract Data Physical Data Resides in autonomous servers; accessed by view- specific wrappers which yield abstract data ‘slices’ Abstract Data Abstract data slices are requested by viewers; uniform data are delivered by wrapper services DataView 2 DataView 3 View Data Processed data are delivered to the user as multi-layer views by portrayal and overlay web services Processed Data Data passed through filtering, aggregation, fusion and other web services

Render Service Chaining in Spatio-Temporal Data Browser Spatial Slice Find/Bind Data nDim Data Cube Time Slice Time Portrayal Spatial PortrayalSpatial Overlay Time Overlay OGC-Compliant GIS Services Time-Series Services PortrayOverlay Homogenizer Catalog Wrapper Mediator Client Browser Cursor/Controller Maintain Data Vector GIS Data XDim Data SQL Table OLAP Satellite Images Data Sources

An Application Program: Voyager Data Browser The web-program consists of a stable core and adoptive input/output layers The core maintains the state and executes the data selection, access and render services The adoptive, abstract I/O layers connects the core to evolving web data, flexible displays and to the a configurable user interface: –Wrappers encapsulate the heterogeneous external data sources and homogenize the access –Device Drivers translate generic, abstract graphic objects to specific devices and formats –Ports connect the internal parameters of the program to external controls –WDSL web service description documents Data Sources Controls Displays I/O Layer Device Drivers Wrappers App State Data Flow Interpreter Core Web Services WSDL Ports