Dvoy Networking Ideas. OpenGIS Web Services Mission: Definition and specification of geospatial web services. A Web service is an application that can.

Slides:



Advertisements
Similar presentations
Web Services Implementation Case Study: DataFed Air Quality Data & Services Project Coordinators: Software Architecture: R. Husar Software Implementation:
Advertisements

Proposal Outline: Extensions to the VIEWS System: Analysis Tools and Auxiliary Data R. Husar, CAPITA March, 2003 Presentation and Analysis Tools CATT for.
Visibility Information Exchange Web System. Source Data Import Source Data Validation Database Rules Program Logic Storage RetrievalPresentation AnalysisInterpretation.
Federated PM and Haze Data Warehouse Project a sub- project of (enter your sticker & logo here ) Nov 20, 2001, RBH St. Louis Midwest Supersite Project.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
16 months…. The Visibility Information Exchange Web System is a database system and set of online tools originally designed to support the Regional Haze.
Proposal Outline: Extensions to the VIEWS: General CATT Analysis Tool R. Husar, CAPITA Revised, June 26, 2003 Proposed Sub-Projects CATT for VIEWS$20k.
Stefan Falke Center for Air Pollution Impact and Trend Analysis Washington University in St. Louis Networked Data and Tools for Environmental Management.
Distributed Data Analysis & Dissemination System (D-DADS) Prepared by Stefan Falke Rudolf Husar Bret Schichtel June 2000.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Dissemination of Haze Data, Data Products and Information Bret Schichtel, Rodger Ames, Shawn McClure and Doug Fox.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Developing Health Geographic Information Systems (HGIS) for Khorasan Province in Iran (Technical Report) S.H. Sanaei-Nejad, (MSc, PhD) Ferdowsi University.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Supersite Relational Database System (SRDS) Rudolf Husar, PI Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University, St. Louis,
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Distributed Voyager (DVoy) Web Services
DRAFT June 6, 2005 ESIP AQ Cluster, Air Quality Cluster Air Quality Cluster TechTrack Earth Science Information Partners Partners NASA.
material assembled from the web pages at
Instrument Builders Information Specialists (ESIP) Scientists Curriculum Developers Teachers Decision Analysts Decision Makers Reports From Kim Kastens.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Using SAS® Information Map Studio
Extensible Markup Language (XML) Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879).ISO 8879 XML is a.
Supersite Relational Database Project: ‘Federated PM Data Warehouse’ ‘Federated PM Data Warehouse Rudolf Husar, PI Center for Air Pollution Impact and.
Air Quality Focus Group Discussion Summary ESIP Winter Meeting January 2005 Air Quality is one of 12 Applications of National Priority as defined by NASA.
Supersite Relational Database Project: (Data Portal?) a sub- project of St. Louis Midwest Supersite Project Draft of the November 16, 2001 Presentation.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
INTRODUCTION TO GEOGRAPHICAL INFORMATION SCIENCE RSG620 Week 1, Lecture 2 April 11, 2012 Department of RS and GISc Institute of Space Technology, Karachi.
Spatio-Temporal Data Sharing using XML Web Services Presented at the Workgroup Meeting on Web-based Environmental Information System for Global Emission.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Sea Ice Mapping Systems Archive Browser Interface Distribution IngestProduction Ice Analyst Application Database Henrik Steen AndersonDMI Paul SeymourNIC.
Current Air Quality Information ‘Ecosystem’ (Draft for Feedback) AQ information includes emissions, ambient & satellite data and model outputs The distributed.
Stefan Falke Center for Air Pollution Impact and Trend Analysis Washington University in St. Louis Brooke Hemming US EPA – Office of Research and Development.
Application of ESE Data and Tools to Particulate Air Quality Management The CAPITA REASoN Project August 15, 2003 Stefan Falke and Rudolf Husar Center.
Supersite Relational Database Project: (Data Portal?) a sub- project of St. Louis Midwest Supersite Project Draft of the November 16, 2001 Presentation.
Accessing and Using Fire-Related Data with the CAPITA DataFed.net* Services Framework Stefan Falke Rudolf Husar Kari Hoijarvi Washington University in.
1 Application Scenario: Smoke Impact REASoN Project: Application of NASA ESE Data and Tools to Particulate Air Quality Management (PPT/PDF)Application.
Select, Overlay, Explore; Integration of diverse data Distributed Data Heterogeneous coding, access Connects providers to users; Homogenize data access.
Stefan Falke and Rudolf Husar Center for Air Pollution Impact and Trend Analysis Washington University in St. Louis A NSF Digital Government Pilot Project.
Supersite Relational Database Project: (Data Portal?) a sub- project of St. Louis Midwest Supersite Project Draft of the November 16, 2001 Presentation.
Smoke Event Public EPA NAAQS Exc. Events States: AQ Warning NOAA Travel Advisories AQ Forecasting FAA Flight Advisories NASA Earth Obs: Public.
COMMUNITY. Data Acquisition and Usage Value Chain.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Dvoy Related Ideas. Data Acquisition and Usage Value Chain.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
NASA REASoN Project SHAirED: S ervices for H elping the Air -quality Community use E SE D ata Stefan Falke, Kari Höijärvi and Rudolf Husar, Washington.
NASA REASoN Project SHAirED: S ervices for H elping the Air -quality Community use E SE D ata Stefan Falke, Kari Höijärvi and Rudolf Husar, Washington.
Processes of the Information Value Chain Informing Knowledge ActionProductive Knowledge Information Organizing Grouping Classifying Formatting Geo-referencing.
Web Services-Based Mediator of Distributed Data Flow and Processing Project Coordinators: Software Architecture: R. Husar Software Implementation: K. Höijärvi.
An Integrated Fire, Smoke and Air Quality Data & Tools Network Stefan Falke and Rudolf Husar Center for Air Pollution Impact and Trend Analysis Washington.
ESIP Air Quality Jan Air Quality Cluster Air Quality Cluster Technology Track Earth Science Information Partners Partners NASA NOAA EPA (?) USGS.
Data Warehousing November PORTALS A portal provides users with personalized, one-stop shopping for structured and unstructured data, as well as.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
1 SEEDS IT Vision Scenario: Smoke Impact REASoN Project: Application of NASA ESE Data and Tools to Particulate Air Quality Management (PPT/PDF)Application.
MEDIATORS. Mediation Typical file-sharing systems have a single global schema for describing their data P2P networks have to consider heterogeneous schemas.
DRAFT June 6, 2005 ESIP AQ Cluster, Contact R. Husar Air Quality Cluster Air Quality Cluster TechTrack Earth Science Information Partners.
Application of NASA ESE Data and Tools to Particulate Air Quality Management A proposal to NASA Earth Science REASoN Solicitation CAN-02-OES-01 REASoN:
Harmonization and Integration of Semi- Structured Data Through Wikis and Controlled Tagging E. M. Robinson, R. B. Husar Washington University, St. Louis,
VOYAGER Data Explorer: Architecture and Technologies See also the the Voyager Developer Website and early ApplicationsDeveloper WebsiteApplications Layered.
Proposal to MANE_VU: Extensions to the VIEWS: CATT Analysis Tool Full Proposal Text Full Proposal Text R. Husar, PI, CAPITA Revised, October 8, 2003 The.
Topic Suggestions Scheffe GEOSS Support to Regional Air Quality (see next slide) –Data. Services –Sharing/Harvesting Infrastructure –Intellectual Resources.
Voyager Data Services Services for Finding, Exploring and Presenting Distributed Environmental Data Outline Prepared by Voyager Interest Group on Environmental.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
There is increasing evidence that intercontinental transport of air pollutants is substantial Currently, chemical transport models are the main tools for.
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION ESDS Reuse Working Group Earth Science Data Systems Reuse Working Group Case Study: SHAirED Services for.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
DATAFED Application Programs. Dvoy Data Flow and Processes DataView 1 View Data Abstract Portrayal Device Portrayal Render Device View Portrayal Device.
MANAGING DATA RESOURCES
Presentation transcript:

Dvoy Networking Ideas

OpenGIS Web Services Mission: Definition and specification of geospatial web services. A Web service is an application that can be published, located, and dynamically invoked across the Web. Applications and other Web services can discover and invoke the service. The sponsors of the Web services initiative include –Federal Geographic Data Committee –Natural Resources Canada –Lockheed Martin –National Aeronautics and Space Administration –U.S. Army Corps of Engineers Engineer Research and Development Center –U.S. Environmental Protection Agency EMPACT Program –U.S. Geological Survey –US National Imagery and Mapping Agency. Phase I - February 2002 –Common Architecture: OGC Services Model, OGC Registry Services, and Sensor Model Language. –Web Mapping: Map Server- raster, Feature Server-vector, Coverage Server-image, Coverage Portrayal Services. –Sensor Web: OpenGIS Sensor Collection Service for accessing data from a variety of land, water, air and other sensors.

D-DADS Architecture

The D-DADS Components Data Providers supply primary data to system, through SQL or other data servers. Standardized Description & Format populate and describe the data cubes and other data types using a standard metadata describing data Data Access and Manipulation tools for providing a unified interface to data cubes, GIS data layers, etc. for accessing and processing (filtering, aggregating, fusing) data and integrating data into virtual data cubes Users are the analysts who access the D-DADS and produce knowledge from the data The multidimensional data access and manipulation component of D-DADS will be implemented using OLAP.

Interoperability “the ability to freely exchange all kinds of spatial information about the Earth and about objects and phenomena on, above, and below the Earth’s surface; and to cooperatively, over networks, run software capable of manipulating such information.” (Buehler & McKee, 1996) Such a system has two key elements: Exchange of meaningful information Cooperative and distributed data management One requirement for an effective distributed environmental data system is interoperability, defined as,

On-line Analytical Processing: OLAP A multidimensional data model making it easy to select, navigate, integrate and explore the data. An analytical query language providing power to filter, aggregate and merge data as well as explore complex data relationships. Ability to create calculated variables from expressions based on other variables in the database. Pre-calculation of frequently queried aggregated values, i.e. monthly averages, enables fast response time to ad hoc queries.

User Interaction with D-DADS Query Data View (Table, Map, etc.) Distributed Database XML data

Metadata Standardization The Supersite Data Management Workgroup NARSTO FGDC Metadata standards for describing air quality data are currently being actively pursued by several organizations, including:

Potential D-DADS Nodes The following organizations are potential nodes in a distributed data analysis and dissemination system: CAPITA NPS-CIRA EPA Supersites - California - Texas - St. Louis

Summary In the past, data analysis has been hampered by data flow resistances. However, the tools and framework to overcome each of these resistances now exist, including: World Wide Web XML OLAP OpenGIS Metadata standards Incorporating these tools will initiate a distributed data analysis and dissemination system.

‘Global’ and ‘Local’ AQ Analysis AQ data analysis needs to be performed at both global and local levels The ‘global’ refers to regional national, and global analysis. It establishes the larger-scale context. ‘Local’ analysis focuses on the specific and detailed local features Both global and local analyses are needed for for full understanding. Global-local interaction (information flow) needs to be established for effective management. National and Local AQ Analysis

Data Re-Use and Synergy Data producers maintain their own workspace and resources (data, reports, comments). Part of the resources are shared by creating a common virtual resources. Web-based integration of the resources can be across several dimensions: Spatial scale:Local – global data sharing Data content:Combination of data generated internally and externally The main benefits of sharing are data re-use, data complementing and synergy. The goal of the system is to have the benefits of sharing outweigh the costs. Content User Local Global Virtual Shared Resources Data, Knowledge Tools, Methods User Shared part of resources

Integration for Global-Local Activities Global Activity Local Benefit Global data, toolsImproved local productivity Global data analysisSpatial context; initial analysis Analysis guidance Standardized analysis, reporting Local Activity Global Benefit Local data, tools Improved global productivity Local data analysisElucidate, expand initial analysis Identify relevant issuesResponsive, relevant global analysis Global and local activities are both needed – e.g. ‘think global, act local’ ‘Global’ and ‘Local’ here refers to relative, not absolute spatial scale

Content Integration for Multiple Uses (Reports) Data from multiple measurements are shared by their providers or custodians Data are integrated, filtered, aggregated and fused in the process of analysis Reports use the analysis for Status and Trends; Exposure Assessment; Compliance … The creation of the needed reports requires data sharing and integration from multiple sources.

Federated Data Warehouse Features As much as possible, data should reside in their respective home environment. ‘Uprooted’ data in decoupled databases tend to decay i.e. can not be easily updated, maintained, enriched. Data Providers would need to ‘open up’ their SQL data servers for limited data subsets and queries, in accordance with a ‘contract’. However, the data structures of the Providers will not need to be changed. Data from the providers will be transferred to the ‘federated data warehouse’ through (1) on-line DataAdapters, (2) Manual web submission and (3) Semi-automated transfer from the NARSTO archive. Retrieval of uniform data from the data warehouse facilitates integration and comparison along the key dimensions (space, time, parameter, method) The open architecture data warehouse (see Web Services) promotes the building of further value chains: Data Viewers, Data Integration Programs, Automatic Report Generators etc..Web Services

Navigation Service DVoy: Components and Data Flow Legacy Data Publish (DataSet) DataSet Records Provider Descript. Service Descript. Measure Access Find (Measure) DataSet Recs Selected Measure Record Dvoy Data Wrapping WebService Bind (Measure, FocusCube) Data Delivery WebService FocusCube, GlobCursor, Measure, Granule Time Chart Layered Map DataToView DataForCursorAndView Catalog ServiceData Services Data Delivery WebService Presentation Services

Data Focus Range Rendering Cursor Viewer Layers Dim1: Lon Dim2: Lat Data provided by each dimension of a View: Dim1.Type, Dim1.Min, Dim1.Max Dim2.Type, Dim2.Min, Dim2.Max …. Current Dim.Types: Latitude, Longitude, DateTime, Elevation

Federated Data Services Architecture XML Web Services Satellite Vector GIS Data XDim Data OLAP Cube SQL Table HTTP Services Text Data Web Page Text Data Scatter Chart Text, Table Data View & Process Tier Layered Map Cursor Data Warehouse Tier Data View Manager Connection Manager Data Access Manager Cursor-Query Manager OpenGIS Services Data are rendered by linked Data Views (map, time, text) Distributed data of multiple types (spatial, temporal text ) The Broker handles the views, connections, data access, cursor Time Chart

Dvoy Federated Information System Dvoy offers a homogeneous, read-only access mechanism to a dynamically changing collection of heterogeneous, autonomous and distributed information sources. Data access uses a global multidimensional schema consisting of spatial, temporal and parameter dimensions The uniform global schema is suitable for data browsing and online analytical processing, OLAP The limited global query capabilities yield slices along the spatial, temporal and parameter dimensions of the multidimensional data cubes.

Architecture of DATAFED Federated Data System After Busse et. al., 1999 The main software components of Dvoy are wrappers, which encapsulate sources and remove technical heterogeneity, and mediators, which resolve the logical heterogeneity. Wrapper classes are available for geo-spatial (incl. satellite) images, SQL servers, text files,etc. The mediator classes are implemented as web services for uniform data access to n-dimensional data.

Integration Architecture (Ullman, 1997)Ullman, 1997 Heterogeneous sources are wrapped by software that translates between the sources local language, model and concepts and the shared global concepts Mediators obtain information from one or more components (wrappers or other mediators) and pass it on to other mediators or to external users. In a sense, a mediator is a view of the data found in one or more sources; it does not hold the data but it acts as it it did. The job of the mediator is to go to the sources and provide an answer to the query.

Federated PM and Haze Data Warehouse Project a sub- project of (enter your sticker & logo here ) Nov 20, 2001, RBH St. Louis Midwest Supersite Project Regional Planning Organization RPO EPA Supersites SupSite NARSTO PM NARSTO EPA Division1, Division2, Division2 EPA Me and my dog for our aerosol project Me

PM/Haze Data Flow in Support of AQ Management There are numerous organizations in need of data relevant to PM/Haze Most interested parties (stakeholders) are both producers and consumers of PM and haze data There is a general willingness to share data but the resistances to data flow and processing are too high RPO Regional Planning Orgs FLM Federal Land Managers EPA EPA Regul. & Research Industry Academic NARSTO Other: Private, Academic SuperSite Shared PM/Haze Data PM and haze data are used for may parts of AQ management, mostly in form of Reports The variety of pertinent (ambient, emission) data come from many different sources To produce relevant reports, the data need to be ‘processed’ (integrated, filtered aggregated)

Scientific and Administrative Rationale for Resource Sharing Scientific Rationale: Regional haze and its precursors have a km airshed. (Smoke, Dust, Haze) – Data integration Substantial fraction of haze originates from natural sources or from out-of-jurisdiction man-made sources Cross-RPO data and knowledge sharing yields better operational and science support to AQ management Management Rationale: Haze control within some RPOs cannot yield Data sharing saves money and ….

A Strategy for the Federated PM/Haze Data Warehouse Negotiate with the data providers ‘open up’ their data servers for limited, controlled, access in accordance with clear ‘access contract’ with the Federated Warehouse Design an interface to the warehoused datasets that has simple data access and satisfies the data needs of most integrating users.(oxymoron ????) Facilitate the the development of shared value-adding processes (analysis tools, methods) that refine the raw data to useful knowledge

Three-Tier Federated Data Warehouse Architecture (Note: In this context, ‘Federated’ differs from ‘Federal’ in the direction of the driving force. Federated meant to indicate a driving force for sharing from ‘bottom up’ i.e. from the members, not dictated from ‘above’, by the Feds) 1.Provider Tier: Back-end servers containing heterogeneous data, maintained by the federation members 2.Proxy Tier: Retrieves designated Provider data and homogenizes it into common, uniform Datasets 3.User Tier: Accesses the Proxy Server and uses the uniform data for presentation, integration or processing Provider Tier Heterogeneous data in distributed SQL Servers Proxy Tier Data homogenization, transformation Federated Data Warehouse User Tier Data presentation, processing

Federated Data Warehouse Interactions The Provider servers interact only with the Proxy Server in accordance with the Federation Contract –The contract sets the rules of interaction (accessible data subsets, types of queries) –Strong server security measures enforced, e.g. through Secure Socket layer The data User interacts only with the generic Proxy Server using flexible Web Services interface –Generic data queries, applicable to all data in the Warehouse (e.g. data sub-cube by space, time, parameter) –The data query is addressed to the Web Service provided by the Proxy Server –Uniform, self-describing data packages are passed to the user for presentation or further processing SQLDataAdapter1 CustomDataAdapter SQLDataAdapter2 SQLServer1 SQLServer2 LegacyServer Presentation Data Access & Use Provider Tier Heterogeneous Data Proxy Tier Data Homogenization, etc. Member Servers Proxy Server User Tier Data Consumption Processing Integration Federated Data Warehouse Fire Wall, Federation Contract Web Service, Uniform Query & Data

‘Global’ and ‘Local’ AQ Analysis AQ data analysis needs to be performed at both global and local levels The ‘global’ refers to regional national, and global analysis. It establishes the larger-scale context. ‘Local’ analysis focuses on the specific and detailed local features Both global and local analyses are needed for for full understanding. Global-local interaction (information flow) needs to be established for effective management. National and Local AQ Analysis

Data Model Ray Plante, Virtual Obs Ray Plante, Virtual Obs What’s the difference between Data Models and Metadata? Intertwined –metadatum: a datum with a name or semantic tag that refers to the data –data model: a description of the relationships between metadata structural & logical relationships between compound objects & their components operations that can be performed on them (really - –framework: the architecture/process used to define metadata/data models that enables their ready use in applications Formalized data modeling process –encourages community involvement for defining standard models & metadata –structure enables easy verification, dissemination, & automated use –“standard” metadata should point directly to components of the “standard” models –allow groups to define metadata independent of a “standard” metadata Practical Difference? –data model captures as complete a picture of a concept as possible –metadata represents the instantiation of portion of the model’s components Data access through a data model (Wrapper Classes for each data model)

Integration for Global-Local Activities Global Activity Local Benefit Global data, tools => Improved local productivity Global data analysis => Spatial context; initial analysis Analysis guidance => Standardized analysis, reporting Local Activity Global Benefit Local data, tools => Improved global productivity Local data analysis => Elucidate, expand initial analysis Identify relevant issues => Responsive, relevant global work Global and local activities are both needed – e.g. ‘think global, act local’ ‘Global’ and ‘Local’ here refers to relative, not absolute scale

Federated Data System Features Data reside in their respective home environment where it can mature. ‘Uprooted’ data in centralized databases are not easily updated, maintained, enriched. Abstract (universal) query/retrieval facilitates integration and comparison along the key dimensions (space, time, parameter, method) The open data query based on Web Services promotes the building of further value chains: Data Viewers, Data Integration Programs, Automatic Report Generators etc..Web Services The data access through the Proxy server protects the data providers and the data users from security breaches, excessive detail

Integration for Global-Local Activities Global Activity Local Benefit Global data & analysisSpatial context; initial analysis Analysis guidance Standardized analysis, reporting Local Activity Global Benefit Local data & analysisElucidate, expand initial analysis Identify relevant issuesResponsive, relevant global analysis Global and local activities are both needed – e.g. ‘think global, act local’ ‘Global’ and ‘Local’ here refers to relative, not absolute spatial scale

Data Re-Use and Synergy Data producers maintain their own workspace and resources (data, reports, comments). Part of the resources are shared by creating a common virtual resources. Web-based integration of the resources can be across several dimensions: Spatial scale:Local – global data sharing Data content:Combination of data generated internally and externally The main benefits of sharing are data re-use, data complementing and synergy. The goal of the system is to have the benefits of sharing outweigh the costs. Content User Local Global Virtual Shared Resources Data, Knowledge Tools, Methods User Shared part of resources

Federated Information System Providers maintain their own workspace and resources (data, tools, reports) Part of the private resources are exposed as shared (federated) resources The Federation facilitates finding, accessing and usage of the shared resources Data sharing federations: Open GIS Consortium (GIS data layers) NASA SEEDS network (Satellite data) NSF Digital Government EPA’s National Env. Info Exch. Network. InfoData Shared (Federated) Resources Data, Services, Tools, Methods Shared Private Other Federations Providers/Users

Data Federation Concept and the FASNET Network Schematic representation of data sharing in a federated information system. Based on the premise that providers expose part of their data (green) to others Schematics of the value-adding network proposed for FASTNET Components embedded in the federated value network

Data Acquisition and Usage Value Chain Monitor Store Data 1 Monitor Store Data 2 Monitor Store Data n Monitor Store Data m IntData1 IntDatan IntData2 Virtual Int. Data

Processes of the Information Value Chain Informing Knowledge ActionProductive Knowledge Information Organizing Grouping Classifying Formatting Displaying Analyzing Separating Evaluating Interpreting Synthesizing Judging Options Quality Advantages Disadvantages Deciding Matching goals, Compromising Bargaining Deciding CIRA VIEWSLangley IDEAAQ ManagerWG Summary Rpt Data (after Taylor, 1975) Examples:

Data Flow and Processing

Next Process Why? How? When? Where? CATT: A Community Tool! Part of an Analysis Value Chain Aerosol Data Collection IMP. EPA Aerosol Sensors Integration VIEWS Integrated AerData AEROSOL Weather Data Assimilate NWS Gridded Meteor. Trajectory ARL Traject. Data TRANSPORT TrajData Cube Aggreg. Traject. AerData Cube CATT Aggreg. Aerosol CATT-In CAPITA There! Not There! Further Analysis GIS Grid Processing Emission Comparison

F ast A erosol S ensing T ools for N atural E vent T racking FASTNET Analysts Console Community Website

Distributed Programming: Interpreted and Compiled Web services allow processing of distributed data –Data are distributed and maintained by their custodians, –Processing nodes (web-services) are also distributed –‘Interpreted’ web-programs for data processing can be created ad hoc by end users However, ‘interpreted’ web programs are slow, fragile and uncertain –Slow due to large data transfers between nodes –Fragile due to instability of connections –Uncertain due to failures of data provider and processing nodes One solution is to ‘compile’ the data and processing services –Data compilation transforms the data for fast, effective access (e.g. OLAP) –Web service compilation combines processes for effective execution Interpreted or compiled? –Interpreted web programs are simpler and up to date but slow, fragile, uncertain –Compiled versions are more elaborate and latent but also faster and more robust –Frequently used datasets and processing chains should be compiled and kept current

Interpreted and Compiled Service Point Access Point Grid Grid Render Point Render PtGrid Overla y Point Access Point Grid Grid Render Point Access Point Render PtGrid Overlay Interpreted Service Processes distributed Data flow on Internet Compiled Service Processes in the same place Data flow within aggregate service Controllers, e.g. zoom can be shared Data Flow Control Flow

Voyager: The Program The Voyager program consists of a stable core and adoptive input/output section The core executes the data selection, access portrayal tasks The adoptive, abstract I/O layer connects the core to evolving web data, flexible displays and to the a configurable user interface: –Wrappers encapsulate the heterogeneous external data sources and homogenize the access –Device Drivers translate generic, abstract graphic objects to specific devices and formats –Ports expose the internal parameters of Voyager to external controls Data Sources Controls Displays Voyager Core Data Selection Data Access Data Portrayal Adoptive Abstract I/O Layer Device Drivers Ports Wrapper s

Dvoy_Services: Generic Software components Webservice Param 1 Param2 Service state Webservice Adaptor User Interface Module Controller state I/o ports Web service calls Web service Output data Web service Input data User Interface Module UIM extracts relevant UI parameters from STATE User changes UI parameters UIM transmits modified UI parameters to STATE Service Chain STATE Module Contains the state params for all services in the chain Has ports for getting/setting state params Service Adopter Module Gets input data from upsteam service Gets service params from STATE Make service call Service Adopter Module Gets input data from upsteam service Gets service params from STATE Make service call Web service Service Module Gets service call from Adopter module Executes service Returns output data

PointAccess->Grid->GridRender Service Chain The service chain interpreter make ONLY 2 sequential calls, stated in the data flow program : –GetMapPointDataAdaptor –RenderMapviewPoint Adaptor GetMapPointData dataset_abbr: IMPROVE Param_abber SOILf datatime: sql_filter: RenderMapviewPoint dataset_url: output_format: out_image_width: Etc….. Service state GetMapPointData Adaptor RenderMapviewPoint Adaptor GetMapPointData Selector RenderMapviewPoint Selector state I/o ports Web service calls Web service Output data Service state

PointAccess->Grid->GridRender Service Chain The service chain interpreter make ONLY 3 sequential calls, stated in the data flow program: –GetMapPointDataAdaptor –GridMapviewPointAdaptor –RenderMapviewGridAdaptor GetMapPointData dataset_abbr: IMPROVE Param_abber SOILf datatime: sql_filter: RenderMapviewGrid dataset_url: output_format: out_image_width: Etc….. GetMapPointData Adaptor RenderMapviewGrid Adaptor GetMapPointData Selector RenderMapviewGrid Selector GridMapviewPoint Selector GridMapviewPoint dataset_url: output_format: out_image_width: Etc….. GridMapviewPoint Adaptor state I/o ports Web service calls Web service Output data Service state

VOYAGER Web Services Layered Map Time Chart Vector GIS Data XDim Data SQL Tables Web Images Publish, Find, Bind Catalog, Data & Tools Uniform Access Scatter Chart S u p p o r tCoordination T e c h n o l o g i e s Users Select, Overlay, Explore; Multidimensional data Providers Maintain distributed data; Heterogeneous coding, access Voyager Web Services Homogenize data access Catalog, access, transform data C O M M U N I T Y

Services Program Execution: Reverse Polish Notation Writing the WS program: - Write the program on the command line of a URL call - Services are written sequentially using RPN - Replacements Connector/Adaptor: - Reads the service name from the command line and loads its WSDL - Scans the input WSDL - The schema walker populates the service input fields from: - the data on the command line - the data output of the upstream process - the catalog for the missing data Service Execution For each service Reads the command line, one service at a time Passes the service parameters to the above Connector/Adopter, which prepares the service Executes the service It also handles the data stack for RPN

Lecture Notes by M. Small Lecture Notes by M. Small Pg 18-19; 23-24

Pg ; 23-24N2

Value Network_Anderson