Download presentation
Presentation is loading. Please wait.
Published byKatherine Sims Modified over 8 years ago
1
VOYAGER Data Explorer: Architecture and Technologies See also the the Voyager Developer Website and early ApplicationsDeveloper WebsiteApplications Layered Map Time Chart ProvidersUsers Vector GIS Data XDim Data SQL Tables Web Images Voyager Web Services Publish, Find, Bind Data & Tool Catalog Uniform Access Scatter Chart S u p p o r tCoord./Cooperation T e c h n o l o g i e s
2
Select, Overlay, Explore; Multidimensional data Maintain Distributed Data; Heterogeneous coding, access Connect providers to users; Homogenize data access Voyager Spatio-Temporal Data Explorer Built and Used by a Virtual Community Layered Map Time Chart ProvidersUsers Vector GIS Data XDim Data SQL Table Web Images Voyager Web Services Publish, Find, Bind Data & Tool Catalog Uniform Access/Retrieval Scatter Chart
3
The Dvoy Project DVOY is a graphic browser for distributed, heterogeneous, multidimensional datasets The initial Dvoy infrastructure was developed at CAPITA, with NSF supportDvoy Further services for data access, processing and viewing are expected from the community The project evolution is to ride 'web services wave‘ of the Internet CAPITA Support: –NSF ITRWorkgroup Collab. Tool:Aug 2001 - Aug 2004 –EPA Web-based Visibility:Aug 2001 - Apr 2003 –NOAAASOS Visibility:Aug 2001 - May 2002 –MARAMAChemical Trajectory Tool:Aug 2002 - July 2003 –EPA OAQPS Global Transport Analysis:Nov 2002 – Oct 2003 In-kind support by organizations participating in DVOY-based data sharing
4
The Dvoy Project DVOY is Federated Information System for heterogeneous, multidimensional datasets Voyager is a generic graphic browser for the federated DVOY data. The initial Dvoy infrastructure is being developed at CAPITA, with NSF supportDvoy Further services for data access, processing and viewing are expected from the community The project evolution is to ride 'web services wave‘ of the Internet CAPITA projects which use DVOY: –NSF ITRWorkgroup Collaboration Tool:Aug 2001 - Aug 2004 –EPA Web-based Visibility:Aug 2001 - Apr 2003 –NOAAASOS Visibility:Sep 2001 - Sep 2002 –MARAMAChemical Trajectory Tool:Aug 2002 - July 2003 –EPA OAQPS Global Transport Analysis:Nov 2002 – Oct 2003 –NSF DigiGov Fire and Smoke Network :May 2003 – Apr 2006 Pending –NASA ESE Satellite Appl. to PM Management :June 2003 – May 2008 Pending In-kind support by organizations participating in DVOY-based federated data sharing Collaborators/Partners: CIRA (Schichtel), NRL(Westphal), NASA (Goddard)…many data sources. CAPITA cast: R. Husar, S. Falke, K. Hoijarvi, J. Colson, R. Zager)
5
Related CAPITA Projects EPA Network Design Project (~$150K/yr –April 2003). Development of novel quantitative methods of network optimization. The network performance evaluation is conducted using the complete PM FRM data set in AIRS which will be available for input into the SRDS. EPA WebVis Project (~$120K/yr - April 2003). Delivery current visibility data to the public through a web-based system. The surface met data are being transferred into the SQL database (Since March 2001) and will be available to SRDS. NSF Collaboration Support Project (~$140K/yr – Dec 2004). Continuing development of interactive web sites for community discussions and for web-based data sharing; (directly applicable to this project) NOAA ASOS Analysis Project (~$50K/yr - May 2002). Evaluate the potential utility of the ASOS visibility sensors (900 sites, one minute resolution) as PM surrogate. Data now available for April-October 2001 – can be incorporated into to the Supersite Relational Data System. St. Louis Supersite Project website (~$50K/yr – Dec 2003). The CAPITA group maintains the St. Louis Supersite website and some auxiliary data. It will also be used for this project
6
NSF-NOAA-EPA/EMAP (NASA)? Project: Real-Time Aerosol Watch System Real-Time Virtual PM Monitoring Dashboard. A web-page for one-stop access to pre-set views of current PM monitoring data including surface PM, satellite, weather and model data. Virtual Workgroup Website. An interactive website which facilitates the active participation of diverse members in the interpretation, discussion, summary and assessment of the on-line PM monitoring data. Air Quality Managers Console. Helps PM managers make decisions during major aerosol events; delivers a subset of the PM data relevant to the AQ managers, including summary reports prepared by the Virtual workgroups.
7
Features of the DVOY XML Web Service Architecture Interoperability: Platform and language independence; based on Web Services (XML,SVG) Legacy Support: Encapsulating existing data and exposing them as Web Services. (Access to standard HTTP/FTP servers) Just-in-time integration: Discovery, access to and ad-hoc chaining of services. (Future: Agile app building)
8
DVOY Web Services DVOY draws on three basic web services: DataCatalogWrapping service is for data registration, finding and wrapping information DataAccess service provides uniform access to heterogeneous, distributed, multidim. data DataPortrayal service prepares input data in form suitable for rendering
9
Dvoy Federated Information System Dvoy offers a homogeneous, read-only access mechanism to a dynamically changing collection of heterogeneous, autonomous and distributed information sources. Data access uses a global multidimensional schema consisting of spatial, temporal and parameter dimensions The uniform global schema is suitable for data browsing and online analytical processing, OLAP The limited global query capabilities yield slices along the spatial, temporal and parameter dimensions of the multidimensional data cubes.
10
Scientific and Administrative Rationale for Resource Sharing Scientific Rationale: Regional haze and its precursors have a 1000-10000 km airshed. (Smoke, Dust, Haze) – Data integration Substantial fraction of haze originates from natural sources or from out-of-jurisdiction man-made sources Cross-RPO data and knowledge sharing yields better operational and science support to AQ management Management Rationale: Haze control within some RPOs cannot yield a complete answer Data sharing saves money and human resources
11
Data Re-Use and Synergy Data producers maintain their own workspace and resources (data, reports, comments). Part of the resources are shared by creating a common virtual resources. Web-based integration of the resources can be across several dimensions: Spatial scale:Local – global data sharing Data content:Combination of data generated internally and externally The main benefits of sharing are data re-use, data complementing and synergy. The goal of the system is to have the benefits of sharing outweigh the costs. Content User Local Global Virtual Shared Resources Data, Knowledge Tools, Methods User Shared part of resources
12
Scientific and Administrative Rationale for Resource Sharing Scientific Rationale: Regional haze and its precursors have a 1000-10000 km airshed. (Smoke, Dust, Haze) – Data integration Substantial fraction of haze originates from natural sources or from out-of-jurisdiction man-made sources Cross-RPO data and knowledge sharing yields better operational and science support to AQ management Management Rationale: Haze control within some RPOs cannot yield a complete answer Data sharing saves money and human resources
13
A Strategy for the Federated PM/Haze Data Warehouse Negotiate with the data providers ‘open up’ their data servers for limited, controlled, access in accordance with clear ‘access contract’ with the Federated Warehouse Design an interface to the warehoused datasets that has simple data access and satisfies the data needs of most integrating users.(oxymoron ????) Facilitate the the development of shared value-adding processes (analysis tools, methods) that refine the raw data to useful knowledge
14
Three-Tier Federated Data Warehouse Architecture (Note: In this context, ‘Federated’ differs from ‘Federal’ in the direction of the driving force. Federated meant to indicate a driving force for sharing from ‘bottom up’ i.e. from the members, not dictated from ‘above’, by the Feds) 1.Provider Tier: Back-end servers containing heterogeneous data, maintained by the federation members 2.Proxy Tier: Retrieves designated Provider data and homogenizes it into common, uniform Datasets 3.User Tier: Accesses the Proxy Server and uses the uniform data for presentation, integration or processing Provider Tier Heterogeneous data in distributed SQL Servers Proxy Tier Data homogenization, transformation Federated Data Warehouse User Tier Data presentation, processing
15
Federated Data Warehouse Interactions The Provider servers interact only with the Proxy Server in accordance with the Federation Contract –The contract sets the rules of interaction (accessible data subsets, types of queries) –Strong server security measures enforced, e.g. through Secure Socket layer The data User interacts only with the generic Proxy Server using flexible Web Services interface –Generic data queries, applicable to all data in the Warehouse (e.g. data sub-cube by space, time, parameter) –The data query is addressed to the Web Service provided by the Proxy Server –Uniform, self-describing data packages are passed to the user for presentation or further processing SQLDataAdapter1 CustomDataAdapter SQLDataAdapter2 SQLServer1 SQLServer2 LegacyServer Presentation Data Access & Use Provider Tier Heterogeneous Data Proxy Tier Data Homogenization, etc. Member Servers Proxy Server User Tier Data Consumption Processing Integration Federated Data Warehouse Fire Wall, Federation Contract Web Service, Uniform Query & Data
16
Federated Data Warehouse Features Data reside in their respective home environment where it can mature. ‘Uprooted’ data in separated databases are not easily updated, maintained, enriched. Abstract (universal) query/retrieval facilitates integration and comparison along the key dimensions (space, time, parameter, method) The open data query based on Web Services promotes the building of further value chains: Data Viewers, Data Integration Programs, Automatic Report Generators etc..Web Services The data access through the Proxy server protects the data providers and the data users from security breaches, excessive detail
17
INMON The applications that were built in the early days of the information processing industry were designed to satisfy a set of applications. These applications did such things as: collect data store data, and provide online data access. After an organization had built or otherwise acquired many applications, the organization discovered that the application data did not fulfill the need for information in the corporation. In particular the organization discovered that applications did not provide: integrated data, historical data, and summary data. As an example of the lack of integrated data, the corporation had no integrated, corporate definition for such things as CUSTOMER, PRODUCT, TRANSACTION, and so forth. Each application had its own notion of these things. As an example of the lack of historical data, corporations could tell a bank customer how much money was in an account today, but could not tell the customer what his/her average account balance had been for each month for the past twelve months.
18
SRDS and Federated Data Warehouse Technologies Server hardware: 2 identical Dell PowerEdge 4400 servers, (SQL server and Web Server); dual processor Xeon, 1 GB memory, 260 GB RAID drives SQL Software: Microsoft SQL 2000 Enterprize Development Server Web Server: Microsoft IIS 2000, including Data Transformation Services for data ingestion. Programming Environment: Microsoft.Net languages (VB, C#) for creating and programming web objects and ASP.NET to create the distributed web pages. Note: The rapid development of distributed applications was recently made possible by the ubiquity of SOAP/XML as a data transport protocol, and Web Services/.Net as the distributed programming environment. In fact,.NET is still in version Beta2.Web Services
19
SRDS and Federated Data Warehouse Technologies Server hardware: 2 identical Dell PowerEdge 4400 servers, (SQL server and Web Server); dual processor Xeon, 1 GB memory, 260 GB RAID drives SQL Software: Microsoft SQL 2000 Enterprize Development Server Web Server: Microsoft IIS 2000, including Data Transformation Services for data ingestion. Programming Environment: Microsoft.Net languages (VB, C#) for creating and programming web objects and ASP.NET to create the distributed web pages. Note: The rapid development of distributed applications was recently made possible by the ubiquity of SOAP/XML as a data transport protocol, and Web Services/.Net as the distributed programming environment. In fact,.NET is still in version Beta2.Web Services
20
The Winds of Change Shift from primary to secondary pollutants. Ozone and PM2,5 travel 500 + miles across state or international boundaries and their sources are not well established New Regulatory approach. Compliance evaluation based on ‘weight of evidence’ and tracking the effectiveness of controls Shift from command & control to participatory management. Inclusion of federal, state, local, industry, international stakeholders.
21
Challenges Broader user community. The information systems need to be extended to reach all the stakeholders ( federal, state, local, industry, international) A richer set of data and analysis. Establishing causality, ‘weight of evidence’, emissions tracking requires the analysis of air quality, meteorology emissions and effects data. Rich AQ data availability. Abundant high-grade routine and research monitoring data from EPA and other agencies are now available. New information technologies. DBMS, data exploration tools and web-based communication now allows cooperation (sharing) and coordination among diverse groups. Opportunities
22
Recap: Harnessing the Winds Secondary pollutants along with more open environmental management style are placing increasing demand on data analysis. Meanwhile, rich AQ data sets and the computer and communications technologies offer unique opportunities. It appears timely to consider the development of a web-based, open, distributed air quality data integration, analysis and dissemination system. The challenge is learn how to harness the winds of change as sailors have learned to use the winds for going from A to B
23
Data sharing standards. A set of open standards for the sharing of AQ data, tools and reports. Examples: TCP/IP, HTML, XML, FGDC Data catalog. A virtual centralized catalog with search and retrieval facilities. Examples: GCMD, web-indexes Web-based shared workspace. Place to share comments, feedback, plans,... Infrastructure support for a distributed system
24
Earth Science Research & Application Matrix
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.