Federated PM and Haze Data Warehouse Project a sub- project of (enter your sticker & logo here ) Nov 20, 2001, RBH St. Louis Midwest Supersite Project.

Slides:



Advertisements
Similar presentations
Visibility Information Exchange Web System. Source Data Import Source Data Validation Database Rules Program Logic Storage RetrievalPresentation AnalysisInterpretation.
Advertisements

Systematic Review Data Repository (SRDR™) The Systematic Review Data Repository (SRDR™) was developed by the Tufts Evidence-based Practice Center (EPC),
Chapter 17: Client/Server Computing Business Data Communications, 4e.
Integrated Decision Support: A Tale of Two Systems “It was the best of times, it was the worst of times…” Charles Dickens, A Tale of Two Cities, 1859 “Actually,
The Visibility Information Exchange Web System is a database system and set of online tools originally designed to support the Regional Haze Rule enacted.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
WRAP Technical Support System for Air Quality Planning, Tracking, & Decision Support Tom Moore, Western Regional Air Partnership, Western Governor’s Association.
Data Sources Data Warehouse Analysis Results Data visualisation Analytical tools OLAP Data Mining Overview of Business Intelligence Data visualisation.
16 months…. The Visibility Information Exchange Web System is a database system and set of online tools originally designed to support the Regional Haze.
Proposal Outline: Extensions to the VIEWS: General CATT Analysis Tool R. Husar, CAPITA Revised, June 26, 2003 Proposed Sub-Projects CATT for VIEWS$20k.
Stefan Falke Center for Air Pollution Impact and Trend Analysis Washington University in St. Louis Networked Data and Tools for Environmental Management.
Distributed Data Analysis & Dissemination System (D-DADS) Prepared by Stefan Falke Rudolf Husar Bret Schichtel June 2000.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Dissemination of Haze Data, Data Products and Information Bret Schichtel, Rodger Ames, Shawn McClure and Doug Fox.
Digital Object: A Virtual Online Storage Solution 598C Course Project Huajing Li.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Supersite Relational Database System (SRDS) Rudolf Husar, PI Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University, St. Louis,
Concept demo System dashboard. Overview Dashboard use case General implementation ideas Use of MULE integration platform Collection Aggregation/Factorization.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
Distributed Voyager (DVoy) Web Services
DRAFT June 6, 2005 ESIP AQ Cluster, Air Quality Cluster Air Quality Cluster TechTrack Earth Science Information Partners Partners NASA.
Instrument Builders Information Specialists (ESIP) Scientists Curriculum Developers Teachers Decision Analysts Decision Makers Reports From Kim Kastens.
Supersite Relational Database Project: ‘Federated PM Data Warehouse’ ‘Federated PM Data Warehouse Rudolf Husar, PI Center for Air Pollution Impact and.
Supersite Relational Database Project: (Data Portal?) a sub- project of St. Louis Midwest Supersite Project Draft of the November 16, 2001 Presentation.
FASTNET. Regional Haze Rule: Nomenclature and Time Scale Schematics Goal is to attain natural conditions by 2064; Baseline is established during
IBISAdmin Utah’s Web-based Public Health Indicator Content Management System.
REASoN REASoN Project to link NASA's data, modeling and systems to users in research, education and applications Application of NASA ESE Data and Tools.
Project Outline: Technical Support to EPA and RPOs Estimation of Natural Visibility Conditions over the US Project Period: June May 2008 Reports:
An Integrated Systems Solution to Air Quality Data and Decision Support on the Web GEO Architecture Implementation Pilot – Phase 2 (AIP-2) Kickoff Workshop.
Chapter 17: Client/Server Computing Business Data Communications, 4e.
Spatio-Temporal Data Sharing using XML Web Services Presented at the Workgroup Meeting on Web-based Environmental Information System for Global Emission.
Supersite Relational Database Project: (Data Portal?) a sub- project of St. Louis Midwest Supersite Project Draft of the November 16, 2001 Presentation.
Select, Overlay, Explore; Integration of diverse data Distributed Data Heterogeneous coding, access Connects providers to users; Homogenize data access.
Stefan Falke and Rudolf Husar Center for Air Pollution Impact and Trend Analysis Washington University in St. Louis A NSF Digital Government Pilot Project.
Supersite Relational Database Project: (Data Portal?) a sub- project of St. Louis Midwest Supersite Project Draft of the November 16, 2001 Presentation.
Building Dashboards SharePoint and Business Intelligence.
Smoke Event Public EPA NAAQS Exc. Events States: AQ Warning NOAA Travel Advisories AQ Forecasting FAA Flight Advisories NASA Earth Obs: Public.
COMMUNITY. Data Acquisition and Usage Value Chain.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
BI Practice March-2006 COGNOS 8BI TOOLS COGNOS 8 Framework Manager TATA CONSULTANCY SERVICES SEEPZ, Mumbai.
Ms Dynamics Ax 2012 By Johnkrish. MSD Ax is a Customizable, Multi-language, Multi-Currency ERP Solution. Completely integrated & Web-enabled Supports.
NASA REASoN Project SHAirED: S ervices for H elping the Air -quality Community use E SE D ata Stefan Falke, Kari Höijärvi and Rudolf Husar, Washington.
NASA REASoN Project SHAirED: S ervices for H elping the Air -quality Community use E SE D ata Stefan Falke, Kari Höijärvi and Rudolf Husar, Washington.
Dvoy Networking Ideas. OpenGIS Web Services Mission: Definition and specification of geospatial web services. A Web service is an application that can.
Air Quality Data User Agencies Draft ESIP Federation Air Quality Cluster February, 2005.
Processes of the Information Value Chain Informing Knowledge ActionProductive Knowledge Information Organizing Grouping Classifying Formatting Geo-referencing.
CS 157B: Database Management Systems II April 10 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
Web Services-Based Mediator of Distributed Data Flow and Processing Project Coordinators: Software Architecture: R. Husar Software Implementation: K. Höijärvi.
ESIP Air Quality Jan Air Quality Cluster Air Quality Cluster Technology Track Earth Science Information Partners Partners NASA NOAA EPA (?) USGS.
1 SEEDS IT Vision Scenario: Smoke Impact REASoN Project: Application of NASA ESE Data and Tools to Particulate Air Quality Management (PPT/PDF)Application.
MEDIATORS. Mediation Typical file-sharing systems have a single global schema for describing their data P2P networks have to consider heterogeneous schemas.
Concepts on Aerosol Characterization R.B. Husar Washington University in St. Louis Presented at EPA – OAQPS Seminar Research Triangle Park, NC, April 4,
Application of NASA ESE Data and Tools to Particulate Air Quality Management A proposal to NASA Earth Science REASoN Solicitation CAN-02-OES-01 REASoN:
Harmonization and Integration of Semi- Structured Data Through Wikis and Controlled Tagging E. M. Robinson, R. B. Husar Washington University, St. Louis,
VOYAGER Data Explorer: Architecture and Technologies See also the the Voyager Developer Website and early ApplicationsDeveloper WebsiteApplications Layered.
Proposal to MANE_VU: Extensions to the VIEWS: CATT Analysis Tool Full Proposal Text Full Proposal Text R. Husar, PI, CAPITA Revised, October 8, 2003 The.
Topic Suggestions Scheffe GEOSS Support to Regional Air Quality (see next slide) –Data. Services –Sharing/Harvesting Infrastructure –Intellectual Resources.
Federal Land Manager Environmental Database (FED) Overview and Update June 6, 2011 Shawn McClure.
Fire, Smoke & Air Quality: Tools for Data Exploration & Analysis : Data Sharing/Processing Infrastructure This project integrates.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
There is increasing evidence that intercontinental transport of air pollutants is substantial Currently, chemical transport models are the main tools for.
Application of NASA ESE Data and Tools to Particulate Air Quality Management A proposal to NASA Earth Science REASoN Solicitation CAN-02-OES-01 REASoN:
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION ESDS Reuse Working Group Earth Science Data Systems Reuse Working Group Case Study: SHAirED Services for.
DATAFED Application Programs. Dvoy Data Flow and Processes DataView 1 View Data Abstract Portrayal Device Portrayal Render Device View Portrayal Device.
Intermountain West Data Warehouse - Western Air Quality Study
Current and Future State of the IMPROVE Website
Federal Land Manager Environmental Database (FED)
Logical Data Warehousing and Tableau 10
Software Design Lecture : 8
Presentation transcript:

Federated PM and Haze Data Warehouse Project a sub- project of (enter your sticker & logo here ) Nov 20, 2001, RBH St. Louis Midwest Supersite Project Regional Planning Organization RPO EPA Supersites SupSite NARSTO PM NARSTO EPA Division1, Division2, Division2 EPA Me and my dog for our aerosol project Me

PM/Haze Data Flow in Support of AQ Management There are numerous organizations in need of data relevant to PM/Haze Most interested parties (stakeholders) are both producers and consumers of PM and haze data There is a general willingness to share data but the resistances to data flow and processing are too high RPO Regional Planning Orgs FLM Federal Land Managers EPA EPA Regul. & Research Industry Academic NARSTO Other: Private, Academic SuperSite Shared PM/Haze Data PM and haze data are used for may parts of AQ management, mostly in form of Reports The variety of pertinent (ambient, emission) data come from many different sources To produce relevant reports, the data need to be ‘processed’ (integrated, filtered aggregated)

Scientific and Administrative Rationale for Resource Sharing Scientific Rationale: Regional haze and its precursors have a km airshed. (Smoke, Dust, Haze) – Data integration Substantial fraction of haze originates from natural sources or from out-of- jurisdiction man-made sources Cross-RPO data and knowledge sharing yields better operational and science support to AQ management Management Rationale: Haze control within some RPOs cannot yield Data sharing saves money and ….

A Strategy for the Federated PM/Haze Data Warehouse Negotiate with the data providers ‘open up’ their data servers for limited, controlled, access in accordance with clear ‘access contract’ with the Federated Warehouse Design an interface to the warehoused datasets that has simple data access and satisfies the data needs of most integrating users.(oxymoron ????) Facilitate the the development of shared value-adding processes (analysis tools, methods) that refine the raw data to useful knowledge

Three-Tier Federated Data Warehouse Architecture (Note: In this context, ‘Federated’ differs from ‘Federal’ in the direction of the driving force. Federated meant to indicate a driving force for sharing from ‘bottom up’ i.e. from the members, not dictated from ‘above’, by the Feds) 1.Provider Tier: Back-end servers containing heterogeneous data, maintained by the federation members 2.Proxy Tier: Retrieves designated Provider data and homogenizes it into common, uniform Datasets 3.User Tier: Accesses the Proxy Server and uses the uniform data for presentation, integration or processing Provider Tier Heterogeneous data in distributed SQL Servers Proxy Tier Data homogenization, transformation Federated Data Warehouse User Tier Data presentation, processing

Federated Data Warehouse Interactions The Provider servers interact only with the Proxy Server in accordance with the Federation Contract –The contract sets the rules of interaction (accessible data subsets, types of queries) –Strong server security measures enforced, e.g. through Secure Socket layer The data User interacts only with the generic Proxy Server using flexible Web Services interface –Generic data queries, applicable to all data in the Warehouse (e.g. data sub-cube by space, time, parameter) –The data query is addressed to the Web Service provided by the Proxy Server –Uniform, self-describing data packages are passed to the user for presentation or further processing SQLDataAdapter1 CustomDataAdapter SQLDataAdapter2 SQLServer1 SQLServer2 LegacyServer Presentation Data Access & Use Provider Tier Heterogeneous Data Proxy Tier Data Homogenization, etc. Member Servers Proxy Server User Tier Data Consumption Processing Integration Federated Data Warehouse Fire Wall, Federation Contract Web Service, Uniform Query & Data

Live Demo of the Data Warehouse Prototype Uniform Data Query regardless of the native schema: Query by parameter, location, time, method Currently online data are accessible from the CIRA (IMPROVE) and CAPITA SQL servers The hidden DataAdopter - accepts the uniform query - accesses the data server - transforms the original to uniform data - delivers uniforms DataSets A rudimentary viewer displays the data in a table for browsing.

‘Global’ and ‘Local’ AQ Analysis AQ data analysis needs to be performed at both global and local levels The ‘global’ refers to regional national, and global analysis. It establishes the larger- scale context. ‘Local’ analysis focuses on the specific and detailed local features Both global and local analyses are needed for for full understanding. Global-local interaction (information flow) needs to be established for effective management. National and Local AQ Analysis

Integration for Global-Local Activities Global Activity Local Benefit Global data, tools => Improved local productivity Global data analysis => Spatial context; initial analysis Analysis guidance => Standardized analysis, reporting Local Activity Global Benefit Local data, tools => Improved global productivity Local data analysis => Elucidate, expand initial analysis Identify relevant issues => Responsive, relevant global work Global and local activities are both needed – e.g. ‘think global, act local’ ‘Global’ and ‘Local’ here refers to relative, not absolute scale

Data Re-Use and Synergy Data producers maintain their own workspace and resources (data, reports, comments). Part of the resources are shared by creating a common virtual resources. Web-based integration of the resources can be across several dimensions: Spatial scale:Local – global data sharing Data content:Combination of data generated internally and externally The main benefits of sharing are data re-use, data complementing and synergy. The goal of the system is to have the benefits of sharing outweigh the costs. Content User Local Global Virtual Shared Resources Data, Knowledge Tools, Methods User Shared part of resources

Federated Data Warehouse Features Data reside in their respective home environment where it can mature. ‘Uprooted’ data in separated databases are not easily updated, maintained, enriched. Abstract (universal) query/retrieval facilitates integration and comparison along the key dimensions (space, time, parameter, method) The open data query based on Web Services promotes the building of further value chains: Data Viewers, Data Integration Programs, Automatic Report Generators etc..Web Services The data access through the Proxy server protects the data providers and the data users from security breaches, excessive detail