PNC, “Collaboration: Tools and Infrastructure” December 7, 2012 Michael Frenklach Supported by AFOSR, Fung PrIMe: Integrated Infrastructures for Data and.

Slides:



Advertisements
Similar presentations
Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
Advertisements

REST Introduction 吴海生 博克软件(杭州)有限公司.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
MICHAEL MARINO CSC 101 Whats New in Office Office Live Workspace 3 new things about Office Live Workspace are: Anywhere Access Store Microsoft.
MACCCR 5 th Fuels Research Review September 17, 2012 Michael Frenklach Supported by AFOSR PrIMe Next Frontier: Large, Multi-dimensional Data Sets.
Unveiling ProjectWise V8 XM Edition. ProjectWise V8 XM Edition An integrated system of collaboration servers that enable your AEC project teams, your.
C6 Databases.
DESIGN PROCESS OPTIMIZATION INTEGRATION TRADES SIMULATION VISUALIZATION Copyright 2010 Phoenix Integration, Inc. All rights reserved. Grant Soremekun Business.
Michael Naas, Teddy Wescott, Andrew Gluck
University of Leeds Department of Chemistry The New MCM Website Stephen Pascoe, Louise Whitehouse and Andrew Rickard.
Managing Data Resources
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Development of a Community Hydrologic Information System Jeffery S. Horsburgh Utah State University David G. Tarboton Utah State University.
Page 1 Building Reliable Component-based Systems Chapter 18 - A Framework for Integrating Business Applications Chapter 18 A Framework for Integrating.
Distributed Collaborations Using Network Mobile Agents Anand Tripathi, Tanvir Ahmed, Vineet Kakani and Shremattie Jaman Department of computer science.
SaaS, PaaS & TaaS By: Raza Usmani
INTRODUCTION TO CLOUD COMPUTING Cs 595 Lecture 5 2/11/2015.
Alfresco – An Open Source Content Management System - Bindu Nayar, Bhavana Mohanraj.
Discussion and conclusion The OGC SOS describes a global standard for storing and recalling sensor data and the associated metadata. The standard covers.
Chapter 4 Networking and the Internet Introduction to CS 1 st Semester, 2015 Sanghyun Park.
Project 1 Online multi-user video monitoring system.
Tutorial 1: Getting Started with Adobe Dreamweaver CS4.
Controlled Vocabularies (Term Lists). Controlled Vocabs Literally - A list of terms to choose from Aim is to promote the use of common vocabularies so.
COMP 410 & Sky.NET May 2 nd, What is COMP 410? Forming an independent company The customer The planning Learning teamwork.
ICT Technologies Session 2 4 June 2007 Mark Viney.
National Energy Technology Laboratory Dirk Van Essendelft (PI) Terry Jordan, Philip Nicoletti, Tingwen Li (Team Members) Multiphase Flow Team, CSED August.
What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
CHAPTER TEN AUTHORING.
Introduction to the Adapter Server Rob Mace June, 2008.
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
PI Data Archive Server COM Points Richard Beeson.
A/WWW Enterprises 28 Sept 1995 AstroBrowse: Survey of Current Technology A. Warnock A/WWW Enterprises
Software Engineering Prof. Ing. Ivo Vondrak, CSc. Dept. of Computer Science Technical University of Ostrava
Some RMG Basics William H. Green MIT Dept. of Chemical Engineering RMG Study Group Sept. 27, 2013.
Futures Lab: Biology Greenhouse gasses. Carbon-neutral fuels. Cleaning Waste Sites. All of these problems have possible solutions originating in the biology.
DISTRIBUTED COMPUTING. Computing? Computing is usually defined as the activity of using and improving computer technology, computer hardware and software.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
Elementary Kinetics Discussion Summary Robert Tranter A number of topics were discussed extensively. The main points are sumarized on the following slides.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
J.P. Wellisch, CERN/EP/SFT SCRAM Information on SCRAM J.P. Wellisch, C. Williams, S. Ashby.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Realistic Uncertainty Bounds for Complex Dynamic Models Andrew Packard, Michael Frenklach CTS April 2005 Our research focuses on the benefits of.
Taming the Big Data in Computational Chemistry #euroCRIS2015 Barcelona 9-11-XI-2015 Carles Bo ICIQ (BIST) -
Creating User Interfaces Ideas & Trends Homework: Post constructive comments. Work on project.
Chapter 1 Introduction to HTML, XHTML, and CSS HTML5 & CSS 7 th Edition.
Next steps from the developers viewpoint Tõnu Näks IB Krates, Estonia 23/09/2009.
Climate-SDM (1) Climate analysis use case –Described by: Marcia Branstetter Use case description –Data obtained from ESG –Using a sequence steps in analysis,
Chapter 6 A nalyze the Existing Infrastructure KMS2 Technology roles in KM 1.Facilitate communication. 2.It provides the infrastructure for storing.
Stoichiometry. Table of Contents ‘Stoichiometry’ Balancing Chemical Equations Avogadro’s Number Molar Mass Combustion Reactions Synthesis Reactions Single.
A Collaborative e-Science Architecture towards a Virtual Research Environment Tran Vu Pham 1, Dr. Lydia MS Lau 1, Prof. Peter M Dew 2 & Prof. Michael J.
Introduction. Internet Worldwide collection of computers and computer networks that link people to businesses, governmental agencies, educational institutions,
Michael Mast Senior Architect Applications Technology Oracle Corporation.
Get Data to Computation eudat.eu/b2stage B2STAGE How to shift large amounts of data Version 4 February 2016 This work is licensed under the.
Review of PARK Reflectometry Group 10/31/2007. Outline Goal Hardware target Software infrastructure PARK organization Use cases Park Components. GUI /
ENEA GRID & JPNM WEB PORTAL to create a collaborative development environment Dr. Simonetta Pagnutti JPNM – SP4 Meeting Edinburgh – June 3rd, 2013 Italian.
If it’s not automated, it’s broken!
ERP & APO Integration Theories & Concepts
z/Ware 2.0 Technical Overview
Flanders Marine Institute (VLIZ)
Web Engineering.
MANAGING DATA RESOURCES
“Kinetic mechanisms” discussion Summary Matt Oehlschlaeger
Reportnet 3.0 Database Feasibility Study – Approach
Mark Quirk Head of Technology Developer & Platform Group
Automating Profitable Growth™
Presentation transcript:

PNC, “Collaboration: Tools and Infrastructure” December 7, 2012 Michael Frenklach Supported by AFOSR, Fung PrIMe: Integrated Infrastructures for Data and Analysis

IMPACT ON SOCIETY –Energy (power plants, car and jet engines, rockets, …) –Defense (engines, rockets, …) –Environment (pollutants, global modeling, …) –Space exploration –Astrophysics –Material synthesis ESTABLISHED PRACTICE OF COLLABORATION –Across different disciplines –Across different countries THERE IS AN ACCUMULATING EXPERIMENTAL PORTFOLIO THEORY/MODELING LINKS FUNDAMENTAL TO APPLIED LEVEL

mechanism of: ignition laminar flames NO x soot... individual reactions model model reductionanalysis numerical simulations experiments theory sensitivity reaction path …

Methane Combustion: CH O 2  CO H 2 O 1970’s: 15 reactions, 12 species 1980’s: 75 reactions, 25 species 1990’s: 300+ reactions, 50+ species Larger molecular-size fuels: 2000’s: 1,000+ reactions, 100+ species 2010’s: 10,000+ reactions, species

Methane Combustion: CH O 2  CO H 2 O The networks are complex, but the governing equations (rate laws) are known Uncertainty exists, but much is known where the uncertainty lies (rate parameters) Numerical simulations with parameters fixed to certain values may be performed “reliably” There is an accumulating experimental portfolio on the system and yet

Methane Combustion: CH O 2  CO H 2 O Lack of predictability Lack of consensus but still

current inability of truly predictive modeling –conflicting data in/among sources –poor documentation of data/models –no uncertainty reporting or analysis –not much focus on integration of data resistance to data sharing –no personal incentives –no easy-to-use technology no recognition of the problem

models are not additive data are not additive need a system for synthesis of data

PrIMe Process Informatics Model  Data sharing  App sharing  Automation

registered members~400 countries~15 data records~100,000 apps~20 active “players” − UCB (lead), NSCU, Stanford, MIT, Cambridge, KAUST, Tsinghua

DATA ORGANIZATION : conceptual abstraction practical realization

Chemical Kinetics Model Chemical Reactions Chemical Species Chemical Elements composed of have atomic masses rate law data -parameter values -uncertainties -reference have thermo data transport data

reactions - combustion modeling quantum chemistry diagnostics thermosciences thermo molecular structure spectra absorption coefficient

Data Attribute (QOI, ‘target’ ) a specific feature extracted for modeling: –peak value –peak location –induction time –ratio of peaks ( from multiple experiments ) … Experimental Record reference apparatus conditions observations –inner: XML –remote: HDF5, … uncertainties additional items –links, docs, … –video files, … archival record VVUQ data instrumental model

Initial Model: “Upload your data to PrIMe Warehouse” (“give me your data”) New, Distributed Model: “You may, if choose, connect your data to the communal system” with a switch in the OFF position: “you can use the communal data and tools but your own data is private to you only” “but please flip the switch to the ON position when you are ready to share your own data”

“Connect your code to the communal system” - you control your own code: release version user access, licenses collect fees, if desired

Remote server app — PrIMe Web Services (PWS) no restrictions on platform no restrictions on data formats no restrictions on local programming language(s) PrIMe Workflow Interface (PWI) is the only “standard” developed, maintained, and controlled by the community

client machine client data PrIMe web services PrIMe Data Flow Network PrIMe Dispatcher

excessively large data sets do not move the data but use “smart agents” (eg, HTML5 walkers) web services with user-reloaded tasks: fetch data features for user-requested analysis

workflow project user specifies conditions of interest workflow component retrieves archived data: a set of relevant targe ts target values and their uncertainty ranges surrogate models developed for relevant targets active variables and their uncertainty ranges data warehouse workflow component performs: retrieves the pertinent kinetics model (via link in the dataset) performs simulations on the fly for the conditions specified and builds a new surrogate model performs UQ analysis combining the new surrogate model with the archived ones and the rest of the pertinent data reports results

workflow project workflow component performs: retrieves the pertinent kinetics model (via link in the dataset) performs simulations on the fly for the new data and builds a new surrogate model performs UQ analysis combining the new surrogate model with the archived ones and the rest of the pertinent data reports results adds the new data to the dataset and archives in Warehouse workflow component retrieves archived data: a set of relevant targets target values and their uncertainty ranges surrogate models developed for relevant targets active variables and their uncertainty ranges data warehouse enrichment user specifies a new set of data

What causes/skews model predictiveness? Are there new experiments to be performed, old repeated, theoretical studies to be carried out? What impact could a planned experiment have? What is the information content of the data? What would it take to bring a given model to a desired level of accuracy?

from algorithm-centric view to data-centric view outputinput code data