17 May 2006 Rapid Prototyping Capability for Earth-Sun System Sciences Preliminary Design Robert J. Moorhead Mississippi State University.

Slides:



Advertisements
Similar presentations
LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Advertisements

Software Quality Assurance Plan
Lecture # 2 : Process Models
Chapter 2 – Software Processes
High Performance Computing Course Notes Grid Computing.
ITIL: Service Transition
May 17, Capabilities Description of a Rapid Prototyping Capability for Earth-Sun System Sciences RPC Project Team Mississippi State University.
OASIS Reference Model for Service Oriented Architecture 1.0
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Designing new systems or modifying existing ones should always be aimed at helping an organization achieve its goals State the purpose of systems design.
Copyright 2002 Prentice-Hall, Inc. Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer.
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
Understanding Metamodels. Outline Understanding metamodels Applying reference models Fundamental metamodel for describing software components Content.
Introduction to Databases Transparencies
Chapter 1: Overview of Workflow Management Dr. Shiyong Lu Department of Computer Science Wayne State University.
Cyberinfrastructure for Rapid Prototyping Capability Tomasz Haupt, Anand Kalyanasundaram, Igor Zhuk, Vamsi Goli Mississippi State University GeoResouces.
Copyright 2006 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Third Edition Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter.
Lecture Nine Database Planning, Design, and Administration
Configuration Management
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Database Administration Chapter 16. Need for Databases  Data is used by different people, in different departments, for different reasons  Interpretation.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
What is Business Analysis Planning & Monitoring?
SYSTEM ANALYSIS AND DESIGN
Chapter 9 Elements of Systems Design
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
UML - Development Process 1 Software Development Process Using UML (2)
Overview of the Database Development Process
Chapter 2 The process Process, Methods, and Tools
Engineering, Operations & Technology | Information TechnologyAPEX | 1 Copyright © 2009 Boeing. All rights reserved. Architecture Concept UG D- DOC UG D-
11 SECURITY TEMPLATES AND PLANNING Chapter 7. Chapter 7: SECURITY TEMPLATES AND PLANNING2 OVERVIEW  Understand the uses of security templates  Explain.
1. 2 Purpose of This Presentation ◆ To explain how spacecraft can be virtualized by using a standard modeling method; ◆ To introduce the basic concept.
Rational Unified Process Fundamentals Module 4: Disciplines II.
Software System Engineering: A tutorial
1 Process Engineering A Systems Approach to Process Improvement Jeffrey L. Dutton Jacobs Sverdrup Advanced Systems Group Engineering Performance Improvement.
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Introduction: Databases and Database Users
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Chapter 1: Overview of Workflow Management Dr. Shiyong Lu Department of Computer Science Wayne State University.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
Lecture 7: Requirements Engineering
9 Systems Analysis and Design in a Changing World, Fourth Edition.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Database Administration
Create Content Capture Content Review Content Edit Content Version Content Version Content Translate Content Translate Content Format Content Transform.
Chapter 2 – Software Processes Lecture 1 Chapter 2 Software Processes1.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Copyright 2002 Prentice-Hall, Inc. Chapter 4 Automated Tools for Systems Development 4.1 Modern Systems Analysis and Design.
Module 4: Systems Development Chapter 13: Investigation and Analysis.
State of Georgia Release Management Training
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
SwCDR (Peer) Review 1 UCB MAVEN Particles and Fields Flight Software Critical Design Review Peter R. Harvey.
Wide Area Grid – Technical Requirements Paul Kopp.
ISWG / SIF / GEOSS OOSSIW - November, 2008 GEOSS “Interoperability” Steven F. Browdy (ISWG, SIF, SCC)
Chapter 9 Database Planning, Design, and Administration Transparencies © Pearson Education Limited 1995, 2005.
9 Systems Analysis and Design in a Changing World, Fifth Edition.
1 The XMSF Profile Overlay to the FEDEP Dr. Katherine L. Morse, SAIC Mr. Robert Lutz, JHU APL
Dr. Ir. Yeffry Handoko Putra
ITIL: Service Transition
Software Project Configuration Management
Chapter 1 The Systems Development Environment
Fundamentals of Information Systems, Sixth Edition
Modern Systems Analysis and Design Third Edition
Chapter 1 The Systems Development Environment
Object oriented system development life cycle
Ch 15 –part 3 -design evaluation
Chapter 1 The Systems Development Environment
Presentation transcript:

17 May 2006 Rapid Prototyping Capability for Earth-Sun System Sciences Preliminary Design Robert J. Moorhead Mississippi State University

17 May 2006 Approach Formulate architectures and develop baseline capacities that integrate applied sciences systems tools into configurations to support efficient evaluation of the prospects of integrating research results from NASA’s Earth observation systems (with emphasis on spacecraft instruments on missions recently launched or planned for near-term launch) and associated Earth system models systems engineering tools enterprise architecture tools information visualization and analysis tools uncertainty characterization tools performance assessment tools “NASA Earth Science and Space Systems benefiting Society: Evolving Systems Engineering Capacity,” presentation by Ron Birk, August 24, 2005, SSC

17 May 2006 System Scope Reduce the amount of time that has typically been required to consider the utility of new or future data streams on model outcomes. Systematically evaluate research capabilities in a simulated operational environment in order to evaluate components and/or configurations that could be considered for verification, validation, and benchmarking for transition from research to operations and/or into an integrated system solution (ISS). Figure 1 illustrates the interface between the RPC and external systems that include the SN and ISS components of NASA’s Earth Science Application Plan.

17 May 2006 RPC Interface

17 May 2006 System Context The RPC will provide the capability to integrate and provide access to the tools needed to evaluate the use of a wide variety of current and future NASA sensors and research results, model outputs, and knowledge, collectively referred to as “resources”. It is assumed that the resources are geographically distributed and thus RPC will provide the support for the location transparency of the resources.

17 May 2006 RPC node Local and remote computing and storage facilities Remote data providers Model configuration Input data sets configuration Experiment design and execution Analysis System administration and maintenance

17 May 2006 System modes and states Before an experiment can be performed (a particular model using a particular data source) two conditions must be satisfied. –First, the model must be installed at some computing facility assessable to RPC users, and configured to run; –Second, the data must be configured so that it can be used by the model. The data configuration may involve developing tools for the data conversions (format translations, subsetting, deriving values of variables not included in the original data products, geo-processing, etc). From the point of view of performing a particular experiment and analysis, the RPC can be in two distinct states: –ready for the experiment and analysis by end users –requiring action of specialists for installing and configuring the model and its data During its life cycle, new resources and tools will be integrated with the RPC node, increasing the repertoire of experiments and analyses that can be performed.

17 May 2006 numerical model Model results Model results Model results analysis numerical model 1 Model results Model results analysis numerical model 2 Major Categories of Experiments Different sourcesDifferent models

17 May 2006 Capabilities Required 1.Discovery, semantic understanding, secure access, and transport mechanisms for data products available from known data providers (Science Data Manager) 2.Data assimilation and geo-processing tools for all data transformations needed to match a given data product (or products) to the model input requirements, and support for organizing the data processing into workflows built from reusable and interoperable modules, including both the workflow specification mechanisms and the workflow enacting engine (Interoperable Geo-processing Environment)

17 May 2006 Capabilities Required (cont.) 3.Model management: a.Catalog of available models, model metadata catalog (including input and output model requirements), and mechanisms for integrating new models with RPC b.Mechanisms for creation runtime environments; data staging (in and out); job scheduling, remote execution, and monitoring c.Mechanisms for storing model outputs together with metadata and provenance information (all information needed to recreate the output data set); the metadata necessary to enable search and discovery of model outputs 4.Tools for model output analysis (including visualizations), tools for quantitative comparing model outputs, and tools for model benchmarking (Performance Metrics Workbench)

17 May 2006 Major System Constraints Only models and data made available to RPC users and integrated with the RPC node can be used to perform experiments. Installation and/or integration of models, as well as integration and geo-processing of data, needs to be performed by a respective specialist, and the time needed to accomplish that task will depend on the complexity of the particular model and data set(s). Running a model may take a long time, depending on the complexity and configuration of the model. The experiments will not necessarily be performed in real time.

17 May 2006 User Categories 1.System administrators – responsible for deployment, configuration, and maintenance of the system, and its users (for access control purposes) 2.Application specialists – responsible for installation and configuration of the model on computational systems accessible to the RPC users, and integrating these models with the RPC (which includes definition of the input and output data requirements) 3.Data processing specialists – responsible for the development and the deployment of the tools for data transformations 4.Domain specialists – responsible for defining, configuring (creating workflows for data processing, setting model parameters, etc), and executing experiments 5.Domain specialist performing the data analysis

17 May 2006 Assumptions and Dependencies The RPC will depend on data and models provided by third parties. Access to remote computational and storage facilities will be controlled according to policies established by the facility owners (stakeholders). It is assumed that these policies will allow RPC users to submit and monitor jobs on these systems which may require penetrating firewalls. It is possible that the access privileges will be different for different users, depending on organizational membership, nationality, or other factors beyond the control of the RPC system developers.

17 May 2006 Operational Scenario Summary Design of experiment – identification of models and data sets to be used Assessment whether the models and data are currently integrated with the RPC node Filling requests to model and data specialists, as needed; the specialists issue a notification when the models and data are available Configuration of the experiment (setting the model parameters, configuring the data (e.g., ROI, timeframe, etc) Asynchronous run and monitoring of the model Analysis

17 May 2006 Physical Issues The RPC node will be installed on a dedicated, stand-alone system consisting of standard commercially available computing nodes, data storage, and hosting middleware servers. Core RPC modular capabilities (SDM, IGE, MM, PMW) will be executed on separate computing nodes. The RPC node will be complemented with remote resources – high performance computing and storage facilities as needed by the models to be used in the experiments. The RPC node can be moved from one geographical location to another. Access to the remote resources will require standard internet connections.

17 May 2006 System Performance Characteristics The primary goal of the RPC node is to provide the capability to rapidly prototype the assimilation of new or future NASA data products and/or model derived data streams into model applications that have generated demonstrable scientific results of merit and stakeholder interest. However, there is no established benchmark to quantitatively specify what “rapid” means. The reference point is the current practice – manual configuration of data and models, whereas the expectation is that the RPC approach will considerably speed up the process, in particular for repeated experiments, after the baseline data and models are set up. However, the initial phase – setting the baseline data and models – may prove to be time consuming as it will involve model integration, data acquisition and simulation, and the development of new components for geoprocessing the data.

17 May 2006 System Performance Characteristics “Rapid Prototyping” performance benefits will be best realized through the reusability of configured geoprocessing tasks to provide model-ready input data to a model that has been fully integrated into the RPC. It is this “reuse” capability that will enable the rapid evaluation of new data types. By associating existing geoprocessing workflows with new data types, the rapid assimilation of next-generation data into configured models should be readily achievable.

17 May 2006 Policy and Regulation As the RPC develops into a viable simulation system, it is expected that activities requiring RPC resources will be requested and coordinated among those selecting an RPC for evaluation, the RPC team conducting a specific evaluation, and RPC developers who will be required to maintain and evolve the RPC to support requirements for integrating new model applications, data products, and geoprocessing tasks. As the RPC evolves to meet new or changing requirements, configuration management practices, version control, and developmental practices will be followed to ensure that capabilities in development will be isolated from operational RPC capabilities.

17 May 2006 Policy and Regulation Simply stated, development activities, testing, and integration of new functionalities into the RPC should be “contained” through the use of segregated physical or virtual systems that may be isolated from the operational instance of the RPC. As new capabilities mature through development processes, configuration “check-in” procedures will be followed to ensure the orderly integration of the new “proven” capabilities. It is likely that such activities will involve proactive participation of an RPC technical working group.

17 May 2006 System Interfaces The RPC node has 5 categories of users, each requiring a dedicated interface. In addition, the RPC interacts with two classes of external systems: data providers and remote computing and storage facilities. Each interface will be described in the remaining slides

17 May 2006 System Administrator Interface The administrator interface must support the administrator tasks: registering and de-registering users and assigning roles maintaining the user credentials needed to access remote resources monitoring the system status and usage backing up and restoring data and software; recovery from faults deployment of new software components and services

17 May 2006 Model Specialist Interface The model specialist is responsible for deploying and integrating the models into the RPC environment. The models can be installed either locally on RPC node hardware and/or at a remote computing facility. To integrate the model with RPC the specialist must “register” the model, that is, generate a metadata record that describes the model in terms of its functionality, the runtime requirements (location of the executable, environmental variables, the structure of the working directory, etc.), model parameters, and definition of the input and output datasets. The model specialist interface must thus support the registration of new models and editing of the metadata of the existing models. In addition, the model specialist interface must provide support for the testing of the correctness of the model deployment.

17 May 2006 Data Specialist Interface The data specialist identifies the data providers and designs the geo- processing procedure for transforming the original data product to match the model input data requirements. The design of the geo-processing may require the development and deployment of software components to perform specified tasks. The data specialist interface must provide support for: –searching data products from known data providers –assessing the structure and syntax of available data products –assessing the model input data requirements –discovering and evaluating the geo-processing modules already integrated with the RPC node –integrating new geo-processing modules within the RPC node –composing the geo-processing process from available components –testing of the correctness of the geo-processing procedure

17 May 2006 Domain Specialist Interface To support the design and execution of experiments, the domain specialist interface must support: –Discovery of available models and data through the RPC facilities –Receiving and filling requests for new models and data –Configuring experiments by Connecting a particular model with particular data Setting the model parameters Configuring datasets (region of interest, timeframe, etc.) –Submitting models for execution –Monitoring the model progress –Controlling the model execution (e.g., aborting it, if needed) –Verifying that the model completed successfully (e.g., by examining a log file generated by the model, running a test applications, etc.)

17 May 2006 Analyst Interface The analyst analyses the experiment outcome. The analyst interface must: –Allow queries of the output data databases to find the model outputs of interest –Provide access to model outputs –Provide access to model provenance (when and in what circumstances the model has been run, e.g., what input data sets has been used, the values of the model parameters, etc.) –Provide access to tools (visualizations or otherwise) enabling access to the results of the experiments

17 May 2006 Data Provider Interface The RPC must define interfaces that allow acceptance of data streams coming from data providers.

17 May 2006 Remote Resources Interface The RPC must define interfaces for invoking Grid services such as allocating and monitoring remote resources, accepting notifications about status changes (i.e., a job has completed), and data transfers between RPC node and remote resources, as well as data transfers between remote resources. Defined interfaces must support delegation of user credentials to satisfy the access control requirements and policies of the remote resources.

17 May 2006 The End Backup slides follow

17 May 2006 The baseline system. This four-tier architecture follows OGSA recommendations

17 May 2006 Evaluations leading to new understanding & ideas for ISS MyRPCLIS Functional computational capabilities of the RPC system IGE Authorization Authentication Notification Monitoring Workflow Security ESMF GCMD THREDDS ESML Ontology Query MyRPC Host environment GPIR Execution description Application description Grid enabled OGC Services WorldWinds

17 May 2006 RPC Portal MyRPC GCMD Service oriented architecture for Computational RPC Node [based on NSF LEAD (Drogemeier et. al., 2006)] WRF, HSPF LIS, RAMS DAACs CLASS Evaluation ESMF, GEOLEM OGC Services

17 May 2006 CRPN WRF ESMF IGE GCMD Systems framework for CRPN, consisting of interacting subsystems in the secure and stable RPC computational grid [based on NSF LEAD (Drogemeier et. al., 2006)] MyRPC workspace LIS WorldWinds