1 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Interactive Composition of Computational Pathways Yolanda Gil Jihie Kim Varun Ratnakar.

Slides:



Advertisements
Similar presentations
Construction process lasts until coding and testing is completed consists of design and implementation reasons for this phase –analysis model is not sufficiently.
Advertisements

Configuration management
DETAILED DESIGN, IMPLEMENTATIONA AND TESTING Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
Software Quality Assurance Plan
Mixed-Initiative Planning Yolanda Gil USC CS 541 Fall 2003.
1 USC INFORMATION SCIENCES INSTITUTE Modeling and Using Simulation Code for SCEC/IT Yolanda Gil Varun Ratnakar Norm Tubman USC/Information Sciences Institute.
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Software Testing and Quality Assurance
MACMERL Mixed-Initiative Scheduling with Coincident Problem Spaces M.J. Prietula, W.L. Hsu, P.S.Ow.
1 Software Architecture: a Roadmap David Garlen Roshanak Roshandel Yulong Liu.
An Intelligent Broker Approach to Semantics-based Service Composition Yufeng Zhang National Lab. for Parallel and Distributed Processing Department of.
School of Computing and Mathematics, University of Huddersfield Knowledge Engineering: Issues for the Planning Community Lee McCluskey Department of Computing.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
Matthew J Mattia CSC  Cumbersome Code  Consistent/Predictable design (GUEPs #5, CD’s #10)  Display “proper” amount of information  Including.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
What is Business Analysis Planning & Monitoring?
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Managing Software Quality
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
1 USC INFORMATION SCIENCES INSTITUTE Modeling and Using Simulation Code for SCEC/IT Yolanda Gil Jihie Kim Varun Ratnakar Marc Spraragen USC/Information.
The Grid is a complex, distributed and heterogeneous execution environment. Running applications requires the knowledge of many grid services: users need.
RUP Implementation and Testing
© 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1 A Discipline of Software Design.
System Design: Designing the User Interface Dr. Dania Bilal IS582 Spring 2009.
European Network of Excellence in AI Planning Intelligent Planning & Scheduling An Innovative Software Technology Susanne Biundo.
1 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Artificial Intelligence and Large-Scope Science: Workflow Planning and Beyond Yolanda Gil USC/Information.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Architecting Web Services Unit – II – PART - III.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
(Business) Process Centric Exchanges
1 USC INFORMATION SCIENCES INSTITUTE CALO, 8/8/03 Acquiring advice (that may use complex expressions) and action specifications Acquiring planning advice,
Introduction to Software Testing. Types of Software Testing Unit Testing Strategies – Equivalence Class Testing – Boundary Value Testing – Output Testing.
Introduction CS 3358 Data Structures. What is Computer Science? Computer Science is the study of algorithms, including their  Formal and mathematical.
1 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Interactive Composition of Computational Pathways Jihie Kim Varun Ratnakar Students: Marc Spraragen (USC)
Using and modifying plan constraints in Constable Jim Blythe and Yolanda Gil Temple project USC Information Sciences Institute
7 Systems Analysis and Design in a Changing World, Fifth Edition.
1 USC, INFORMATION SCIENCES INSTITUTE An integrated environment for KA An Integrated Environment for Knowledge Acquisition Jim Blythe
Search Engine Optimization © HiTech Institute. All rights reserved. Slide 1 What is Solution Assessment & Validation?
LHCb Software Week November 2003 Gennady Kuznetsov Production Manager Tools (New Architecture)
1 Computer Group Engineering Department University of Science and Culture S. H. Davarpanah
Using and modifying plan constraints in Constable Jim Blythe and Yolanda Gil Temple project USC Information Sciences Institute
1 USC INFORMATION SCIENCES INSTITUTE Expect: COA Critiquing PSM EXPECT: A User-Centered Environment for the Development and Adaptation of Knowledge-Based.
Department of Information Science and Applications Hsien-Jung Wu 、 Shih-Chieh Huang Asia University, Taiwan An Intelligent E-learning system for Improving.
Volgograd State Technical University Applied Computational Linguistic Society Undergraduate and post-graduate scientific researches under the direction.
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
1 USC INFORMATION SCIENCES INSTITUTE EXPECT TEMPLE: TEMPLate Extension Through Knowledge Acquisition Yolanda Gil Jim Blythe Information Sciences Institute.
Modelling the Process and Life Cycle. The Meaning of Process A process: a series of steps involving activities, constrains, and resources that produce.
JavaScript Introduction and Background. 2 Web languages Three formal languages HTML JavaScript CSS Three different tasks Document description Client-side.
1 Pegasus and wings WINGS/Pegasus Provenance Challenge Ewa Deelman Yolanda Gil Jihie Kim Gaurang Mehta Varun Ratnakar USC Information Sciences Institute.
The NExt Process Workbench: Towards the Suupport of Dynamic Semantic Web Processes The NExT Process Workbench: Towards the Support of Dynamic Semantic.
KANAL (Knowledge ANALysis) Jihie Kim Jim Blythe Yolanda Gil
1 Artemis: Integrating Scientific Data on the Grid Rattapoom Tuchinda Snehal Thakkar Yolanda Gil Ewa Deelman.
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
The PLA Model: On the Combination of Product-Line Analyses 강태준.
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
A Mixed-Initiative System for Building Mixed-Initiative Systems Craig A. Knoblock, Pedro Szekely, and Rattapoom Tuchinda Information Science Institute.
Modelling and Solving Configuration Problems on Business
KANAL: Knowledge ANALysis
Action Editor Storyboard
KANAL: Knowledge ANALysis
MANAGING KNOWLEDGE FOR THE DIGITAL FIRM
Ontology-Based Information Integration Using INDUS System
USC Information Sciences Institute {jihie, gil,
Introduction To software engineering
Presented By: Darlene Banta
Chaitali Gupta, Madhusudhan Govindaraju
Self-Managed Systems: an Architectural Challenge
Overview Activities from additional UP disciplines are needed to bring a system into being Implementation Testing Deployment Configuration and change management.
Scientific Workflows Lecture 15
Presentation transcript:

1 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Interactive Composition of Computational Pathways Yolanda Gil Jihie Kim Varun Ratnakar Students: Marc Spraragen (USC) Sid Shaw (USC) Dan Wu (U Maryland) Ronggang Yu (UT) Edward Kim (USC)

2 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool SCEC/IT Architecture for a Community Modeling Environment

3 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Interactive Knowledge Acquisition: Summary of Activities Accessibility of complex models to end users (DOCKER) Showing appropriate descriptions of models and constraints Handling errors due to complex constraint violations Assisting model developers to publish code (DOCKER) Describing code behavior is not sufficient Documenting appropriate use of model formally and informally Interactive composition of computational pathways (CAT) User selects and connects models to create a sketch of pathway Automatic error checking and completion support (details in a minute) Execution on the Grid environment (Pegasus) Isolate unsophisticated user from complexity of distributed computing environments

4 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Hazard Curve Calculator: SA vs. prob. exc. SA exc. probs. SA exc. prob. Rupture Ruptures Site VS30 Site Basin-Depth-2.5 SA Period Gaussian Truncation Std. Dev. Type Task Result: Hazard curve: SA vs. prob. exc. Hazard curve: SA vs. prob. exc. Field (2000) IMR: SA exc. prob. Basin-Depth Calculator Basin-DepthLat Long. UTM Converter (get-Lat-Long- given-UTM) Lat. long UTM (,,, ) Lat Long. CVM-get- Velocity- at-point Velocity Lat Long. Ruptures PEER-Fault Gaussian Dist No Truncation Total Moment Rate Duration-Year Fault-Grid-Spacing Rupture Offset Mag-Length-sigma Dip Rake Magnitude (min) Magnitude (max) Magnitude (mean) A Computational Pathway Specification

5 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Interactive Composition of Computational Pathways Goal: support users in creating a specification of a pathway Automatic tracking of pathway constraints –System ensures consistency and completeness of pathway so user does not have to keep track of many computational details Provide flexible interaction –User can start from initial data, from data products, or steps –User can specify abstract descriptions of steps and later specialize them Intelligent assistance –System should not just point out problems but help user by suggesting fixes

6 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Approach Mixed-initiative system that helps users create, reuse, and combine workflows by exploiting: Knowledge-based descriptions of components Ontology of components and component types based on common features and parameter constraints Analysis of (partially constructed) pathways based on AI planning techniques Relate steps to goals and initial states, and interpret user actions in terms of incremental plan generation Exploit existing techniques and algorithms [Kim et al, IUI 04] [Kim & Gil AAAI SSS 04] [Kim & Gil ISWC 03] [Kim et al, submitted] [Spraragen, submitted]

7 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Supporting Knowledge Base Component ontology: hierarchies of components that describe abstract-level components as well as specific executable components e.g. IMR  Field-2000  Field-2000-SA-Prob-Exc Domain ontology: data types for representing input and output parameters and the constraints associated with them e.g. Field-2000 needs VS30, Basin-Depth, Fault-type, etc.

8 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool F2-operation-SA-Median-Distance-JBF2-operation-SA-Median-VS30 Compute-F2-SA-Median-wrt-Distance-JB- given-Fault-Type-&-Basin-Depth-&-… Compute-F2-SA-MEDIAN-wrt-VS30- given-Fault-Type-&-Basin-Depth-&-… Hazard-Level Hazard-Level-with-SA Hazard-Level-with-PGA Hazard-Level-with-PGV Compute-Hazard-Level- given-IMR-input-parameters... Compute-Hazard-Level- with-SA- given-IMR-input-parameters Compute-Hazard-Level-with-PGA- given-IMR-input-parameters Compute-Hazard-Level- with-PGV- given-IMR-input-parameters Hazard-Level-with- SA-Median Hazard-Level-with- SA-Std-Dev Hazard-Level-with- SA-Prob-Exc Hazard-Level-with-Median Hazard-Level-with-Std-Dev Hazard-Level-with-Median... Compute-Hazard-Level-with-SA-Median- given-IMR-input-parameters Compute-Hazard-Level-with-SA-Std-Dev- given-IMR-input-parameters Compute-Hazard-Level-with-SA-Prob-Exc- given-IMR-input-parameters IMR-Input-Parameter Field-2000-Input- Parameter Parameter Fault-Type Basin-Depth Distance... Compute-F2-SA-Median- given-Field-2000-input-parameters Compute-F2-Hazard-Level- given-Field-2000-input-parameters F2-Hazard-Level... Domain Ontology Component Ontology IMT probability-function IMR probability-function F2-SA-Median-wrt-VS30...

9 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Pathway analysis based on AI planning framework Formalize pathway features Define desirable properties ErrorScan Algorithm Purposeful (pathway) For each step S in pathway: –Satisfied (S) –Justified (S) –Executable (S) For each link L in pathway: –Consistent (L) –Unique (L) If ErrorScan does not generate any error messages for a given pathway, the pathway is purposeful, executable, satisfied, justified, unique, and consistent  Correct pathway specification ErrorScan Input: pathway W Output: list of errors and corresponding fix suggestions I. If W is not purposeful, return Error. Suggestions: define end result e using types from the KB, AddEndResult (e).. II. For each Component C in W: a. If C is not Justified, return Error. Suggestions  p that is output-parameter (c), find components cj in the pathway or the KB that have pj as input- parameter(cj), and subsumes(pj,p), AddLink(c,p,cj,pj) b. If C is not Executable, return Error. Suggestions : (  Cj  FindDirectSubtypes(c), SpecializeComponent(C, Cj). c. For each i in input-parameter(c): 1. If i is not Satisfied, return Error. Suggestions :  cj  C with output parameter pj such that subsumes(range(c,i),range(cj,pj)) AddLink(cj,pj,c,i). Suggestions :  cj  FindMatchingOutput (i)), AddLink(cj,pj,c,i). Suggestion: AddAndLinkComponent (W, AddInitialInput(i),range( i), c, i) III. For each Link L in W: a.If L is not Consistent, return Error. Suggestions:  Ci  FindInterPosingComponent(L), InterposeComponent (Ci, L). Suggestion: RemoveLink(L). b. If L is Redundant, return Error. Suggestion: RemoveLink (L).

10 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool CAT (Composition Analysis Tool) architecture Interaction Manager Constraint Reasoning Pathway Reasoning CAT User Interface Component Ontology Domain Ontology Web Services ErrorScan

11 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool CAT Interface User building a pathway specification from library of components Errors and fixes generated by ErrorScan algorithm

12 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool User Studies Preliminary study of system’s capabilities 4 subjects 2 pathways with 7 steps and 20 links each Users composed complete workflows in a short amount of time (avg. 13 min) CAT’s error messages were routinely used by the users (4.2/5) Mixed-initiative interaction seems useful

13 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Evaluating CAT with Synthetic Scenarios: Results with “artificial user mistakes” Results with 1 mistake SHATRIPTotal # mistakes20 40 # of mistakes detected (# of error msgs for detection) 20 (195) 20 (63) 40 (258) # of mistakes with direct fixes (# of direct fixes) 15 (44) 19 (38) 34 (82) Total # of error msgs Total # of fixes Avg # of fixes per error msg (avg) Results with 2 mistakesSHATRIPTotal # of mistakes40 80 # of mistakes detected (# of error msgs for detection) 39 (184) 40 (140) 79 (324) # of mistakes with direct fixes (# of direct fixes) 35 (92) 37 (56) 72 (148) Total # of error msgs Total # of fixes Avg # of fixes per error msg (avg) Results with 3 mistakes SHATRIPTotal # of mistakes # of mistakes detected (# of error msgs for detection) 59 (335) 60 (185) 119 (520) # of mistakes with direct fixes (# of direct fixes) 49 (133) 52 (98) 101 (231) Total # of error msgs Total # of fixes Avg # of fixes per error msg (avg) How well can CAT detect mistakes given uncertainty of user actions and interactions between mistakes?  CAT virtually always detected artificial user mistakes and made suggestions to user (238/240 cases) Can CAT propose useful fixes?  For 87% of the mistakes detected, CAT’s fixes would correct the mistakes directly (direct fixes)

14 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Year 3 work: Automatic Completion of Partial Workflows Goal: User specifies a sketch of the workflow at a high level, system fills out details Approach:  User creates partial workflow with CAT  Use AI planning techniques to complete partial workflow AutoCAT system to complete workflow spec Pegasus to add grid-related execution details

15 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Pegasus: Workflow Generation for Computational Grids [Deelman et al 03; Blythe et al 03; Gil et al 04] Given: desired result and constraints A desired result A set of application components described in the Grid A set of resources in the Grid (dynamic, distributed) A set of constraints and preferences on solution quality Find: an executable job workflow A configuration of components that generates the desired result A specification of resources where components can be executed and data can be stored Approach: Use AI planning techniques to search the solution space and evaluate tradeoffs Exploit heuristics to direct the search for solutions and represent optimality and policy criteria

16 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool Future Work Integration with execution environment (ongoing) use Pegasus planner and SCEC Grid testbed Ontology creation from component descriptions Design integrated approach by extending MCS Release CAT to SCEC community

17 USC INFORMATION SCIENCES INSTITUTE CAT: Composition Analysis Tool