February 1999 CHAIMS1 Prof. Gio Wiederhold, Dr. Dorothea Beringer, several Ph.D. and master students Stanford University Cost Estimation in CPAM, an Access Protocol for Remote and Autonomous Services CHAIMS Objective: Investigate new approaches to large-scale software composition. Approach: Develop and validate a composition-only language, a protocol for large, distributed, heterogeneous and autonomous megamodules, and a supporting system.
June 1998 CHAIMS2 Composition of Services... versus composition of Components »reusing small components via copy/paste or shared libraries locally installed »large distributed components within same “domain” as composition, e.g. within one bank or airline versus composition and integration of Data »data-warehouses »wrapping data available on web CPAM/CHAIMS: » composing processes » composing services of remote, autonomous, large megamodules
February 1999 CHAIMS3 Assumption Composition of services that are remote, autonomous, heterogeneous, computation intensive ==> specific requirements Domain expert Client workstation Megamodules IO module a b c d e s e r v e r s at provider's sites c l i e n t CPAM and distribution systems control data IO module
February 1999 CHAIMS4 Challenge: Autonomy Megamodules are autonomous: »responsibility for maintenance is with provider »client has no direct control over availability of services and resources provided »heterogeneity concerning implementation languages, server platforms, distribution systems, and interface definitions (ontologies, ==> SKC project) »yet client might be able to choose from several providers Megamodule A at site Stanford provided by InfoLab Megamodule B at site SLAC provided by Admin Megamodule C at site NewCom provided by BestCalc
June 1998 CHAIMS5 Challenge: Heavy-weight Services What we would like: ==> monitoring progress of a service ==> possibility to choose cheapest or fastest service ==> exploiting parallelism among services Services are not free for a client: execution time of a service transfer time and fee for data fees for services
February 1999 CHAIMS6 Parallelism, Invocation Scheduling Distributed services ==> potential parallel execution focus is on parallelism of remote (long) services, not on parallelism of local operations a d e b c i1 e1 e2 e3 e4 e5 i2 i3 i4 i5 time a d e b c i1 e1 e4 e3 e2 i3 i4 i5 i2 e5 time dataflow dependency
February 1999 CHAIMS7 Cost Estimation for Scheduling (1) Scheduling of invocations: »defer shorter invocations so results will be extractable from all services at about the same time a d (<c) e b c (>a+b) i1 e1 e4 e3 e2 i3 i4 i5 i2 e5 time ==> ESTIMATE ( methodname), returns estimated execution time, fee, and datavolume of results
February 1999 CHAIMS8 Cost Estimation for Scheduling (2) Scheduling of invocations: »control-flow dependency (conditional block), yet no data- flow dependency a d e b c condi- tional i1 e1 e2 e3 e4 e5 i2 i3 i4 i5 time a b c e d dataflow dependency control flow dependency: results of c only needed under certain conditions, determined by b
February 1999 CHAIMS9 Cost Estimation for Scheduling (3) ==>cost-function based on cost information from a, b, and c a d e b c i1 e1 e4 e3 e2 i3 i4 i5 i2 e5 time Risk: waste of resources and money Risk: waste of time a e b c i1 e1 e2 e3 e4 e5 i2 i3 i4 i5 time d
February 1999 CHAIMS10 Cost Estimation for Selection Choosing megamodules: RouteChoose - Optimum - … BestPick - BestRoute -... CHAIMS clientCHAIMS client time SETUP () SETPARAM ( attributes essential for cost estimation ) (f1=fee, t1=time) = ESTIMATE (“Optimum”) SETUP () SETPARAM ( attributes essential for cost estimation ) (f2=fee, t2=time) = ESTIMATE (“BestRoute”) calculate cost function ==> today BestPick is better BestPick.INVOKE (“BestRoute”, …)
February 1999 CHAIMS11 Why Run-time Cost Estimation? Static cost information in repository or catalog: –upper bound? average? –fluctuation in resources and load of server –dependence on specific input data Run-time cost information from megamodule: –reflects actual load and resources –takes into account autonomy, no out-of-date cost information in a central repository, no daily updates –easily fits into CPAM and the concept of having several primitives for remote execution –yet: requires in megamodule either statistics of costs over typical loads and input data gained by previous invocations, or special functionality in (wrapped) server software
February 1999 CHAIMS12 Monitoring: Incremental Extraction CPAM primitive EXTRACT: Partial extract: all results are ready, but results are extracted step by step (always available) Partial extract: only some of the results are ready and can be extracted before rest of results is ready Progressive extract: preliminary version of a result is ready and can be extracted EXTRACT: takes list of result attributes to be returned, returns values of these attributes
February 1999 CHAIMS13 Monitoring: Progress Information CPAM primitive EXAMINE: EXAMINE allows to ask for the completion of: »an invocation: DONE, NOT_DONE, PARTIAL, ERROR EXAMINE allows to ask for the progress of: »a result attribute: returns current accuracy »an invocation: returns current progress Progress information is optional, not provided by every megamodule. ==> partial and progressive extraction ==> rescheduling of invocations, stopping slow invocations ==> getting preliminary results which influence program flow (e.g. invocation parameters)
February 1999 CHAIMS14 Primitives in CPAM Pre-invocation: SETUP: set up a connection to a megamodule SET-, GETPARAM: preset / get attributes in a megamodule ESTIMATE: get cost estimation for optimization Invocation and result gathering: INVOKE: start a specific method EXAMINE: test status and progress of an invoked method EXTRACT: extract results from an invoked method TERMINATE: terminate a method invocation TERMINATEALL: terminate the connection to a megamodule Termination: All primitives are procedure calls ==> asynchrony in method invocation
February 1999 CHAIMS15 server side client side CHAIMS Architecture CPAM protocol Distribution Systems (CORBA, RMI…) Composer writes Client Side Run Time CLAM program CHAIMS Compiler generates e d a b c MEGA Modules CHAIMS Repository adds information to Megamodule Provider provides megamodules Wrapper Templates
February 1999 CHAIMS16 CPAM and WfMS Composing services of autonomous, computation ==> intensive servers Issues like cost-estimation also important for WfMS Managing workflow across organizations computation-intensive, autonomous services run-time cost estimation, progress monitoring