WORKS08, Austin, Texas, November 17th, 2008 Monitoring Infrastructure for Grid Scientific Workflows Institute of Computer Science and ACC CYFRONET AGH Kraków, Poland Bartosz Baliś, Marian Bubak
WORKS08, Austin, Texas, November 17th, 2008 Outline Challenges in Monitoring of Grid Scientific Workflows Challenges in Monitoring of Grid Scientific Workflows GEMINI infrastructure GEMINI infrastructure Event model for workflow execution monitoring Event model for workflow execution monitoring On-line workflow monitoring support On-line workflow monitoring support Information model for recording workflow executions Information model for recording workflow executions
WORKS08, Austin, Texas, November 17th, 2008 Motivation Monitoring of Grid Scientific Workflows important in particularly many scenarios Monitoring of Grid Scientific Workflows important in particularly many scenarios On-line & off-line performance analysis, dynamic resource reconfiguration, on-line steering, performance optimization, provenance tracking, experiment mining, experiment repetition, … Consumers of monitoring data: humans (provenance) and processes Consumers of monitoring data: humans (provenance) and processes On-line & off-line scenarios On-line & off-line scenarios Historic records: provenance, retrospective analysis (enhancement of next executions) Historic records: provenance, retrospective analysis (enhancement of next executions)
WORKS08, Austin, Texas, November 17th, 2008 Grid Scientific Workflows Traditional scientific applications Traditional scientific applications Parallel Homogeneous Tightly coupled Scientific worfklows Scientific worfklows Distributed Heterogeneous Loosely Coupled Legacy applications often in the backends Grid environment Challenges for monitoring arise Challenges for monitoring arise
WORKS08, Austin, Texas, November 17th, 2008 Challenges Monitoring infrastructure that conceals workflow heterogeneity Monitoring infrastructure that conceals workflow heterogeneity Event subscription and instrumentation requests Standardized event model for Grid workflow execution Standardized event model for Grid workflow execution Currently events tightly coupled to workflow environments On-line monitoring support On-line monitoring support Existing Grid information systems not suitable for fast notification-based discovery Monitoring information model to record executions Monitoring information model to record executions
WORKS08, Austin, Texas, November 17th, 2008 GEMINI: monitoring infrastructure Standardized, abstract interfaces for subscription and instrumentation Standardized, abstract interfaces for subscription and instrumentation Complex Event Processing: subscription management via continuous querying Complex Event Processing: subscription management via continuous querying Event representation Event representation XML: self describing, extensible but poor performance Google protocol buffers: under investigation Monitors: query & sub engine, event caching, services Monitors: query & sub engine, event caching, services Sensors: lightweight collectors of events Sensors: lightweight collectors of events Mutators: manipulation of monitored entities (e.g. dynamic instrumentation) Mutators: manipulation of monitored entities (e.g. dynamic instrumentation)
WORKS08, Austin, Texas, November 17th, 2008 Outline Event model for workflow execution monitoring Event model for workflow execution monitoring On-line workflow monitoring support On-line workflow monitoring support Information model for recording workflow executions Information model for recording workflow executions
WORKS08, Austin, Texas, November 17th, 2008 Workflow execution events Motivation: capture commonly used monitoring measurements concerning workflow execution Motivation: capture commonly used monitoring measurements concerning workflow execution Attempts to standardize monitoring events exist, but oriented to resource monitoring Attempts to standardize monitoring events exist, but oriented to resource monitoring GGF DAMED ‘Top N’ GGF NMWG Network Peformance Characteristics Typically monitoring systems introduce a single event type for application events Typically monitoring systems introduce a single event type for application events
WORKS08, Austin, Texas, November 17th, 2008 Workflow Execution Events – taxonomy Extension of GGF DAMED Top N events Extension of GGF DAMED Top N events Extensible hierarchy; example extensions: Extensible hierarchy; example extensions: Loop entered – started.codeReigon.loop MPI app invocation – invoking.application.MPI MPI Calls – started.codeRegion.call.MPISend Application-specific events Events for initiators and performers Events for initiators and performers Invoking, invoked; started, finished Event for various execution levels Event for various execution levels Workflow, task, code region, data operations Events for various execution states Events for various execution states Failed, suspended, resumed, … Events for execution metrics Events for execution metrics Progress, rate
WORKS08, Austin, Texas, November 17th, 2008 Outline Event model for workflow execution monitoring Event model for workflow execution monitoring On-line workflow monitoring support On-line workflow monitoring support Information model for recording workflow executions Information model for recording workflow executions
WORKS08, Austin, Texas, November 17th, 2008 On-line Monitoring of Grid Workflows Motivation Reaction to time-varying resource availability and application demands Up-to-date execution status Typical scenario: ‘subscribe to all execution events related to workflow Wf_1234’ Distributed producers, not known apriori Prerequisite: automatic resource discovery of workflow components New producers are automatically discovered and transparently receive appropriate active subscription requests
WORKS08, Austin, Texas, November 17th, 2008 Resource discovery in workflow monitoring Challenge: complex execution life cycle of a Grid workflow Challenge: complex execution life cycle of a Grid workflow Abstract workflows: mapping of tasks to resources at runtime Many services involved: enactment engines, resource brokers, schedulers, queue managers, execution managers, … No single place to subscribe for notifications about new workflow components Discovery for monitoring must proceed bottom-up: (1) local discovery, (2) global advertisement, (3) global discovery Problem: current Grid information services are not suitable Problem: current Grid information services are not suitable Oriented towards query performance Slow propagation of resource status changes Example: average delay from event ocurrence to notification in EGEE infrastructure ~ 200 seconds (Berger et al., Analysis of Overhead and Waiting Times in the EGEE production Grid)
WORKS08, Austin, Texas, November 17th, 2008 Resource discovery: solution What kind of resource discovery is required? What kind of resource discovery is required? Identity-based, not attribute-based Full-blown information service functionality not needed Just simple, efficient key-value store Solution: a DHT infrastructure federated with the monitoring infrastructure to store shared state of monitoring services Solution: a DHT infrastructure federated with the monitoring infrastructure to store shared state of monitoring services Key = workflow identifier Value = producer record (Monitoring service URL, etc.) Multiple values (= producers) can be registered Efficient key-value stores Efficient key-value stores OpenDHT Amazon Dynamo: efficiency, high availability, scalability. Lack of strong data consistency (‘eventual consistency’) Avg get/put delay ~ 15/30ms; 99th percentile ~ 200/300ms (Decandia et al. Dynamo: Amazon’s Highly Available Key-value Store)
WORKS08, Austin, Texas, November 17th, 2008 Monitoring + DHT (simplified architecture)
WORKS08, Austin, Texas, November 17th, 2008 DHT-based scenario
WORKS08, Austin, Texas, November 17th, 2008 Evaluation Goal: Measure performance & scalability Comparison with centralized approach Main characteristic measured: Delay between ocurrence of a new workflow component to beginning of data transfer, for different workloads Two methodologies: Queuing Network models with multiple classes, analytical solution Simulation models (CSIM simulation package)
WORKS08, Austin, Texas, November 17th, st methodology: Queuing Networks Solved analitycally (a) DHT solution QN model (b) Centralized solution QN model
WORKS08, Austin, Texas, November 17th, nd methodology: discrete-event simulation CSIM simulation package
WORKS08, Austin, Texas, November 17th, 2008 Input parameters for models Workload intensity Measured in job arrivals per second Taken from EGEE: 3000 to jobs per day Large scale production infrastructure Assumed range: from 0.3 to 10 job arrivals per second Service demands Monitors and Coordinator: prototypes built and measured DHT: from available reports on large-scale deployments OpenDHT, Amazon Dynamo
WORKS08, Austin, Texas, November 17th, 2008 Service demand matrices
WORKS08, Austin, Texas, November 17th, 2008 Results (centralized model)
WORKS08, Austin, Texas, November 17th, 2008 Results (DHT model)
WORKS08, Austin, Texas, November 17th, 2008 Scalability comparison: centralized vs. DHT Conclusion: DHT solution scalable as expected, but centralized solution can still handle relatively large workloads before saturation
WORKS08, Austin, Texas, November 17th, 2008 Outline Event model for workflow execution monitoring Event model for workflow execution monitoring On-line workflow monitoring support On-line workflow monitoring support Information model for recording workflow executions Information model for recording workflow executions
WORKS08, Austin, Texas, November 17th, 2008 Information model for wf execution records Motivation: need for structured information about past experiments executed as scientific workflows in e-Science environments Motivation: need for structured information about past experiments executed as scientific workflows in e-Science environments Provenance querying Mining over past experiments Experiment repetition Execution optimization based on history State of the art State of the art Monitoring information models do exist but for resource monitoring (GLUE), not execution monitoring Provenance models are not sufficient Repositories for performance data are oriented towards event traces or simple performance-oriented information Experiment Information (ExpInfo) model Ontologies used to describe the model and represent the records
WORKS08, Austin, Texas, November 17th, 2008 ExpInfo model A simplified example with particular domain ontologies
WORKS08, Austin, Texas, November 17th, 2008 ExpInfo model: set of ontologies General experiment information General experiment information Purpose, execution stages, input/output data sets Provenance information Provenance information Who, where, why, data dependencies Performance information Performance information Duration of computation stages, scheduling, queueing, performance metrics (possible) Resource information Resource information Physical resources (hosts, containers) used in the computation Connection with domain ontologies Connection with domain ontologies Data sets with Data ontology Execution stages with Application ontology
WORKS08, Austin, Texas, November 17th, 2008 Aggregation of data to information From monitoring events to ExpInfo records From monitoring events to ExpInfo records Standardized process described by aggregation rules and derivation rules Standardized process described by aggregation rules and derivation rules Aggregation rules specify how to instantiate individuals Aggregation rules specify how to instantiate individuals Ontology classes associated with aggregation rules through object properties Derivation rules specify how to compute attributes, including object properties = associations betwen individuals Derivation rules specify how to compute attributes, including object properties = associations betwen individuals Attributes are associated with derivation rules via annotations Semantic Aggregator uses collects wf execution events and produces ExpInfo records according to aggregation and derivation rules Semantic Aggregator uses collects wf execution events and produces ExpInfo records according to aggregation and derivation rules
WORKS08, Austin, Texas, November 17th, 2008 Aggregation rules ExperimentAggregation: eventTypes = started.workflow, finished.workflow instantiatedClass = protos/Experiment ecidCoherency = 1ComputationAggregation: eventTypes = invoking.wfTask, invoked.wfTask instantaitedClass = protos/Computation acidCoherency = 2
WORKS08, Austin, Texas, November 17th, 2008 Derivation rules The simplest case – an XML element mapped directly to a functional property: MonitoringData/experimentStarted/ownerLogin MonitoringData/experimentStarted/ownerLogin </ext-ns:Derivation> More complex case: which XML elements are needed and how to compute an attribute: cyfronet.gs.aggregator.delegates.ExpPlugin cyfronet.gs.aggregator.delegates.ExpPlugin software.execution.started.application/time software.execution.started.application/time software.execution.finished.application/time software.execution.finished.application/time </ext-ns:Derivation>
WORKS08, Austin, Texas, November 17th, 2008 Applications Coordinated Traffic Management Executed within K-Wf Grid infrastructure for workflows Workflows with legacy backends Instrumentation & tracing Drug Resistance application Executed within ViroLab virtual laboratory for infectious diseases virolab.cyfronet.pl Recording executions, provenance querying, visual ontology-based querying based on ExpInfo model
WORKS08, Austin, Texas, November 17th, 2008 Conclusion Several monitoring challenges specific to Grid scientific workflows Standardized taxonomy for workflow execution events DHT infrastructure to improve performance of resource discovery and enable on-line monitoring Information model for recording workflow executions
WORKS08, Austin, Texas, November 17th, 2008 Future Work Enhancement of event & information models Enhancement of event & information models Work-in-progress, requires extensive review of existing systems to enhance event taxonomy, event data structures and information model Model enhancement & validation Model enhancement & validation Performance of large-scale deployment Classification of workflows w.r.to generated workloads (Preliminary study: S. Ostermann, R. Prodan, and T. Fahringer. A Trace-Based Investigation of the Characteristics of Grid Workflows) Information model for worfklow status Information model for worfklow status Similar to resource status in information systems