TF meeting – July 13, 2006 Support for taking actions in MonALISA Costin Grigoras
TF meeting – July 13, 2006 Overview of data flow in MonALISA Site VOBox Site VOBox Site VOBox Job Central services Repository DB Web pages Site services
TF meeting – July 13, 2006 Internal data flow Data acquisition (monitoring modules) Filters / aggregation Actions based on monitoring data Consumers (clients, storage...)
TF meeting – July 13, 2006 Actions ● Actions can be taken when: values exceed given thresholds or based on the existence / non-existence of some monitoring data condition evaluation can be periodical or event-based ● There are 3 possible states that can be signaled: ok, error and flip-flop, each with customizable thresholds. ● This mechanism is implemented at both service and repository levels, so we can take actions based on local conditions or on global states.
TF meeting – July 13, 2006 Actions (cont.) ● As actions we currently have the following plugins: Writing a message in a log file Executing a command Sending an (using the embedded mail srv) Sending an instant message
TF meeting – July 13, 2006 Actions (cont.) ● When the state changes for one monitored series any number of actions can be taken ● Each action set is defined in one simple configuration file ● Changing/adding/removing configuration files don't require restarting of the service or of the repository
TF meeting – July 13, 2006 Sample configuration file - watching if the site services are alive- series.count=1 series.0=$QSELECT name FROM abping_aliases; period=60 rule=$Eif(zero_if_null($Ct#0/MonaLisa/localhost/CPU_usr;)>now()-90000; 1; 0) threshold.success=5 threshold.error=5 actions.count=1 action.0.report_err=offline action.0.type= action.0.subject=Service #0 is #MSG action.0.body=Service #0 is #MSG
TF meeting – July 13, 2006 Suggestions ?