March Availability Report for EGEE Sites based on Nagios James Casey, David Collados
SAM and Nagios in WLCG At Current Moment Calculation of site availability for sites done in SAM March Availability computations Parallel computations based on SAM & Nagios probes Equivalent metrics for CE, SRMv2 & sBDII Using same algorithm (gridview)
March - Nagios Based Report March Availability reports for EGEE sites Official SAM report: https://edms.cern.ch/file/963325/1/EGEE_Mar2010.pdf Unofficial Nagios report: https://edms.cern.ch/file/963325/1/Unofficial_Nagios_EGEE_Mar2010.pdf 315 EGEE sites in the report Sites whose availability changed > 10% 40 = 12.7% Sites whose availability increased > 10% 19 = 6.0% Sites whose availability decreased > 10% 21 = 6.7%
March - Nagios Based Report Sites whose availability decreased > 10% 21 = 6.7% 12 failed due to messaging brokers discovery, now OK 5 due to timeouts in job submit or missing libraries on WNs, now 4 OK, 1 fails 4 failing sBDII tests, now 2 OK, 2 fail Differences understood and corrected in Nagios Sites can/should check their current Nagios status In the GridView Nagios portal: http://gvdev.cern.ch/NAGIOS/same_index.php Or in their corresponding ROC Nagios or MyEGEE instances: https://twiki.cern.ch/twiki/bin/view/EGEE/NagiosROCURL#Production_installations_Nagios And report any issue through the Nagios Support Unit in GGUS
April - Availability Official availability reports for April will not change Still calculated based on SAM probe results We will also generate availability reports based on Nagios results Continue the validation of Nagios availability by sites and project Sites should check their current Nagios status and report any issue