Download presentation
Presentation is loading. Please wait.
Published byPhyllis Reynolds Modified over 9 years ago
1
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 1 WP4 demonstration Fabric Monitoring and Fault Tolerance Sylvain Chapeland Lord Hess
2
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 2 Workload Management (WP1) Data Management (WP2) Storage Element (WP5) Fabric Management (WP4) Networking (WP7) Information Service (WP3) Fabric Monitoring and Fault Tolerance in the global picture
3
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 3 Outline u System architecture n Fabric Monitoring n Fault Tolerance u Demonstration n Hardware setup n Use case u Summary u Questions
4
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 4 Sensor 3 Consumer Sensor 2 Consumer Sensor Fabric Monitoring architecture Measurement Repository (MR) Database Monitored nodes Monitoring Sensor Agent (MSA) 1 Cache Consumer Sensor Consumer
5
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 5 Sensor MSA Sensor Fault Tolerance architecture Local Node Decision unit Actuator agent monitoring Rules Fault Tolerance daemon (FTd) Cache Actuator
6
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 6 Demonstration setup Slides Monitoring data Shells Laptop Beamer 1 Beamer 2 MSA FTd MR Monitored node Server node FT Rule editor
7
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 7 Demonstration u Use case based on daemon monitoring u Fabric Monitoring n Check a daemon status with the monitoring system while killing and restarting it u Fault Tolerance n Edit a rule to restart the daemon automatically n Kill the daemon while following its status in monitoring
8
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 8 Monitored node Server node MSA MR MSA monitors a daemon status. Information is propagated to repository and consumers. daemon Daemon status Check ok Transport Store Notify Daemon ok Status display : consumer application connected to repository
9
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 9 Monitored node Server node MSA MR When daemon killed, MSA updates the daemon status in the repository. Consumers are notified of the new metric value. daemon Daemon status Check not ok Transport Store Notify Daemon dead Shell Kill Status display : consumer application connected to repository
10
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 10 Monitored node Server node MSA MR A manual operation is required to get back to normal status. Daemon status Check ok Transport Store Notify Daemon ok Shell Relaunch daemon Status display : consumer application connected to repository
11
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 11 Monitored node Server node A rule is added to automatically restart the daemon when dead. Web browser Rule editor FTd Rule editor accessed by web browser Rule editor HTTP rule Transport
12
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 12 Monitored node Server node MSA MR daemon Daemon ok Status display : consumer application connected to repository Check Shell Kill When daemon killed, FTd is notified and triggers recovery action as specified in rule. FTd rule daemon Transport Store Notify Daemon dead Notify rule Daemon ok Relaunch
13
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 13 Monitored node Server node MSA MR daemon Daemon ok Recovery actions are also fed back to the monitoring. FTd Transport Store Notify Daemon dead Log Daemon restarted Log viewer: consumer application connected to repository
14
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 14 Monitored node Server node Web browser MSA History on web browser. HTTP Metric history is available in the measurement repository. MR
15
EU 2nd Year Review – 04-05 Feb. 2003 – WP4 demo – n° 15 Summary u Monitoring system to get live status of a node n Centralization of data n Measures available remotely u Fault Tolerance as monitoring data consumer n Rule edition of recovery actions n Automatic actions taken according to monitoring status u Deployment status n Monitoring agent runs in production mode on ~1000 nodes in CERN computer center n Will be available in next EDG release
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.