Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks The network monitoring in grid context Operations.

Similar presentations


Presentation on theme: "EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks The network monitoring in grid context Operations."— Presentation transcript:

1 EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks The network monitoring in grid context Operations Perspective Emir Imamagic /SRCE EGEE’09, Barcelona, Spain

2 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Overview Monitoring In Operations Service Availability Monitoring –Architecture –Network Monitoring Performance Monitoring Possible Future Work Conclusion 2

3 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Monitoring In Operations Provide means to site and grid operators to monitor their resources Focus on improving availability and reliability by spotting problems and issuing alarms Define procedures for escalation and resolution of more complex problems 3

4 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Service Availability Monitoring 4 Schema provided by Karolis Eigelis

5 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 The New Architecture 5 Schema provided by Karolis Eigelis

6 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 The New Architecture 6

7 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Which Other Systems Are Used? Database components –Aggregated Topology Provider (ATP) –Metric Description Database (MDDB) Operations services –GOCDB, ENOC, OIM Grid information services –BDII 7

8 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 What Do We Check? SAM probes –various grid services (CE, WN and SRM) WLCG probes (SRCE, CERN) –various grid services (e.g. GridFTP, LFC) BDII & Gstat probes –validation of content in information system BDII Nagios native probes –standard services (e.g. web, ftp, ssh servers) 8

9 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Network Monitoring Collaboration with ENOC –integration of ENOC Downcollector features into SAM Added lightweight service checks –based on nmap –executed with high frequency –used for masking other alarms 9

10 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Network Monitoring Integrated network topology data –ENOC provided static list of border routers for all sites –Nagios supports network hierarchy –in case of router failure site resources flagged as unreachable 10

11 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Performance Monitoring - Grid Several grid systems gather performance –BDII, GridFTP transfers –Dashboards and VO-specific systems Some raise alarms based on performance data 11

12 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Performance Monitoring - Network Majority of sites are without dedicated links –without SLAs what should we alarm on? Severe degradation of network performance –e.g. failure of primary link –interpreted as service unavailability 12

13 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Possible Future Work – Availability Monitoring Lightweight checks improvement? Dynamic network topology info? Better integration with networking monitoring systems? End-to-end monitoring between sites? 13

14 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Possible Future Work – Performance Monitoring Dynamic performance testing –to distinguish between failure and severe degradation –interesting for grid services (job & file transfer management) With dedicated links –monitoring network parameters –raising alarms in case of degradation Monitoring dynamic link reservation 14

15 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Conclusion Multilevel monitoring provide the means for administrators to better monitor their services Integration with existing components to automate operations of monitoring instances Network monitoring mainly focused on end-to-end links 15

16 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Links OAT web page https://twiki.cern.ch/twiki/bin/view/EGEE/OAT_EGEE_III https://twiki.cern.ch/twiki/bin/view/EGEE/OAT_EGEE_III OAT Multi-level monitoring architecture https://twiki.cern.ch/twiki/bin/view/EGEE/MultiLevelMon itoringOverview https://twiki.cern.ch/twiki/bin/view/EGEE/MultiLevelMon itoringOverview 16

17 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Thank You! Questions? 17


Download ppt "EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks The network monitoring in grid context Operations."

Similar presentations


Ads by Google