Presentation is loading. Please wait.

Presentation is loading. Please wait.

Trigger Supervisor Monitoring & Alarms Workshop, 2008 Christos Lazaridis Marc Magrans de Abril Ildefons Magrans de Abril.

Similar presentations


Presentation on theme: "Trigger Supervisor Monitoring & Alarms Workshop, 2008 Christos Lazaridis Marc Magrans de Abril Ildefons Magrans de Abril."— Presentation transcript:

1 Trigger Supervisor Monitoring & Alarms Workshop, 2008 Christos Lazaridis Marc Magrans de Abril Ildefons Magrans de Abril

2 2 TS Monitoring & Alarms WorkshopXXXXX, 2008Outline Logs Alarms Monitoring Severity levels If an error occurs LogCollector & Chainsaw Summary Workshop agenda

3 3 TS Monitoring & Alarms WorkshopXXXXX, 2008 Collector Logs architecture LAS general Dashboard pulser WSEventing Subsystem group Cell Worker SimpleXdaq Cell Supervisor Collector WSEventing TStore CMS_OMDS_LB xmas::store exception Dashboard WSEventing Log Collector log Chainsaw Cmsrc-trigger /tmp

4 4 TS Monitoring & Alarms WorkshopXXXXX, 2008 Logs Logging should be treated as 'cout' statements – Not shifter-oriented – Include information for developing/debugging 5 Logging levels/macros to choose from: – DEBUG – INFO – WARN – ERROR – FATAL

5 5 TS Monitoring & Alarms WorkshopXXXXX, 2008 Alarms architecture pulser WSEventing Subsystem group Cell Worker SimpleXdaq Cell Supervisor TStore CMS_OMDS_LB xmas::store Log Collector log Chainsaw Cmsrc-trigger /tmp LAS general Dashboard Collector WSEventing exception Dashboard WSEventing Collector

6 6 TS Monitoring & Alarms WorkshopXXXXX, 2008 Alarms messages propagated to subsystem sentinel-dashboard – Common L1Trigger setup needed for central display Alarms exist to inform the trigger shifter – Clear information – Alarm cause/possible actions Can be raised: – During configuration – From monitorable items DataSource / Periodic DataSource Alarms Planned

7 7 TS Monitoring & Alarms WorkshopXXXXX, 2008 Monitoring architecture pulser WSEventing Subsystem group Cell Worker SimpleXdaq Cell Supervisor TStore CMS_OMDS_LB xmas::store Cmsrc-trigger /tmp Log Collector log Chainsaw LAS general Dashboard Collector WSEventing exception Dashboard WSEventing Collector

8 8 TS Monitoring & Alarms WorkshopXXXXX, 2008 Monitoring Retrieve and publish the status of trigger subsystems – Typed metrics – System retrofitting DataSource (uses xdaq pulser) – Pull mode Automatic refresh method – data always sent to flashlist & DB – Push mode - No autorefresh! Periodic DataSource (push mode; uses own timer) – Reduce rate to DB – Periodic hardware checks – Children Cells status CellContext* c = dynamic_cast (getContext()); MonitorSource * monSource = c->getDataSource(); const std::string item = "itemNonAutoRefresh"; monSource->put(item, xdata::String("a value")); monSource->push( item );

9 9 TS Monitoring & Alarms WorkshopXXXXX, 2008 Severity Levels Various logging/alarm levels Message severity sent to alarms dashboard can be defined (thru an arbitrary string)

10 10 TS Monitoring & Alarms WorkshopXXXXX, 2008 If an error occurs... Log the error with appropriate severity Report to the sentinel dashboard If it happens in a CellCommand/CellOperation descendant: – Can and should be handled there – Return a reply with a warning message of the same level Anywhere else: – Throw an exception XCEPT_DECLARE(tsexception::MonitoringError, e, "Monitorable in WARN"); getContext()->getCell()->notifyQualified("warning", e); LOG4CPLUS_WARN(getLogger(), "Monitorable in WARN"); getWarning().setMessage("Monitorable in WARN"); getWarning().setLevel(tsframework::CellWarning::WARNING); XCEPT_RAISE(tsexception::CellException, "Monitorable in WARN" ); Exception string

11 11 TS Monitoring & Alarms WorkshopXXXXX, 2008 LogCollector & Chainsaw Log Collector should be used for logging: Few subsystems do... – Central Cell, GCT, RCT, CSCTF, ECAL – Recipe to send logs to persistent storage https://savannah.cern.ch/cookbook/?func=detailitem&item_id=168 – Postmortem reports Plain logfiles overwritten – Chainsaw during operations Message filtering – Python script to convert xml logfiles to human readable format http://triggersupervisor.cern.ch/uploads/api_docs/v1.6/logreader.py chainsaw

12 12 TS Monitoring & Alarms WorkshopXXXXX, 2008 Summary Logs and Alarms have different orientation – Logs to debug the subsystem – Alarms to inform the shifter Monitoring information – Report current status – Raise alarms – Retrofit the system Abide by the severity levels guidelines – Uniform definitions will help avoid confusion

13 13 TS Monitoring & Alarms WorkshopXXXXX, 2008 Workshop agenda Error handling in a transition Push mode in CellOperation Monitoring persistency (DataSource & pulser) – Reducing rate using a Periodic DataSource Checking children cell status


Download ppt "Trigger Supervisor Monitoring & Alarms Workshop, 2008 Christos Lazaridis Marc Magrans de Abril Ildefons Magrans de Abril."

Similar presentations


Ads by Google