Online remote monitoring facilities for the ATLAS experiment

Slides:



Advertisements
Similar presentations
GNAM and OHP: Monitoring Tools for the ATLAS Experiment at LHC GNAM and OHP: Monitoring Tools for the ATLAS Experiment at LHC M. Della Pietra, P. Adragna,
Advertisements

Development of the System remote access real time (SRART) at JINR for monitoring and quality assessment of data from the ATLAS LHC experiment (Concept.
1 CS 502: Computing Methods for Digital Libraries Lecture 22 Web browsers.
Servlets and a little bit of Web Services Russell Beale.
CADDLAB Medical Imaging on Remote Compute Servers.
Cloud Computing Lecture #7 Introduction to Ajax Jimmy Lin The iSchool University of Maryland Wednesday, October 15, 2008 This work is licensed under a.
March 2003 CHEP Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.
First year experience with the ATLAS online monitoring framework Alina Corso-Radu University of California Irvine on behalf of ATLAS TDAQ Collaboration.
Screen Snapshot Service Kurt Biery LAFS Meeting, 08-May-2007.
Web Programming Language Dr. Ken Cosh Week 1 (Introduction)
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
Screen Snapshot Service Kurt Biery SiTracker Monitoring Meeting, 23-Jan-2007.
14th IEEE-NPSS Real Time Conference 2005, 8 June Stockholm.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Systems technologies // healthcare solutions. What is messenger? Multi-user messaging platform Sends messages to: Wireless Telephones (WiFi / DECT) Pocket.
Introduction to Internet Programming (Web Based Application)
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
Proprietary & Confidential Java WebStart Created by Bob Hays.
Tracker data quality monitoring based on event display M.S. Mennea – G. Zito University & INFN Bari - Italy.
A.Golunov, “Remote operational center for CMS in JINR ”, XXIII International Symposium on Nuclear Electronics and Computing, BULGARIA, VARNA, September,
Jian Gui WANG New Implementation of Agriculture Models APAN19---Jan New Implementations of Agriculture Models Using Mediate Architecture.
CMS pixel data quality monitoring Petra Merkel, Purdue University For the CMS Pixel DQM Group Vertex 2008, Sweden.
INTRODUCTION TO WEB APPLICATION Chapter 1. In this chapter, you will learn about:  The evolution of the Internet  The beginning of the World Wide Web,
A Collaborative Framework for Scientific Data Analysis and Visualization Jaliya Ekanayake, Shrideep Pallickara, and Geoffrey Fox Department of Computer.
CHEP 2013, Amsterdam Reading ROOT files in a browser ROOT I/O IN JAVASCRIPT B. Bellenot, CERN, PH-SFT B. Linev, GSI, CS-EE.
SWGData and Software Access - 1 UCB, Nov 15/16, 2006 THEMIS SCIENCE WORKING TEAM MEETING Data and Software Access Ken Bromund GST Inc., at NASA/GSFC.
Java for networking Module Introduction Data Communications Communication architecture Application.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
Tom Meyer, Iowa State SCT/Pixel Online Workshop June, 2001 CORBA Common Object Request Broker Architecture.
Introduction to ASP.NET development. Background ASP released in 1996 ASP supported for a minimum 10 years from Windows 8 release ASP.Net 1.0 released.
Data Hosting and Security Overview January, 2011.
M. Caprini IFIN-HH Bucharest DAQ Control and Monitoring - A Software Component Model.
Online Data Monitoring Framework Based on Histogram Packaging in Network Distributed Data Acquisition Systems Tomoyuki Konno 1, Anatael Cabrera 2, Masaki.
Introduction to Node.js® Jitendra Kumar Patel Saturday, January 31, 2015.
WARCS (Wide Area Remote Control for SPring-8)‏ A. Yamashita and Y.Furukawa SPring-8, Japan Control System Cyber-Security Workshop (CS)2/HEP Oct
Introduction to Information Systems SSD1: Introduction to Information Systems Unit 1. The World Wide Web Unit 2. Introduction to Java and Object- Oriented.
Norwalk LUG May Meeting
SmartCode Brad Argue INLS /19/2001.
Progress Apama Fundamentals
CX Introduction to Web Programming
Applications Active Web Documents Active Web Documents.
Understanding Web Server Programming
Web Programming Language
Distributed Control and Measurement via the Internet
WP18, High-speed data recording Krzysztof Wrona, European XFEL
WWW and HTTP King Fahd University of Petroleum & Minerals
JavaScript and Ajax (Internet Background)
Overview of the LHD Central Control Room Data Monitoring Environment
CNIT 131 Internet Basics & Beginning HTML
LHC experiments Requirements and Concepts ALICE
Outline SOAP and Web Services in relation to Distributed Objects
CMS Centres Worldwide CMS Centres Worldwide
Experience in CMS with Analytic Tools for Data Flow and HLT Monitoring
Processes The most important processes used in Web-based systems and their internal organization.
Outline SOAP and Web Services in relation to Distributed Objects
HTML5 based Notification System for Updating
PHP / MySQL Introduction
File Transfer Olivia Irving and Cameron Foss
Node.Js Server Side Javascript
Monitoring of the infrastructure from the VO perspective
Design and Maintenance of Web Applications in J2EE
Grid Canada Testbed using HEP applications
Outline Overview Development Tools
AIMS Equipment & Automation monitoring solution
Java Analysis Studio - Status
The Performance and Scalability of the back-end DAQ sub-system
BOF #1 – Fundamentals of the Web
Client-Server Model: Requesting a Web Page
Web Application Development Using PHP
Presentation transcript:

Online remote monitoring facilities for the ATLAS experiment Serguei Kolos (University of California, Irvine) Of behalf of the ATLAS remote monitoring team 10/19/2010 CHEP 2010 Conference

Outline Introduction Remote monitoring requirements in ATLAS: ATLAS collaboration Motivation for remote monitoring Remote monitoring requirements in ATLAS: Remote monitoring roles ATLAS security considerations Remote monitoring services: ATLAS online monitoring architecture Remote monitoring design and implementations Experience of the exploitation Summary and conclusion 10/19/2010 CHEP 2010 Conference

ATLAS collaboration ATLAS is one of the 4 major LHC experiments at CERN ATLAS detector and TDAQ system construction took about 15 years: 140 000 000 channels Over 3000 people from: 169 institutes 38 counties 10/19/2010 CHEP 2010 Conference

The motivations for remote monitoring (1/2) After the first year of the ATLAS exploitation the DAQ efficiency for 7TeV data is about 93% Availability of expertise from the people who were participating in the ATLAS construction is the key point to the success: Many experts are not permanently located at CERN Coming to CERN is possible but not too often due to various reasons (other commitments, budget limitations, teaching duties, etc.) Even those who are at CERN can not be present in the Control Room all day 10/19/2010 CHEP 2010 Conference

The motivations for remote monitoring (2/2) ATLAS experiment has its private set of networks: Data, Technical and Control Connection from the CERN Global Public Network is provided via dedicated gateway server: Closed during data taking Experts can log-in only in case of serious issues after the approval of the ATLAS Shift Leader Even experts based at CERN don’t have direct access to the monitoring data, unless sitting in one of the control rooms 10/19/2010 CHEP 2010 Conference

ATLAS Remote Monitoring Roles and Requirements Any member of the ATLAS collaboration shall be able to: View overall high-level status of the currently ongoing data taking activities Small amount of information is made available to everybody Updated at fixed time intervals (every couple of minutes) ATLAS sub-system expert shall be able to: Request and obtain an up-to-date value of a given monitoring information ATLAS sub-system shifter shall be able to: Permanently monitor the status of the given sub-system in real-time while data are being taken Reduces the burden on the sub-system experts, who are contacted only in case of problems (-) Amount of data (+) (-) Number of users (+) 10/19/2010 CHEP 2010 Conference

ATLAS Online Monitoring Architecture All monitoring information is kept in the Information Service: It’s complete It’s up-to-date IS provides read-write and subscribe-callback APIs: Java C++ Python Trigger/DAQ system Data Quality Monitoring Framework Visual Displays DQ Status Status, counters, etc. Histograms Information Service (IS) Histograms Event samples Event Analysis Frameworks 10/19/2010 CHEP 2010 Conference

General public remote monitoring Very limited set of data Practically unlimited number of users 10/19/2010 CHEP 2010 Conference

General public remote monitoring: Design Trigger/DAQ system Data Quality Monitoring Framework Visual Displays DQ Status Histograms Status, counters, etc. Information Service (IS) ATLAS Operations Web Server Periodically regenerated static HTML pages Histograms Event samples Event Analysis Frameworks Web Monitoring Interface (WMI) 10/19/2010 CHEP 2010 Conference

General public remote monitoring: Implementation WMI is a C++ framework: It is running in the ATLAS private network It fires plug-ins execution at configurable time intervals It provides data-out C++ API for the plug-ins It converts monitoring data chosen by the plug-ins to HTML files It copies those files to the ATLAS Web server WMI plug-ins: Select information to be put to HTML files Defines layout of the generated HTML pages 10/19/2010 CHEP 2010 Conference

WMI DAQ pages https://atlasop.cern.ch/atlas-point1/wmi/current/Run%20Status_wmi/ATLAS.html 10/19/2010 CHEP 2010 Conference

WMI Data Quality pages https://atlasop.cern.ch/atlas-point1/wmi/current/Data%20Quality%20Monitoring_wmi/ATLAS.html 10/19/2010 CHEP 2010 Conference

Hits statistics for the ATLAS Operations Web Server (first week of October) 10/19/2010 CHEP 2010 Conference

Expert mode remote monitoring Any monitoring information can be provided on request Overall request rate is limited 10/19/2010 CHEP 2010 Conference

Expert mode remote monitoring: Design Trigger/DAQ system Data Quality Monitoring Framework Visual Displays Web Browser XML over HTTPS DQ Status Histograms Status, counters, etc. Information Service (IS) ATLAS Operations Web Server Histograms XML over HTTPS Event samples Event Analysis Frameworks WebIS (Python) 10/19/2010 CHEP 2010 Conference

Expert mode remote-monitoring: Implementation WebIS (Python) IS ATLAS Operations Web Server (Apache) Web Browser WebIS (Python) Each object in IS has a unique URL, e.g.: https://atlasop.cern.ch/info/current/ATLAS/is/RunParams/RunParams.RunParams WebIS – Python HTTP server: Sends back the content of the IS object designated by the given URLs Stateless which allows using multiple servers for scalability Apache cache is used to reduce the number of requests Generic facility which can be used to construct complex WEB based GUIs using: CSS, JavaScript, Java applets, standalone Java applications 10/19/2010 CHEP 2010 Conference

WebIS: Histogram presenter https://atlasop.cern.ch/atlas-point1/tdaq/web_is/ohp/ATLAS.html 10/19/2010 CHEP 2010 Conference

WebIS: L1 Calorimeter remote monitoring application (Java) https://twiki.cern.ch/twiki/bin/view/Atlas/LevelOneCaloRemoteMonitoring 10/19/2010 CHEP 2010 Conference

Hits statistics for the ATLAS Operations Web Server (first week of October) 10/19/2010 CHEP 2010 Conference

Shifter mode remote monitoring Provides the same desktop and GUI monitors as in the ATLAS Control Room Supports limited number of users 10/19/2010 CHEP 2010 Conference

Shifter mode remote monitoring: Design Trigger/DAQ system Data Quality Monitoring Framework Visual Displays DQ Status Histograms Status, counters, etc. Information Service (IS) IS Mirror IS Streaming makes full real-time copy of the Information in IS Histograms Event samples Event Analysis Frameworks 10/19/2010 CHEP 2010 Conference

Shifter mode remote monitoring: Implementation Mirror IS nodes have 2 NICs: One can be connected only from Atlas Private Network Another one is restricted to “Remote X session” nodes Remote users: Open ssh-tunnel on a CERN public node Open X (NX) session on one of the “Remote X session” nodes Use the same GUI displays as in the ATLAS Control Room CERN Global public network ATLAS Technical Control Network Mirror IS (node 1) Remote X(NX) sessions (node 1) CERN Public Node ATLAS Information Service Remote node Mirror IS (node 2) Remote X(NX) sessions (node 2) Mirror IS (node 3) 10/19/2010 CHEP 2010 Conference

NX technology Plain X11 is very slow over long distance connections Several technologies were evaluated: VNC, Sun Secure Global Desktop, NX Finally NX technology has been chosen: The fastest, the cheapest and with small CPU/Memory overhead What is NX: Initially proposed and implemented by NoMachine company Provides compression and caching over standard X11 protocol We are using FreeNX server (GPL) Clients can download NoMachine NX client: It is free for non-commercial use It is available on Linux, Windows, MacOS 10/19/2010 CHEP 2010 Conference

Shifter mode remote monitoring: Scalability This approach is almost “infinitely” scalable: The information in the “IS mirror” service can be streamed to the “Secondary IS mirror”, which in turn can stream to the “Tertiary IS mirror”, etc. At every level the streaming delay is O(1) ms Mirror IS (node 1) Remote X (NX) sessions (node 1) CERN Public Node Remote node Mirror IS (node 2) Remote X (NX) sessions (node 2) ATLAS Technical Control Network Mirror IS (node 3) CERN Global public network Mirror IS (node 1) Remote X (NX) sessions (node 1) CERN Public Node Remote node ATLAS Information Service Mirror IS (node 2) Remote X (NX) sessions (node 2) Mirror IS (node 3) 10/19/2010 CHEP 2010 Conference

Shifter mode remote monitoring: The highlights Mirrors full status of the ATLAS data taking session in real-time: Information transfer delay is at the order of a few milliseconds Provides the same GUI applications as being used in the ATLAS Control Room: No additional learning curve involved No overhead for the SW development and maintenance Current configuration: Two “Remote X session nodes” handle now up to 20 concurrent remote users sessions With recurrent mirroring, any practical number of remote sessions can be easily handled: It’s just a matter of available HW resources 10/19/2010 CHEP 2010 Conference

Summary and conclusion Three types of services have been implemented to provide remote monitoring for ATLAS Each service is oriented to a specific group of ATLAS collaboration members The information available via WEB covers most of the experts needs for the monitoring The number of log-in requests to the ATLAS private network was drastically reduced when the services were put in place Remote NX sessions are working nicely for people logged in from USA, Japan, Russia and other long distance sites: Remote shifts become regular for some ATLAS sub-systems 10/19/2010 CHEP 2010 Conference