CASTOR logging at RAL Rob Appleyard, James Adams and Kashyap Manjusha.

Slides:



Advertisements
Similar presentations
JQuery MessageBoard. Lets use jQuery and AJAX in combination with a database to update and retrieve information without refreshing the page. Here we will.
Advertisements

3 Copyright © 2005, Oracle. All rights reserved. Designing J2EE Applications.
© Copyright 2012 STI INNSBRUCK Apache Lucene Ioan Toma based on slides from Aaron Bannert
Introduction to Web Services and Web API’s Richard Holowczak Baruch College December, 2014.
MONITORING TOOLS Open Source Security Tools to monitor your network.
1 Vic Hargrave |
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Log Monitoring, Management and Analysis with Nagios
Finding Nearby Wireless Hotspots CSE 403 LCA Presentation Team Members: Chris Scoville Tessa MacDuff Matt Mohebbi Aiman Erbad Khalil El Haitami.
CSE 486/586 CSE 486/586 Distributed Systems PA Best Practices Steve Ko Computer Sciences and Engineering University at Buffalo.
MySQL + PHP.  Introduction Before you actually start building your database scripts, you must have a database to place information into and read it from.
GDT V5 Web Services. GDT V5 Web Services Doug Evans and Detlef Lexut GDT 2008 International User Conference August 10 – 13  Lake Las Vegas, Nevada GDT.
Clemens Düpmeier (KIT / IAI)
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Pakiti.
Common Servers in a Workplace Environment Brandon Reynolds Computer Electronic Networking Dept. of Technology, Eastern Kentucky University.
NOSQL DATABASES Please remember to read the NOSQL Distilled book and the Seven Databases book.
Writing macros and programs for Voyager cataloging Kathryn Lybarger ELUNA 2013 May 3, #ELUNA2013.
Open Search Office Web Services Database Doc Mgt Sys Pipeline Index Geospatial Analysis Text Search Faceting Caching Query parsing Clustering Synonyms.
OSG Area Coordinator’s Report: Workload Management April 20 th, 2011 Maxim Potekhin BNL
1 Quick Overview Overview Network –IPTables –Snort Intrusion Detection –Tripwire –AIDE –Samhain Monitoring & Configuration –Beltaine –Lemon –Prelude Conclusions.
Views Lesson 7.
Streamlining Monitoring Infrastructure in IT-DB-IMS Charles Newey ›
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Security monitoring boxes Andrew McNab University of Manchester.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
RAL Site Report Castor Face-to-Face meeting September 2014 Rob Appleyard, Shaun de Witt, Juan Sierra.
Cove: A Practical Quantum Computer Programming Framework Matt Purkeypile (DCS3) Winter 2009.
Realtime insight in your application usage with NodeJs, ElasticSearch and Kibana Onno de Haan.
Unit 9: Distributing Computing & Networking Kaplan University 1.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities Agile Infrastructure Monitoring CERN IT/CF.
Centralized Logfile Search (a.k.a. Tracing) Vito Baggiolini with Gergo Horanyi, Felix Ehm, Stephen Page.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Reconfigurable Communication Interface Between FASTER and RTSim Dec0907.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.
VIVO architecture March 1, Major Components Vitro is a general-purpose Web-based application leveraging semantic standards VIVO is a customized.
Automated Testing April 2001WISQA Meeting Ronald Utz, Automated Software Testing Analyst April 11, 2001.
Audit & Reporting with Alfresco & NoSQL architecture Lucas Patingre Alfresco consultant and technical lead at Zaizi.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Alfresco Monitoring with OpenSource Tools Miguel Rodriguez Technical Account Manager.
CERN IT Department CH-1211 Genève 23 Switzerland t Towards end-to-end debugging for data transfers Gavin McCance Javier Conejero Banon Sophie.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Present and Future Pedro Andrade (CERN IT) 31 st August.
Elasticsearch – An Open Source Log Analysis Tool Rob Appleyard and James Adams, STFC Application-Level Logging for a Large Tier 1 Storage System.
CH-1211 Genève 23 Job efficiencies at CERN Review of job efficiencies at CERN status report James Casey, Daniel Rodrigues, Ulrich Schwickerath.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
A presentation on ElasticSearch
Detecting Web Attacks Using Multi-Stage Log Analysis
Centralised logging using RSYSLog
WinCC-OA Log Analysis SCADA Application Service - Reporting
Programmer: Roman Martushev
Combining Metrics and Logs for Holistic System/Application Analysis
SWITCHdrive Experience with running Owncloud on top of Openstack/Ceph
CERN-Russia Collaboration in CASTOR Development
Supplier Recovery Claim Automation
INFOD-WG Implementation
Processes The most important processes used in Web-based systems and their internal organization.
Enterprise Application Architecture
Gen-Tao Chiang Data and Analytic Engineer
Azure's Performance, Scalability, SQL Servers Automate Real Time Data Transfer at Low Cost MINI-CASE STUDY “Azure offers high performance, scalable, and.
Software models - Software Architecture Design Patterns
another noSql customization for the HDB++ archiving system
Get your ETL flow under statistical process control
Problem Solving Designing Algorithms.
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
End to End Monitoring Solution using Open Source Technology where webMethods 9.10 is used as ESB IBM Confidential.
Overview of Workflows: Why Use Them?
Query Interface using Django
Indexing with ElasticSearch
Alarm information in CS-Studio
Presentation transcript:

CASTOR logging at RAL Rob Appleyard, James Adams and Kashyap Manjusha

The plan Implement CERN’s new CASTOR logging system at RAL Start work at the source of messages and work through to the destination. Simple, right?

simple-log-producer Easy to set up, ran well. We weren’t sure of the reason to write custom code for this, we are aware of off-the-shelf products that do the same thing.

Apollo/ActiveMQ CERN planned to move from Apollo to ActiveMQ, so we decided to go straight to the final destination. The ActiveMQ broker proved difficult to set up. – Arcane config. – Dept. experience – they got it running once and hoped to never touch it again! ActiveMQ seemed like overkill. – It is a heavyweight bit of software. – Our use case is extremely simple; take messages from multiple sources and forward them on. – Is there a simpler way of doing this?

Message Broker: The solution! Replace the ActiveMQ broker with some rsyslog config that does the same thing. – We use rsyslog for all other Tier 1 logging at RAL. – Lightweight solution that does what we needed. – Simply send unprocessed log messages over TCP. – A couple of lines in rsyslog.conf were all that was necessary.

But… The simple-log-producer is not simply a forwarding mechanism. Messages processed locally before transmission. We need to do the processing done by slp somewhere downstream. Combine slp and the consumer scripts into one script that runs on the ‘viewer’ node. We could also eliminate the rsyslog broker, and just send directly to the viewer.

smooshed-log-producer Attempt to combine simple-log-producer with the consumers. James spent several days working on this. – Thought that the system could more easily be re- implemented using standard software. – Let’s have a go!

The Off-the-Shelf Approach Use logstash feeding ElasticSearch and Kibana. All three components are affiliated open-source products. <1 day to produce a working prototype. We already have a solution for long-term archival of log messages; central loggers that capture all Tier 1 messages.

Logstash Open source log management tool. Input -> Filters -> Output Can interface with… more or less anything.

Logstash Our setup receives syslog messages over TCP, tokenises and forwards them to ElasticSearch. RAL has experimented with it in the past for other applications. See: logstash.net

ElasticSearch Distributed real-time search and analysis tool. Based on Apache Lucene (JSON document-based search) Horizontal construction – need more capacity? Just add more nodes. Currently running on a two-machine cluster Accepting messages from preproduction instance.

Kibana

Web FE for ElasticSearch. Index and full text of every CASTOR log message. All tokenised and searchable. – Search on any message field. – Arbitrary queries Lots of graphs and analytics. Much faster than DLF, at least with preprod. No Oracle or MySQL database required. Current implementation: LINK!LINK

The Result We have a system that appears to be capable of fulfilling our needs better than DLF. Faster Able to run arbitrary queries. Components are all off-the-shelf. Correctly handles all CASTOR messages that DLF did (which isn’t everything…) – Needs some help to interpret and deal with a few anomalies. – The xroot logs are weird.

Future Plans Currently working against preprod during stress testing. During stress testing, received 16 GB/day of CASTOR logs. No dependency on We aim to start testing against our production instances before Christmas. – Scalability? The plan is to have one message index per CASTOR instance. Possible future development: – Reconfigure rsyslog on source nodes to send JSON to logstash rather than syslog (should be pretty trivial).

Questions?