GridICE: a monitoring service for Grid Systems. OUTLINE GridICE Server Installation – Brief Introduction – System Requirements – Core Packages & Dependencies.

Slides:



Advertisements
Similar presentations
Database Planning, Design, and Administration
Advertisements

Database System Concepts and Architecture
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 9: Implementing and Using Group Policy.
Network+ Guide to Networks, Fourth Edition Chapter 10 Netware-Based Networking.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 1: Introduction to Windows Server 2003.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 9: Implementing and Using Group Policy.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 1: Introduction to Windows Server 2003.
Lecture Nine Database Planning, Design, and Administration
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 1: Introduction to Windows Server 2003.
Sharepoint Portal Server Basics. Introduction Sharepoint server belongs to Microsoft family of servers Integrated suite of server capabilities Hosted.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
MCTS Guide to Configuring Microsoft Windows Server 2008 Active Directory Chapter 3: Introducing Active Directory.
1 Kyung Hee University Prof. Choong Seon HONG Network Control.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Cluster Reliability Project ISIS Vanderbilt University.
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
Performance of the Relational Grid Monitoring Architecture (R-GMA) CMS data challenges. The nature of the problem. What is GMA ? And what is R-GMA ? Performance.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 1: Introduction to Windows Server 2003.
INFSO-RI Enabling Grids for E-sciencE GridICE: a monitoring service for Grid Systems Sergio Andreozzi INFN (Italy)
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
INFSO-RI Enabling Grids for E-sciencE GridICE: a monitoring service for Grid Systems Giuseppe Misurelli INFN-CNAF (Italy) giuseppe.misurelli.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System on gLite middleware Vincent.
Computer Emergency Notification System (CENS)
Database Planning, Design, and Administration Transparencies
Distributed monitoring system. Why Monitor? Solve them! Identify Problems Ensure conduct Requirements Manage many computers Spot trends in the system.
MTA SZTAKI Hungarian Academy of Sciences Introduction to Grid portals Gergely Sipos
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America BDII Server Installation and Configuration.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America gLite Information System Claudio Cherubino.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
Hyperion Artifact Life Cycle Management Agenda  Overview  Demo  Tips & Tricks  Takeaways  Queries.
Module 6: Administering Reporting Services. Overview Server Administration Performance and Reliability Monitoring Database Administration Security Administration.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
DataTAG is a project funded by the European Union International School on Grid Computing, 23 Jul 2003 – n o 1 GridICE The eyes of the grid PART I. Introduction.
FESR Trinacria Grid Virtual Laboratory gLite Information System Muoio Annamaria INFN - Catania gLite 3.0 Tutorial Trigrid Catania,
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Research Infrastructures Grant Agreement n
DataTAG is a project funded by the European Union CERN, 8 May 2003 – n o 1 / 10 Grid Monitoring A conceptual introduction to GridICE Sergio Andreozzi
II EGEE conference Den Haag November, ROC-CIC status in Italy
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
E-science grid facility for Europe and Latin America Updates on Information System Annamaria Muoio - INFN Tutorials for trainers 01/07/2008.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
INFSO-RI Enabling Grids for E-sciencE GridICE: status and plans for gLite integration and user level job monitoring Sergio Andreozzi.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
Databases and DBMSs Todd S. Bacastow January 2005.
Job monitoring and accounting data visualization
gLite Information System
GridICE: a monitoring service for Grid Systems
Brief overview on GridICE and Ticketing System
Monitoring: problems, solutions, experiences
Sergio Fantinel, INFN LNL/PD
GridICE monitoring for the EGEE infrastructure
Objectives Differentiate between the different editions of Windows Server 2003 Explain Windows Server 2003 network models and server roles Identify concepts.
a VO-oriented perspective
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Data, Databases, and DBMSs
Author: Laurence Field (CERN)
Database Systems Instructor Name: Lecture-3.
The EU DataGrid Fabric Management Services
Information System (BDII)
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

GridICE: a monitoring service for Grid Systems

OUTLINE GridICE Server Installation – Brief Introduction – System Requirements – Core Packages & Dependencies – APT Installation – Apache Configuration – PostgreSQL Optimization – The GridICE Configuration Script – After The Discovery Process Why Monitoring – A Use Case Perspective – VO manager viewpoint – Grid Operations viewpoint – Site Administrator viewpoint What is Grid Monitoring – Our Definition – Concepts & Terminology – Requirements – The Four Main Phases of Monitoring 2 The GridICE Approach –Generating Events –Distributing Events –Presenting Events Monitoring a Grid –Challenges for Data Collection –Challenges for Data Presentation –VO manager utilization –Grid Operations manager utilization –Site Administrator utilization

GridICE Server Installation 3

Brief Introduction GridICE: – is a distributed monitoring tool for grid systems – integrates with local monitoring systems – offers a web interface for publishing monitoring data at the Grid level – fully integrated in the LCG-2 Middleware gridice-clients data collector installation and configuration for each site ralized by the Yaim scripts. 4

System Requirements Suggested Operating system is Scientific Linux with a minimal installation The GridICE server should be installed on a performant machine – PostgreSQL service - RAM intensive demand – Apache web server - RAM-CPU intensive demand 5

Core Packages & Dependencies The GridICE server software is composed by three core packages: 1.gridice-core (setup and maintenance scripts / discovery components) 2.gridice-www (web interface scripts and components) 3.gridice-plugins (monitoring scripts) Plus several dependencies: – Apache http web server – PostgreSQL database server – Nagios monitoring tool –... 6

APT Installation Fully automated process thanks to APT package manager 7 Add in the /etc/apt/source.list.d/sl.d the GridICE repository: ### GridICE APT Repository ### rpm gridice/packages/sl/3.0.3/i386 \ gridicehttp://infnforge.cnaf.infn.it Update your new repository list with the command: update Upgrade your system with the command: apt-get upgrade (it takes a while) Install the GridICE meta-package with the command: apt-get install gridice-server

Apache Configurations 1.HTTPD to disable dir indexes and manage the.htaccess related files: In /etc/httpd/conf/httpd.conf Modify Options Indexes FollowSymLinks with Option –Indexexs FollowSymLinks After few lines modify AllowOverride None with AllowOverride All Save and exit Then create the two symbolic links for the jpgraph and ADODB libraries ln –s /var/www/jpgraph- /var/www/html/gridice/external/jpgraph ln –s /var/www/adodb /var/www/html/gridice/external/adodb 8

PostgreSQL Optimization 1.All PostgreSQL databases and configurations files are locate in /var/lib/pgsql/data If this directory does not exist (or it is empty) then launch: 2.For a database performances optimization we suggest to set the following attibutes/values in the /var/lib/pgsql/data/postgresql.conf file as follows: listen_addresses = ‘*’ max_connections = 256 work_men = 2048 maintenance_work_men = fsync = false enable_hashjoin = true enable_indexscan = true enble_nestloop = true enable_seqscan = true enavble_tidiscan = true effective_cache_size = random_page_cost = 2 9 – postgres –D /var/lib/pqsql/data

The GridICE Configuration File Choosing your Grid being monitored… Create a GridICE server configuration file 10 /opt/gridice/setup/ gridice-server.cfg.template gridice-server.cfg Edit /opt/gridice/setup/gridice-server.cfg Modify the following attributes: hostname FQDN of the GridICE server addr IP address of theGridICE server dbadminpass Choose a password for PostgreSQL connections (it refers to the‘postgres’ Linux user) dbhost FQDN of the GridICE server dbpass Choose a password for the GridICE DB connections (It refers to the ‘gridiceadmin’ PostgreSQL user) blacklist Define a regular expression in order to exclude one or more sites from discovery process (separate each site wit “|” following the reported example) Default is no Grid site excluded giisgroup BDII list to use for the Grid being monitored Note that Every giisgroup indicates a BDII so that you can have more than one monitored Grid For each group you can insert more than one BDII for backup pupose in terms of ldap queries to the related BDII. Now you can launch the GridICE configuration scripts: –-cfg \ /opt/gridice/setup/gridice-server.cfg

Final Configurations GridICE Database creation (plus patches for the new geo view) GridICE cron jobs to perform maintenance routines and periodic discovery GridICE discovery script to explore and collect all the monitoring data about your Grid (It queries the Information Service of your Grid and inserts into the RDMS all the data retrieved) 11 – postgres –U gridiceadmin GridICEdb < \ /opt/gridice/setup/pgsql/mondb.sql /opt/gridice/utils/gridice-cronjobs /etc/cron.d

After The Discovery Process Be sure that the following services are running 1.nagios 2.postgresql 3.httpd (check also if the http port is open) To see your Grid monitored data, point the web browser to the URL: /gridice 12

Why Monitoring 13

A Use Case Perspective – Grid resources availability is subject to failures. – Resources observability is necessary for the Grid utilization. 14 Need for analyzing the usage, behavior and performance of a Grid depending on different users: 1. VO manager 2. Grid operations manager 3. Site administrator

VO manager viewpoint Visualization of the actual set of resources accessible to its members. Evaluation of members’demand satisfaction on the Grid mapping functionalities. Evaluation of the Service Level Agreement (SLA) for the global Grid service offers. 15

Grid operations manager viewpoint Detection and prediction of fault situations related to wide area distributed resources. Coordination of the deployment and upgrade of the Grid middleware installed at several sites. Investigation on Grid resources for statistical purpose. 16

Site Administrator viewpoint Detection of fault situations related to the own resources. Control how the own resources are used and appear to the Grid. 17

What is Grid Monitoring 18

Our Definition Grid Monitoring – the activity of measuring significant Grid resources related parameters – in order to analyze usage, behavior and performance of the grid detect and notify fault situations 19

Concepts & Terminology Entity: any networked and useful resources having a considerable lifetime (e.g. processors, memories, disk capacity, etc.). Events: collection of timestamped data, associated with the attribute of an entity. Event Schema (or Schema): the typed structure and semantics of the all events, so that given an event type, one can find the structure and interpret the semantics of the corresponding event. Sensor: process monitoring an entity and generating events. 20

Requirements Scalability: monitoring systems have to cope efficiently with a growing number of resources, events and users. Extensibility: monitoring systems must be extensible with respect to the supported resources. Data delivery models: monitoring systems must integrate different measurement policies (e.g. periodic, on-demand). Portability: any encapsulated measurement must be platform independent. Security: monitoring systems must deal with security concerns such as privacy, data integration and confidentiality. 21

The Four Main Phases of Monitoring 22 Generation Distributing Presenting Processing Sensors inquiring entities and encoding the measurements according to a schema Transmission of the events from the source to any interested parties (data delivery model: push vs. pull; periodic vs. aperiodic) Processing and abstract the number of received events in order to enable the consumer to draw conclusions about the operation of the monitored system e.g., filtering according to some predefined criteria, or summarising a group of events

The GridICE Approach 23

Generating Events Generation of events: – Sensors: typically perl scripts or c programs. – Schema: GLUE Schema v GridICE extension. – System related (e.g., CPU load, CPU Type, Memory size). – Grid service related (e.g., CE ID, queued jobs). – Network related (e.g., Packet loss). – Job usage (e.g., CPU Time, Wall Time). – All sensors are executed in a periodic fashion. 24

Distributing Events Distribution of events: – Hierarchical model. Intra-site: by means of the local monitoring service – default choice, LEMON ( Inter-site: by offering data through the Grid Information Service. Final Consumer: depending on the client application. – Mixed data delivery model. Intra-site: depending on the local monitoring service (push for lemon). Inter-site: depending on the GIS (current choice, MDS 2.x, pull). Final consumer: pull (browser/application), push (publish/subscribe notification service coming on the next release). 25

Presenting Events Data stored in a RDBMS used to build aggregated statistics. Data retrieved from the RDBMS are encoded in XML files. XSL to XHTML transformations to publish aggregated data in a Web context. 26

Monitoring a Grid 27

Challenges for Data Collection The distribution of monitoring data is strongly characterised by significant requirements (e.g., Scalability, Heterogeneity, Security, System Health) None of the existing tools satisfy all of these requirements Grid data collection should be customized depending on what are the needs of your Grid users selected 28

Challenges for Data Presentation Different Grid users are interested in different subset of Grid data and different aggregation levels Usability principles should be taken into account to help users finding relevant Grid monitoring information A sintetic data aggregation is crucial to permit a drill- down navigation (from the general to te detailed) of the Grid data 29

30

VO manager utilization Mostly interested in: – Resources available to the VO Computing elements where VO users can submit jobs. Storage elements where VO users can store/retrieve data. – Job monitoring How many jobs are running or queued? – For the whole VO? In each site? Submitted by a certain RB? How many jobs have been executed? – For the whole VO? In each site? Submitted by a certain RB? 31

Grid operations manager utilization Mostly interested in: – General status of the managed Grid How many sites compose the managed Grid and where they are located. How many resources (cpu#, WN, etc.) are available. – Highlighted problems Is there any Grid service (e.g., CE, SE, BDII) which related processes have problems? Is the Grid Information Service working properly? 32

Site administrator utilization Mostly interested in: – Status of their resources What is the cpu load at the moment? What is the percentage of the busy storage space? Are there any jobs running or queued in my site and in which Worker Node? – Highlighted problems Is there any Grid service (e.g., CE, SE, BDII) which related processes have problems? Is the Grid Information Service working properly? 33