WORKER NODE Alfonso Pardo EPIKH School, System Admin Tutorial Beijing, 2010 August 30th – 2010 September 3th.

Slides:



Advertisements
Similar presentations
MPI CUSTOMIZATION IN ROMA3 SITE Antonio Budano Federico Bitelli.
Advertisements

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR E-infrastructure shared between Europe and Latin America CE + WN installation and configuration.
Instalación y configuración de CE+WN Angelines Alberto CIEMAT Grid Tutorial, Sept
INFSO-RI Enabling Grids for E-sciencE Computing Element installation & configuration Giuseppe Platania INFN Catania EMBRACE Tutorial.
1 Kolkata, Asia Joint CHAIN/EU-IndiaGrid2/EPIKH School for Grid Site Administrators, The EPIKH Project (Exchange Programme.
SEE-GRID-SCI Hands-On Session: Computing Element (CE) and site BDII Installation and Configuration Dusan Vudragovic Institute of Physics.
1 Worker Nodes Installation&Configuration Sara Bertocco INFN Padova 11 th International GridKa School 2013 – Big Data, Clouds and Grids.
E-science grid facility for Europe and Latin America UI PnP and UI Installation User and Site Admin Tutorial Riccardo Bruno – INFN Catania.
Ninth EELA Tutorial for Users and Managers E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) VOMS Installation and configuration Bouchra
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) CE+WN+siteBDII Installation and configuration Bouchra
E-science grid facility for Europe and Latin America Installation and configuration of a top BDII Gianni M. Ricciardi – Consorzio COMETA.
4th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS E-infrastructure shared between Europe and Latin America BDII Server Installation Vanessa.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Workload Management System + Logging&Bookkeeping Installation.
E-science grid facility for Europe and Latin America LFC Server Installation and Configuration Antonio Calanducci INFN Catania.
SEE-GRID-2 The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no
E-science grid facility for Europe and Latin America gLite WMS Installation and configuration Riccardo Bruno – INFN.CT 30/06/2008 – 04/07/2008.
9th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS E-infrastructure shared between Europe and Latin America CE + WN installation and configuration.
12th EELA Tutorial for Users and System Administrators E-infrastructure shared between Europe and Latin America User Interface installation.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America BDII Server Installation and Configuration Antonio Juan.
BDII Server Installation and Configuration Manuel Rubio del Solar Extremadura Advanced Research Center (CETA-CIEMAT) 11th EELA Tutorial for Users Sevilla,
4th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS E-infrastructure shared between Europe and Latin America CE + WN installation and configuration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Giuseppe La Rocca INFN – Catania
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America BDII Server Installation and Configuration.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Worker Node installation & configuration.
CREAM Installation&Configuration Sara Bertocco INFN Padova 11 th International GridKa School 2013 – Big Data, Clouds and Grids.
CE: compute element TP: CE & WN Compute Element Worker Node Installation configuration.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America R-GMA Server Installation Valeria Ardizzone.
Ninth EELA Tutorial for Users and Managers E-infrastructure shared between Europe and Latin America BDII Server Installation Yubiryn Ramírez.
EGEE-II INFSO-RI Enabling Grids for E-sciencE YAIM Overview MiMOS Grid tutorial HungChe, ASGC OPS Team.
Third EELA Tutorial for Managers and Users E-infrastructure shared between Europe and Latin America CE + WN installation and configuration.
12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin.
South African Grid Training COMPUTING ELEMENT Albert van Eck UFS - ICTS 18 November 2009 Slides by: GIUSEPPE PLATANIA.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America WMS+LB Server Installation Eduardo Murrieta.
E-science grid facility for Europe and Latin America COMPUTING ELEMENT GIUSEPPE PLATANIA INFN Catania 30 June - 4 July, 2008.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America BDII Server Installation Claudio Cherubino.
12th EELA Tutorial for Users and Managers E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) UI Installation and Configuration Dong Xu IHEP,
Presentation of the results khiat abdelhamid
GLite WN Installation Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
SEE-GRID-SCI CE and BDII Hands-on Session Miloš Ivanović Research and Development Center for Bioengineering, Kragujevac Serbia
First South Africa Grid Training WORKER NODE Albert van Eck University of the Free State 25 July, 2008.
First South Africa Grid Training Installation and configuration of BDII Gianni M. Ricciardi Consorzio COMETA First South Africa Grid Training Catania,
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) LFC Installation and Configuration Dong Xu IHEP,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks BDII Server Installation & Configuration.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) WMS LB BDII Installation and Configuration Salma Saber
Site BDII and CE Installation Muhammad Farhan Sjaugi, UPM 2009 November , UM Malaysia 1.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Overview of software tools for gLite installation & configuration.
Overview about other gLite services Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
INFSO-RI Enabling Grids for E-sciencE Worker Node installation & configuration Giuseppe Platania INFN Catania EMBRACE Tutorial Clermont-Ferrand,
South African Grid Training WORKER NODE Albert van Eck UFS - ICTS 17 November 2009 Slides by GIUSEPPE PLATANIA.
16-26 June 2008, Catania (Italy) First South Africa Grid Training LFC Server Installation and Configuration Antonio Calanducci INFN Catania.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Elisa Ingrà Consortium GARR- Roma WMS LB.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Workload Management System + Logging&Bookkeeping Installation.
YAIM Optimized Cristina Aiftimiei – Sergio Traldi
Elisa Ingrà Consortium GARR- Roma
Installation and configuration of a top BDII
(Exchange Programme to advance e-Infrastructure Know-How)
UI PnP and gLite UI installation
UI Installation and Configuration
Installation and configuration of a Computing Element
gLite User Interface Installation
Berkley Database Information Index (BDII) Server Installation & Configuration Giuseppe La Rocca INFN – Catania gLite Tutorial Rome, April 2006.
R-GMA Server Installation (v. 1.4)
WMS LB topBDII Installation and Configuration
Grid Management Challenge - M. Jouvin
gLite User Interface Installation and configuration
BDII Server Installation and Configuration
UI Installation and Configuration
Presentation transcript:

WORKER NODE Alfonso Pardo EPIKH School, System Admin Tutorial Beijing, 2010 August 30th – 2010 September 3th

OUTLINE OVERVIEW INSTALLATION & CONFIGURATION TESTING FIREWALL SETUP TROUBLESHOOTING 2

OVERVIEW The Worker Node is a service where the jobs run. Its main functionally are: – execute the jobs – update to Computing Element the status of the jobs It can run several kinds of client batch system: – Torque – LSF – SGE – Condor 3

TORQUE client The Torque client is composed by a: – pbs_mom – pbs_mom which places the job into execution. It is also responsible for returning the job’s output to the user 4

Worker Node installation & configuration using YAIM

There are several kinds of metapackages to install: glite_WN – “Generic” WorkerNode. glite_WN_noafs – Like ig_WN but without AFS. glite_WN_LSF – LSF WorkerNode. IMPORTANT: provided for consistency, it does not install LSF software but it apply some fixes via ig_configure_node. glite_WN_LSF_noafs – Like ig_WN_LSF but without AFS. glite_WN_torque – Torque WorkerNode. glite_WN_torque_noafs – Like ig_WN_torque but without AFS. WHAT KIND OF WN?

Repository settings REPOS=Ӗcg-ca dag ig gilda glite-wn_torque" Download and store repo files: for name in $REPOS; do wget it.cnaf.infn.it/mrepo/repos/sl5/x86_64/$name.repo -O /etc/yum.repos.d/$name.repo; done wget -O /etc/yum.repos.d/gilda.repohttp://grid018.ct.infn.it/mrepo/repos/gilda.repo -O /etc/yum.repos.d/ wget -O /etc/yum.repos.d/jpackage.repohttp://grid-it.cnaf.infn.it/mrepo/repos/jpackage.repo 7

INSTALLATION yum install jdk java sun-compat yum install lcg-CA yum install lcg-WN yum install glite-TORQUE_utils yum install glite-TORQUE_client Gilda rpms: yum install gilda_utils gilda_applications 8

Copy users and groups example files to /opt/glite/yaim/etc/gilda/ cp /opt/glite/yaim/examples/ig-groups.conf /opt/glite/yaim/etc/gilda/ cp /opt/glite/yaim/examples/ig-users.conf /opt/glite/yaim/etc/gilda/ Append gilda users and groups definitions to /opt/glite/yaim/etc/gilda/ig- users.conf cat /opt/glite/yaim/etc/gilda/gilda_ig-users.conf >> /opt/glite/yaim/etc/gilda/ig-users.conf cat /opt/glite/yaim/etc/gilda/gilda_ig-groups.conf >> /opt/glite/yaim/etc/gilda/ig-groups.conf Customize ig-site-info.def

Copy ig-site-info.def template file provided by ig_yaim in to gilda dir and customize it cp /opt/glite/yaim/examples/siteinfo/ig-site-info.def /opt/glite/yaim/etc/gilda/ Open /opt/glite/yaim/etc/gilda/ file using a text editor and set the following values according to your grid environment: CE_HOST= TORQUE_SERVER=$CE_HOST 10 Customize ig-site-info.def

WN_LIST=/opt/glite/yaim/etc/gilda/wn-list.conf The file specified in WN_LIST has to be set with the list of all your WNs hostname. WARNING: It’s important to setup it before to run the configure command Customize ig-site-info.def

GROUPS_CONF=/opt/glite/yaim/etc/gilda/ig-groups.conf USERS_CONF=/opt/glite/yaim/etc/gilda/ig-users.conf JAVA_LOCATION="/usr/bin/java/jdk“ JOB_MANAGER=lcgpbs BATCH_BIN_DIR=/usr/bin BATCH_VERSION=torque VOS=“gilda” ALL_VOMS=“gilda” Customize ig-site-info.def

QUEUES="short long infinite“ SHORT_GROUP_ENABLE=$VOS LONG_GROUP_ENABLE=$VOS INFINITE_GROUP_ENABLE=$VOS In case of to configure a queue fo a single VO: QUEUES="short long infinite gilda“ SHORT_GROUP_ENABLE=$VOS LONG_GROUP_ENABLE=$VOS INFINITE_GROUP_ENABLE=$VOS GILDA_GROUP_ENABLE=“gilda” Customize ig-site-info.def

WN Torque CONFIGURATION Now we can configure the node: /opt/glite/yaim/bin/ig_yaim -n glite-WN -n glite-TORUQE_client -n glite- TORQUE_utils

Worker Node testing

Verify if the pbs_mom is active and if its status is free: root]# /etc/init.d/pbs_mom status pbs_mom (pid 3692) is running... root]# pbsnodes -a wn.localdomain state = free np = 2 properties = lcgpro ntype = cluster status = arch=linux,uname=Linux wn.localdomain EL.cern 1 Tue Oct 4 16:45:05 CEST 2005 i686,sessions= ,3584,nsessions=6,nusers=1,idletime=1569,totmem=254024kb,availme m=69852kb,physmem=254024kb,ncpus=1,loadave=0.30,rectime= Testing

First of all, check if a generic user on WN can do ssh to the CE without type the password: root] su – gilda001 gilda001] ssh ce gilda001] The same test has to be executed between the WNs in order to run MPI jobs: gilda001] ssh wn1 gilda001] Testing

FIREWALL setup

*filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :RH-Firewall-1-INPUT - [0:0] -A INPUT -j RH-Firewall-1-INPUT -A FORWARD -j RH-Firewall-1-INPUT -A RH-Firewall-1-INPUT -i lo -j ACCEPT -A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -s --dport 22 -j ACCEPT -A RH-Firewall-1-INPUT -p all -s -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m tcp --syn -j REJECT -A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited COMMIT /etc/sysconfig/iptables

IPTABLES STARTUP /sbin/chkconfig iptables on /etc/init.d/iptables start

Troubleshooting

root]# su – gilda001 gilda001] ssh ce password: probably this wn hostname is not in /etc/ssh/shosts.equiv or its ssh keys were not created and stored in /etc/ssh/ssh_known_hosts on CE Solution (to run on CE): Ensure that the wn is in pbs list using: root]# pbsnodes –a And then: root]# /opt/edg/sbin/edg-pbs-shostsequiv root]# /opt/edg/sbin/edg-pbs-known-hosts Troubleshooting

root]# pbsnodes -a wn.localdomain state = down np = 2 properties = lcgpro ntype = cluster Solution: root]# /etc/init.d/pbs_mom restart Troubleshooting

24