INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org DPM Administration Jean-Philippe Baud (Sophie Lemaitre)

Slides:



Advertisements
Similar presentations
12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS E-infrastructure shared between Europe and Latin America SRM Installation and Configuration.
Advertisements

DPM Name Server (DPNS) Namespace Authorization Location of physical files DPM Server Requests queuing and processing Space Management SRM Servers v1.1,
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America DPM Server Installation Luciano Diaz ICN-UNAM.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America DPM Server Installation Claudio Cherubino INFN – Catania.
70-270, MCSE/MCSA Guide to Installing and Managing Microsoft Windows XP Professional and Windows Server 2003 Chapter Nine Managing File System Access.
9th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS E-infrastructure shared between Europe and Latin America SRM Installation and Configuration.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 7 Configuring File Services in Windows Server 2008.
Ninth EELA Tutorial for Users and Managers E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 experience Sophie Lemaitre WLCG Workshop.
70-291: MCSE Guide to Managing a Microsoft Windows Server 2003 Network Chapter 7: Domain Name System.
DPM CCRC - 1 Research and developments DPM status and plans Jean-Philippe Baud.
DPM Server Installation Claudio Cherubino INFN - Catania.
LFC tutorial Jean-Philippe Baud, IT-GT, CERN July 2010.
StoRM Some basics and a comparison with DPM Wahid Bhimji University of Edinburgh GridPP Storage Workshop 31-Mar-101Wahid Bhimji – StoRM.
DCache at Tier3 Joe Urbanski University of Chicago US ATLAS Tier3/Tier2 Meeting, Bloomington June 20, 2007.
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
CERN IT Department CH-1211 Geneva 23 Switzerland t Storageware Flavia Donno CERN WLCG Collaboration Workshop CERN, November 2008.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
EMI is partially funded by the European Commission under Grant Agreement RI Argus Policies Tutorial Valery Tschopp - SWITCH EGI TF Prague.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios for Grid Services E. Imamagic, SRCE.
Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
June 22, 2007USATLAS T2-T3 DQ2 0.3 SiteServices Patrick McGuigan
INFSO-RI Enabling Grids for E-sciencE gLite Data Management and Interoperability Peter Kunszt (JRA1 DM Cluster) 2 nd EGEE Conference,
SRM Monitoring 12 th April 2007 Mirco Ciriello INFN-Pisa.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
INFSO-RI Enabling Grids for E-sciencE Experiences with LFC and comparison with RNS Erwin Laure Jean-Philippe.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
SEE-GRID-SCI Storage Element Installation and Configuration Branimir Ackovic Institute of Physics Serbia The SEE-GRID-SCI.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
INFSO-RI Enabling Grids for E-sciencE ATLAS DDM Operations - II Monitoring and Daily Tasks Jiří Chudoba ATLAS meeting, ,
EGEE-II INFSO-RI Enabling Grids for E-sciencE Real Life Examples Tickets – Real life examples Mário David LIP - Lisbon.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America DPM Server Installation Claudio Cherubino.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 in DPM Sophie Lemaitre Jean-Philippe.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Storage Accounting for Grid Environments Fabio Scibilia INFN - Catania.
SRM-2 Road Map and CASTOR Certification Shaun de Witt 3/3/08.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Storage Element Model and Proposal for Glue 1.3 Flavia Donno,
User Interface UI TP: UI User Interface installation & configuration.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
INFSO-RI Enabling Grids for E-sciencE gLite Information System: R-GMA Tony Calanducci INFN Catania gLite tutorial at the EGEE User.
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
EMI is partially funded by the European Commission under Grant Agreement RI Argus Policies Tutorial Valery Tschopp (SWITCH) – Argus Product Team.
I Corso di formazioni per amministratori di siti Grid, Martina Franca, 07/11/ DPM Installation, Administration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Status of SRB/SRM interface development Fu-Ming Tsai Academia Sinica Grid Computing.
Introduction to Storage Element Hsin-Wei Wu Academia Sinica Grid Computing Center, Taiwan.
Security recommendations DPM Jean-Philippe Baud CERN/IT.
EMI is partially funded by the European Commission under Grant Agreement RI DPM in EMI-II HTTP and NFS interfaces Oliver Keeble On behalf of DPM.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
INFSO-RI Enabling Grids for E-sciencE Running reliable services: the LFC at CERN Sophie Lemaitre
Jean-Philippe Baud, IT-GD, CERN November 2007
The lightweight Grid-enabled Disk Pool Manager (DPM)
Troubleshooting su Installazione SE [DPM]
Jean-Philippe Baud - Sophie Lemaitre IT-GD, CERN May 2005
StoRM Architecture and Daemons
SRM2 Migration Strategy
Data Management cluster summary
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE DPM Administration Jean-Philippe Baud (Sophie Lemaitre)

Enabling Grids for E-sciencE INFSO-RI DPM in a nutshell Description Installation DPM as a service Troubleshooting Documentation / support

Enabling Grids for E-sciencE INFSO-RI DPM in a nutshell Description Installation DPM as a service Troubleshooting Documentation / support

Enabling Grids for E-sciencE INFSO-RI What is a DPM ? Disk Pool Manager –Manages storage on disk servers –SRM support (1.1, 2.1 and 2.2) Deployment status –~200 DPMs in production –70 VOs supported

Enabling Grids for E-sciencE INFSO-RI Architecture Very important to backup ! Store physical files -- Namespace -- Authorization -- Replicas -- DPM config -- All requests (SRM, transfers…) Standard Storage Interface Can all be installed on a single machine

Enabling Grids for E-sciencE INFSO-RI Daemons DPM server DPM Name Server SRM servers  srm v1  srm v2  srm v2.2 RFIO server DPM-enabled GridFTP server  Normal GridFTP server with a DPM plugin  Can be used in place of the normal GridFTP server  But normal GridFTP server cannot be used for the DPM

Enabling Grids for E-sciencE INFSO-RI DPM in a nutshell Description Installation DPM as a service Troubleshooting Documentation / support

Enabling Grids for E-sciencE INFSO-RI Before installing 5 questions before installation: –For each VO, what is the expected load?  Does the DPM need to be installed on a separate machine ? –How many disk servers do I need ?  Disk servers can easily be added or removed later –Which operating system ? –Which file system type ? –At my site, can I open ports:  5010 (Name Server)  5015 (DPM server)  8443 (srmv1)  8446 (srmv2.2)  5001 (rfio)  (rfio data port)  2811 (DPM GridFTP control port)  (DPM GridFTP data port) Yes, this is recommended !

Enabling Grids for E-sciencE INFSO-RI Operating System RPMs provided for: –SLC4 and SL5 Packages provided for: –Debian 5 and OpenSolaris Build on: –MacOSx

Enabling Grids for E-sciencE INFSO-RI File Systems Dedicated file systems –DPM cannot work properly if each file system is not dedicated:  Reported space will be incorrect  Thus, space reservation, garbage collection, etc. will not work as expected Supported file systems ? –ext2: not recommended –ext3, xfs: OK –Reiserfs, ext4: ? NFS mounted file system ? –Problems seen:  DPM sometimes hangs  File corruption  bottlenecks VERY IMPORTANT

Enabling Grids for E-sciencE INFSO-RI Which File System type ? From Greig A. Cowan (HEPIX 2006) xfs seems to be more performant But, problems with xfs and kernel 2.6 seen…

Enabling Grids for E-sciencE INFSO-RI YAIM YAIM configuration : –VO_ _DEFAULT not used for DPM –VO_ _STORAGE_DIR not used for DPM On the DPM server: ./install_node site-info.def glite-SE_dpm_mysql ./configure_node site-info.def glite-SE_dpm_mysql On each disk server (except DPM server in current distribution): ./install_node site-info.def glite-SE_dpm_disk ./configure_node site-info.def glite-SE_dpm_disk $MY_DOMAIN$DPM_DB_PASSWORD $MYSQL_PASSWORD$DPM_DB_HOST $DPM_HOST$DPMFSIZE $DPMPOOL $DPM_FILESYSTEMS $DPM_DB_USER$VOS Site-info.def Default amount of space reserved for a file. Check VO typical file usage. “$DPM_HOST:/data01 disk1.$MY_DOMAIN:/data02”

Enabling Grids for E-sciencE INFSO-RI YAIM installation YAIM installs –On $DPM_HOST:  DPM server  DPM Name Server  SRM servers (srmv1 & srmv2.2) –On each disk server specified in $DPM_FILESYSTEMS:  RFIO server  DPM-enabled GridFTP server YAIM creates –One pool including all file systems specified Only configuration files  /opt/lcg/etc/DPMCONFIG : DPM DB connection string  /opt/lcg/etc/NSCONFIG : DPNS DB connection string  /etc/shift.conf : different servers/disk servers TRUST rules

Enabling Grids for E-sciencE INFSO-RI How does it look ? DPM Pool export DPM_HOST=dpm01.cern.ch export DPNS_HOST=dpm01.cern.ch dpm-qryconf POOL TutorialPool DEFSIZE M GC_START_THRESH 5 GC_STOP_THRESH 15 DEFPINTIME 0 PUT_RETENP FSS_POLICY maxfreespace GC_POLICY lru RS_POLICY fifo GID 0 S_TYPE - CAPACITY 52.77G FREE 43.79G ( 83.0%) dpm01.cern.ch /data CAPACITY 17.27G FREE 14.24G ( 82.4%) disk01.cern.ch /storage CAPACITY 17.75G FREE 14.78G ( 83.3%) disk01.cern.ch /data CAPACITY 17.75G FREE 14.78G ( 83.3%) Name Server export DPNS_HOST=dpm01.cern.ch dpns-ls -l /dpm/cern.ch/home drwxrwxr-x 0 root May 22 14:43 atlas drwxrwxr-x 2 root May 22 14:53 dteam drwxrwxr-x 0 root May 22 14:43 geant4 drwxrwxr-x 0 root May 22 14:43 lhcb Your domainSupported VOs Virtual gids Cleanup stops when 15% space left Cleanup starts when 5% space left

Enabling Grids for E-sciencE INFSO-RI Test installation (examples) lcg_utils (from a UI) –If DPM not in site BDII yet  export LCG_GFAL_INFOSYS=dpm01.cern.ch:2135  lcg-cr –v --vo dteam –d dpm01.cern.ch file:/path/to/file –Otherwise  export LCG_GFAL_INFOSYS=my_bdii.cern.ch:2170  lcg-cr –v --vo dteam –d dpm01.cern.ch file:/path/to/file rfio (from a UI)  export LCG_RFIO_TYPE=dpm  export DPNS_HOST=dpm01.cern.ch  export DPM_HOST=dpm01.cern.ch  rfdir /dpm/cern.ch/home/dteam/myVO

Enabling Grids for E-sciencE INFSO-RI Test installation (examples) srmcp (from a UI)  /opt/d-cache/srm/bin/srmcp file:////path/to/file srm://dpm01.cern.ch:8443/dpm/cern.ch/home/dteam/my_dir/my_file globus-url-copy (from a UI)  lcg-gt srm://dpm01.cern.ch/dpm/cern.ch/home/dteam/generated/ /file5b517db2-30dc-4df0-9ac9-0e30433e10da gsiftp TURL returned: gsiftp://disk01.cern.ch/disk01.cern.ch:/data/dteam/ /file5b517db2-30dc-4df0-9ac9-0e30433e10da.1.0  globus-url-copy file:/path/to/file gsiftp://disk01.cern.ch/disk01.cern.ch:/data/dteam/ /file5b517db2-30dc-4df0-9ac9-0e30433e10da.1.0 SRM server

Enabling Grids for E-sciencE INFSO-RI DPM in a nutshell Description Installation DPM as a service –Administration –Monitoring Troubleshooting Documentation / support

Enabling Grids for E-sciencE INFSO-RI Admin Tasks (1) Modify configuration dpm-modifypool --poolname myPool --gc_start_thresh 5 -- gc_stop_thresh 10 dpm-modifyfs --server disk01.cern.ch --fs /data --st RDONLY Add a disk server ./install_node site-info.def glite-SE_dpm_disk ./configure_node site-info.def glite-SE_dpm_disk  Add it to TRUST rules in DPM server /etc/shift.conf file Remove a disk server  dpm-drain --server disk02.cern.ch --fs /data  dpm-rmfs --server disk02.cern.ch --fs /data  Remove it from TRUST rules in DPM server /etc/shift.conf file

Enabling Grids for E-sciencE INFSO-RI Admin Tasks (2) Load balancing –DPM automatically round robins between file systems Example –disk01: 1TB file system –disk02: very fast, 5TB file system –Solution 1: one file system per disk server  A file will be stored on either disk, equally, if space left –Solution 2: one file system on disk01 two file systems on disk02  A file will more often end up on disk02, which is better from space point of view but could trigger too much load on disk02

Enabling Grids for E-sciencE INFSO-RI Admin Tasks (3) Replicate –Users and admins can do it –Useful if file often accessed dpm-replicate --s P /dpm/cern.ch/home/dteam/my_file Drain –Set pool/file system as READ-ONLY –Let operations finish –Replicates to another pool/file system  Pinned files  Permanent files dpm-drain --poolname myPool dpm-drain --server disk01.cern.ch dpm-drain --server disk01.cern.ch --fs /data Empty pool / file system can then be removed physically

Enabling Grids for E-sciencE INFSO-RI Admin Tasks (4) Remove Name Server entries for lost files –Use rfrm (as user or root)  export DPNS_HOST=dpm01.cern.ch  export DPM_HOST=dpm01.cern.ch  rfrm /dpm/cern.ch/home/dteam/my_dir/my_file –Removes  Files on disk  Replicas and Logical Files Names in DPM Name Server

Enabling Grids for E-sciencE INFSO-RI Quotas ? Dedicate a Pool to a VO –By default, pool can be used by all supported groups/VOs –But, pool can be restricted: dpm-modifypool --poolname VOpool --def_filesize 200M --gid VO_gid dpm-modifypool --poolname VOpool --def_filesize 200M --group VO_group_name –Hard limit:  What to do if pool is full ? Quotas –Maximum space allowed per group/VO –Softer limit:  Not bind to pools / physical file systems Dynamic space reservation –Defined by user on request –Maximum value configurable –Admin maximum value could be higher than for standard user Not implemented yet

Enabling Grids for E-sciencE INFSO-RI Monitoring What to monitor ? –All processes running ? –All threads busy ? DPNS (see LFC) –All threads busy ? DPM  20 threads for fast operations  20 threads for slow operations (get, put, copy)  Note: there are also 1 garbage collector thread per disk pool defined 1 thread that removes the expired put request –RFIO/GridFTP  Too many transfers to the same file system ? –Number of requests (in dpm_db) per week/month ? Monitor separately

Enabling Grids for E-sciencE INFSO-RI Other features Request DB cleanup –Request DB only grows:  Ex: two more rows in the dpm_db database, when opening a file –Old requests can now be automatically be removed –Question: how long should the requests be stored ?  1 week ?  1 month ? (holidays…)

Enabling Grids for E-sciencE INFSO-RI In the plan User/group banning Quotas Limitation of transfers to same file system Automated database backups for MySQL –MySQL replication –Save/backup data from the slave in a safe place (Tier1 ?)

Enabling Grids for E-sciencE INFSO-RI DPM in a nutshell Description Installation DPM as a service Troubleshooting Documentation / support

Enabling Grids for E-sciencE INFSO-RI Log Files DPM server  /var/log/dpm/log DPM Name Server  /var/log/dpns/log SRM servers  /var/log/srmv1/log  /var/log/srmv2.2/log RFIO server  /var/log/rfiod/log DPM-enabled GridFTP  /var/log/dpm-gsiftp/gridftp.log  /var/log/dpm-gsiftp/dpm-gsiftp.log

Enabling Grids for E-sciencE INFSO-RI Permissions problem DPM Name Server –Based on same code as LFC  virtual uids,gids  /opt/lcg/etc/lcgdm-mapfile (if grid-proxy-init or simple voms-proxy-init)  VOMS support –More details: see LFC tutorial

Enabling Grids for E-sciencE INFSO-RI DPM in a nutshell Description Installation DPM as a service Troubleshooting Documentation / support

Enabling Grids for E-sciencE INFSO-RI Another problem ? Main LFC/DPM documentation page – DPM Admin Guide – Troubleshooting page –

Enabling Grids for E-sciencE INFSO-RI Still no clue ? Contact GGUS –your ROC will help –If needed, DPM experts will help

INFSO-RI Enabling Grids for E-sciencE Questions ? Jean-Philippe Baud

Enabling Grids for E-sciencE INFSO-RI Example: SRM put processing (1)

Enabling Grids for E-sciencE INFSO-RI Example: SRM put processing (2)

Enabling Grids for E-sciencE INFSO-RI Example: SRM put processing (3)

Enabling Grids for E-sciencE INFSO-RI Example: SRM put processing (4)

Enabling Grids for E-sciencE INFSO-RI Example: SRM put processing (5)

Enabling Grids for E-sciencE INFSO-RI Example: SRM put processing (6)