DCache at Tier3 Joe Urbanski University of Chicago US ATLAS Tier3/Tier2 Meeting, Bloomington June 20, 2007.

1 dCache at Tier3 Joe Urbanski University of Chicago US ATLAS Tier3/Tier2 Meeting, Bloomington June 20, 2007

2 What is dCache? From "A system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree”

3 Features & Advantages What can dCache do?  Capable of combining hundreds of commodity disk servers to get a huge petabyte scale data store  Allows several copies of a single file for distributed data access  Has internal load balancing using cost metrics and transfers between the site's pools  Has automatic file replication on high load ("hotspot detection")

4 What does dCache look like? The single virtual filesystem is provided by pnfs (Pretty Normal File System). Mounting and unmounting through mount and umount and /etc/fstab, much like traditional nfs. pnfs is “POSIX-like”:  can use: ls, mkdir, find  cannot use: cp, md5sum

5 Clients How do I access files in dCache?  dCap: dCache's native method, uses dccp. Easiest for local transfers.  gridFTP: via globus-url-copy with a valid X.509 proxy  SRM: via srmcp, srmls with a valid X.509 proxy

6 Architecture What about the backend?  Admin nodes: Provide basic admin services. One or more of these.  Door nodes: Provide I/O access via SRM or GridFTP. One or more, may reside on admin node on small installs.  pnfs node: Provides the unified namespace. Only one per install.  Pool nodes: Provides storage. Can be installed alongside any other type of node.

7 UC Tier3

8 UC Tier3 (cont'd)‏ 3 Admin nodes:  uct3-edge1: gridFTP, dCap  uct3-edge2: pnfs  uct3-edge3: admin, SRM 25 Pool nodes:  Currently: 22 compute nodes x 1.9TB + 3 admin nodes x 1.9TB = 47TB

9 Installing dCache The VDT Installer  Greatly simplifies and quickens the install process Automatically configures which services to run, and on what nodes to run them. Installs needed rpms. Configures dCache and its postgresql databases.  Latest vdt version is v1.1.8 Not to be confused with the dCache version, latest is v1.7.0.

10 Running the VDT Installer Download the latest vdt tarball, untar and cd into the install directory. Run to generate site-info.def file

11 [root@uct3-edge2 install]./ How many admin nodes (non-pool and non-door nodes) do you have? 2 The recommended services for node 1 are: lmDomain poolManager adminDoor httpDomain utilityDomain gplazmaService infoProvider srm The recommended services for node 2 are: pnfsManager dirDomain Enter the FQDN for the node 1: Which services do you wish to run on node (Enter for defaults)? Enter the FQDN for the node 2: Which services do you wish to run on node (Enter for defaults)? How many door nodes do you have? 1 Enter the FQDN of door number 1:

12 (cont'd)‏ Enter the private network that the pools are in. If this does not apply, just press enter to skip: Enter the number of dcap doors to run on each door node [default 1]: 1 Enter a pool FQDN name(Press Enter when all are done): Enter the first storage location (Press Enter when all are done)): /dcache Enter another storage location (Press Enter when all are done)): --SNIP-- Enter another pool FQDN name(Press Enter when all are done): Enter the first storage location (Press Enter when all are done)): /dcache Enter another storage location (Press Enter when all are done)): Enter another pool FQDN name(Press Enter when all are done): Created site-info.def file. [root@uct2-mgt install]

13 Running the VDT Installer (cont'd)‏ Copy vdt tarball, and site-info.def file to all nodes. Run './ -d' for a dryrun. (This will be very verbose). If successful, run the actual install. Start dCache services in the following order:  pnfs node core services  other admin nodes core services  all dcache pool services

14 Verifying the install Check the status webpage: 

15 Verifying the install (cont'd)‏ Test the doors:  dCap: use dccp  gridFTP: use globus-url-copy  SRM: use srmcp globus-url-copy -dbg \ file:////tmp/test.file \ gsi dccp -d999 /tmp/test.file \ /pnfs/ srmcp -debug file:////tmp/test.file \ srm://

16 Troubleshooting Always check the status page!

17 Troubleshooting (cont'd)‏ Check the logs:  Most dCache cells: /var/log/*Domain.log. Each cell, or service, will generate a log with the appropriate name  SRM: /opt/d-cache/libexec/apache-tomcat- 5.5.20/logs/catalina.out  pnfs: /var/log/pnfsd.log, /var/log/pmountd.log, /var/log/dbserver.log dCache, The Book:  Ask for help:  support:  OSG-storage:  OSG Storage Activities Meeting: Every Thursday utes

18 UC Tier3 Pool Usage

19 Wormholes pnfs provides a way to distribute configuration information to all directories in the pnfs filesystem via 'wormholes'. Accessed via the '.(config)()' subdirectory, which acts like a symlink to /pnfs/fs/admin/etc/config/ By default, reading files like this is disabled, so you'll need to enable access to these files. Without being able to read that file, pnfs won't be able to find the dcap door, and dccp's won't work! WARNING: enabling access to these files empties the file!

20 Wormholes (cont'd)‏ [root@uct3-edge2 dcache-upgrade-v1.1.4]# cd \ /pnfs/fs/admin/etc/config/dCache [root@uct3-edge2 dCache]# cat dcache.conf [root@uct3-edge2 dCache] touch ".(fset)(dcache.conf)(io)(on)" [root@uct3-edge2 dCache] echo "" > \ /pnfs/fs/admin/etc/config/dCache/dcache.conf

21 Authorization with gPLAZMA Grid-aware PLuggable AuthoriZation Management Works in a manner similar to PAM Four available methods:  kpwd: 'legacy method'. flat file maps DN's to a local username, then username to uid, gid, and rootpath  grid-mapfile: uses a grid-mapfile, then a second file, storage-authzdb, to map username to uid, gid, and rootpath  gplazmalite-vorole-mapping: concatenate DN + Role, then provide uid, gid, and rootpath via storage-authzdb.  saml-vo-mapping: uses GUMS to map to username, may provide uid, gid, and rootpath or via storage- authzdb.

22 The Admin Interface dCache provides a shell-like interface accessed via ssh. [root@uct3-edge3 config]# ssh -c blowfish -p 22223 -1 admin@uct3-edge3 dCache Admin (VII) (user=admin)‏ (local) admin > cd uct3-edge2_1 (uct3-edge2_1) admin > pnfs register (uct3-edge2_1) admin >.. (local) admin > cd uct3-edge3_1 (uct3-edge3_1) admin > pnfs register (uct3-edge3_1) admin >..

23 Autovacuuming Postgres Pfns database files could potentially reach a very large size and fill up your filesystem. To turn it on, uncomment all the entries in the AUTOVACUUM PARAMETERS section and change 'autovacuum = off' to 'on' in /var/lib/pgsql/data/postgresql.conf, then restart postgresql.

24 Useful URL's dCache homepage  VDT Installer homepage  VDT Installer HOWTO  OSG Storage Activities Meeting  OSG dCache overview  USATLAS dCache at BNL 

