Martina Franca (TA), 07 November 2007 - 1 Installazione, configurazione, testing e troubleshooting di Storage Element.

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

Steve Traylen Particle Physics Department Experiences of DCache at RAL UK HEP Sysman, 11/11/04 Steve Traylen
Owen SyngeDCache Slide 1 D-Cache Status Report Owen Synge.
GridKa January 2005 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann 1 Mass Storage at GridKa Forschungszentrum Karlsruhe GmbH.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 7 Configuring File Services in Windows Server 2008.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
16 th May 2006Alessandra Forti Storage Alessandra Forti Group seminar 16th May 2006.
DCache at Tier3 Joe Urbanski University of Chicago US ATLAS Tier3/Tier2 Meeting, Bloomington June 20, 2007.
Csi315csi315 Client/Server Models. Client/Server Environment LAN or WAN Server Data Berson, Fig 1.4, p.8 clients network.
A. Sim, CRD, L B N L 1 OSG Applications Workshop 6/1/2005 OSG SRM/DRM Readiness and Plan Alex Sim / Jorge Rodriguez Scientific Data Management Group Computational.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Module 11: Implementing ISA Server 2004 Enterprise Edition.
CERN IT Department CH-1211 Geneva 23 Switzerland t Storageware Flavia Donno CERN WLCG Collaboration Workshop CERN, November 2008.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
Introduction to HDFS Prasanth Kothuri, CERN 2 What’s HDFS HDFS is a distributed file system that is fault tolerant, scalable and extremely easy to expand.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Maarten Litmaath (CERN), GDB meeting, CERN, 2006/02/08 VOMS deployment Extent of VOMS usage in LCG-2 –Node types gLite 3.0 Issues Conclusions.
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
DCache Basics Alessandro Usai, Sergio Maffioletti Grid Group CSCS.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
OSG Abhishek Rana Frank Würthwein UCSD.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
ALCF Argonne Leadership Computing Facility GridFTP Roadmap Bill Allcock (on behalf of the GridFTP team) Argonne National Laboratory.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
GridKa December 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann dCache Implementation at FZK Forschungszentrum Karlsruhe.
EGEE is a project funded by the European Union under contract IST Enabling bioinformatics applications to.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Mario Reale – GARR NetJobs: Network Monitoring Using Grid Jobs.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
1 DIRAC Data Management Components A.Tsaregorodtsev, CPPM, Marseille DIRAC review panel meeting, 15 November 2005, CERN.
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Security recommendations DPM Jean-Philippe Baud CERN/IT.
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
The CMS Beijing Tier 2: Status and Application Xiaomei Zhang CMS IHEP Group Meeting December 28, 2007.
Nov 05, 2008, PragueSA3 Workshop1 A short presentation from Owen Synge SA3 and dCache.
Federating Data in the ALICE Experiment
dCache Paul Millar, on behalf of the dCache Team
Jean-Philippe Baud, IT-GD, CERN November 2007
GFAL Grid File Access Library
Classic Storage Element
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
Status of the SRM 2.2 MoU extension
dCache “Intro” a layperson perspective Frank Würthwein UCSD
Experiences with http/WebDAV protocols for data access in high throughput computing
StoRM Architecture and Daemons
Introduction to Data Management in EGI
SRM Developers' Response to Enhancement Requests
dCache Status and Plans – Proposals for SC3
Quality Control in the dCache team.
GFAL 2.0 Devresse Adrien CERN lcgutil team
Ákos Frohner EGEE'08 September 2008
Introduction of Week 3 Assignment Discussion
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

Martina Franca (TA), 07 November Installazione, configurazione, testing e troubleshooting di Storage Element SE [D-CACHE] Giacinto Donvito INFN-Bari I Corso di formazione INFN per amministratori di siti Grid Martina Franca (TA), 07 November 2007

Martina Franca (TA), 07 November Outline Introduction on SRM Introduction on dCache How it works –What is happening under the hood Installation of dCache, in theory: –Layout of a standard installation –Layout of a complex installation Installation of dCache: practical: –Installing using YAIM –Installing … by hand dCache: news, future, issues, etc. Conclusions

Martina Franca (TA), 07 November SRM Overview “Storage Resource Manager” SRM is a Control protocol What it does: –Ask to make file ready for upload/download –Basic metadata (size, checksum,…) –Many components optional Web service (over GSI HTTP) What it doesn’t: –Data transfer  However it can do third party transfer –Access control & permissions  However some implementation have already been tried

Martina Franca (TA), 07 November SRM functionalities Features from SRM v1.1 –Get –Put –copy –getFileMetaData –getRequestStatus –getProtocols –AdvisoryDelete Features from SRM v2.2 – File types -> (“Storage Classes”) – Space reservation – Permission functions – Directory functions – Data transfer control functions – Relative paths – Query supported protocols

Martina Franca (TA), 07 November Tape: 1 Disk: 0 DATA TYPESTORAGE TYPE Tape: 1 Disk: 1 Tape: 0 Disk: 1

Martina Franca (TA), 07 November dCache overview It is developed in a large collaboration between Desy and FNAL (plus some other minor contributions) GOALS: –To make a distributed storage system that can use cheap disk- server to gain high performance and high-availability –To provide an abstraction of whole disk space under a unique NFS like file-system (just for metadata operations) –To possibly add the support for its own MSS system  They are needed only 2 or 3 scripts (put/get/remove) –To provide a system that scale at  hundreds of TB of disk cache  hundreds of pool nodes  hundreds of TB per day to clients File access: –provides local and remote access (posix like) with many protocols (dcap, ftp) both with and without authentication (gsi or kerberos)

Martina Franca (TA), 07 November dCache overview (2) Access management: access priority and load balancing obtained trough the use of different queue Allows multiple copy of files spread over different pools to improve performance and HA –pool-2-pool automatic (or manually) transfers Allows dynamic “match-making” between pools –According to the parameters chosen by the administrator (they can be based on disk space, load, network, type of access etc.) It is possible to split different type of “access point” (doors) on different nodes It is possible to move all the files in a pool to put it in a “scheduled downtime” –Or just to choose which file you want to move and where. Also the “central services” can be split on different nodes to improve the scalability

Martina Franca (TA), 07 November dCache overview (3) Pool management: –gives the possibility to create groups of pools named “storage class” (read, write, cache, or per VO and user bases or use bases) –Can be useful for quota management Web monitoring, statistical module (also with rate-plot), and SRM Monitoring The SRM layer can be used as stand-alone software (on standard Unix file-system) It is possible to choose the space used by dCache pool in a partition (you can host many “services” on the same partition) JAVA GUI for administration Also Xrootd protocol is supported Accounting system flat-files or DB based (not user friendly but there are many information) and space used per VO It is possible to use WN (or other “not reliable” space) disks to improve performance for local access

Martina Franca (TA), 07 November dCache overview (4)

Martina Franca (TA), 07 November dCache: A file read (What is happening under the hood)

Martina Franca (TA), 07 November dCache: A file read (What is happening under the hood)

Martina Franca (TA), 07 November dCache: A file read (What is happening under the hood)

Martina Franca (TA), 07 November dCache: A file read (What is happening under the hood)

Martina Franca (TA), 07 November dCache: A file read (What is happening under the hood)

Martina Franca (TA), 07 November dCache: A file read (What is happening under the hood)

Martina Franca (TA), 07 November dCache: A file read (What is happening under the hood)

Martina Franca (TA), 07 November dCache: A file read (What is happening under the hood)

Martina Franca (TA), 07 November dCache: A file read (What is happening under the hood)

Martina Franca (TA), 07 November dCache: Advanced Installation Layout dCache CORE PNFS Server Postgres DB Admin nodePNFS Server DB Server Pool node SRM door Gsi-dcap door Pool service (read) gridftp door Pool node SRM door Gsi-dcap door Pool service (write) gridftp door Pool node SRM door Gsi-dcap door Pool service (xrootd) gridftp door DNS Aliased dCap door xRootd door

Martina Franca (TA), 07 November YAIL: Yet Another Installation Layout

Martina Franca (TA), 07 November Best Practices Admin Node must be “resilient” PNFS-DB ( “admin”, “data1”, … ) are crucial: loosing this DB means loosing all files into that dir –It is better to create a PNFS-DB per each VO or each type of usage: better performances and scalability All other DB are not crucial Use the latest Postgres version –Is more stable and better performances It is better to have many small “pool” Doors should be always replicated Automatic installation with YAIM should be avoided –It is better to install and configure the system manually and then run “configure_node” with YAIM If the service is high loaded PNFS server can be splitted in a separate machine –Usually for a Tier2 site this is not needed

Martina Franca (TA), 07 November Reference Site: – Installation instruction: – YAIM Installation instruction: – Main documentation: –

Martina Franca (TA), 07 November Prerequisite SLC3 for Admin node (SLC4 already available, maybe not so stable) Not so many problem with ANY OTHER OS on pool node JAVA >=1.4 for 1.7.x version –JAVA >=1.5 for 1.8.x version Host certificates for all pool nodes APT Repository (not yet for 1.8.x): –echo 'rpm / ' > /etc/apt/sources.list.d/desy_dcache.list A lot of patience –A bit of perseverance

Martina Franca (TA), 07 November YAIM Installation YAIM Installation instruction: – Needed parameters: –MY_DOMAIN=gs.ba.infn.it –JAVA_LOCATION="/usr/java/j2sdk1.5.x_x” –DCACHE_ADMIN=”my-admin.gs.ba.infn.it” –DCACHE_POOLS="dcache.desy.de:7:/dCachePools/pool1 dcache.desy.de:7:/dCachePools/pool2”  # the pools : hostname:size:path –DCACHE_DOOR_SRM="my-admin.gs.ba.infn.it" –DCACHE_DOOR_GSIFTP="my-admin.gs.ba.infn.it" –DCACHE_DOOR_GSIDCAP="my-admin.gs.ba.infn.it" –DCACHE_DOOR_DCAP="my-admin.gs.ba.infn.it” –RESET_DCACHE_CONFIGURATION=yes –RESET_DCACHE_PNFS=yes –RESET_DCACHE_RDBMS=yes –VOS="ops dteam" Starting from dCache > only JAVA 1.5.x is supported

Martina Franca (TA), 07 November YAIM Installation For admin nodes : –/opt/glite/yaim/scripts/install_node ~/site-info.def glite- SE_dcache_admin_postgres –/opt/glite/yaim/scripts/configure_node ~/site-info.def glite- SE_dcache_admin_postgres For pool nodes : –/opt/glite/yaim/scripts/install_node ~/site-info.def glite- SE_dcache_pool –/opt/glite/yaim/scripts/configure_node ~/site-info.def glite- SE_dcache_pool

Martina Franca (TA), 07 November Manual Installation Installation instruction: – –“wget” all the rpm in:

Martina Franca (TA), 07 November And now… check you installation Look at: – –dccp -d 3 /tmp/test_file dcap://your- host.gs.ba.infn.it/pnfs/gs.ba.infn.it/data/test1 –srmcp -debug=true file:////tmp/test_file srm://your- host.gs.ba.infn.it:8443/pnfs/gs.ba.infn.it/data/test1 –ls -ltr /var/log/*Domain*.log –tail -n40 -f /opt/d-cache/libexec/apache-tomcat /logs/catalina.out  Only to debug SRM –tail -n 30 -f /opt/d-cache/billing/YYYY/MM/billing-YYYY.MM.DD –tail -n 30 -f /opt/d-cache/billing/YYYY/MM/billing-error-YYYY.MM.DD

Martina Franca (TA), 07 November Go on debugging… Look at: – – –Look at web monitoring pages

Martina Franca (TA), 07 November dCache new Release (1) Full SRM v2.2 support gPlazma authorization added. (For VOMS support) –GsiFtp and SRM understand extended proxies. Pools prepared to run on Windows XP dCap (client and server) now supports passive connections. [firewall issue is solved] Error type Fatal added. This allows for advanced actions ( , sms, firealarm) dCap Door : improved permission handling FTP Door : Commands chmod and rmdir added Cost calcuation for multi I/O queues –fast cost prediction was added for multi I/O queues. Files can be automatically replicated on arrival in the dCache Pool to pool transfers pool destinations are treated seperately from 'read' pool selection. A set of important parameters can be now be defined 'per dCache partition'. SRM monitoring system xRoot protocol (as in 1.7.0) integrated as any other protocol

Martina Franca (TA), 07 November dCache new Release (2) bug fixed Support Multiple PNFS server on different machines dCap : large file problem fixed. dCap lib always opens local files with O_LARGEFILE. SRMCP Client: –Ensure reliable srmcp return codes. The return code is 0 only if all individual file transfers are successful. In case of any failure the return code is 1. –Added several new command line options: gss_expected_name, globus_tcp_port_range, streams_num, and server_mode.

Martina Franca (TA), 07 November dCache issues It is written in JAVA –CPU and memory issues The configuration of the advanced features is not so easy The documentation has been improved, but still the system is complex than not so easy! The support is on best-effort bases –User Forum really helpful The license is free but not completely Open- Source

Martina Franca (TA), 07 November dCache future plans New software instead of PNFS (Chimera) -> Will improve performances –Chimera : you may run or 1.8 with Chimera Acl's available September for testing –Acl's in production end of the year (might be sooner) The StorageInfoQuotaObserver cell: advanced Quota support –Quotas will come with chimera nfs4.1 already in very good shape

Martina Franca (TA), 07 November Conclusions dCache is a complex system –GOOD:  Powerful System  Many Advanced functionality  Complete set of functionalities  Proved Scalability (at a tier1 level)  Easy portable to many software architecture (it works also on S.O. different from SLC3: i.e. Solaris) –BAD:  JAVA  Single Point of failure  Can be e little bit more difficult to manage (compared with DPM or Classic SE)

Martina Franca (TA), 07 November