Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow – Jamie K. Ferguson – University.

Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University of Glasgow Graeme A. Stewart - University of Glasgow Greig A. Cowan - University of Edinburgh

● Typical Tier 2 & Purpose of the Inbound Transfer Tests ● Details of the hardware/software configuration for the File Transfers ● Analysis of Results Introduction

LHC and the LCG ● LHC – most powerful instrument ever built in the field of physics ● Generate huge amounts of data every second it is running ● Retention of 10PB annually to be processed at sites ● Use case is typically files of size ~GB, many of which are cascaded down to be stored at T2s until analysis jobs process them

Typical Tier2 - Definition ● Limited Hardware Resources – (In GridPP) Using dCache or dpm as SRM – Few (one or two) Disk Servers – Few Terabytes of RAIDed Disk ● Limited Manpower – Not enough time to Configure and/or Administer a Sophisticated Storage System – Ideally want something just to work “out of the box”

Importance of Good Write (and Read) Rates ● Experiments Desire Good in/out Rates – Write more stressful than read, hence our focus – Expected data transfer rates (T1==>T2) will be directly proportional to the storage at a T2 site – Few 100Mbps for small(ish) sites up to several Gbps for large CMS sites ● Limiting Factor could be one of many things – I know this from recently coordinating 24hour tests between all 19 of the GridPP T2 member institutes ● We also yielded file transfer failure rates

Glite File Transfer Service ● Used FTS to manage transfers – Easy to use file transfer management software – Uses SURLs for source and destination – Experiments shall also use this software – Able to set channel parameters N f and N s – Able to monitor each job, and each transfer within each job. ● Pending, Active, Done, Failed, etc.

What Variables Were Investigated? ● Destination srm – dCache (v1.6.6-5) – dpm (v1.4.5) ● The underlying File system on the destination – ext2, ext3, jfs, xfs ● Two Transfer-Channel Parameters – No. of Parallel Files – No. of GridFTP Streams ● Example => N f =5, N s =3

Software Components ● Dcap and rfio are the transportation layers for dCache and dpm respectively ● Under this software stack is the filesystem itself. e.g. ext2 ● Above this stack was the filetransfer.py script – See http://www.physics.gla.ac.uk/~graeme/scripts/|filetran sfer#filetransfer

Software Components - dpm ● All the daemons of the destination dpm were running on the same machine ● dCache had a similar setup in terms everything housed in a single node

Hardware Components ● Source was a dpm. – High performance machine ● Destination was single node dual core Xeon CPU ● Machines were on same network. – Connected via 1GB link which had negligible other traffic. – No firewall between source and destination ● No iptables loaded ● Destination had three 1.7TB partitions – Raid 5 – 64K stripe

Kernels and Filesystems ● A CERN contributed rebuild of the standard SL kernel was used to investigate xfs. – This differs from the first Kernel only in the addition of xfs support – Instructions on how to install kernel at http://www.gridpp.ac.uk/wiki/XFS_Kernel_Howto – Necessary RPMs available from ftp://ftp.scientificlinux.org/linux/scientific/305/i386/ contrib/RPMS/xfs/

Method ● 30 source files, each of size 1GB were used – This size is typical of the sizes of LCG files that shall be used by LHC experiments ● Both dCache and dpm were used during testing ● Each kernel/Filesystem was tested - 4 such pairs ● Values of 1,3,5,10 were used for No. Files and No. Streams - giving a matrix of 16 test results ● Each test was repeated 4 times to attain a mean. – Outlying results (~ < 50% of other results) were retested ● This prevented failures in higher level components e.g. FTS adversely affecting results

Results – Average Rates ● All results are in Mbps

Average dCache rate vs. N f

Average dCache rate vs. N s

Average dpm rate vs. N f

Average dpm rate vs. N s

Results – Average Rates ● In our tests dpm outperformed dCache for every average Nf, Ns

Results – Average Rates ● Transfer rates are greater when using jfs and xfs rather than ext2 or ext3 ● Rates for ext2 are better than ext3 due to the fact that ext2 does not suffer from journalling overheads

Results - Average Rates ● Having more than one N f on the channel substantially improves the transfer rate for both SRMs and for all filesystems. And for both SRMs, the average rate is similar for N f =3,5,10 ● dCache – N s = 1 is the optimal value for all filesystems ● dpm – N s = 1 is the optimal value for ext2 and ext3 – For jfs and xfs rate seems independent of N s ● For both SRMs, the average rate is similar for N s =3,5,10

Results – Error (Failure) Rates ● Failures, in both cases tended to be caused by a failure to correctly call srmSetDone() in FTS resulting from a high machine load ● Recommended to separate the SRM daemons and disk servers, especially at larger sites

Results – Error (Failure) Rates ● dCache – small number of errors for the ext2 and ext3 filesystems ● caused by high machine load – No errors for the jfs and xfs filesystems ● dpm – all filesystems had errors ● As in dCache case, caused by high machine load – Error rate for jfs was particularly high, but this was down to many errors in one single transfer

Results – FTS Parameters ● N f – Initial tests indicate that N f set at a high value (15) causes a large load on machine when first batch of files completes. Subsequent batches time-out. – Caused by post-transfer SRM protocol negotiations occurring simultaneously ● N s – > 1 caused slower rates for ¾ of the SRM/filesystem combinations – Multiple streams causes a file to be split up and sent down different TCP channels – This results in “random writes” to the disk. – Single streams cause the data packets to arrive sequentially and can be written sequentially also

Future Work ● Use SL4 as OS – allows testing of 2.6 kernel ● Different stripe size for RAID configuration ● TCP read and write buffer sizes – Linux kernel-networking tuning parameters ● Additional hardware, e.g. More disk servers ● More realistic simulation – Simultaneous reading/writing – Local file access ● Other filesystems? – e.g. reiser, but this filesystem is more applicable to holding small files, not the sizes that shall exist on the LCG

Conclusions ● Choice of SRM application should be made at site level based on resources available ● Using newer high performance filesystem jfs or xfs increases inbound rate – Howto move to xfs filesystem without loosing data http://www.gridpp.ac.uk/wiki/DPM_Filesystem_XFS_For matting_Howto http://www.gridpp.ac.uk/wiki/DPM_Filesystem_XFS_For matting_Howto ● High value for Nf – Although too high will cause other problems ● Low value for Ns – I recommended N s =1 and N f =8 for GridPP inter-T2 tests that I'm currently conducting

Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow – Jamie K. Ferguson – University.

Similar presentations

Presentation on theme: "Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow – Jamie K. Ferguson – University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow – Jamie K. Ferguson – University.

Similar presentations

Presentation on theme: "Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow – Jamie K. Ferguson – University."— Presentation transcript:

Similar presentations

About project

Feedback