Download presentation
Presentation is loading. Please wait.
Published byPaloma Burrough Modified over 9 years ago
1
DPM Status & Roadmap Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI-261611
2
CLIENT DPNSDPMSRMHTTPNFS GRIDFTP RFIOHTTPNFSXROOT HEAD NODE DISK NODE(s) FILE METADATA OPS FILE ACCESS OPS DPM Overview RFIO HTTP NFS XROOT
3
DPM Core 1.8.2, Testing, Roadmap
4
DPM 1.8.2 – Highlights Improved scalability of all frontend daemons – Especially with many concurrent clients – By having a configurable number of threads Fast/Slow in case of the dpm daemon Faster DPM drain – Disk server retirement, replacement, … Better balancing of data among disk nodes – By assigning different weights to each filesystem Log to syslog GLUE2 support
5
DPM Core – Testing Activity Improved validation & testing – Collaboration with ASGC for this purpose (thanks!) – Hammercloud tests running regularly – They started with a 400 core setup, we looked at the issues, now moving to 1000 cores to increase load Example run – http://hammercloud.cern.ch/atlas/10006472/test/ http://hammercloud.cern.ch/atlas/10006472/test/ To be used extensively for stress testing – Covering all components: DPM, RFIO, GRIDFTP, NFS, HTTP, … Results will benefit other sites too
6
DPM Core – Testing HC using GridFTP Thanks to ShuTing for the plots ( preliminary results ) HC using RFIO Example GridFTP vs RFIO
7
DPM Core - Testing Big contribution from openlab student – Martin Hellmich, University of Edinburgh Detailed analysis of DPM internals – Detecting bottlenecks in specific transfer / access phases Example… but we have a lot more results which we are now investigating
8
DPM Core – Roadmap Package consolidation: EPEL compliance Fixes in multi-threaded clients Replace httpg with https on the SRM Improve dpm-replicate (dirs and FSs) GUIDs in DPM Synchronous GET requests Reports on usage information Quotas Accounting metrics HOT file replication 1.8.3 1.8.4 1.8.5
9
Beta Components HTTP/DAV, NFS, Nagios, Puppet, Perfsuite, Catalog Sync, Contrib Tools
10
Beta Components: Overview Faster releases – Monthly releases since June Separate yum repository Already in use by several sites – Including sites in the UK https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Dev/Components
11
Beta Components: PerfSuite Overview
12
Performance Suite Set of tools to easily trigger bunches of tests – With different configurations Common wrapper, many tests Existing suites – POSIX Transfers: RFIO, NFS – GET/PUT Transfers: HTTP, GSIFTP – ROOT – More coming… Used for most results presented later https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Admin/Performance#Perfsuite
13
Performance Suite Set of tools to easily trigger test bunches – With different configurations Common wrapper, many tests Existing suites – POSIX Transfers: RFIO, NFS – GET/PUT Transfers: HTTP, GSIFTP – ROOT – More coming… Used for most results presented later https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Admin/Performance#Perfsuite Sample Configuration test_rfcp(c:5,s:{1M 2M 4M 8M 16M 32M 64M 128M 256M 512M 1G})x3 test_nfs(m:/mnt/nfs41,c:5,s:{1M 2M 4M 8M 16M 32M 64M 128M 256M 512M 1G})x3
14
Beta Components: HTTP / DAV Overview, Performance, Roadmap
15
https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/WebDAV HTTP / DAV: Overview CLIENT LFC DPM HEAD DPM DISK GET GET / PUT 1 2 3 REDIRECT DATA
16
HTTP / DAV: Overview CLIENT LFC DPM HEAD DPM DISK GET GET / PUT 1 2 3 REDIRECT DATA
17
HTTP: Client Support curlbrowser OSAny GUINOYES CLIYESNO X509YES ProxiesYESOnly IE so far RedirectYES PUTYESNO Recommendation: browser/curl for GET, curl for PUT Chrome Issue 9056 submitted for proxy support
18
DAV: Client Support TrailMixCadaverDavlibShared Folder DavFS2NautilusDolphin OSFirefox < 4*nixMac OS XWindows*nixGnomeKDE GUIYESNOYES N/AYES CLINOYESNO N/ANO X509YES NOYES NO Proxies?NO YESNO RedirectYESNOYESNot PUTNO YES Updated analysis based on initial one from dCache Recommendation: Cadaver for *nix, Windows explorer
19
HTTP vs GridFTP: Multiple streams Not explicit in the HTTP protocol But needed for even higher performance – Especially in the WAN So we added it, with some semantics – Small wrapper around libcurl – PUT with ‘0 bytes’ && null content-range == end of write Submitted patch to libcurl to allow ssl session reuse among parallel requests
20
HTTP vs GridFTP: 3 rd Party Copies Implemented using WEBDAV COPY Requires proxy certificate delegation – Using gridsite delegation, with a small wrapper client Requires some common semantics to copy between SEs (to be agreed) – Common delegation portType location and port – No prefix in the URL ( just http:// / )
21
HTTP vs GridFTP: 3 rd Party Copies Example of FTS usage
22
Ongoing Evaluation HTTP / DAV: Performance No difference detected in LAN with different number of streams – But early results do show a big difference on the WAN lcg-cp configured to use gridftp File registration & transfer times considered in both cases Xeon 4 Cores 2.27GHz 12 GB RAM 1 Gbit/s links
23
HTTP / DAV: Issues & Roadmap Towards a first production release – Testing with large number of concurrent clients – Finish up the WAN performance tests And after that – Further testing of 3 rd party copy with larger files – Finish validation against other implementations – Validate usage via ROOT – Improved GET on the LFC – PUT support on the LFC (?)
24
Beta Components: NFS 4.1 / pNFS Overview, Performance, Roadmap
25
NFS 4.1/pNFS: Why? Industry standard (IBM, NetApp, EMC, …) Free clients (with free caching) Strong security (GSSAPI) Parallel data access Easier maintenance … But you know all this by now…
26
NFS 4.1/pNFS: Overview CLIENT METADATA SERVER DISK SERVER(s) OPEN 1 LAYOUTGET 2 3 GETDEVICEINFO https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/NFS41 4 OPEN 5 READ / WRITE 6 CLOSE 7
27
NFS4.1 / pNFS: Client pNFS support in linux kernel from >= 2.6.38 nfs-utils >= 1.2.3 Latest Fedora and Debian Sid have it We provide packages for EL5 – Enabled pNFS in the elrepo mainline kernel – nfs-utils and AFS module we package ourselves
28
NFS4.1 / pNFS: Performance IOZONE Results Ongoing Evaluation Server – Xeon 4 Cores 2.27GHz – 12 GB RAM – 1 Gbit/s links Client – Dual core – 2 GB RAM – 100 Mbit/s link
29
NFS4.1 / pNFS: Performance NFS vs RFIO Ongoing Evaluation RFIO read misbehaving in this test… investigating Server – Xeon 4 Cores 2.27GHz – 12 GB RAM – 1 Gbit/s links Client – Dual core – 2 GB RAM – 100 Mbit/s link 8 KB block sizes
30
NFS4.1 / pNFS: Issues & Roadmap Towards a first production release – Tests with a faster network link – Testing with a larger number of concurrent clients – WAN testing – Enable bigger block sizes And after that – X509 certificate support Still not figured out… needs a strong focus – Further validation with other implementations
31
Beta Components: Even more… Puppet, Nagios, Contrib, Catalog Sync
32
Even more components… Catalog Synchronization – Check Fabrizio’s talk next Monday (EGI Forum Lyon) DPM Admin contrib package – Contribution from GridPP – Now package and distributed with the DPM components – http://www.gridpp.ac.uk/wiki/DPM-admin-tools http://www.gridpp.ac.uk/wiki/DPM-admin-tools Nagios monitoring plugins for DPM – Available now – https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Admin/Monitoring https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Admin/Monitoring Puppet templates – Available now in beta – https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Admin/Puppet https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Admin/Puppet
33
Conclusion 1.8.2 fixes many scalability and performance issues – But we continue testing and improving Popular requests coming in next versions – Accounting, quotas, easier replication Beta components getting to production state – Standards compliant data access – Simplified setup, configuration, maintenance – Metadata consistency and synchronization And much more extensive testing – Performance test suites, regular large scale tests
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.