Scalable sync-and-share service with dCache Paul Millar, on behalf of the dCache team Workshop on Cloud Services for Synchronisation and Sharing (CS3) Amsterdam, The Netherlands 2017-01-30 .. 2017-02-01 https://cs3.surfsara.nl/ Direct NFS mount of dCache Looks like a regular file-system; in use at DESY-Cloud Direct dCache WebDAV support: Similar to NFS, but supporting more authn options (e.g., X.509, OpenID-Connect) Tie QoS to directory; e.g., 'important' folder – multiple disk copies, 'archive' folder – contents written to tape.
What is this talk about? Web-client Web-client Sync-client Sync-client 𝑥Cloud 𝑥Cloud NFS Filesystem dCache
What is this 'dCache' cloud thing? Managed distributed (aggregation of heterogeneous) storage. Mature open-source project, core team based at three organisations. In production for over 15 years and deployed throughout the world. Instances with tens-of-Petabytes of data, O(10^8) files. Different QoS within a single dCache instance: Support different media types: random-IO (SSD, HDD, CEPH), whole-file copying (HSM/tape, S3, …) Support different replication policies. Data placement / caching policies. Through INDIGO-DataCloud (Horizon-2020) project: Working within RDA to standardise QoS definitions, Working within SNIA to standardise QoS access models, Working within dCache to improve QoS support.
Media locality QoS in dCache Different QoS chosen by directory; e.g., Different users have different QoS, A user chooses which QoS by writing file into different directory. Single copy: Redundant copies: … + + All QoS options available in 𝑥Cloud: Media placement, geographical locality, replication factor, … Principally admin configured: User chooses into which directory they write. Target directory chooses which policy is used: 'tape' directory goes to tape, 'redundant' gets multiple copies Distinct nodes | racks | buildings | campuses | … Disk + tape copy: +
QoS media type support: CEPH dCache can use any POSIX storage Typically RAID storage with some local filesystem. Admins wanted to be able to use CEPH. Supported: 1x storage-node per CEPH pool. Adding redundant storage-nodes Also benefits cluster file-systems: luster, GPFS, …
Geographic placement QoS, part #1
Geographic placement QoS, part #2
Geographic placement QoS, part #3
Updates in dCache: hardening service Matching NFS semantics to 𝑥Cloud expectations: Allow file truncation (for when 𝑥Cloud overwriting existing files), Better support for multi-open (multiple sync- clients), Implementing more advanced features from the NFS specification. Real-life: conditions / operation intervention: Better support for planned storage-node intervention, Better user experience after unintended storage-node outage/restart, More stable IO when network is flaky. High Availability dCache: All dCache components may be deployed redundantly, Load-balancing for core components, Users unaware of (patch-level) upgrades and (planned) hardware intervention.
Looking towards the future
dCache vision 𝑥Cloud & dCache duplicated namespace: Pain to keeping these in sync. Shared files knowledge only in 𝑥Cloud layer: Shared files are not visible when interacting directly with dCache
Teaching dCache about sharing Users can share files or directories Can share what you own + those you've been given permission (e.g., ACL ADMINISTRATE bit). SHARE object is written in other user(s) 'incoming' directory (e.g., home directory) Only receiving user sees the SHARE Can rename, move, delete, cd/read; similar to regular sym-link. SHARE has share-mode for access control: share can be read-only or read-write. Independent of the SHARE's filesystem permissions. Renaming source does not affect SHARE. Unshare a subset of recipients removes their SHAREs. Removing the source also removes all SHAREs
Teaching dCache about sharing Users can share files or directories Can share what you own + those you've been given permission (e.g., ACL ADMINISTRATE bit). SHARE object is written in other user(s) 'incoming' directory (e.g., home directory) Only receiving user sees the SHARE Can rename, move, delete, cd/read; similar to regular sym-link. SHARE has share-mode for access control: share can be read-only or read-write. Independent of the SHARE's filesystem permissions. Renaming source does not affect SHARE. Unshare a subset of recipients removes their SHAREs. Removing the source also removes all SHAREs
QoS improvements Standardising QoS description: Increasing choice: Work coming through the INDIGO-DataCloud EU Horizon-2020 project. Standardisation (SNIA, RDA), Improved flexibility: more QoS options, Moves decision from admin into users. What it looks like: RESTful interface to dCache An 𝑥Cloud application to drive RESTful interface Examples: User chooses desired QoS for a directory, Change QoS later on: this data is actually important (multiple copies, store on tape, etc). Expose choice to sync-n-share users:
Summary NFS hardened from extensive use in real-world conditions Extending QoS options in dCache. Plan to: Expose QoS options in 𝑥Cloud, Implementing SHARE in dCache, Reduce dependency on 𝑥Cloud database, Investigate getDirectDownload for direct- from-dCache client HTTP transfers Finish sync-client support within dCache.
Backup slides
Solving them: teaching dCache to share Users can share files or directories Can share what you own + those you've been given permission (e.g., ACL ADMINISTRATE bit). SHARE object is written in other user(s) 'incoming' directory (e.g., home directory) Only receiving user sees the SHARE Can rename, move, delete, cd/read; similar to regular sym-link. SHARE has share-mode for access control: share can be read-only or read-write. Independent of the SHARE's filesystem permissions. Renaming source does not affect SHARE. Unshare a subset of recipients removes their SHAREs. Removing the source also removes all SHAREs