dCache Scientific Cloud

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

Steve Traylen Particle Physics Department Experiences of DCache at RAL UK HEP Sysman, 11/11/04 Steve Traylen
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
Preservasi Informasi Digital.  It will never happen here!  Common Causes of Loss of Data  Accidental Erasure (delete, power, backup)  Viruses and.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 experience Sophie Lemaitre WLCG Workshop.
Evolution, by tackling new challenges| CHEP 2015, Japan | Patrick Fuhrmann | 16 April 2015 | 1 Patrick Fuhrmann On behave of the project team Evolution,
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
Future home directories at CERN
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Sync and Exchange Research Data b2drop.eudat.eu This work is licensed under the Creative Commons CC-BY 4.0 licence B2DROP EUDAT’s Personal.
LHC Computing, CERN, & Federated Identities
File Transfer And Access (FTP, TFTP, NFS). Remote File Access, Transfer and Storage Networks For different goals variety of approaches to remote file.
Introduction to AFS IMSA Intersession 2003 An Overview of AFS Brian Sebby, IMSA ’96 Copyright 2003 by Brian Sebby, Copies of these slides.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
The GridPP DIRAC project DIRAC for non-LHC communities.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
GGUS summary (3 weeks) VOUserTeamAlarmTotal ALICE7029 ATLAS CMS LHCb Totals
Sync n share and QoS in dCache | NEC 2015, Montenegro| Patrick Fuhrmann | 02 Oct 2015 | 1 Patrick Fuhrmann On behave of the dCache team by especially Tigran.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Cremlin Kick Off | DESY-IT | V. Gülzow | Seite 1 Untertitel durch Klicken bearbeiten e-Infrastructures & Big Data Handling Volker Guelzow DESY Moscow,
Nov 05, 2008, PragueSA3 Workshop1 A short presentation from Owen Synge SA3 and dCache.
Security recommendations for dCache
Open Object Storage as the Foundation for Open Sync and Share in Science Simon Traill – © SwiftStack All.
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
Federating Data in the ALICE Experiment
dCache Paul Millar, on behalf of the dCache Team
Open-E Data Storage Software (DSS V6)
CernVM-FS vs Dataset Sharing
MERANTI Caused More Than 1.5 B$ Damage
IWDW-WAQS Technical Committee
Scalable sync-and-share service with dCache
By: Raza Usmani SaaS, PaaS & TaaS By: Raza Usmani
DAaM summary Ian Bird MB 29th June 2010.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
StratusLab First Periodic Review
StoRM: a SRM solution for disk based storage systems
Status of the SRM 2.2 MoU extension
BNL Tier1 Report Worker nodes Tier 1: added 88 Dell R430 nodes
Future of WAN Access in ATLAS
Data Bridge Solving diverse data access in scientific applications
dCache – protocol developments and plans
StoRM Architecture and Daemons
Introduction to Data Management in EGI
Taming the protocol zoo
Storage for Science at CERN
Usage of Openstack Cloud Computing Architecture in COE Seowon Jung Systems Administrator, COE
ATLAS Sites Jamboree, CERN January, 2017
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Quality Control in the dCache team.
EGI UMD Storage Software Repository (Mostly former EMI Software)
Ákos Frohner EGEE'08 September 2008
DCache things Paul Millar … on behalf of the dCache team.
Computing Infrastructure for DAQ, DM and SC
dCache: new and exciting features
Research Data Archive - technology
AWS. Introduction AWS launched in 2006 from the internal infrastructure that Amazon.com built to handle its online retail operations. AWS was one of the.
Cloud vs Local: Better Data Storage Device
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
IT INFRASTRUCTURES Business-Driven Technologies
INFO 344 Web Tools And Development
Cloud computing mechanisms
Technical Capabilities
Building Serverless Enterprise Applications
Windows Azure Hybrid Architectures and Patterns
Exploring Multi-Core on
Rucio & objectstores James Perry.
Building a Windows Azure Application
Presentation transcript:

dCache Scientific Cloud Paul Millar (on behalf of DESY and dCache team) Workshop on Cloud Services for File Synchronisation and Sharing, CERN

DESY: a HEP lab, with photon sciences + Optimising writing of data into dCache Need to cope with “bad” writers: Writing small (a few MiB) files at ~100 Hz dCache was originally written for low frequency, high data-rates. Newer (major) versions of dCache have optimisations folded in, improved performance

Need a sync-n-share service at DESY Hard requirements: Easy to use, Store everything at DESY, Integrate with existing infrastructure. Anticipated future usage: change data between syncing and non-syncing storage, like Amazon, provide different QoS with different costs, share data without syncing, 3rd party transfers between sites, immediate access to sync space from compute facilities.

A scientific cloud vision

How we solved it at DESY Looked around, chose two open-source projects: dCache: powerful managed storage system Integration with scientific data life-cycle; “Hot” data can be stored on SSDs, “cold” on cheaper HDDs, “archive” tape; … but no sync and share facilities. ownCloud: popular front-end Our collaborators adopting ownCloud makes it more attractive; … but assumes storage is managed. Combining these two gives DESY the best of both worlds: dCache is mounted on ownCloud server with NFS v4.1/pNFS, running community edition. Integrated with DESY Kerberos, LDAP and Registry.

What is dCache? Collaborations Core team Student mentor programme 1.5 FTEs 2 FTEs 5 FTEs 3 students Look at adding bandwidth usage as a metric of importance. dCache is important throughout the world for storing Big Data in Science

The DESY Cloud service Status: production, for “friends” in two weeks “by invitation” (selected power-users, -groups); 1st January general availability. Required minor patches to ownCloud & dCache. Changes pushed into dCache: Running unpatched dCache release Have a blueprint for any site to deploy ownCloud+dCache. Changes pushed upstream to ownCloud: Not all were accepted for v7, so running patched until v8 is released.

Integration within DESY infrastructure

Development and future work Files in dCache have user-ownership, not ownCloud: Plan to expose files directly from dCache: NFS mount, 3rd party transfers, direct access from any grid worker-node, … Couldn't fix ownCloud: work-around within dCache Consistency between ACLs and shares: dCache ACLs to honour ownCloud shares and vice versa Integrity; e.g., propagate and handling checksums, Notification: avoid client polling, Redirection support for sync-client: ownCloud server proxying data is bottleneck; want syncing to be more efficient by taking data from where its stored.

NFS v4.1/PNFS vs ownCloud (currently)

ownCloud: currently vs with redirect

Experience: problems with ownCloud If underlying FS disappears, all sync-clients delete all data. If underlying FS returns EIO on read, sync-client creates 0-length file: impossible to recover. Bulk delete through web interface is unreliable (under investigation). Rename directory causes client to delete all files and upload them again. Admin interface awkward with O(5k) users.

Not just ownCloud ... dCache team hosted a two-day workshop with project- and technical-lead of DCORE Provides cloud storage with features beyond ownCloud Some “big name” customers Initial “lite integration” by December 2014 (includes redirection support) Then providing “tight integration” with shared namespace

Thanks for listening … any questions?

Backup slides

Thinking about sync-and-share Like other systems, small fraction of data is “hot” SSDs provide better performance, but can't afford only SSDs; nice to have system that places hot data on SSDs, cold data on HDD. Amazon had a smart idea: allow people to choose how much to pay Let users choose between Normal and Glacial QoS; e.g., disable sync for Glacial-like storage but allow access via web interface

WLCG dCache instances (only WLCG sites shown)

Over 10 years “Big Data” experience Disk cache Grid Storage Generic Storage Cloud Storage Era Industry dCache is evolving to match demands from an ever increasing number of communities Additional Communities Trusted host X.509, Kerberos Username+PW SAML, OpenID, OAuth, Token, ... Additional Authen- tication