dCache “Intro” a layperson perspective Frank Würthwein UCSD

Slides:



Advertisements
Similar presentations
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
Advertisements

EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
An Engineering Approach to Computer Networking
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Case Study - GFS.
File Systems and N/W attached storage (NAS) | VTU NOTES | QUESTION PAPERS | NEWS | VTU RESULTS | FORUM | BOOKSPAR ANDROID APP.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
1 The Google File System Reporter: You-Wei Zhang.
Pooja Shetty Usha B Gowda.  Network File Systems (NFS)  Drawbacks of NFS  Parallel Virtual File Systems (PVFS)  PVFS components  PVFS application.
DCache at Tier3 Joe Urbanski University of Chicago US ATLAS Tier3/Tier2 Meeting, Bloomington June 20, 2007.
Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008 John Bresnahan Michael Link Raj Kettimuthu Argonne National Laboratory and.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
Storage, Networks, Data Management Report on Parallel Session OSG Meet 8/2006 Frank Würthwein (UCSD)
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
Flexibility, Manageability and Performance in a Grid Storage Appliance John Bent, Venkateshwaran Venkataramani, Nick Leroy, Alain Roy, Joseph Stanley,
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
SRM Monitoring 12 th April 2007 Mirco Ciriello INFN-Pisa.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
OSG Abhishek Rana Frank Würthwein UCSD.
ALCF Argonne Leadership Computing Facility GridFTP Roadmap Bill Allcock (on behalf of the GridFTP team) Argonne National Laboratory.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
GridKa December 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann dCache Implementation at FZK Forschungszentrum Karlsruhe.
NeST: Network Storage John Bent, Venkateshwaran V Miron Livny, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
An Introduction to GPFS
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
STORAGE EXPERIENCES AT MWT2 (US ATLAS MIDWEST TIER2 CENTER) Aaron van Meerten University of Chicago Sarah Williams Indiana University OSG Storage Forum,
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
Federating Data in the ALICE Experiment
dCache Paul Millar, on behalf of the dCache Team
Scalable sync-and-share service with dCache
Vincenzo Spinoso EGI.eu/INFN
StoRM Architecture and Daemons
Abhishek Singh Rana UC San Diego
dCache Status and Plans – Proposals for SC3
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
CMS analysis job and data transfer test results
Operating System I/O System Monday, August 11, 2008.
Brookhaven National Laboratory Storage service Group Hironori Ito
Enabling High Speed Data Transfer in High Energy Physics
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
CSI 400/500 Operating Systems Spring 2009
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
GARRETT SINGLETARY.
Hadoop Basics.
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
Distributed File Systems
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Distributed File Systems
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Ch 17 - Binding Protocol Addresses
Distributed File Systems
An Engineering Approach to Computer Networking
Distributed File Systems
Presentation transcript:

dCache “Intro” a layperson perspective Frank Würthwein UCSD

documentation on dcache!!! dCache docs “The Book” http://www.dcache.org/manuals/Book By far the most useful documentation on dcache!!!

dCache Goal (T2 perspective) “Virtualize” many disks into a single namespace. Allow for seemingly arbitrary growth in disk space! “Virtualize” many IO channels via a single entry point. Allow for seemingly arbitrary growth in aggregate IO! Manage flows by scheduling Xfer queues. Allow for replication to improve data availability and avoid hotspots among disk hardware.

Techniques to meet goals Separate physical and logical namespace. PNFS = namespace mapper Separate “file request” from “file open” Doors manage requests. Pools manage Xfer. Fundamental Problem: Access costs for: LFN - PFN translation: max of ~50Hz (?) SRM request handling: max of ~1Hz (?) Need large files for decent performance!

Xfer Protocols “Streaming”, i.e. put & get but not seek Random access Gftp, dccp Random access dcap, xrootd (alpha version ?) Where does SRM fit in? Provides single entry point for many gftp => spreads the load. Implements higher level functionality Retries Space reservation (more on this later)

Performance Conclusions (so far) PNFS server wants its own hardware … and be stingy with pnfs mounts to minimize risk of fools killing pnfs with find!!! … and avoid more than a few hundred files per directory!!! SRM wants its own hardware … and be “reasonable” with the # of files per srmcp. There’s a per file overhead and a per srmcp latency! … but many dcap doors on same node ok!

dCache Basic Design Door Name Space Provider Pool Manager Pool Mover Components involved in Data Storage and Data Access Provides specific end point for client connection Exists as long as client process is alive Client’s proxy used within dCache Interface to a file system name space Maps dCache name space operations to filesystem operations Stores extended file metadata Performs pool selection Data repository handler (i.e. has the physical files) Launches requested data transfer protocols Data transfer handler (gsi)dCap, (Grid)FTP, http, HSM hooks Door Name Space Provider Pool Manager Pool Mover (concept by P. Fuhrmann)

Places to queue … (details in ASR’s talk) SRM = “site global queuing” Queuing of requests Algorithm to pick gftp door to handle requests PoolManager = select “best” pool or replicate Hard policy (e.g. dir-tree fixed to pools) # of requests / max. allowed Space available Local queuing @ pool Multiple queues for multiple purposes

PoolManager Selection Determine which pools are allowed to perform request. Static decision based on configuration. Pools may be assigned to IP & Path Ask all allowed pools to provide cost of performing request. Cost is configurable per pool. Decide if cost is low enough Cost too high => replicate file onto lower cost pool … otherwise assign lowest cost pool to service request.

Different Queues for different Protocols because typical IO differs between protocols. Courtesy: P.Fuhrmann

Advanced Topics Private/Public network boundaries Replica Manager Implicit Space Reservation Overwriting files Uid’s and file access management

All public network Each pool may dynamically instantiate a gftp server as needed. No need for SRM to pick the gftp door because its determined by the pool selection mechanism alone.

All pools on private network Need dual homed gftp doors as “application level proxy servers”. SRM selects gftp door independent of pool selection. Gftp door flows data from WAN into memory buffer and from memory buffer onto LAN onto pool, or vice versa. 3rd party Xfer between sites w. private pools is not possible! You design your IO by the # of gftp servers you provision.

Replica Manager Wants to sit on its own hardware! Allows specification of # of replicas of files within boundaries. More performance More robustness against failure Guarantees replica count via lazy replication Wants to sit on its own hardware!

Implicit Space Reservation SRM guarantees that only pools are selected that have enough space for the file that is to be written. This is a feature that is turned on by default.

Overwriting of files You can NOT modify a file after it is closed! Reason: File may be replicated, and dCache has no means to guarantee that all replicas are modified. Different replicas of a file may be accessed simultaneously by different clients. No “cache coherence” in dCache.

UID’s and file access UID space in dCache and compute cluster does not have to be the same! Instead, it is possible to require all accesses to be “cert” based (x509 or krb5), and thus have arbitrary map between dCache and compute cluster!!!

Questions ?

Two example deployments UCSD 38TB across 70+ pool nodes. LAN & WAN queue per pool Core: LM, PM, Spy, httpd, billing Dcap: 5 doors SRM node PNFS node RM: RM, admin door 6 gftp nodes FNAL 110TB across 23 pool nodes with 2Gbps each. LAN & WAN queue per pool. Core: LM, PM, HSM Dcap: 3 doors 2nd dcap: 4 dcap, 1 gftp SRM node Pnfs node RM: RM, 1 gftp InfoSys: Billing, Spy, httpd, infoProvider Management: CM, dcap, gftp