The lightweight Grid-enabled Disk Pool Manager (DPM)

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
DPM Name Server (DPNS) Namespace Authorization Location of physical files DPM Server Requests queuing and processing Space Management SRM Servers v1.1,
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
SRM at Clemson Michael Fenn. What is a Storage Element? Provides grid-accessible storage space. Is accessible to applications running on OSG through either.
DPM CCRC - 1 Research and developments DPM status and plans Jean-Philippe Baud.
LFC tutorial Jean-Philippe Baud, IT-GT, CERN July 2010.
StoRM Some basics and a comparison with DPM Wahid Bhimji University of Edinburgh GridPP Storage Workshop 31-Mar-101Wahid Bhimji – StoRM.
EGEE-III INFSO-RI Enabling Grids for E-sciencE The Medical Data Manager : the components Johan Montagnat, Romain Texier, Tristan.
Data Management The GSM-WG Perspective. Background SRM is the Storage Resource Manager A Control protocol for Mass Storage Systems Standard protocol:
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
INFSO-RI Enabling Grids for E-sciencE DPM Administration Jean-Philippe Baud (Sophie Lemaitre)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management and Interoperability Peter Kunszt (JRA1 DM Cluster) 2 nd EGEE Conference,
INFSO-RI Enabling Grids for E-sciencE Experiences with LFC and comparison with RNS Erwin Laure Jean-Philippe.
CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Medical Data Manager 1 Dicom retrieval : overview of the DPM One command line to retrieve a file:
Database authentication in CORAL and COOL Database authentication in CORAL and COOL Giacomo Govi Giacomo Govi CERN IT/PSS CERN IT/PSS On behalf of the.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Data management in LCG and EGEE David Smith.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite Data Management Components Presenter.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 in DPM Sophie Lemaitre Jean-Philippe.
SESEC Storage Element (In)Security hepsysman, RAL 0-1 July 2009 Jens Jensen.
SRM-iRODS Interface Development WeiLong UENG Academia Sinica Grid Computing 1.
Site Authorization Service Local Resource Authorization Service (VOX Project) Vijay Sekhri Tanya Levshina Fermilab.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
CMS User Support and Beijing Site Xiaomei Zhang CMS IHEP Group Meeting March
Bologna, March 30, 2006 Riccardo Zappi / Luca Magnoni INFN-CNAF, Bologna.
Introduction to Storage Element Hsin-Wei Wu Academia Sinica Grid Computing Center, Taiwan.
Security recommendations DPM Jean-Philippe Baud CERN/IT.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SE Security Rémi Mollon, Ákos Frohner EGEE'08,
Data Management & Information Systems
Enabling Grids for E-sciencE EGEE-II INFSO-RI The Development of SRM interface for SRB Fu-Ming Tsai Academia Sinica Grid Computing.
Enabling Grids for E-sciencE INFSO-RI Virtual Ids and VOMS integration DPM supports virual Ids and VOMS : –each user/group is internally mapped.
INFSO-RI Enabling Grids for E-sciencE Security needs in the Medical Data Manager EGEE MWSG, March 7-8 th, 2006 Ákos Frohner on behalf.
EGEE Data Management Services
a brief summary for users
Jean-Philippe Baud, IT-GD, CERN November 2007
Classic Storage Element
StoRM: a SRM solution for disk based storage systems
Status of the SRM 2.2 MoU extension
Hints for DPM Administration
Troubleshooting su Installazione SE [DPM]
Cross-health enterprises Medical Data Management on the EGEE grid
Medical Data Manager use case: 3D medical images analysis workflow.
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Jean-Philippe Baud - Sophie Lemaitre IT-GD, CERN May 2005
StoRM Architecture and Daemons
Introduction to Data Management in EGI
SRM Developers' Response to Enhancement Requests
Encrypted Data Store, Hydra & Delegation Interface
Hands-On Session: Data Management
Ákos Frohner EGEE'08 September 2008
Data Management cluster summary
Grid Security M. Jouvin / C. Loomis (LAL-Orsay)
Data services in gLite “s” gLite and LCG.
Architecture of the gLite Data Management System
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

The lightweight Grid-enabled Disk Pool Manager (DPM) Sophie Lemaitre – Jean-Philippe Baud EGEE-OSG workshop 25 June 2007

DPM architecture SRMv2.2 VOMS and virtual ids What’s next ? Issues Agenda DPM architecture SRMv2.2 VOMS and virtual ids What’s next ? Issues

DPM architecture

Functionality offered Management of disk space on geographically distributed disk servers Management of name space (including ACLs) Control interfaces socket, SRM v1.0, SRM v2.1, SRM v2.2 (no srmCopy) Data access protocols secure RFIO, gsiFTP, HTTPS, and to come HTTP

SRM-enabled client, etc. DPM architecture /dpm /domain /home CLI, C API, SRM-enabled client, etc. /vo DPM head node file DPM Name Server Namespace Authorization Physical files location DPM Server Requests queuing and processing Space management SRM Servers (v1.1, v2.1, v2.2) Disk Servers Physical files Direct data transfer from/to disk server (no bottleneck) data transfer … DPM disk servers

DPM administration Feedback from DPM administrators Intuitive commands “Easy to install and configure” “It works for us !” “Our DPM has been running untouched for months” “Very good online documentation” Intuitive commands As similar to UNIX commands as possible Ex: dpns-ls, dpns-mkdir, dpns-getacl, etc. DPM architecture is database centric No configuration file Support for MySQL and Oracle Scalability All servers (except the DPM one) can be replicated if needed (DNS load balancing)

Platforms Supported platforms From next release onwards SL(C)3 SL(C)4 MAC OS X From next release onwards GridFTP 2 instead of GridFTP 1 GridFTP 2 plugin Allowed to have a cleaner implementation Much simpler than GridFTP 1 to interface to

SRMv2.2

What’s new ? SRMv2.2 5 new method types Biggest effort of last year Required significant changes in DPM server code 5 new method types Space reservation srmReserveSpace, srmReleaseSpace, … Namespace operations srmMkdir, srmLs, … Permissions and ACLs srmSetPermission, srmGetPermission, … Transfer functions srmPrepareToPut, srmPerpareToGet, … Admin functions srmPing

What’s new ? Retention policies Access latency File storage type Given quality of disks, admin defines quality of service Replica, Output, Custodial Access latency Online, Nearline Nearline will be used for BIOMED DICOM integration File storage type Volatile, Permanent File pinning Extend TURL lifetime (srmPrepareToGet, srmPrepareToPut) Extend file lifetime in space (srmBringOnline)

Space reservation Static space reservation (admin) $ dpm-reservespace --gspace 20G --lifetime Inf --group atlas --token_desc Atlas_ESD $ dpm-reservespace --gspace 100M --lifetime 1h --group dteam/Role=lcgadmin --token_desc LcgAd $ dpm-updatespace --token_desc myspace --gspace 5G $ dpm-releasespace --token_desc myspace Dynamic space reservation (user) Defined by user on request dpm-reservespace srmReserveSpace Limitation on duration and size of space reserved

VOMS & Virtual Ids

How to support VOMS ? Lightweight VOMS handling in DPM Check that VOMS proxy signature comes from a trusted host For scalability reasons, we didn’t want to contact another server for every authorization Why virtual ids ? We didn’t want to use local users / groups That admins would need to create a priori DPM instead uses virtual ids Stored in the DPM Name Server database Created automatically when user first connects with a valid proxy

DPM virtual ids Each user’s DN Each VOMS group, each VOMS role Name Server (uid1, gid1) Each user’s DN Is mapped to a unique virtual uid Each VOMS group, each VOMS role Is mapped to a unique virtual gid Virtual uids / gids are created automatically the first time a given user / group contacts the DPM

Virtual uids mapping (example) Virtual gids mapping (example) DPM virtual ids DPM Name Server (uid1, gid1) Ex: (102, 101) DB Virtual uids mapping (example) /C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 2268 101 /C=CH/O=CERN/OU=GRID/CN=Simone Campana 7461 102 Virtual gids mapping (example) $ grid-proxy-init $ voms-proxy-init --vo atlas Simone will be mapped to (uid, gid) = (102, 101) atlas 101 atlas/Role=lcgadmin 102 atlas/Role=production 103

Virtual uids mapping (example) Virtual gids mapping (example) DPM secondary groups DPM Name Server (uid1, gid1) Ex: (102, 103, 101) DB Virtual uids mapping (example) /C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 2268 101 /C=CH/O=CERN/OU=GRID/CN=Simone Campana 7461 102 $ voms-proxy-init –voms atlas:/atlas/Role=production Simone will be mapped to (uid, gid, …) = (102, 103, 101) Simone still belongs to “atlas” Virtual gids mapping (example) atlas 101 atlas/Role=lcgadmin 102 atlas/Role=production 103

ACLs on files DPM supports Posix ACLs based on Virtual Ids Example Access Control Lists on files and directories Default Access Control Lists on directories: they are inherited by the sub-directories and files under the directory Example dpns-mkdir /dpm/cern.ch/home/dteam/jpb dpns-setacl -m d:u::7,d:g::7,d:o:5 /dpm/cern.ch/home/dteam/jpb dpns-getacl /dpm/cern.ch/home/dteam/jpb # file: /dpm/cern.ch/home/dteam/jpb # owner: /C=CH/O=CERN/OU=GRID/CN=Jean-Philippe Baud 7183 # group: dteam user::rwx group::r-x #effective:r-x other::r-x default:user::rwx default:group::rwx default:other::r-x

ACLs on pools DPM terminology By default, pools are generic A DPM pool is a set of filesystems on DPM disk servers By default, pools are generic Possibility to dedicate a pool to several groups dpm-addpool --poolname poolA --group alice dpm-addpool --poolname poolB --group atlas,cms,lhcb Easy to add or remove groups dpm-modifypool --poolname poolA --group +atlas,-alice

Authorization models Follow the UNIX model Namespace: primary and secondary groups Space reservation: primary group only For disk space accounting (and quotas later) Who actually uses the space gets to pay the bill…

What’s next ?

What’s next ? Next release Short term (autumn 2007) DPM Name Server as local LFC Short term (autumn 2007) Quotas srmCopy daemon Medical data management Encryption DICOM backend Medium term (beginning 2008) NFSv4.1

Local LFC DPM Name Server Advantages Available in next release Can act as a local LFC (LCG File Catalog) Advantages Only one service to run instead of two (LFC + DPM) Transparent to the users Available in next release

DPM quotas DPM terminology Unix-like quotas A DPM pool is a set of filesystems on DPM disk servers Unix-like quotas Quotas are defined per disk pool Usage in a given pool is per DN and per VOMS FQAN Primary group gets charged for usage Quotas in a given pool can be defined/enabled per DN and/or per VOMS FQAN Quotas can be assigned by admin Default quotas can be assigned by admin and applied to new users/groups contacting the DPM

DPM quotas Unix-like quota interfaces User interface dpns-quota gives quota and usage information for a given user/group (restricted to the own user information) Administrator interface dpns-quotacheck to compute the current usage on an existing system dpns-repquota to list the usage and quota information for all users/groups dpns-setquota to set or change quotas for a given user/group

DPM with NFSv4.1 NFSv4.1 and DPM have similar architectures Separate metadata server Direct access to physical files Easy NFSv4.1 integration

Encrypted Storage Medical community as the principal user large amount of images are produced in DICOM privacy concerns vs. processing needs ease of use (image production and application)‏ Strong security requirements anonymity (patient data is separate)‏ fine grained access control privacy (even storage administrator cannot read) data is encrypted (DICOM-SE) and decrypted (client) in memory Hydra KeyStore AMGA metadata Hydra KeyStore Hydra KeyStore 2. keys 1. patient look-up 3.1.1 keys MDM = Medicam Data Management DICOM = Digital Image and Communication in Medicine grid SE = SRM + gridftp + I/O DICOM gridftp 3. get TURL 5. decrypt 3.1.2 image DICOM plug-in 3.1 get enc. image SRMv2 4. read enc. image I/O DICOM-SE

Issues

Issues DPM stable and reliable service but… No NFS support yet For several sites, reason for not moving from Classic SE to DPM Lack of experience with big sites Lack of internal monitoring Ex1: automatically disable a file system that is down Ex2: automatically limit the number of transfers to a disk server Different VO types (HEP, BIOMED, etc.) Need to develop different features for different needs

Number of Storage Element instances published in EGEE top BDII Summary DPM service Manages space on distributed disks Easy to configure and administer Easy and transparent to use Stable and reliable Grid service Widely deployed 125 DPM instances in EGEE 138 VOs supported Short term Quotas NFSv4 support Number of Storage Element instances published in EGEE top BDII

Help ? DPM online documentation Support General questions https://twiki.cern.ch/twiki/bin/view/LCG/DataManagementDocumentation Support helpdesk@ggus.org General questions hep-service-dpm@cern.ch