Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

DataGrid is a project funded by the European Union CHEP 2003 – March 2003 – Grid-based access control – n° 1 Grid-based access control for Unix environments,
Andrew McNab - Manchester HEP - 17 September 2002 Putting Existing Farms on the Testbed Manchester DZero/Atlas and BaBar farms are available via the Testbed.
29 June 2006 GridSite Andrew McNabwww.gridsite.org VOMS and VOs Andrew McNab University of Manchester.
Andrew McNab - Manchester HEP - 2 May 2002 Testbed and Authorisation EU DataGrid Testbed 1 Job Lifecycle Software releases Authorisation at your site Grid/Web.
Andrew McNab - Manchester HEP - 31 January 2002 Testbed Release in the UK Integration Team UK deployment TB1 Job Lifecycle VO: Authorisation VO: GIIS and.
Andrew McNab - Manchester HEP - 29/30 March 2001 gridmapdir patch Overview of the problem Constraints from local systems Outline of how it works How to.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Andrew McNab - EDG Access Control - 14 Jan 2003 EU DataGrid security with GSI and Globus Andrew McNab University of Manchester
Andrew McNab - Manchester HEP - 6 November Old version of website was maintained from Unix command line => needed (gsi)ssh access.
CIS 240 Introduction to UNIX Instructor: Sue Sampson.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File-System Interface.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 2: Operating-System Structures Modified from the text book.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
NETWORK FILE SYSTEM (NFS) By Ameeta.Jakate. NFS NFS was introduced in 1985 as a means of providing transparent access to remote file systems. NFS Architecture.
Network File System (NFS) in AIX System COSC513 Operation Systems Instructor: Prof. Anvari Yuan Ma SID:
1 Network File System. 2 Network Services A Linux system starts some services at boot time and allow other services to be started up when necessary. These.
File Systems (2). Readings r Silbershatz et al: 11.8.
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Andrew McNab - Manchester HEP - 5 March 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:
10 May 2007 HTTP - - User data via HTTP(S) Andrew McNab University of Manchester.
Andrew McNab - GACL - 16 Dec 2003 Grid Access Control Language Andrew McNab, University of Manchester
Andrew McNab - EDG Access Control - 17 Jan 2003 EDG Site Access Control (ie Local Authorisation and Accounts) Andrew McNab, University of Manchester
Andrew McNab - GridPP Security - 24 Feb 2003 GridPP Security Middleware Andrew McNab, University of Manchester
Andrew McNab - SlashGrid, HTTPS, fileGridSite SlashGrid, HTTPS and fileGridSite 30 October 2002 Andrew McNab, University of Manchester
Advanced Operating Systems - Spring 2009 Lecture 21 – Monday April 6 st, 2009 Dan C. Marinescu Office: HEC 439 B. Office.
The Linux /proc Filesystem CSE8343 – Fall 2001 Group A1 – Alex MacFarlane, Garrick Williamson, Brad Crabtree.
1 All-Hands Meeting 2-4 th Sept 2003 e-Science Centre The Data Portal Glen Drinkwater.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Chapter 10: File-System.
Andrew McNab - Access Control - 28 May 2002 Access Control and User Management (ie Local Authorisation and Accounts) Andrew McNab, University of Manchester.
CVS – concurrent versions system Network Management Workshop intERlab at AIT Thailand March 11-15, 2008.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
EU DataGrid (EDG) & GridPP Authorization and Access Control User VOMS C CA 2. certificate dn, ca, key 1. request 3. certificate 4. VOMS cred: VO, groups,
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
Andrew McNab - Security - 1 July 2003 Security: Authorization, Access Control and Usage Control Andrew McNab, University of Manchester
Processes Introduction to Operating Systems: Module 3.
Andrew McNab - Grid HTTP/HTTPS extensions Grid HTTP/HTTPS extensions 18 November 2002 Andrew McNab, University of Manchester
CE Operating Systems Lecture 13 Linux/Unix interprocess communication.
Andrew McNab - Manchester HEP - 11 May 2001 Packaging / installation Ready to take globus from prerelease to release. Alex has prepared GSI openssh.
User VOMS Java C CA 2. certificate dn, ca, key 1. request 3. certificate 4. VOMS cred: VO, groups, roles, capabilities Authentication Certificate Authorities.
Andrew McNab - EDG Access Control - 4 Dec 2002 EDG Access Control and User Management (ie Local Authorisation and Accounts) Andrew McNab, University of.
Andrew McNabSecurity Middleware, GridPP8, 23 Sept 2003Slide 1 Security Middleware Andrew McNab High Energy Physics University of Manchester.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Andrew McNabGrid in 2002, Manchester HEP, 7 Jan 2003Slide 1 Grid Work in 2002 Andrew McNab High Energy Physics University of Manchester.
Andrew McNab - EDG Access Control - 17 Jun 2003 EU DataGrid and GridPP Authorization and Access Control Andrew McNab, University of Manchester
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 File-System Interface.
Linux Operations and Administration
Andrew McNab - Security issues - 17 May 2002 WP6 Security Issues (some personal observations from a WP6 and sysadmin perspective) Andrew McNab, University.
Andrew McNab - Security issues - 4 Mar 2002 Security issues for TB1+ (some personal observations from a WP6 and sysadmin perspective) Andrew McNab, University.
Andrew McNab - Globus Distribution for Testbed 1 Globus Distribution for Testbed 1 Andrew McNab, University of Manchester
Security Middleware Andrew McNab University of Manchester.
Andrew McNab - HTTP/HTTPS extensions HTTP/HTTPS as Grid data transport 6 March 2003 Andrew McNab, University of Manchester
Andrew McNab - Dynamic Accounts - 2 July 2002 Dynamic Accounts in TB1.3 What we could do with what we’ve got now... Andrew McNab, University of Manchester.
Introduction to AFS IMSA Intersession 2003 An Overview of AFS Brian Sebby, IMSA ’96 Copyright 2003 by Brian Sebby, Copies of these slides.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
Storage Element Security Jens G Jensen, WP5 Barcelona, May 2003.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Security recommendations DPM Jean-Philippe Baud CERN/IT.
OGSA-DAI.
Andrew McNabSlashGrid/GFS BOF, GGF9, 7 Oct 2003Slide 1 SlashGrid = “/grid” Andrew McNab High Energy Physics University of Manchester
Chapter 12: File System Implementation
Operating System Structure
Operation System Program 4
Chapter 2: System Structures
NFS.
CSE 451 Fall 2003 Section 11/20/2003.
Presentation transcript:

Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation: remote file access Implementation: Coda, ACLs, plugins Current status Future work a framework for Grid-aware filesystems

Andrew McNab - Manchester HEP - 29 January 2002 Motivation: dynamic accounts For TB1 we provided a patch for Globus gatekeeper, gsi_wuftpd etc to associate Unix UIDs from a pool with the Grid DN identities of incoming requests. This is ok when all jobs do on the machine in question is computation. But (1) any files created by pool UID need to cleaned up before account can be reallocated. But (2) no good for long term storage, since no promise to maintain UID-DN association in long term. But (3) what if malicious user creates a cron entry, writes to some obscure writeable directory we didn’t think of, etc?

Andrew McNab - Manchester HEP - 29 January 2002 Solution: get away from UID filesystems All these problems are fundamentally because files are owned according to UID, but we want UID to have no long term meaning. Obvious solution is to have a filesystem where file ownership depends on Grid DNs not temporary UIDs. –Can then ban user processes from writing anywhere else (straightforward to impose this with a modified ext2 device driver: eg no disk files can be created if UID > 99) UID becomes as transitory as Process Group ID. Problem now becomes: how to implement a DN/Grid aware filesystem?

Andrew McNab - Manchester HEP - 29 January 2002 Implementation alternatives 1) Fake a filesystem by making user process use modified versions of open(), read() etc system calls. –Can do this by relinking, or by an interposition / bypass library that is preloaded before real, shared libc. –But, this cannot enforce access restrictions on files accessible on local disk (since you can use a static binary and ignore permissions) –Need to put filesystem behind a server, accessed via TCP ports, named pipe, or shared memory (all the usual X tricks.) This going to be slow for streaming large files: the very thing we need to be fast. 2) Put filesystem into kernel –Lets kernel enforce access control. Potentially as fast as normal disk. –User space daemon useful to parse proxies, and do any remote IO.

Andrew McNab - Manchester HEP - 29 January 2002 Coda A suitable kernel module already exists for Linux: Coda –introduced into main kernel tree in 1997 (during 2.1) and present in all 2.2 and 2.4 stable kernels. This is part of the Coda project at CMU, an open source fork of AFS2. Very similar architecture to AFS –Kernel module and client side cache daemon (Venus) –Kerberos based Already used “parasitically” by other Linux projects –eg AVFS maps files to virtual filesystems (eg cd into a tar file…) Coda kernel module / Venus also available for *BSD and Windows 98/NT upwards.

Andrew McNab - Manchester HEP - 29 January 2002 Implementation with Coda Coda kernel module talks to client cache daemon by exchanging messages via /dev/cfs0 Since we already have the kernel module, we just need to write a Venus-like daemon: SlashGrid (“/grid”) Coda implementation allows efficient streaming: –open(), close(), stat() handled by calls to Venus/SlashGrid daemon –coda_open call returns the inode of the cached copy to the kernel –subsequent read() and write() operations handled by kernel itself, without daemon being involved. –So streaming a local copy is just as fast as reading/writing a normal disk file. Since SlashGrid called for open()’s etc, can enforce DN based access control at that point.

Andrew McNab - Manchester HEP - 29 January 2002 System calls with SlashGrid kernel a real (ext2) disk open() read() stat() SlashGrid read() write() open() stat() /dev/cfs0/var/spool/slashgrid/fcache ordinary directory/grid/... Standard Unix User process

Andrew McNab - Manchester HEP - 29 January 2002 Remote file access Another idea that has been around a while: AFS-like system using Grid protocols. All the usual advantages of a global filesystem –Makes a lot of the tedious management of “parameter” files needed by jobs just another operating system service. –Very useful for interactive users: they just see the Grid as one big file system. –Makes all applications (even ls) Grid-enabled immediately. Already using URLs to refer to remote files, so easy to find an appropriate mapping into a filesystem space. So we want to design a system that can be generalised to remote file access too.

Andrew McNab - Manchester HEP - 29 January 2002 ACL format Need to specify permissions in some way. Commonly used compromise between granularity and simplicity is the per-directory ACL (cf AFS) We’ve used the same format as the GridSite website management system (used for WP6 and GridPP websites): –admin: can modify ACL –write: can write/create files –list: can get a directory listing –read: can read a named file –ACL consists of lines: Currently only implement but in future will add VO groups, CAS authorisation symbols etc (when dust settles...)

Andrew McNab - Manchester HEP - 29 January 2002 ACL implementation Each directory has, or appears to have, a read-only file.grid-acl consisting of ACL lines in format. Can easily be transferred via existing protocols –eg if cache daemon fetches a file from a remote gsi-ftp server, can fetch the.grid-acl from the same directory without modifying gsi_wuftpd or GridFTP protocol. Modification of ACL done by accessing “virtual files” - these operations are trapped by SlashGrid and ACL updated –cf. Coda’s.CONTROL mechanism –eg remove file.grid-acl-write-%url-encoded-DN% to change the DN’s permission level to write Provide command line tools to hide this from users

Andrew McNab - Manchester HEP - 29 January 2002 Plugin framework Avoid making a monolithic system since: –Lots of interesting filesystems possible: anon ftp, http, https, gsi-ftp, rfio, ldap, SQL databases (cf. Oracle 8i) … –Lots of uncertainty about which caching strategies to use. –Some people will want some but not all of this on their systems. Have /etc/slashgrid.conf that specifies mount points and then which loadable module handles which part of the file system (cf. /etc/fstab) At start time, load dynamic modules which all export a common API. SlashGrid daemon hands each request to the right plugin –user: stat() => coda_getattr => PluginStat() => plugin: stat()

Andrew McNab - Manchester HEP - 29 January 2002 Example configuration /etc/slashgrid.conf [/] plugin=certfs.so [/gsiftp] plugin=gsiftpfs.so /grid - mount point for Coda kernel module fs /var/spool/slashgrid/fcache/ => /grid/ /var/spool/slashgrid/fcache/tmp/ => /grid/tmp/ /var/spool/slashgrid/fcache/gsiftp/ => /grid/gsiftp/ /usr/lib/slashgrid/plugins/certfs.so, gsiftpfs.so...

Andrew McNab - Manchester HEP - 29 January 2002 Remote file access strategies SlashGrid framework allows several options: none “the best” simplest: make a local copy when the coda_open call is received, and return the copy’s inode when transfer finishes –ok for small files –awful for very big files: need lots of disk cache and have to wait pure streaming: plugin forks a process to stream the file from remote server; makes a temporary named pipe and returns its inode to kernel; writes incoming file to pipe; kernel (and therefore user) read file as it comes in; tidy up pipe when coda_close received. –good when we have a copy on a “close” file server (cf. NFS) both: stream file down a named pipe, but keep a copy too. Writing even more complicated: when to transfer local write-cache? –do we need consistency for different machines viewing the same server?

Andrew McNab - Manchester HEP - 29 January 2002 Current status Have implemented SlashGrid daemon and one plugin to provide local file storage with ACLs (certfs.so) SlashGrid obtains DN of a UID from /tmp/x509up_uUID –so you do grid-proxy-init to get started stat / read / creat / mkdir / write / remove / rename / chmod system calls working for files and directories can already do normal shell commands (ls etc), edit files with emacs, even copy the SlashGrid and certfs sources into the filesystem and build them with make and gcc. some things not yet done –hard and soft links (means I can’t try building a Linux kernel yet…) –modifying ACL’s - have to be set manually as root still

Andrew McNab - Manchester HEP - 29 January 2002 Future work Finish certfs and ACL tools Implement an example remote IO plugin –probably anonymous ftp since simplest Document the plugin API –Encourage other people to write plugins for things they need. Write plugins for the major protocols: gsi-ftp and https Investigate specialised filesystems for dynamic accounts, automated cleanup, extra logging / auditing,... Look at porting to other OS’s: –Coda kernel module exists for *BSD and Windows already –The Linux Coda module was only 4000 lines of C...

Andrew McNab - Manchester HEP - 29 January 2002 Conclusion Have implemented a read/write filesystem for Linux, based on Grid DNs rather than Unix UIDs. Have done this in an extendable way using plugins for different filesystem types. Should be straightforward to write a plugin for your favourite remote file access protocol. System is efficient for streaming local copies of files –But can still accommodate many different strategies for fetching, caching and streaming files from remote servers. (Thanks to Anders, Cal and Fabio of Integration Team for useful discussions about all these issues.)

Andrew McNab - Manchester HEP - 29 January 2002 More information... –(now) –(later today) WP6 CVS repository –(later this week)