Host and Callback Tracking in OpenAFS Jeffrey Altman, Secure Endpoints, Inc Derrick Brashear, Sine Nomine Associates.

Slides:



Advertisements
Similar presentations
Overview Network security involves protecting a host (or a group of hosts) connected to a network Many of the same problems as with stand-alone computer.
Advertisements

Interprocess Communication CH4. HW: Reading messages: User Agent (the user’s mail reading program) is either a client of the local file server or a client.
Umut Girit  One of the core members of the Internet Protocol Suite, the set of network protocols used for the Internet. With UDP, computer.
IS333, Ch. 26: TCP Victor Norman Calvin College 1.
 Introduction Originally developed by Open Software Foundation (OSF), which is now called The Open Group ( Provides a set of tools and.
How do Networks work – Really The purposes of set of slides is to show networks really work. Most people (including technical people) don’t know Many people.
CS533 - Concepts of Operating Systems 1 Remote Procedure Calls - Alan West.
Cs4411 – Operating Systems Practicum November 4, 2011 Zhiyuan Teo Supplementary lecture 4.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
IPv6 Mobility David Bush. Correspondent Node Operation DEF: Correspondent node is any node that is trying to communicate with a mobile node. This node.
Introduction to Management Information Systems Chapter 5 Data Communications and Internet Technology HTM 304 Fall 07.
TCP/IP Protocol Suite 1 Chapter 11 Upon completion you will be able to: User Datagram Protocol Be able to explain process-to-process communication Know.
Computer Networks Transport Layer. Topics F Introduction  F Connection Issues F TCP.
Chapter 23: ARP, ICMP, DHCP IS333 Spring 2015.
WXES2106 Network Technology Semester /2005 Chapter 8 Intermediate TCP CCNA2: Module 10.
P2P Project Mark Kurman Nir Zur Danny Avigdor. Introduction ► Motivation:  Firewalls may allow TCP or UDP connections on several specific ports and block.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Application Layer Functionality and Protocols Network Fundamentals – Chapter.
1 Network File System. 2 Network Services A Linux system starts some services at boot time and allow other services to be started up when necessary. These.
File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.
Deadlocks in Distributed Systems Deadlocks in distributed systems are similar to deadlocks in single processor systems, only worse. –They are harder to.
Version Control with Subversion. What is Version Control Good For? Maintaining project/file history - so you don’t have to worry about it Managing collaboration.
Combating Abuse Brian Nisbet NOC Manager HEAnet.
Federal Student Aid Identification username and password – this is how students and parents will sign the FAFSA application. The FSA ID process replaced.
Networked File System CS Introduction to Operating Systems.
Lecture 2 TCP/IP Protocol Suite Reference: TCP/IP Protocol Suite, 4 th Edition (chapter 2) 1.
Copyright ®xSpring Pte Ltd, All rights reserved Versions DateVersionDescriptionAuthor May First version. Modified from Enterprise edition.NBL.
Distributed File Systems
Announcing U.S. Dept of Energy SBIR Grant Supporting Development of Next Generation OpenAFS Jeffrey Altman, President Your File System Inc. 13 September.
Shared File Performance Improvements LDLM Lock Ahead Patrick Farrell
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Network Services Networking for Home and Small Businesses – Chapter 6.
Jozef Goetz, Application Layer PART VI Jozef Goetz, Position of application layer The application layer enables the user, whether human.
CMPT 471 Networking II Address Resolution IPv4 ARP RARP 1© Janice Regan, 2012.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 OSI Transport Layer Network Fundamentals – Chapter 4.
CS332, Ch. 26: TCP Victor Norman Calvin College 1.
OpenAFS in a multihomed universe Jeffrey Altman Derrick Brashear.
Kerberos Named after a mythological three-headed dog that guards the underworld of Hades, Kerberos is a network authentication protocol that was designed.
OpenAFS for Windows Status Report AFS & Kerberos Best Practice Workshop 2008.
Module 7: Resolving NetBIOS Names by Using Windows Internet Name Service (WINS)
NFS : Network File System SMU CSE8343 Prof. Khalil September 27, 2003 Group 1 Group members: Payal Patel, Malka Samata, Wael Faheem, Hazem Morsy, Poramate.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Digital Networking TOI David Smith
The Alternative Larry Moore. 5 Nodes and Variant Input File Sizes Hadoop Alternative.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
OpenAFS Status Report Cartel 2008 Stanford University.
CS333 Intro to Operating Systems Jonathan Walpole.
Mapping IP Addresses to Hardware Addresses Chapter 5.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
ECE 4110 – Internetwork Programming
Firewalls A brief introduction to firewalls. What does a Firewall do? Firewalls are essential tools in managing and controlling network traffic Firewalls.
1 Middleware and future telecom ’platform’ By Lill Kristiansen, ntnu.
ITM © Port,Kazman 1 ITM 352 Cookies. ITM © Port,Kazman 2 Problem… r How do you identify a particular user when they visit your site (or any.
A PC Wakes Up A STORY BY VICTOR NORMAN. Once upon a time…  a PC (we’ll call him “H”) is connected to a network and turned on. Aside: The network looks.
4343 X2 – The Transport Layer Tanenbaum Ch.6.
AFS/OSD Project R.Belloni, L.Giammarino, A.Maslennikov, G.Palumbo, H.Reuter, R.Toebbicke.
Internet Flow By: Terry Hernandez. Getting from the customers computer onto the internet Internet Browser
Netprog: Client/Server Issues1 Issues in Client/Server Programming Refs: Chapter 27.
1 Kyung Hee University Chapter 11 User Datagram Protocol.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
First generation firewalls packets filtering ريماز ابراهيم محمد علي دعاء عادل محمد عسجد سامي عبدالكريم.
Ch. 23, 25 Q and A (NAT and UDP) Victor Norman IS333 Spring 2015.
Computer Networks 1000-Transport layer, TCP Gergely Windisch v spring.
IPEmotion License Management PM (V1.2).
Skype.
Jonathan Walpole Computer Science Portland State University
Andrew File System (AFS)
LCGAA nightlies infrastructure
Introduction to Networking
Cary G. Gray David R. Cheriton Stanford University
Overview Multimedia: The Role of WINS in the Network Infrastructure
Presentation transcript:

Host and Callback Tracking in OpenAFS Jeffrey Altman, Secure Endpoints, Inc Derrick Brashear, Sine Nomine Associates

AFS concepts Cell: The unit of administrative control for AFS filesystems. (An organization or part of one) Volume: A relocatable (path- and storage- wise) piece of the AFS filesystem. FID: A volume, vnode (file or directory) and uniquifier (an incrementing version) which corresponds uniquely to one revision of one object. A FID has no cell identifier.

AFS concepts UUID: a (theoretically) globally unique identifier on each client and server. Used to track when we’re talking to the same machine at a different address. “Multi” requests: allow us to make a request to a list of addresses all at once.

10 mile view of AFS AFS uses UDP as its data transport. This means there is no concept of “connected” The fileserver must track the client so that as the client moves, or goes away, status can be maintained or purged as appropriate. Clients may have many addresses (host and port pairs), treated equivalently when requests are handled. Exactly one, the “callback connection” address, will be used for messages which originate at the server.

10 mile view of AFS Client-server architecture Clients work like traditional network FS Except they cache. In order to maintain cache coherency, help from the server is needed.

100 foot view of AFS To avoid unnecessary network traffic, the fileserver tracks which clients are interested in what objects, and offers a callback, which is a coherency guarantee for a given duration. If the object changes before that time expires, the client is notified. This is a “callback break”. It will be sent via the callback connection. A client can at break or expiration re-obtain a callback. Data is re-fetched only if needed.

6 inch view of AFS Callbacks are granted on a sliding time scale. More clients interested in a file means shorter callbacks. This is probably dumb. Should be based upon the likelihood of change. For.readonly volumes, callbacks are tracked at the volume level.

Callback “buckets” Since we lower the duration of a callback as more clients become interested, we keep track of FIDs in “buckets”. When too many people for the given bucket care, it goes in the next bucket. Buckets are 4 hours (up to 7 users), 1 hour (up to 15 users), and descend to 7 minutes above 63 users. Volume callbacks have a 2 hour duration.

Breaking callbacks When a file is edited. When a directory is modified (including ACLs and owner/group/mode). When any object is unlocked. When a volume is released or restored. When we reach the number of callbacks the server is capable of tracking.

Host Tracking TellMeAboutYourself allows us to ask the client for its host, port, and capabilities lists, or no capabilities with WhoAreYou. The addresses we receive from the client can not be trusted. Given NAT, it is likely that all of them are useless. We should be able to reply to the sender, though.

Enter the UUID ProbeUUID allows us to group host/port combinations which are all the same client. Remember it’s always safe to use the address which sent to us. Update the address list tracked with the UUID if this changes.

Timeouts No, you don’t have to sit in the corner. The common case is a 57 second delay, during which retransmits happen at slowing rates, until we decide the other end is gone. Ideally, state is tracked so the client never has to elapse this time idle.

Follow the callback connection The first reply to a multi break callback request. The first reply to a multi probe alternate address request. The primary address of a host when we remove the address previously in use for the connection. A new address when a client switches to it to talk to us.

Client Server Request TellMeAboutYourself UUID, interface list, capabilities Answer Creates a new client New client InitCallBackState

Client Server Request TellMeAboutYourself UUID, interface list, capabilities Answer Old client, new address ProbeUuid old address ? Drop old address

Client Server Request TellMeAboutYourself UUID, interface list, capabilities Answer Old client, new address, old address reused ProbeUuid old address Drop old address Other client

Host Tracking Background checks in the server poll clients to make sure they’re still around, and do the same basic operations. Older Windows clients don’t use UUIDs; Instead, tracking uses host and port pairs, only.

Callback tracking Clients have hashed linked lists of callbacks. This is keyed from expire time, not the FID. The server’s lists can be dumped by a signal: kill -XCPU (pid of fileserver) You then need “cbd” to read the list. What can be done to reduce the bottlenecks and improve performance?

Derailed Just because the client and server are online doesn’t mean everything will work. Many NATs aggressively expire portmappings. The client will need to establish a new one. The server then gets to discover it’s the same client. Callbacks thus become delayed.

Delayed callbacks Because a client can legitimately be unreachable for periods while not actually rebooting, we must track state while the client is gone. Enter “delayed callbacks”. The next time the client talks to us, “and by the way, these callbacks have been broken”.

A special case Because a volume release or restore is dealt with by a single thread within the fileserver, all communications relating to that were done in that (“fssync”) thread. The thread could block for long periods waiting on down or offline clients, causing issues with consecutive releases.

Procrastination The fssync thread has an interface to mark a callback immediately void, but be broken “later”. Another thread looks for these callbacks and breaks them, freeing the fssync thread to do it’s job. “Later” callbacks become delayed just like any other callback.

BreakCallBack Host online, done Host offline, becomes delayed Broken on first contact. Server Online ClientOffline Client

Questions? Jeff: endpoints.com Derrick: Mailing list: openafs- endpoints.com