Host and Callback Tracking in OpenAFS Jeffrey Altman, Secure Endpoints, Inc Derrick Brashear, Sine Nomine Associates
AFS concepts Cell: The unit of administrative control for AFS filesystems. (An organization or part of one) Volume: A relocatable (path- and storage- wise) piece of the AFS filesystem. FID: A volume, vnode (file or directory) and uniquifier (an incrementing version) which corresponds uniquely to one revision of one object. A FID has no cell identifier.
AFS concepts UUID: a (theoretically) globally unique identifier on each client and server. Used to track when we’re talking to the same machine at a different address. “Multi” requests: allow us to make a request to a list of addresses all at once.
10 mile view of AFS AFS uses UDP as its data transport. This means there is no concept of “connected” The fileserver must track the client so that as the client moves, or goes away, status can be maintained or purged as appropriate. Clients may have many addresses (host and port pairs), treated equivalently when requests are handled. Exactly one, the “callback connection” address, will be used for messages which originate at the server.
10 mile view of AFS Client-server architecture Clients work like traditional network FS Except they cache. In order to maintain cache coherency, help from the server is needed.
100 foot view of AFS To avoid unnecessary network traffic, the fileserver tracks which clients are interested in what objects, and offers a callback, which is a coherency guarantee for a given duration. If the object changes before that time expires, the client is notified. This is a “callback break”. It will be sent via the callback connection. A client can at break or expiration re-obtain a callback. Data is re-fetched only if needed.
6 inch view of AFS Callbacks are granted on a sliding time scale. More clients interested in a file means shorter callbacks. This is probably dumb. Should be based upon the likelihood of change. For.readonly volumes, callbacks are tracked at the volume level.
Callback “buckets” Since we lower the duration of a callback as more clients become interested, we keep track of FIDs in “buckets”. When too many people for the given bucket care, it goes in the next bucket. Buckets are 4 hours (up to 7 users), 1 hour (up to 15 users), and descend to 7 minutes above 63 users. Volume callbacks have a 2 hour duration.
Breaking callbacks When a file is edited. When a directory is modified (including ACLs and owner/group/mode). When any object is unlocked. When a volume is released or restored. When we reach the number of callbacks the server is capable of tracking.
Host Tracking TellMeAboutYourself allows us to ask the client for its host, port, and capabilities lists, or no capabilities with WhoAreYou. The addresses we receive from the client can not be trusted. Given NAT, it is likely that all of them are useless. We should be able to reply to the sender, though.
Enter the UUID ProbeUUID allows us to group host/port combinations which are all the same client. Remember it’s always safe to use the address which sent to us. Update the address list tracked with the UUID if this changes.
Timeouts No, you don’t have to sit in the corner. The common case is a 57 second delay, during which retransmits happen at slowing rates, until we decide the other end is gone. Ideally, state is tracked so the client never has to elapse this time idle.
Follow the callback connection The first reply to a multi break callback request. The first reply to a multi probe alternate address request. The primary address of a host when we remove the address previously in use for the connection. A new address when a client switches to it to talk to us.
Client Server Request TellMeAboutYourself UUID, interface list, capabilities Answer Creates a new client New client InitCallBackState
Client Server Request TellMeAboutYourself UUID, interface list, capabilities Answer Old client, new address ProbeUuid old address ? Drop old address
Client Server Request TellMeAboutYourself UUID, interface list, capabilities Answer Old client, new address, old address reused ProbeUuid old address Drop old address Other client
Host Tracking Background checks in the server poll clients to make sure they’re still around, and do the same basic operations. Older Windows clients don’t use UUIDs; Instead, tracking uses host and port pairs, only.
Callback tracking Clients have hashed linked lists of callbacks. This is keyed from expire time, not the FID. The server’s lists can be dumped by a signal: kill -XCPU (pid of fileserver) You then need “cbd” to read the list. What can be done to reduce the bottlenecks and improve performance?
Derailed Just because the client and server are online doesn’t mean everything will work. Many NATs aggressively expire portmappings. The client will need to establish a new one. The server then gets to discover it’s the same client. Callbacks thus become delayed.
Delayed callbacks Because a client can legitimately be unreachable for periods while not actually rebooting, we must track state while the client is gone. Enter “delayed callbacks”. The next time the client talks to us, “and by the way, these callbacks have been broken”.
A special case Because a volume release or restore is dealt with by a single thread within the fileserver, all communications relating to that were done in that (“fssync”) thread. The thread could block for long periods waiting on down or offline clients, causing issues with consecutive releases.
Procrastination The fssync thread has an interface to mark a callback immediately void, but be broken “later”. Another thread looks for these callbacks and breaks them, freeing the fssync thread to do it’s job. “Later” callbacks become delayed just like any other callback.
BreakCallBack Host online, done Host offline, becomes delayed Broken on first contact. Server Online ClientOffline Client
Questions? Jeff: endpoints.com Derrick: Mailing list: openafs- endpoints.com