Caching in the Sprite Network File System Scale and Performance in a Distributed File System COMP 520 September 21, 2004.

Caching in the Sprite Network File System Scale and Performance in a Distributed File System COMP 520 September 21, 2004

Agenda The Sprite file system Basic cache design Concurrency issues BenchmarkingAndrew

The Sprite file systems is functionally similar to UNIX Read, write, open, and close calls provide access to files Read, write, open, and close calls provide access to files Sprite communicates kernel-to-kernel Sprite communicates kernel-to-kernel Remote-procedure-calls (RPC) allow kernels to talk to each other Remote-procedure-calls (RPC) allow kernels to talk to each other RPC

Sprite uses caching on the client and server side Two different caching mechanisms Two different caching mechanisms –Server workstations use caching to reduce delays caused by disk accesses –Client workstations use caching to minimize the number of calls made to non-local disks Client Cache Server Cache Client Cache Network File Traffic Server Traffic Server Traffic File Traffic Disk Traffic Disk Traffic

Three main issues are addressed by Sprite’s caching system 1. Should client caches be kept in main memory or on local disk? 2. What structure and addressing scheme should be used for caching? 3. What should happen when a block is written back to disk?

Sprite caches client data in main memory, not on local disk Allows clients to be diskless Allows clients to be diskless –Cheaper –Quieter Data access is faster Data access is faster Physical memory is large enough Physical memory is large enough – Provides a high hit ratio –Memory size will continue to grow A single caching mechanism can be used for both client and server A single caching mechanism can be used for both client and server Local Disk

A virtual addressing structure is used for caching Data organized into blocks Data organized into blocks –4 Kbytes –Virtually addressed –Unique file identifier and block index –Both client and server cache data blocks Server also caches naming info. Server also caches naming info. –Addressed using physical address –All naming operations (open, close, etc.) passed to the server –Cached file info. lost if server crashed Client Cache Server Cache Data block Data block Data block Data block Data block Data block Data block Data block Data block Mgmt Info Mgmt Info Mgmt Info Open Close Read Write Data blocks

Sprite uses a delayed-write policy to write dirty blocks to disk Every 30 seconds dirty blocks which have not been changed in the last 30 seconds are written to disk Every 30 seconds dirty blocks which have not been changed in the last 30 seconds are written to disk Blocks written by a client are written to the server’s cache in 30-60 seconds and to the server’s disk in 30-60 more seconds Blocks written by a client are written to the server’s cache in 30-60 seconds and to the server’s disk in 30-60 more seconds Limits server traffic Limits server traffic Minimizes the damage in a crash Minimizes the damage in a crash 30 sec. Dirty Block Disk Dirty Block Client Server 30 sec. Untouched for 30 seconds Untouched for 30 seconds

Two unusual design optimizations differentiate system, solve problems Consistency guaranteed Consistency guaranteed –All clients see the most recent version of a file –Provides transparency to the user –Concurrent and sequential write-sharing permitted Cache size changes Cache size changes –Virtual memory system and file system negotiate over physical memory –Cache space reallocated dynamically

Concurrent write-sharing makes the file system more user friendly A file is opened by multiple clients A file is opened by multiple clients At least one client has the file open for writing At least one client has the file open for writing Concurrent write- sharing occurs Concurrent write- sharing occurs F1: R F1: W

Concurrent write-sharing can jeopardize file consistency Server detects concurrent write-sharing Server detects concurrent write-sharing Server instructs client B to write all dirty blocks to memory Server instructs client B to write all dirty blocks to memory Server notifies all clients that file is no longer cacheable Server notifies all clients that file is no longer cacheable Clients remove all cached blocks Clients remove all cached blocks All future access requests sent to server All future access requests sent to server Server serializes requests Server serializes requests File becomes cacheable again once no longer open and undergoing write sharing File becomes cacheable again once no longer open and undergoing write sharing Server Client A Client C Client B F1: W F1: R NotifyWrite NotifyRequest Notify

Sequential write-sharing provides transparency, but not without risks Sequential write-sharing Sequential write-sharing –Occurs when a file is modified by a client, closed, then opened by a second client –Clients are always guaranteed to see the most recent version of the file File 1: v1 Client A Client BClient C Server File 1: v2 Client A Client BClient C Server File 1: v1

Sequential write-sharing: Problem 1 Problem: Problem: –Client A modifies a file –Client A closes the file –Client B opens the file using out-of-date cached blocks –Client B has an out-of- date version of the file Solution: version numbers Solution: version numbers Server Client A Client C Client B F1: v1 F1: cacheF1: v1 F1: v2 Close F1 Open F1

Sequential write-sharing: Problem 2 Problem: The last client to write to a file did not flush the dirty blocks Problem: The last client to write to a file did not flush the dirty blocks Solution: Solution: –Server keeps track of last writer –Only last writer allowed to have dirty blocks –When server receives open request, notifies last writer –Writer writes any dirty blocks to server Ensures reader will receive up-to-date info. Ensures reader will receive up-to-date info. Server (Client A) Client A Client C Client B F1: Dirty blocks Open: F1 Write: F1Notify F1

Cache consistency does increase server traffic Server traffic was reduced by over 70% due to client caching Server traffic was reduced by over 70% due to client caching 25% of all traffic is a result of cache consistency 25% of all traffic is a result of cache consistency Table 2 Table 2 –Gives an upper bound on cache consistency algorithms –Unrealistic since incorrect results occurred

Dynamic cache allocation also sets Sprite apart Virtual memory and the file system battle over main memory Virtual memory and the file system battle over main memory Both modules keep a time-of-last access Both modules keep a time-of-last access Both compare oldest page with oldest page in other module Both compare oldest page with oldest page in other module The oldest page in the cache is recycled The oldest page in the cache is recycled Virtual Memory Keeps pages in approx. LRU order using clock algorithm File System Keeps blocks in perfect LRU order by tracking read and write calls Main Memory Pages Blocks

Negotiations could cause double- caching Problem: Problem: –Pages being read from backing files could wind up in both the files cache and the virtual memory cache –Could force a page eliminated from the virtual memory pool to be moved to the file cache –The page would then have to wait another 30 seconds to be sent to the server Solution: Solution: –When writing and reading backing files, virtual memory skips the local file cache Virtual MemoryFile System Main Memory Page A

Multi-block pages create problems in shared caching Problem: Problem: –Virtual memory pages are big enough to hold multiple file blocks –Which block’s age should be used to represent the LRU time of the page? –What should be done with the other blocks once one is relinquished? Solution: Solution: –The age of the page is the age of the youngest block –All blocks in a page are removed together Virtual MemoryFile System 2:152:16 2:193:05 4:304:31 Main Memory

Micro-benchmarks show reading from a server cache is fast An upper limit on remote file access costs An upper limit on remote file access costs Two important results: Two important results: –A client can access his own cache 6-8 time faster than he can access the server’s cache –A client is able to write and read from the server’s cache about as quickly as he can from a local disk

Macro-benchmarks indicate disks and caching together run fastest With a warm start and client caching, diskless machines were only up to 12% worse than machines with disks With a warm start and client caching, diskless machines were only up to 12% worse than machines with disks Without caching machines were 10-50% slower Without caching machines were 10-50% slower

Andrew’s caching is notably different from Sprite’s Vice: a group of trusted servers Vice: a group of trusted servers –Stores data and status information in separate files –Has a directory hierarchy Venus: A user-level process on each client workstation Venus: A user-level process on each client workstation –Status cache: stored in virtual memory for quick status checks –Data cache: stored on local disk Client Cache Server Cache Memory Data block Data block Data block Data block Data block Data block Data block Naming Open Close Read Write Data blocks VenusVice Local Disk Memory Data file Status info Data file Data file Status info Data file Data file Status info Status info Status info Open Close Data files SpriteAndrew

…the pathname conventions are also very different Two level naming Two level naming –Each Vice file or directory identified by a unique fid –Venus maps Vice pathnames to fids –Servers see only fids Each fid has 3 parts and is 96 bits long Each fid has 3 parts and is 96 bits long –32-bit Volume number –32-bit Vnode number (index into the Volume) –32-bit Uniquifier (guarantees no fid is ever used twice) –Contains no location information Volume locations are maintained on Volume Location Database found on each server Volume locations are maintained on Volume Location Database found on each server Fid: 32-bits Volume numberUniquifierVnode number 32-bits

Andrew uses write-on-close convention Sprite Delayed-write policy Delayed-write policy –Changes written back every 30 seconds –Prevents writing changes that are quickly erased –Decreases damage in the event of a crash –Rationed network traffic Andrew Write-on-close policy Write-on-close policy Write changes are visible to the network only after the file is closed Write changes are visible to the network only after the file is closed Little information lost in a crash Little information lost in a crash –caching on local disk, not main memory –The network will not see a file in the event of a client crash 75% of files open less than 0.5 seconds 75% of files open less than 0.5 seconds 90% open less than 10 seconds 90% open less than 10 seconds Could results in higher server traffic Could results in higher server traffic Delays closing process Delays closing process

Sequential consistency is guaranteed in Andrew and Sprite Clients are guaranteed to see the latest version of a file Clients are guaranteed to see the latest version of a file –Venus assumes that cached entries are valid –Server maintains Callbacks to cached entries –Server notifies callbacks before allowing a file to be modified –Server has the ability to break callbacks to reclaim storage –Reduces server utilization since communication occurs only when a file is changed File 1: v1 Client A Client BClient C Server File 1: v2 Client A Client BClient C Server File 1: v1

Concurrent write-sharing consistency is not guaranteed Different workstations can perform the same operation on a file at the same time Different workstations can perform the same operation on a file at the same time No implicit file locking No implicit file locking Applications must coordinate if synchronization is an issue Applications must coordinate if synchronization is an issue Client A Client BClient C Server File 1: W File 1: R

Comparison to other systems With one client, Sprite is around 30% faster than NFS and 35% faster than Andrew With one client, Sprite is around 30% faster than NFS and 35% faster than Andrew Andrew has the greatest scalability Andrew has the greatest scalability –Each client in Andrew utilized about 2.4% of the CPU –5.4% in Sprite –20% in NFS

Summary: Sprite vs. Andrew SpriteAndrew Cache location MemoryDisk Client caching Clients cache data blocks Clients cache data files and status info. Cache size VariableFixed File path lookups Lookups performed by server Lookups performed by clients Concurrent write-sharing 30 second write delay, consistency guaranteed Write on close, consistency not guaranteed Sequential Consistency guaranteed: Servers know which workstations have a file cached Consistency guaranteed: Servers maintain callbacks to identify which workstations cache files Cache Validation Validated on open Server notifies when modified Kernel vs. User-level Kernel-to-kernel communication OS intercepts file system calls and forwards to user- level Venus

Conclusion: Both file systems have benefits and drawbacks Sprite Benefits: Benefits: –Guarantees sequential and concurrent consistency –Faster runtime with single client due to memory caching and kernel-to-kernel communication –Files can be cached in blocks Drawbacks: Drawbacks: –Lacks the scalability of Andrew –Writing every 30 seconds could result in lost data –Fewer files can be cached on main memory than on the disk Andrew Benefits: Benefits: –Better scalability due in part to shifting path lookup to client –Transferring entire files reduces communication with server, no read and write calls –Tracking entire files is easier than individual pages Drawbacks Drawbacks –Lacks concurrent write-sharing consistency guarantees –Caching to the disk slows runtime –Files larger than the disk cannot be cached

Backup

Cache consistency does increase server traffic Server traffic was reduced by over 70% due to client caching Server traffic was reduced by over 70% due to client caching 25% of all traffic is a result of cache consistency 25% of all traffic is a result of cache consistency Table 2 Table 2 –Gives an upper bound on cache consistency algorithms –Unrealistic since incorrect results occurred Server Traffic with Cache Consistency Client Cache Size Blocks read Blocks Written Total Traffic Ration 0 Mbyte 445815172546618361100% 0.5 Mbyte 1024699686619933532% 1 Mbyte 840179679618081329% 2 Mbytes 774459679617424128% 4 Mbytes 753229679617211828% 8 Mbytes 750889679617188428% Server Traffic, Ignoring Cache Consistency Client Cache Size Blocks read Blocks Written Total Traffic Ration 0 Mbyte 445815172546618361100% 0.5 Mbyte 807549366317441728% 1 Mbyte 523779325814563524% 2 Mbytes 417679325813502522% 4 Mbytes 381659325813142321% 8 Mbytes 370079325813026521%

Micro-benchmarks show reading from a server cache is fast Give an upper limit on remote file access costs Give an upper limit on remote file access costs Two important results: Two important results: –A client can access his own cache 6-8 time faster than he can access the server’s cache –A client is able to write and read from the server’s cache about as quickly as he can from a local disk Read and Write Throughput, Kbytes/second Local Cache Server Cache Local Disk Server Disk Read3269475224212 Write2893380197176 Maximum read and write rates in various places

Macro-benchmarks indicate disks and caching together run fastest With a warm start and client caching, diskless machines were only up to 12% worse than machines with disks With a warm start and client caching, diskless machines were only up to 12% worse than machines with disks Without caching machines were 10- 50% slower Without caching machines were 10- 50% slower Benchmar k Local Disk, with Cache Diskless, Server Cache Only Diskless, Client and Server Caches ColdWarmColdWarmColdWarm Andrew261105%249100%373150%363146%291117%280112% Fs-make660102%649100%855132%843130%698108%685106% Simulator161109%147100%168114%153104%167114%147100% Sort65107%61100%74121%72118%66108%61100% Diff22165%8100%27225%12147%27223%8100% Nroff53103%51100%57112%56109%53105%52102% Top number: time in seconds Bottom number: normalized time

Status info. on Andrew and Sprite Sprite mgmt cache contains: Sprite mgmt cache contains: –File maps –Disk management info.

Volumes in Andrew Volume Volume –A collection of files –Forms a partial subtree in Vice name space Volumes joined at Mount Points Volumes joined at Mount Points Resides in a single disk partition Resides in a single disk partition Can be moved from server to server easily for load balancing Can be moved from server to server easily for load balancing Enables quotas and backup Enables quotas and backup

Sprite caching improves speed and reduces overhead Client side caching enables diskless workstations Client side caching enables diskless workstations –Caching on diskless workstations improves runtime by 10-40% –Diskless workstations with caching are only 0-12% slower than workstations with disks Caching on the server and client side result in overall system improvement Caching on the server and client side result in overall system improvement –Server utilization is reduced from 5-18% to 1-9% per active client –File intensive benchmarking was completed 30-35% faster on Sprite than on other systems

Caching in the Sprite Network File System Scale and Performance in a Distributed File System COMP 520 September 21, 2004.

Similar presentations

Presentation on theme: "Caching in the Sprite Network File System Scale and Performance in a Distributed File System COMP 520 September 21, 2004."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Caching in the Sprite Network File System Scale and Performance in a Distributed File System COMP 520 September 21, 2004.

Similar presentations

Presentation on theme: "Caching in the Sprite Network File System Scale and Performance in a Distributed File System COMP 520 September 21, 2004."— Presentation transcript:

Similar presentations

About project

Feedback