Nomadic File Systems Uri Moszkowicz 05/02/02
Nomadic File Systems Overview Previous papers Coda Bayou Andrew File System Epidemic protocols
Coda General idea Ability to continue work when file servers are inaccessible through use of caching Transparent operation through optimistic conflict resolution
Coda
Coda Venus States Hoarding – Hoard data anticipating disconnect Emulation – Use of local cache Reintegration – Updating local cache
Coda Hoarding priorities Hoard walking Recent usage Hoard profiles – user specified Children and descendants may be included (ie directories) Hoard walking Since priority based on recent usage, every once in a while need to update file system to reflect priorities 10 min intervals chosen?
Coda Emulation Coda is faithful; all updates accepted At reintegration, validity is checked by servers All writes are logged to be replayed at reintegration Logs may use up all disk space; not gracefully handled (writes disabled)
Coda Reintegration Updates propagated to servers and vice versa (one volume at a time) 4 stages Log parsed, files locked Validation: conflict detection, security, integrity Fetching: shadow files are transferred (as needed) Commit: locks released and changes finalized Failures must be manually examined from logs
Coda Performance Reintegration time Disk size Likelihood of conflicts Significantly affected by hardware (~1990) and application
Bayou Anti-entropy Address Coda limitation Also Mobile replicas cannot reconcile amongst themselves in Coda Also Incremental progress (not volume) Efficient storage management Light-weight replica creation
Bayou Anti-entropy Simple anti-entropy protocol One-way operation between pairs Propagation of write operations Accept-order maintained Replicas can be ordered in any topology Amount of data propagated proportional to update activity at replicas, not size of data being replicated
Bayou Anti-entropy Basic operation Write log version vector sent from one replica to another Receiving server traverses write log and requests all writes it has not yet seen Write log Prune write log entries when stable ie all replicas have seen it Re-execute subsequent writes when learn of earlier write Each write log entry maintains a primary replica to track its stability or encode stability in messages
Bayou Anti-entropy
Bayou Anti-entropy Tuple Store Database (SQL) obtained by executing writes in order; cache for read requests Two views: full and committed Write Log - Ordered list of all writes ever received Undo Log - Facilitates reordering of write log “O” vector – for each vector timestamp of latest write that has been discarded (avoid re-accepting earlier writes because later one omitted) “C” vector – tracks committed writes “F” vector – tracks full writes “C”+”F” used to determine what to needs update
Bayou Anti-entropy Conflict Detection & Resolution Application specified and included in write Dependency checks (detection) write/write + read/write Merge procedures (resolution) Failures resolved later by user through application
Bayou Anti-entropy
Bayou Anti-entropy Consistency Servers can receive writes from other servers & clients, but writes are applied immediately when received Non-determinism must be removed since servers can be in different states Environment information (ie system clock) Exceeding resource limits
Bayou Anti-entropy Security Early version relied on central authentication server No servers? Possible invalid updates maintained on replicas until connected Later version relies on cryptography
Bayou Anti-entropy Protocol extensions Transportation of write log through portable media Causal order through logical clocks & total order of eventual consistency Light-weight server creation and deletion (not specified at volume creation time)
Bayou Anti-entropy Disadvantages Security Large network traffic during checkpoint synchronization Every replica must know about every other replica: complexity grows with size of network
Bayou Anti-entropy Policies When to reconcile Periodic, Manual, System Triggered Choosing which replicas to reconcile with Reachable, updatedness, primary status Deciding when to truncate write-log Available HD & RAM space, network usage Selecting server to create replicas Connection bandwidth, completeness of write-logs
Bayou Anti-entropy Performance ~1000 servers Hierarchy schemes to expand network size Size(tentative writes) = 10*size(committed writes) due to access control certificate Linear increase in anti-entropy time with number of replicas
Bayou Anti-entropy Examples Meeting room scheduler Users select several possible meeting times Upon connection, meeting times resolved based on chosen times or prompted for new time if irresolvable conflict Tentative meeting times exposed to other users
Bayou Anti-entropy Examples Bibliographic database Users enter bibliographic entries At resolution, dependency check can access database to see if two entries are the same (just typed differently), merge, or prompt user User may opt to be disconnected to avoid expensive cellular fees