Review Session for Fourth Quiz Jehan-François Pâris Summer 2011
Blue File System
According to the designers of the Blue System what are the two limitations of flash drives?
Blue File System They can be lost. It is hard to keep them synchronized
Blue File System The Blue File System is said to have a dynamic storage hierarchy. What does it mean?
Blue File System The ranking of the storage devices in the storage hierarchy depends on their states. A disk that is powered down will have a lower priority in the hierarchy than the remote server A disk that is powered up will have a higher priority than the same server
Blue File System How does the Blue file system operate its device write queues?
Blue File System It empties them when it flushes them to disk. Much more could be said.
Blue File System Explain how the Blue file system saves energy by aggregating writes to local disks.
Blue File System Aggregating writes to local disks saves energy by amortizing disk power state transitions across multiple writes.
Blue File System True or false: Most of the Blue FS functionality is handled by a user-level server.
Blue File System True
Pergamum
Pergamum What equipment failures can be corrected by intratome redundancy?
Pergamum Irrecoverable read errors
Pergamum What would be the main drawback of a Pergamum system having Plenty of intratome redundancy but No intertome redundancy?
Pergamum It would not tolerate full disk failures
Pergamum How do intradisk parity blocks contribute to reduce the power consumption of the system?
Pergamum They allow the local recovery of bad blocks without having to power up other tomes
Pergamum What are the two main functions of Pergamum digital signatures? Where are they stored? Why?
Pergamum Their two main functions are To verify the integrity of the tome’s contents By exchanging them with other Pergamum tomes, to verify the integrity of distributed data.
Pergamum Where are they stored? Why?
Pergamum They are stored in a small flash drive so they can be consulted without powering the tome’s hard drive.
Pergamum What is disk scrubbing?
Pergamum Disk scrubbing periodically verifies that a given range of disk blocks can be retrieved and reconstitutes the contents of the blocks that it could no access due to an irrecoverable read error.
Pergamum Which feature of Pergamum reduces the need for frequent full-disk scrubs?
Pergamum Pergamum intratome parity reduces the need for frequent disk scrubs as it provides an additional way to reconstitute the contents of the blocks that caused irrecoverable read errors.
Pergamum How does Pergamum reconstitute data contained on a tome that failed?
Pergamum 1.Pergamum replaces the failed tome by a new tome 2.One after the other, each tome in the same parity stripe as the failed tome sends its contents to the new tome
Pergamum Why?
Pergamum To avoid powering up too many tomes at the same time
Pergamum How does the system’s workload—and intended use(s)-- affect the tradeoffs to consider when deciding the right amount of intra-disk and inter-disk redundancy in a storage system?
Pergamum Intra-disk redundancy saves energy in archival file systems because it allows local reconstruction of irrecoverable read errors We might prefer using more inter-disk redundancy in conventional file systems as inter-disk redundancy protects data against both irrecoverable read errors and disk failures.
FARSITE
FARSITE How does FARSITE store users’ secret keys? Why?
FARSITE FARSITE encrypts the secret keys of its users with a symmetric key derived from user password and stores them in a globally-readable directory. It does it because these keys are typically too long to be memorized by the user.
FARSITE What characterizes a Byzantine failure?
FARSITE 1. 1.The failed node keeps communicating with the other nodes 2. 2.We have no easy way to detect such a failed node
FARSITE How does Farsite guarantee the availability and the integrity its directory data?
FARSITE Farsite replicates directory and manage them through a Byzantine fault- tolerant protocol that ensures their integrity (as long as less than one third of the machines misbehave in any manner).
FARSITE In addition to using a Byzantine agreement protocol in its directory host, which steps does Farsite take to protect user files against malicious behaviors by its file hosts?
FARSITE 1.File blocks are encrypted so that file hosts cannot access their contents. 2.File blocks are also replicated on different hosts so that a single file host cannot maliciously destroy a file. 3.Farsite ensures that all copies of a given file block will be spread over machines controlled by different owners.
FARSITE You are to design a FARSITE file system that can tolerate two Byzantine failures. What is the minimum number of members in each directory host?
FARSITE Each directory host should have at least seven members
FARSITE What is the minimum number of copies each data block should have?
FARSITE Each data block should have at least Each data block should have at least three copies
FARSITE What is a Sybil attack? How does Farsite protects itself against them?
FARSITE A Sybill attack is an attack where one or more rogue nodes assume multiple identities. To prevent that, Farsite requires each node entered the system to have a verifiable unique ID issued by a trusted authority
FARSITE Which actions does FARSITE take when the owner of a file grants or revokes access to a given file?
FARSITE When the owner of a file grants access to the file to another user, FARSITE encrypts a copy of the file key with the public key of the new user. When that access is revoked, FARSITE deletes that copy.
FARSITE How is the effect of a revoke different of that of the same revoke on a conventional UNIX system?
FARSITE The user whose has lost the right to access the file could still be able to read it if he/she has kept a copy of the file key on his/her own workstation.
FARSITE What could FARSITE do to implement the semantics of a UNIX access right revocation?
FARSITE It would require encrypting the file with a new key.
FARSITE What does FARSITE to improve its less than stellar response time? Hint: Answer has two parts
FARSITE Files are cached for up to one week on the client machines Farsite uses background—”lazy”— propagation of directory updates
Farsite What is a lease?
Farsite A lease is a time-limited contract between the file server and a client guaranteeing that the server will not accept any update for a given file or et of files during the duration of the lease without notifying first the client. Typical lease durations are fairly short.
Zyzzyva
Zyzzyva Why may a Zyzzyva replica sometimes store two checkpoints?
Zyzzyva Zyzzyva replicas have two checkpoints whenever their latest checkpoint contains non-committed history. (That checkpoint is then called a tentative checkpoint.) As a result, the replica must keep its previous checkpoint until the new checkpoint becomes a committed checkpoint.
Zyzzyva When does a Zyzzyva tentative checkpoint becomes a committed checkpoint ?
Zyzzyva A checkpoint becomes a committed checkpoint as soon as all the history it contains has become committed history
Zyzzyva What are the four exchanges of messages that occur during the gracious execution of the Zyzzyva Byzantine fault-tolerant protocol?
Zyzzyva Client sends a message to primary replica Primary replica sends a message to all secondary replicas. Secondary replicas send a message to the client. Client send a message to all replicas (not included in the paper's figures)
FAWN
FAWN How is the FAWN datastore organized?
FAWN As a log operating in append mode
FAWN Why?
FAWN Because flash memory performs sequential writes much faster than random writes
FAWN What is the purpose of allocating several randomly selected virtual nodes to each FAWN node?
FAWN To spread the workload of a failed physical node among several physical nodes
FAWN Why do Pergamum and FAWN select very different CPUs for their nodes?
FAWN The CPU of a Pergamum tome controls a hard drive that is likely to be powered down 90 to 95 percent of the time Power savings are paramount The CPU of a FAWN node controls a faster flash drive that is very frequently accessed Emphasis is on the best power-to-wattage ratio
FAWN Consider a variant of Fawn tailored to a workload with infrequent requests to a very large data set How would that affect your choice of a storage device?
FAWN We should store FAWN datastore on a disk drive as the capacity of the storage device becomes more important than its access times
FAWN How would your choice affect the organization of the in-memory hash table—and the size of the main memory?
FAWN We would need a bigger main memory: Many more hash table entries Each hash table entries should contain a much larger key fragment to minimize false positives Disk reads are much more expensive than flash memory reads