Presentation is loading. Please wait.

Presentation is loading. Please wait.

Storage management and caching in PAST Antony Rowstron and Peter Druschel Presented to cs294-4 by Owen Cooper.

Similar presentations


Presentation on theme: "Storage management and caching in PAST Antony Rowstron and Peter Druschel Presented to cs294-4 by Owen Cooper."— Presentation transcript:

1 Storage management and caching in PAST Antony Rowstron and Peter Druschel Presented to cs294-4 by Owen Cooper

2 Outline PAST goals PAST api File storage overview File and replica diversion Replica management Caching Performance Discussion

3 PAST (non)goals P2P global storage network –Use properties of existing p2p systems (Pastry) –Support for strong persistence Via a core set of replicas –High availability Via local caching –Scalable Obtain high storage utilization via local cooperation –Secure Design goals do not include –Replacing the file system –Updatable files –Directory or lookup service

4 Security Model Pastry node ids are a hash of a public key Smartcard based security –Provides keys –Quota management Nodeid and fileid generation controlled –Try to stop nodes from getting consecutive ids –Or clients from overloading parts of the network But node id and real world identity may not be linked Data not encrypted

5 PAST API’s In PAST, files are immutable Fileid=Insert(filename,credentials, k, file) –Insert k copies of the file into the network, or fail. –Fileid a signed (filename, credentials, salt) –Successful if ack with receipts from k nodes File=lookup(fileid) –Return a copy of the file if it exists Reclaim(fileid, cradentials) –Reclaim accepted if requested by the owner –Allows, but does not require, storage reclamation

6 File insertion Insert(name, c, k, file) –Computes a storage certificate Contains fileid, hash of content, k, salt –Deducts k*filesize from quota –Routes file and storage certificate using pastry using fileid. –Node verifies the integrity of the file, stores it, and asks k-1 closest nodes to store the file. K-1 nodes in leaf set (k-1 <= l) –Node returns ack with k signed storage receipts, or a nak.

7 Lookup and Reclamation Pastry ensures replica is found –Since a lookup is routed to the closest nodeid Reclamation –Client generates a reclaim certificate –Sends it to the fileid via pastry –Recipients verify the certificate & issue receipt –Client reclaims quota

8 Diversion A file or replica can be relocated For a replica, to another close node –If one of the K closest is overloaded For a file, to another set of nodes in the idspace –If the nodes around a fileid are (possibly locally) congested Why is this necessary? –Differing storage capacity at nodes –Differing file size for inserted files

9 Replica Diversion Node responsible for fileid asks k-1 neighbors to store the file Neighbor (N) may divert a copy to a node in its leaf set –Pointer to copy inserted at N –N issues storage certificate –N also inserts a pointer on the k+1th closest node No orphan if N fails N remains responsible for pointer maintenance

10 File Diversion Replica diversion is local –Allows storage choice between nodes around fileid File Diversion –Triggered when an insert with a fileid fails –Insert is tried a total of three times –New fileid generated by changing the salt

11 Storage Policy How does a node choose to accept or reject a replica? –Computes sizeof(file)/sizeof(free_space) –Compares to T pri or T div depending node’s role –T pri > T div How is node chosen for replica diversion –Search leaf set for the node that Has maximal free space Doesn’t already hold a diverted or primary replica File diversion – K copies cannot be located (via primary or diversion)

12 Replica maintenance Node join/leave causes responsibility shift –Pastry node failure detection will cause leaf set updates Past detects responsibility shifts this way Newly responsible node must copy files –Make a copy immediately, OR –pointer to old owner & copy lazily Diverted replicas –Target of diversion may move out of leaf set Node to store repica can be any one in leaf set –Must exchange keepalive messages themselves –Should be relocated

13 Replica maintenance (2) Node failure may cause storage shortage –No node in leaf set can take over ownership Search space is widened –Ask most extreme nodes to locate storage Increases search space to 2l nodes –If no storage space found, fail.

14 Caching Pastry’s locality based routing will tend to direct requests to nearby copies PAST also stores cached copies –Along routing path between client and fileid –For insert and lookup operations –Cache maintained using GD-size algorithm Weight per file: 1/size(file) Eviction: –Pick file with minimum weight –Subtract weight of evicted file from all others

15 Experiments: without diversion Experiments use –Large trace from web server –Files from local web server The case for diversion with web trace –Without diversion: 51.1% of insertions failed 60.8% storage utilization

16 Experiments (2): with diversion With diversion –Bigger leaf set size a plus

17 Experiments (3):varying T pri Effects of varying T pri # files stored v.s. size of file

18 Experiments (4): Varying T div Varying T div T pri is constant

19 File and Replica Diversion

20 caching 8 traces combined Requests from clients in each trace are mapped to close PAST nodes


Download ppt "Storage management and caching in PAST Antony Rowstron and Peter Druschel Presented to cs294-4 by Owen Cooper."

Similar presentations


Ads by Google