Download presentation
Presentation is loading. Please wait.
Published byAdam Stephens Modified over 9 years ago
1
Storage Management and Caching in PAST A Large-scale persistent peer-to-peer storage utility Presented by Albert Tannous CSE 598D: Storage Systems – Dr. Bhuvan Urgaonkar
2
Introduction to PAST (1) P2P systems can be characterized as distributed systems in which all nodes have identical capabilities and responsibilities and all communication is symmetric PAST: An Internet-based, P2P global storage utility, which aims to provide strong persistence, high availability, scalability and security PAST is based on a self-organizing, Internet based overlay network of storage nodes that cooperatively route file queries, store multiple replicas of files, and cache additional copies of popular files Storage nodes and files are each assigned uniformly distributed identifiers
3
Introduction to PAST (2) PAST is composed of nodes connected to the Internet, where each node is capable of initiating and routing client requests to insert or retrieve files PAST nodes form a self-organizing overlay network Inserted files are replicated across multiple nodes for availability Advantages of PAST: Exploits the multitude and diversity of nodes in the Internet to achieve strong persistence and high availability No need for physical transport of storage media to protect backup and archival data No need for explicit mirroring to ensure high availability and throughput for shared data
4
Introduction to PAST (3) Files stored are associated with a quasi-unique fileId that is generated at the time of the file’s insertion into PAST Files stored are immutable Files can be shared by the owner by distributing fileId (and decryption key if necessary) Pastry is the routing scheme used to route clients’ requests to proper nodes
5
Introduction to PAST (4) A client needs the fileId (and maybe the decryption key) to retrieve a file PAST does not provide facilities for searching, directory lookup, or key distribution. PAST is intended as an archival storage and content distribution utility, not a general-purpose FS
6
PAST Overview (1) Any host connected to the Internet can be a PAST node, just needs the software Set of operations exported to PAST clients: fileId = Insert(name, owner-credentials, k, file) file = Lookup(fileId) Reclaim(fileId, owner-credentials) Each PAST node is assigned a 128-bit node identifier, called a nodeId A set of nodeIds is an excellent candidate for storing the replicas of a file, since the nodes are likely to be diverse in all aspects
7
PAST Overview (2) During an insert operation, PAST stores the file on the k PAST nodes whose nodeIds are numerically closest to the 128 msb of the file’s fileId k is chosen to meet the availability needs of a file, relative to the expected failure rates of individual nodes PAST is layered on top of Pastry
8
Pastry Pastry: A P2P request routing and content location scheme Pastry is efficient, scalable, fault resilient and self- organizing Pastry routes an associated message towards the node whose nodeId is numerically closest to the 128 msb of the fileId, among all live nodes A file can be located unless the k nodes whose nodeIds are numerically closest to the 128 msbs of the fileId have failed within a recovery period
9
PAST Operations (1) Insert Request: a fileId is computed as the SHA-1 hashcode of the file’s textual name, the client’s public key, and a random salt. The required storage (file size times k) is debited against the client’s storage quota, and a file certificate is issued and signed with the owner’s private key The file and the certificate are routed to the first node closest to the fileId. The node accepts the replica and forwards it to the k-1 other nodes When the k nodes have accepted the replica, the client recieves an acknowledgement
10
PAST Operations (2) Lookup request: Client node sends a request message, using the requested fileId as the destination. When a node that stores the file receives the request, node responds with the content and the stored file certificate Reclaim request: Like an insert request, the client’s node issues a reclaim certificate, which allows the replica storing nodes to verify that the file’s legitimate owner is requesting the operation. The storing nodes each issue and return a reclaim receipt, which the client node verifies for a credit against the user’s storage quota
11
Security Each PAST node and each user of the system hold a smartcard (read-only clients don’t need a card). A private/public key pair is associated with each card Each smartcard’s public key is signed with the smartcard issuer’s private key for certification purposes The smartcards ensure the integrity of nodeId and fileId assignments, thus preventing an attacker from controlling adjacent nodes in the nodeId space, or directing file insertions to a specific portion of the fileId space.
12
Storage Management (1) The goal is to allow high global storage utilization and graceful degradation as the system approaches its maximal utilization Responsibilities of the storage management: Balance the remaining free storage space among nodes in the PAST network as the system-wide storage utilization is approaching 100% Maintain the invariant that copies of each file are maintained by the k nodes with nodeIds closest to the fileId
13
Storage Management (2) 2 ways to resolve the conflicting responsibilities: Replica diversion: Allows a node that is not one of the k numerically closest nodes to the fileId to alternatively store the file, if it is in the leaf set of one of those k nodes, to accommodate differences in the storage capacity and utilization of nodes within a leaf set File diversion: A file is diverted to a different part of the nodeId space by choosing a different salt in the generation of its fileId when a node’s entire leaf set is reaching capacity. Its purpose is to achieve more global load balancing across large portions of the nodeId space
14
Causes of Storage Load Imbalance If not all the k closest nodes can accommodate a replica (insufficient storage), but k nodes exist within the leaf sets of the k nodes that can accommodate the file That imbalance in the available storage among the l + k nodes in the intersection of the k leaf sets can arise for several reasons The number of files assigned to each node may differ because of statistical variation in the assignment of nodeIds and fileIds The size distribution of inserted files may have high variance and may be heavy tailed. The storage capacity of individual PAST nodes differs
15
Per-Node Storage Assumption: The storage capacities of individual PAST nodes differ by no more than two orders of magnitude at a given time PAST controls the distribution of per-node storage capacities by comparing the advertised storage capacity of a newly joining node with the average storage capacity of nodes in its leaf set: If the node is too large, it is asked to split and join under multiple nodeId If a node is too small, it is rejected A node is free to advertise only a fraction of its actual disk space for use by PAST. The advertised capacity is used as the basis for the admission decision
16
Replica Diversion Goal: Balance the remaining free storage space among the nodes in a leaf set 3 policies are used: Acceptance of replicas into a node’s local store Selecting a node to store a diverted replica Deciding when to divert a file to a different part of the nodeId space
17
File Diversion Goal: Balance the remaining free storage space among different portions of the nodeId space in PAST When a file insert operation fails, a negative acknowledgment is returned to the client node The client node in turn generates a new fileId using a different salt value and retries the insert operation A client node repeats this process for up to 3 times. If, after 4 attempts the insert operation still fails, the operation is aborted and an insert failure is reported to the application
18
Maintaining Replicas PAST maintains k copies of each inserted file on different nodes within a leaf set: Neighboring nodes in the nodeId space periodically exchange keep-alive messages When a new node joins the system or a previously failed node gets back on-line, it’s included and another node is dropped from each of the previous leaf sets A node that becomes one of the k closest nodes for certain files has to acquire a replica of each file, re- creating replicas that were previously held by the failed node A node that ceases to be one of the k nodes for certain files can discard the copies
19
File Encoding Reed-Solomon encoding: adding m additional checksum blocks to n original data blocks (all of equal size) allows recovery from up to m losses of data or checksum blocks This reduces the storage overhead required to tolerate m failures from m to (m + n)/n times the file size By fragmenting a file into a large number of data blocks, the storage overhead for availability can be made very small Storing fragments of a file at separate nodes (and thereby striping the file over several disks) can also improve bandwidth
20
Caching (1) Goal: Minimize client access latencies (fetch distance), maximize the query throughput and balance the query load in the system Maintaining replicas lead to availability, query load balancing and latency reduction A highly popular file may demand many more than k replicas in order to sustain its lookup load while minimizing client latency and network traffic If a file is popular among one or more local clusters of clients, it’s advantageous to store a copy near each cluster
21
Caching (2) PAST nodes use the unused portion of their advertised disk space to cache files Cached copies can be evicted and discarded at any time: when a node stores a new primary or redirected replica of a file, it evicts one or more cached files to make room for the replica Cache insertion policy: A file routed through a node as part of a lookup or insert operation is inserted into the local disk cache if its size is less than a fraction c of the node’s current cache size (the portion of the node’s storage not currently used to store primary or diverted replicas) Cache replacement policy: Based on the GreedyDual-Size (GD-S)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.