Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Replica Location Service

Similar presentations


Presentation on theme: "A Replica Location Service"— Presentation transcript:

1 A Replica Location Service
The Globus Project USC Information Sciences Institute Argonne National Laboratory

2 Motivation In a Data Grid, it may be desirable to create remote, read-only copies (replicas) of storage elements (files) To reduce latency of data accesses To increase robustness Need a mechanism for locating replicas Replica Location Problem: Given a unique logical identifier for data content, determine physical locations of one or more copies of that content Replica Location Service: a Data Grid component that maintains and provides access to information about physical locations of copies

3 A Replica Location Service Framework
Applications may operate at different scales, have different resources and different tolerances to inconsistent RLS information We define a flexible RLS framework Allows users to make tradeoffs among: consistency space overhead reliability update costs query costs By different combinations of 5 essential elements, the framework supports a variety of RLS designs

4 RLS Requirements Support read-only files
Mutable files require greater consistency, must use a separate mechanism Scale of Data Grid (e.g., High Energy Physics) 200 replica sites 50 million logical files total 500 million physical files (replicas) total 20 million physical files at a replica site

5 RLS Requirements (Cont.)
Data Grid Performance (e.g., High Energy Physics) Avg. query response time: 10 milliseconds Max. query response time: 5 seconds Max query rates: 10 to 100 per second Max update/insertion rates: 5 to 20 per second

6 RLS Requirements (cont.)
Security Issues: Authorization: Verify that users are allowed to perform requested operations Privacy: Knowledge of existence, location and content of data must be controlled Integrity: Prevent adversary from tampering with replica location results returned from RLS queries RLS: protects information about existence and location of data Individual storage systems: protect privacy and integrity of data contents

7 RLS Requirements (Cont.)
Consistency Relaxed consistency: RLS is not required to maintain strict consistency Strict consistency would require that RLS always returns a complete and accurate list of copies of specified content Difficult or impossible to achieve in a Grid Local sites may delete replicas or become disconnected without warning

8 RLS Requirements (Cont.)
Reliability No single point of failure: No one RLS site, if it fails or becomes inaccessible, can render entire service inoperable Decoupling of local and global state: Failure or inaccessibility of remote RLS components should not affect local access to local replicas Checksums

9 A Flexible RLS Framework
Five essential elements: Reliable Local State Unreliable Global State Soft State mechanisms for maintaining global state Compression of state updates Membership protocol

10 Example 1: A Centralized, Nonredundant Global Index
All updates sent to a centralized GRIN Not scalable: All queries serviced by a single index Not reliable: Single point of failure

11 Example 2: An RLS with LFN Partitioning, Redundancy and Bloom Filter Compression
Updates to specific, redundant GRINs based on LFN More scalable, reliable Limited storage and communication costs

12 Example 3: An RLS with Redundancy, Compression and Partitioning of Logical Collections
Send collection information to GRINs (lossy) Advantage: Partition intelligently based on file contents, creation or access patterns

13 Example 4: Hierarchical Index with Partitioning, Bloom Compression, Redundancy
GRINs can exchange soft state updates Allows large variety of global index configurations


Download ppt "A Replica Location Service"

Similar presentations


Ads by Google