Download presentation
Presentation is loading. Please wait.
1
A Replica Location Service
The Globus Project USC Information Sciences Institute Argonne National Laboratory
2
Motivation In a Data Grid, it may be desirable to create remote, read-only copies (replicas) of storage elements (files) To reduce latency of data accesses To increase robustness Need a mechanism for locating replicas Replica Location Problem: Given a unique logical identifier for data content, determine physical locations of one or more copies of that content Replica Location Service: a Data Grid component that maintains and provides access to information about physical locations of copies
3
A Replica Location Service Framework
Applications may operate at different scales, have different resources and different tolerances to inconsistent RLS information We define a flexible RLS framework Allows users to make tradeoffs among: consistency space overhead reliability update costs query costs By different combinations of 5 essential elements, the framework supports a variety of RLS designs
4
RLS Requirements Support read-only files
Mutable files require greater consistency, must use a separate mechanism Scale of Data Grid (e.g., High Energy Physics) 200 replica sites 50 million logical files total 500 million physical files (replicas) total 20 million physical files at a replica site
5
RLS Requirements (Cont.)
Data Grid Performance (e.g., High Energy Physics) Avg. query response time: 10 milliseconds Max. query response time: 5 seconds Max query rates: 10 to 100 per second Max update/insertion rates: 5 to 20 per second
6
RLS Requirements (cont.)
Security Issues: Authorization: Verify that users are allowed to perform requested operations Privacy: Knowledge of existence, location and content of data must be controlled Integrity: Prevent adversary from tampering with replica location results returned from RLS queries RLS: protects information about existence and location of data Individual storage systems: protect privacy and integrity of data contents
7
RLS Requirements (Cont.)
Consistency Relaxed consistency: RLS is not required to maintain strict consistency Strict consistency would require that RLS always returns a complete and accurate list of copies of specified content Difficult or impossible to achieve in a Grid Local sites may delete replicas or become disconnected without warning
8
RLS Requirements (Cont.)
Reliability No single point of failure: No one RLS site, if it fails or becomes inaccessible, can render entire service inoperable Decoupling of local and global state: Failure or inaccessibility of remote RLS components should not affect local access to local replicas Checksums
9
A Flexible RLS Framework
Five essential elements: Reliable Local State Unreliable Global State Soft State mechanisms for maintaining global state Compression of state updates Membership protocol
10
Example 1: A Centralized, Nonredundant Global Index
All updates sent to a centralized GRIN Not scalable: All queries serviced by a single index Not reliable: Single point of failure
11
Example 2: An RLS with LFN Partitioning, Redundancy and Bloom Filter Compression
Updates to specific, redundant GRINs based on LFN More scalable, reliable Limited storage and communication costs
12
Example 3: An RLS with Redundancy, Compression and Partitioning of Logical Collections
Send collection information to GRINs (lossy) Advantage: Partition intelligently based on file contents, creation or access patterns
13
Example 4: Hierarchical Index with Partitioning, Bloom Compression, Redundancy
GRINs can exchange soft state updates Allows large variety of global index configurations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.