Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University.

Similar presentations


Presentation on theme: "1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University."— Presentation transcript:

1 1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University

2 2 Awareness Services for Digital Libraries Digital library repository: Data store Other components: –Indexers –Name manager –Replica manager –Etc

3 3 Data Stores and Clients Clients Data Stores DB Indexer CS Indexer DB Tech Reports AI Tech Reports HCI Tech Reports

4 4 Data Store Services Object access –Via a handle Object awareness –Clients must be aware of changes at the store

5 5 A Case Study: CS-TR and SIFT SIFT: a selective dissemination service CS-TR: A digital library of technical reports from about 50 universities –Awareness based on timestamps Problems: –File system timestamps –Application timestamps –Deletions

6 6 Contributions Survey of the spectrum of awareness options –Advantages and disadvantages of each one –All mechanisms can be capture by a single algorithm: the UNI-AWARE algorithm Enhancements for signature-based schemes –Reduced computation –Reduced communication costs

7 7 Related Work Database replica maintenance Remote file comparison Deployment of programs over the network

8 8 The Client-store Design Space Push vs. Pull Statefull versus stateless stores and clients Cognizant clients and sources Number of clients per data store

9 9 The UNI-AWARE Algorithm A unified algorithm that “covers” known schemes: –Snapshot algorithm –Timestamps and versions –Logs –Triggers –Signatures Algorithm is tailored to a specific scheme through the definition of “custom functions”

10 10 UNI-AWARE : Signature Algorithm Signature: a token associated with each document that has a high probability of being unique and changes when the content of the object changes Example: CRC, checksums Advantages: –Robust: as it does not require metadata maintenance –Easy to manage consistently when store fails or object migrates

11 11 UNI-AWARE : Signature Algorithm All signatures transferred Request Documents Document Signature Client Data Store

12 12 DIST-UNI-AWARE Algorithm Objective: reduce amount of data exchanged between data store and clients DIST-UNI-AWARE: –Unified algorithm that can be tailored to different schemes: »Hierarchical signatures »Hierarchical timestamps

13 13 DIST-UNI-AWARE Signatures of Buckets transferred Request more Signatures Document Signature Client Data Store Request Documents

14 14 Advantages of Signature Algorithms Support the push and pull models No need for reliable storage of additional data structures: if signatures are lost or corrupted, they can be recomputed Efficient in usage of network resources, clients and data stores Scales well in number of clients and documents

15 15 DIST-UNI-AWARE : Performance Performance depends on number of changes: –No changes: only one round is required –Single change: log 2 n rounds –2 changes: log 2 n rounds, but twice as much data … –Eventually, DIST-UNI-AWARE starts behaving worse than UNI-AWARE

16 16 DIST-UNI-AWARE : Enhancements Increase group split factor Client sends additional information at split time Clustering of changed objects

17 17 Conclusions Awareness mechanism for digital libraries Separation of storage functionality and other services Awareness schemes must be resilient to computer environment changes and bugs UNI-AWARE and DIST-UNI-AWARE

18 18 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University


Download ppt "1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University."

Similar presentations


Ads by Google