Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University.

Similar presentations


Presentation on theme: "1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University."— Presentation transcript:

1 1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University

2 2 Data Storage Motivation Our Objective: create the next generation Data Repositories tailored to Digital Libraries needs: –Persistence, Distribution, Intellectual Property, Indexing and Cataloging, Replication,... IndexersReplicaNaming Data Storage Clients

3 3 Data Stores and Clients Clients Data Stores DB Indexer CS Indexer DB Tech Reports AI Tech Reports HCI Tech Reports

4 4 Data Store Services Object access –Via a handle Object awareness –Clients must be aware of changes at the store

5 5 A Case Study: CS-TR and SIFT SIFT: a selective dissemination service CS-TR: A digital library of technical reports from about 50 universities –Awareness based on timestamps Problems: –File system timestamps –Application timestamps –Deletions

6 6 The Problem How can a Data Storage Client detect the changes that have happened in remote Data Storages since the last update There is not a “Perfect Algorithm”: –The best algorithm for solving this problem depends on the characteristics of the relation between the Data Storage and the client

7 7 The Design Space Ratio of Data Storages per Client Statefull versus Stateless Data Storages in relation with the Clients Push versus Pull Model Update Frequency { Client awareness of Data Storages Complexity of the Algorithm How often the repository changes How often the client is updated

8 8 Standard Mechanisms for Client Updating Key Query Algorithm Snapshot Differential Algorithm Timestamps and Versions Logs Triggers Signatures

9 9 Contributions Survey of the spectrum of awareness options –Advantages and disadvantages of each one –All mechanisms can be capture by a single algorithm: the UNI-AWARE algorithm Enhancements for signature-based schemes –Reduced computation –Reduced communication costs

10 10 Related Work Database replica maintenance Remote file comparison Deployment of programs over the network

11 11 The UNI-AWARE Algorithm A unified algorithm that “covers” known schemes: –Snapshot algorithm –Timestamps and versions –Logs –Triggers –Signatures Algorithm is tailored to a specific scheme through the definition of “custom functions”

12 12 UNI-AWARE : Signature Algorithm Signature: a token associated with each document that has a high probability of being unique and changes when the content of the object changes Example: CRC, checksums Advantages: –Robust: as it does not require metadata maintenance –Easy to manage consistently when store fails or object migrates

13 13 UNI-AWARE : Signature Algorithm All signatures transferred Request Documents Document Signature Client Data Store

14 14 DIST-UNI-AWARE Algorithm Objective: reduce amount of data exchanged between data store and clients DIST-UNI-AWARE: –Unified algorithm that can be tailored to different schemes: »Hierarchical signatures »Hierarchical timestamps

15 15 DIST-UNI-AWARE Signatures of Buckets transferred Request more Signatures Document Signature Client Data Store Request Documents

16 16 Advantages of Signature Algorithms Support the push and pull models No need for reliable storage of additional data structures: if signatures are lost or corrupted, they can be recomputed Efficient in usage of network resources, clients and data stores Scales well in number of clients and documents

17 17 DIST-UNI-AWARE : Enhancements Increase group split factor Client sends additional information at split time Clustering of changed objects

18 18 Conclusions Awareness mechanism for digital libraries Separation of storage functionality and other services Awareness schemes must be resilient to computer environment changes and bugs UNI-AWARE and DIST-UNI-AWARE

19 19 Reference Arturo Crespo, Hector Garcia-Molina. "Awareness Services for Digital Libraries." ECDL'97. http://www-db.stanford.edu/~crespo/publications/

20 20 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University


Download ppt "1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University."

Similar presentations


Ads by Google