Download presentation
Presentation is loading. Please wait.
1
Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University cooperb@stanford.edu www-db.stanford.edu/peers/ www-diglib.stanford.edu
2
Data: easy to create, hard to preserve Broken tapes Human deletions Going out of business Motivation Reliable replication of digital collections Given that: Resources are limited, sites are autonomous, not all sites are equal Metric Reliability Not necessarily “efficiency” Goal
3
A right to use space at another site Bookkeeping mechanism for trades Used, saved, split, or transferred Deed for space For use by: Library of Congress or for transfer 623 gigabytes Deeds A B Collection 1 Trading algorithm: 1. Sites trade deeds 2. Sites exercise deeds to replicate collections “623 GB?” A B Collection 1 “Okay” A B Collection 1 “Thanks” 623 GB
4
Reliability layer Archived data Users Filesystem InfoMonitor SAV Archive Internet Local archive Remote archive Reliability layer Service layer Archived data Replication architecture Data trading Geographic dispersal of data protects from a variety of failures Data trading is the replication component of a digital archive Architecture developed with Arturo Crespo
5
High reliability Framework for replication Site autonomy Make local decisions Fairness Contribute more = more reliability Must contribute resources Adapts to dynamic situation Just make new trades Benefits of trading Other solutions Central control Loses autonomy Still must adapt to dynamicity Client based Rare collections not protected Random We can do better Especially with limited resources
6
Who to trade with How much to trade When to ask for a trade Providing space Advertising space Picking a number of copies Coping with varying site reliabilities What to do with acquired resources How to deliver other services Decisions facing an archive Trading simulator 1. Generate scenarios 2. Simulate trading with different policies 3. Compare reliability Many, many degrees of freedom! How to find the best decisions?
7
Example: Advertising policy “I have 120 GB” 120 GB Space fractional policy “I have 60 GB” 60 GB Data proportional policy “I have 40 GB” 40 GB Data A B A B A B Data proportional is best policy Reserves space for future
8
Extensions Some sites > others More reliable Better reputation “Good friends” Clustering: trade with trusted partners ClosestReliability: trade with other sites that are as reliable as you MostReliable: trade with the most reliable site Freedom to negotiate “80 GB” “95 GB” “120 GB” “How much do I pay for 100 GB of your space?” A “Bid trading” Choose bid based on local situation
9
Secure services Publish: Makes copies to survive failures Search: Find documents Retrieve: Get a copy of a document Challenges Attacker may delete copy Attacker may provide fake search results Attacker may provide altered document Attacker may disrupt message routing … Joint work with Mayank Bawa and Neil Daswani Malicious sites
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.