Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University www-db.stanford.edu/peers/

Slides:



Advertisements
Similar presentations
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Modeling Malware Spreading Dynamics Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley.
Improving Peer-to-Peer Networks “Limited Reputation Sharing in P2P Systems” “Robust Incentive Techniques for P2P Networks”
1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University.
The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
The Problem: An Introduction to Preservation, Trust and Continuing Access for e-Journals Neil Beagrie Charles Beagrie Ltd With thanks to Randy Kiefer (CLOCKSS)
Chapter 9 Designing Systems for Diverse Environments.
Open Problems in Data- Sharing Peer-to-Peer Systems Neil Daswani, Hector Garcia-Molina, Beverly Yang.
E-business Infrastructure
Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University.
Stanford Archival Vault (SAV)
End-to-End Analysis of Distributed Video-on-Demand Systems Padmavathi Mundur, Robert Simon, and Arun K. Sood IEEE Transactions on Multimedia, February.
1 Peer-To-Peer Data Management Hector Garcia-Molina ICDE Conference, February 28, 2002.
Information Survivability Control Systems Earl Crane Security Architecture and Analysis Thursday, September 07, 2000.
ODISSEA Mehdi Kharrazi Kulesh Shanmugasundaram Security Issues.
1 Maximizing Remote Work in Flooding-based P2P Systems Qixiang Sun Neil Daswani Hector Garcia-Molina Stanford University.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
1 Archival Storage for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
1 Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University.
Wide-area cooperative storage with CFS
On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
A Framework for Cost-Effective Peer-to- Peer Content Distribution Mohamed Hefeeda and Bharat Bhargava Department of Computer Sciences Purdue University.
Peer-to-peer archival data trading Brian Cooper and Hector Garcia-Molina Stanford University.
1 Stanford Archival Repository Project Brian Cooper Arturo Crespo Hector Garcia-Molina Department of Computer Science Stanford University.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Secure Knowledge Management: and.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Scientific Computing Department Faculty of Computer and Information Sciences Ain Shams University Supervised By: Mohammad F. Tolba Mohammad S. Abdel-Wahab.
Freenet: A Distributed Anonymous Information Storage and Retrieval System Presentation by Theodore Mao CS294-4: Peer-to-peer Systems August 27, 2003.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Where in the world is my data? Sudarshan Kadambi Yahoo! Research VLDB 2011 Joint work with Jianjun Chen, Brian Cooper, Adam Silberstein, David Lomax, Erwin.
Section 15.1 Identify Webmastering tasks Identify Web server maintenance techniques Describe the importance of backups Section 15.2 Identify guidelines.
On P2P Collaboration Infrastructures Manfred Hauswirth, Ivana Podnar, Stefan Decker Infrastructure for Collaborative Enterprise, th IEEE International.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
| ©2009, Cognizant Technology SolutionsConfidential All rights reserved. The information contained herein is subject to change without notice. ©2009, Cognizant.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
03/19/02Scalab Seminar Series1 Routing in Peer-to-Peer Systems Ramaswamy N.Vadivelu Scalab, ASU.
P2PComputing/Scalab 1 Gnutella and Freenet Ramaswamy N.Vadivelu Scalab.
Freenet “…an adaptive peer-to-peer network application that permits the publication, replication, and retrieval of data while protecting the anonymity.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Resources Management and Component Placement Presenter:Bo Sheng.
Supporting Privacy Protection in Personalized Web Search.
 CMS data challenges. The nature of the problem.  What is GMA ?  And what is R-GMA ?  Performance test description  Performance test results  Conclusions.
Paul Beraud, Alen Cruz, Suzanne Hassell, Juan Sandoval, Jeffrey J Wiley November 15 th, 2010 CRW’ : NETWORK MANEUVER COMMANDER – Resilient Cyber.
P2P Networking: Freenet Adriane Lau November 9, 2004 MIE456F.
Internet of Things. Creating Our Future Together.
Get Data to Computation eudat.eu/b2stage B2STAGE How to shift large amounts of data Version 4 February 2016 This work is licensed under the.
Digital Preservation Initiatives in the United States A Summary Deanna B. Marcum.
Autonomic aspects in cloud data management Alexandra Carpen-Amarie KerData.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Introduction to Load Balancing:
CASCADE: AN ATTACK-RESISTANT DHT WITH MINIMAL HARD STATE
Trusted Routing in IoT Dr Ivana Tomić In collaboration with:
Joseph JaJa, Mike Smorul, and Sangchul Song
Section 15.1 Section 15.2 Identify Webmastering tasks
GSAF Grid Storage Access Framework
GSAF Grid Storage Access Framework
Security & .NET 12/1/2018.
Peer-To-Peer Data Management
Creating Trading Networks
Presentation transcript:

Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University www-db.stanford.edu/peers/ www-diglib.stanford.edu

Data: easy to create, hard to preserve Broken tapes Human deletions Going out of business Motivation Reliable replication of digital collections Given that: Resources are limited, sites are autonomous, not all sites are equal Metric Reliability Not necessarily “efficiency” Goal

A right to use space at another site Bookkeeping mechanism for trades Used, saved, split, or transferred Deed for space For use by: Library of Congress or for transfer 623 gigabytes Deeds A B Collection 1 Trading algorithm: 1. Sites trade deeds 2. Sites exercise deeds to replicate collections “623 GB?” A B Collection 1 “Okay” A B Collection 1 “Thanks” 623 GB

Reliability layer Archived data Users Filesystem InfoMonitor SAV Archive Internet Local archive Remote archive Reliability layer Service layer Archived data Replication architecture Data trading Geographic dispersal of data protects from a variety of failures Data trading is the replication component of a digital archive Architecture developed with Arturo Crespo

High reliability Framework for replication Site autonomy Make local decisions Fairness Contribute more = more reliability Must contribute resources Adapts to dynamic situation Just make new trades Benefits of trading Other solutions Central control Loses autonomy Still must adapt to dynamicity Client based Rare collections not protected Random We can do better Especially with limited resources

Who to trade with How much to trade When to ask for a trade Providing space Advertising space Picking a number of copies Coping with varying site reliabilities What to do with acquired resources How to deliver other services Decisions facing an archive Trading simulator 1. Generate scenarios 2. Simulate trading with different policies 3. Compare reliability Many, many degrees of freedom! How to find the best decisions?

Example: Advertising policy “I have 120 GB” 120 GB Space fractional policy “I have 60 GB” 60 GB Data proportional policy “I have 40 GB” 40 GB Data A B A B A B Data proportional is best policy Reserves space for future

Extensions Some sites > others More reliable Better reputation “Good friends” Clustering: trade with trusted partners ClosestReliability: trade with other sites that are as reliable as you MostReliable: trade with the most reliable site Freedom to negotiate “80 GB” “95 GB” “120 GB” “How much do I pay for 100 GB of your space?” A “Bid trading” Choose bid based on local situation

Secure services Publish: Makes copies to survive failures Search: Find documents Retrieve: Get a copy of a document Challenges Attacker may delete copy Attacker may provide fake search results Attacker may provide altered document Attacker may disrupt message routing … Joint work with Mayank Bawa and Neil Daswani Malicious sites