Automated P2P Backup Group 1 Anderson, Bowers, Johnson, Walker
Motivation Hard drive space is cheap Network connectivity is cheap Losing data is expensive We’d like to pool our resources and easily collectively maintain backups
Background – Distributed Hashes Most academic P2P systems are built on “distributed hash tables” –Ask “the system” for a key, and get the content back How the Distributed Hash and the hash lookup is implemented characterizes the P2P system
Related Work – P2P Pastry –Each node has an ID, messages routed to node nearest the hash key –N-order graph used to route OceanStore –Write-once, has versioning –Emphasizes local storage
Related Work – P2P (cont.) CHORD –Routes similarly to Pastry –Circular routing space Freenet –Write-once –Many security and anonymity features All resource encrypted by their hash key
Related Work – Distributed Backup Distributed Internet Backup System –Not P2P – uses direct connections –Encrypts user data on other drives Pastiche –Built on Pastry PAST –Also built on Pastry pStore –Uses own P2P architecture
Design
P2P Adapter Abstracts interfacing with a P2P system’s distributed hash Writing another adapter could make this work with another system We used Freenet because it is the only working P2P system publicly available When backing up a file, returns a key that the adapter can later use to retrieve the file –P2P system specific –Freenet example: Done in Python
Engine Implements backup and retrieval logic Drives the P2P adapter to insert and retrieve files Stores keys from P2P adapter for retrieval Done in Java
UI Allows the user to select files or directories for backup % mark_dir > backup > done Allows user to initiate backup or retrieval based on selection Stores selections in “backup.txt” as a comma- delimited file storing the filename, date of last backup, and retrieval key (if any) Done in Java
Example Run ls –l total 8 drwx bowersj2 student 4096 Apr 16 13:58 dir -rw bowersj2 student 27 Apr 16 13:58 example java Backup_UI % mark_dir > dir > done % mark_file > example > done % backup Backing up dir/one... retrieval key Backing up dir/two... retrieval key Backing up example... retrieval key % exit
Results One limit of the Freenet library used was that files must be no larger then 64KB Not fundamental to Freenet A Freenet file takes approximately 4-5 seconds to insert into the system Retrieval was very fast since it was always from the local drive cache
Design Benefits Careful design allows each component to be implemented in any language UI and Engine communicate through backup.txt Engine and P2P adapter communicate through command lines Problems getting other P2P systems running Most not publicly available yet PASTRY could not be compiled Shipped source had Java exception handling errors
Conclusion Modern P2P systems will provide a good substrate for this sort of application When they are released and working! Writing a basic version of this kind of application is fairly easy Effectiveness depends on the underlying P2P system Freenet doesn’t chunk files, some P2P systems do Freenet has no retention guarantees, some P2P systems do Freenet natively prevents snooping by other users, some P2P systems don’t