Download presentation
Presentation is loading. Please wait.
1
Large Scale File Distribution Troy Raeder & Tanya Peters
2
Distribute a large file to some number of machines useful to deploy new programs, distribute data Chirp_distribute was implemented last year and distribute files using a spanning tree Want to improve upon the existing methods to transfer files more efficiently. Choke points exist – multiple machines will all transfer files through a single router/switch Minimizing failures, including permissions errors The Problem
3
The Solution Take advantage of network topology – transfer across routers and switches as soon as possible, and then machines in the same cluster transfer to each other. Using traceroute, we build a graph that represents the network. This is done as needed and saved in a file which is loaded at run time. Access Control Lists: if we know a source machine doesn’t have permissions to transfer to some target, don’t even try
4
Network Topology
5
Picking a Target: Check if all clusters in the graph contain a copy of the file. If some cluster does not, we copy to it. Next, if some node within your cluster doesn't have the file, transfer to it. Otherwise, pick some other node that doesn't have the file. If a node is unable to transfer to nodes that don't have the file yet, it is removed from the list of possible sources.
6
Initial Results Current version of algorithm doesn’t always do better As expected, for smaller files and/or smaller number of hosts, overhead costs us For larger files and/or number of hosts, things like timeouts can wash out relative gains.
7
What's Next... Pick source & target more intelligently If initial attempt to copy from some cluster A to cluster B fails, don't try transferring between these two clusters again unless no other possibilities exist. Try and manage straggler transfers Dynamically set timeout for transferring a single copy: set to some multiple of max or average transfer time seen so far. The end result hopefully that we have a significant improvement over existing algorithm
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.