Rapid Content Distribution Using An Ordered Seeding Torrent Raja Naresh Dr. Javed Khan Media Communications Networking Research Lab Kent State University
Outline Problem Description and Background Terms and Assumptions Overlay Network The Technique Seeding Distribution Process Time Complexity Analysis Big Picture Experimental Evaluation Bound Achieved
Problem Description and Background Content Distribution Networks (CDNs) is a system of computers containing copies of data placed at various nodes of a network. Companies sign-up with CDN providers with their services. Synchronization of content b/w the CDN nodes. Given N nodes in a network we want to efficiently (as fast as possible) distribute a file to all the nodes.
Terms and Assumptions Two kinds of nodes. The Seed which has the file/content to be distributed. Only one Seed. Non-Seed/Sink/Peers to which the content has to be distributed. They are represented with a unique binary ID eg. :- nodeID = Sink nodes are connected forming an overlay network such that their hamming distance = 1. Represents a hypercube structure. The Seed is connected to all the Sink nodes. Takes 1 unit of time to transfer file b/w any two nodes.
Overlay Network D hypercube3-D hypercube
The Technique Fragment 1 Fragment 2 … Fragment N Fragmentation and Broadcasting. [ ] [ ] … [ ] Fragments also have unique binary ID. Each fragment is of size 1/N hence fragment transfer b/w any two nodes takes 1/N units of time. We call this one timestamp.
Seeding The Seed is responsible for fragmenting the file into N fragments. Sends fragmentID to nodeID. Formulates an array consisting of the destination nodeID. Sends the fragmentID to the Sink with same nodeID. The array index represents the senquence in which the fragmentID is transferred to nodeID. 3-D illustration of the Seeding algorithm (8 nodes)
Seeding
Distribution General Idea is the same as broadcasting in parallel network such as hypercube. Consider a fragmentID [000] being distributed D hypercube [000] Analysis of node 001 [000] 001-->{011, 101}
Process [000] [111] Timestamp 1Timestamp 2
Process [000] [011] [111] [100 ] [000] [011] [000] [111] Timestamp 4Timestamp 3
Process [001 ] [011] [111] [111] [100] [000] [001] [011] [100] [011] [110] [111] Timestamp 5Timestamp 6
Time Complexity Analysis Discard half of the fragments. If the last bit is different then the node doesn't need to forward the fragment to any other node. It's the last node to receive the frag- ment The minimum number of timestamps required is N since the seed forwards atleast N fragments Hence we only consider d-1 bits among d bits. => N+...
Time Complexity Analysis Consider last node with nodeID = 1 d-1 1 d When fragmentID = nodeID, distributes fragment in d timestamps. When (d-2) th bit is different, distributes fragment in 1X2 d-2 timestamps => N+d+1X2 d-2 +2X(2 d-2 /2) When (d-3) rd bit is different, distributes fragment in 2X2 d-3 Which is 2X(2 d-2 /2)
Time Complexity Analysis => N + d + 1X2 d-2 + 2X(2 d-2 /2) … + (d-1)X(2 d-2 /2 d-2 ) => N + N – 1 => 2N – 1 # of timestamps in worst case => 2 – (1/N) Actual time taken to distribute Big Picture Sequential – N Broadcasting – (logN) Fragmentation and Broadcasting - O(1)
Experimental Evaluation Experimental Time Complexity = 1 + O((log 2 N)/N)
Further Extensions Optimality of the Seeding algorithm Include Header cost and find the optimal N. Thank you