Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April 2004 Presented by Ho Tsz Kin 14/04/2004
Agenda Introduction Architecture Topology-ware Overlay Replication Strategies Intergroup Replication Intragroup Replication Performance Evaluation Conclusion
Multimedia distribution services Centralized multimedia distribution Mirroring, Proxy caching Bottleneck bandwidth problem Measurement between University of Washington and a set of 13,656 servers Over 90% is less than 10 Mbps Not scalable Content distribution network (CDN) Deploys a large number of servers at the edge of the network Objective is to efficiently redirect user requests to appropriate servers so that request latency is reduced and load among servers are balanced
Multimedia distribution services Capacity of the edge server is not large enough to support multimedia service Where and when to place those edge servers is a difficult problem Peer-to-peer network Some rely on servers to disseminate information Single point of failure Overlay network in a P2P system is not aware of the underlying topology Availability depend on peer ’ s reliability Cannot provide good QoS-provision Propose a novel framework based on P2P network
Architecture Determine how many replicas and how they place Determine grouping among peers Client join the P2P network, and contribute resources
Topology-aware Overlay Routing overhead is a key performance metric If randomly constructed, overlay network may actually be far away in the underlying network Nearby peers in the underlying network are clustered into groups A group consists of a set of nodes that are close to each other Close means if the distance is less than some predefined value Distance can be network latency, or round trip time
Topology-aware Overlay Two different groups are communicating with each other through the shortest distance Predefined distance threshold Given a certain transmission delay requirement
Content delivery When a request to obtain certain content is issued Found within the same group Content can be directly distributed to the requesting peer Peer may decide to replicate according to the replication strategies Not found, flooding search is carried out A shortest communication path is setup between two groups The content in source will first be sent to some host in target group, that host in target group will send the content to requester
Replication Strategies Global level replication decision relies on complete information about the network such as distances between groups or between peers, storage capacity of each group, and each peer such global information is difficult to obtain in a distributed environment Divide the problem into two sub-problems Intergroup and Intragroup replication
Intergroup Replication Provide low latency and QoS-aware service within group level Seed Group-level replica Number of seeds = number of groups holding this Seed capacity is the total capacity of a group to store different seed Minimize the average distance between requesting group and the group providing content Subject to the constraint of each group ’ s seed capacity
Intergroup Replication Variation of K-center problem NP-Complete Ignore seed capacity of each group, and only consider the totally seed capacity Idea of heuristic L L 2D Euclidean space Average distance Seed of each content c i should be uniformly distributed over the network, let number be
Intergroup Replication Average access distance Modified problem, with S is total capacity, popularity of content c i is r i Weighted average minimum distance Storage capacity constraints Applying Lagrange Function
Intergroup Replication Proposed heuristic If distance between the requestor and the peer who has a replica is larger than, then replicate Substitute back to find the average distance Estimated using local information
Intragroup Replication Improving the availability of the content Replica is copies of the content within the group Replica replication matrix Availability of content c i N peers Reliability of p j
Intragroup Replication Optimization problem Variation of the knapsack problem NP-complete size of content c i storage capacity of peer p j
Intragroup Replication Proposed heuristic Climb-hill based algorithm Adding a new replica for content c r will improve its availability Deleting the stored contents c j also decreases its availability A(c r ): availability of content c r A ’ (c i ):availability of content c i if we delete this content If A ’ (c i ) > A(c r ) Deleting c i does not conflict with the objective
Performance Evaluation Network topology Euclidean space model Nodes are randomly located Edge longitudes are fixed as 3000 ms 200 groups are generated Latency within group are very small Packet loss model mainly due to the congestion occurred at routers Number of hops between two peers increases linearly to the distance between two peers Largest hop is ten Bandwidth of link range from 800 Kbps to 1.4 Mbps, and average is about 1.2 Mbps
Performance Evaluation Content distribution 10,000 MPEG-4 format video clips encoded in 1.28 Mbps Length follows a normal distribution in range of 3 min to 5 min, correspondingly to 37.8 MB to 48 MB in files sizes Request distribution Zipf distribution Truncated Geometric Distribution (TGD) Truncated Pareto Distribution (TPD)
Performance Evaluation Peer Storage capacity and reliability Storage contributed by a peer follows a normal distribution in the range of 300 MB and 2 GB, which approximately supports 8 to 50 video clips Peer reliability of sustaining service follows normal distribution in the range of 0.1 to 0.9 Comparison Freenet Always makes a replica for each requested content LRU replacement policy Random replication system Contents are uniformly distributed into peer ’ s storage
Performance Evaluation Performance metrics Average latency Average access distance between the requestor peer and the content provider peer Video quality Perceived video quality by the client PSNR Weighted availability Represents the service availability provided by contents in a certain area (within distance d) Defined as:
Performance Evaluation Average latency Varying number of content from 8000 to Varying skew factor with content
Performance Evaluation Video quality Varying peer storage Varying average packet loss ratio of network links with peer storage capacity as 960 MB
Performance Evaluation Availability Varying distance d
Conclusion Propose and analyze A topology-aware overlay Replication strategies Intergroup replication Intragroup replication Comments: Assume equal sizes in intergroup replication, but different sizes in intragroup replication Topology-aware techniques can also be applied to clustering in SLVoD How to formulate and resolve stripping strategies