Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Efficient Massive Sharing of Content among Peers by Peter Triantafillou, Chryssani Xiruhaki and Manolis Koubarakis Dept. of Electronics and Computer.

Similar presentations


Presentation on theme: "1 Efficient Massive Sharing of Content among Peers by Peter Triantafillou, Chryssani Xiruhaki and Manolis Koubarakis Dept. of Electronics and Computer."— Presentation transcript:

1 1 Efficient Massive Sharing of Content among Peers by Peter Triantafillou, Chryssani Xiruhaki and Manolis Koubarakis Dept. of Electronics and Computer Engineering Technical University of Crete Chania, 73100 Greece

2 2 Outline Background Aim System Architecture Experimental Result

3 3 Background Internet content sharing system has become very popular recently(e.g. Napster) Content sharing systems could consists of very large number of nodes, offering resources and content(documents).

4 4 Background The central control of information at special nodes is undesirable – Central points of failure – Peformance bottleneck Need to create a P2P system Architecture in which nodes can collaborate with each other.

5 5 Aim Ensuring high performance in large-scale P2P content sharing systems. This paper focus on two particular sub-goals: – Ensuring load balancing across all nodes of such a system. – Ensuring short response times to user requests.

6 6 System Architecture Divide documents into a number of semantic categories. It uniquely maps each document to a semantic category Category 1Category 2 Document 1 Document 2 Document 3 Document 4 Document 5Document 6 ……..

7 7 System Architecture The nodes will be divided into a number of clusters. Node 2Node 1 Node 7Node 6 Node 5 Node 4 Node 3 Cluster 1Cluster 2

8 8 System Architecture Each semantic category can only be found in one and only one clusters Cluster 1 Cluster 2 Category 1 Category 3 Category 2 Category 4

9 9 System Architecture Each node store the cluster metadata describing which category of documents are stored by which cluster nodes Global load balance if – Inter-cluster load balance – Intra-cluster load balance

10 10 User Request Processing Request can be either to publish(contribute) or to retrieve(download) documents. – Identified the semantic category of the document. – The request will be forwarded to the proper clusters – the request will be forwarded to one randomly chosen node of the cluster – If the node does/could not store the requested documents, it will forward the request to some other nodes of the same cluster.

11 11 User Request Processing With this approach, the paper claims that – The load is balanced within a cluster – The response time is low and it is bounded by the number of nodes in the cluster.

12 12 Inter-cluster Load Balance Assumption: – Each node contribute documents belonging to a single semantic category – The content has known popularity, with document popularities following the Zipf distribution – Each node have the same storage and processing power The system contain N nodes

13 13 Inter-cluster Load Balance Question: Is there a partition of N nodes into k(given) clusters N 1, N 2,…, N k such that the following two constraint are satisfied? – If two documents belong to the same semantic category, then the nodes that contributed/store these documents belongs to the same cluster – Clusters have equal normalized popularities Formally, p(Ni)/| Ni | = p(Nj)/| Nj |,for all 1<=i,j <=k

14 14 Inter-cluster Load Balance Cluster 3 Cluster 1 Cluster 2

15 15 Inter-cluster Load Balance Cluster 1Cluster 2 Cluster 3

16 16 Inter-cluster Load Balance The problem is NP-complete Consider partitioning the given set of nodes N into clusters with nearly equal normalized popularities Minimizing the following quantity:

17 17 Inter-cluster Load Balance The paper proposed a greedy algorithm, called MinDiff – Initially all cluster are empty – The MinDiff considers each semantic category in turn – It assign the category to the cluster of nodes which minimize the quantity of the above equation

18 18 Inter-cluster Load Balance MinDiff is incomplete(it might miss the optimal solution) MinDiff runs in polynomial time and achieves very good results.

19 19 Experimental Results Documents: 19800 Semantic categories: 300 Clusters: 50 Popularity distribution of documents are Zipf-like with parameter O = 0.3 and O = 0.7

20 20 Experimental Results O=0.3 (more skewed)

21 21 Experimental Results O=0.6 (less skewed)


Download ppt "1 Efficient Massive Sharing of Content among Peers by Peter Triantafillou, Chryssani Xiruhaki and Manolis Koubarakis Dept. of Electronics and Computer."

Similar presentations


Ads by Google