A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1,

Slides:



Advertisements
Similar presentations
Performance in Decentralized Filesharing Networks Theodore Hong Freenet Project.
Advertisements

Optimal Scheduling in Peer-to-Peer Networks Lee Center Workshop 5/19/06 Mortada Mehyar (with Prof. Steven Low, Netlab)
August 16, 2014 Modeling the Performance of Wireless Sensor Networks Carla Fabiana Chiasserini Michele Garetto Telecommunication Networks Group Politecnico.
1 Analysis of Random Mobility Models with PDE's Michele Garetto Emilio Leonardi Politecnico di Torino Italy MobiHoc Firenze.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
2/66 GET /index.html HTTP/1.0 HTTP/ OK... Clients Server.
Peer to Peer and Distributed Hash Tables
Analysis and Modeling of Social Networks Foudalis Ilias.
Modeling Malware Spreading Dynamics Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley.
Delay and Throughput in Random Access Wireless Mesh Networks Nabhendra Bisnik, Alhussein Abouzeid ECSE Department Rensselaer Polytechnic Institute (RPI)
Modeling and Analysis of Random Walk Search Algorithms in P2P Networks Nabhendra Bisnik, Alhussein Abouzeid ECSE, Rensselaer Polytechnic Institute.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
Farnoush Banaei-Kashani and Cyrus Shahabi Criticality-based Analysis and Design of Unstructured P2P Networks as “ Complex Systems ” Mohammad Al-Rifai.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
Expediting Searching Processes via Long Paths in P2P Systems 05/30 IDEA Lab.
Small-world Overlay P2P Network
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
Hardware-based Load Generation for Testing Servers Lorenzo Orecchia Madhur Tulsiani CS 252 Spring 2006 Final Project Presentation May 1, 2006.
Analysis of Web Caching Architectures: Hierarchical and Distributed Caching Pablo Rodriguez, Christian Spanner, and Ernst W. Biersack IEEE/ACM TRANSACTIONS.
Building Low-Diameter P2P Networks Eli Upfal Department of Computer Science Brown University Joint work with Gopal Pandurangan and Prabhakar Raghavan.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Vassilios V. Dimakopoulos and Evaggelia Pitoura Distributed Data Management Lab Dept. of Computer Science, Univ. of Ioannina, Greece
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
P2P File Sharing Systems
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Developing Analytical Framework to Measure Robustness of Peer-to-Peer Networks Niloy Ganguly.
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
Decentralised load balancing in closed and open systems A. J. Ganesh University of Bristol Joint work with S. Lilienthal, D. Manjunath, A. Proutiere and.
Introduction of P2P systems
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
DELAYED CHAINING: A PRACTICAL P2P SOLUTION FOR VIDEO-ON-DEMAND Speaker : 童耀民 MA1G Authors: Paris, J.-F.Paris, J.-F. ; Amer, A. Computer.
1 BitHoc: BitTorrent for wireless ad hoc networks Jointly with: Chadi Barakat Jayeoung Choi Anwar Al Hamra Thierry Turletti EPI PLANETE 28/02/2008 MAESTRO/PLANETE.
Othman Othman M.M., Koji Okamura Kyushu University 1.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
Othman Othman M.M., Koji Okamura Kyushu University 1.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Pending Interest Table Sizing in Named Data Networking Luca Muscariello Orange Labs Networks / IRT SystemX G. Carofiglio (Cisco), M. Gallo, D. Perino (Bell.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
1 Performance Analysis of the Distributed Coordination Function under Sporadic Traffic joint work with C.-F. Chiasserini (Politecnico di Torino)
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Analyzing the Vulnerability of Superpeer Networks Against Attack Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Indian Institute of Technology Bombay 1 Communication Networks Prof. D. Manjunath
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
Mean Field Methods for Computer and Communication Systems Jean-Yves Le Boudec EPFL Network Science Workshop Hong Kong July
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Dynamic Graph Partitioning Algorithm
Early Measurements of a Cluster-based Architecture for P2P Systems
EE 122: Peer-to-Peer (P2P) Networks
Determining the Peer Resource Contributions in a P2P Contract
INFOCOM 2013 – Torino, Italy Content-centric wireless networks with limited buffers: when mobility hurts Giusi Alfano, Politecnico di Torino, Italy Michele.
CS 162: P2P Networks Computer Science Division
Peer-to-Peer Information Systems Week 6: Performance
Joydeep Chandra, Santosh Shaw and Niloy Ganguly
IFIP – Performance 2007 A Modeling Framework to Understand the Tussle between ISPs and Peer-to-Peer File Sharing Users Michele Garetto - unito.
A Fluid-Diffusive Approach for Modelling P2P Systems
Javad Ghaderi, Tianxiong Ji and R. Srikant
Presentation transcript:

A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio 1 Giovanna Carofiglio 1, R.Gaeta 2, M.Garetto 1, P.Giaccone 1, E.Leonardi 1, M.Sereno 2 MAMA WorkshopACM SIGMETRICS 2005 MAMA Workshop joint with ACM SIGMETRICS 2005 Banff, June 6-10, Politecnico di Torino, 2 Università di Torino Italy

MAMA Workshop, Sigmetrics ‘05 Outline  Motivation  Basic Model  Extended Model  Content Search  Download effects

MAMA Workshop, Sigmetrics ‘05 P2P System Architecture peers clients server  A possible definition Decentralized, self-organizing distributed systems, in which all or most communication is symmetric.

MAMA Workshop, Sigmetrics ‘05 Peer-to-Peer traffic  P2P is the single largest generator of traffic  P2P traffic significantly outweights web traffic  P2P traffic is continuing to grow

MAMA Workshop, Sigmetrics ‘05 P2P Applications  Communication  Voice Over IP: Skype  Instant Messaging  Distributed Computation  UnitedDevices, Distributed Science  File Sharing  BitTorrent, KaZaA, Gnutella, eDonkey, Napster, etc.  DHTs  Chord, CAN, Pastry, Tapestry  Wireless Ad hoc Networking

MAMA Workshop, Sigmetrics ‘05 Motivation  Most of the Internet traffic is generated by p2p applications.  Performance studies of p2p systems may be useful to drive the design of future applications.  Analytical models help analyzing large and complex p2p networks.

MAMA Workshop, Sigmetrics ‘05 Modeling techniques  Traditional Markov Models A detailed microscopic description is provided but with a huge space-state. It is computationally expensive to analyze large systems like p2p systems (with million of users and contents shared).  Fluid models Network dynamics are described with an increased level of abstraction, neglecting stochastic information. Scalability: the model is based on a set of differential equations invariant w.r.t. the size of the network (n.users, link cap)

MAMA Workshop, Sigmetrics ‘05 Model description [1]F. Clevenot, P. Nain, “A Simple Model for the Analysis of SQUIRREL”, Infocom 2004, Hong Kong, Mar [2]D. Qiu, R. Srikant, “Modeling and Performance Analysis of BitTorrent like Peer-to-Peer Networks”, Sigcomm 2004, U.S.A.  We model a generic p2p system without focusing on a particular implementation.  Based on a fluid approach like in [1] and [2], our model evolves in a second-order diffusion approximation where stochasticity in networks’ dynamics plays a relevant role.  The model provide a description of users/contents dynamics both in transient and in steady state.

MAMA Workshop, Sigmetrics ‘05 Model structure Users dynamics Contents dynamics Search phase Download phase

MAMA Workshop, Sigmetrics ‘05 Outline  Motivation  Basic Model  Extended Model  Content Search  Download effects 2

MAMA Workshop, Sigmetrics ‘05  The number of users joining the p2p network dynamically changes according to:  Enter-leave dynamics λ u = new users’ arrival rate 1/μ u = average subscription time  Active-Sleeping mode 1/μ as = average active time 1/μ sa = average sleeping time  Users in sleeping mode do not interact at all with the other users of the community. Users dynamics (1)

MAMA Workshop, Sigmetrics ‘05 Users dynamics (2) The evolution of the number of users in active or sleeping mode, U a and U s respectively, can be described by two fluid differential equations: sleeping users who become active new users active users who become sleeping active users who leave the system active users who become sleeping

MAMA Workshop, Sigmetrics ‘05 Content Dynamics The evolution of the number of available copies of a content is driven by 2 phenomena:  the generation of new copies (downloads or off-on transitions)  the cancellation of existing copies θ = average request rate 1/μ h, 1/μ’ h = average content holding time for active/sleeping users Note: p s = p s (μ’ h ) is the probability that sleeping users have the considered content when they become active.

MAMA Workshop, Sigmetrics ‘05 Brownian Motion  Content dynamics are modelled through a Second-Order Diffusion Approximation Each content is a particle with instantaneous position x(t) moving accordingly to a Brownian motion. Langevin equation Fokker Planck equation The evolution of the pdf f(x,t) over follows:

MAMA Workshop, Sigmetrics ‘05 Content diffusion equation Introduction of new contents in the system  A content can disappear when are no more copies available. The rate at which a content disappear is:  The pdf F(x,t) of the number of copies follows the F.P. equation with boundary conditions for :

MAMA Workshop, Sigmetrics ‘05 Diffusion Parameters h h = variation coefficient of holding time h r = variation coefficient of inter request time  m(x,t) expresses the average speed at which the content-particle moves along the x axis.  The variance σ 2 (x,t) expresses the burstiness of the processes.

MAMA Workshop, Sigmetrics ‘05 Case : Content disappearance (1)  In a single-content scenario we study the probability that the content disappears as a function of the users’ dynamics.  Active Users = 10  Sleeping Users = 10  Copies Availables = 1 Network parameters Initial condition  λ u = users’ arrival rate = 0.1 ut/s  1/μ u = avg subscription time = 4000 s  1/μ as = avg active period = 400 s  1/μ sa = avg sleeping period = 400 s  θ = average request rate  1/μ h,1/μ’ h = avg content holding time for a/s users= 100 s

MAMA Workshop, Sigmetrics ‘05 Case: Content disappearance (2) Che grafico facciamo vedere? Modello e simulatore michele a confronto? Solo Modello?

MAMA Workshop, Sigmetrics ‘05 Outline  Motivation  Basic Model  Extended Model  Content Search  Download effects 2

MAMA Workshop, Sigmetrics ‘05 Dual distribution  Relations between users’ and contents’ dynamics  The number of active and sleeping users at time t  The number of copies available at time t

MAMA Workshop, Sigmetrics ‘05 Dual equations  G a (x,t) and G s (x,t) are the pdf of the number of active and sleeping users having x contents: new users active users who become sleeping or leave the system sleeping users who become active

MAMA Workshop, Sigmetrics ‘05 Diffusion parameters  As for the contents diffusion equation m(x,t) expresses the average speed at which the copy-particle moves along the x axis, while σ 2 (x,t) expresses the variance of the associated process. r a = rate of generation of new copies d a/s = rate of cancellation of existing copies

MAMA Workshop, Sigmetrics ‘05 Multi-contents case (1)  In a multi-content scenario, still assuming ideal search and download we study the steady state distribution of the contents among users.  Active Users = 2500  Sleeping Users=7500  Copies Availables = 1 Network parameters Initial condition  λ u = users’ arrival rate = 0 ut/s  1/μ u = avg subscription time = inf  1/μ as = avg active period = 6 h  1/μ sa = avg sleeping period = 18 h  θ = average request rate = 2 c/h  λ c = contents’ introduction= 1/600 c/s  1/μ h,1/μ’ h = avg content holding time for a/s users= 10 h, 8 h

MAMA Workshop, Sigmetrics ‘05 Multi-contents case (2) Che grafici facciamo vedere? Modello e simulatore michele a confronto? Solo Modello?

MAMA Workshop, Sigmetrics ‘05 Outline  Motivation  Basic Model  Extended model  Content Search  Download effects 2

MAMA Workshop, Sigmetrics ‘05 The contents’ trasfer rate  In a non-ideal p2p system the transfer rate of the contents dynamically changes according to: p hit (x,t)  the probability of a successful search p hit (x,t) (related to content diffusion, search algorithm) p down (x,t)  the probability of a successful download p down (x,t) (related to network congestion, user impatience, on-off dynamics) The effective retrieval rate becomes:  Both search and download require to know F(x,t) and provide it as a function of time.

MAMA Workshop, Sigmetrics ‘05 Search Phase  Search algorithm  Search algorithm: flooding in an unstructured p2p network For each content request a query message is forwarded to all the neighbors up to the distance max_ttl  Graph Model  Graph Model The P2P network topology is modeled as a random finite graph. We consider Generalized Random Graph (GRG) to allow an arbitrary vertex degree distribution. Active peer Application-level connection

MAMA Workshop, Sigmetrics ‘05 GRG Model  Given the probability distribution {p k } that a vertex has k edges departing from it, we can define the generating function:  It can be shown that the generating function of the number of the first neighbors with a copy of the content is: α = x/U a X =#copies U a =#active users  The composition of these generating functions gives the generating function of the number of neighbors at distance h

MAMA Workshop, Sigmetrics ‘05 GRG Topology  To compute the pdf of the GRG nodes degree we adopt a M/M/∞ queue Assuming that an external observer joins the network # customers # connections established in queue by the observer  Now we can define the generating function for the number of neighbors at distance up to max_ttl that have a copy of the content: Hence it derives the hit probability:

MAMA Workshop, Sigmetrics ‘05 Outline  Motivation  Basic Model  Extended Model  Content Search  Download effects 2

MAMA Workshop, Sigmetrics ‘05 Download Phase  Assumptions  Assumptions:  The transport network is ideal  Infinite bandwidth on the client side  The peer from which downloading the desired content is rqndomly chosen between those storing that content. The dynamics of dowload at each peer are modelled by a M/G/1-PS queue. Problem Problem The download request rate incoming at peers is not known a priori! It depends on:  The contents’ distribution at peers  The policy used by the system to distribute the load among peers

MAMA Workshop, Sigmetrics ‘05 Probability of successful download (1)  Let θ is the popularity of a content, present in x copies in the network where there are U a active peers Download request rate  Assuming that the requests form a Poisson process, the queue becomes a M/G/1-PS with average delay:  Given a download rate y= θ s p hit the probability of successful download is: Single Content Case

MAMA Workshop, Sigmetrics ‘05 The overall probability of successful download is Multiple Content Case From F(x) we derive the probability that a peer has k contents, present in x copies: ( F(x) is the pdf of the number of copies available for the content ) The overall download request rate seen by a peer is Probability of successful download (2)

MAMA Workshop, Sigmetrics ‘05  Since all Z(x) are independent we can approximate the distribution of Y around its average with a normal distribution  The probability of successful download becomes  m y and σ y are the first two moments of Y  The integral is restricted to the interval for numerical reasons. Notes Probability of successful download (3)

MAMA Workshop, Sigmetrics ‘05 Conclusions We defined a stochastic fluid model of a p2p system able to describe users and contents dynamics both in transient and stationary regime. A support model permits to consider the effects of the search and the download on the system performance. Analytical solution of the equations in steady state Model Extension to classes of different users Model Extension to classes of different contents Comparison beetween model and simulations in realistic scenarios. Work in progress…

MAMA Workshop, Sigmetrics ‘05 Thank you!