Download presentation
Presentation is loading. Please wait.
Published byChristal Ferguson Modified over 9 years ago
2
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Immune System and Search Technology Designing a Fast Search Algorithm for P2P Network using concepts from Immune Systems
3
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Overview of the Presentation ● P2P Network – Paradigm for Decentralised Computing ● Immune System Features ● Experimental Setup ● Simulation Results
4
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Peer To Peer Network ● Most Direct Method of Connecting Computers – Simple – Inexpensive – No Boss – No Regulation
5
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Peer To Peer Network ● PCs at the edge of the network are called “Peers” ● Peers can retrieve objects directly from each other Advantages of a P2P Network A large collection of peers may be available for content distribution-- sometimes millions! User takes advantage of the network’s currently available resources.
6
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Peer To Peer Network ● Problem of Hugeness – Emergence of Protocol ● Network Structure ● Degree of Centralization Unstructured Network Loosely Structured Network Structured Network Hybrid Decentralized Napster Pure Decentralized GnutellaFreenetCAN, CHORD Partially Centralized FastTrack, Kazaa, Morpheus
7
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme P2P: Hybrid Decentralized (Napster) When peer connects, it informs central server: – IP address – content Centralized directory server peers Alice Bob 1 1 1 1 3 Alice queries for Das Wunder von Bern Alice requests file from Bob Hybrid Decentralized – Napster Pure Decentralized – Gnutella Partially Centralized - Kazaa While file transfer is decentralized, locating content is highly centralized
8
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme P2P: Hybrid Decentralized (Napster) Centralized directory server peers Alice Bob 1 1 1 1 3 Hybrid Decentralized – Napster Pure Decentralized – Gnutella Partially Centralized - Kazaa ● Fast ● Single point of failure – Application crash ● Performance bottleneck ● Huge database to maintain ● Copyright infringement – Legal proceedings may result in the company having to shut down directory server
9
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Hybrid Decentralized – Napster Pure Decentralized – Gnutella Partially Centralized - Kazaa P2P: Intermediate Arrangement (Kazaa) Feature Has a centralized server that maintains user registrations, logs users into the systems to keep statistics, provides downloads of client software. Two client types are supported: Supernodes (fast cpus + high bandwidth connections) Nodes (slower cpus and/or connections) Supernodes addresses are provided in the initial download. They also maintain searchable indexes and proxies search requests for users. ^
10
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Hybrid Decentralized – Napster Pure Decentralized – Gnutella Partially Centralized - Kazaa P2P: Pure Decentralized (Gnutella) Basic Feature ● no hierarchy, peers have similar responsibilities: no group leader ● no peer maintains directory info ● highly decentralized Joining Algorithm ● use bootstrap node to learn about others ● Join message ^
11
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Hybrid Decentralized – Napster Pure Decentralized – Gnutella Partially Centralized - Kazaa P2P: Pure Decentralized (Gnutella) ^ Message Query : ● Send query to neighbors ● If queried peer has object, it sends message back to querying peer ● The queried peer forwards the query to its immediate neighbor. ● The resulting results are carried back to the user. ● A message Flooding occurs
12
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Hybrid Decentralized – Napster Pure Decentralized – Gnutella Partially Centralized - Kazaa P2P: Pure Decentralized (Gnutella) Pros : ● Totally Decentralized query ● Robust; Query doesn't stop on break down of one of the nodes ● Fresh Results : No outdated Index Cons ● Query radius: Query Radius can be long ● Excessive query traffic : 25% of the total traffic is query traffic ● Total Traffic in Gnutella Network is 1.7 Gbps 1.7% of total traffic in US Internet Backbone
13
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Hybrid Decentralized – Napster Pure Decentralized – Gnutella Partially Centralized - Kazaa P2P: Pure Decentralized (Gnutella) Challenges Ahead : ● Reduce Query time ● Stop Flooding; use Intelligent method for search to stop network congestion Relation Between Data and Topology Structured and Loosely Structured Topology
14
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Unstructured – Gnutella Structured – CHORD Loosely Structured - Freenet P2P: Structured Decentralized Network Distributed Hash Table : Data or metadata is carefully placed across nodes in a deterministic fashion Every file and every node ( ip ) generates a unique hash address helping in placement of data Each node has to keep information of limited number of neighbors Search is very fast, typically of the order log( n ) Extremely scalable
15
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Unstructured – Gnutella Structured – CHORD Loosely Structured - Freenet P2P: Structured Decentralized Network Disadvantages ● Locality is destroyed. – Data items (i.e. files) from a single site are not usually co-located, meaning that opportunities for enhanced browsing, pre-fetching and efficient searching are lost. ● Useful application level information is lost. – The data used by many applications is naturally described using hierarchies, which expose relationships between items near to each other. The virtualization of the file namespace by generating keys discards this information. ● P2P Networks are extremely transient ● Difficult to have keyword search and not exact- match queries
16
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Unstructured – Gnutella Structured – CHORD Loosely Structured - Freenet P2P: Loosely Structured Network ● Freenet is in between the two. ● File locations are affected by routing hints, but they are not completely specified, so not all searches succeed. ● It essentially pools unused disk space in peer computers to create a collaborative virtual file system. ● Files are replicated when they are searched. Unstructured Network Loosely Structured Network Structured Network Hybrid DecentralizedNapster Pure DecentralizedGnutellaFreenetCAN, CHORD Partially Centralized FastTrack, Kazaa, Morpheus
17
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Search Search Mechanism Topology Placement Data Message Routing Search Criterion Expressiveness (Key-lookup, Keyword, Rank Keyword) Efficiency (Bandwidth, Processing Power, Storage) Quality of Service (Number of Results, Response Time) Robustness (Stability in the presence of failures) System Requirement Autonomy (Freedom to chose how much data to store, where to store)
18
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Artificial Immune System ● Relatively new branch of computer science – Using natural immune system as a metaphor for solving computational problems – Not modelling the immune system ● Variety of applications so far … – Fault diagnosis (Ishida) – Computer security (Forrest, Kim) – Novelty detection (Dasgupta) – Robot behaviour (Lee) – Machine learning (Hunt, Timmis, de Castro) – AIS are computational systems, inspired by theoretical immunology and observed immune functions, which are applied to complex problem domains (Timmis, 2001)
19
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Why the Immune System? ● Recognition – Anomaly detection – Noise tolerance ● Robustness ● Feature extraction ● Diversity ● Memory ● Distributed ● Adaptive
20
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Role of the Immune System ● Protect our bodies from infection ● Primary immune response – Launch a response to invading pathogens ● Secondary immune response – Remember past encounters – Faster response the second time around Lymphatic vessels Lymph nodes Thymus Spleen Tonsils and adenoids Bone marrow Appendix Peyer’s patches Primary lymphoid organs Secondary lymphoid organs
21
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Role of the Immune System ● Protect our bodies from infection ● Primary immune response – Launch a response to invading pathogens ● Secondary immune response – Remember past encounters – Faster response the second time around MHC proteinAntigen APC Peptide T-cell Activated T-cell B- Lymphokines Activated B-cell (plasma cell) ( I ) ( III ) ( IV ) ( V ) ( VI ) ( VII ) ( II )
22
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Role of the Immune System MHC proteinAntigen APC Peptide T-cell Activated T-cell B- Lymphokines Activated B-cell (plasma cell) ( I ) ( III ) ( IV ) ( V ) ( VI ) ( VII ) ( II ) Epitopes - B cell Receptors Antigen The immune recognition is based on the complementarily between the binding region of the receptor and a portion of the antigen called epitope.
23
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Role of the Immune System MHC proteinAntigen APC Peptide T-cell Activated T-cell B- Lymphokines Activated B-cell (plasma cell) ( I ) ( III ) ( IV ) ( V ) ( VI ) ( VII ) ( II )
24
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Role of the Immune System MHC proteinAntigen APC Peptide T-cell Activated T-cell B- Lymphokines Activated B-cell (plasma cell) ( I ) ( III ) ( IV ) ( V ) ( VI ) ( VII ) ( II )
25
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Role of the Immune System Auto Immune Reaction (Self NonSelf Discrimination) Self Presented at beginning
26
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme General Framework for AIS Application Domain Representation Affinity MeasuresImmune Algorithms Solution P2P Network Search Search Item - Antigen Similarity (message,search item) ImmuneSearch Algorithm
27
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Reiterating the Perspective Solution P2P Network Search Search Item - Antigen Similarity (message,search item) ImmuneSearch Algorithm Search Mechanism TopologyMessage Routing Search Criterion Efficiency (Stop Packet Flooding) Quality of Service (Number of Results) Robustness (Stability in the presence of failures) System Requirement Autonomy (Freedom from storing data)
28
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Modeling the Network Information Profile – Pop Search Profile – Classical User Search Mechanism TopologyMessage Routing Search Criterion Efficiency (Stop Packet Flooding) Quality of Service (Number of Results) Robustness (Stability in the presence of failures) System Requirement Autonomy (Freedom from storing data) Profile is thought to be continuous It is represented by a 10-bit binary string That is, it is assumed there are 1024 categories Profiles close to each other (pop,rap) are close in terms of Hamming Distance
29
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Modeling the Network Zipf Law (Information and SearchProfile) 1 1 1 1 1 1 1 3 0 3 0 0 0 2 2 3 Search Mechanism TopologyMessage Routing Search Criterion Efficiency (Stop Packet Flooding) Quality of Service (Number of Results) Robustness (Stability in the presence of failures) System Requirement Autonomy (Freedom from storing data) Zipf Law Power Law to calculate probability of occurrece of a pattern r P r i a, r is the i th frequent keyword, a is a constant close to 1 N r = K/i a N r = N N = 16, K = 7.68, a = 1 K/1 = 7, K/2 = 4 K/3 = 3, K/4 = 2
30
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Search the Network – Flooding Flooding essentially implies sending the message packet to all the neighboring nodes
31
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Search the Network – Random Walk A Message packet travels at its will
32
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Search the Network – Immune Search Algorithm Consists of two parts 1.The movement of Message Packets 2.Rearrangement of Topology Proliferation Mutation High Concentration of Packets Search Mechanism TopologyMessage Routing Search Criterion Efficiency (Stop Packet Flooding) Quality of Service (Number of Results) Robustness (Stability in the presence of failures) System Requirement Autonomy (Freedom from storing data)
33
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Search the Network – Immune Search Algorithm Consists of two parts 1.The movement of Message Packets 2.Rearrangement of Topology Proliferation Mutation Search Mechanism TopologyMessage Routing Search Criterion Efficiency (Stop Packet Flooding) Quality of Service (Number of Results) Robustness (Stability in the presence of failures) System Requirement Autonomy (Freedom from storing data)
34
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Search the Network – Immune Search Aim Cluster Similar Nodes (Similar in Information and Search Profile) Algorithm Move nodes similar to user node closer to the user (change their neighborhood) Search Mechanism TopologyMessage Routing Search Criterion Efficiency (Stop Packet Flooding) Quality of Service (Number of Results) Robustness (Stability in the presence of failures) System Requirement Autonomy (Freedom from storing data)
35
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Search the Network – Immune Search Movement Depends on 1.The Distance from the user node 2.Amount of Matching 3.Age Aim Cluster Similar Nodes (Similar in Information and Search Profile) Algorithm Move nodes similar to user node closer to the user
36
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Search the Network – Immune Search Movement Depends on 1.The Distance from the user node 2.Amount of Matching 3.Age Aim Cluster Similar Nodes (Similar in Information and Search Profile) Algorithm Move nodes similar to user node closer to the user
37
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Search the Network – Immune Search Movement Depends on 1.The Distance from the user node 2.Amount of Matching 3.Age Aim Cluster Similar Nodes (Similar in Information and Search Profile) Algorithm Move nodes similar to user node closer to the user No Movement
38
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Experimental Results Experiment : Run for 100 generation, without changing the participating nodes Each Generation 100 searches by users selected randomly Efficiency No. Of Search Items found in 50 time steps Comparison Random Walk, Flooding, Proliferation 100
39
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Experimental Results Fairness Criteria Search Criteria is same HD(Search,query) Number of query packets are same Initial Number of packets in Random Walk is higher than Proliferation Flooding is not continued for 50 time steps 100
40
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Experimental Results Fairness Criteria Proliferation 1 and ImmuneSearch have same proliferation rate Proliferate HD(Search,query) < 2 Proliferation 2 has higher proliferation rate Proliferate HD(Search,query) < 3 Proliferation 2 has almost same number of packets as ImmuneSearch 100
41
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Experimental Results(Cost) No of Packets staying for 50 time steps Limited Flooding – 16 ImmuneSearch - 2 Proliferation 1 – 2 Proliferation is self-regulatory 100 Performance ImmuneSearch Proliferation 1 Proliferation 2 RandomWalk Limited Flooding
42
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Clustering (Most Frequent Token) 100 Cluster Very Fast – within 24 generation clusters Not one cluster but two/three clusters Information Profile and Search Profile intermingles So clusters are not very tight This allows Proliferation to flourish without much wasting Lesser frequent tokens can cluster Information Profile – Pop Search Profile – Classical
43
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Clustering (Less Frequent Token) 100 Clustering of second, third and eleventh most frequent tokens
44
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Experimental Results Experiment : Change 5%, 10%, --- 50% of the node at each generation 100 Search Mechanism TopologyMessage Routing Search Criterion Efficiency (Stop Packet Flooding) Quality of Service (Number of Results) Robustness (Stability in the presence of failures) System Requirement Autonomy (Freedom from storing data)
45
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Experimental Results Experiment : Change 5%, 10%, --- 50% of the node at each generation 100 Search Mechanism TopologyMessage Routing Search Criterion Efficiency (Stop Packet Flooding) Quality of Service (Number of Results) Robustness (Stability in the presence of failures) System Requirement Autonomy (Freedom from storing data)
46
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Experimental Results Observations ImmuneSearch is better till 50% replacement than simple proliferation 5% replacement is some times better than without replacement scheme 100 Search Mechanism TopologyMessage Routing Search Criterion Efficiency (Stop Packet Flooding) Quality of Service (Number of Results) Robustness (Stability in the presence of failures) System Requirement Autonomy (Freedom from storing data)
47
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Clustering (In Changing Condition) 100 Clustering of most frequent tokens with 5%, 10% and 20% replacement.
48
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Experimental Results(Amount of Change in Neighborhood) ● Change of 20% óf the node after 100 generations without replacement ● The neighborhood change rate drop after some time ● In 5% continuous replacement, it always changes maintaining a more or less constant rate ● The new nodes participate in this change
49
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme ● The network works as a self correcting/organizing system ● The proliferation – mutation combination is a good alternative for random walk and flooding ● Topology evolution helps in enhancing the performance of the network ● The design is robust ● Simulate it on other overlay topologies Discussion and Future Work
50
Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Questions and Answers
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.