1 RESOURCE DISCOVERY Presenter: Cù Nguyễn Phương Hà.

Slides:



Advertisements
Similar presentations
Peer to Peer and Distributed Hash Tables
Advertisements

CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Ameoba Designed by: Prof Andrew S. Tanenbaum at Vrija University since 1981.
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
A. Frank 1 Internet Resources Discovery (IRD) Peer-to-Peer (P2P) Technology (1) Thanks to Carmit Valit and Olga Gamayunov.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Chapter 19 Binding Protocol Addresses (ARP) Chapter 20 IP Datagrams and Datagram Forwarding.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Peer-to-Peer Databases David Andersen Advanced Databases.
Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems Artur Andrzejak, Sven Graupner,Vadim Kotov, Holger Trinks.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Peer-to-Peer Computing CS587x Lecture Department of Computer Science Iowa State University.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
Searching In Peer-To-Peer Networks Chunlin Yang. What’s P2P - Unofficial Definition All of the computers in the network are equal Each computer functions.
1 P2P Computing. 2 What is P2P? Server-Client model.
 Introduction to Operating System Introduction to Operating System  Types Of An Operating System Types Of An Operating System  Single User Single User.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
G063 - Distributed Databases. Learning Objectives: By the end of this topic you should be able to: explain how databases may be stored in more than one.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
Enabling Peer-to-Peer SDP in an Agent Environment University of Maryland Baltimore County USA.
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
Peer-to-Peer Network Tzu-Wei Kuo. Outline What is Peer-to-Peer(P2P)? P2P Architecture Applications Advantages and Weaknesses Security Controversy.
Mobile Agent Migration Problem Yingyue Xu. Energy efficiency requirement of sensor networks Mobile agent computing paradigm Data fusion, distributed processing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Nguyen Tuan Anh. VN-Grid: Goals  Grid middleware (focus of this presentation)  Tuan Anh  Grid applications  Hoai.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
Efficient Peer-to-Peer Keyword Searching 1 Efficient Peer-to-Peer Keyword Searching Patrick Reynolds and Amin Vahdat presented by Volker Kudelko.
Enabling e-Research in Combustion Research Community T.V Pham 1, P.M. Dew 1, L.M.S. Lau 1 and M.J. Pilling 2 1 School of Computing 2 School of Chemistry.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Computer Network Architecture Lecture 2: Fundamental of Network.
P2P Networking: Freenet Adriane Lau November 9, 2004 MIE456F.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
NGS data transmission, A point view from a user
CS 268: Lecture 22 (Peer-to-Peer Networks)
CS4470 Computer Networking Protocols
Scientific Research Group in Egypt (SRGE)
Peer-to-Peer Data Management
CHAPTER 3 Architectures for Distributed Systems
University of Technology
EE 122: Peer-to-Peer (P2P) Networks
DHT Routing Geometries and Chord
Chapter 17: Database System Architectures
Peer-to-Peer Video Services
DATA RETRIEVAL IN ADHOC NETWORKS
Mobile P2P Data Retrieval and Caching
Database System Architectures
A Semantic Peer-to-Peer Overlay for Web Services Discovery
Presentation transcript:

1 RESOURCE DISCOVERY Presenter: Cù Nguyễn Phương Hà

2 Overview Introduction of problems Several approaches Solution model

3 Overview Introduction of problems Several approaches Solution model

4 Introduction The goal of VN-Grid project is connecting the available computational resources on the network to utilize available resources from those sites to resolve big scientific problems. Therefore, knowing resources available from all Grid sites and finding which Grid sites having available resources are necessary => Resource Discovery services.

5 Functions of Resource Discovery The prospective Resource Discovery services in each Gridsite must be able to know, find and provide the resource information from others. The main function is that when receiving a specific request about resources form client, Resource Discovery services must find out reliable information about Gridsites in the network that possess available resources satisfying the query.

6 Resources in VN-Grid There are three kinds of resources: Resources for executing job: or computing resources. It is information about the resources used to execute submitted job, for example the computational power, data storage, network bandwidth... Information about services: these are information about the services which user wants to learn about, for example: Information Services, Resource Discovery Services Information about applications: these are information about special applications deployed on Grid such as MPI, POP C.

7 Resources in VN-Grid Characteristics of Resources in VN-Grid environment: The resources are heterogeneous not only in the network but also in each. The resources have variety of properties with different data types. The existing resources continuously vary, especially the computing resources for example: CPUs, memory disk, network bandwidth... New resources are continually being published.

8 Forwarding in VN-Grid The proposed VN-Grid infrastructure simulates a Peer-to-Peer model in which clients control the networking instead of servers, that means those peers could exchange information directly. Interacting is limit to known peers. The peers are equally considered. The number of peers participating in Grid can be raised enormously.

9 Summary Good resource discovery services must: Provide the most exact, update and sufficient information with timely solution. Be flexible with features of resources such as variety, heterogeneity, and newly added resources. Be scalable to adapt with the number of peers in Grid environment rising. Reduce the expense of transmitting information in P2P environment.

10 Overview Introduction of problems Several approaches Solution model

11 Several Approaches Resources description: – Global Grid Forum: JSDL – EDAGrid: RDSResult Matching method: – EDAGrid: Ranking, Set-matching Forwarding method: – Napster: Centralized Indexing – Gnutella: Flooding Query – Chord: Indexing Using Distributed Hash Tables – HyperCuP system: Interests-based – Ant Colony Optimizing

12 Several Approaches Resources description: – Global Grid Forum: JSDL – EDAGrid: RDSResult Matching method: – EDAGrid: Ranking, Set-matching Forwarding method: – Napster: Centralized Indexing – Gnutella: Flooding Query – Chord: Indexing Using Distributed Hash Tables – HyperCuP system: Interests-based – Ant Colony Optimizing

13 JSDL JSDL is used to describe the requirements of computational jobs for submission to resources, particularly in Grid environments

14 JSDL ? * *

Resources This is a complex type that defines the operating system required by the job.complex typeOperatingSystem This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming system.xsd:booleanExclusiveExecution This element describes a filesystem that is required by the job.complex typeFileSystem This element is a complex type specifying the set of named hosts which may be selected for running the job.complex typeCandidateHosts DescriptionTypeName of attribute Resources

This is a range value that describes the required amount of disk space for each resource allocated to the job. jsdl:RangeValue_Ty peIndividualDiskSpace This element is a range value specifying the required amount of virtual memory for each of the resources to be allocated for this job submission. jsdl:RangeValue_Ty pe IndividualVirtualMemo ry This element is a range value specifying the amount of physical memory required on each indi- vidual resource. jsdl:RangeValue_Ty pe IndividualPhysicalMem ory This element is a range value specifying the bandwidth requirements of each individual resource. jsdl:RangeValue_Ty pe IndividualNetworkBan dwidth This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission. jsdl:RangeValue_Ty peIndividualCPUCount This element is a range value specifying the total number of CPU seconds required on each resource to execute the job. jsdl:RangeValue_Ty peIndividualCPUTime This element is a range value specifying the speed of each CPU required by the job in the execution environment. jsdl:RangeValue_Ty peIndividualCPUSpeed DescriptionTypeName of attribute Resources

This element is a range value specifying the total number of resources required by the job. jsdl:RangeValue_TypeTotalResourceCount This is a range value that describes the required total amount of disk space that should be allocated to the job. jsdl:RangeValue_TypeTotalDiskSpace This element is a range value specifying the required total amount of virtual memory for the job submission. jsdl:RangeValue_TypeTotalVirtualMemory This element is a range value specifying the required amount of physical memory for the entire job across all resources. jsdl:RangeValue_TypeTotalPhysicalMemory This element is a range value specifying the total number of CPUs required for this job submission. jsdl:RangeValue_TypeTotalCPUCount This element is a range value specifying total number of CPU seconds required, across all CPUs used to execute the job. jsdl:RangeValue_TypeTotalCPUTime DescriptionTypeName of attribute Resources

This element is a simple type containing a single name of a host.xsd:stringHostName CandidateHosts

Resources This is a token that describes the type of filesystem of the containing FileSystem element. jsdl:FileSystemTypeE numeration.FileSystemType This is a range value that describes the required amount of disk space on the containing FileSystem element for the job. jsdl:RangeValue_Typ eDiskSpace This is a string that describes a remote location that MUST be made available locally for the job.xsd:stringMountSource This is a string that describes a local location that MUST be made available in the allocated resources for the job.xsd:stringMountPoint xsd:stringDescription FileSystem

Resources xsd:stringDescription This element is a string that defines the version of the operating system required by the job.xsd:stringOperatingSystemVersion This is a complex type that contains the name of the operating system.complex typeOperatingSystemType OperatingSystem

Resources This element is a token specifying the CPU architecture required by the job in the execution environment. jsdl:ProcessorArchitec tureEnumerationCPUArchitectureName CPUArchitecture This is a token type that contains the name of the operating system. jsdl:OperatingSyste mTypeEnumerationOperatingSystemName OperatingSystemType

22 RDSResults: EDAGrid > ? + *

23 RDSResults > * ? *

24 Several Approaches Resources description: – Global Grid Forum: JSDL – EDAGrid: RDSResult Matching method: – EDAGrid: Ranking, Set-matching Forwarding method: – Napster: Centralized Indexing – Gnutella: Flooding Query – Chord: Indexing Using Distributed Hash Tables – HyperCuP system: Interests-based – Ant Colony Optimizing

25 Ranking: EDAGrid FUN CTION Ranking_Algorithm FOR each (Ri satisfy the individual condition)‏ Rank(Ri) = 0; FOR each (Aj is the attribute of job)‏ Rank(Ri) = Rank(Ri) + (w[j] *R[i,j]/A[j]); Next A; Next R; Sort the Resource Set Return list of resource with order

26 Set-matching: EDAGrid The set-matching algorithm: Create an empty set Add the resources into the set with the higher rank one after one the lower. Each time, check if the Total condition is met and if the InteractBandwidth is violated. Terminate the loop if these conditions are satisfied or the number of gridnodes reaches the number user required.

27 Several Approaches Resources description: – Global Grid Forum: JSDL – EDAGrid: RDSResult Matching method: – EDAGrid: Ranking, Set-matching Forwarding method: – Napster: Centralized Indexing – Gnutella: Flooding Query – Chord: Indexing Using Distributed Hash Tables – HyperCuP system: Interests-based – Ant Colony Optimizing

28 Centralized Indexing Proposes a centralization management for the whole resources of all the grid sites. There is a server machine which holds all the index of available resources on the network. Users start the query process by sending the request to the index server The server will send the answers to the users bases on the information stored. Advantages: quickly Disadvantages – The bottleneck at the server machine – Update the information continuously – Not suitable to the P2P

29 Flooding The query will be routed from one peer to all of its neighbors. By this way, the query will be sent throughout the network. If the peer finds out the resources in its local storage, it will send the answer to the original peer who makes the request. Using Time-To-Live (TTL) to limit the number of hops a request could be sent, so that after a certain times to be sent, the request will automatically disappear out of the network.

30 Indexing Using Distributed Hash Tables In this method, each peer in the network has a partition of the hash table. Each entry in the hash table is the key space, which point to the peer where the search file can be found. When there is a request of a file, the file name will be hash by a uniform hash function. Base on the hash value and the hash table, the look up value will be found and return to the requester. The cost of this method consists of the cost to build and update the hash table and route the query to the location search file. Disadvantages: not apply to the complex query.

31 Interests-based Methods This method based on the interest of users. The idea is to search on the peers that seem to contain what users have required. To reach this point, peers are organized into groups of similar interest. Therefore, the search queries will be forwarded to the interest group to get the high hit rate and reduce the redundant time to search on other peers. Disadvantage : – the peer's interest may change over time – peers have more than one interest

32 Ant Colony Optimizing Ants start from their nests and wander randomly. The ants which found food will return to their nests in terms of their memory and drop pheromone on trails. Other ants which come across such a trail will follow the trail to check the food instead of wandering randomly. If they find the food, they will return home and reinforce the pheromone on the trail. A key point is that the pheromone evaporates over time. The more time it takes for an ant to travel back to its nest, the more pheromone will be evaporated. When an ant reaches an intersection, the ant has to decide which branch to take. The ants which take a short branch march faster than those which take a long branch. Therefore, the pheromone density on the short branch remains higher. Other ants will more likely choose the branch in terms of the pheromone density. Eventually, all the ants which go to get the food will take the shortest branch.

33 Ant Colony Optimizing Initially, the ants may take the paths of A→B→C→E, A→B→E, or A→B→D→E. After the initial stage, most of the ants will take the shortest path A→B→E.

34 Overview Introduction of problems Several approaches Solution model

35

36 Reference [1]Thien-Nga Nguyen-Vu, Information Service API Specification »,VNGRID PROJECT. [2]Tran Vu Pham, Lydia MS Lau and Peter M Dew, An Ontology-based Adaptive Approach to P2P Resource Discovery in Distributed, School of Computing University of Leeds, Leeds, UK [3]Tuan Anh Nguyen, VN-Grid_design-Oct_1, VNGrid Project[4] Project Overview_VN-Grid Project wiki [4] [JSDL]Job Submission Description Language (JSDL) Specification, Version Nguyễn Quang Hùng, Nguyễn Thanh Sơn, USER-DRIVEN GRID RESOURCE DISCOVERY, Khoa Khoa Học & Kỹ Thuật Máy Tính Nhà A3, Trường Đại học Bách Khoa – ĐHQG Tp.HCM Yuhui Deng · FrankWang · Adrian Ciura, Ant colony optimization inspired resource discovery in P2P Grid systems, Springer Science+Business Media, LLC 2008 Zenggang Xiong 1,2, Yang Yang1, Xuemin Zhang2, Fu Chen1Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering, University of Science and Technology Beijing, Beijing , China; Tran Vu Pham, A Collaborative e-Science Architecture for Distributed Scientific Communities, The University of Leeds, School of Computing, October 2006

37 Thank You!