1 RESOURCE DISCOVERY Presenter: Cù Nguyễn Phương Hà.

1 RESOURCE DISCOVERY Presenter: Cù Nguyễn Phương Hà

2 Overview Introduction of problems Several approaches Solution model

4 Introduction The goal of VN-Grid project is connecting the available computational resources on the network to utilize available resources from those sites to resolve big scientific problems. Therefore, knowing resources available from all Grid sites and finding which Grid sites having available resources are necessary => Resource Discovery services.

5 Functions of Resource Discovery The prospective Resource Discovery services in each Gridsite must be able to know, find and provide the resource information from others. The main function is that when receiving a specific request about resources form client, Resource Discovery services must find out reliable information about Gridsites in the network that possess available resources satisfying the query.

6 Resources in VN-Grid There are three kinds of resources: Resources for executing job: or computing resources. It is information about the resources used to execute submitted job, for example the computational power, data storage, network bandwidth... Information about services: these are information about the services which user wants to learn about, for example: Information Services, Resource Discovery Services Information about applications: these are information about special applications deployed on Grid such as MPI, POP C.

7 Resources in VN-Grid Characteristics of Resources in VN-Grid environment: The resources are heterogeneous not only in the network but also in each. The resources have variety of properties with different data types. The existing resources continuously vary, especially the computing resources for example: CPUs, memory disk, network bandwidth... New resources are continually being published.

8 Forwarding in VN-Grid The proposed VN-Grid infrastructure simulates a Peer-to-Peer model in which clients control the networking instead of servers, that means those peers could exchange information directly. Interacting is limit to known peers. The peers are equally considered. The number of peers participating in Grid can be raised enormously.

9 Summary Good resource discovery services must: Provide the most exact, update and sufficient information with timely solution. Be flexible with features of resources such as variety, heterogeneity, and newly added resources. Be scalable to adapt with the number of peers in Grid environment rising. Reduce the expense of transmitting information in P2P environment.

11 Several Approaches Resources description: – Global Grid Forum: JSDL – EDAGrid: RDSResult Matching method: – EDAGrid: Ranking, Set-matching Forwarding method: – Napster: Centralized Indexing – Gnutella: Flooding Query – Chord: Indexing Using Distributed Hash Tables – HyperCuP system: Interests-based – Ant Colony Optimizing

13 JSDL JSDL is used to describe the requirements of computational jobs for submission to resources, particularly in Grid environments

14 JSDL ? * *

Resources This is a complex type that defines the operating system required by the job.complex typeOperatingSystem This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming system.xsd:booleanExclusiveExecution This element describes a filesystem that is required by the job.complex typeFileSystem This element is a complex type specifying the set of named hosts which may be selected for running the job.complex typeCandidateHosts DescriptionTypeName of attribute Resources

This is a range value that describes the required amount of disk space for each resource allocated to the job. jsdl:RangeValue_Ty peIndividualDiskSpace This element is a range value specifying the required amount of virtual memory for each of the resources to be allocated for this job submission. jsdl:RangeValue_Ty pe IndividualVirtualMemo ry This element is a range value specifying the amount of physical memory required on each individual resource. jsdl:RangeValue_Ty pe IndividualPhysicalMem ory This element is a range value specifying the bandwidth requirements of each individual resource. jsdl:RangeValue_Ty pe IndividualNetworkBan dwidth This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission. jsdl:RangeValue_Ty peIndividualCPUCount This element is a range value specifying the total number of CPU seconds required on each resource to execute the job. jsdl:RangeValue_Ty peIndividualCPUTime This element is a range value specifying the speed of each CPU required by the job in the execution environment. jsdl:RangeValue_Ty peIndividualCPUSpeed DescriptionTypeName of attribute Resources

This element is a range value specifying the total number of resources required by the job. jsdl:RangeValue_TypeTotalResourceCount This is a range value that describes the required total amount of disk space that should be allocated to the job. jsdl:RangeValue_TypeTotalDiskSpace This element is a range value specifying the required total amount of virtual memory for the job submission. jsdl:RangeValue_TypeTotalVirtualMemory This element is a range value specifying the required amount of physical memory for the entire job across all resources. jsdl:RangeValue_TypeTotalPhysicalMemory This element is a range value specifying the total number of CPUs required for this job submission. jsdl:RangeValue_TypeTotalCPUCount This element is a range value specifying total number of CPU seconds required, across all CPUs used to execute the job. jsdl:RangeValue_TypeTotalCPUTime DescriptionTypeName of attribute Resources

This element is a simple type containing a single name of a host.xsd:stringHostName CandidateHosts

Resources This is a token that describes the type of filesystem of the containing FileSystem element. jsdl:FileSystemTypeE numeration.FileSystemType This is a range value that describes the required amount of disk space on the containing FileSystem element for the job. jsdl:RangeValue_Typ eDiskSpace This is a string that describes a remote location that MUST be made available locally for the job.xsd:stringMountSource This is a string that describes a local location that MUST be made available in the allocated resources for the job.xsd:stringMountPoint xsd:stringDescription FileSystem

Resources xsd:stringDescription This element is a string that defines the version of the operating system required by the job.xsd:stringOperatingSystemVersion This is a complex type that contains the name of the operating system.complex typeOperatingSystemType OperatingSystem

Resources This element is a token specifying the CPU architecture required by the job in the execution environment. jsdl:ProcessorArchitec tureEnumerationCPUArchitectureName CPUArchitecture This is a token type that contains the name of the operating system. jsdl:OperatingSyste mTypeEnumerationOperatingSystemName OperatingSystemType

22 RDSResults: EDAGrid > ? + *

23 RDSResults > * ? *

25 Ranking: EDAGrid FUN CTION Ranking_Algorithm FOR each (Ri satisfy the individual condition)‏ Rank(Ri) = 0; FOR each (Aj is the attribute of job)‏ Rank(Ri) = Rank(Ri) + (w[j] *R[i,j]/A[j]); Next A; Next R; Sort the Resource Set Return list of resource with order

26 Set-matching: EDAGrid The set-matching algorithm: Create an empty set Add the resources into the set with the higher rank one after one the lower. Each time, check if the Total condition is met and if the InteractBandwidth is violated. Terminate the loop if these conditions are satisfied or the number of gridnodes reaches the number user required.

28 Centralized Indexing Proposes a centralization management for the whole resources of all the grid sites. There is a server machine which holds all the index of available resources on the network. Users start the query process by sending the request to the index server The server will send the answers to the users bases on the information stored. Advantages: quickly Disadvantages – The bottleneck at the server machine – Update the information continuously – Not suitable to the P2P

29 Flooding The query will be routed from one peer to all of its neighbors. By this way, the query will be sent throughout the network. If the peer finds out the resources in its local storage, it will send the answer to the original peer who makes the request. Using Time-To-Live (TTL) to limit the number of hops a request could be sent, so that after a certain times to be sent, the request will automatically disappear out of the network.

30 Indexing Using Distributed Hash Tables In this method, each peer in the network has a partition of the hash table. Each entry in the hash table is the key space, which point to the peer where the search file can be found. When there is a request of a file, the file name will be hash by a uniform hash function. Base on the hash value and the hash table, the look up value will be found and return to the requester. The cost of this method consists of the cost to build and update the hash table and route the query to the location search file. Disadvantages: not apply to the complex query.

31 Interests-based Methods This method based on the interest of users. The idea is to search on the peers that seem to contain what users have required. To reach this point, peers are organized into groups of similar interest. Therefore, the search queries will be forwarded to the interest group to get the high hit rate and reduce the redundant time to search on other peers. Disadvantage : – the peer's interest may change over time – peers have more than one interest

32 Ant Colony Optimizing Ants start from their nests and wander randomly. The ants which found food will return to their nests in terms of their memory and drop pheromone on trails. Other ants which come across such a trail will follow the trail to check the food instead of wandering randomly. If they find the food, they will return home and reinforce the pheromone on the trail. A key point is that the pheromone evaporates over time. The more time it takes for an ant to travel back to its nest, the more pheromone will be evaporated. When an ant reaches an intersection, the ant has to decide which branch to take. The ants which take a short branch march faster than those which take a long branch. Therefore, the pheromone density on the short branch remains higher. Other ants will more likely choose the branch in terms of the pheromone density. Eventually, all the ants which go to get the food will take the shortest branch.

33 Ant Colony Optimizing Initially, the ants may take the paths of A→B→C→E, A→B→E, or A→B→D→E. After the initial stage, most of the ants will take the shortest path A→B→E.

36 Reference [1]Thien-Nga Nguyen-Vu, Information Service API Specification »,VNGRID PROJECT. [2]Tran Vu Pham, Lydia MS Lau and Peter M Dew, An Ontology-based Adaptive Approach to P2P Resource Discovery in Distributed, School of Computing University of Leeds, Leeds, UK [3]Tuan Anh Nguyen, VN-Grid_design-Oct_1, VNGrid Project[4] Project Overview_VN-Grid Project wiki http://www.cse.hcmut.edu.vn/~vngrid/wiki/Project_Overview [4] [JSDL]Job Submission Description Language (JSDL) Specification, Version 1.0 http://forge.gridforum.org/projects/jsdl-wg http://forge.gridforum.org/projects/jsdl-wg Nguyễn Quang Hùng, Nguyễn Thanh Sơn, USER-DRIVEN GRID RESOURCE DISCOVERY, Khoa Khoa Học & Kỹ Thuật Máy Tính Nhà A3, Trường Đại học Bách Khoa – ĐHQG Tp.HCM Yuhui Deng · FrankWang · Adrian Ciura, Ant colony optimization inspired resource discovery in P2P Grid systems, Springer Science+Business Media, LLC 2008 Zenggang Xiong 1,2, Yang Yang1, Xuemin Zhang2, Fu Chen1Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering, University of Science and Technology Beijing, Beijing 100083, China; Tran Vu Pham, A Collaborative e-Science Architecture for Distributed Scientific Communities, The University of Leeds, School of Computing, October 2006

37 Thank You!

1 RESOURCE DISCOVERY Presenter: Cù Nguyễn Phương Hà.

Similar presentations

Presentation on theme: "1 RESOURCE DISCOVERY Presenter: Cù Nguyễn Phương Hà."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 RESOURCE DISCOVERY Presenter: Cù Nguyễn Phương Hà.

Similar presentations

Presentation on theme: "1 RESOURCE DISCOVERY Presenter: Cù Nguyễn Phương Hà."— Presentation transcript:

Similar presentations

About project

Feedback