Download presentation
Presentation is loading. Please wait.
Published byHoratio Wilson Modified over 9 years ago
1
Scalable Web Server on Heterogeneous Cluster CHEN Ge
2
Current Approaches Why use Global Object Space –Current Web server side cache approach generally based on single node design: Limited physical memory Usually only caching URL mapping tables File content caching largely relies on the OS’ file system caching Web cluster support emphasis on load distributing algorithms, but not on cluster wide file content caching
3
Current Approaches –Problems comes with current Web server caching approaches Limited physical memory as file content caching will result in either trashing of cached contents, or complex cache management algorithms with slight performance improvement and great computation or storage overhead Single node design is not scalable when apply to cluster environment. Some current cluster wide file content support (such as Rice’s LARD policy) is not scalable Relies on OS’ file system’s file content cashing is not efficient, and lack support of cluster wide file content caching
4
Current Approaches –Current Web server cluster approaches Usually emphasis on load distribution, but rarely address on the cluster wide caching problem Usually adopts L4/L5 switches to distribute load among the cluster nodes Requires homogeneous hardware and software cluster environment
5
Current Approaches –Problems comes with current Web server cluster approaches Emphasis load distribution based on separated cluster node policy and L4/L5 switch can not handle ‘hot’ object well Almost all cluster support requires homogeneous cluster environment, which will not able to utilize resources for different hardware and software platforms
6
Global Object Space Global Object Space has two main aims: –Utilize the giant total physical memory a cluster system can provide to cache object content –Using the global object to provide uniform access to resources of various platforms, which is achieved by using Java
7
Global Object Space Current Web server Limited physical memory for file content caching Complex cache management ‘Hot’ object problem Requires L4/L5 Switch Not scalable Requires homogeneous cluster GOS Java Better load balance Good response time for hot objects Large Throughput Good Scalability Heterogeneous cluster support Uniform access to resources of different platforms
8
Global Object Space Global Object Space – Physical Relationship of Components Physical Memory of a node Inter-node high- speed network Cached object(file) content
9
Global Object Space Jigsaw’s Request Handle Object Global Object Space – Logical Relationship of Components Global Object Space Service Interface Protocol (GOSSIP)
10
Global Object Space Marco-view of Request Handle? –A node get a html document request: http://www.dotcom.com/doc/year2k/in dex.html The Request Handle Object will call GOS for the requested document http://www.dotcom.com/doc/year2k/in dex.html –GOS will use GOSSIP to make up the Reply Object which will be returned to the Request Handle Object – Request Handle Object will reply the client with the returned Reply Object
11
Global Object Space How to GOSSIP ? –There are two table on each node: Global Object Space (GOS) Table –Hold entries for each object in the system Hot Object Cache (HOC) Table –Hold entries for locally duplicated objects which are hot, which means those are accessed very frequently. –When an incoming object request received from Request Handle Object, the URL is looked up in the HOC table, if it is in the table, that means it is cached in local physical memory, it forms the Reply Object based on the cached entry.
12
Global Object Space How to GOSSIP –If the requested URL has no entry in the HOC Table, it will be parsed one item by one, until it reaches an item in the GOS table For example: http://www.dotcom.com/doc/year2k/index.ht ml If this document is cached in another node, the entry will have something like GOSEntry.key=“http://www.dotcom.com/doc” GOSEntry.nodeaddr=10.8.102.2 Then the GOS Object will create a connection with the remote GOS Service Object, send the remaining URL to it, here it is “/year2k/index.html” http://www.dotcom.com/doc/year2k/index.ht mlhttp://www.dotcom.com/doc
13
Global Object Space How to GOSSIP –The remote GOS service object will try to fetch this cached object and send back the object content, or read it from disk first if there’s a cache miss
14
Global Object Space How to GOSSIP --- For normal Objects Client Node 1 Request: http://www.dotcom.com/doc/year2k/index.html Node 2 /year2k/index.html Real object content
15
Global Object Space How to GOSSIP --- For hot node Node 1 Request: http://www.dotcom.com/doc/year2k/index.html Node 2 Real object content Client Real object content HOC Table Hit http redirect
16
Global Object Space How to GOSSIP --- For hot Objects on hot node Node 1 Request: http://www.dotcom.com/doc/year2k/index.html Node 2 Real object content Client HOC Table Hit
17
Global Object Space How to GOSSIP –The GOS Service Object will maintain a field in the local object mapping table which contains the access frequency of local objects, when it finds an object becomes a hot object, it will make the hot flag in the replied object on so that the remote GOS object will add an entry to the HOC table and cache the object content in its local memory
18
Global Object Space How to GOSSIP –When a GOS service Object find a previous hot object no-longer hot, it will broadcast to all the nodes in the system, so that other nodes will remove it from the HOC table and local memory cache
19
Global Object Space Further thoughts about GOSSIP –Cache distribution Actually, with such a mechanism, it is not necessary to cache the file content in the node where the file really exists. Requested File Cached File Content
20
Global Object Space How GOS Achieves the Goals –Better Load Balance As GOS will distribute load according to the requested object, not only some simple or some complex round-robin load balancers like L4/L5 switch, it will direct request to the node where the object cached. “Hot” objects’ copies will be duplicated among the cluster nodes, so that load will be more evenly distributed even when there are intensive request for certain few objects in the system As GOS can put the cache of object in the global objects in other nodes’ physical memory, it can solve the problem of a hot server, whose files is much more frequently accessed
21
Global Object Space –Good Response time for hot objects As GOS will duplicate hot objects’ copies among the nodes, requests for hot objects will be redirected to different nodes to serve the requests, then the response time for hot objects is shortened as the server will not so busy as when all the request for the same object should be processed by one node
22
Global Object Space –Large Throughput With GOS, the requests are distributed to the individual nodes, and each node can setup connections with clients directly to server the clients with all the objects in the system, so the throughput of whole web server is increased comparing the use of L4/L5 switches which potentially will become bottlenecks. More, the better load balance under hot objects conditions makes its throughput larger than current L4/L5 switches
23
Global Object Space –Good scalability GOS does not require each node hold the whole URL mapping table in the system. In some current LARD system, each node has a whole mapping table of all the files in the system, which will become extreme large when the system scales. GOS do not relies on a single L4/L5 switch to redistribute requests. This will eliminate the potential bottle neck when the system scale to large number of node
24
Global Object Space –Run on heterogeneous clusters Written in Java, the web server can run on nodes of different platforms With the support of GOS, requests received by any nodes will be able to access resources of all the nodes in the system, this provides uniform access to different platforms in the system
25
Global Object Space Problems need to solve for building GOS –How to efficiently redirect http request As we will not use L4/L5 switches, traditional method of distribute request among nodes will not be suitable for GOS Using Java as the implementation language, we can not do much on the lower levels of network communication –GOS will introduce extra overheads when fetching object contents from other nodes’ physical memory, an efficient implementation needed
26
Global Object Space Further thoughts about GOSSIP –Dynamic content caching is a rather difficult task according to current available references. We can consider a scheme of load balancing by distribute running of the same dynamic generating process to several node. The script file itself can be cache as normal file content
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.