Presentation is loading. Please wait.

Presentation is loading. Please wait.

EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY.

Similar presentations


Presentation on theme: "EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY."— Presentation transcript:

1 EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY OF TOKYO, JAPAN {anirban,kgoda,kitsure}@tkl.iis.u-tokyo.ac.jp

2 PRESENTATION OUTLINE INTRODUCTION INTRODUCTION RELATED WORK RELATED WORK SYSTEM OVERVIEW SYSTEM OVERVIEW MIGRATION AND REPLICATION MIGRATION AND REPLICATION LOAD-BALANCING LOAD-BALANCING PERFORMANCE STUDY PERFORMANCE STUDY CONCLUSION AND FUTURE WORK CONCLUSION AND FUTURE WORK

3 INTRODUCTION Prevalence of spatial applications  GIS, CAD,VLSI  Resource management, development planning, emergency planning, scientific research, GIS, CAD,VLSI Unprecedented growth of available spatial data at geographically distributed locations  the need for efficient networking Emergence of GRID computing and powerful networks Motivates the design of a SPATIAL GRID.

4 CHALLENGES Scale Scale Heterogeneity Heterogeneity Dynamism Dynamism Cross-domain administrative issues Cross-domain administrative issues Efficient search and load-balancing mechanisms Efficient search and load-balancing mechanisms  We focus on load-balancing.  Load-balancing in GRIDs is much more complicated than in traditional environments.

5 LOAD-BALANCING Some nodes become hot Some nodes become hot  Skewed Workloads  Dynamic access patterns These hot nodes become bottlenecks These hot nodes become bottlenecks  Increased waiting times  High response times MAIN CONTRIBUTIONS MAIN CONTRIBUTIONS   Viewing a spatial GRID as comprising several clusters   Each cluster is a LAN   Proposal of an inter-cluster load-balancing algorithm which uses migration/replication of data.   Presentation of a scalable technique for dynamic data placement.

6 RELATED WORK Ongoing GRID projects   Earth Systems Grid (ESG)   NASA Information Power Grid (IPG)   Grid Physics Network (GriPhyN)   European DataGrid. [Thain01] Binding of execution and storage sites together into I/O communities [Thain01] Data-movement system (Kangaroo) Load-balancing Load-balancing  STATIC (BUBBA, tile technique)  DYNAMIC (Disk cooling) Job (Process) MIGRATION in CONDOR Job (Process) MIGRATION in CONDOR Spatial indexes: R-tree [Guttman:84]

7 SYSTEM OVERVIEW Viewing the GRID as a set of clusters Viewing the GRID as a set of clusters Distance between two clusters Distance between two clusters  Communication time between cluster leaders Neighbours Neighbours Definition of Load Definition of Load  Number of disk I/Os in a certain time interval  Normalize w.r.t CPU power Cluster leaders Cluster leaders  Coordinate cluster activities  Maintain meta-information  Data stored at its own cluster & its neighbours Hotspot detection via access statistics Hotspot detection via access statistics  Use only recent statistics

8 DATA MOVEMENT IN GRIDs MIGRATION & REPLICATION MIGRATION & REPLICATION  Unlike replication, migration implies deletion of hot data at the source node. Which option is better: Migration or Replication Which option is better: Migration or Replication  Load-balancing  Data Availability  Disk space usage  Periodic cleanup REPLICA CONSISTENCY ?? REPLICA CONSISTENCY ?? Decisions concerning migration/replication should be taken during run-time. Decisions concerning migration/replication should be taken during run-time.

9 DATA MOVEMENT (Cont.) Impact of heterogeneity on data movement Impact of heterogeneity on data movement  Administrative policies (e.g., security)  Data management techniques (Indexing, hotspot detection, etc)  CPU  Disk space Moving data entails movement of indexes. Moving data entails movement of indexes. To address variations in indexing schemes, we extract data from the index at a node and rebuild the index at the destination node. To address variations in indexing schemes, we extract data from the index at a node and rebuild the index at the destination node. Each node has two indexes Each node has two indexes  Index for its own data  Index for moved data

10 DATA MOVEMENT (Cont.) Impact of variations in disk space on data movement   ‘Pushing’ non-hot data to large capacity peers   Large-sized data: migration   Small-sized data: replication   Replicating small-sized hot data at small capacity peers   Large-sized hot data: migration to large capacity peers if peers are available, otherwise replication. Deletion of infrequently accessed replicas

11 INTER-CLUSTER LOAD-BALANCING Periodic exchange of load info between neighbours Periodic exchange of load info between neighbours Leader L considers itself to be overloaded if its load exceeds that of its neighbours by 10%. Leader L considers itself to be overloaded if its load exceeds that of its neighbours by 10%.  L determines its hot regions and informs its neighbours about disk space requirement of hot regions.  Number of hot regions depends upon load imbalance. Neighbours with enough disk space reply to L with their load status and disk space information. Neighbours with enough disk space reply to L with their load status and disk space information. These leaders are sorted (asc) in List1 based on their loads. These leaders are sorted (asc) in List1 based on their loads. L assigns hot regions to members of List 1 in a round-robin manner. L assigns hot regions to members of List 1 in a round-robin manner.  The hottest region is moved to first member of List1, the second hottest region is moved to second member of List1 and so on.

12 PERFORMANCE STUDY 16 SUN workstations, each of which is a 143 MHz Sun UltraSparc I processor (256 MB RAM) running Solaris 2.5.1 operating system. These are connected by relatively high speed switch (200 Mbyte/s), the APnet. Each cluster is modeled by a workstation node. We simulated a transfer rate of 1 Mbit/second among the clusters. We implemented an R-tree on each of the clusters to organize the data allocated to each cluster. A real dataset (Greece Roads) Each cluster had more than 200000 data rectangles. Zipf distribution was used to model workload skews. We investigated only migration in this proposal.

13 PERFORMANCE OF OUR PROPOSED SCHEME

14 SNAPSHOT OF LOAD-BALANCING FOR ZIPF FACTOR OF 0.1

15 VARIATIONS IN WORKLOAD SKEW

16 SNAPSHOT OF LOAD DISTRIBUTION FOR ZIPF FACTOR OF 0.5

17 Huge amounts of available spatial data worldwide coupled with the emergence of GRID technologies and powerful networks motivate the design of a spatial GRID. Huge amounts of available spatial data worldwide coupled with the emergence of GRID technologies and powerful networks motivate the design of a spatial GRID. For performance reasons, effective load-balancing is necessary in such a spatial GRID. For performance reasons, effective load-balancing is necessary in such a spatial GRID. We view a GRID as a set of clusters. We view a GRID as a set of clusters. Proposal of a dynamic inter-cluster load-balancing strategy via migration/replication in GRIDs Proposal of a dynamic inter-cluster load-balancing strategy via migration/replication in GRIDs SUMMARY

18 FUTURE SCOPE OF WORK FAIRNESS IN LOAD-BALANCING FAIRNESS IN LOAD-BALANCING GRANULARITY OF DATA MOVEMENT GRANULARITY OF DATA MOVEMENT DETAILED PERFORMANCE STUDY DETAILED PERFORMANCE STUDY  REPLICATION  DIFFERENT WORKLOAD TYPES  SCALABILITY  INTEGRATION INTO EXISTING GRIDs


Download ppt "EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY."

Similar presentations


Ads by Google