Ajou University, South Korea Chameleon: A Resource Scheduler in A Data Grid Environment Sang Min Park  Jai-Hoon Kim Ajou University South Korea.

Slides:

Advertisements

Similar presentations

Giggle: A Framework for Constructing Scalable Replica Location Services Ann Chervenak, Ewa Deelman, Ian Foster, Leanne Guy, Wolfgang Hoschekk, Adriana.

Advertisements

High Performance Computing Course Notes Grid Computing.

Data Grids Darshan R. Kapadia Gregor von Laszewski

A conceptual model of grid resources and services Authors: Sergio Andreozzi Massimo Sgaravatto Cristina Vistoli Presenter: Sergio Andreozzi INFN-CNAF Bologna.

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.

Application of GRID technologies for satellite data analysis Stepan G. Antushev, Andrey V. Golik and Vitaly K. Fischenko 2007.

USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:

GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.

Computer Science Department 1 Load Balancing and Grid Computing David Finkel Computer Science Department Worcester Polytechnic Institute.

MASPLAS ’02 Creating A Virtual Computing Facility Ravi Patchigolla Chris Clarke Lu Marino 8th Annual Mid-Atlantic Student Workshop On Programming Languages.

The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing, Rich Wolski, Neil Spring, and Jim Hayes, Journal.

Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.

Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.

On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.

GRID COMPUTING & GRID SCHEDULERS - Neeraj Shah. Definition A ‘Grid’ is a collection of different machines where in all of them contribute any combination.

1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.

Ajou University, South Korea ICSOC 2003 “Disconnected Operation Service in Mobile Grid Computing” Disconnected Operation Service in Mobile Grid Computing.

Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.

CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.

Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.

ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.

WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.

The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.

Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.

DOMENICO TALIA (joint work with M. Cannataro, A. Congiusta, P. Trunfio) DEIS University of Calabria ITALY Grid-Based Data Mining and.

1 520 Student Presentation GridSim – Grid Modeling and Simulation Toolkit.

Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.

File and Object Replication in Data Grids Chin-Yi Tsai.

Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.

A Survey of Distributed Task Schedulers Kei Takahashi (M1)

1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.

Ajou University, South Korea GCC 2003 Presentation Dynamic Data Grid Replication Strategy based on Internet Hierarchy Sang Min Park , Jai-Hoon Kim, and.

The Anatomy of the Grid Mahdi Hamzeh Fall 2005 Class Presentation for the Parallel Processing Course. All figures and data are copyrights of their respective.

Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.

1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.

Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.

Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.

1 Large-Scale Profile-HMM on the Grid Laurent Falquet Swiss Institute of Bioinformatics CH-1015 Lausanne, Switzerland Borrowed from Heinz Stockinger June.

The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.

Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,

Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.

Authors: Ronnie Julio Cole David

What is SAM-Grid? Job Handling Data Handling Monitoring and Information.

Data Management and Transfer in High-Performance Computational Grid Environments B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman,

GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.

Replica Consistency in a Data Grid1 IX International Workshop on Advanced Computing and Analysis Techniques in Physics Research December 1-5, 2003 High.

CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.

Globus – Part II Sathish Vadhiyar. Globus Information Service.

CEDPS Data Services Ann Chervenak USC Information Sciences Institute.

7. Grid Computing Systems and Resource Management

Introduction to Grid Computing and its components.

1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.

Oracle to MySQL synchronization Gianni Pucciani CERN, University of Pisa.

Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.

Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks Author: P. Kokkinos, K. Christodoulopoulos, A. Kretsis, and E. Varvarigos.

Protocols and Services for Distributed Data- Intensive Science Bill Allcock, ANL ACAT Conference 19 Oct 2000 Fermi National Accelerator Laboratory Contributors:

Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.

The Globus Toolkit The Globus project was started by Ian Foster and Carl Kesselman from Argonne National Labs and USC respectively. The Globus toolkit.

A Resource Management Architecture for Metacomputing Systems Karl Czajkowski Ian Foster Nicholas Karonis Carl Kesselman Stuart Martin Warren Smith Steven.

Holding slide prior to starting show. Scheduling Parametric Jobs on the Grid Jonathan Giddy

10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.

A Web-based Parallel File Transferring System on Grid and Cloud Environments Chao-Tung Yang Yu-Hsiang Lo Lung-Teng Chen Department of Computer Science,

ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,

Clouds , Grids and Clusters

GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.

Globus —— Toolkits for Grid Computing

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

Ajou University, South Korea Chameleon: A Resource Scheduler in A Data Grid Environment Sang Min Park  Jai-Hoon Kim Ajou University South Korea

Ajou University, South Korea 2 Contents Introduction to Data Grid Related Works Scheduling Model Scheduler Implementation Testbed and Application Results Conclusions

Ajou University, South Korea 3 Introduction to Data Grid Data Grid Motivations Petabyte scale data production Distributed data storage to store parts of data Distributed computing resources which process the data Two Most Important Approaches for Data Grid Secure, reliable, and efficient data transport protocol (ex. GridFTP) Replication (ex. Replica catalog) Replication Large size files are partially replicated among sites Reduce data access time Application Scheduling, Dynamic replication issues are emerging

Ajou University, South Korea 4 Related Works Data Grid Replica catalog – mapping from logical file name to physical instance GridFTP – Secure, reliable, and efficient file transfer protocol Job Scheduling Various scheduling algorithms for computational Grid Application Level Scheduling (AppLes) Large data collection has not been concerned Job Scheduling in Data Grid Roughly analytical and simulation studies are presented Our works define more in-depth scheduling model

Ajou University, South Korea 5 Scheduling Model - Assumptions Assumptions Site has both data storage and computing facilities Files are replicated at part of Grid sites Each site has different amount of computational capability Grid users request job execution through Job schedulers

Ajou University, South Korea 6 Scheduling Model - System Factors Dynamic system factors - Factors change over time Network bandwidth Data transfer time is proportional to network bandwidth NWS- tool for measuring and forecasting network bandwidth Available computing nodes Determines execution time of jobs Decided according to job load on a site System attributes Machine architecture (clusters, MPPs, etc) Processor speed, Available memory, I/O performance, etc.

Ajou University, South Korea 7 Scheduling Model - System Factors Application specific factors - Unique factors Data Grid applications have Size of input data (replica) If not in the computing site, data fetch is needed Much time will be consumed to transfer large size data Size of application code Application code should be migrated to sites which perform computation Not critical to the overall performance (small size) Size of produced output data When the computing job takes place at the remote site, result data should be returned back to the local Strongly related to the size of input data

Ajou University, South Korea 8 Scheduling Model - application scenarios The model consists of 5 distinct application scenarios 1.Local Data and Local Execution 2.Local Data and Remote Execution 3.Remote Data and Local Execution 4.Remote Data and Same Remote Execution 5.Remote Data and Different Remote Execution

Ajou University, South Korea 9 Scheduling Model - application scenarios Terms in the scenarios ParameterMeaning Number of available computing nodes at the site Size of input data (replica) Size of application codes Size of produced output data Bandwidth of WAN connection between sites Bandwidth of LAN connection between nodes Expected execution time of jobs

Ajou University, South Korea 10 Scheduling Model - application scenarios 1.Local Data and Local Execution Input data (replica) is located in local, and processing is performed with local available processors Data in move consists of Input data (replica) Application code Output data Cost consists of 1. 1.Data transfer time between master and computing nodes via LAN 2. 2.Job execution time using local processors

Ajou University, South Korea 11 Scheduling Model - application scenarios 2. Local Data and Remote Execution Locally copied replica is transferred to remote computation site Cost consists of 1. 1.Data (input+codes+output) movement time via WAN between local and remote site 2. 2.Data movement time via LAN in a remote site 3. 3.Job execution time on a remote site

Ajou University, South Korea 12 Scheduling Model - application scenarios 3. Remote Data and Local Execution Remote replica is copied into local site, and processing is performed on local Cost consists of 1. 1.Input data movement time via WAN between local and remote site 2. 2.Data movement time via LAN in a local site 3. 3.Job execution time on a local processors

Ajou University, South Korea 13 Scheduling Model - application scenarios 4. Remote Data and Same Remote Execution Remote site having replica performs computation Cost consists of 1. 1.Data (code+output) movement time via WAN between local and remote site 2. 2.Data movement time via LAN in a remote site 3. 3.Job execution time on a remote site

Ajou University, South Korea 14 Scheduling Model - application scenarios 5. Remote Data and Different Remote Execution Remote site j performs computation with replica copied from remote site i Cost consists of 1. 1.Input replica movement time via WAN between remote site i and j 2. 2.Data (codes + output) movement time via WAN between local and remote j 3. 3.Data movement time via LAN in a remote site j 4. 4.Job execution time in a remote site j

Ajou University, South Korea 15 Scheduling Model - scheduler Operations of the scheduler 1.Predict the response time of each scenario 2.Compare the response time of scenarios 3.Choose the best scenario and sites holding data and to perform job execution 4.Requests data movement and job execution

Ajou University, South Korea 16 Scheduler Implementation Develop scheduler prototype, called Chameleon, for evaluating the scheduling model Built on top of services provided by Globus GRAM MDS GridFTP Replica Catalog NWS is used for measuring and forecasting network bandwidth Scheduling algorithms are based on the scheduling models presented

Ajou University, South Korea 17 Testbed for experiments SiteLocationNumber of proc.Local Scheduler Ajou UniversityS.Korea8 PBS Yonsei Univ. 1S.Korea12 PBS Yonsei Univ. 2S.Korea12 PBS KISTIS.Korea36 LSF KUTS.Korea6 PBS Chonbuk Univ.S.Korea1 Fork Pusan Univ.S.Korea24 PBS POSTECHS.Korea8 PBS AISTJapan10 SGE

Ajou University, South Korea 18 Applications Gene sequence comparison applications (Bioinformatics) Computationally intensive analysis on the large size protein database Bio-scientists predict structure and functions of newly found protein by comparing it with well known protein database The size of database reaches over 500 MB There are various versions of protein database Large databases are replicated in Data Grid Two well-known applications, Blast and FASTA, are executed

Ajou University, South Korea 19 Applications - parameters ParametersPSI-BLASTFASTA Size of Input replica (Protein Database) 502 MB Size of output data10 MB200 MB Size of application codes7 MB1 MB

Ajou University, South Korea 20 Experimental Results (1) Replication scenario Results when executing PSI-BLAST

Ajou University, South Korea 21 Experimental Results (2) Results when executing FASTA in the above replication scenario Results on the previous slide

Ajou University, South Korea 22 Experimental Results (3) No replication takes place Results when executing PSI- BLAST

Ajou University, South Korea 23 Experimental Results (4) Number of Replica Sites with Replica 1Local 2Local, E 3Local, E, D 4Local, E, D, F 5Local, E, D, F, G 6Local, E, D, F, G, H 7Local, E, D, F, G, H, B 8Local, E, D, F, G, H, B, A 9Local, E, D, F, G, H, B, A, C Increasing the number of replica Decreasing response time

Ajou University, South Korea 24 Conclusions Job scheduling models for Data Grid The models consist of 5 distinct scenarios Scheduler prototype, called Chameleon, is developed which is based on the presented scheduling models Perform meaningful experiments with Chameleon on a constructed Grid testbed We achieve better performance by considering data locations as well as computational capabilities

Ajou University, South Korea 25 References ANTZ: ApGrid: B. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, S. Tuecke. “Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing,” IEEE Mass Storage Conference, Mark Baker, Rajkumar Buyya and Domenico Laforenza. “The Grid: International Efforts in Global Computing,” International Conference on Advances in Infrastructure for E-Business, Science, and Education on the Internet, SSGRR2000, L'Aquila, Italy, July F. Berman and R. Wolski. “The AppLes project: A status report,” Proceedings of the 8th NEC Research Symposium, Berlin, Germany, May Rajkumar Buyya, Kim Branson, Jon Giddy and David Abramson. “The Virtual Laboratory: A Toolset for Utilising the World-Wide Grid to Design Drugs,” 2nd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2002), Berlin, Germany, May CERN DataGrid Project: Ann Chervenak, Ian Foster, Carl Kesselman, Charles Salisbury and Steven Tuecke. “The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets,” Journal of Network and Computer Applications, 23: , Dirk Düllmann, Wolfgang Hoschek, Javier Jean-Martinez, Asad Samar, Heinz Stockinger and Kurt Stockinger. “Models for Replica Synchronisation and Consistency in a Data Grid,” 10th IEEE Symposium on High Performance and Distributed Computing (HPDC-10), San Francisco, California, August I. Foster and C. Kesselman. “The Grid: Blueprint for a New Computing Infrastructure,” Morgan Kaufmann, I. Foster, C. Kesselman and S. Tuecke. “The Anatomy of the Grid: Enabling Scalable Virtual Organizations,” International J. Supercomputer Applications, 15(3), Cynthia Gibas. “Developing Bioinformatics Computer Skills,” O’REILLY, April The Globus Project:

Ajou University, South Korea 26 References Leanne Guy, Erwin Laure, Peter Kunszt, Heinz Stockinger, and Kurt Stockinger. “Replica management in data grids,” Technical report, Global Grid Forum Informational Document, GGF5, Edinburgh, Scotland, July Wolfgang Hoschek, Javier Jaen-Martinez, Asad Samar, Heinz Stockinger and Kurt Stockinger. “Data Management in an International Data Grid Project,” 1st IEEE/ACM International Workshop on Grid Computing (Grid'2000), Bangalore, India, Dec Kavitha Ranganathan and Ian Foster. “Decoupling Computation and Data Scheduling in Distributed Data- Intensive Applications,” 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11), Edinburgh, Scotland, July Kavitha Ranganathan and Ian Foster. “Design and Evaluation of Dynamic Replication Strategies for a High Performance Data Grid,” International Conference on Computing in High Energy and Nuclear Physics, Beijing, September Kavitha Ranganathan and Ian Foster. “Identifying Dynamic Replication Strategies for a High Performance Data Grid,” International Workshop on Grid Computing, Denver, November Heinz Stockinger, Kurt Stockinger, Erich Schikuta and Ian Willers. “Towards a Cost Model for Distributed and Replicated Data Stores,” 9th Euromicro Workshop on Parallel and Distributed Processing PDP 2001, Mantova, Italy, February S. Vazhkudai, S. Tuecke and I. Foster. “Replica Selection in the Globus Data Grid,” Proceedings of the First IEEE/ACM International Conference on Cluster Computing and the Grid (CCGRID 2001), Brisbane, Australia, May Rich Wolski, Neil Spring, and Jim Hayes. “The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing,” Journal of Future Generation Computing Systems, Volume 15, Numbers 5-6, pp , October 1999.