Download presentation
Presentation is loading. Please wait.
Published byLizbeth Thornton Modified over 9 years ago
1
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya Williams nadya@sdsc.edu
2
© 2005 UC Regents2 Condor developed at University of Wisconsin see http://www.cs.wisc.edu/condor system that creates a High-Throughput Computing (HTC) environment specialized workload management system for compute-intensive jobs
3
© 2005 UC Regents3 Condor is: a software system that runs on a cluster of workstations to harness wasted CPU cycles condor pool consists of any number of machines possibly different architectures possibly different operating systems connected by a network
4
© 2005 UC Regents4 Unique features Transparent process checkpoint & migration only migrates processes between machines of the same architecture only migrates processes within its own server pool Remote system calls system calls are executed on submit machine ClassAds Use of idle resources
5
© 2005 UC Regents5 When to use Condor ? parameter studies embarrassingly parallel high-throughput computing where subjobs do not need to communicate long computation
6
© 2005 UC Regents6 Condor Pool on Rocks one condor pool per Rocks cluster frontend: central manager submit compute node submit execute
7
© 2005 UC Regents7 Condor daemons frontend condor_mastermanages other daemons condor_collectorcollects info about computers and jobs condor_negotiatordecides what/where to run condor_scheddallows job submission condor_shadowwatches the running job (only when jobs are active) compute node condor_mastermanages other daemons condor_startd allows jobs to be started condor_scheddallows job submission
8
© 2005 UC Regents8 Basic commands condor_qshows jobs queue condor_submitsubmit a job condor_rm remove jobs from the queue condor_compilelink with condor libraries condor_config_valquery configuration values condor_statusshows pool status
9
© 2005 UC Regents9 Roadmap to run condor jobs code preparation job must be run as a background batch job (no user IO) If must, create files with needed input/keystrokes If possible, relink with condor libraries create submit description files submit jobs monitor jobs
10
© 2005 UC Regents10 Condor universes Universe - run time environment Standard Handles system calls by returning them to submit machine Provides mechanisms to checkpoitn and migrate partially compelted job Must relink with condor libraries Vanila No checkpoint or migration No relinking (3rd party binary) Input/output files reside on shared file system or use Condor transfer mechanism PVM MPI (deprecated) Java Parallel Globus
11
© 2005 UC Regents11 DAG jobs Complex sequence of jobs B3 A B2 C B1
12
© 2005 UC Regents12 More info on jobs Condor documentation: http://www.cs.wisc.edu/condor/manual/v6.7.12 For security reasons do not run jobs As root Any user with GID=0 (wheel) Length limits in submit files: Path names < 256 Command line args < 4096 If error check the log files Specified by user for a job Specified by admin in config files To find files: condor_config_val -config
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.