Presentation is loading. Please wait.

Presentation is loading. Please wait.

Beowulf Software. Monitoring and Administration Beowulf Watch 

Similar presentations


Presentation on theme: "Beowulf Software. Monitoring and Administration Beowulf Watch "— Presentation transcript:

1 Beowulf Software

2 Monitoring and Administration Beowulf Watch  http://www.kaybee.org/~kirk/html/linux.html http://www.kaybee.org/~kirk/html/linux.html  Tck/Tk and rsh based  memory, users, process info. EPCKPT Checkpoint for Linux  http://www.cos.ufrj.br/~edpin/epckpt http://www.cos.ufrj.br/~edpin/epckpt  checkpointing kernel patch to Linux  saving running process ’ s snapshot for later restart  useful for fault tolerance, process tracing/debugging, rollback transactions, migration

3 Monitoring and Administration lperfex  http://www.osc.edu/~troy/lperfex http://www.osc.edu/~troy/lperfex  performance monitoring and analysis tool  for Linux/IA32 system  P-pro/PIII status register 의 정보 사용 (???) Compaq CMU  http://www.compaq.com/solutions/customsystems/hps/linux-cmu.html http://www.compaq.com/solutions/customsystems/hps/linux-cmu.html  Disk Image Cloning  can do network installation and disk partitioning  Console Broadcasting  Serial Console connecting each computing nodes

4 Monitoring and Administration SCMS  http://smile.cpe.ku.ac.th/software/scms/index.html http://smile.cpe.ku.ac.th/software/scms/index.html  Parallel Unix command  pls, pps, …  Display node status  CPU, Memory, Device info.  administration  shutdown, reboot, remote login, remote command execution FAI (Fully Automatic Installation)  http://www.informatik.uni-koeln.de/fai http://www.informatik.uni-koeln.de/fai  Automatic Installation over cluster PCs  for Debian Linux, no interaction needed

5 Global Process Space BPROC  http://beowulf.gsfc.nasa.gov/software/bproc.html http://beowulf.gsfc.nasa.gov/software/bproc.html  remote process start without remote-login  Ghost Process implemented with Kernel-Thread  master node 의 ghost process 는 remote 에서 실행중인 real process 에 대응된다.  PID masquerading  masqueraded PID related operation 을 control 하는 daemon  Starting Processes  rexec : execve syscall 과 유사, homogeneous node 여야 한다.  move or rfork : saving process ’ s memory region and recreating it on the remote node  can transport binary and anything mmap ’ ed(ex DLL)

6 Global Process Space bexec (brexec)  ftp://ftp.parl.clemson.edu/pub/beowulf/bexec-1.1.2.tgz ftp://ftp.parl.clemson.edu/pub/beowulf/bexec-1.1.2.tgz  use a daemon to start tasks and deliver signals  user-level implementation

7 Load-balancing & Allocations job manager  http://bond.imm.dtu.dk/jobd http://bond.imm.dtu.dk/jobd  load balancing and queue control of jobs  solve problem of batch queue computing system Condor  http://www.cs.wisc.edu/project/condor http://www.cs.wisc.edu/project/condor  Load-balancing  over large number of systems owned by different people  process migration, node status monitoring, resource allocation  Condor + BPROC ??

8 Cluster Networking Channel Bonding  http://pdsf.nersc.gov/linux http://pdsf.nersc.gov/linux  allow multiple device to be used as one in order to improve bandwidth  low-level approach

9 File Systems GFS (Global File Systems)  http://www.globalfilesystem.org http://www.globalfilesystem.org  multiple nodes can share storage over network SFS (Secure File Systems)  http://elbe.borg.umn.edu http://elbe.borg.umn.edu  store files securely on remote sites using normal network protocols(FTP,HTTP,NFS … )  use smartcards for authentication and signature

10 File Systems PVFS (Parallel Virtual File System)  http://ece.clemson.edu/parl/pvfs http://ece.clemson.edu/parl/pvfs  improve performance of coarse-grain parallel applications with large I/O requirements  operates at the user-level  no kernel modifications needed


Download ppt "Beowulf Software. Monitoring and Administration Beowulf Watch "

Similar presentations


Ads by Google