Download presentation
Presentation is loading. Please wait.
Published byJean-François Léger Modified over 6 years ago
1
CMU Access via Launch Cluster Management Utility GUI
4
EnginFrame Access via:
5
Submitting Simulation Job
6
Monitoring Simulation Job
7
Cluster Info
8
My Jobs
9
All Jobs
10
LSF Monitoring compute nodes bhosts Monitoring queues/jobs
Monitoring compute nodes bhosts HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV houxpccn ok houxpccn ok *********************SNIP************************ houxpccn ok Monitoring queues/jobs bjobs -a -u all chowdhu DONE Eclipse houxplsf01 houxpccn11 *e_THP5020 Mar 22 17:38 chowdhu DONE Eclipse houxplsf01 houxpccn10 *PERM_HM_2 Mar 22 17:46 chowdhu DONE Eclipse houxplsf01 houxpccn10 *PERM_HM_2 Mar 22 17:52 pulligb RUN Eclipse houxplsf01 6*houxpccn1 PULLIG9 Mar 23 08:43 6*houxpccn12 6*houxpccn13 6*houxpccn14 6*houxpccn15 2*houxpccn02
11
LSF To obtain more info for troubleshooting bjobs –l <jobID>
To obtain more info for troubleshooting bjobs –l <jobID> Job <1292>, Job Name <TIMEDEP_E300_SRV4_NEW_1_SIGMA_DUALPERM_HM_2>, User <chowd hurys>, Project <Houston-standard>, License Project <Houst on-standard>, Status <DONE>, Queue <Eclipse>, Command <cd /data/NAM/L48/Eclipse/Subhadeep_BG_Dallas ; ./TIMEDEP_E300 _SRV4_NEW_1_SIGMA_DUALPERM_HM_ > Thu Mar 22 17:52:24: Submitted from host <houxplsf01>, CWD </data/NAM/L48/Eclip se/Subhadeep_BG_Dallas>, Requested Resources <select[type= =any] rusage[eclipse=1:compositional=1]>; Thu Mar 22 17:52:28: Started on <houxpccn10>, Execution Home </home/chowdhurys> , Execution CWD </data/NAM/L48/Eclipse/Subhadeep_BG_Dallas >; Thu Mar 22 17:53:57: Done successfully. The CPU time used is 61.5 seconds. SCHEDULING PARAMETERS: r15s r1m r15m ut pg io ls it tmp swp mem loadSched loadStop EXTERNAL MESSAGES: MSG_ID FROM POST_TIME MESSAGE ATTACHMENT chowdhurys Mar 22 17:52 EF_SPOOLER_URI Y Datafile name Location of datafile and relevant files for debugging
12
LSF Jobs completed over 60 minutes ago bhist –a –u <username>
Jobs completed over 60 minutes ago bhist –a –u <username> Killing jobs bkill <jobid> -bash-3.2$ bkill 1293 Job <1293> is being terminated queue setup vi /apps/lsf/conf/lsbatch /Houston/configdir/lsb.queues badmin reconfig
13
LSF Infiniband Issues If you suspect Infiniband issues, ping another node on the ib0 device. IP range is: x If the other system doesn’t respond, check the OpenSM service on the head node. Restart if necessary by running: service opensmd restart
14
ECLIPSE Launching Simulation from Command Line
eclrun –s houxplsf01 –q <QUEUENAME> <application> <DATASET> Example eclrun –s houxplsf01 –q Eclipse eclipse BIG_RESERVOIR Application will be: eclipse, e300, or frontsim Modifing benchmarks to run on x-cores Change PARALLEL section in .DATA file eclipse PARALLEL 32 'DISTRIBUTED' / E300 36 /
15
ECLIPSE Troubleshooting Check the following for errors:
<SIMULATION_NAME>.OUT <SIMULATION_NAME>.PRT <SIMULATION_NAME>.LOG <SIMULATION_NAME>.ECLRUN Simulation name is datafile name without .DATA Run either a benchmark or sample datafiles to rule out dataset issues. (/apps/eclipse/benchmarks/)
16
ECLIPSE The commands for running the benchmarks are below:
Run the benchmarks for various configurations from the command line. Files will be run from the following structure: ECLIPSE E100 Parallel: /apps/eclipse/benchmarks/parallel/data ECLIPSE E100 Serial: /apps/eclipse/benchmarks/serial/e100 ECLIPSE E300 Serial: /apps/eclipse/benchmarks/serial/e300 ECLIPSE E300 Parallel: /apps/eclipse/benchmarks/2MMbenchmark/E300 Schlumberger sample datasets are also included in file structure: ECLIPSE Sample: /apps/eclipse/benchmarks/sample The commands for running the benchmarks are below: ECLIPSE E100 Parallel benchmarks: eclrun –s Houston –q eclipse –u <users_name> eclipse ONEM ECLIPSE E100 Serial benchmarks: eclrun –s Houston –q eclipse –u <user_name> eclipse E100 ECLIPSE E300 Parallel benchmarks: eclrun –s Houston –q eclipse –u <user_name>e300 MMx ECLIPSE E300 Serial benchmarks: eclrun –s Houston –q eclipse –u <user_name> e300 E300 If you submit ticket with Schlumberger, they will ask for *.OUT, *.DATA, *.LOG, *.PRT
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.