Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMU Access via http://houxplsf01 Launch Cluster Management Utility GUI.

Similar presentations


Presentation on theme: "CMU Access via http://houxplsf01 Launch Cluster Management Utility GUI."— Presentation transcript:

1 CMU Access via Launch Cluster Management Utility GUI

2

3

4 EnginFrame Access via:

5 Submitting Simulation Job

6 Monitoring Simulation Job

7 Cluster Info

8 My Jobs

9 All Jobs

10 LSF Monitoring compute nodes bhosts Monitoring queues/jobs
Monitoring compute nodes bhosts HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV houxpccn ok houxpccn ok *********************SNIP************************ houxpccn ok Monitoring queues/jobs bjobs -a -u all chowdhu DONE Eclipse houxplsf01 houxpccn11 *e_THP5020 Mar 22 17:38 chowdhu DONE Eclipse houxplsf01 houxpccn10 *PERM_HM_2 Mar 22 17:46 chowdhu DONE Eclipse houxplsf01 houxpccn10 *PERM_HM_2 Mar 22 17:52 pulligb RUN Eclipse houxplsf01 6*houxpccn1 PULLIG9 Mar 23 08:43 6*houxpccn12 6*houxpccn13 6*houxpccn14 6*houxpccn15 2*houxpccn02

11 LSF To obtain more info for troubleshooting bjobs –l <jobID>
To obtain more info for troubleshooting bjobs –l <jobID>  Job <1292>, Job Name <TIMEDEP_E300_SRV4_NEW_1_SIGMA_DUALPERM_HM_2>, User <chowd hurys>, Project <Houston-standard>, License Project <Houst on-standard>, Status <DONE>, Queue <Eclipse>, Command <cd /data/NAM/L48/Eclipse/Subhadeep_BG_Dallas ; ./TIMEDEP_E300 _SRV4_NEW_1_SIGMA_DUALPERM_HM_ > Thu Mar 22 17:52:24: Submitted from host <houxplsf01>, CWD </data/NAM/L48/Eclip se/Subhadeep_BG_Dallas>, Requested Resources <select[type= =any] rusage[eclipse=1:compositional=1]>; Thu Mar 22 17:52:28: Started on <houxpccn10>, Execution Home </home/chowdhurys> , Execution CWD </data/NAM/L48/Eclipse/Subhadeep_BG_Dallas >; Thu Mar 22 17:53:57: Done successfully. The CPU time used is 61.5 seconds. SCHEDULING PARAMETERS: r15s r1m r15m ut pg io ls it tmp swp mem loadSched loadStop EXTERNAL MESSAGES: MSG_ID FROM POST_TIME MESSAGE ATTACHMENT chowdhurys Mar 22 17:52 EF_SPOOLER_URI Y Datafile name Location of datafile and relevant files for debugging

12 LSF Jobs completed over 60 minutes ago bhist –a –u <username>
Jobs completed over 60 minutes ago bhist –a –u <username> Killing jobs bkill <jobid> -bash-3.2$ bkill 1293 Job <1293> is being terminated queue setup vi /apps/lsf/conf/lsbatch /Houston/configdir/lsb.queues badmin reconfig

13 LSF Infiniband Issues If you suspect Infiniband issues, ping another node on the ib0 device. IP range is: x If the other system doesn’t respond, check the OpenSM service on the head node. Restart if necessary by running: service opensmd restart

14 ECLIPSE Launching Simulation from Command Line
eclrun –s houxplsf01 –q <QUEUENAME> <application> <DATASET> Example eclrun –s houxplsf01 –q Eclipse eclipse BIG_RESERVOIR Application will be: eclipse, e300, or frontsim Modifing benchmarks to run on x-cores  Change PARALLEL section in .DATA file eclipse PARALLEL 32 'DISTRIBUTED' /   E300 36 /

15 ECLIPSE Troubleshooting Check the following for errors:
<SIMULATION_NAME>.OUT <SIMULATION_NAME>.PRT <SIMULATION_NAME>.LOG <SIMULATION_NAME>.ECLRUN Simulation name is datafile name without .DATA Run either a benchmark or sample datafiles to rule out dataset issues. (/apps/eclipse/benchmarks/)

16 ECLIPSE The commands for running the benchmarks are below:
Run the benchmarks for various configurations from the command line. Files will be run from the following structure: ECLIPSE E100 Parallel: /apps/eclipse/benchmarks/parallel/data ECLIPSE E100 Serial: /apps/eclipse/benchmarks/serial/e100 ECLIPSE E300 Serial: /apps/eclipse/benchmarks/serial/e300 ECLIPSE E300 Parallel: /apps/eclipse/benchmarks/2MMbenchmark/E300 Schlumberger sample datasets are also included in file structure: ECLIPSE Sample: /apps/eclipse/benchmarks/sample The commands for running the benchmarks are below: ECLIPSE E100 Parallel benchmarks: eclrun –s Houston –q eclipse –u <users_name> eclipse ONEM ECLIPSE E100 Serial benchmarks: eclrun –s Houston –q eclipse –u <user_name> eclipse E100 ECLIPSE E300 Parallel benchmarks: eclrun –s Houston –q eclipse –u <user_name>e300 MMx ECLIPSE E300 Serial benchmarks: eclrun –s Houston –q eclipse –u <user_name> e300 E300 If you submit ticket with Schlumberger, they will ask for *.OUT, *.DATA, *.LOG, *.PRT


Download ppt "CMU Access via http://houxplsf01 Launch Cluster Management Utility GUI."

Similar presentations


Ads by Google