Download presentation
Presentation is loading. Please wait.
Published bySterling Beel Modified over 9 years ago
1
www.ci.anl.gov www.ci.uchicago.edu Matlab on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu)beagle-support@ci.uchicago.edu Computation Institute University of Chicago & Argonne National Laboratory
2
www.ci.anl.gov www.ci.uchicago.edu 2 Matlab on Beagle – beagle-support@ci.uchicago.edu Outline Introduction to high performance computing Some relevant facts about Beagle’s hardware Basics about the work environment Data transfer using Globus Online Use of the compilers (C, C++, and Fortran) Launching and monitoring applications Using Matlab on Beagle
3
www.ci.anl.gov www.ci.uchicago.edu What the Heck is Supercomputing? Credit: Henry Neeman, Director OU Supercomputing Center for Education & Research http://www.oscer.ou.edu/education.php Contact: hneeman@ou.edu
4
www.ci.anl.gov www.ci.uchicago.edu 4 Matlab on Beagle – beagle-support@ci.uchicago.edu Why Beagle? Not the kind of problem we can handle with Matlab at this point Not the kind of problem we can handle with Matlab at this point
5
www.ci.anl.gov www.ci.uchicago.edu 5 Matlab on Beagle – beagle-support@ci.uchicago.edu What affects performance? Accessing data Examples: 1.Data array too big to fit into cache (12 MB), we need to use main memory (32 GB) 2.An image too big to fit into memory (32 GB), use of disk space or distributed memory (23 TB) 3.Too many genomes to fit on local storage (~ max 50 TB per user), use of network disks Examples: 1.Data array too big to fit into cache (12 MB), we need to use main memory (32 GB) 2.An image too big to fit into memory (32 GB), use of disk space or distributed memory (23 TB) 3.Too many genomes to fit on local storage (~ max 50 TB per user), use of network disks
6
www.ci.anl.gov www.ci.uchicago.edu 6 Matlab on Beagle – beagle-support@ci.uchicago.edu What affects performance? Repetition Examples: 1.Unrelated experiments (e.g., CT image reconstruction and molecular dynamics modeling) can be run at the same time 2.Each genome in a experiment can be analyzed independently 3.Slices or sub-images can be processed at the same time Examples: 1.Unrelated experiments (e.g., CT image reconstruction and molecular dynamics modeling) can be run at the same time 2.Each genome in a experiment can be analyzed independently 3.Slices or sub-images can be processed at the same time
7
www.ci.anl.gov www.ci.uchicago.edu 7 Matlab on Beagle – beagle-support@ci.uchicago.edu Matlab examples: 1)If analyzing a single image is time consuming (or images are large): slices or sub-images can be processed at the same time using different threads (e.g., with parallel tools, but not working yet) 2)If images are small: different threads can analyze different images (not really shared memory, just in the same memory) Matlab examples: 1)If analyzing a single image is time consuming (or images are large): slices or sub-images can be processed at the same time using different threads (e.g., with parallel tools, but not working yet) 2)If images are small: different threads can analyze different images (not really shared memory, just in the same memory)
8
www.ci.anl.gov www.ci.uchicago.edu 8 Matlab on Beagle – beagle-support@ci.uchicago.edu
9
www.ci.anl.gov www.ci.uchicago.edu 9 Matlab on Beagle – beagle-support@ci.uchicago.edu
10
www.ci.anl.gov www.ci.uchicago.edu 10 Matlab on Beagle – beagle-support@ci.uchicago.edu
11
www.ci.anl.gov www.ci.uchicago.edu Some relevant facts about Beagle’s hardware http://beagle.ci.uchicago.edu/ Contact: beagle-support@ci.uchicago.edu
12
www.ci.anl.gov www.ci.uchicago.edu 12 Matlab on Beagle – beagle-support@ci.uchicago.edu Beagle: hardware overview
13
www.ci.anl.gov www.ci.uchicago.edu 13 Matlab on Beagle – beagle-support@ci.uchicago.edu Beagle “under the hood”
14
www.ci.anl.gov www.ci.uchicago.edu 14 Matlab on Beagle – beagle-support@ci.uchicago.edu Compute nodes 2 AMD Opteron 6100 “Magny-Cours” 12-core (24 per node) 2.1-GHz 32 GB RAM (8 GB per processor) No disk on node (mounts DVS and Lustre network filesystems) Compute nodes 2 AMD Opteron 6100 “Magny-Cours” 12-core (24 per node) 2.1-GHz 32 GB RAM (8 GB per processor) No disk on node (mounts DVS and Lustre network filesystems) To know more: http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SystemSpecs#Overview To know more: http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SystemSpecs#Overview
15
www.ci.anl.gov www.ci.uchicago.edu 15 Matlab on Beagle – beagle-support@ci.uchicago.edu Details about the Processors (sockets) Superscalar: 3 Integer ALUs 3 Floating point ALUs (can do 4 FP per cycle) Cache hierarchy: Victim cache 64KB L1 instruction cache 64KB L1 data cache (latency 3 cycles) 512KB L2 cache per processor core (latency of 9 cycles) 12MB shared L3 cache (latency 45 cycles) To know more: http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SystemSpecs To know more: http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SystemSpecs
16
www.ci.anl.gov www.ci.uchicago.edu Basics about the work environment http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle Contact: beagle-support@ci.uchicago.edu
17
www.ci.anl.gov www.ci.uchicago.edu 17 Matlab on Beagle – beagle-support@ci.uchicago.edu Beagle’s operating system Cray XE6 uses Cray Linux Environment v3 (CLE3) SuSE Linux-based Compute nodes use Compute Node Linux (CNL) Login and sandbox nodes use a more standard Linux The two are different (relevant to Matlab). Compute nodes can operate in – ESM (extreme scalability mode) to optimize performance to large multi-node calculations – CCM (cluster compatibility mode) for out-of-the-box compatibility with Linux/ x86 versions of software – more or less without recompilation or relinking! (It doesn’t work yet ) To know more: http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#Basics_about_the_work_environmen To know more: http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#Basics_about_the_work_environmen
18
www.ci.anl.gov www.ci.uchicago.edu 18 Matlab on Beagle – beagle-support@ci.uchicago.edu Beagle’s filesystems /lustre/beagle: local Lustre filesystem (read- write) -- this is where all input and output files should be; however, NO BACKUP! /gpfs/pads: PADS GPFS (read-write) – for permanent storage /home: CI home directories, largely useless you can’t write there from the compute nodes! To know more: http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_work_on_the_filesystem To know more: http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_work_on_the_filesystem
19
www.ci.anl.gov www.ci.uchicago.edu 19 Matlab on Beagle – beagle-support@ci.uchicago.edu How to move data to and from Beagle Beagle is not HIPAA-compliant — no PHI data on Beagle Example of factors for choosing a data movement tool: – how many files, how large the files are … – how much fault tolerance is desired, – performance – security requirements, and – the overhead needed for software setup. Recommended tools: – scp/sftp can be OK for moving a few small files (< a couple of GB) o pros: quick to initiate o cons: slow and not scalable – For optimal speed and reliability we recommend Globus Online : o high-performance (e.g., fast) o reliable and easy to use o easy to use from either a command line or web browser, o provides fault tolerant, fire-and-forget transfers. If you know you'll be moving a lot of data or find scp is too slow/unreliable we recommend To know more: http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_move_data_to_and_from_Bea To know more: http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_move_data_to_and_from_Bea
20
www.ci.anl.gov www.ci.uchicago.edu 20 Matlab on Beagle – beagle-support@ci.uchicago.edu Applications on Beagle Applications on Beagle are (mostly) run from the command line, e.g.: aprun –n 17664 myapp & this.log How do I know if an application is on Beagle? – http://beagle.ci.uchicago.edu/software/ http://beagle.ci.uchicago.edu/software/ – http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SoftwareOnBeagle http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SoftwareOnBeagle – On Beagle, use module avail, e.g.: lpesce@login2:~> module avail 2>&1 | grep –i matlab Matlab/7.13(default)
21
www.ci.anl.gov www.ci.uchicago.edu 21 Matlab on Beagle – beagle-support@ci.uchicago.edu Applications on Beagle GUIs are in general not supported (true for both for Matlab and Simulink) Licensing is similar to any other uchicago.edu machine – Packages charged by number of cores can be expensive on Beagle and aren’t usually supported – Packages which have a campus license can be simply installed and used on Beagle Octave is available at no charge and can in principle be installed (per serious request) on Beagle even if porting is not easy
22
www.ci.anl.gov www.ci.uchicago.edu Matlab on Beagle http://www.ci.uchicago.edu/wiki/bin/view/Beagle/MATLAB http://beagle.ci.uchicago.edu/ Contact: beagle-support@ci.uchicago.edu
23
www.ci.anl.gov www.ci.uchicago.edu 23 Matlab on Beagle – beagle-support@ci.uchicago.edu Matlab on Beagle: GUI The Matlab GUI is not supported and most likely will not be in the future: – According to our experience standard Matlab is not very effective in exploiting massively parallel supercomputers such as Beagle – Parallel tools has the potential to at least overcome some of these issues, but licensing and other practical issues render this approach practically unfeasible at this time – If you have suggestions about how to use the GUI and parallel tools, let us know.
24
www.ci.anl.gov www.ci.uchicago.edu 24 Matlab on Beagle – beagle-support@ci.uchicago.edu Matlab on Beagle: Compile code However, compiled executables from Matlab code can be easily run on Beagle: – MATLAB programs should be compiled using mcc (Matlab compiler) and run as command line executables with MCR (Matlab Compiler Runtime). In our experience, Matlab has shown very limited ability in exploiting effectively multi-core processors. – Therefore, to exploit parallelism, executables are compiled single-threaded and run in parallel using a scripting language such as a bash shell or a Swift. – We are working at including parallel tools into the compiled programs, but we have no working solution at this point. – Suggestions?
25
www.ci.anl.gov www.ci.uchicago.edu 25 Matlab on Beagle – beagle-support@ci.uchicago.edu Compiling Matlab: Matlab code The Matlab enviroment can compile any Matlab function of the form foofunc(x1,x2,...,xn) Matlab functions can call other Matlab functions from other files, usually leaving them in the compilation directory will be sufficient Calling parameters ( x1, x2, …,xn above) become arguments for the executable. However, those arguments will be considered as strings and will need to be edited as (if arguments are numbers!): if (isdeployed) x1 = str2num(x1); x2 = str2num(x2);... xn = str2num(xn); end
26
www.ci.anl.gov www.ci.uchicago.edu 26 Matlab on Beagle – beagle-support@ci.uchicago.edu Compiling Matlab: mcc and MCR The Matlab compiler (mcc) produces executables that in order to run require the Matlab Compiler Runtime (MCR) — a set of shared libraries that enables the execution of Matlab files without an installed version of Matlab or a license. The mcc compiler is loaded with the command module load matlab See also http://www.mathworks.com/help/toolbox/compiler
27
www.ci.anl.gov www.ci.uchicago.edu 27 Matlab on Beagle – beagle-support@ci.uchicago.edu Compiling Matlab: mcc and MCR Compilation can be done on other systems, as long as the MCR version corresponding to the mcc used to compile is installed on Beagle. Specific versions MCR can be installed by users in the directories on lustre. Please contact us if you encounter problems while trying to do it. Currently MCR is available as – /soft/matlab/7.13/ – /soft/mcr/v714/ – (if you require other versions let us know).
28
www.ci.anl.gov www.ci.uchicago.edu 28 Matlab on Beagle – beagle-support@ci.uchicago.edu Compiling Matlab on Beagle: mcc options We recommend users to compile with mcc -R -singleCompThread -R -nojvm -R -nodisplay -mv myapp.m -o my_app -m generates a standalone application -v option (verbose) displays all the the compilation steps -- e.g., it helps identify which third-party compiler is used and what environment variables are referenced -R specifies run-time options for MCR – -R -nojvm disables the java virtual machine – -R -nodisplay eliminates functions that would produce a display). – -R -singleCompThread runs MCR single threaded At this stage, it does not appear that there is a way to control how MATLAB creates threads or that it can run a multi-threaded program efficiently on a 24-core Cray XE6 node (MATLAB checks directly /proc/cpuinfo to determine how many cores are available for a calculation and uses all of them, independently from the instructions given by the aprun command) To know more: http://www.mathworks.com/help/toolbox/compiler/f0-985134.html To know more: http://www.mathworks.com/help/toolbox/compiler/f0-985134.html
29
www.ci.anl.gov www.ci.uchicago.edu 29 Matlab on Beagle – beagle-support@ci.uchicago.edu Matlab on Beagle: mcc output After the compilation, a number of files will be generated: – mccExcludedFiles.log : don’t worry about this one – my_app: the executable you will need to copy to Beagle – readme.txt : contains information, for example where is the version of MCRInstaller.bin for your specific MATLAB, which you will need if different from the ones available on Beagle – run_my_app.sh : a shell script that can is used to run each copy of my_app. We recommend that you use it to avoid having to take care of too many variables in your PBS scripts. However, you will need to modify those scripts when using them on Beagle, see next page
30
www.ci.anl.gov www.ci.uchicago.edu 30 Matlab on Beagle – beagle-support@ci.uchicago.edu Matlab on Beagle: changes to run_my_app.sh To prevent the various scripts from blocking each other, add you can add something like the following lines at the beginning of the script, right after the initial comments (series of lines starting with "#") #Added to run on Beagle after August 2011 #TMP must be defined by the calling PBS script tmp=`mktemp -d $TMP/matlabcachedir.XXXXXXXXXXX` echo $tmp export MCR_CACHE_ROOT=$tmp; # end added In order to remove the temporary cache directories, after the line eval "${exe_dir}”/my_app $args add #Added to run on Beagle after August 2011 rm -rf $tmp #end added
31
www.ci.anl.gov www.ci.uchicago.edu 31 Matlab on Beagle – beagle-support@ci.uchicago.edu Using Matlab on Beagle: scripting Run multiple copies of single-threaded run_my_app.sh using a scripting language: – Bash shell + PBS (batch submission) – Swift Remember that Beagle provides only 32GB per node, any request above that value will produce an Out of Memory (OOM) error, which will result in the termination of the process: be mindful about how much you “pack” calculations
32
www.ci.anl.gov www.ci.uchicago.edu 32 Matlab on Beagle – beagle-support@ci.uchicago.edu Using Matlab on Beagle: Bash + PBS #!/bin/bash #PBS -N myTestMatlab #PBS -l walltime=0:10:00 #PBS -l mppwidth=24 #PBS -j oe # Load modules and set for dynamic environment. /opt/modules/3.2.6.6/init/bash # Sets the shared library environment export CRAY_ROOTFS=DSL # set the env variable where the root of MRC is (you might need to change this if you need a specific version of MCR) #export MCRROOT=/soft/mcr/v714 export MCRROOT=/soft/matlab/7.13/ # Create, if necessary, a directory on /lustre to run the simulations LUSTREDIR=/lustre/beagle/`whoami`/testMatlab/magicsquare${PBS_JOBID} mkdir -p $LUSTREDIR # Set up TMP and a cache root dir for MCR, it won't work if it isn't set LUSTRETMP=${LUSTREDIR}/${PBS_JOBID}/tmp mkdir -p $LUSTRETMP export TMP=$LUSTRETMP export MCR_CACHE_ROOT=$LUSTRETMP # copy the file to the run dir and run the code cd $PBS_O_WORKDIR cp run_my_app.sh my_app $LUSTREDIR cd $LUSTREDIR aprun -b -n 1 -d 1./run_my_app.sh $MCRROOT 5 &>test_my_app.log To know more (e.g., packing and loops): http://www.ci.uchicago.edu/wiki/bin/view/Beagle/MATLAB#How_to_run_MATLAB_executables_vi To know more (e.g., packing and loops): http://www.ci.uchicago.edu/wiki/bin/view/Beagle/MATLAB#How_to_run_MATLAB_executables_vi
33
www.ci.anl.gov www.ci.uchicago.edu 33 Matlab on Beagle – beagle-support@ci.uchicago.edu Matlab on Beagle: note We are happy to help you use a scripting language effectively: – Bash shell – Swift (PRESENTATION ABOUT IT follows) In general Matlab compiled executables do not use Beagle very efficiently (both in terms of CPU and memory) and this should be considered carefully when planning large calculations. Let us know if we can help with any of the steps involved into using Matlab on Beagle
34
www.ci.anl.gov www.ci.uchicago.edu 34 Matlab on Beagle – beagle-support@ci.uchicago.edu Acknowledgments BSD for funding most of the operational costs of Beagle A lot of the images and the content has been taken or learned from Cray documentation or their staff (Dave Strenski, mostly) Globus for providing us with many slides and support; special thanks to Mary Bass, manager for communications and outreach at the CI. NERSC and its personnel provided us with both material and direct instruction; special thanks to Katie Antypas, group leader of the User Services Group at NERSC All the people at the CI who supported our work, from administrating the facilities to taking pictures of Beagle Beagle users who helped with the content about using Matlab and Python
35
www.ci.anl.gov www.ci.uchicago.edu Thanks! We look forward to working with you. Questions? (or later: beagle-support@ci.uchicago.edu)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.