Common User Environments - Update Shawn T. Brown, PSC CUE Working Group Lead TG Quartely 1
Team Members S hawn Brown (PSC, Lead) Kevin Colby (Purdue) Dan Lapine (NCSA) David McWilliams (NICS) Derek Simmel (PSC) Rich Raymond (PSC, Managing Lead) Jerry Greenberg (SDSC) Roberto Gomez (PSC) John Lockman (TACC) Jim Lupo (LONI) Diana Diehl (SDSC, TG Documentation, volunteer) 2
Philosophy Create commonality without destroying diversity. Focus on user requirements and experience. We are not developing a gateway. We are not catering to the hero users. 3
TeraGrid Resources CUED CUE - Documentation A centrally located, clearly itemized area for documentation of resources with both web and CLI based access. CUEMS CUE Management System A single common command line system for managing one’s environment, with a single entry to load the CUE. CUETP CUE – Testing Platform Simple program or set of programs that can be compiled and executed through the CUE and will help to illustrate its use. CUBE Common User Build Environment Attempting to make common the tools needed for building usable scientific code across resources CUEVC CUE Variable Collection A set of environment variables that will be common across the TeraGrid, making job submission and resource discovery easier.
TeraGrid Resources CUED CUE - Documentation A centrally located, clearly itemized area for documentation of resources with both web and CLI based access. CUEMS CUE Management System A single common command line system for managing one’s environment, with a single entry to load the CUE. CUEMS CUE Management System A single common command line system for managing one’s environment, with a single entry to load the CUE. CUETP CUE – Testing Platform Simple program or set of programs that can be compiled and executed through the CUE and will help to illustrate its use. CUBE Common User Build Environment Attempting to make common the tools needed for building usable scientific code across resources CUEVC CUE Variable Collection A set of environment variables that will be common across the TeraGrid, making job submission and resource discovery easier.
How did we proceed? Targeted RP “liaisons” to work on implementation. Developed implementation documents outlining the “rules” of the implementation. – Done in consultation with: RP liaisons SW Int working group Campus Champions Worked to implement the CUEMS and CUEVC portions on current TG machines. 6
The Machines We are Working With Abe Queen Bee Steele LonestarRangerKraken PopleDash Future Systems 7
CUEMS – Environment Management –Implementation of the Modules software environment manager on all systems –Five basic modules: cue-login-env Contains the CUEVC definitions for environment variables cue-math A wrapper for the modules cue-mkl cue-fftw cue-lapack cue-scalapack cue-build A wrapper for the module cue-compile cue-comm A wrapper for the default mpi stack cue-tg Contains already defined TG variables for the site –Application Modules cue-namd, cue-gamess, cue-hdf5, etc.. 8
CUEVC – Variable Collection 9 Proposed CUE Variable Collection Environment VariableDefinitionExample Values CUE_HOMEPath to the current user's home directory visible on login nodes and compute nodes /usr/users/0/janedoe /nics/j/home/janedoe /home/ncsa/janedoe /home/janedoe CUE_DOCSURL for documentation specific to the current systemhttp:// en.php CUE_APPSPath to directory on the current system containing common software applications /usr/local/apps /sw/xt5 /usr/local/packages/tg /software/linux-rhel4- ia64 CUE_COMMUNITYPath to directory containing subdirectories for specific user communities in which their applications are installed /usr/projects /usr/local/packages/tg /soft/community CUE_EXAMPLESPath to directory containing example files for user tools/usr/local/packages/tg/examples /usr/local/examples /soft/community/examples CUE_NODE_SCRATCHPath on a compute node to local scratch file space for that node (not necessarily visible to other compute nodes); node scratch filesystems local to the node may be deleted upon job completion. /scr /lustre/scratch/johndoe /bessemer/johndoe CUE_NODE_SCRATCH_TYPEFilesystem type of the node local scratch filesystem.lustre ext3 gpfs posix CUE_SCRATCHPath to the user's scratch directory on a shared filesystem visible to all compute nodes. /gpfs_scratch1/janedoe /lustre/scratch/janedoe /scratcha/janedoe /scratch/gpfs/local/janedoe CUE_SCRATCH_TYPEFilesystem type of the scratch filesystem visible to all compute nodes. lustre ext3 gpfs posix
CUEMS – Environment Management Current Policy – Opt In approach –Provide users a clear and simple procedure for implementing CUE as default..nosoft – tells the system that you want modules as your default environment management.modules – Contains commented out cue modules that can be implemented at login. 10
CUED – Documentation Working with the documentation group to add modules documentation to TG Docs A getting started guide on how to activate modules 11
Rolling out Announce to the TG User Services group at next meeting. –Ask for feedback and testing. Ask Campus Champions to test out the implementation. Incorporate into the QA testing procedures –Already underway –Current implementation…. The Jerry Test Announcement and opening to public. 12
Not stopping… Discussion of common queue names. Continue work on CUED incorporation. Finish fitting this into the TG SW Integration Kits –Derek Simmel (PSC) 13