High Performance Computing Service

User environment and modules

Default environment

The operating system on all Darwin nodes currently is Scientific Linux 6.4 (a rebuild of Redhat Enterprise Linux 6, aka RHEL6).

The user environment is set up using environment modules. By default, the module default-impi is loaded automatically during login to a Westmere/Nehalem or Sandy Bridge node. These modules autoload a collection of other modules which will configure the environment to allow compilation of applications and submission of jobs to the cluster, by providing access to the required utilities and recommended development and MPI software.

It is possible to change the environment which is loaded upon login, by editing the shell initialisation file ~/.bashrc. Note that this will effect all future login sessions, and all batch jobs not yet started. Since some modules are required for proper operation of the account, caution is needed before removing any autoloaded modules. Changes to the shell initialisation file will take effect during the next login or at the creation of the next shell. At any time one can check the currently loaded modules via:

module list

which currently will produce the following output by default on a login-sand (Sandy Bridge) login node:

Currently Loaded Modulefiles:
  1) dot                     4) turbovnc/1.1            7) global                 10) intel/mkl/10.3.10.319
  2) scheduler               5) vgl/2.3.1/64            8) intel/cce/12.1.10.319  11) default-impi
  3) java/jdk1.7.0_21        6) intel/impi/4.1.3.045    9) intel/fce/12.1.10.319

From this list it is possible to see, for example, that the version of the Intel compiler one will receive by invoking icc is version 12.1.10.319 (check this with which icc).

Please note that modules work by setting environment variables such as PATH and LD_LIBRARY_PATH. Therefore if you need to modify these variables directly it is essential to retain the original values to avoid breaking loaded modules (and potentially rendering essential software ''not found'') - e.g. do

export PATH=$PATH:/home/abc123/custom_bin_directory
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/abc123/custom_lib_directory

and not

export PATH=/home/abc123/custom_bin_directory
export LD_LIBRARY_PATH=/home/abc123/custom_lib_directory

Modules

On a complex computer system, on which it is necessary to make available a wide choice of software packages in multiple versions, it can be quite difficult to set up the user environment so as to always find the required executables and libraries. (This is particularly true where different implementations or versions use the same names for files). Environment modules provide a way to selectively activate and deactivate modifications to the user environment which allow particular packages and versions to be found.

The basic command to use is module:

module 		
   (no arguments)              print usage instructions
   avail or av                 list available software modules 
   whatis                      as above with brief descriptions
   load <modulename>     add a module to your environment
   unload <modulename>   remove a module
   purge                       remove all modules

Enter the command module avail to see the entire collection of currently available modules.

Some modules refer to administrative software and are not of interest to users, e.g. cluster-tools is the ClusterVision management and control suite; also some modules load other modules. It is possible to make use of various versions of Intel compiler and parallel libraries by explicitly loading some of the above modules. By default the login environment loads several modules required for the normal operation of the account: e.g. the default versions of the Intel compilers and Intel Math Kernel Library, batch scheduling system and the recommended MPI for the particular flavour of compute node hardware. One can list the modules actually loaded by issuing the command module list.

In some cases, alternative modules may exist with sandybridge in their name. These are the best choices for use on the Sandy Bridge nodes of Darwin, but they may not work on the Westmere or Tesla nodes (as they may contain instructions not supported by Westmere/Nehalem CPUs).

For historical reasons there may also be modules with nehalem in their name. These were originally created to provide Westmere/Nehalem-optimised versions of software using Intel MPI distinguished from versions intended to run on the old Woodcrest nodes (which had an older CPU type and a different recommended MPI). Since the Woodcrests have been decommissioned the need for a separate category of nehalem modules has disappeared (nehalem has become the default for new builds); please check that there is not a more recent version of the same module without the `nehalem' label before using one of these modules. In some cases the same build of the software may be referenced in both locations as the Woodcrest-specific variants are being cleared away.

Since the home directory is shared with the slave nodes, the same default environment is inherited when a user's job commences via the queueing system. It is good practice to explicitly set up the module state in the submission script to remove ambiguity - the default template scripts under /usr/local/Cluster-Docs/Torque/ contain a section dealing with this.

Making your own modules

It is possible to create new modules and add them to your environment. For example, after installing an application in your own filespace, create a personal module directory called ~/privatemodules (recall that ~ is UNIX shorthand for your home directory). Then enable this directory by adding the following line to your ~/.bashrc file (this will take effect for future shells, issue the same command on the command line to affect the current shell):

module load use.own    

Now module files created under ~/privatemodules will be recognised as modules.

The contents of a module file look like this (see the default module directory /usr/local/Cluster-Config/modulefiles for more examples):

#%Module -*- tcl -*-
##
## modulefile
##
proc ModulesHelp { } {

  puts stderr "\tAdds Intel C/C++ compilers (11.0.081) to your environment,"
}

module-whatis "adds Intel C/C++ compilers (11.0.081) to your environment"

set               root                 /usr/local/Cluster-Apps/intel/cce/11.0.081
prepend-path      PATH                 $root/bin/intel64
prepend-path      MANPATH              $root/man
prepend-path      LD_LIBRARY_PATH      $root/lib/intel64

...

Typically, one would set at least PATH, MANPATH and LD_LIBRARY_PATH to pick up the appropriate application directories. For more information, please load the modules module (module load modules), and refer to the module and modulefile man pages (i.e. man module and man modulefile).

A good approach is to create a subdirectory within the modules directory for each separate application and to place the new module in that directory, named to reflect the version number of the application. For example the name of the Intel C compiler module file above is /usr/local/Cluster-Config/modulefiles/intel/cce/11.0.081. The command module load intel/cce will automatically try to load the largest version number it finds under /usr/local/Cluster-Config/modulefiles/intel/cce (NB highest in lexical ordering, which is often but not necessarily the same as numerical ordering: e.g. 8 is higher than 1 and therefore also higher than 10).