All users should contact Darwin or Wilkes initially through the login nodes. To access these do:
Individual login nodes login-sand1 - login-sand8 may be explicitly accessed in the same way (the generic names login and login-sand select a random node from this set).
See this faq for an explanation of "Sandy Bridge" Intel CPU types. All types of node have the same access to filesystems. In addition node login-gfx1 also has NVIDIA graphics cards and can be used (by both paying and non-paying users) for remote visualisation.
Note that compiling code with maximum optimisation requires the compiler to make different choices according to the type of CPU on which it is intended to run the final binary. Incorrect choices of optimisation options may create a binary that, for example, will run on Sandy Bridge nodes but fails to run on the older Westmere nodes. To generate a binary capable of running efficiently on all current types of CPU present in Darwin and Wilkes, use the -xSSE4.2 -axAVXIntel compiler options (without -xHOST).
The HPCS uses CRSids where they exist and so the <username>@ can usually be omitted if your local system does also. Note that although systems in cam.ac.uk should be able to connect, other systems may not - please see this faq and contact support'at'hpc.cam.ac.uk in case of difficulty.
It is possible to change your initial password using the usual unix command passwd on a login node, but note that this will make it different to your UIS Password - see the UIS Password Management Application (https://password.csx.cam.ac.uk) for changing the latter. Note that the security of both users' data and the service itself depends strongly on choosing the password sensibly, which in the age of automated cracking programs unfortunately means the following:
- Use at least 10 characters
- Use a mixture of upper and lower case letters, numbers and at least one non-alphanumeric character
- Do not use dictionary words, common proper nouns or simple rearrangements of these
- Do not use family names, user identifiers, car registrations, media references, ...
- Do not re-use a password in use on another system (this is for damage limitation in case of a compromise somewhere).
Passwords should be treated like credit card numbers (and not left around, emailed or shared etc). The above rules are similar to those which apply to systems elsewhere.
Please see here for a brief summary of available filesystems and the rules governing them.
We use the modules environment extensively. A module can for instance be associated with a particular version of Intel compiler, or different MPI libraries etc. Loading a module establishes the environment required to find the related include and library files at compile-time and run-time.
By default the environment is such that the most commonly required modules are already loaded. It is possible to see what modules are loaded by using the command 'module list':
[sjr20@login-sand1 ~]$ module list Currently Loaded Modulefiles: 1) dot 4) turbovnc/1.1 7) global 10) intel/mkl/10.3.10.319 2) scheduler 5) vgl/2.3.1/64 8) intel/cce/184.108.40.2069 11) default-impi 3) java/jdk1.7.0_21 6) intel/impi/4.1.3.045 9) intel/fce/220.127.116.119
The above shows that torque and the scheduler (the job queueing system software), as well as the Intel compilers and the Intel MPI environment are loaded (these are actually loaded as a result of loading the default- module, which is loaded automatically on login).
To permanently change what modules are loaded by default, edit your ~/.bashrc file, e.g. adding
module load cfitsio/intel/3.300
will ensure that the next shell (or the current shell if you issue source ~/.bashrc) will also have the Intel compiler-compiled version of cfitsio 3.300 loaded into the environment.
module load <module> -> load module module unload <module> -> unload module module purge -> unload all modules module list -> show currently loaded modules module avail -> show available modules module whatis -> show available modules with brief explanation
In some cases, alternative modules may exist with sandybridge in their name. Historically, these were the best choices for use on the Sandy Bridge compute nodes, but in general if a more recent module exists without this prefix, then that module should be used in preference.
The default environment should be correctly established automatically via the modules system and the shell initialization scripts. For example, essential system software for compilation, credit and quota management, job execution and scheduling, error-correcting wrappers and MPI recommended settings are all applied in this way. This works by setting the PATH and LD_LIBRARY_PATH environment variables, amongst others, to particular values. Please be careful when editing your ~/.bashrc file, if you wish to do so, as this can wreck the default settings and create many problems if done incorrectly, potentially rendering the account unusable until administrative intervention. In particular, if you wish to modify PATH or LD_LIBRARY_PATH please be sure to preserve the existing settings, e.g. do
export PATH=/your/custom/path/element:$PATH export LD_LIBRARY_PATH=/your/custom/librarypath/element:$LD_LIBRARY_PATH
and don't simply overwrite the existing values, or you will have problems. If you are trying to add directories relating to centrally-installed software, please note that there is probably a module available which can be loaded to adjust the environment correctly.
Users who are returning to Darwin after some time should check their ~/.bashrc files and if necessary DELETE any pre-existing line
module load default-infinipath
as this will now interfere with the proper environment settings on the Sandy Bridge/Westmere nodes.
Note that the default-impi module, which is loaded by default, arranges for mpicc, mpif90 etc to be found and to use the Intel compilers automatically when invoked. These wrapper commands supply the correct flags for compiling with Intel MPI (which is the recommended and default MPI implementation for all current compute nodes).
When compiling code, it is usually possible to remove any direct MPI library references in your Makefile as mpicc & mpif90 will take care of these details. In the Makefile, simply set
etc, or define
etc before compilation.
To generate a binary capable of running efficiently on all current types of CPU present in Darwin, use the -xSSE4.2 -axAVX Intel compiler options.
If some required libraries are missing, please let us know and we can try to install them centrally (as a module).
Please note that the following resource limits apply:
- On Darwin, SL1 and SL2 users are limited to 1024 cores on Sandy Bridge in use at any one time and a maximum wallclock runtime of 36 hours per job. On Wilkes, SL1 and SL2 are limited to 288 cores (48 GPUs). Note that DiRAC usage (which is classed as SL1) is confined to the Sandy Bridge sector of the cluster unless GPU resources have been awarded.
- SL3 users are limited to 256 cores (Darwin), 12 GPUs (Wilkes) and 12 hours per job.
- SL4 users are limited to one node and 12 hours per job.
For more information, please see this full description of service levels (SLs).
Please see the example job submission scripts under /usr/local/Cluster-Docs/SLURM. There are example scripts for launching an MPI application on Darwin and Wilkes via the queueing system:
Copying this example file and then modifying the top section (where indicated) will create a script suitable for submission to the batch queueing system via the command sbatch.
Darwin and Wilkes operate the SLURM batch queueing system:
Some useful commands:
showq -> show global cluster information sinfo -> show global cluster information sview -> show global cluster information scontrol show job nnnn -> examine the job with jobid nnnn scontrol show node nodename -> examine the node with name nodename sbatch -> submits an executable script to the queueing system sintr -> submits an interactive job to the queueing system srun -> run a command either as a new job or within an existing job scancel -> delete a job mybalance -> show current balance of core hour credits
Once your application is compiled, e.g. to a binary called prog, it can be submitted to the queueing system as follows (we assume it is destined for Darwin).
Firstly, copy the template SLURM submission script:
cp /usr/local/Cluster-Docs/SLURM/slurm_submit.darwin slurm_submit
(Note that for convenience newer users may have symbolic links to these template files in their home directories - these are read-only so making a copy is still necessary.) Edit the copyslurm_submit, setting application to "prog" and workdir to the correct working directory. Set options to contain any desired command line options, e.g ">outfile 2>errfile" would redirect stdout and stderr to files which could be monitored while the job runs. Note the comment lines in the script:
#! Which project should be charged: #SBATCH -A CHANGEME #! How many whole nodes should be allocated? #SBATCH --nodes=2 #! How many (MPI) tasks will there be in total? (<= nodes*16) #SBATCH --ntasks=32 #! How much wallclock time will be required? #SBATCH --time=02:00:00
These are comments to bash, but are interpreted by SLURM as requests to use 2 nodes, with 32 tasks in total (which because each node has 16 cores, completely utilizes each node), for 2 hours of wall time (i.e. actual time as measured by a clock on the wall, rather than CPU time). Finally, the following command submits the job to the queueing system:
Please note that each Sandy Bridge node has 16 CPU cores, and one should normally be careful not to start more working processes or threads per node than there are cores per node. Fewer cores per node than the maximum may be specified, if for example your MPI job requires more memory per task than is available per core (and so you want to use fewer cores in each node), or if each task is expected to spawn additional parallel threads (e.g. if you have a hybrid MPI/OpenMP application). In these cases, the smaller number of total tasks to launch should be set via the --ntasks directive, but make sure that if you do this, sufficient memory per node is also requested, via --mem=mb_per_node (up to a maximum on the Sandy Bridge nodes of 63900 MB which represents the entire node).
The job's status in the queue can be monitored with squeue; alternatively use qstat or showq (add -u username to focus on a particular user's jobs).
The job can be deleted with scancel <job_id> or qdel <job_id>.
When the job finishes (in error or correctly) there will normally be one file created in the submission directory with a name of the form slurm-NNNN.out (where NNNN is the job id).