CM cluster documentation#

CM is the cluster of the Theory and Simulation of Condensed Matter group @UniTS!

Overview#

_images/monitor_cm.svg

Computing nodes#

The cluster comprises the following computing nodes.

Node

CPU

Cores

Frequency

RAM

PSU

cm01

Xeon Gold 5218

32 (2x125W)

2.30 GHz

96 Gb (DDR4-2666)

550W (x2)

cm02

Xeon Gold 5118

24 (2x105W)

2.30 GHz

64 Gb (DDR4-2400)

550W (x2)

cm03

Xeon Gold 5118

24 (2x105W)

2.30 GHz

64 Gb (DDR4-2400)

550W (x2)

cm04

Xeon Silver 4210R

10 (1x100W)

2.40 GHz

16 Gb (DDR4-2400)

550W (x1)

General rules#

  • The login node is cm01.units.it

  • Do not ssh or run processes manually in the other nodes

  • Submit jobs from the login node using SLURM, see next section

  • Avoid running long processes on the login node

  • Keep an eye on the data you store on disk (du -sh ~)

  • Regularly remove data you do not need anymore

Submitting jobs#

Interactive job

srun <command>
nohup srun -n 16 sleep 10 &

Sample batch job

#!/bin/bash
#SBATCH --ntasks-per-node=16
mpirun -n $SLURM_TASKS sleep 10

Batch job

sbatch <script>

Jobs queue

squeue

Frequently asked questions (FAQ)#

How do I install python package X?#

It is recommended to install python packages in a python virtual environment. The following command

python3 -m venv env

creates a virtual environment in folder env/. To activate the environment and install packages within it

. env/bin/activate
pip install <package_to_install>

Once you have finished using the environment and you want to go back to the default python distribution

deactivate

You can have several environments (e.g. one per project) and they can be deleted anytime by just delete the folder env/.

Can I use docker images?#

Yes, ask the admin to add you to the docker group.

How can I submit a job on a specific node?#

Typically, you should let the scheduler find the first available node for your job. But if we want to submit it on a specific node, say cm03

sbatch -w cm03 job.sh

How can I use the Intel compiler?#

To use the Intel compilers (ifort and icc) and the MKL libraries

. /opt/intel/oneapi/setvars.sh

You can put this command in your ~/.bashrc file, so that it is executed every time you log in.

Can squeue print the number of cores used by jobs?#

Add this to your ~/.bashrc file

alias squeue='squeue -o "%.7i %.9P %.8j %.8u %.2t %.10M %.6D %C %R"'

Log out, log in again and type squeue

How do I get the number of idle cores in the cluster?

Type

free_cpus

Updating this web page#

Log into the cm01 node and go to

cd  /var/www/docs

Edit the file index.rst (in reStructuredText syntax, see for instance this tutorial).

Compile and deploy the webpage

make deploy

and check the result by reloading this page (ignore the warnings). When you are satisfied, commit your changes

git commit -am "...describe your changes here..."