How to Submit Jobs to SLURM#

Some jobs can be time-consuming, and we want to avoid reserving a powerful node for only interaction. So what we can do is have the ‘interactive part’ (Tomwer node) with fewer resources (no GPU, for example) and submit jobs to a node that will be released once the processing is done.

In Tomwer, resource-consuming tasks can be executed on SLURM:

  • Slice / volume reconstruction

    • nabu slice

    • nabu volume

    • sa-axis

    • sa-delta-beta

  • Volume casting

Warning: To launch jobs on SLURM, you need to be on a SLURM client of the ESRF cluster. A simple way to check is to make sure you can access the sbatch command.

Then you will need to handle the job submission and future object by the * SLURM cluster widget * Future supervisor widget

Note: Both are part of the ‘cluster’ group.

SLURM Cluster Widget SLURM cluster icon#

This widget will be used to know the partition and the configuration to create nodes.

You need to add it to the canvas and configure it like

SLURM cluster configuration

Then you can connect it to one of the widgets handling it (presented above) using the ‘cluster_config’ input. Here we connect it with the nabu volume widget.

SLURM cluster example

But this time, the widget launching the job to SLURM will return a future tomo obj object. Basically, this is used to wait for the processing to be done.

To wait for the processing to be done and convert it back to a ‘standard’ data or volume object, we need to use the future supervisor widget.

Future Supervisor Widget Future supervisor icon#

This widget will give feedback on the different jobs currently executed remotely.

Then by default, once a job is finished, you can use it back in your workflow.

../../../_images/future_supervisor_with_nabu_volume.png ../../../_images/slurm_cluster_interface.png

Bash Script#

Each process compatible with SLURM will create a bash script (.sh) under the ‘slurm_scripts’ folder.

Log files related to those scripts have the same name but a different extension (.out).

If needed, you can relaunch them from the command line using the sbatch command.


Demo Video of a Submission to SLURM#

Here is a video demonstration of this feature.

Note: The interface of the SLURM cluster has been updated (simplified) since the video, but the behavior/logic behind it is the same.

[1]:
from IPython.display import YouTubeVideo

YouTubeVideo("HYufV7Ya9v8", height=500, width=800)
[1]:

Hands-on - Exercise A#

Reconstruct a slice of the crayon.nx NXtomo (available from /scisoft/tomo_training/part6_edit_nxtomo/) or another dataset if you prefer.

Then reconstruct a few slices of the volume (~20) by submitting the volume reconstruction to the cluster.