Using Slurm Cluster#

../_images/slurm_cluster.png

Slurm (Simple Linux Utility for Resource Management) is a free and open-source job scheduler for Linux.

tomwer can benefit from a Slurm cluster and trigger remote processing on the cluster. For now, a limited set of tasks can be run on a remote Slurm node:

This allows you to request dedicated and appropriate resources from the cluster (and avoid wasting resources by blocking them for an interactive job). There are two categories of widgets capable of remote processing:

  • Category A (cat A): Widgets in this category can request remote processing on the cluster and produce a Future Data Object, which can be supervised and converted back to a Data Object if the processing is successful.

  • Category B (cat B): Widgets in this category have embedded displays of the processing and expect users to provide feedback to continue the processing. Widgets in this category will request remote processing, but they will never generate a Future Data Object. They will wait for the future to be completed in order to update the GUI and ask for user feedback.

To see how to benefit from the Slurm cluster, you can watch:

Requirements:

  • Have (local) access to Slurm.

  • Target a Slurm partition with GPUs (p9gpu, gpu, p9gpu-long).

  • Get tomwer version >= 0.9.

  • Have dask-jobqueue installed.

Note

You can create a dedicated Slurm cluster per process. You are not required to reuse the same one each time.