Interactive Jobs

Certain applications require direct user input via a terminal. For these one can make use of interactive jobs in SLURM, which makes it possible to run applications/commands on compute nodes in a shell. SLURM offers two ways in which one can run interactive jobs: using the srun command and salloc command.

Interactive jobs are intended for very short running and very specific applications/commands.
Do not use interactive jobs for long term jobs and for regular applications.
Please use the sbatch command to submit jobs.

SRUN

Using the srun command, interactive jobs can be run within a scheduled shell.

Here is an example:

srun --nodes=1 --ntasks=1 --mem=4G --time=00:05:00 --pty /bin/bash

Notice how the prompt changes indicating that a new shell has been spawned on one of the compute nodes:

peregrine0:~$ 

You can now run your interactive application/command and after you are done, just type exit at the command prompt to quit the shell and delete the SLURM job.

SALLOC

For situations where you would like to come back to your interactive session (after disconnecting from it), you can use SLURM’s salloc command to allocate resource up-front and keep the job running. The process looks like this:

  • Use salloc to create the resource allocation up front
  • Use srun to connect to it, as many times as needed during the job time frame.

Run the command below to allocate resources:

salloc --nodes=1 --ntasks=1 --mem=4G --time=00:20:00

Here we are allocating 4GB of memory and one CPU on a node for 20 minutes. The command will display a job id number. Keep a note of it, as you will need that to connect to the interactive shell.

salloc: Granted job allocation 235
salloc: Waiting for resource configuration
salloc: Nodes peregrine0 are ready for job

Notice this time the prompt did not change. Since salloc only allocates resources to your job, it does not start a shell. To connect to an interactive shell on your job use the srun command and specify the job id (which you noted in the salloc step):

srun --jobid=235 --pty /bin/bash

You will now be landed on a compute node in an interactive shell.

peregrine0:~$

Now you can exit from the shell and connect again later, using the srun command with the same job id number. To finally delete your job, use the scancel command.

scancel 235
salloc: Job allocation 235 has been revoked.
Hangup