TensorFlow

In this example, we’ll use a small example along the lines of https://www.tensorflow.org/tutorials/keras/classification Save the following Python code as mnist.py

import tensorflow as tf

mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10)
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)

We will now use the following SLURM script tf-gpu.sh to run the code:

#!/bin/bash

#SBATCH --job-name="TensorFlow-GPU-Demo"	 # job name
#SBATCH --partition=peregrine-gpu 		 # partition to which job should be submitted
#SBATCH --qos=gpu_debug			  		 # qos type
#SBATCH --nodes=1                 		 # node count
#SBATCH --ntasks=1                		 # total number of tasks across all nodes
#SBATCH --cpus-per-task=1         		 # cpu-cores per task
#SBATCH --mem=4G                  		 # total memory per node
#SBATCH --gres=gpu:nvidia_a100_3g.39gb:1 # Request 1 GPU (A100 40GB)
#SBATCH --time=00:05:00 				 #  wall time

module purge
module load python/anaconda

python mnist.py

Submit the job as

sbatch tf-gpu.sh

The result will be saved in a file named slurm-####.out and should look like

Epoch 1/10
1875/1875 [==============================] - 2s 958us/step - loss: 0.2587 - accuracy: 0.9252
Epoch 2/10
1875/1875 [==============================] - 2s 955us/step - loss: 0.1135 - accuracy: 0.9660
Epoch 3/10
1875/1875 [==============================] - 2s 956us/step - loss: 0.0772 - accuracy: 0.9764

-----------------------------------------------------------
-------------------TRUNCATED-------------------------------
-----------------------------------------------------------

1875/1875 [==============================] - 2s 956us/step - loss: 0.0285 - accuracy: 0.9910
Epoch 8/10
1875/1875 [==============================] - 2s 956us/step - loss: 0.0245 - accuracy: 0.9920
Epoch 9/10
1875/1875 [==============================] - 2s 955us/step - loss: 0.0184 - accuracy: 0.9943
Epoch 10/10
1875/1875 [==============================] - 2s 956us/step - loss: 0.0172 - accuracy: 0.9942
313/313 - 0s - loss: 0.0815 - accuracy: 0.9771 - 330ms/epoch - 1ms/step

Test accuracy: 0.9771000146865845