In this example, we’ll use a small example along the lines of https://www.tensorflow.org/tutorials/keras/classification
Save the following Python code as mnist.py
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
We will now use the following SLURM script tf-gpu.sh
to run the code:
#!/bin/bash
#SBATCH --job-name="TensorFlow-GPU-Demo" # job name
#SBATCH --partition=peregrine-gpu # partition to which job should be submitted
#SBATCH --qos=gpu_debug # qos type
#SBATCH --nodes=1 # node count
#SBATCH --ntasks=1 # total number of tasks across all nodes
#SBATCH --cpus-per-task=1 # cpu-cores per task
#SBATCH --mem=4G # total memory per node
#SBATCH --gres=gpu:nvidia_a100_3g.39gb:1 # Request 1 GPU (A100 40GB)
#SBATCH --time=00:05:00 # wall time
module purge
module load python/anaconda
python mnist.py
Submit the job as
sbatch tf-gpu.sh
The result will be saved in a file named slurm-####.out
and should look like
Epoch 1/10
1875/1875 [==============================] - 2s 958us/step - loss: 0.2587 - accuracy: 0.9252
Epoch 2/10
1875/1875 [==============================] - 2s 955us/step - loss: 0.1135 - accuracy: 0.9660
Epoch 3/10
1875/1875 [==============================] - 2s 956us/step - loss: 0.0772 - accuracy: 0.9764
-----------------------------------------------------------
-------------------TRUNCATED-------------------------------
-----------------------------------------------------------
1875/1875 [==============================] - 2s 956us/step - loss: 0.0285 - accuracy: 0.9910
Epoch 8/10
1875/1875 [==============================] - 2s 956us/step - loss: 0.0245 - accuracy: 0.9920
Epoch 9/10
1875/1875 [==============================] - 2s 955us/step - loss: 0.0184 - accuracy: 0.9943
Epoch 10/10
1875/1875 [==============================] - 2s 956us/step - loss: 0.0172 - accuracy: 0.9942
313/313 - 0s - loss: 0.0815 - accuracy: 0.9771 - 330ms/epoch - 1ms/step
Test accuracy: 0.9771000146865845