Skip to content

Kubernetes

By choosing the kubernetes Job orchestrator the Instant Cluster will also have k8s installed.

Features

The cluster is configured to provide an out-of-the box ability to run multi-node Infiniband jobs.

  • Each worker node's /mnt/local_disk is available as a StorageClass
  • mpi-operator is deployed
  • cilium, nvidia-device-plugin, nvidia-network-operator are also installed with helm

At this point in time both kubernetes and SLURM job orchestrators are available when one chooses Kubernetes. There is no coordination between the orchestrators. One can disable SLURM with systemctl disable --now slurmctld on the jumphost.

Using kubectl

Admin credentials can be found in /root/.kube/config and /home/ubuntu/.kube/config

Submitting a job

/home/ubuntu/verda_k8s_all_reduce_perf_2_nodes.yml is available as an example. It runs an nccl-tests all_reduce_perf and it sets the crucial NCCL_PKEY=1 environment variable. It is needed so that the nodes know which Infiniband P_Key to use.

$ kubectl create -f /home/ubuntu/verda_k8s_all_reduce_perf_2_nodes.yml 
mpijob.kubeflow.org/nccl-test-2n-wcq4s created

Job Details

$ kubectl get pods 
NAME                                READY   STATUS      RESTARTS   AGE
nccl-test-2n-8cnbc-launcher-przrf   0/1     Completed   4          30m
$
$ kubectl logs -f nccl-test-2n-8cnbc-launcher-przrf | tail -10
  4294967296    1073741824     float     sum      -1  9230.25  465.31  872.46       0  9221.35  465.76  873.31       0
  8589934592    2147483648     float     sum      -1  18376.5  467.44  876.45       0  18337.6  468.43  878.31       0
# Out of bounds values : 0 OK
# Avg bus bandwidth    : 827.441 
#
# Collective test concluded: all_reduce_perf
#


=== NCCL test completed ===

Downloading the container image on all workers might take a little while

Container Registry

It is a good idea to use authentication when pulling images.

https://docs.verda.com/containers/container-registries