Difference between revisions of "DeepOps on OpenStack POC"

Revision as of 10:46, 21 December 2018

DeepOps on vScaler POC

The Ubuntu driver and all the components required for a simple Kubernetes with GPU deployment (1 master, 1 worker) are available in the ireland.south1 region. Below is a description how users can use it.

First, as an admin, share the DGX image with your project:

[root@lhc-headnode ~]# source /etc/kolla/admin-openrc.sh
[root@lhc-headnode ~]# openstack image add project ubuntu-software-config-dgx <your-project-name>
[root@lhc-headnode ~]# openstack image member list ubuntu-software-config-dgx

The last command will tell you the image in "pending" state. Log as your regular user and accept the image either through Horizon or with the command:

$ openstack image set --accept ubuntu-software-config-dgx

All of this is done this way for now because the vGPU driver and settings for connecting to our license server are baked into the image.

Next, head to the dashboard to Container Infra -> Cluster Templates, where you can see the "deepops-poc-template" template. This template was created using the following command:

[root@lhc-headnode ~]# openstack coe cluster template create --coe kubernetes --image ubuntu-software-config-dgx --external-network public1 --flavor g1.large.1xk80 --master-flavor m1.medium --docker-storage-driver overlay --public --floating-ip-disabled deepops-poc-template

Create a cluster off of the template by clicking on "Create Cluster" next to the template and specifying the name of the cluster and your keypair (leave all other parameters with their defaults). You can also create a cluster with these commands:

$ openstack coe cluster template list
$ openstack coe cluster create --cluster-template deepops-poc-template --keypair <your-keypair> <cluster-name>
$ openstack coe cluster list

Now, head to Orchestration -> Stacks and wait for the stack associated with the cluster to transition into the "Create Complete" state. When this is done, check the list of your instances in Compute -> Instances and find the master node of the cluster. Assign a floating IP to it and SSH into it.

Confirm Kubernetes works by launching a test pod:

kubectl get nodes
git clone https://github.com/NVIDIA/deepops
cd deepops/
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.12/nvidia-device-plugin.yml
kubectl apply -f tests/gpu-test-job.yml
kubectl get pods

Wait for the pod to become ready and run these commands to confirm the pod can access the vGPU:

kubectl describe pod gpu-pod
kubectl exec -ti gpu-pod -- nvidia-smi

Resources

https://github.com/NVIDIA/deepops

Difference between revisions of "DeepOps on OpenStack POC"

Revision as of 10:46, 21 December 2018

DeepOps on vScaler POC

Resources

Navigation menu

Search