Difference between revisions of "DeepOps on OpenStack POC"
(Add description on how to use the POC deployment on vScaler) |
(No difference)
|
Revision as of 10:46, 21 December 2018
DeepOps on vScaler POC
The Ubuntu driver and all the components required for a simple Kubernetes with GPU deployment (1 master, 1 worker) are available in the ireland.south1 region. Below is a description how users can use it.
First, as an admin, share the DGX image with your project:
[root@lhc-headnode ~]# source /etc/kolla/admin-openrc.sh [root@lhc-headnode ~]# openstack image add project ubuntu-software-config-dgx <your-project-name> [root@lhc-headnode ~]# openstack image member list ubuntu-software-config-dgx
The last command will tell you the image in "pending" state. Log as your regular user and accept the image either through Horizon or with the command:
$ openstack image set --accept ubuntu-software-config-dgx
All of this is done this way for now because the vGPU driver and settings for connecting to our license server are baked into the image.
Next, head to the dashboard to Container Infra -> Cluster Templates, where you can see the "deepops-poc-template" template. This template was created using the following command:
[root@lhc-headnode ~]# openstack coe cluster template create --coe kubernetes --image ubuntu-software-config-dgx --external-network public1 --flavor g1.large.1xk80 --master-flavor m1.medium --docker-storage-driver overlay --public --floating-ip-disabled deepops-poc-template
Create a cluster off of the template by clicking on "Create Cluster" next to the template and specifying the name of the cluster and your keypair (leave all other parameters with their defaults). You can also create a cluster with these commands:
$ openstack coe cluster template list $ openstack coe cluster create --cluster-template deepops-poc-template --keypair <your-keypair> <cluster-name> $ openstack coe cluster list
Now, head to Orchestration -> Stacks and wait for the stack associated with the cluster to transition into the "Create Complete" state. When this is done, check the list of your instances in Compute -> Instances and find the master node of the cluster. Assign a floating IP to it and SSH into it.
Confirm Kubernetes works by launching a test pod:
kubectl get nodes git clone https://github.com/NVIDIA/deepops cd deepops/ kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.12/nvidia-device-plugin.yml kubectl apply -f tests/gpu-test-job.yml kubectl get pods
Wait for the pod to become ready and run these commands to confirm the pod can access the vGPU:
kubectl describe pod gpu-pod kubectl exec -ti gpu-pod -- nvidia-smi