In previous blogs I was working with Rook/Ceph on Kubernetes and demonstrating how to setup a Ceph cluster and even replace failed OSDs. With that in mind I wanted to shift gears a bit and bring it more into alignment with OpenShift and Container Native Virtualization(CNV).
The following blog will guide us through a simple OpenShift deployment with Rook/Ceph and CNV configured. I will also demonstrate the use of a Rook PVC that provides the back end storage for a CNV deployed virtual instance.
The configuration for this lab is four virtual machines where one node is the master and compute and the other 3 nodes compute. Each of these nodes has a base install of Red Hat Enterprise Linux 7 on it and the physical host they are on allows for nested virtualization.
Before we start with the installation of various software lets make sure we do a bit of user setup to ensure our install runs smoothly. The next few steps will need to be done on all nodes to ensure a user origin (this could be any non root user) is created and has sudo rights without use of a password:
# useradd origin
# passwd origin
# echo -e 'Defaults:origin !requiretty\norigin ALL = (root) NOPASSWD:ALL' | tee /etc/sudoers.d/openshift
# chmod 440 /etc/sudoers.d/openshift
Then we need to perform the the following steps to setup keyless authentication for the origin user from the master node to the rest of the nodes that will make up the cluster:
# ssh-keygen -q -N ""
# vi /home/origin/.ssh/config
Host ocp-master
Hostname ocp-master.schmaustech.com
User origin
Host ocp-node1
Hostname ocp-node1.schmaustech.com
User origin
Host ocp-node2
Hostname ocp-node2.schmaustech.com
User origin
Host ocp-node3
Hostname ocp-node3.schmaustech.com
User origin
# chmod 600 /home/origin/.ssh/config
# ssh-copy-id ocp-master
# ssh-copy-id ocp-node1
# ssh-copy-id ocp-node2
# ssh-copy-id ocp-node3
Now we can move on to enabling the necessary repositries on all nodes to ensure we can get access to the right packages we will need for installation:
[origin@ocp-master ~]$ sudo subscription-manager repos --enable=rhel-7-server-rpms --enable=rhel-7-server-extras-rpms --enable=rhel-7-server-rh-common-rpms --enable=rhel-7-server-ose-3.11-rpms --enable=rhel-7-server-ansible-2.6-rpms --enable=rhel-7-server-cnv-1.4-tech-preview-rpms
Next lets install the initial required packages on all the nodes:
[origin@ocp-master ~]$ sudo yum -y install openshift-ansible docker-1.13.1 kubevirt-ansible kubevirt-virtctl
On the master node lets configure the Ansible hosts file for our OpenShift installation. The following is the example I used and I simply replaced /etc/ansible/hosts with it.
[OSEv3:children]
masters
nodes
etcd
[OSEv3:vars]
# admin user created in previous section
ansible_ssh_user=origin
ansible_become=true
oreg_url=registry.access.redhat.com/openshift3/ose-${component}:${version}
openshift_deployment_type=openshift-enterprise
# use HTPasswd for authentication
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
# define default sub-domain for Master node
openshift_master_default_subdomain=apps.schmaustech.com
# allow unencrypted connection within cluster
openshift_docker_insecure_registries=172.30.0.0/16
[masters]
ocp-master.schmaustech.com openshift_schedulable=true containerized=false
[etcd]
ocp-master.schmaustech.com
[nodes]
# defined values for [openshift_node_group_name] in the file below
# [/usr/share/ansible/openshift-ansible/roles/openshift_facts/defaults/main.yml]
ocp-master.schmaustech.com openshift_node_group_name='node-config-all-in-one'
ocp-node1.schmaustech.com openshift_node_group_name='node-config-compute'
ocp-node2.schmaustech.com openshift_node_group_name='node-config-compute'
ocp-node3.schmaustech.com openshift_node_group_name='node-config-compute'
With the Ansible host file in place we are ready to run the OpenShift prerequisite playbook:
[origin@ocp-master ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml
Once the prerequisite playbook executes sucessfully we can then run the OpenShift deploy cluster playbook:
[origin@ocp-master ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml
Lets validate OpenShift is up and running:
[origin@ocp-master ~]$ oc get nodes
NAME STATUS ROLES AGE VERSION
ocp-master Ready compute,infra,master 15h v1.11.0+d4cacc0
ocp-node1 Ready compute 14h v1.11.0+d4cacc0
ocp-node2 Ready compute 14h v1.11.0+d4cacc0
ocp-node3 Ready compute 14h v1.11.0+d4cacc0
[origin@ocp-master ~]$ oc get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
default docker-registry-1-g4hgd 1/1 Running 0 14h 10.128.0.4 ocp-master <none>
default registry-console-1-zwhrd 1/1 Running 0 14h 10.128.0.6 ocp-master <none>
default router-1-v8pkp 1/1 Running 0 14h 192.168.3.100 ocp-master <none>
kube-service-catalog apiserver-gxjst 1/1 Running 0 14h 10.128.0.17 ocp-master <none>
kube-service-catalog controller-manager-2v6qs 1/1 Running 3 14h 10.128.0.18 ocp-master <none>
openshift-ansible-service-broker asb-1-d8clq 1/1 Running 0 14h 10.128.0.21 ocp-master <none>
openshift-console console-566f847459-pk52j 1/1 Running 0 14h 10.128.0.12 ocp-master <none>
openshift-monitoring alertmanager-main-0 3/3 Running 0 14h 10.128.0.14 ocp-master <none>
openshift-monitoring alertmanager-main-1 3/3 Running 0 14h 10.128.0.15 ocp-master <none>
openshift-monitoring alertmanager-main-2 3/3 Running 0 14h 10.128.0.16 ocp-master <none>
openshift-monitoring cluster-monitoring-operator-79d6c544f5-c8rfs 1/1 Running 0 14h 10.128.0.7 ocp-master <none>
openshift-monitoring grafana-8497b48bd5-bqzxb 2/2 Running 0 14h 10.128.0.10 ocp-master <none>
openshift-monitoring kube-state-metrics-7d8b57fc8f-ktdq4 3/3 Running 0 14h 10.128.0.19 ocp-master <none>
openshift-monitoring node-exporter-5gmbc 2/2 Running 0 14h 192.168.3.103 ocp-node3 <none>
openshift-monitoring node-exporter-fxthd 2/2 Running 0 14h 192.168.3.102 ocp-node2 <none>
openshift-monitoring node-exporter-gj27b 2/2 Running 0 14h 192.168.3.101 ocp-node1 <none>
openshift-monitoring node-exporter-r6vjs 2/2 Running 0 14h 192.168.3.100 ocp-master <none>
openshift-monitoring prometheus-k8s-0 4/4 Running 1 14h 10.128.0.11 ocp-master <none>
openshift-monitoring prometheus-k8s-1 4/4 Running 1 14h 10.128.0.13 ocp-master <none>
openshift-monitoring prometheus-operator-5677fb6f87-4czth 1/1 Running 0 14h 10.128.0.8 ocp-master <none>
openshift-node sync-7rqcb 1/1 Running 0 14h 192.168.3.103 ocp-node3 <none>
openshift-node sync-829ql 1/1 Running 0 14h 192.168.3.101 ocp-node1 <none>
openshift-node sync-mwq6v 1/1 Running 0 14h 192.168.3.102 ocp-node2 <none>
openshift-node sync-vc4hw 1/1 Running 0 15h 192.168.3.100 ocp-master <none>
openshift-sdn ovs-n55b8 1/1 Running 0 14h 192.168.3.101 ocp-node1 <none>
openshift-sdn ovs-nvtgq 1/1 Running 0 14h 192.168.3.103 ocp-node3 <none>
openshift-sdn ovs-t8dgh 1/1 Running 0 14h 192.168.3.102 ocp-node2 <none>
openshift-sdn ovs-wgw2v 1/1 Running 0 15h 192.168.3.100 ocp-master <none>
openshift-sdn sdn-7r9kn 1/1 Running 0 14h 192.168.3.101 ocp-node1 <none>
openshift-sdn sdn-89284 1/1 Running 0 15h 192.168.3.100 ocp-master <none>
openshift-sdn sdn-hmgjg 1/1 Running 0 14h 192.168.3.103 ocp-node3 <none>
openshift-sdn sdn-n7lzh 1/1 Running 0 14h 192.168.3.102 ocp-node2 <none>
openshift-template-service-broker apiserver-md5sr 1/1 Running 0 14h 10.128.0.22 ocp-master <none>
openshift-web-console webconsole-674f79b6fc-cjrhw 1/1 Running 0 14h 10.128.0.9 ocp-master <none>
With OpenShift up and running we can move onto install Rook/Ceph cluster. The first step is to clone the Rook Git repo down to the master node and make an adjustment for the kubelet-plugins. Please note here I am cloning down a colleagues Rook clone and not direct from the Rook project:
[origin@ocp-master ~]$ git clone https://github.com/ksingh7/ocp4-rook.git
[origin@ocp-master ~]$ sed -i.bak s+/etc/kubernetes/kubelet-plugins/volume/exec+/usr/libexec/kubernetes/kubelet-plugins/volume/exec+g /home/origin/ocp4-rook/ceph/operator.yaml
With the repository cloned we can now apply the the security context constraints needed by the Rook pods using the scc.yaml and then launch the Rook operator with operator.yaml:
[origin@ocp-master ~]$ oc create -f /home/origin/ocp4-rook/ceph/scc.yaml
[origin@ocp-master ~]$ oc create -f /home/origin/ocp4-rook/ceph/operator.yaml
Lets validate the Rook operator came up:
[origin@ocp-master ~]$ oc get pods -n rook-ceph-system
NAME READY STATUS RESTARTS AGE
rook-ceph-agent-77x5n 1/1 Running 0 1h
rook-ceph-agent-cdvqr 1/1 Running 0 1h
rook-ceph-agent-gz7tl 1/1 Running 0 1h
rook-ceph-agent-rsbwh 1/1 Running 0 1h
rook-ceph-operator-b76466dcd-zmscb 1/1 Running 0 1h
rook-discover-6p5ht 1/1 Running 0 1h
rook-discover-fnrf4 1/1 Running 0 1h
rook-discover-grr5w 1/1 Running 0 1h
rook-discover-mllt7 1/1 Running 0 1h
Once the operator is up we can proceed on deploying the Ceph cluster and once that is up deploy the Ceph toolbox pod:
[origin@ocp-master ~]$ oc create -f /home/origin/ocp4-rook/ceph/cluster.yaml
[origin@ocp-master ~]$ oc create -f /home/origin/ocp4-rook/ceph/toolbox.yaml
Lets validate the Ceph cluster is up:
[origin@ocp-master ~]$ oc get pods -n rook-ceph
NAME READY STATUS RESTARTS AGE
rook-ceph-mgr-a-785ddd6d6c-d4w56 1/1 Running 0 1h
rook-ceph-mon-a-67855c796b-sdvqm 1/1 Running 0 1h
rook-ceph-mon-b-6d58cd7656-xkrdz 1/1 Running 0 1h
rook-ceph-mon-c-869b8d9d9-m7544 1/1 Running 0 1h
rook-ceph-osd-0-d6cbd5776-987p9 1/1 Running 0 1h
rook-ceph-osd-1-cfddf997-pzq69 1/1 Running 0 1h
rook-ceph-osd-2-79fc94c6d5-krtnj 1/1 Running 0 1h
rook-ceph-osd-3-f9b55c4d6-7jp7c 1/1 Running 0 1h
rook-ceph-osd-prepare-ocp-master-ztmhs 0/2 Completed 0 1h
rook-ceph-osd-prepare-ocp-node1-mgbcd 0/2 Completed 0 1h
rook-ceph-osd-prepare-ocp-node2-98rtw 0/2 Completed 0 1h
rook-ceph-osd-prepare-ocp-node3-ngscg 0/2 Completed 0 1h
rook-ceph-tools 1/1 Running 0 1h
Lets also validate from the Ceph toolbox that the cluster health is ok:
[origin@ocp-master ~]$ oc -n rook-ceph rsh rook-ceph-tools
sh-4.2# ceph status
cluster:
id: 6ddab3e4-1730-412f-89b8-0738708adac8
health: HEALTH_OK
services:
mon: 3 daemons, quorum b,a,c
mgr: a(active)
osd: 4 osds: 4 up, 4 in
data:
pools: 1 pools, 100 pgs
objects: 281 objects, 1.1 GiB
usage: 51 GiB used, 169 GiB / 220 GiB avail
pgs: 100 active+clean
Now that we have confirmed the Ceph cluster is deployed lets configure a Ceph storage class and also make it the default storage class for the environment:
[origin@ocp-master ~]$ oc create -f /home/origin/ocp4-rook/ceph/storageclass.yaml
[origin@ocp-master ~]$ oc patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
And now if we display the storage class we can see Rook/Ceph is our default:
[origin@ocp-master ~]$ oc get storageclass
NAME PROVISIONER AGE
rook-ceph-block (default) ceph.rook.io/block 6h
Proceeding with our stack installation lets get CNV installed. Again with the use of the Ansible playbook we used earlier for OpenShift this is a relatively easy task:
[origin@ocp-master ~]$ oc login -u system:admin
[origin@ocp-master ~]$ cd /usr/share/ansible/kubevirt-ansible
[origin@ocp-master ~]$ ansible-playbook -i /etc/ansible/hosts -e @vars/cnv.yml playbooks/kubevirt.yml -e apb_action=provision
Once the installation completes lets run the following command to ensure the pods for CNV are up:
[origin@ocp-master ~]$ oc get pods --all-namespaces -o wide|egrep "kubevirt|cdi"
cdi cdi-apiserver-7bfd97d585-tqjgt 1/1 Running 0 6h 10.129.0.11 ocp-node3
cdi cdi-deployment-6689fcb476-4klcj 1/1 Running 0 6h 10.131.0.12 ocp-node1
cdi cdi-operator-5889d7588c-wvgl4 1/1 Running 0 6h 10.130.0.12 ocp-node2
cdi cdi-uploadproxy-79c9fb9f59-pkskw 1/1 Running 0 6h 10.129.0.13 ocp-node3
cdi virt-launcher-f29vm-h6mc9 1/1 Running 0 6h 10.129.0.15 ocp-node3
kubevirt-web-ui console-854d4585c8-hgdhv 1/1 Running 0 6h 10.129.0.10 ocp-node3
kubevirt-web-ui kubevirt-web-ui-operator-6b4574bb95-bmsw7 1/1 Running 0 6h 10.130.0.11 ocp-node2
kubevirt kubevirt-cpu-node-labeller-fvx9n 1/1 Running 0 6h 10.128.0.29 ocp-master
kubevirt kubevirt-cpu-node-labeller-jr858 1/1 Running 0 6h 10.131.0.13 ocp-node1
kubevirt kubevirt-cpu-node-labeller-tgq5g 1/1 Running 0 6h 10.129.0.14 ocp-node3
kubevirt kubevirt-cpu-node-labeller-xqpbl 1/1 Running 0 6h 10.130.0.13 ocp-node2
kubevirt virt-api-865b95d544-hg58l 1/1 Running 0 6h 10.129.0.8 ocp-node3
kubevirt virt-api-865b95d544-jrkxh 1/1 Running 0 6h 10.131.0.10 ocp-node1
kubevirt virt-controller-5c89d4978d-q79lh 1/1 Running 0 6h 10.130.0.8 ocp-node2
kubevirt virt-controller-5c89d4978d-t58l7 1/1 Running 0 6h 10.130.0.10 ocp-node2
kubevirt virt-handler-gblbk 1/1 Running 0 6h 10.128.0.28 ocp-master
kubevirt virt-handler-jnwx6 1/1 Running 0 6h 10.130.0.9 ocp-node2
kubevirt virt-handler-r94fb 1/1 Running 0 6h 10.129.0.9 ocp-node3
kubevirt virt-handler-z7775 1/1 Running 0 6h 10.131.0.11 ocp-node1
kubevirt virt-operator-68984b585c-265bq 1/1 Running 0 6h 10.129.0.7 ocp-node3
Now that CNV is up running lets pull down a Fedora 29 image and upload it into a PVC of the default storageclass which of course is Rook/Ceph:
[origin@ocp-master ~]$ curl -L -o /home/origin/f29.qcow2 http://ftp.usf.edu/pub/fedora/linux/releases/29/Cloud/x86_64/images/Fedora-Cloud-Base-29-1.2.x86_64.qcow2
[origin@ocp-master ~]$ virtctl image-upload --pvc-name=f29vm --pvc-size=5Gi --image-path=/home/origin/f29.qcow2 --uploadproxy-url=https://`oc describe route cdi-uploadproxy-route|grep Endpoints|cut -f2` --insecure
We can execute the following to see that the PVC has been created:
[origin@ocp-master ~]$ oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
f29vm Bound pvc-4815df9e-4987-11e9-a732-525400767d62 5Gi RWO rook-ceph-block 6h
Besides the PVC we will also need a virtual machine configuration yaml file. The one below is an example that will be used in this demonstration:
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
creationTimestamp: null
labels:
kubevirt-vm: f29vm
name: f29vm
spec:
running: true
template:
metadata:
creationTimestamp: null
labels:
kubevirt.io/domain: f29vm
spec:
domain:
cpu:
cores: 2
devices:
disks:
- disk:
bus: virtio
name: osdisk
volumeName: osdisk
- disk:
bus: virtio
name: cloudinitdisk
volumeName: cloudinitvolume
interfaces:
- name: default
bridge: {}
resources:
requests:
memory: 1024M
terminationGracePeriodSeconds: 0
volumes:
- name: osdisk
persistentVolumeClaim:
claimName: f29vm
- name: cloudinitdisk
cloudInitNoCloud:
userData: |-
#cloud-config
password: ${PASSWORD}
disable_root: false
chpasswd: { expire: False }
ssh_authorized_keys:
- "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDUs1KbLraX74mBM/ksoGwbsEejfpCVeMzbW7JLJjGXF8G1jyVAE3T0Uf5mO8nbNOfkjAjw24lxSsEScF2wslBzA5MIm+GB6Z+ZzR55FcRlZeouGVrfLmb67mYc2c/F/mq35TruHdRk2G5Y0+6cf8cfDs414+yiVA0heHQvWNfO7kb1z9kIOhyD6OOwdNT5jK/1O0+p6SdP+pEal51BsEf6GRGYLWc9SLIEcqtjoprnundr5UPvmC1l/pkqFQigMehwhthrdXC4GseWiyj9CnBkccxQCKvHjzko/wqsWGQLwDG3pBsHhthvbY0G5+VPB9a8YV58WJhC6nHpUTDA8jpB origin@ocp-master"
networks:
- name: default
pod: {}
At this point we have all the necessary components to launch our containerized virtual machine instance. The following command does the creation using the yaml file we created in the previous step:
[origin@ocp-master ~]$ oc create -f /home/origin/f29vm.yaml
There are multiple ways to validate the virtual machine has been instantiated. I like to do the following to confirm the instance is running and has an IP address:
[origin@ocp-master ~]$ oc get vms
NAME AGE RUNNING VOLUME
f29vm 6h true
[origin@ocp-master ~]$ oc get vmi
NAME AGE PHASE IP NODENAME
f29vm 6h Running 10.129.0.15 ocp-node3
One final step you can do is actually log into the instance assuming a key was set in the yaml file:
[origin@ocp-master ~]$ ssh -i /home/origin/.ssh/id_rsa -o "StrictHostKeyChecking=no" fedora@10.129.0.15
[fedora@f29vm ~]$ cat /etc/fedora-release
Fedora release 29 (Twenty Nine)
Hopefully this demonstrated how easy it is to get OpenShift, Rook and CNV up and running and how one can then leverage the storage of Rook to provide a backend for the virtual instance that gets spun up in CNV. What is awesome is that I have taken the steps above and put them into a DCI job where I can automatically rerun the deployment using newer version of the code base for testing. If you are not familiar with DCI I will leave with this tease link to DCI: https://doc.distributed-ci.io/