Quantcast
Channel: SCHMAUSTECH
Viewing all articles
Browse latest Browse all 66

Deploy Rook/Ceph Cluster on Dedicated Networks

$
0
0

Recently a colleague of mine was trying to get Rook to deploy a Ceph cluster that used dedicated public and private networks to segment the Ceph replication traffic and the client access traffic to the OSDs of the cluster.   In a regular Ceph deployment this is rather trivial but when in the context of Kubernetes it becomes a little more complex given that Rook is deploying the cluster containers.  The following is procedure I applied to ensure my OSDs were listening on the appropriate networks.

Before we get into the steps on how to achieve this configuration lets quick take a look at the setup I used.  First I have a three node Kubernetes configuration (1 master with allowed scheduling and two workers):

# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-master Ready master 2d22h v1.14.0
kube-node1 Ready worker 2d22h v1.14.0
kube-node2 Ready worker 2d22h v1.14.0

On each of the nodes I have 3 network interfaces: eth0 on 10.0.0.0/24 (Kubernetes public), eth1 on 192.168.100.0/24 (Ceph private/cluster) & eth2 on 192.168.200.0/24 (Ceph public):

# ip a|grep eth[0-2]
2: eth0: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 10.0.0.81/24 brd 10.0.0.255 scope global noprefixroute eth0
3: eth1: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.100.81/24 brd 192.168.100.255 scope global noprefixroute eth1
4: eth2: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.200.81/24 brd 192.168.200.255 scope global noprefixroute eth2

Before we begin lets see the current vanilla pods and namespaces on the Kubernetes cluster:

# kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-fb8b8dccf-h6wfn 1/1 Running 0 3d 10.244.1.2 kube-node2
kube-system coredns-fb8b8dccf-mv7p5 1/1 Running 0 3d 10.244.0.7 kube-master
kube-system etcd-kube-master 1/1 Running 0 3d 10.0.0.81 kube-master
kube-system kube-apiserver-kube-master 1/1 Running 0 3d 10.0.0.81 kube-master
kube-system kube-controller-manager-kube-master 1/1 Running 1 3d 10.0.0.81 kube-master
kube-system kube-flannel-ds-amd64-szhg9 1/1 Running 0 3d 10.0.0.83 kube-node2
kube-system kube-flannel-ds-amd64-t4fxs 1/1 Running 0 3d 10.0.0.82 kube-node1
kube-system kube-flannel-ds-amd64-wbsdp 1/1 Running 0 3d 10.0.0.81 kube-master
kube-system kube-proxy-sn7j7 1/1 Running 0 3d 10.0.0.83 kube-node2
kube-system kube-proxy-wtzm5 1/1 Running 0 3d 10.0.0.81 kube-master
kube-system kube-proxy-xlwd9 1/1 Running 0 3d 10.0.0.82 kube-node1
kube-system kube-scheduler-kube-master 1/1 Running 1 3d 10.0.0.81 kube-master

# kubectl get ns
NAME STATUS AGE
default Active 3d
kube-node-lease Active 3d
kube-public Active 3d
kube-system Active 3d

Before can deploy the cluster we need to create a configmap for the rook-ceph namespace.  This namespace is normally created when the cluster is deployed however we want specific configuration items to be incorporated into the cluster upon deployment and so to do this we will create the rook-ceph namespace and apply a configmap that we create to that namespace.

First create a configmap file that looks like the following and notice I am referencing my Ceph cluster networks.  I will save this file with an arbitrary name like config-override.yaml

apiVersion: v1
kind: ConfigMap
metadata:
name: rook-config-override
namespace: rook-ceph
data:
config: |
[global]
public network = 192.168.200.0/24
cluster network = 192.168.100.0/24
public addr = ""
cluster addr = ""

Next I will create the rook-ceph namespace:

# kubectl create namespace rook-ceph
namespace/rook-ceph created
# kubectl get ns
NAME STATUS AGE
default Active 3d1h
kube-node-lease Active 3d1h
kube-public Active 3d1h
kube-system Active 3d1h
rook-ceph Active 5s

Now we can apply the configmap we created to the newly created namespace and validate its there:

# kubectl create -f config-override.yaml 
configmap/rook-config-override created
# kubectl get configmap -n rook-ceph
NAME DATA AGE
rook-config-override 1 66s
# kubectl describe configmap -n rook-ceph
Name: rook-config-override
Namespace: rook-ceph
Labels: <none>
Annotations: <none>

Data
====
config:
----
[global]
public network = 192.168.200.0/24
cluster network = 192.168.100.0/24
public addr = ""
cluster addr = ""

Events: <none>


Before we actually start to do the deploy we need to update one more thing in our Rook cluster.yaml.  Inside the cluster.yaml file we need to change hostNetwork from the default of false to true:

 sed -i 's/hostNetwork: false/hostNetwork: true/g' cluster.yaml

Now we can begin the process of deploying the Rook/Ceph cluster that includes launching the operator, cluster and toolbox.   I will place sleep statements in between each command to ensure the pods are up before I run the next command.  Also note there will be an error when creating the cluster about the rook-ceph namespace already existing and this is normal:

# kubectl create -f operator.yaml
namespace/rook-ceph-system created
customresourcedefinition.apiextensions.k8s.io/cephclusters.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystems.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstores.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstoreusers.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephblockpools.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/volumes.rook.io created
clusterrole.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
role.rbac.authorization.k8s.io/rook-ceph-system created
clusterrole.rbac.authorization.k8s.io/rook-ceph-global created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
serviceaccount/rook-ceph-system created
rolebinding.rbac.authorization.k8s.io/rook-ceph-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-global created
deployment.apps/rook-ceph-operator created
# sleep 60
# kubectl create -f cluster.yaml 
serviceaccount/rook-ceph-osd created
serviceaccount/rook-ceph-mgr created
role.rbac.authorization.k8s.io/rook-ceph-osd created
role.rbac.authorization.k8s.io/rook-ceph-mgr-system created
role.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-system created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
cephcluster.ceph.rook.io/rook-ceph created
Error from server (AlreadyExists): error when creating "cluster.yaml": namespaces "rook-ceph" already exists
# sleep 60
# kubectl create -f toolbox.yaml 
pod/rook-ceph-tools created

Lets validate the Rook/Ceph operator, cluster and toolbox is up and running:

# kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-fb8b8dccf-h6wfn 1/1 Running 0 3d1h 10.244.1.2 kube-node2 <none> <none>
kube-system coredns-fb8b8dccf-mv7p5 1/1 Running 0 3d1h 10.244.0.7 kube-master <none> <none>
kube-system etcd-kube-master 1/1 Running 0 3d1h 10.0.0.81 kube-master <none> <none>
kube-system kube-apiserver-kube-master 1/1 Running 0 3d1h 10.0.0.81 kube-master <none> <none>
kube-system kube-controller-manager-kube-master 1/1 Running 1 3d1h 10.0.0.81 kube-master <none> <none>
kube-system kube-flannel-ds-amd64-szhg9 1/1 Running 0 3d1h 10.0.0.83 kube-node2 <none> <none>
kube-system kube-flannel-ds-amd64-t4fxs 1/1 Running 0 3d1h 10.0.0.82 kube-node1 <none> <none>
kube-system kube-flannel-ds-amd64-wbsdp 1/1 Running 0 3d1h 10.0.0.81 kube-master <none> <none>
kube-system kube-proxy-sn7j7 1/1 Running 0 3d1h 10.0.0.83 kube-node2 <none> <none>
kube-system kube-proxy-wtzm5 1/1 Running 0 3d1h 10.0.0.81 kube-master <none> <none>
kube-system kube-proxy-xlwd9 1/1 Running 0 3d1h 10.0.0.82 kube-node1 <none> <none>
kube-system kube-scheduler-kube-master 1/1 Running 1 3d1h 10.0.0.81 kube-master <none> <none>
rook-ceph-system rook-ceph-agent-55fqp 1/1 Running 0 17m 10.0.0.83 kube-node2 <none> <none>
rook-ceph-system rook-ceph-agent-5v9v5 1/1 Running 0 17m 10.0.0.81 kube-master <none> <none>
rook-ceph-system rook-ceph-agent-spx29 1/1 Running 0 17m 10.0.0.82 kube-node1 <none> <none>
rook-ceph-system rook-ceph-operator-57547fc866-ltp8z 1/1 Running 0 18m 10.244.2.4 kube-node1 <none> <none>
rook-ceph-system rook-discover-brxmt 1/1 Running 0 17m 10.244.2.5 kube-node1 <none> <none>
rook-ceph-system rook-discover-hl748 1/1 Running 0 17m 10.244.1.8 kube-node2 <none> <none>
rook-ceph-system rook-discover-qj5kd 1/1 Running 0 17m 10.244.0.9 kube-master <none> <none>
rook-ceph rook-ceph-mgr-a-5dbb44d7f8-vzs46 1/1 Running 0 16m 10.0.0.82 kube-node1 <none> <none>
rook-ceph rook-ceph-mon-a-5fb9568cb4-gvqln 1/1 Running 0 16m 10.0.0.81 kube-master <none> <none>
rook-ceph rook-ceph-mon-b-b65c555bf-vz7ps 1/1 Running 0 16m 10.0.0.82 kube-node1 <none> <none>
rook-ceph rook-ceph-mon-c-69cf744c4d-8g4l6 1/1 Running 0 16m 10.0.0.83 kube-node2 <none> <none>
rook-ceph rook-ceph-osd-0-77499f547-d2vjx 1/1 Running 0 15m 10.0.0.81 kube-master <none> <none>
rook-ceph rook-ceph-osd-1-698f76d786-lqn4w 1/1 Running 0 15m 10.0.0.82 kube-node1 <none> <none>
rook-ceph rook-ceph-osd-2-558c59d577-wfdlr 1/1 Running 0 15m 10.0.0.83 kube-node2 <none> <none>
rook-ceph rook-ceph-osd-prepare-kube-master-p55sw 0/2 Completed 0 15m 10.0.0.81 kube-master <none> <none>
rook-ceph rook-ceph-osd-prepare-kube-node1-q7scn 0/2 Completed 0 15m 10.0.0.82 kube-node1 <none> <none>
rook-ceph rook-ceph-osd-prepare-kube-node2-8rm4d 0/2 Completed 0 15m 10.0.0.83 kube-node2 <none> <none>
rook-ceph rook-ceph-tools 1/1 Running 0 3m24s 10.244.1.9 kube-node2 <none> <none>
# kubectl -n rook-ceph exec -it rook-ceph-tools -- /bin/bash
bash: warning: setlocale: LC_CTYPE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_COLLATE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_MESSAGES: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_NUMERIC: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_TIME: cannot change locale (en_US.UTF-8): No such file or directory
[root@rook-ceph-tools /]# ceph status
cluster:
id: b58f2a5c-2fc7-43e7-b410-2d541e78a90e
health: HEALTH_OK

services:
mon: 3 daemons, quorum a,b,c
mgr: a(active)
osd: 3 osds: 3 up, 3 in

data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 57 GiB used, 49 GiB / 105 GiB avail
pgs:

[root@rook-ceph-tools /]# exit
exit

At this point we have a fully operational cluster but is it really using the networks for OSD public and private traffic?   Lets explore that a bit further by first running the netstat command on any node in the cluster that has an OSD pod running.  Since my cluster is small I will show all 3 nodes below:

[root@kube-master]# netstat -tulpn | grep LISTEN | grep osd
tcp 0 0 192.168.100.81:6800 0.0.0.0:* LISTEN 29719/ceph-osd
tcp 0 0 192.168.200.81:6800 0.0.0.0:* LISTEN 29719/ceph-osd
tcp 0 0 192.168.200.81:6801 0.0.0.0:* LISTEN 29719/ceph-osd
tcp 0 0 192.168.100.81:6801 0.0.0.0:* LISTEN 29719/ceph-osd
[root@kube-node1]# netstat -tulpn | grep LISTEN | grep osd
tcp 0 0 192.168.100.82:6800 0.0.0.0:* LISTEN 18770/ceph-osd
tcp 0 0 192.168.100.82:6801 0.0.0.0:* LISTEN 18770/ceph-osd
tcp 0 0 192.168.200.82:6801 0.0.0.0:* LISTEN 18770/ceph-osd
tcp 0 0 192.168.200.82:6802 0.0.0.0:* LISTEN 18770/ceph-osd

[root@kube-node2]# netstat -tulpn | grep LISTEN | grep osd
tcp 0 0 192.168.100.83:6800 0.0.0.0:* LISTEN 22659/ceph-osd
tcp 0 0 192.168.200.83:6800 0.0.0.0:* LISTEN 22659/ceph-osd
tcp 0 0 192.168.200.83:6801 0.0.0.0:* LISTEN 22659/ceph-osd
tcp 0 0 192.168.100.83:6801 0.0.0.0:* LISTEN 22659/ceph-osd

From the above we should see the OSD processes listening on the corresponding public and private networks we configured in the configmap.   However lets further confirm by going back into the toolbox and doing a ceph osd dump:

# kubectl -n rook-ceph exec -it rook-ceph-tools -- /bin/bash
bash: warning: setlocale: LC_CTYPE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_COLLATE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_MESSAGES: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_NUMERIC: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_TIME: cannot change locale (en_US.UTF-8): No such file or directory

[root@rook-ceph-tools]# ceph osd dump
epoch 14
fsid 05a8b767-e3e8-42aa-b792-69f479c807f7
created 2019-04-02 13:24:24.549423
modified 2019-04-02 13:25:28.441850
flags sortbitwise,recovery_deletes,purged_snapdirs
crush_version 7
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client jewel
min_compat_client firefly
require_osd_release mimic
max_osd 3
osd.0 up in weight 1 up_from 11 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.200.81:6800/29719 192.168.100.81:6800/29719 192.168.100.81:6801/29719 192.168.200.81:6801/29719 exists,up 2feb0edf-6652-4148-8264-6ba52d04ff80
osd.1 up in weight 1 up_from 14 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.200.82:6801/18770 192.168.100.82:6800/18770 192.168.100.82:6801/18770 192.168.200.82:6802/18770 exists,up f8df61b4-4ac8-4705-9f97-eb09a1cc0d6c
osd.2 up in weight 1 up_from 14 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.200.83:6800/22659 192.168.100.83:6800/22659 192.168.100.83:6801/22659 192.168.200.83:6801/22659 exists,up db555c80-9d81-4662-aed9-4bce1c0d5d78

As you can see it can be fairly straight forward to configure Rook to deploy a Ceph cluster using segmented networks to ensure the replication traffic runs on dedicated network and does not interfere with public client performance.  Hopefully this quick demonstrate showed that.

Viewing all articles
Browse latest Browse all 66

Trending Articles