Install Cassandra
Contents
Much thanks to IBM for providing a starting point for deploying Cassandra as a StatefulSet
. The difference in this case is that we are going to deploy one StatefulSet
per Kubernetes cluster, and these clusters do not share anything except a network route between them. (Kubernetes Federation does not yet support StatefulSet
s, which should make this simpler in the future.) The code is available from my GitHub repository.
There are several Kubernetes-specific problems we need to solve here. The first is node identity: how will the Cassandra nodes (e.g. pods) uniquely identify themselves. Remember that in Kubernetes, the IP address of a pod can change whenever Kubernetes needs to restart it, so it is important to ensure that the nodes have unique, stable FQDNs.
The rules for assigning FQDNs to members of a StatefulSet
are described here in the section “Pod Identity”. The formula is:
So we need to make one of these unique to each cluster. I chose the namespace since it allows us to define CoreDNS stub domains that point to the different cloud providers in a service-agnostic way. In other words, we can use the stub domain azure-1.svc.cluster.local
for all services in the Azure cluster, and aws-1.svc.cluster.local
for all services in the AWS cluster.
So each cluster should deploy a StatefulSet
in a different namespace. I used aws-1
and azure-1
to allow for expansion to multiple data centers per cloud.
$ kubectl config use-context kube-azure $ cat <<EOF | kubectl create -f - kind: Namespace apiVersion: v1 metadata: name: azure-1 labels: name: azure-1 EOF $ kubectl config use-context kube-aws $ cat <<EOF | kubectl create -f - kind: Namespace apiVersion: v1 metadata: name: aws-1 labels: name: aws-1 EOF
In order to set up the kube-dns
service stub domains to forward cross-cluster DNS requests to the DNS service on the remote cluster, we need stable IP addresses for each cluster’s DNS which can be achieved by configuring kube-dns
as a NodePort
service and using the IP addresses of the Kubernetes worker nodes.
Change kube-dns
to type NodePort
on each cluster:
$ kubectl config use-context kube-aws $ kubectl patch service kube-dns --namespace=kube-system -p '{"spec": {"type": "NodePort"}}' $ kubectl config use-context kube-azure $ kubectl patch service kube-dns --namespace=kube-system -p '{"spec": {"type": "NodePort"}}'
Run the script generate-dns-configmap-coredns.sh
to update the kube-dns
ConfigMaps
:
$ ./generate-dns-configmap-coredns.sh This will generate the ConfigMaps for DNS forwarding between 2 clusters Context 1: kube-aws Namespace 1: aws-1 Context 2: kube-azure Namespace 2: azure-1 Switched to context "kube-azure". Switched to context "kube-aws". Updating ConfigMap on cluster kube-aws --- apiVersion: v1 kind: ConfigMap metadata: name: coredns namespace: kube-system data: Corefile: | .:53 { errors health kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure upstream fallthrough in-addr.arpa ip6.arpa } prometheus :9153 proxy . /etc/resolv.conf cache 30 loop reload loadbalance import custom/*.override } azure-1.svc.cluster.local { forward . 10.19.1.5:32478 } import custom/*.server configmap/coredns configured Switched to context "kube-aws". Switched to context "kube-azure". Updating ConfigMap on cluster kube-azure --- apiVersion: v1 kind: ConfigMap metadata: name: coredns namespace: kube-system data: Corefile: | .:53 { errors health kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure upstream fallthrough in-addr.arpa ip6.arpa } prometheus :9153 proxy . /etc/resolv.conf cache 30 loop reload loadbalance import custom/*.override } aws-1.svc.cluster.local { forward . 10.9.1.111:32275 10.9.2.227:32275 10.9.2.72:32275 } import custom/*.server configmap/coredns configured
Now that DNS is configured, we can deploy the Cassandra StatefulSets
on each cluster. The script deploy-cassandra.sh
uses the cassandra.yaml
template, replacing the placeholders for context, namespace, datacenter and storage class. I named the datacenters AWS-1
and AZURE-1
.
$ ./deploy-cassandra.sh This will deploy Cassandra as a 3-node cluster in each cloud provider. Context 1: kube-aws Datacenter 1: AWS-1 Namespace 1: aws-1 Storage class 1: gp2 Context 2: azure Datacenter 2: AZURE-1 Namespace 2: azure-1 Storage class 2: default Hit ENTER to deploy Cassandra to aws Switched to context "aws". service/cassandra created statefulset.apps/cassandra created Hit ENTER to deploy Cassandra to azure Switched to context "azure". service/cassandra created statefulset.apps/cassandra created
Note that CASSANDRA_SEEDS
includes the cassandra-0
node from each cluster
The pods may take a while to start up since they enter an error-retry loop until the persistent volumes are created and attached. Use nodetool status
to see when the clusters are ready.
$ kubectl config use-context kube-aws Switched to context "kube-aws". $ kubectl exec cassandra-0 --namespace=aws-1 -- nodetool status Datacenter: AWS-1 ================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.9.2.249 69.89 KiB 256 35.6% 856df155-8bf8-4656-a504-a509d2816193 Rack1 UN 10.9.1.75 114.48 KiB 256 32.6% c64d4b37-4114-410a-ac15-6fac91525e13 Rack1 UN 10.9.1.225 103.67 KiB 256 34.4% 64ab45b0-4ac7-4438-80c5-0f83084ef411 Rack1 Datacenter: AZURE-1 =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.19.1.52 93.98 KiB 256 32.9% 04b5411a-ef91-4066-84cc-5ebd2b3ddad8 Rack1 UN 10.19.1.79 108.6 KiB 256 32.3% 0aa07cbb-5e61-4707-9565-5b8e54e9c613 Rack1 UN 10.19.1.13 69.92 KiB 256 32.2% 10aa0560-c9e1-4d83-855a-aca733789937 Rack1
Great! Our multi-datacenter Cassandra cluster is up and running. Next, we’ll run some tests to show that the cluster is working and is tolerant of node failures.