Install Cassandra
Contents
Much thanks to IBM for providing a starting point for deploying Cassandra as a StatefulSet. The difference in this case is that we are going to deploy one StatefulSet per Kubernetes cluster, and these clusters do not share anything except a network route between them. (Kubernetes Federation does not yet support StatefulSets, which should make this simpler in the future.) The code is available from my GitHub repository.
There are several Kubernetes-specific problems we need to solve here. The first is node identity: how will the Cassandra nodes (e.g. pods) uniquely identify themselves. Remember that in Kubernetes, the IP address of a pod can change whenever Kubernetes needs to restart it, so it is important to ensure that the nodes have unique, stable FQDNs.
The rules for assigning FQDNs to members of a StatefulSet are described here in the section “Pod Identity”. The formula is:

So we need to make one of these unique to each cluster. I chose the namespace since it allows us to define CoreDNS stub domains that point to the different cloud providers in a service-agnostic way. In other words, we can use the stub domain azure-1.svc.cluster.local for all services in the Azure cluster, and aws-1.svc.cluster.local for all services in the AWS cluster.
So each cluster should deploy a StatefulSet in a different namespace. I used aws-1 and azure-1 to allow for expansion to multiple data centers per cloud.
$ kubectl config use-context kube-azure
$ cat <<EOF | kubectl create -f -
kind: Namespace
apiVersion: v1
metadata:
name: azure-1
labels:
name: azure-1
EOF
$ kubectl config use-context kube-aws
$ cat <<EOF | kubectl create -f -
kind: Namespace
apiVersion: v1
metadata:
name: aws-1
labels:
name: aws-1
EOF
In order to set up the kube-dns service stub domains to forward cross-cluster DNS requests to the DNS service on the remote cluster, we need stable IP addresses for each cluster’s DNS which can be achieved by configuring kube-dns as a NodePort service and using the IP addresses of the Kubernetes worker nodes.
Change kube-dns to type NodePort on each cluster:
$ kubectl config use-context kube-aws
$ kubectl patch service kube-dns --namespace=kube-system -p '{"spec": {"type": "NodePort"}}'
$ kubectl config use-context kube-azure
$ kubectl patch service kube-dns --namespace=kube-system -p '{"spec": {"type": "NodePort"}}'
Run the script generate-dns-configmap-coredns.sh to update the kube-dns ConfigMaps:
$ ./generate-dns-configmap-coredns.sh
This will generate the ConfigMaps for DNS forwarding between 2 clusters
Context 1: kube-aws
Namespace 1: aws-1
Context 2: kube-azure
Namespace 2: azure-1
Switched to context "kube-azure".
Switched to context "kube-aws".
Updating ConfigMap on cluster kube-aws
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
upstream
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
proxy . /etc/resolv.conf
cache 30
loop
reload
loadbalance
import custom/*.override
}
azure-1.svc.cluster.local {
forward . 10.19.1.5:32478
}
import custom/*.server
configmap/coredns configured
Switched to context "kube-aws".
Switched to context "kube-azure".
Updating ConfigMap on cluster kube-azure
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
upstream
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
proxy . /etc/resolv.conf
cache 30
loop
reload
loadbalance
import custom/*.override
}
aws-1.svc.cluster.local {
forward . 10.9.1.111:32275 10.9.2.227:32275 10.9.2.72:32275
}
import custom/*.server
configmap/coredns configured
Now that DNS is configured, we can deploy the Cassandra StatefulSets on each cluster. The script deploy-cassandra.sh uses the cassandra.yaml template, replacing the placeholders for context, namespace, datacenter and storage class. I named the datacenters AWS-1 and AZURE-1.
$ ./deploy-cassandra.sh This will deploy Cassandra as a 3-node cluster in each cloud provider. Context 1: kube-aws Datacenter 1: AWS-1 Namespace 1: aws-1 Storage class 1: gp2 Context 2: azure Datacenter 2: AZURE-1 Namespace 2: azure-1 Storage class 2: default Hit ENTER to deploy Cassandra to aws Switched to context "aws". service/cassandra created statefulset.apps/cassandra created Hit ENTER to deploy Cassandra to azure Switched to context "azure". service/cassandra created statefulset.apps/cassandra created
Note that CASSANDRA_SEEDS includes the cassandra-0 node from each cluster
The pods may take a while to start up since they enter an error-retry loop until the persistent volumes are created and attached. Use nodetool status to see when the clusters are ready.
$ kubectl config use-context kube-aws Switched to context "kube-aws". $ kubectl exec cassandra-0 --namespace=aws-1 -- nodetool status Datacenter: AWS-1 ================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.9.2.249 69.89 KiB 256 35.6% 856df155-8bf8-4656-a504-a509d2816193 Rack1 UN 10.9.1.75 114.48 KiB 256 32.6% c64d4b37-4114-410a-ac15-6fac91525e13 Rack1 UN 10.9.1.225 103.67 KiB 256 34.4% 64ab45b0-4ac7-4438-80c5-0f83084ef411 Rack1 Datacenter: AZURE-1 =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.19.1.52 93.98 KiB 256 32.9% 04b5411a-ef91-4066-84cc-5ebd2b3ddad8 Rack1 UN 10.19.1.79 108.6 KiB 256 32.3% 0aa07cbb-5e61-4707-9565-5b8e54e9c613 Rack1 UN 10.19.1.13 69.92 KiB 256 32.2% 10aa0560-c9e1-4d83-855a-aca733789937 Rack1
Great! Our multi-datacenter Cassandra cluster is up and running. Next, we’ll run some tests to show that the cluster is working and is tolerant of node failures.