RKE is a CNCF-certified Kubernetes distribution that runs entirely within Docker containers. It could be configured and provisioned from simple configuration. Plus the easy-to-use Rancher Server as a Web UI running on the cluster, we can have a truly self-hosted Kubernetes cluster.

Prepare Nodes

All nodes will be running CentOS 7. The system is installed by Server DVD installation media and use Compute Node flavor in software selection. Local user container is created for further operations.

After the installation, perform a full upgrade then install and setup Docker.

$ sudo yum update -y
$ sudo yum install -y yum-utils
$ sudo yum-config-manager \
    --add-repo \
    https://download.docker.com/linux/centos/docker-ce.repo
$ sudo yum install -y docker-ce docker-ce-cli containerd.io
$ sudo usermod -aG docker $(whoami)
https://docs.docker.com/engine/install/centos/

Open all required ports for Kubernetes and Rancher.

#!/bin/bash
firewall-cmd --permanent --add-services=http,https
firewall-cmd --permanent --add-port 6443/tcp
# Ports for etcd
firewall-cmd --permanent --add-port 2376/tcp
firewall-cmd --permanent --add-port 2379/tcp
firewall-cmd --permanent --add-port 2380/tcp
firewall-cmd --permanent --add-port 8472/udp 
firewall-cmd --permanent --add-port 9099/tcp
firewall-cmd --permanent --add-port 10250/tcp
# Kubernetes node ports
firewall-cmd --permanent --add-port 30000-32767/tcp
# Some magic
firewall-cmd --permanent --add-masquerade
firewall-cmd --reload
configure-ports.sh

Create Cluster Description File

The topology and options of an RKE cluster could be easily defined by an YAML file. Here we will use 1 control plane node and 1 worker node, both on which will be running etcd.

Refer to the official example, we will leave most the options to default but change nodes and cluster name accordingly.

nodes:
    - address: container-1.paas.central.sakuragawa.cloud
      user: container
      role:
        - controlplane
        - etcd
    - address: container-2.paas.central.sakuragawa.cloud
      user: container
      role:
        - etcd
        - worker
ignore_docker_version: false
ssh_key_path: ~/.ssh/skg
cluster_name: skg-central
kubernetes_version: v1.20.5-rancher1-1
authorization:
    mode: rbac
network:
    plugin: canal
dns:
    provider: coredns
ingress:
    provider: nginx
    options:
        use-forwarded-headers: 'true'
skg-central-rke.yaml

Note the ingress.options.use-forwarded-headers option: this is required to allow external TLS termination to be used (which I will be configuring in the next step).

Provision RKE Cluster

$ RKE_CLUSTER_NAME=skg-central-rke
$ ./rke_linux-amd64 up --config ${RKE_CLUSTER_NAME}.yaml
INFO[0000] Running RKE version: v1.2.7
INFO[0000] Initiating Kubernetes cluster
....
INFO[0000] Building Kubernetes cluster
....
INFO[0234] Finished building Kubernetes cluster successfully

When the building process is finished, check the cluster status with the generated kubeconfig:

$ RKE_CLUSTER_NAME=skg-central-rke
$ export KUBECONFIG=$(pwd)/kube_config_{RKE_CLUSTER_NAME}.yaml
$ kubectl get nodes
NAME                                        STATUS   ROLES               AGE     VERSION
container-1.paas.central.sakuragawa.cloud   Ready    controlplane,etcd   3h49m   v1.20.5
container-2.paas.central.sakuragawa.cloud   Ready    etcd,worker         3h49m   v1.20.5

Install Rancher Server via Helm

My $HOME is mounted on NFS and will cause Helm error. Therefore I would specify another HELM_HOME first.

$ export HELM_CACHE_HOME=/mnt/home/asaba/.cache/helm
$ export HELM_CONFIG_HOME=/mnt/home/asaba/.config/helm
$ export HELM_DATA_HOME=/mnt/home/asaba/.local/share/helm

Then add the stable repository from Rancher and install with an external TLS termination. And since I have only 1 worker node, I will also set replicas to 1.

$ helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
$ export RANCHER_HOSTNAME=rancher.central.sakuragawa.cloud
$ export RANCHER_NS=cattle-system
$ kubectl create namespace $RANCHER_NS
$ helm install rancher rancher-stable/rancher \
    --namespace $RANCHER_NS \
    --set hostname=$RANCHER_HOSTNAME \
    --set tls=external \
    --set replicas=1
NAME: rancher
LAST DEPLOYED: Thu Apr 15 01:09:01 2021
NAMESPACE: cattle-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Rancher Server has been installed.
                                                                                                                        NOTE: Rancher may take several minutes to fully initialize. Please standby while Certificates are being issued and Ingress comes up.

Check out our docs at https://rancher.com/docs/rancher/v2.x/en/

Browse to https://rancher.central.sakuragawa.cloud                                                                      
Happy Containering!

Check the rollout progress of Rancher server:

$ kubectl -n cattle-system rollout status deploy/rancher
Waiting for deployment spec update to be observed...
Waiting for deployment "rancher" rollout to finish: 0 out of 1 new replicas have been updated...
Waiting for deployment "rancher" rollout to finish: 0 of 1 updated replicas are available...
deployment "rancher" successfully rolled out

Configure Reverse Proxy

Refer to official example Nginx configuration, create a configuration to proxy all traffic to cluster ingress controller.

upstream rancher {
    server 192.168.1.21:80;                                                                                                        
    server 192.168.1.22:80;
}

server {
    listen 80;
    listen [::]:80;
    server_name rancher.central.sakuragawa.cloud;

    location / {
        return 301 https://$host$request_uri;
    }
}

map $http_upgrade $connection_upgrade {
    default Upgrade;
    ''      close;
}

server {
    listen 443 http2 ssl;
    listen [::]:443 http2 ssl;
    server_name rancher.central.sakuragawa.cloud;

    ssl on;
    ssl_certificate /etc/letsencrypt/live/central.sakuragawa.cloud/fullchain.pem;                                                  
    ssl_certificate_key /etc/letsencrypt/live/central.sakuragawa.cloud/privkey.pem;                                                
    include /etc/nginx/ssl.conf;

    client_max_body_size 100G;

    location / {
        proxy_pass http://rancher;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;                                                               
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        # This allows the ability for the execute shell window to remain open for up to 15 minutes. Without this parameter, the default is 1 minute and will automatically close.
        proxy_read_timeout 900s;
    }
}
rancher.central.sakuragawa.cloud.conf

When the deployment is sucessfully rolled out, navigate the the Rancher hostname and the first-start page is accessible.

Increase Ingress ControllerRequest  Body Size

By default nginx has a limit for request body size. This will lead to 413 Request Entity Too Large error for some HTTP requests.

This could be fixed by patching Nginx configmaps:

$ kubectl patch configmap nginx-configuration -n ingress-nginx -p '{"data":{"proxy-body-size":"0"}}'
$ kubectl logs -n ingress-nginx -l app=ingress-nginx | grep UPDATE | grep ConfigMap
load balancing 413 Request Entity Too Large · Issue #14323 · rancher/rancher
Rancher versions: 2.02 rancher/server or rancher/rancher: rancher/server rancher/agent or rancher/rancher-agent: rancher/rancher-agent Infrastructure Stack versions: healthcheck: ipsec: network-ser...

Remove RKE Cluster

RKE provides a remove options to completely tear down the cluster. The rke remove command does the following to each node in the cluster.yml:

  • etcd
  • kube-apiserver
  • kube-controller-manager
  • kubelet
  • kube-pro
$ ./rke_linux-amd64 remove --config skg-central-rke.yaml
INFO[0000] Running RKE version: v1.2.7
Are you sure you want to remove Kubernetes cluster [y/n]: y
INFO[0001] Tearing down Kubernetes cluster
....
INFO[0061] Removing local admin Kubeconfig: ./kube_config_skg-central-rke.yaml
INFO[0061] Local admin Kubeconfig removed successfully
INFO[0061] Removing state file: ./skg-central-rke.rkestate
INFO[0061] State file removed successfully
INFO[0061] Cluster removed successfully