Prepare Host
Raw Block Device for GlusterFS
OpenShift could be integrated with multiple distributed storage solution. One of the popular solution is containerized GlusterFS. To provide management API for OpenShift, Heketi
is also required in this case.
Unfortunately, GlusterFS with Heketi deployment requires at least one raw block device such as bare partition or standalone device. Note that a LVM logical volume or filesystem is not supported by Heketi.
In this case, each bare metal has a 500GiB HDD. They will be partitioned into a 400GiB partition for GlusterFS and the rest for system usage.
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 465.8G 0 disk
├─sda1 8:1 0 500M 0 part /boot
├─sda2 8:2 0 400G 0 part /glusterfs
├─sda3 8:3 0 15.7G 0 part [SWAP]
├─sda4 8:4 0 1K 0 part
└─sda5 8:5 0 49.6G 0 part /
Setup Internal CA
If external registries or authentication is needed and they are secured by certificates signed by internal self-signed CA certificate, they need to be added into trusted CAs.
# wget --no-check-certificate https://example.com/ca.pem -O /etc/pki/ca-trust/source/anchors/
# update-ca-trust extract
Request Certificates for OpenShift
By default, OpenShift will generate self-signed certificate for master API, etcd, web console and router. For easier access to web console and applications hosted on the platform, customized certificates could be applied.
Make sure that if you want to customize the certificate for router, it's better to use a wildcard certificate for *.your.router.domain.example.com
.
Install OpenShift
Prepare Ansible Runtime
Latest Ansible playbooks for setting up OpenShift 3.11 requires ansible<2.7
.
Ansible >= 2.6.5, Ansible 2.7 is not yet supported and known to fail
To simplify the process, we could use pipenv
to install dependencies and build Ansible runtime.
$ pipenv --three
$ pipenv install "ansible~=2.6"
Configure Inventory File
There is a nice guide from offical documents to prepare an inventory file. However, for GlusterFS storage, we need to add a glusterfs
group and specify the hosts and deivces used for storage.
[OSEv3:children]
...
glusterfs
...
....
[glusterfs]
rack02-03.raw-infra.sakuragawa.cloud glusterfs_devices='["/dev/sda2"]'
rack02-04.raw-infra.sakuragawa.cloud glusterfs_devices='["/dev/sda2"]'
rack02-06.raw-infra.sakuragawa.cloud glusterfs_devices='["/dev/sda2"]'
rack02-11.raw-infra.sakuragawa.cloud glusterfs_devices='["/dev/sda2"]'
rack02-14.raw-infra.sakuragawa.cloud glusterfs_devices='["/dev/sda2"]'
rack02-15.raw-infra.sakuragawa.cloud glusterfs_devices='["/dev/sda2"]'
....
Configure Environment Variables
OpenShift Ansible installation uses a lot of variables to customize the process. Lucky we could put all the environment variables into one file for clearer configuration.
Environment variables could be written in yaml
or json
format and finely saved in a single file. It is an alternative to the OSEv3:vars
, so most of the variables are explained in the official documentation. However, since we have some customization during the deployment plan, we need to pay special attention to the following parts.
Firstly, since the bare metals are provisioned by Beaker, the default ssh user is root
.
ansible_ssh_user: root
# No need to add ansible_become or ansible_sudo
Then, we choose internal LDAP as the identity provider.
openshift_master_identity_providers:
- name: ldap_auth
kind: LDAPPasswordIdentityProvider
challenge: true
login: true
bindDN: ''
bindPassword: ''
ca: 'ca.pem'
insecure: false
url: ldaps://ldap.central.sakuragawa.cloud/ou=users,dc=sakuragawa,dc=cloud?uid
attributes:
id: ['dn']
email: ['mail']
name: ['cn']
preferredUsername: ['uid']
Then, we need to tell OpenShift to deploy and use GlusterFS as a storageclass
in our environment variables.
....
# Containerized Gluster FS
openshift_storage_glusterfs_namespace: openshift-storage
openshift_storage_glusterfs_storageclass: true
openshift_storage_glusterfs_storageclass_default: false
openshift_storage_glusterfs_block_deploy: true
openshift_storage_glusterfs_block_host_vol_size: 400
openshift_storage_glusterfs_block_storageclass: true
openshift_storage_glusterfs_block_storageclass_default: false
....
Also don't forget to specify all *_storage_kind
to glusterfs
.
Finally, apply our own certificates to OpenShift hosted router. Given that we applied a certificate for both ocp.central.sakuragawa.cloud
and *.ocp.central.sakuragawa.cloud
, we also need to overwrite certificate name for master.
# Customize master and web console certificates
openshift_master_overwrite_named_certificates: true
openshift_master_named_certificates: [{"certfile": "cert.crt", "keyfile": "priv.key", "names": ["ocp.central.sakuragawa.cloud"], "cafile": "2015-RH-IT-Root-CA.pem"}]
# Customize router certificates
openshift_hosted_router_certificate: {"certfile": "cert.crt", "keyfile": "priv.key", "cafile": "ca.pem"}
Roll Out the Deployment
When the inventory file and environment varialbes are ready, we can easily start the deployment.
For running K8S and other basic services, we need to bootstrap the hosts. The prerequisites.yml
playbook will install required packages and tune the systems.
# Make sure you are in openshift-ansible project folder
$ pipenv shell
$ ansible-playbook -i inventory.file -e @env-vars.yaml -e ansible_ssh_private_key_file=/path/to/ssh.key playbooks/prerequisites.yml
Then, run deploy_cluster.yml
playbook to finish the installation.
$ ansible-playbook -i inventory.file -e @env-vars.yaml -e ansible_ssh_private_key_file=/path/to/ssh.key playbooks/deploy_cluster.yml
When the deployment comes to etcd
installation, Ansible may fail and stop. It is unexpected, but we can run etcd
installation playbook manually and continue the deployment.
$ ansible-playbook -i inventory.file -e @env-vars.yaml -e ansible_ssh_private_key_file=/path/to/ssh.key playbooks/openshift-etcd/config.yml
$ ansible-playbook -i inventory.file -e @env-vars.yaml -e ansible_ssh_private_key_file=/path/to/ssh.key playbooks/deploy_cluster.yml
Have some cups of coffee and OpenShift should be ready to access via CLI and web console.
Problems and Solutions
Missing CA for LDAP Config during upgrade
In case of Ansible not copying LDAP CA certificate to the proper location, we need to manually copy the certificate file to etc/origin/master/ldap_auth_ldap_ca.crt
.
Failed to Deploy Containerized GlusterFS
...ext4 signature detected on /dev/sda2 at offset XXX. Wipe it? [y/n]: [n]\n Aborted wiping of dos.\n 1 existing signature left on the device....
By creating and removing LVM physical volumes, this problem is worked out:
# pvcreate /dev/sdx
WARNING: dos signature detected on /dev/sdx at offset 0. Wipe it? [y/n]: y
# pvremove /dev/sdx
Then we can continue to deploy GlusterFS:
# ansible-playbook -i openshift.inv -e ... openshift-ansible/playbooks/openshift-glusterfs/config.yml
Cannot Log into Cluster Console
When using self-signed CA certificate for web console, the newly introduced cluster console in OpenShift 3.11 would not be able to authenticate from OAuth due to X.509 errors. The solution is too so simple: rebuild the cluster console image to add your own CA certificate.
FROM quay.io/openshift/origin-console:v3.11
USER root
RUN cd /etc/pki/ca-trust/source/anchors && \
curl -k -O http://example.com/ca.crt && \
update-ca-trust
USER 1001
CMD [ "/opt/bridge/bin/bridge", "--public-dir=/opt/bridge/static" ]
500 Internal Error When Logging into Grafana and Prometheus
This is the same reason as the cluster console logging in problem. The difference is that these two are protected by openshift/oauth-proxy
instead of authenticating themselves. By official documentation, openshift/oauth-proxy
could accept extra CA certificates by adding ca-certificate=/path/to/ca.crt
parameters.
Unfortunately from OpenShift 3.11, official monitoring stack are managed and deployed by Cluster Monitoring Operator. We have to build a new oauth-proxy
that trusts our CA certificate and modify the configmaps
of operator.
FROM openshift/oauth-proxy:v1.1.0
RUN cd /etc/pki/ca-trust/source/anchors && \
curl -k -O http://example.com/ca.crt && \
update-ca-trust
ENTRYPOINT ["/bin/oauth-proxy"]
Build the image and tag it carefully with the original v1.1.0
one. Then push it to your own registry.
Under the namespace of the monitoring stack, which is openshift-monitoring
in this case, and edit the cluster-monitoring-config
config map.
$ oc edit cm cluster-monitoring-config
....
auth:
baseImage: registry.central.sakuragawa.cloud/openshift/oauth-proxy
Note that it doesn't contain the tag name in the end of the baseImage
value. Save it and wait for the operator to update the replicaSets
, and authentication will work as expected.