SDD 0020 - OpenShift Hive
Author |
Simon Rüegg |
Owner |
|
Reviewers (SIG) |
|
Date |
2020-06-10 |
Status |
accepted |
Summary
This describes how we want to integrate OpenShift Hive into Syn to automatically provision OpenShift clusters on supported cloud providers. |
Motivation
The OpenShift Hive operator provides fully automated provisioning of new OpenShift clusters using the OpenShift installer. We want to integrate this into the Syn project in order to provision OpenShift clusters. Currently the following cloud providers are supported by Hive: * AWS * Azure * Google Cloud Platform
Design Proposal
Hive Overview
Hive is an operator and works on the following CRDs:
-
ClusterImageSet
to define the OpenShift version -
ClusterDeployment
to define a cluster -
MachinePool
to define the sizing/scaling of a cluster
In addition to these CRDs the operator needs various secrets for the following information:
-
Image pull secret for the OpenShift images
-
Credentials for the cloud provider
-
SSH keypair to access machines of the provisioned cluster
-
An
install-config.yaml
to configure the OpenShift installer
Most of the heavy lifting Hive does is implemented by the OpenShift installer which is also why the install-config.yaml
file is required.
This file is the same format as described in the installer docs.
It needs to be provided in a secret and is only changed by the operator to set the pullSecret
property with the referenced image pull secret.
Implementation Details
A controller is implemented which creates the necessary objects to provide Hive with the necessary information to provision an OpenShift cluster.
Based on certain conditions the controller creates a set of objects (secrets and Hive CRs).
For example if a cluster object has the annotation syn.tools/cluster-provisioner=hive
set, an OpenShift cluster should be provisioned via Hive.
Provisioning Information
In order for this controller to be able to create the necessary objects, it needs to receive certain information. All confidential information (like cloud provider credentials or image pull secrets) should be stored in Kubernetes secrets and referenced. In a first PoC phase, all information that’s not yet present in a cluster’s facts must be provided in annotations on the cluster object:
-
hive.syn.tools/gcp-project
GCP project name -
hive.syn.tools/base-domain
Base domain -
hive.syn.tools/credentials-secret
Name of the secret containing the cloud credentials
The reason for using annotations being that we don’t have to change the cluster CRD for the PoC. In a second step, once this design is validated and accepted, the information can be added as a typed struct to the cluster CRD.
Cluster Scaling
Scaling of a cluster shouldn’t be done via Hive. For provisioning we use a default setup of three master and three worker nodes. Once a cluster is provisioned, scaling will be implemented via other means for example in a Commodore component.
Cluster Synfection
Once a cluster is provisioned via Hive the next step is to synfect it (install Steward).
Hive provides the SyncSet
CRD to create and patch arbitrary resources on a provisioned cluster.
While this mechanism could be used to install the Steward agent on a new cluster it also poses some downsides:
-
We would need to duplicate the installation manifests (currently implemented in the Lieutenant API) to create a
SyncSet
out of them -
Once Syn is fully bootstrapped on the cluster, Steward itself will be managed by GitOps. This would end up with two systems managing the same resources (Hive and GitOps)
Instead of using a SyncSet
we use the credentials of a provisioned cluster and run a job which installs Syn.
This can be implemented relatively easy since the credentials (Kubeconfig) for a cluster are stored in a secret.
The controller can create the Kubernetes job which mounts this secret and runs kubectl apply -f
against the install URL of the respective cluster.
Risks and Mitigations
With this design we’re relatively tightly coupled to the Hive operator as in the created CRs (API) are defined by it. If Hive changes it’s API we’ve to implement these changes as well. As long as Hive follows the concept of a Kubernetes operator (acting on CRDs), the basic idea of this design should always apply though.
Alternatives
An alternative approach would be to leave the creation of the Hive CRs out of Syn and implement it in another component. This could be an option if project Syn shouldn’t provide specific provisioning options for Kubernetes distributions. The basic idea of this design would still apply though and it could be implemented separate from the Syn project.
References
-
OpenShift Hive - github.com/openshift/hive
-
OpenShift installer - github.com/openshift/installer
-
Hive architecture - github.com/openshift/hive/blob/master/docs/architecture.md
-
Hive
SyncSet
- github.com/openshift/hive/blob/master/docs/syncset.md