SDD 0007 - Lieutenant - Management API
Author |
Marco Fretz, Tobias Brunner |
Owner |
SIG Syn |
Reviewers (SIG) |
|
Date |
2019-10-18 |
Status |
implemented |
Summary
This SDD documents the design of the inventory and management API service which holds all information about Syn managed Kubernetes clusters. |
Motivation
A central inventory helps to keep track of all Syn managed Kubernetes clusters, their content and it’s the central hub for all operational needs. It provides inventory data to query for all aspects about these clusters. All clusters must be registered in Lieutenant so that they can get recognized by the various components of Syn. It presents itself as a REST API, providing all the needed functionality. This API gets used by frontends like the VSHN Portal, by CLI clients or by other means of automation.
Goals
The API provides several key features for the Syn platform. It’s architected around many helper services and orchestrates them to do their job well.
-
Be the central hub of Syn managed services and clusters including tenant information, fully ready for multitenancy operations
-
Provide insights to all running services and clusters
-
Be the key part of config management by providing management functionality for GitOps Git repositories
API Personas
Consumers of the API
Consumer | Use Cases |
---|---|
Commodore |
|
Steward |
|
Web Admin |
Full administration possibilities:
|
CLI |
Full administration possibilities |
Design Proposal
Lieutenant is an OpenAPI 3.0 REST API written in Go. It makes use of specialized Kubernetes Operators running on Kubernetes, implementing the core functionality and reconciliation processes. The Lieutenant API application runs in Kubernetes on works directly with the Kubernetes API, managing the objects of the Operators.
REST API
- Tenant Management
-
-
Management of tenants for assigning clusters and managing access rights. CRUD functionality.
-
For VSHN: The idea is to sync this data from the portal. A customer can be Syn enabled which then automatically syncs the information to Lieutenant and makes it available in the API.
-
Data storage location:
Tenant
CRD
-
- Cluster Registry
-
-
Registry of all Syn managed Kubernetes clusters with the following features:
-
Create (register) cluster
-
Read (list) all clusters with filtering capabilities
-
Update (edit) cluster
-
Delete cluster
-
-
The following information is stored:
-
Automatically generated cluster ID (string with 6 lowercase chars and numbers) - not editable
-
Cluster display name
-
Initial registration key for Steward
-
Lifetime
-
Validity
-
-
Kubernetes API endpoint
-
Reference to Kubernetes credentials in Vault
-
-
Data storage location:
Cluster
CRD
-
- Facts Inventory
-
-
Inventory of facts about the clusters and services with a query interface.
-
The key of the facts is the cluster ID.
-
There are two type of facts:
-
Static facts: Manually configured meta information about a cluster, stored in cluster Object: Tenant, SLA, Contact information, Cloud information, Kubernetes distribution
-
Dynamic facts: Facts collected by an in-cluster agent and regularly sent to Lieutenant, stored in Timeseries DB: Kubernetes version, Deployed managed services and their version, Node details (OS, CPU, RAM, labels) and number of nodes, important objects, Syn predefined annotations on objects, per object contact information and responsibilities via annotations, Ingress objects
spec.rules[].host
(helps to find which cluster hosts which URL).
-
-
Facts aren’t actively collected by polling, they’re pushed to the API by Steward.
-
All dynamic facts are versioned (and update generates a new version) which allows historical information when has what changed. Facts have timestamps, so that a freshness indicator is possible.
-
Data storage location: InfluxDB
-
- Component state
-
-
Live state of the managed services
-
Querying Argo CD, running on the cluster
-
Data storage location: Argo CD
-
- Steward Configuration
-
-
Deployment and configuration delivery for Steward bootstrapping
-
The API provides the Steward deployment JSON for the initial cluster synfection.
-
The following data is delivered to Steward:
-
URL to cluster catalog git repository
-
SSH host key of git repository
-
-
- Vault Integration
-
-
Vault bootstrapping for cluster
-
The API configures Vault for the cluster:
-
Create access credentials
-
Configure authorization
-
Steward authentication token handling
-
Bootstrap token can only be used once and for a limited time
-
-
-
- Maintenance Utilities
-
-
Helper functions to facilitate proper maintenance
-
Integration with Renovate, f.e. list all open Renovate Merge Request on different Git repository instances, including the possibility to merge them.
-
Integration with to-be-invented host maintenance agent?
-
Data storage location: Git Merge/Pull requests
-
REST OpenAPI Spec
All objects are stored in the same Kubernetes namespace as the API and the Operator is running.
The OpenAPI specification is stored in Git: openapi.yaml.
API Authentication
Bearer Token
Authentication to the API is handled via Kubernetes service account tokens. Except for the /docs
, /healthz
and /install/steward.json
endpoints, every request must contain a bearer token. The HTTP header Authorization
must be set to Bearer <token>
with <token>
beign a valid JWT token. This JWT token will then be used by the API to authenticate against the Kubernetes cluster.
Bootstrap Token
The /install/steward.json
endpoint must provide a query parameter token
which contains the bootstrap token of a cluster. Such a token can only be used once and has a short (for example ~30 minutes) expiry time. The API uses it’s own service account to authenticate to Kubernetes and search the clusters for the provided bootstrap token. Once a cluster is found and the bootstrap token is still valid, the installation manifests will be returned and the token marked invalid.
API Authorization
With the exception of the /install/steward.json
endpoint, authorization of all API requests is fully delegated to the Kubernetes cluster.
Kubernetes RBAC
Kubernets RBAC rules will be used to restrict the various service accounts to their required scope.
Open Policy Agent
The Open Policy Agent (OPA) will be used to further restrict access of the service accounts to the cluster. More advanced features like a tenant accessing all his clusters can be implemented with the OPA which can not be represented by RBAC alone.
Cluster Bootstrap
The /install/steward.json
endpoint serves as an entrypoint to bootstrap (adopt) a new cluster. It delivers the manifests required to bootstrap the SYN system on a cluster. Authorization is done via a Bootstrap Token which identifies the cluster to bootstrap. The API searches the cluster and checks the validity of the token: if the token has expired or was already used (attribute valid=false
).
Different Entities
- Cluster
-
A service account per cluster is created with an RBAC role restricting it’s access to the cluster it belongs to (using resourceNames).
- Tenant
-
TBD
- Commodore
-
TBD
Operator and Custom Resource Definitions (CRDs)
CRD | Description |
---|---|
Tenant |
When a tenant is created, a GitRepo object is created to create the tenant configuration repository. |
GitRepo |
Git repository management (CRUD repos on GitLab, GitHub and Gitea). Lieutenant manages the CR objects and queries the status fields to get the status. The Operator manages the following objects: GitRepo
|
Cluster |
When a Cluster object is created:
When a Cluster object is deleted:
|
Proxy |
Manages the deployment and configuration of an Inlets server per Syn Kubernetes cluster. Details tbd |
Operator Common
The first iteration is a single Operator consisting of several controllers, sharing CR Go structs as the objects depend on each other. A later iteration could split these controllers into their own Operator if it makes sense then. The Operator will be implemented using the operator-sdk in Go.
Property | Value |
---|---|
API group |
|
API version |
|
Controller: Tenant
-
Responsible for the
Tenant
kind -
Manages the
GitRepo
object for the tenant configuration -
Admission webhook denies deletion of
Tenant
object when there are stillCluster
objects available referencing the Tenant object to be deleted
apiVersion: syn.tools/v1alpha1
kind: Tenant
metadata:
name: aezoo6
namespace: syn-lieutenant
spec:
displayName: Big Corp.
gitRepoTemplate:
spec:
<GitRepoSpec>
---
apiVersion: syn.tools/v1alpha1
kind: Tenant
metadata:
name: os3ce3
namespace: syn-lieutenant
spec:
displayName: Acme Corp. (Subtenant of Big Corp)
tenantRef:
name: aezoo6
gitRepoURL: https://github.com/acmecorp/commodore-config.git
Field | Description |
---|---|
|
Display name of the tenant. |
|
Git repository storing the tenant configuration. If
this is set, no |
|
Template for managing the |
Controller: Cluster Registry
Functionality:
-
Responsible for the
Cluster
kind -
Manages the
GitRepo
object for cluster configuration catalog -
(Manages
Proxy
object for providing an Inlets server endpoint) later -
Stores the registration key and it’s validity for initial Steward bootstrapping
-
Admission webhook denies creation of
Cluster
object when the referencedTenant
object isn’t available
apiVersion: syn.tools/v1alpha1
kind: Cluster
metadata:
name: ae3oso
namespace: syn-lieutenant
labels:
syn.tools/tenant: aezoo6
spec:
displayName: Big Corp. Production Cluster
gitRepoTemplate:
spec:
<GitRepoSpec>
tenantRef:
name: aezoo6
tokenLifetime: 4h
facts:
distribution: openshift3
cloud: cloudscale
region: rma1
status:
bootstrapToken:
token: sdaf445ftzzt
valid: true
validUntil: 2020-01-27 12:00
---
apiVersion: syn.tools/v1alpha1
kind: Cluster
metadata:
name: etahv3
namespace: syn-lieutenant
labels:
syn.tools/tenant: os3ce3
spec:
displayName: Acme Corp. Dev Cluster
gitRepoURL: https://github.com/acmecorp/gitops-mycluster.git
gitHostKeys: |
gitlab.com ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAfuCHKVTjquxvt6CM6tdG4SLp1Btn/nOeHHE5UOzRdf
gitlab.com ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFSMqzJeV9rUzU4kWitGjeR4PWSa29SPqJ1fVkhtj3Hw9xjLVXVYrU9QlYWrOLXBpQ6KWjbjTDTdDkoohFzgbEY=
tenantRef:
name: os3ce3
tokenLifetime: 4h
facts:
distribution: openshift4
cloud: aws
region: eu-west-1
status:
bootstrapToken:
token: sdaf445ftzzt
valid: true
validUntil: 2020-01-27 12:00
Field | Description |
---|---|
|
Autosynced from |
|
Display name of cluster which could be different
from |
|
Git repository storing the cluster configuration
catalog. If this is set, no |
|
SSH host key of the git server. Only needed if not standard (GitHub, GitLab.com). Will be set by the git operator. |
|
Template for managing the |
|
Reference to |
|
Key/value pairs of statically configured cluster facts which don’t change on a cluster. |
|
This key is used only once for Steward to register.
When the initial registration is done, |
Controller: Git Repository
-
Responsible for the
GitRepo
kind -
Manages Git repository on the configured remote (CRUD)
-
Configures deploy keys on the remote Git repository
-
It’s always referenced to the
Tenant
as owner -
Admission webhook denies creation of
GitRepo
object when the referencedTenant
object isn’t available
apiVersion: syn.tools/v1alpha1
kind: GitRepo
metadata:
name: ohj4iu
namespace: syn-lieutenant
labels:
syn.tools/tenant: aezoo6
spec:
apiSecretRef:
name: my-api-secret
deployKeys:
steward:
type: ssh-ed25519
key: AAAA...
writeAccess: true
path: bigcorp
repoName: commodore-config
repoType: auto
tenantRef:
name: aezoo6
status:
phase: created
type: GitHub
url: https://github.com/bigcorp/commodore-config.git
hostKeys: |
gitlab.com ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAfuCHKVTjquxvt6CM6tdG4SLp1Btn/nOeHHE5UOzRdf
gitlab.com ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFSMqzJeV9rUzU4kWitGjeR4PWSa29SPqJ1fVkhtj3Hw9xjLVXVYrU9QlYWrOLXBpQ6KWjbjTDTdDkoohFzgbEY=
Field | Description |
---|---|
|
Autosynced from |
|
Reference to secret containing connection information:
|
|
Optional dict of SSH deploy keys. If not set, no deploy keys will be configured.
|
|
Path to Git repository |
|
Name of Git repository |
|
Type of the repo. Can be one of |
|
Reference to |
|
Updated by Operator with current phase. Can be:
There’s no deleted phase as a deleted repo will be re-created on the next reconcile run. |
|
Autodiscovered Git repo type. Can be:
|
|
Computed Git repository URL |
|
SSH host keys of the git server. Will be synced from the |