SDD 0006 - Commodore Configuration Catalog Compilation

Author

Simon Gerber

Owner

Simon Gerber

Reviewers (SIG)

Date

2019-10-18

Status

implemented

Summary

The catalog compilation engine collects configuration information from different global and customer-specific Git repositories, collates this data into an inventory for Kapitan and compiles the set of manifests (the "catalog") for a SYN-managed cluster.

Motivation

We want to be able to define defaults for the SYN platform configuration in a hierarchical fashion and compile a catalog for a specific cluster from a hierarchical set of input config repositories.

Terminology

Engine: the catalog compilation engine described in this document
Inventory: the Kapitan inventory used by the engine
Component: an Engine component consisting of Kapitan component templates, a Kapitan component class, and engine postprocessing filters
Catalog: the set of rendered templates described by the inventory

Goals

The engine is responsible for collating parts of a Kapitan inventory from different Git repositories
The engine gets some information to help with this task directly from the Lieutenant API
The engine is responsible for fetching all the Kapitan components that make up the SYN platform from Git repositories. The list of components is generated from the inventory
The engine is responsible for ensuring that the inventory is a valid Kapitan inventory
The engine runs Kapitan to produce a set of YAML files that can be applied to a cluster (the "catalog")
The engine commits the compiled catalog to the cluster catalog repository on a branch
The engine pushes the branch to the upstream catalog repository and creates a merge request
It must be possible to override versions for SYN platform Kapitan components for a cluster
We want to be able to define default versions for SYN platform Kapitan components

Non-Goals

The exact way of how and where the engine runs is outside the scope of this SDD
Currently, supporting configuration sources which aren’t a Git repository is outside the scope of the compilation engine

Design Proposal

The high level idea is to build a tool around Kapitan which can collect fragments of a Kapitan inventory from different Git repos and organize them in such a way that Kapitan can compile a catalog for the cluster.

Background

Kapitan is a tool to render templates in different templating languages (currently Jsonnet, and Jinja2 are fully supported, Kadet and Helm are in alpha status. Kadet is a python-based, Jinja-inspired templating language for YAML). Kapitan works with components and an inventory. Kapitan components are collections of templates which should be rendered. The Kapitan inventory is similar to Puppet’s hieradata and consists of "classes" and "targets." A Kapitan target describes a unit which will be compiled by Kapitan (for example all the manifests required to deploy an application on a cluster). Each Kapitan component also requires a "component class" which describes how the component should be rendered. This includes listing which inputs to render, and where to save the outputs, the input format (on of the supported templating languages), and the output format (for example YAML or JSON). Additionally, Kapitan itself can download dependencies of a component, for example the helm chart if using helm as the templating language.

Overall architecture for the compilation engine

For the catalog compilation engine, we define all of the platform configuration in a single Kapitan target named "cluster," which is constructed from data provided by the Lieutenant API. The target pulls in a generic inventory class for global configuration (named global.common), cloud-specific configuration (for example class global.cloud.aws) and cluster distribution specific configuration (for example global.distribution.rancher). Additionally the target also pulls in a class for the cluster itself (for example class thecustomer.thecluster). Finally, the compilation engine traverses the inventory to gather all the components which are referenced, downloads them, and ensures that a class components.thecomponentname and defaults.thecomponentname (to define component defaults) is available in the Kapitan inventory. See Component Repositories below for a more detailed description of how we manage Kapitan components.

During compilation, we enable Kapitan’s dependency fetching to allow components to use the dependency mechanism to download their internal dependencies. We configure Kapitan’s search path to include the dependencies directory. This allows components to omit the leading dependencies/ in their compilation descriptions and enables the framework to provide extra Jsonnet (or other templating) libraries that are available to all components. The compilation engine downloads a list of libraries (defined in the inventory) and makes them available under dependencies/lib which in turn is visible under lib from the perspective of components.

The engine also provides a postprocessing step which runs after Kapitan. This postprocessing step can modify the rendered output, for example to add an explicit namespace into a rendered Helm chart, as helm doesn’t support adding namespace information when using helm template . The postprocessing step works similar to Kapitan’s compile step, and is heavily inspired by it. Postprocessing filters can be defined in component repositories alongside the implementation of the Kapitan component. This proposal defines Jsonnet as the only templating language which is available for postprocessing filters. Postprocessing filters have access to the Kapitan inventory in the same way as the Kapitan component templates. in addition to inventory access, postprocessing filters are provided with a method to load the contents of a YAML file into Jsonnet. A postprocessing filter should have the same structure as a Kapitan Jsonnet template. The engine renders the Jsonnet, and collects all the keys of the resulting JSON document. The engine then writes the content of each key into a file with the same name as the key. The component’s filters.yml can define an output directory per filter which is used as the base directory for the file names returned by the filter.

Kapitan’s secrets management is leveraged to ensure no secrets appear in any Git repository (input or output) and the in-cluster GitOps engine is configured to run kapitan refs --reveal on the cluster catalog before applying the manifests to the cluster. The Kapitan secrets management is configured to retrieve the secrets from Vault. The Vault instance is configurable in the inventory.

Inventory Repositories

The inventory is collected from various configuration Git repositories, which are either global or in the scope of a customer. A customer may have one or more configuration repositories for their different SYN platforms. The inventory repositories aren’t versioned, as they always reflect the desired state. It must be possible to pull global and customer-specific repositories from different Git hosting platforms.

Inventory repositories are cloned directly into the Kapitan inventory directory to make their contents available as inventory classes in Kapitan.

Component versions are tracked in the inventory parameters in key component_versions and default to master. An example:

parameters:
  component_versions:
    argocd:
      version: v1.0.0

Component versions can be any git tree-ish.

Component Repositories

Components are versioned as part of the inventory repositories, which comes from global defaults and the cluster / customer specific repository
Each component is managed in an individual Git repository, comparable to a Puppet module.
Components bundle the Kapitan component templates (for example Jsonnet) with the component class that will be referenced in the Kapitan inventory and optional postprocessing filters which are applied on the compiled Kapitan catalog.
The engine downloads the component repository into dependencies and symlinks the component class into inventory/classes/components to make the class available in the Kapitan inventory structure.
A component repository must have the file class/<component-name>.yml which contains the Kapitan component class which defines how the component is compiled
A component repository must define default parameters for the templates in the file class/defaults.yml in order to allow components to be included at any level in the configuration hierarchy without overriding component parameters which have been configured in the hierarchy earlier than the component was included. The defaults.yml file of a component is symlinked to inventory/classes/defaults/<compoent-name>.yml and included in the Kapitan target before global.common.
If a component has postprocessing filters, it must have the file postprocess/filters.yml defining which postprocessing filters the compilation engine should apply.
A component repository must be named the same as the component.

Repository Discovery

A default base URL is configurable to define the base path where component repositories will be searched. If a component is hosted on a different Git server or namespace, it’s full URL can be configured in the global config file. If no specific URL is provided for a component, the default base URL and the component name are concatenated to form the Git repository URL:

ssh://git@git.example.com/components/ + <component-name> + .git

The inventory can override the URL of a component by configuring the url parameter under component_versions:

parameters:
  component_versions:
    argocd:
      version: feature/my-pr
      url: ssh://git@git.example.com/my-user/my-fork.git

This allows for example using forks of components for some part of the hierarchy.

Secrets management

Commodore automates the more tedious parts of Kapitan’s secret management, allowing users to simply specify Kapitan secret references (denoted by ?{…}) in the configuration parameters. Commodore currently only supports Vault KV as back-end for storing secrets. Commodore eliminates the need for users to manually create Kapitan secret files (using kapitan refs --write …), by scanning the configuration parameters (everything defined under parameters: in Kapitan classes) for secret references and generating secret files in catalog/refs before running kapitan compile. This ensures that the secret files and the catalog are always in sync. All secret references MUST be made in the configuration parameters, otherwise Commodore can’t discover them. Secret references can use reclass references to define dynamic defaults, for example ?{vaultkv:${cluster:tenant}/${cluster:name}/thesecret/thekey}.

Secret file generation

Commodore generates the secret files and their contents according to specific rules. A Kapitan secret reference, for example ?{vaultkv:path/to/secret/thekey}, always refers to a key named thekey in a secret named path/to/secret in Vault’s KV back-end. The address of the Vault instance and the name of the back-end are configurable:

parameters:
  secret_management:
    vault_addr: https://vault.syn.vshn.net
    # Name of the back-end (called mount in Vault)
    vault_mount: kv

For the secret reference mentioned above, Commodore generates a Kapitan secret file in catalog/refs/path/to/secret/thekey with path/to/secret:thekey as the reference to the Vault secret.

Kapitan’s vaultkv secret engine is configured in the class global.common under the dict secret_management. The configuration defaults to vault.syn.vshn.net and a back-end with name kv. This can be overridden at any level of the inventory. The GitOps engine on the cluster uses the Vault Kubernetes authentication (in a sidecar) to lease and renew a token which can be used by Kapitan.

Implementation Details/Notes/Constraints

Currently the engine is implemented in Python. The implementation provides a local mode where it reuses the existing (probably downloaded by the engine) directory structure to run only local operations. This allows local development of new components without having to re-download all the information on every run of the engine.

The engine uses the Python Jsonnet library to render the postprocessing filters and provides native callbacks to Jsonnet for accessing the Kapitan inventory and for loading YAML files in Jsonnet.

Component versions are extracted from the Kapitan inventory using Kapitan’s inventory_reclass method.

Directory structure

The current implementation of the engine produces and manages the following directories:

compiled contains the Kapitan output
catalog is a checkout of the catalog repository which is updated from the contents of compiled/cluster
dependencies contains all the dependencies downloaded by the engine (component repositories, template libraries, …)
- dependencies/libs/<libname> contains downloaded template library repositories
- dependencies/<component-name> is the checked out component repository for component "component-name"
- dependencies/lib contains symlinks to the template libraries
inventory contains the collated Kapitan inventory. Configuration repositories are directly cloned into inventory
- inventory/targets/cluster.yml contains the Kapitan target definition (generated by Commodore)
- inventory/classes/global contains globally managed configuration classes
- inventory/classes/<customer-name> contains customer-specific configuration classes

Constraints

Currently, a Commodore component can define only one Kapitan class

Risks and Mitigations

RISK: The current architecture is built around generating an inventory and classes to use with Kapitan. Thus we’ve a hard dependency on Kapitan in the current design iteration.
MITIGATIONS:
- Kapitan is written in Python, and currently under active development. If worst comes to worst, we potentially could keep a Kapitan fork alive.
- It would be entirely possible to replace Kapitan with another tool (either 3rd party or developed in the context of SYN) which does the inventory resolution and template rendering.

Drawbacks

Templates for the output of a Kapitan component (K8s YAMLs for SYN) are mainly written in Jsonnet. Kapitan is developing an alternative called Kadet which is based on YAML. Having the component templates in YAML would potentially make more sense as we’re more familiar with writing YAML than JSON.

Alternatives

It would be possible to put much more of the inventory value lookup magic behind an API (Lieutenant or otherwise), f.e. with hiera. With hiera, this approach would use the same Hieradata structure as we’ve in VMSFv1, with all its advantages and drawbacks.
Early experimental versions of the engine tried different approaches in regard to how the inventory and components are organized:
- The first experimental version of the engine collected four Git repositories which define classes (one each for "global," "cluster," "customer," and "components") and put them into predefined spots in the Kapitan inventory. This implementation used the SYNventory API to determine which source repository to clone for each of the four categories. Each of the source repositories would define non-specific classes (for example "cluster.distribution"), which allowed for writing non-specific Kapitan targets. This was deemed "too magic," as it has too much configuration magic stored in the SYNventory API.
- The second experimental version of the engine collected a number of Git repositories which define classes that are specific for a particular configuration (for example "cluster.rancher") and kept the "components" repository which contained Kapitan classes for each component. The generation of the Kapitan target was outsourced to the SYNventory API. The API then would provide the engine with a customized target which contains the exact classes that are required for a customer’s configuration. This version briefly also contained customer-specific configuration in the target itself.
- The current design of the engine closely resembles the second experimental version, as the target specification is retrieved from the Lieutenant API in addition to some information about the cluster’s project and stage. The big change from the second experimental version is that the different Kapitan components are now all managed in self-contained repositories, the contents of which are strategically inserted in the right places in the Kapitan directory structure. This makes the overall workflow simpler, as updating a component happens within a single repository.