# SDD 0006 - Commodore Configuration Catalog Compilation

 Author Simon Gerber Owner Simon Gerber Reviewers (SIG) Date 2019-10-18 Status implemented
 Summary The catalog compilation engine collects configuration information from different global and customer-specific Git repositories, collates this data into an inventory for Kapitan and compiles the set of manifests (the "catalog") for a SYN-managed cluster.

## Motivation

We want to be able to define defaults for the SYN platform configuration in a hierarchical fashion and compile a catalog for a specific cluster from a hierarchical set of input config repositories.

### Terminology

• Engine: the catalog compilation engine described in this document

• Inventory: the Kapitan inventory used by the engine

• Component: an Engine component consisting of Kapitan component templates, a Kapitan component class, and engine postprocessing filters

• Catalog: the set of rendered templates described by the inventory

### Goals

• The engine is responsible for collating parts of a Kapitan inventory from different Git repositories

• The engine gets some information to help with this task directly from the Lieutenant API

• The engine is responsible for fetching all the Kapitan components that make up the SYN platform from Git repositories. The list of components is generated from the inventory

• The engine is responsible for ensuring that the inventory is a valid Kapitan inventory

• The engine runs Kapitan to produce a set of YAML files that can be applied to a cluster (the "catalog")

• The engine commits the compiled catalog to the cluster catalog repository on a branch

• The engine pushes the branch to the upstream catalog repository and creates a merge request

• It must be possible to override versions for SYN platform Kapitan components for a cluster

• We want to be able to define default versions for SYN platform Kapitan components

### Non-Goals

• The exact way of how and where the engine runs is outside the scope of this SDD

• Currently, supporting configuration sources which aren’t a Git repository is outside the scope of the compilation engine

## Design Proposal

The high level idea is to build a tool around Kapitan which can collect fragments of a Kapitan inventory from different Git repos and organize them in such a way that Kapitan can compile a catalog for the cluster.

### Background

Kapitan is a tool to render templates in different templating languages (currently Jsonnet, and Jinja2 are fully supported, Kadet and Helm are in alpha status. Kadet is a python-based, Jinja-inspired templating language for YAML). Kapitan works with components and an inventory. Kapitan components are collections of templates which should be rendered. The Kapitan inventory is similar to Puppet’s hieradata and consists of "classes" and "targets." A Kapitan target describes a unit which will be compiled by Kapitan (for example all the manifests required to deploy an application on a cluster). Each Kapitan component also requires a "component class" which describes how the component should be rendered. This includes listing which inputs to render, and where to save the outputs, the input format (on of the supported templating languages), and the output format (for example YAML or JSON). Additionally, Kapitan itself can download dependencies of a component, for example the helm chart if using helm as the templating language.

### Overall architecture for the compilation engine

For the catalog compilation engine, we define all of the platform configuration in a single Kapitan target named "cluster," which is constructed from data provided by the Lieutenant API. The target pulls in a generic inventory class for global configuration (named `global.common`), cloud-specific configuration (for example class `global.cloud.aws`) and cluster distribution specific configuration (for example `global.distribution.rancher`). Additionally the target also pulls in a class for the cluster itself (for example class `thecustomer.thecluster`). Finally, the compilation engine traverses the inventory to gather all the components which are referenced, downloads them, and ensures that a class `components.thecomponentname` and `defaults.thecomponentname` (to define component defaults) is available in the Kapitan inventory. See Component Repositories below for a more detailed description of how we manage Kapitan components.

During compilation, we enable Kapitan’s dependency fetching to allow components to use the dependency mechanism to download their internal dependencies. We configure Kapitan’s search path to include the `dependencies` directory. This allows components to omit the leading `dependencies/` in their compilation descriptions and enables the framework to provide extra Jsonnet (or other templating) libraries that are available to all components. The compilation engine downloads a list of libraries (defined in the inventory) and makes them available under `dependencies/lib` which in turn is visible under `lib` from the perspective of components.

The engine also provides a postprocessing step which runs after Kapitan. This postprocessing step can modify the rendered output, for example to add an explicit namespace into a rendered Helm chart, as helm doesn’t support adding namespace information when using `helm template` . The postprocessing step works similar to Kapitan’s compile step, and is heavily inspired by it. Postprocessing filters can be defined in component repositories alongside the implementation of the Kapitan component. This proposal defines Jsonnet as the only templating language which is available for postprocessing filters. Postprocessing filters have access to the Kapitan inventory in the same way as the Kapitan component templates. in addition to inventory access, postprocessing filters are provided with a method to load the contents of a YAML file into Jsonnet. A postprocessing filter should have the same structure as a Kapitan Jsonnet template. The engine renders the Jsonnet, and collects all the keys of the resulting JSON document. The engine then writes the content of each key into a file with the same name as the key. The component’s `filters.yml` can define an output directory per filter which is used as the base directory for the file names returned by the filter.

Kapitan’s secrets management is leveraged to ensure no secrets appear in any Git repository (input or output) and the in-cluster GitOps engine is configured to run `kapitan refs --reveal` on the cluster catalog before applying the manifests to the cluster. The Kapitan secrets management is configured to retrieve the secrets from Vault. The Vault instance is configurable in the inventory.

### Inventory Repositories

The inventory is collected from various configuration Git repositories, which are either global or in the scope of a customer. A customer may have one or more configuration repositories for their different SYN platforms. The inventory repositories aren’t versioned, as they always reflect the desired state. It must be possible to pull global and customer-specific repositories from different Git hosting platforms.

Inventory repositories are cloned directly into the Kapitan `inventory` directory to make their contents available as inventory classes in Kapitan.

Component versions are tracked in the inventory parameters in key `component_versions` and default to `master`. An example:

``````parameters:
component_versions:
argocd:
version: v1.0.0``````

Component versions can be any git tree-ish.

### Component Repositories

• Components are versioned as part of the inventory repositories, which comes from global defaults and the cluster / customer specific repository

• Each component is managed in an individual Git repository, comparable to a Puppet module.

• Components bundle the Kapitan component templates (for example Jsonnet) with the component class that will be referenced in the Kapitan inventory and optional postprocessing filters which are applied on the compiled Kapitan catalog.

• The engine downloads the component repository into `dependencies` and symlinks the component class into `inventory/classes/components` to make the class available in the Kapitan inventory structure.

• A component repository must have the file `class/<component-name>.yml` which contains the Kapitan component class which defines how the component is compiled

• A component repository must define default parameters for the templates in the file `class/defaults.yml` in order to allow components to be included at any level in the configuration hierarchy without overriding component parameters which have been configured in the hierarchy earlier than the component was included. The `defaults.yml` file of a component is symlinked to `inventory/classes/defaults/<compoent-name>.yml` and included in the Kapitan target before `global.common`.

• If a component has postprocessing filters, it must have the file `postprocess/filters.yml` defining which postprocessing filters the compilation engine should apply.

• A component repository must be named the same as the component.

#### Repository Discovery

A default base URL is configurable to define the base path where component repositories will be searched. If a component is hosted on a different Git server or namespace, it’s full URL can be configured in the global config file. If no specific URL is provided for a component, the default base URL and the component name are concatenated to form the Git repository URL:

``ssh://git@git.example.com/components/ + <component-name> + .git``

The inventory can override the URL of a component by configuring the `url` parameter under `component_versions`:

``````parameters:
component_versions:
argocd:
version: feature/my-pr
url: ssh://git@git.example.com/my-user/my-fork.git``````

This allows for example using forks of components for some part of the hierarchy.

### Secrets management

Commodore automates the more tedious parts of Kapitan’s secret management, allowing users to simply specify Kapitan secret references (denoted by `?{…​}`) in the configuration parameters. Commodore currently only supports Vault KV as back-end for storing secrets. Commodore eliminates the need for users to manually create Kapitan secret files (using `kapitan refs --write …​`), by scanning the configuration parameters (everything defined under `parameters:` in Kapitan classes) for secret references and generating secret files in `catalog/refs` before running `kapitan compile`. This ensures that the secret files and the catalog are always in sync. All secret references MUST be made in the configuration parameters, otherwise Commodore can’t discover them. Secret references can use reclass references to define dynamic defaults, for example `?{vaultkv:${cluster:tenant}/${cluster:name}/thesecret/thekey}`.

#### Secret file generation

Commodore generates the secret files and their contents according to specific rules. A Kapitan secret reference, for example `?{vaultkv:path/to/secret/thekey}`, always refers to a key named thekey in a secret named `path/to/secret` in Vault’s KV back-end. The address of the Vault instance and the name of the back-end are configurable:

``````parameters:
secret_management:
# Name of the back-end (called mount in Vault)
vault_mount: kv``````

For the secret reference mentioned above, Commodore generates a Kapitan secret file in `catalog/refs/path/to/secret/thekey` with `path/to/secret:thekey` as the reference to the Vault secret.

Kapitan’s `vaultkv` secret engine is configured in the class `global.common` under the dict `secret_management`. The configuration defaults to vault.syn.vshn.net and a back-end with name `kv`. This can be overridden at any level of the inventory. The GitOps engine on the cluster uses the Vault Kubernetes authentication (in a sidecar) to lease and renew a token which can be used by Kapitan.

### Implementation Details/Notes/Constraints

Currently the engine is implemented in Python. The implementation provides a local mode where it reuses the existing (probably downloaded by the engine) directory structure to run only local operations. This allows local development of new components without having to re-download all the information on every run of the engine.

The engine uses the Python Jsonnet library to render the postprocessing filters and provides native callbacks to Jsonnet for accessing the Kapitan inventory and for loading YAML files in Jsonnet.

Component versions are extracted from the Kapitan inventory using Kapitan’s `inventory_reclass` method.

#### Directory structure

The current implementation of the engine produces and manages the following directories:

• `compiled` contains the Kapitan output

• `catalog` is a checkout of the catalog repository which is updated from the contents of `compiled/cluster`

• `dependencies` contains all the dependencies downloaded by the engine (component repositories, template libraries, …​)

• `dependencies/libs/<libname>` contains downloaded template library repositories

• `dependencies/<component-name>` is the checked out component repository for component "component-name"

• `dependencies/lib` contains symlinks to the template libraries

• `inventory` contains the collated Kapitan inventory. Configuration repositories are directly cloned into inventory

• `inventory/targets/cluster.yml` contains the Kapitan target definition (generated by Commodore)

• `inventory/classes/global` contains globally managed configuration classes

• `inventory/classes/<customer-name>` contains customer-specific configuration classes

#### Constraints

• Currently, a Commodore component can define only one Kapitan class

### Risks and Mitigations

• RISK: The current architecture is built around generating an inventory and classes to use with Kapitan. Thus we’ve a hard dependency on Kapitan in the current design iteration.

• MITIGATIONS:

• Kapitan is written in Python, and currently under active development. If worst comes to worst, we potentially could keep a Kapitan fork alive.

• It would be entirely possible to replace Kapitan with another tool (either 3rd party or developed in the context of SYN) which does the inventory resolution and template rendering.

## Drawbacks

• Templates for the output of a Kapitan component (K8s YAMLs for SYN) are mainly written in Jsonnet. Kapitan is developing an alternative called Kadet which is based on YAML. Having the component templates in YAML would potentially make more sense as we’re more familiar with writing YAML than JSON.

## Alternatives

• It would be possible to put much more of the inventory value lookup magic behind an API (Lieutenant or otherwise), f.e. with hiera. With hiera, this approach would use the same Hieradata structure as we’ve in VMSFv1, with all its advantages and drawbacks.

• Early experimental versions of the engine tried different approaches in regard to how the inventory and components are organized:

• The first experimental version of the engine collected four Git repositories which define classes (one each for "global," "cluster," "customer," and "components") and put them into predefined spots in the Kapitan inventory. This implementation used the SYNventory API to determine which source repository to clone for each of the four categories. Each of the source repositories would define non-specific classes (for example "cluster.distribution"), which allowed for writing non-specific Kapitan targets. This was deemed "too magic," as it has too much configuration magic stored in the SYNventory API.

• The second experimental version of the engine collected a number of Git repositories which define classes that are specific for a particular configuration (for example "cluster.rancher") and kept the "components" repository which contained Kapitan classes for each component. The generation of the Kapitan target was outsourced to the SYNventory API. The API then would provide the engine with a customized target which contains the exact classes that are required for a customer’s configuration. This version briefly also contained customer-specific configuration in the target itself.

• The current design of the engine closely resembles the second experimental version, as the target specification is retrieved from the Lieutenant API in addition to some information about the cluster’s project and stage. The big change from the second experimental version is that the different Kapitan components are now all managed in self-contained repositories, the contents of which are strategically inserted in the right places in the Kapitan directory structure. This makes the overall workflow simpler, as updating a component happens within a single repository.