Skip to content

Periodic Builds And Deployments

Use cases

  1. Continuous delivery.
  2. Monitoring deployability of branches.

Current state

Currently, the periodic builds are handled by Ocim and scheduled twice a day. In the case of a build failure, the relevant build logs are sent to a Build, Test, and Release workgroup's Google Group to notify them about failures. The sensitive values (like database passwords) are masked, so we don't leak credentials.

In this document we set boundaries for Grove and GitLab's CI/CD pipelines. Although Grove handles the deployment, it should not be responsible for scheduling. Scheduling would be solved by GitLab's pipelines, including the pipelines configuration and environment variables.

Proposed solution

Scheduling

Scheduling of GitLab pipelines could be done in two ways:

  1. Periodically calling the pipeline triggering API, or
  2. Using pipeline Schedules

As the second approach is the recommended and expected way of scheduling pipelines, we would utilize that feature.

The pipelines are tied to the person creating the pipeline, therefore the schedules has ownership. The ownership can be taken over any time by a person with the necessary permissions.

We have the possibility to add multiple Schedules, therefore we can define periodic builds for instances with diverging intervals.

To create schedules, we could either add them manually or use the Schedules API, wrapped by a bash script. For now, Grove won't provide helper scripts for creating schedules, though that may be changed later.

Deployments

Although pipeline scheduling is easily solved by Schedules, the current pipeline generation and triggering works based on commit messages. To be compatible with the approach we have, we are going to set commit-related environment variables as pipeline variable overrides.

Properly generating a child pipeline, we need to set CI_COMMIT_MESSAGE for the pipeline schedule. The CI_COMMIT_REF_NAME variable is set by Target branch or tag when setting the pipeline schedule arguments.

Creating a new schedule

Pipeline commit message of the form [AutoDeploy][Update] <INSTANCE_NAME|DEPLOYMENT ID>[,<INSTANCE_NAME|DEPLOYMENT ID>,...] are already parsed, from which only the instance names are extracted. This commit message perfectly fits for the redeployment purpose. An example commit message for periodic deployments could be [AutoDeploy][Update] <INSTANCE_NAME>|periodic or simply, [AutoDeploy][Update] <INSTANCE_NAME>. The latter is preferred as it is not introducing any characters where users may expect integers in case of incremental deployment IDs.

List of schedules

Reusing the above-mentioned commit message pattern, it is possible to periodically deploy multiple instances as part of one schedule, otherwise we have the option to create a schedule per instance as necessary.

Running pipeline

Pipeline schedules are not creating new commits, the scheduled pipelines are showing the latest commit message of the repository. This is the expected behaviour as we only override the trigger's commit message.

Filtering credentials

To prevent leaking credentials, we are using masked variable values for CI/CD pipeline, hence no credentials should be leaked on that end. On the other end, Grove is not printing sensitive value to console, ergo we don't leak there either.

Sending notification emails

GitLab automatically sends email for failed (and recovered) pipeline runs, though these notifications are sent to the person triggered a pipeline.

To workaround this, it is possible to setup "integrations" for a project. Most of the integrations are satisfactory for those who use Grove, though in the case of OpenCraft, we want to test named releases can be deployed until deprecation.

From this point, this section is OpenCraft specific, treat it as is.

Unfortunately, none of the integrations provide us the same email sending feature as we have with Ocim. The best option we would have with the current integrations is sending all build failures automatically, though that would spam the workgroup's mailing list.

As an alternative, we could use Webhooks to listen on pipeline status changes. When the status changes, GitLab sends a POST request with a payload that contains all necessary information needed to make a decision about email sending. It includes a segment about commits:

"commit": {
    "id":"c9e4ad12ba088d2b8e7564228d9a058dbaa03a68",
    "message":"chore: bump grove version",
    "title":"chore: bump grove version",
    "timestamp":"2022-04-20T11:51:33+02:00",
    "url":"https://gitlab.com/opencraft/ops/grove-stage-digitalocean/-/commit/c9e4ad12ba088d2b8e7564228d9a058dbaa03a68",
    "author":{
        "name":"Gabor Boros",
        "email":"[REDACTED]"
    }
},

Using the commit message patterm discussed above, we could set a rule to send email to the workgroup's mailing list, if the commit message is matching the [AutoDeploy][Update] <INSTANCE_NAME> pattern (note that we don't include deployment ID).

The webhook still needs an API endpoint to parse the request. We have two options for that:

  1. Use Ocim for parsing webhooks, or
  2. Setup FAAS on Kubernetes (using OpenFAAS)

The OpenFAAS option would be more beneficial and could provide a generic solution for Grove users. We could set it up using Helm charts and Terraform. Since this is not a feature that everyone would need, we can make it an optional component to install by Terraform.

Email content for build status changes

The current content of the email contains a notification message, build configuration, and the most relevant log lines. Since the cluster repository is private, we need to get the log lines using GitLab's Jobs API. The log file should be downloaded and attached to the email notification, similarly as we do now.

Feature deprecation in Ocim

Although Grove and Ocim are separate tools, in some ways Ocim is tied to Grove. As the deployment method is changing from OpenStack to Kubernetes and the builds are managed by Grove, we should remove the periodic build feature from Ocim as it is not relevant anymore.

The dependency for the feature removal is the feature parity for periodic builds, which is described above.