Onboarding a New Component for Testing and Merge Automation
Overview
This document overviews the workflow for onboarding new public component repositories to the Openshift CI. Private repositories are not supported at this time, but will be in the future.
If you are thinking about adding new images to OpenShift release payloads, read this section first, to avoid doing work you might have to adjust later.
Granting Robots Privileges and Installing the GitHub App
In order to add labels, move PRs and issues into milestones, merge PRs, etc, the robots will need write access to your repository.
If your repository is in the OpenShift organization and it was created by Dev Productivity team, it probably has Team OpenShift Robots added by default as part of the repo creation process. To check if your repo has the robots added, please ask in slack in #forum-dp-platform or open a ticket. If your repo does not have robots, then open a new ticket in their JIRA project with the details.
If your component repository is not in this organization:
- Invite
openshift-ci-robot
andopenshift-merge-robot
into your organization or add them as collaborators for the repository. - The invitations will be accepted automatically via the
periodic-openshift-release-master-accept-invitations-ci-robot
andperiodic-openshift-release-master-accept-invitations-merge-robot
jobs in no more than 4 hours. - The
openshift-merge-robot
must be givenadmin
permissions to support either of the following use cases:- In order for
tide
to automatically merge PRs it must be allowed to verify the repo’s allowed merge methods. - By default, we enable branch-protection for all prow-controlled repos. We can disable it in prow’s config.yaml
- In order for
- If a repository is enrolled in centralized branch management and no write permissions is granted to
openshift-merge-robot
, ensure that thetide/merge-blocker
label exists on the repository. Otherwise, the periodic-openshift-release-merge-blockers job would fail. See how to create a label at Github’s documentation. - All repositories need the Openshift CI and the Openshift Merge Bot GitHub Apps installed. Go to the app and click Configure. Repeat the process for the second app. We plan to eventually replace the bot accounts entirely with these apps, but that work is not yet done. Both of them are required for automations to work properly, if one is missing you might experience failures in tests like pull-ci-openshift-release-check-gh-automation
Prow Configuration
Prow is the k8s-native upstream CI system, source code hosted in the kubernetes-sigs/prow repository. Prow interacts with GitHub to provide the automation UX that developers use on their pull requests, as well as orchestrating test workloads for those pull requests.
Bootstrapping Configuration for a new Repository
From the root of the openshift/release repository, run the following target and use the interactive tool to bootstrap a configuration for your repository:
|
|
This should fully configure your repository, so the changes that it produces are ready to be submitted in a pull
request. The resulting YAML file called $org-$repo-$branch.yaml
will be found in the ci-operator/config/$org/$repo
directory.
Enabling Plugins
Plugins implement many portions of CI system functionality. These must be configured before they can be used. They can be enabled as-needed and configured for your repository independently of other repositories in the organization. After your repository is configured to deliver webhooks to Prow, those webhooks will be ignored until plugins are configured to consume and react to them. A live list of plugins and their descriptions is hosted on our website, please consult that list while reading the following section for more detail.
After initializing your repository with the make new-repo
target, you can create a new Plugin configuration for your repository with the following target:
|
|
This will place a new _pluginconfig.yaml
file in the /core-services/prow/02_config/$org/$repo
directory.
This file is used to configure the specific plugins for your repository.
Default plugin configuration is stored in
_plugins.yaml
in the
openshift/release repository. Plugins are enabled for a repository or
organization under the plugins
key. Plugin-specific configuration is under keys like label
or owners
. The set of a
repository’s enabled plugins is the union of plugins configured for the repository’s organization (found at the
plugins.yaml["plugins"]["$org"]
key) and the repository itself (found in the /core-services/prow/02_config/$org/$repo/_pluginconfig.yaml
file).
Most individual plugins can be configured to change their behavior; only some plugins allow for granular configuration
at a repository level, many only expose global configuration options for all repositories that Prow monitors. If you
think you need to configure an individual plugin, consult a CI administrator. While we work on a better solution,
documentation for these options lives in the type Configuration struct
here.
Repositories Under Existing Organizations
If you are onboarding a repository in an organization for which plugins are already configured, you will only need to
enable plugins that you do not already inherit from the organization by adding those plugins under a new
plugins.yaml["plugins"]["$org/$repo"]
key. The one plugin you may want to configure at a repository-scoped level is:
Name | Description |
---|---|
approve | enables the /approve functionality with OWNERS files |
Review the list of plugins enabled for your owning organization and the live plugin catalog when choosing which plugins you want on your repo.
If your repository does not have OWNERS files
, you will not be able to opt into the /approve
process or automatic pull
request review assignment. OWNERS
file format and interaction details can be found
upstream. You will also
need to add bugzilla component information to your OWNERS
file. Format can be found
here.
Repositories Under New Organizations
If you are onboarding a component not in any organization we already have configured, consider copying the openshift
organization’s plugin configuration for your organization under a new plugins.yaml["plugins"]["$org"]
key, or/ setting
your repo’s configuration without adding any organization-wide configuration by adding the full list of plugins you
require under a new plugins.yaml["plugins"]["$org/$repo"]
key.
Describing Tests
Prow provides the following test trigger types:
Type Name | Trigger | Target | Purpose |
---|---|---|---|
presubmit | Push to a PR | A single PR merged into the branch it is targeting | Testing commits within a PR before they are merged |
postsubmit | Push/merge to a branch | User specified set of branches | Integration tests after a PR is merged |
periodic | cron -like schedule | User-specified set of branches | Scheduled test runs |
Configuration for your repository’s tests live in YAML files in the
openshift/release repository. Jobs are stored in
many files, sharded by branch and job type, found under: ci-operator/jobs/$org/$repo/$org-$repo-$branch-$jobtype.yaml
.
The org and repo name redundancy is because of a requirement that the basename of your YAML file is unique under the
ci-operator/jobs
tree. More detailed Prow job configuration documentation lives
upstream.
Generating Prow jobs from ci-operator configuration files
The Test Platform team created a tool to generate Prow
job configuration files out of ci-operator configuration files. The generator has knowledge of the naming and directory structure conventions in
openshift/release
repository. Provided you have put the ci-operator
configuration file to ci-operator/config/$org/$repo
directory in it (as described by Containerized Tests
section), you can generate the needed Prow files by running this command from the root of the
openshift/release
repository:
|
|
This will create all necessary files under ci-operator/jobs/$org/$repo
, creating a good default set of Prow jobs.
Note
This will, however, completely regenerate jobs for all configuration files in
openshift/release
, which can take a long time due to the size of the
repository. Generating jobs for one or a few repositories is fast, and can be
done by specifying a subdirectory of ci-operator/config
using the WHAT
parameter:
Setting up team ownership of ci-operator and Prow config files
While the initial PR to openshift/release
will need to be reviewed and
approved by root approvers, once the component config is in
place, it should be owned by the component team. To achieve this, an OWNERS
file mirroring what exists upstream should
be placed into both ci-operator/config/$org/$repo
and ci-operator/jobs/$org/$repo
directories.
Shortly following the merge of the onboarding PR, the periodic-prow-auto-owners
job will sync the OWNERS
file from the component repository to the relevant directories.
Assumming this file exists in the repo’s base directory, this sync is performed periodically, and will sync all members in that file that are also members of the openshift
GitHub org.
This means that the component’s OWNERS
file is the only one that needs to be manually updated.
See #forum-pge-cloud-ops
to gain membership in the openshift
org.
Enabling Automatic Merges
Prow’s tide
component
periodically searches for pull requests that fit merge criteria (for instance, presence of a lgtm
label and absence of the do-not-merge/hold
label) and merges them. Tide
furthermore requires not only that all required
tests in the Prow configuration succeed and all posted statuses on the GitHub pull request are green but also that the
tests tested the latest commit in the pull request on top of the latest commit in the branch that the pull request is
targeting before a pull request is considered for merging.
To enable Tide, place a new _prowconfig.yaml
file in the
/core-services/prow/02_config/$org/$repo
directory and configure Tide under
the top-level ["tide"]
key. The easiest way to get the correct config is to
copy it from an existing repo with similar requirements. Tide’s configuration
options are documented
upstream.
If your repository does not have OWNERS
files, or if you have not chosen to
opt into the /approve
process, it is suggested that you require only the
lgtm
label and not approve
also.
Note
Due to github’s rate limiting it is possible that theTide
check will not appear on any given PR. The PR will still merge with Tide, and this is purely a cosmetic issue.Who can /approve
?
Repo’s OWNERS
and OWNERS_ALIASES
define the list. It is a concept of Prow workflow. Those files also define who
could get selected as reviewers of PRs in that repo. See Prow’s
doc
on this topic.
Who can /lgtm
?
Github’s users who are the repo’s collaborators. It is a concept of github. Contributors should follow this mojo to become a collaborator for repositories in openshift org. See Prow’s doc on this topic.
CI Operator Configuration
CI-Operator is a second-level orchestrator which translates Prow’s testing requests into OpenShift-native test
workloads. Think translating “run integration tests on my PR” into “trigger an OpenShift Build
to create a container
with test artifacts and a Pod
to run the integration test using that container image”.
While the Prow configuration describes when to run a test, the CI Operator configuration describes the test’s content.
Consult the ci-operator
configuration reference document for
information on specific fields.
Containerized Tests
Adding a containerized test is as simple as adding an entry to the tests
array in the CI Operator configuration file
and a Prow job configuration that runs the test with --target=$target
. Consult the documentation
for more details on how to configure containerized tests and test them locally.
We recommend breaking up your tests into logical sections when adding tests here. More granular test reporting will allow for higher parallelism during test execution and more efficient re-testing if one suite fails.
Containerized tests are configured first-class in CI Operator configuration files in the
openshift/release
repository, sharded by branch, at:
ci-operator/config/$org/$repo/$org-$repo-$branch.yaml
. The org and repo name redundancy is due to
a requirement that all filenames be unique under ci-operator/config
.
End-to-End Tests
Tests that require a running Openshift cluster should use one of the provided step registry workflows, more details here.
Non-Openshift organization users
Please read the section about creating your own cluster profile. It is needed for all parties outside of OpenShift organization to operate on test platform using cloud accounts provided by your organization. Alternatively, you may be interested in creating your own cluster pool.Image Publishing and Mirroring
When container images are declared as release artifacts for a repository in the CI Operator configuration file under the
images
list (like
this),
a synthetic [images]
target is available for CI Operator execution that will simply build all
release images. In order for the container images built from your repository to be published, the
Prow job generator will configure a Prow postsubmit job that uses the CI Operator --target=[images]
and --promote
flags.
Information on how to publish images to an external registry can be found in a separate document.
Product builds and becoming part of an OpenShift release
Some images become part of the OpenShift release because they are core to the platform. To be part of the release, there are additional requirements (beyond those described in the previous section). Adding images to releases is a significant addition, and requires an enhancement to explain why the new functionality is important, and why it should be delivered via OpenShift releases (instead of via an OLM-installed operator). It also requires the approval of image names for mirroring. You can delay the enhancement proposal and image naming discussion if you need to get through some later steps in order to figure out what to put in the enhancement, but that it’s good to round with likely enhancement and image naming approvers about that, and it comes with the risk that you sink in some work and then enhancement or image-name review ends up rejecting your idea or requesting invasive changes.
There are two types of component images in the release payloads:
- Operators managed by the CVO - known as second level operators. If a Dockerfile contains
LABEL io.openshift.release.operator true
, the component is a second level operator. - Operands managed by second level operators. These images are pulled into the release payload by virtue of being specified in an image-references file by a second level operator.
All product teams that will ship an image to product must ensure their image is built in OSBS at least once and
published back to the nightly test jobs BEFORE you reference them from another component (via the image-references
file), or before you set the image label io.openshift.release.operator
to get automatically included.
To include the build version, you can consume environment variables like
OS_GIT_VERSION
andBUILD_VERSION
set by ART’s Doozer in your Dockerfile. For example, here is build-machinery-go passingOS_GIT_VERSION
,SOURCE_GIT_COMMIT
, andSOURCE_GIT_TREE_STATE
through to Go.CVO manifests may also use the
0.0.1-snapshot
placeholder to have the OCP release version injected at release-assembly time.Ensure you have successfully published your image to the CI integration stream
Follow the ART instructions to have them build your image
- On the dist-git part of the process it is critical that you ensure your component/image names match as described in the bulleted criteria
Ensure a single successful build is run (sync with ART to confirm). You can also check for your new image name in both ART and CI ImageStreams by checking CI registries:
1 2 3 4 5 6 7
$ podman login -u=$(oc --context app.ci whoami) -p=$(oc --context app.ci whoami -t) quay-proxy.ci.openshift.org --authfile /tmp/t.c $ oc image info quay-proxy.ci.openshift.org/openshift/ci:ocp_4.15_cluster-config-api -a /tmp/t.c | head -n2 # looking for the 'cluster-config-api' name in the CI ImageStream for 4.15 Name: quay-proxy.ci.openshift.org/openshift/ci:ocp_4.15_cluster-config-api Digest: sha256:59ec20828d39a8b7c971d2d6e6142d2b3c4a45996038fe0d19ada0372a775598 $ oc image info registry.ci.openshift.org/ocp/4.15-art-latest:cluster-config-api | head -n2 # looking for the 'cluster-config-api' name in the ART ImageStream for 4.15 Name: registry.ci.openshift.org/ocp/4.15-art-latest:cluster-config-api Digest: sha256:aefb9a83bac984a1bb54d3976f38ff25c60bc101a40117e63396bfb8891f190a
Open the PR to add
LABEL io.openshift.release.operator true
or to add your new image to another component (your operator, usually)Once the PR is merged, verify that the nightly builds continue to pass (usually 2-3 hours after your PR merges) and that you didn’t break the OCP CI test
Renaming or removing components in the OpenShift release payload
It is occasionally necessary to rename or remove existing components in the OpenShift release payload. This must be done with care. Missteps may cause production and CI automation to be unable to create new release payloads – impacting the entire organization. Observing the process as it unfolds and maintaining good communication with the ART team is crucial.
Note
Before proceeding, it is important to understand the name you are changing. This remainder of this section describes how to change the component names used within the release payload (i.e. the names listed as a result of runningoc adm release info ...
).
This procedure will not change the repository to which an image is published on registry.redhat.io (e.g. registry.redhat.io/openshift4/ose-cluster-etcd-rhel8-operator…) .
To change the registry.redhat.io repository name, create a copy of this ART team templatedescribing the desired change. The repository on registry.redhat.io is called the “Comet Repo” and has no impact on the content / construction of the release payload.
Changing the component name of a second level operator
It is uncommon for other components to directly reference the release payload component name of a second level operator. However, if your second level operator is referenced, follow the procedure for changing the component name of an operand.
Note
To detect if another component references the component about to be changed via an image-references
file, you can use oc
and --exclude
the
old name of the component.
In this example, cluster-etcd-operator
references etcd
.
Failure to follow all steps can lead to important and difficult to detect disparities between what is tested in CI and what is shipped to customers.
Steps:
- If a name change was not directly requested by a staff engineer, ensure that a staff engineer agrees on the name change (
@aos-staff-engineers
on Slack). - Open and
/hold
a pull request, PR1, against github.com/openshift/release to change the component’s name in CI. In virtually all cases, this will be a PR against the component’s ci-config for its main branch. A component’s name should be specified in theimages.to:
stanza of a component’s CI configuration. The new tag name should match exactly what will appear inoc adm release info
when the component is listed. - Open and
/hold
a pull request, PR2, against github.com/openshift/ocp-build-data in the openshift-4.x branch of the targeted release(s). Note that ART may have already cut a new release branch, meaning you need to open a PR for openshift-4.x+1 as well. The PR should change thename
field in the component’s metadata. Note that in ART metadata, the desired name should be prefixed withose-
. For example, thecluster-etcd-operator
payload component name is defined here, for openshift-4.11. - Submit a copy of this ART team template to communicate to ART that a component name change is desired. Include the PRs in the Jira ticket.
- All PRs should be passing tests and ready to merge. No PRs other than PR1 should be merged in the component’s github.com repository during the following process. In a synchronous chat with an ART team member or release manager (
@release-artists
in#aos-art
on Slack) the following should be performed:- On the central app.ci CI cluster, a release-artist should check the current registry.ci.openshift.org image associated with the component in the
-n ocp is/4.{minor}
release image stream. For example,oc -n ocp get istag 4.11:{old-component-name} -o=json | jq .tag.from.name
should output a pullspec likeregistry.ci.openshift.org/ocp/4.11@sha256:...
for 4.11. - Unhold PR1 and allow it to merge.
- Once PR1 merges, the release-artist should run
oc -n ocp tag {existing-registry.ci-openshift-org@sha26..} 4.{minor}:{new-component-name}
followed immediately byoc -n ocp tag 4.{minor}:{old-component-name} -d
to remove the old component name from CI. - Unhold PR2 and have the release-artist merge it.
- On the central app.ci CI cluster, a release-artist should check the current registry.ci.openshift.org image associated with the component in the
- Monitor the subsequent CI payloads on amd64 OpenShift release controller. Continue to do so until you see a CI payload produced which reports the new component name in
oc adm release info <ci-release-payload-pullspec>
. If steps in this process were missed, following the hyperlink for a payload name in the release controller will simply display an error message stating that the release controller wasunable to create a release
with a few details about the problem’s cause. If this error is displayed, immediately report the issue to@team-technical-release
and@release-artists
so that the incident can be recovered as quickly as possible.
Changing the component name of an operand
Release payload operand component names are referenced in second level operator github.com repositories. A change must merge in the second level
operator’s image-references
file, or release payloads will fail to assemble after the operand component name is changed.
Steps:
- If a name change was not directly requested by a staff engineer, ensure that a staff engineer agrees on the name change (
@aos-staff-engineers
on Slack). - Open and
/hold
a pull request, PR1, against github.com/openshift/release to change the operand component’s name in CI. In virtually all cases, this will be a PR against the component’s ci-config for its main branch. A component’s name should be specified in theimages.to:
stanza of a component’s CI configuration. The new tag name should match exactly what will appear inoc adm release info
when the component is listed. - Open and
/hold
a pull request, PR2, against the repo(s) of the operator(s) which references the old operand component name in itsimage-references
file. PR2 should updateimage-references
to use the new component name. - Open and
/hold
a pull request, PR3, against github.com/openshift/ocp-build-data in the openshift-4.x branch of the targeted release(s). Note that ART may have already cut a new release branch, meaning you need to open a PR for openshift-4.x+1 as well. The PR should change thename
field in the component’s metadata. Note that in ART metadata, the desired name should be prefixed withose-
. For example, theetcd
operand component is name definition can be seen here, for openshift-4.11. - Submit a copy of this ART team template to communicate to ART that a component name change is desired. Include the PRs in the Jira ticket.
- PR1 and PR3 should be passing tests and ready to merge (PR2 will be failing at this point). No PRs other than PR1 and PR2 should be merged in their respective github.com repositories during the following process. In a synchronous chat with an ART team member or release manager (
@release-artists
in#aos-art
on Slack) the following should be performed:- On the central app.ci CI cluster, a release-artist should check the current registry.ci.openshift.org image associated with the component in the
-n ocp is/4.{minor}
release image stream. For example,oc -n ocp get istag 4.11:{old-component-name} -o=json | jq .tag.from.name
should output a pullspec likeregistry.ci.openshift.org/ocp/4.11@sha256:...
for 4.11. - The release-artist should run
oc -n ocp tag {existing-registry.ci-openshift-org@sha26..} 4.{minor}:{new-component-name}
to establish a tag with the new operand component name for CI. - Run
/retest
on PR2(s). Tests should now pass. - Unhold and merge PR1.
- Unhold and merge PR2(s).
- Unhold and have the release-artist merge PR3. The time between PR2 and PR3 merging should be kept to a minimum to avoid ART nightlies failing to assemble.
- It is not time sensitive, but before the ART Jira ticket is closed, the release-artist must remove the old component name from CI:
oc -n ocp tag 4.{minor}:{old-component-name} -d
.
- On the central app.ci CI cluster, a release-artist should check the current registry.ci.openshift.org image associated with the component in the
- Monitor the subsequent CI payloads on amd64 OpenShift release controller. Continue to do so until you see a CI payload produced which reports the new component name in
oc adm release info <ci-release-payload-pullspec>
. If steps in this process were missed, following the hyperlink for a payload name in the release controller will simply display an error message stating that the release controller wasunable to create a release
with a few details about the problem’s cause. If this error is displayed, immediately report the issue to@team-technical-release
and@release-artists
so that the incident can be recovered as quickly as possible. - During the ensuing work day, check ART nightlies on the s390x release controller. Continue to do so until you see an ART nightly produced which reports the new component name in
oc adm release info <art-s390x-nightly-release-payload-pullspec>
. If steps in this process were missed, following the hyperlink for a payload name in the release controller will simply display an error message stating that the release controller wasunable to create a release
with a few details about the problem’s cause. If this error is displayed, immediately report the issue to@team-technical-release
and@release-artists
so that the incident can be recovered as quickly as possible. Changes will also eventually be apparent on the amd64 release controller, but, due to differences in acceptance testing, they will be evident in the s390x release controller much sooner.
Removing a component from the OpenShift release payload
Steps:
- Submit a copy of this ART team template to communicate to ART that a component change is desired. Take care to mention whether the image should also be removed from the 4.x+1 branch of ART’s metadata in case it has already been branched.
- Open and
/hold
pull request, PR1, removing references to the old component from any second level operator which includes the component in itsimage-references
file. - Open and
/hold
a pull request, PR2, to github.com/openshift/release which, minimally, removes thepromotion:
stanza from the ci-operator configuration for the component & affected release(s). If the component is being completely removed from CI, PR2 can be the complete deletion of the ci-operator configuration file for the component / branch. - Via the ART Jira ticket, have a release-artist prepare a github.com/openshift/ocp-build-data pull request, PR3, which either removes the component metadata or prevents its inclusion in the release payload (
for_payload: false
). - In a synchronous chat with an ART team member or release manager (
@release-artists
in#aos-art
on Slack) the following should be performed:- Unhold and merge PR1.
- Unhold and merge PR2.
- Unhold and have the release-artist merge PR3.
- The release-artist should then remove the old component name from CI:
oc -n ocp tag 4.{minor}:{old-component-name} -d
.