Accessing CI component logs
Using AWS’s CloudWatch to query application logs for: Prow, CI services, Cluster Bot, Release Controller, etc..
This section describes in detail some aspects of the implementation of
ci-operator and associated programs, with the intent of serving as an
auxiliary guide for developers working with the
openshift/ci-tools repository.
ci-operator has a number of foundational principles:
To achieve this, ci-operator requires configuration to understand the build
process for each component as well as every output container image. This
document overviews the workflow that ci-operator uses to build components and
structure tests.
Every invocation of ci-operator creates a workspace to isolate test execution,
seed it with build inputs and the published component images from the OpenShift
release if the component under test is a part of one, then schedule test
workflows as Kubernetes and OpenShift objects.
ci-operator is at its core a task scheduling program. The input configuration
is processed and used to build a task graph, which is then executed until
completion, failure, or interruption. Thus, the execution flow of ci-operator
can be divided in these major phases:
In the code base, these phases correspond to the following modules in the
pkg directory:
api:
Go types used by all phasesload:
I/O operations for typesregistry:
configuration resolution using the step registryconfig:
input configuration processingvalidation:
input configuration validationdefaults:
mapping from inputs to taskssteps:
definitions for each task type, task executionTo avoid repeating work, ci-operator needs to determine when work can be
re-used. The tool identifies a build of any specific job with a hash of:
With such an identifier, ci-operator can determine if two builds are using the
same configuration on the same inputs and can therefore re-use common work.
This identifier is used to create the Kubernetes Namespace in which the test
workloads will run and is furthermore available to tests via the NAMESPACE
environment variable.
Input resolution can be identified in the ci-operator output by all of the
steps that precede the creation of the test Namespace:
INFO[2022-07-11T17:03:22Z] ci-operator version v20220708-ca6de370c
INFO[2022-07-11T17:03:22Z] Loading configuration from https://config.ci.openshift.org for openshift/ci-tools@master
INFO[2022-07-11T17:03:22Z] Resolved source https://github.com/openshift/ci-tools to master@ca6de370, merging: #2883 3530bc01 @smg247
INFO[2022-07-11T17:03:23Z] Using namespace https://console-openshift-console.apps.build04.34d2.p2.openshiftapps.com/k8s/cluster/projects/ci-op-022dqmlr
Most log messages during input resolution have debug priority, so the log file is more informative in this case:
{"level":"info","msg":"ci-operator version v20220708-ca6de370c","time":"2022-07-11T17:03:22Z"}
{"level":"info","msg":"Loading configuration from https://config.ci.openshift.org for openshift/ci-tools@master","time":"2022-07-11T17:03:22Z"}
{"level":"debug","msg":"performing request method GET url https://config.ci.openshift.org/config?branch=master&org=openshift&repo=ci-tools","time":"2022-07-11T17:03:22Z"}
{"level":"info","msg":"Resolved source https://github.com/openshift/ci-tools to master@ca6de370, merging: #2883 3530bc01 @smg247","time":"2022-07-11T17:03:22Z"}
{"level":"debug","msg":"Determining if build cache build-cache/openshift-ci-tools:master can be used in place of root ci/ci-tools-build-root:1.18","time":"2022-07-11T17:03:22Z"}
{"level":"debug","msg":"Resolved build cache build-cache/openshift-ci-tools:master to sha256:de666c2027b84ff2b1ca1a4cafb08959aadcd5e8ba9e6a40c2767bbda87d1599","time":"2022-07-11T17:03:22Z"}
{"level":"debug","msg":"Build cache build-cache/openshift-ci-tools:master is based on root image at sha256:f8c36a557d17e88976fea1349279a656e546b299d034e985b4ae43309003153d","time":"2022-07-11T17:03:22Z"}
{"level":"debug","msg":"Resolved root image ci/ci-tools-build-root:1.18 to sha256:f8c36a557d17e88976fea1349279a656e546b299d034e985b4ae43309003153d","time":"2022-07-11T17:03:22Z"}
{"level":"debug","msg":"Using build cache build-cache/openshift-ci-tools:master as root image.","time":"2022-07-11T17:03:22Z"}
{"level":"debug","msg":"Resolved build-cache/openshift-ci-tools:master (root) to sha256:de666c2027b84ff2b1ca1a4cafb08959aadcd5e8ba9e6a40c2767bbda87d1599.","time":"2022-07-11T17:03:22Z"}
{"level":"debug","msg":"Resolved origin/centos:stream8 (base_image: os) to sha256:ad7d81f622a590e73c34ec20b4ae6a0ff162b1e7306d121d3d634949bfae6b45.","time":"2022-07-11T17:03:22Z"}
{"level":"debug","msg":"Resolved ocp/4.10:cli (base_image: cli) to sha256:954f2ea4c53cfdc6b439ba691180b6a3fcba64cc7d71d49bf81f269066fe4af6.","time":"2022-07-11T17:03:22Z"}
{"level":"debug","msg":"Resolved ci/golangci-lint:v1.45.2 (base_image: golangci-lint) to sha256:50d12acf8ef7c41545ae8e1fe14cfc22e690baca7d2bb41d40da8feb8a78cabe.","time":"2022-07-11T17:03:23Z"}
{"level":"trace","msg":"Using binary as hash: /usr/bin/ci-operator 1657322166 69194954","time":"2022-07-11T17:03:23Z"}
{"level":"info","msg":"Using namespace https://console-openshift-console.apps.build04.34d2.p2.openshiftapps.com/k8s/cluster/projects/ci-op-022dqmlr","time":"2022-07-11T17:03:23Z"}
The hash created from input resolution is used to create a Namespace as an
isolated workspace for the test; the Namespace is subsequently initialized for
use by the test workloads.
All input images for the tests that are described in the configuration YAML are
tagged in, as are all images that form the larger release that the test is a
part of. Images that are used for the build graph, like those identified with
the base_images, base_rpm_images, and build_root stanzas, have
ImageStreamTags created for them in the pipeline ImageStream in the test
Namespace. Images that are part of the release that the test exists within,
as specified with the optional releases stanza, are mirrored to
ImageStreamTags in the stable ImageStream within the test Namespace.
In order to ensure that resources from tests do not leak on the cluster the
tests are executed on, both hard and soft TTLs are set on the Namespace and
the ci-ns-ttl-controller
is used to enforce the TTLs and reap namespaces when TTLs have expired. Both a
hard and a soft TTL are set on the namespaces; the hard TTL (cleanupDuration,
currently 24 hours) describes how much time can pass after
the creation of the Namespace before it is reaped, the soft TTL
(idleCleanupDuration, currently 1 hour) describes how
much time can pass without any active Pods in the Namespace before it is
reaped. Whichever TTL is reached first triggers reaping.
A configuration file for ci-operator defines build steps, test targets and
output images for a component git repository. A graph of build dependencies
is built from this configuration in order to determine what concrete actions
need to occur for any specific target. Each invocation of ci-operator
specifies one or more --targets to execute; for each target, the build graph
is traversed to execute dependent steps first.
The ci-operator configuration file creates some implicit build steps:
Output ImageStreamTag | Action |
|---|---|
pipeline:src | clones the refs under test |
pipeline:bin | runs the binary_build_commands |
pipeline:test-bin | runs the test_binary_build_commands |
pipeline:rpms | runs the rpm_build_commands |
Container image builds – whether from the implicit pipeline steps above or
from explicit image build configurations in the images stanza, are executed
using OpenShift Builds. Test targets in the tests stanza are executed using
Kubernetes Pods. As all of the test workflow execution objects are created in
a Namespace shared for all jobs with the same input, re-use is achieved by
deterministic naming. For instance, the src Build that creates the
pipeline:src ImageStreamTag will be created only once in a given
Namespace; other builds of jobs that require this build step will see the
Build running and wait for it to complete or see the ImageStreamTag existing
and consider the build step finished.
Below is a graph showing the various pipeline images:

Solid boxes are images, solid lines are dependencies. The dashed stable box
represents the “internal” promotion to the stable stream prior to the execution
of tests. Dashed lines represent edges not fully depicted since they are
optional and can be added to any image in the pipeline:
operator.substitutions makes src-bundle depend on that
image.operator.base_index entry, if specified, makes all index generator
images depend on that image.Using AWS’s CloudWatch to query application logs for: Prow, CI services, Cluster Bot, Release Controller, etc..
A description of the components and processes responsible for uploading test results and artifacts to long-term storage.
A description of the ci-operator-configresolver service.
The process by which changes to files in the openshift/release repository are propagated to the CI clusters.
How Prowjob scheduler and Prowjob dispatcher work together to provide dynamic scheduling of prowjobs
Debugging the issues about the images in CI.
Description of how observer pods work
A description of the various ci-operator task types.
Executing ci-operator outside of CI jobs.
A description of the implementation details of job and test timeouts and interruptions.