Add a New Cluster Profile
This document lays out the process of getting a new cluster profile added to the test platform so that jobs can begin using it.
Info
Adding a cluster profile is just one of the steps necessary to enable CI for new platforms. For the high-level information about platform enablement process, please see the OpenShift Infrastructure Provider Onboarding Guide.What Is a Cluster Profile?
The cluster_profile
is a ci-operator
concept that bundles together a couple of concepts to make it easier to configure jobs and steps that can operate on different cloud infrastructures.
When a cluster_profile
is added to a job or workflow, the following actions occur:
- all steps in the workflow will have
credentials
mounted at$CLUSTER_PROFILE_DIR
for cloud accounts, image registries, etc. - the test will implicitly ask for a
lease
and expose it with$LEASED_RESOURCE
- all steps in the test will implicitly declare
dependencies
on imported OpenShift release images - all steps will have a number of environment variables set, such as
$CLUSTER_TYPE
,$IMAGE_FORMAT
, and$KUBECONFIG
Generally, the major difference between cluster_profile
s is the content of the credentials.
Prepare the cloud account
In order for most workflows to operate with the cluster profile, the cloud account must be prepared, including creating a new IAM user as described in the OCP document (AWS, GCP).
In addition to the permissions specified by the OCP documentation, include the following which are required for running tests in the environment:
AWS Policies:
CloudFormationFullAccess
AmazonEC2ContainerRegistryFullAccess
GCP Roles:
Deployment Manager Editor
Compute Image User
Role Administrator
The default AWS quotas need to be increased to ensure the AWS account is capable of creating and running multiple clusters at the same time. The Configuring an AWS account section of the OpenShift document include general instructions on configuring the quotas. We make the following recommendations:
- Simple Storage Service (S3): Bucket Limit: 1000
- Amazon Elastic Compute Cloud (Amazon EC2): EC2-VPC Elastic IPs: 500
- Amazon Elastic Compute Cloud (Amazon EC2): Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances: 3700
- Amazon Virtual Private Cloud (Amazon VPC): VPCs per Region: 250
- Amazon Virtual Private Cloud (Amazon VPC): NAT gateways per Availability Zone: 200
- Amazon Virtual Private Cloud (Amazon VPC): Internet gateways per Region: 200
- Elastic Load Balancing (ELB): Classic Load Balancers per Region: 250
For GCP, we need to increase the following quotas:
- Cloud Filestore API: Basic HDD (Standard) capacity (GB) per region: 20 TB
Adding a New Cluster Profile
When adding a new cluster_profile
, three major steps must be taken: registering the profile inside of ci-operator
, adding the new leases to Boskos
, and providing the credentials.
Registering a New Profile
As cluster-profile
s are handled as first-class items in the ci-operator
configuration, a new pull request (example)
must be sent to the openshift/ci-tools
repository in order to register a new
profile. The next sections detail the requirements for opening this pull
request. All changes required in openshift/ci-tools
are isolated to a single
file, pkg/api/types.go
.
The process of creating a new cluster profile involves adding:
ClusterProfile
: a new constant for the name of the profile.ClusterProfiles()
: a new item in the list of valid test profiles.ClusterProfile::ClusterType()
: a mapping from the profile to its cluster type.ClusterProfile::LeaseType()
: a mapping from the profile to its lease type.LeaseTypeFromClusterType()
: a mapping from cluster type to lease type, if a new type is being added (this is only used for legacy template tests).
Cluster type
This value is passed to tests via the CLUSTER_TYPE
environment variable, as
mentioned in the introduction. It is used for cloud-provider-specific behavior
by step registry components such as the OpenShift installer steps
(e.g.).
For profiles created for completely new platforms, this should be a unique value and will probably require corresponding changes to the installation steps. Profiles which are derivatives of existing ones should likely retain the cluster type unless they require special treatment in the installation process.
Adding New Leases
In the pull request to openshift/ci-tools
, the mapping between a cluster_profile
and the implicit lease
that will be requested is determined. The standard is to use leases named <name>-quota-slice
, so the aws
profile uses aws-quota-slice
. The resources for leasing must be registered with our leasing server (example).
Providing Credentials
The credentials provided to tests that declare a cluster_profile
are a mix
of content owned by the test platform and content owned by the users adding
a new cluster_profile
. The secret used to hold this content is
cluster-secrets-<name>
, so the aws
profile uses cluster-secrets-aws
.
When adding a new profile, a pull request must change the ci-secret-bootstrap
configuration to seed this credential with content owned by the platform,
like central pull secrets for image registries (example).
In addition, any user-provided secrets must be added using
the self-service portal
to add it to the clusters, using the following keys in Vault (the destination
namespace/name needs to match the item added to the ci-secret-bootstrap
config):
Credentials provided for the test containers are mounted using a simple Secret
mount.
- The secret name must match the item in the
ci-secret-bootstrap
configuration. - If the convention for the secret to be named
cluster-secrets-<name>
is followed, no other action is required. - To use a custom secret name for the new cluster profile,
e.g. when the new profile shares credentials with another profile,
a new field must be added to the cluster profiles configuration to override the default naming:
In this case, two pull requests are required; the first pull request should modify the- profile: <name> secret: custom-cluster-profile-secret
ci-secret-bootstrap
configuration and must be merged before proceeding with the second pull request, which modifies the cluster profiles configuration.
Storing AWS credentials as secrets
Some workflows will provision resources before installing the cluster (e.g.,
ipi-aws
. These types of
workflows requires AWS credentials exposed to the workflow as a Vault secret.
It’s important to note that the secret must contain a key-value pair where the
key name is .awscred
and the value is the contents of an AWS credentials
file.
Replace the placeholder values as appropriate:
[default]
aws_access_key_id=<placeholder-id>
aws_secret_access_key=<placeholder-secret>
Storing SSH key pairs
In addition to credentials, some workflows (e.g., ipi-aws
) require ssh keys,
which allow you to access CI clusters. This can be important for debugging
issues. We recommend generating a new key pair specifically for CI usage.
SSH key pairs are also stored in Vault, just like provider credentials. They
should be stored within the same secret as the provider credentials, but as
separate key-value pairs. You’ll have two new key-value pairs, where the keys
are ssh-publickey
and ssh-privatekey
and they store the file contents of
the SSH public and private key, respectively.
Supplying an AWS IP Pool
It is possible to take advantage of AWS’s BYOIP feature to allow your test’s ephemeral clusters to utilize configured IP Pools for cost savings. To utilize this, a few steps must be taken:
- Set up IP Pool in utilized region(s). See AWS docs. Note that this feature can only handle one IP Pool per region.
- Add
Boskos
configuration for every region you intend to utilize. See the example, noting, thename
should be{cluster-profile-name}-ip-pools-{region}
. - Add your lease name (without the region suffix) to the
ci-tools
configuration for your cluster-profile. - (optional) If you don’t want the standard OpenShift branch validation to determine which tests can utilize the pools, modify this function to return
false
for your cluster-profile.
Private Cluster Profiles
To restrict the usage of your cluster profile to specific organizations and repositories,
you can create a pull request in the openshift/release
repository.
Within the pull request, add your repository or organization to the cluster profiles configuration file.
For detailed instructions please refer to the README file.
VPN connection
For platforms that need access to restricted environments, ci-operator
supports adding a dedicated VPN connection to each test step. Since this is a
requirement of specific platforms, it is triggered by special files in the
cluster profile(s)
associated with those platforms. This process is transparent to the test
command: when a VPN connection is requested at the test level, it is set up
automatically by the test platform with no changes required to individual tests.
Note
Details of the implementation can be found in the design document.Cluster profile
VPN connections are requested by the presence of a special file named vpn.yaml
in the cluster profile, detected when the test steps are about to be executed.
This file should have the following fields:
image
: pull spec of the image to be used for the VPN client container.commands
: the name of another file in the cluster profile (e.g.openvpn.sh
) which contains the VPN client’s entry point script. This script is effectively executed asbash -c "$(<f")"
, wheref
is the value associated with thecommands
key.wait_timeout
: the maximum amount of time the step script should wait before starting (detailed in the next section). This ensures the steps are not blocked until the test timeout (several hours) expires if any problems occur with the VPN client.
Image
The image used for the VPN client should contain the packages necessary to
establish the connection, as well as the bash
shell to execute the container’s
entry point script.
The pull spec placed in the cluster profile can point to images stored
anywhere, but the simplest setup is to build and store them in the central CI
cluster. Builds are configured in openshift/release
in
the supplemental images directory (see for example the
OpenVPN image build).
Once the BuildConfig
is merged into master
and the image is built and
tagged, the cluster profile can reference the public pull spec. For the
OpenVPN image stream from the example above, that would be:
Client container
The container should execute the VPN client to establish the connection. It will reside in the same network namespace as the test container, so no special provisions are necessary to make the networking changes usable by the test program. When executed, the entry point program will be located in the directory where the cluster profile files are mounted, so all secrets will be available and can be referenced with a relative path.
In addition to executing the client, the container has also two synchronization points:
- When the connection has been established and the test script should start, a
file named
/tmp/vpn/up
should be created. For OpenVPN, for example, the following options can be used:
|
|
- The script should watch for the creation of a file indicating that the main
test has finished and then exit so that the test can terminate properly. This
marker file is created automatically by the CI infrastructure at
/logs/marker-file.txt
. The client can perform these actions with a script such as: