Job Primer

Job Primer ci-tool used to generate job names for Big Query.

Overview

JobPrimer is the ci-tool that is used to populate the BigQuery Jobs table. The Jobs table dictates which periodic jobs are ingested during disruption data gathering.

JobPrimer runs periodically in our CI. There are two sub commands described below that run on a cron schedule.

  • The configuration for generate-job-names can be found here. It updates the generated_job_names.txt file then opens and merges the PR via the Openshift bot. (example PR)

  • The configuration for prime-job-table which updates the table in big query can be found in the DPCR Job Aggregation Configs (private repo)

    https://github.com/openshift/continuous-release-jobs/tree/master/config/clusters/dpcr/services/dpcr-ci-job-aggregation/job-table-updater-cronjob.yaml
    

High Level Diagram

GCPBig Queryopenshift-ci-data-analysisJobs TableJob PrimerGitHubopenshift/releaseFetch List of JobsGenerate list text fileParse job names(network, topology, type)Update Jobs Table

How The Data Flows

  1. We first look at the origin/release repo to gather a list of the current release jobs that were created. The below command is ran to look through the current configuration and generate the job names.

    1
    
    ./job-run-aggregator generate-job-names > pkg/jobrunaggregator/jobtableprimer/generated_job_names.txt
    
  2. That generated_jobs_names.txt is then committed to the repo.

    You must then rebuild the binary so the newly generated list is correctly embedded.

  3. We then create the jobs in the BigQuery table by running the prime-job-table command. This will use the embedded generated_jobs_names.txt data and generate the Jobs rows based off of the naming convention (see below). After which the Jobs table should be updated with the latest jobs.

    1
    
    ./job-run-aggregator prime-job-table
    

Naming Convention

Please make sure your job names follow the convention defined below. All job names must include adequate information to allow proper data aggregation.

  • Platform:
    • aws, gcp, azure, etc…
  • Architecture: (default: amd64)
    • arm64, ppc64le, s390x
  • Upgrade: (default: assumes NOT upgrade)
    • upgrade
  • Network: (default: sdn && ipv4)
    • sdn, ovn
    • ipv6, ipv4
  • Topology: (default: assumes ha)
    • single
  • Serial: (default: assumes parallel)
    • serial
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
func newJob(name string) *jobRowBuilder {
	platform := ""
	switch {
	case strings.Contains(name, "gcp"):
		platform = gcp
	case strings.Contains(name, "aws"):
		platform = aws
	case strings.Contains(name, "azure"):
		platform = azure
	case strings.Contains(name, "metal"):
		platform = metal
	case strings.Contains(name, "vsphere"):
		platform = vsphere
	case strings.Contains(name, "ovirt"):
		platform = ovirt
	case strings.Contains(name, "openstack"):
		platform = openstack
	case strings.Contains(name, "libvirt"):
		platform = libvirt
	}

	architecture := ""
	switch {
	case strings.Contains(name, "arm64"):
		architecture = arm64
	case strings.Contains(name, "ppc64le"):
		architecture = ppc64le
	case strings.Contains(name, "s390x"):
		architecture = s390x
	default:
		architecture = amd64
	}

...