Code Implementation
Overview
Note!
In the examples below we use theBackend Disruption
tests, but the same will hold true for the alerts durations.To measure our ability to provide upgrades to OCP clusters with minimal downtime, the Disruption Testing framework monitors select backends and records disruptions in the backend service availability. This document serves as an overview of the framework used to provide disruption testing and how to configure new disruption tests when needed
Matcher Code Implementation
Now that we have a better understanding of how the disruption test data is generated and updated, let’s discuss how the code makes use of it.
Best Matcher
The origin/pkg/synthetictests/allowedbackenddisruption/query_results.json file that we updated previously is embedded into the openshift-tests
binary. At runtime, we ingest the raw data and create a historicaldata.NewMatcher()
object which implements the BestMatcher
interface.
|
|
Best Guesser
The core logic of the current best matcher will check if we have an exact match in the historical data. An exact match is one that contains the same Backend Name
and JobType. When we don’t have an exact match, we make a best guess effort by doing a fuzzy match for data we don’t have. Fuzzy matching is done by iterating through all the nextBestGuessers
and stopping at the first match that fits our criteria and checking if it’s contained in the data set.
|
|
Default Next Best Guessers
Next Best Guessers are functions that can be chained together and will return either a true
or false
if the current JobType
matches the desired logic. In the code snippet below, we check if MicroReleaseUpgrade
matches the current JobType
, if false, we continue down the list. The combine helper function gives you the option to chain and compose a more sophisticated check. In the example below, if we can do a PreviousReleaseUpgrade the result of that will be fed into MicroReleaseUpgrade and if no function returns false
during this chain, we have successfully fuzzy matched and can now check the historical data has information for this match.
nextBestGuessers
origin/pkg/synthetictests/historicaldata/next_best_guess.goPreviousReleaseUpgrade
origin/pkg/synthetictests/historicaldata/next_best_guess.go
|
|
MicroReleaseUpgrade
origin/pkg/synthetictests/historicaldata/next_best_guess.go
|
|
Adding new disruption tests
Currently disruption tests are focused on disruptions created during upgrades. To add a new backend to monitor during the upgrade test, add a new backendDisruptionTest
NewBackendDisruptionTest
origin/test/extended/util/disruption/backend_sampler_tester.govia NewBackendDisruptionTest to the e2e upgrade AllTests.
AllTests
origin/test/e2e/upgrade/upgrade.go
|
|
NewKubeAvailableWithNewConnectionsTest
origin/test/extended/util/disruption/controlplane/controlplane.go
|
|
If this is a completely new backend being tested, then query_results data will need to be added or, if preferable, NewBackendDisruptionTestWithFixedAllowedDisruption can be used instead of NewBackendDisruptionTest and the allowable disruption hardcoded.
Updating test data
Allowable disruption values can be added / updated in query_results. Disruption data can be queried from BigQuery using p95Query
Disruption test framework overview
To check for disruptions while upgrading OCP clusters
- The tests are defined by AllTests
- The disruption is defined by clusterUpgrade
- These are passed into disruption.Run
- Which creates a new Chaosmonkey and executes the disruption monitoring tests and the disruption
- The backendDisruptionTest is responsible for
- Creating the event broadcaster, recorder and monitor
- Attempting to query the backend and timing out after the max interval (1 second typically)
- Analyzing the disruption events for disruptions that exceed allowable values
- When the disruption is complete, the disruptions tests are validated via Matches / BestMatcher to find periods that exceed allowable thresholds
- Matches will look for an entry in query_results if an exact match is not found it will utilize BestMatcher to look for data with the closest variants match