168 lines
8.0 KiB
Markdown
168 lines
8.0 KiB
Markdown
# Federation Buildcop Guide and Playbook
|
|
|
|
Federation runs two classes of tests: CI and Presubmits.
|
|
|
|
## CI
|
|
|
|
* These tests run on the HEADs of master and release branches (starting
|
|
from Kubernetes v1.6).
|
|
* As a result, they run on code that's already merged.
|
|
* As the name suggests, they run continuously. Currently, they are
|
|
configured to run
|
|
[at least once every 30 minutes](https://github.com/kubernetes/test-infra/blob/22c38cfb64137086373e1b89d5e7d98766560747/prow/config.yaml#L3686).
|
|
* Federation CI tests run as
|
|
[periodic jobs on prow](https://github.com/kubernetes/test-infra/blob/22c38cfb64137086373e1b89d5e7d98766560747/prow/config.yaml#L3686).
|
|
* CI jobs always run sequentially. In other words, no single CI job
|
|
can have two instances of the job running at the same time.
|
|
|
|
### Configuration
|
|
|
|
Configuration steps are described in https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/jenkins/README.md#how-to-work-with-jenkins-jobs
|
|
|
|
The configuration of CI tests are stored in:
|
|
|
|
* Jenkins config: https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/jenkins/job-configs/kubernetes-jenkins/bootstrap-ci.yaml
|
|
* Test job/bootstrap config: https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/jobs/config.json
|
|
* Test grid config: https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/testgrid/config/config.yaml
|
|
* Job specific config: https://github.com/kubernetes/test-infra/tree/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/jobs
|
|
|
|
### Results
|
|
|
|
Results of all the federation CI tests, including the soak tests, are
|
|
listed in the corresponding tabs on the Cluster Federation page in the
|
|
testgrid.
|
|
https://k8s-testgrid.appspot.com/sig-federation
|
|
|
|
### Playbook
|
|
|
|
#### Triggering a new run
|
|
|
|
Please ping someone who has access to the Jenkins UI/dashboard and ask
|
|
them to login and click the "Build Now" link on the Jenkins page
|
|
corresponding to the CI job you want to manually start.
|
|
|
|
#### Quota cleanup
|
|
|
|
Please ping someone who has access to the GCP project. Ask them to
|
|
look at the quotas and delete the leaked resources by clicking the
|
|
delete button corresponding to those leaked resources on Google Cloud
|
|
Console.
|
|
|
|
|
|
## Presubmit
|
|
|
|
* We only have one presubmit test, but it is configured very
|
|
differently than the CI tests.
|
|
* The presubmit test is currently configured to run on the master
|
|
branch and any release branch that's 1.7 or newer.
|
|
* Federation presubmit infrastructure is composed of two separate test
|
|
jobs:
|
|
* Deploy job: This job runs in the background and recycles federated
|
|
clusters every time it runs. Although this job supports federation
|
|
presubmit tests, it is configured as a CI/Soak job. More on
|
|
configuration later. Since recycling federated clusters is an
|
|
expensive operation, we do not want to run this often. Hence, this
|
|
job is configured to run once every 24 hours, around midnight
|
|
Pacific time.
|
|
* Test job: This is the job that runs federation presubmit tests on
|
|
every PR in the core repository, i.e.
|
|
[kubernetes/kubernetes](https://github.com/kubernetes/kubernetes).
|
|
These jobs can run in parallel on the PRs in the repository.
|
|
|
|
### Two-jobs setup
|
|
|
|
The deploy job runs once every 24 hours at around midnight Pacific
|
|
time. It is configured to turn up and tear down 3 federated clusters.
|
|
It starts out by downloading the latest Kubernetes release built from
|
|
[kubernetes/kubernetes](https://github.com/kubernetes/kubernetes)
|
|
master. It then tears down the existing federated clusters and turns
|
|
up new ones. As the clusters are created, their kubeconfigs are
|
|
written to a local kubeconfig file where the job runs. Once all the
|
|
clusters are successfully turned up, the local kubeconfig is then
|
|
copied to a pre-configured GCS bucket. Any existing kubeconfig in the
|
|
bucket will be overwritten.
|
|
|
|
The test job on the other hand starts by copying the latest kubeconfig
|
|
from the pre-configured GCS bucket. It uses this kubeconfig to deploy
|
|
a new federation control plane on one of the clusters in the
|
|
kubeconfig. It then joins all the clusters in the kubeconfig, including
|
|
the host cluster where federation control plane is deployed, as members
|
|
to the newly created federation control plane. The test job then runs
|
|
the federation presubmit tests on this control plane and tears down the
|
|
control plane in the end.
|
|
|
|
Since federated clusters are recycled only once every 24 hours, all
|
|
presubmit runs in that period share the federated clusters. And since
|
|
there could be multiple presubmit tests running in parallel, each
|
|
instance of the test gets its own namespace where it deploys the
|
|
federation control plane. These federation control planes deployed in
|
|
separate namespaces are independent of each other and do not interfere
|
|
with other federation control planes in any way.
|
|
|
|
### Configuration
|
|
|
|
The two jobs are configured differently.
|
|
|
|
#### Deploy job
|
|
|
|
The deploy job is configured as a CI/Soak job in Jenkins.
|
|
Configuration steps are described in https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/jenkins/README.md#how-to-work-with-jenkins-jobs
|
|
|
|
The configuration of the deploy job is stored in:
|
|
|
|
* Jenkins config: https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/jenkins/job-configs/kubernetes-jenkins/bootstrap-ci-soak.yaml#L76
|
|
* Test job/bootstrap config: https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/jobs/config.json#L3996
|
|
* Test grid config: https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/testgrid/config/config.yaml#L152
|
|
* Job specific config: https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/jobs/ci-kubernetes-pull-gce-federation-deploy.env
|
|
|
|
#### Test job
|
|
|
|
The test job is
|
|
[configured in prow](https://github.com/kubernetes/test-infra/blob/35ceb37e999bb0589218708262634951b79dfe05/prow/config.yaml#L236),
|
|
but it runs in Jenkins mode. The configuration steps are described in
|
|
https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/README.md#create-a-new-job
|
|
|
|
The configuration of the test job is stored in:
|
|
|
|
* Prow config: https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/prow/config.yaml#L244
|
|
* Test job/bootstrap config: https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/jobs/config.json#L4691
|
|
* Job specific config: https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/jobs/pull-kubernetes-federation-e2e-gce.env
|
|
|
|
### Results
|
|
|
|
Aggregated results are available on the Gubernator dashboard page for
|
|
the federation presubmit tests.
|
|
|
|
https://k8s-gubernator.appspot.com/builds/kubernetes-jenkins/pr-logs/directory/pull-kubernetes-federation-e2e-gce
|
|
|
|
### Metrics
|
|
|
|
We track the flakiness metrics of all the presubmit jobs and
|
|
individual tests that run against PRs in
|
|
[kubernetes/kubernetes](https://github.com/kubernetes/kubernetes).
|
|
|
|
* The metrics that we track are documented in https://github.com/kubernetes/test-infra/blob/0c56d2c9d32307c0a0f8fece85ef6919389e77fd/metrics/README.md#metrics.
|
|
* Job-level metrics are available in - [http://storage.googleapis.com/k8s-metrics/job-flakes-latest.json]().
|
|
* As of this writing, federation presubmits have a [success rate of
|
|
93.4%](http://storage.googleapis.com/k8s-metrics/job-flakes-latest.json).
|
|
|
|
### Playbook
|
|
|
|
#### Triggering a new deploy job run
|
|
|
|
Please ping someone who has access to the Jenkins UI/dashboard and ask
|
|
them to login and click the "Build Now" link on the Jenkins page
|
|
corresponding to the CI job you want to manually start.
|
|
|
|
#### Triggering a new test run
|
|
|
|
Use the `/test` command on the PR to retrigger the test. The exact
|
|
incantation is: `/test pull-kubernetes-federation-e2e-gce`
|
|
|
|
#### Quota cleanup
|
|
|
|
Please ping someone who has access to `k8s-jkns-pr-bldr-e2e-gce-fdrtn`
|
|
GCP project. Ask them to look at the quotas and delete the leaked
|
|
resources by clicking the delete button corresponding to those leaked
|
|
resources on Google Cloud Console.
|