Merge pull request #735 from kubernetes/foxish-patch-1
Update buildcop instructions
This commit is contained in:
commit
c9912ffb89
|
|
@ -1,117 +1,49 @@
|
|||
## Kubernetes "Github and Build-cop" Rotation
|
||||
# Kubernetes BuildCop Workflow
|
||||
|
||||
### Preqrequisites
|
||||
June 2017
|
||||
|
||||
* Ensure you have [write access to http://github.com/kubernetes/kubernetes](https://github.com/orgs/kubernetes/teams/kubernetes-maintainers)
|
||||
* Test your admin access by e.g. adding a label to an issue.
|
||||
## Objective
|
||||
|
||||
### Traffic sources and responsibilities
|
||||
This document describes the responsibilities and the workflow of a person assuming the buildcop role.
|
||||
The current buildcop can be found [here](https://storage.googleapis.com/kubernetes-jenkins/oncall.html).
|
||||
|
||||
* GitHub Kubernetes [issues](https://github.com/kubernetes/kubernetes/issues):
|
||||
Your job is to be
|
||||
the first responder to all new issues. If you are not equipped to do
|
||||
this (which is fine!), it is your job to seek guidance!
|
||||
## Prerequisites for build-copping
|
||||
|
||||
* Support issues should be closed and redirected to Stack Overflow (see example
|
||||
response [here](on-call-user-support.md#user-support-response-example)).
|
||||
- Ensure you have admin access to [http://github.com/kubernetes/kubernetes](http://github.com/kubernetes/kubernetes)
|
||||
- Check your membership in the GitHub team: [kubernetes-build-cops](https://github.com/orgs/kubernetes/teams/kubernetes-build-cops/members).
|
||||
If you are not a member contact one of the team maintainers to get yourself added to it.
|
||||
- Test your admin access by e.g. adding a label to an issue.
|
||||
- You must communicate any concerns/actions via the **#sig-release** slack channel to ensure that
|
||||
the release team has context on the current state of the submit queue.
|
||||
- You must attend the release burndown meeting to provide an update on the current state of the submit-queue
|
||||
|
||||
* All incoming issues should be tagged with a team label
|
||||
(team/{api,ux,control-plane,node,cluster,csi,redhat,mesosphere,gke,release-infra,test-infra,none});
|
||||
for issues that overlap teams, you can use multiple team labels
|
||||
## Responsibilities
|
||||
|
||||
* There is a related concept of "Github teams" which allow you to @ mention
|
||||
a set of people; feel free to @ mention a Github team if you wish, but this is
|
||||
not a substitute for adding a team/* label, which is required
|
||||
The build-cop's primary responsibility is to ensure that automatic merges are happening at a
|
||||
**reasonable** rate. This may include performing merging of test flake PRs when the pre-submits
|
||||
are failing repeatedly. The buildcop must be familiar with the
|
||||
[queue labels](https://submit-queue.k8s.io/#/info) and apply them as necessary to critical fixes.
|
||||
The priority labels are defunct and no longer respected by the submit-queue. As of June 2017,
|
||||
the merge rate is ~30 PRs per day if there are that many PRs in the queue. The previous
|
||||
responsibilities of this role included classification of incoming issues, but that is no
|
||||
longer a part of the mandate.
|
||||
|
||||
* [Google teams](https://github.com/orgs/kubernetes/teams?utf8=%E2%9C%93&query=goog-)
|
||||
* [Redhat teams](https://github.com/orgs/kubernetes/teams?utf8=%E2%9C%93&query=rh-)
|
||||
* [SIGs](https://github.com/orgs/kubernetes/teams?utf8=%E2%9C%93&query=sig-)
|
||||
## Workflow
|
||||
|
||||
* If the issue is reporting broken builds, broken e2e tests, or other
|
||||
obvious P0 issues, label the issue with priority/P0 and assign it to someone.
|
||||
This is the only situation in which you should add a priority/* label
|
||||
* non-P0 issues do not need a reviewer assigned initially
|
||||
|
||||
* Assign any issues related to Vagrant to @derekwaynecarr (and @mention him
|
||||
in the issue)
|
||||
|
||||
* Keep in mind that you can @ mention people in an issue to bring it to
|
||||
their attention without assigning it to them. You can also @ mention github
|
||||
teams, such as @kubernetes/goog-ux or @kubernetes/kubectl
|
||||
|
||||
* If you need help triaging an issue, consult with (or assign it to)
|
||||
@brendandburns, @thockin, @bgrant0607, @davidopp, @dchen1107,
|
||||
@lavalamp (all U.S. Pacific Time) or @fgrzadkowski (Central European Time).
|
||||
|
||||
* At the beginning of your shift, please add team/* labels to any issues that
|
||||
have fallen through the cracks and don't have one. Likewise, be fair to the next
|
||||
person in rotation: try to ensure that every issue that gets filed while you are
|
||||
on duty is handled. The Github query to find issues with no team/* label is:
|
||||
[here](https://github.com/kubernetes/kubernetes/issues?utf8=%E2%9C%93&q=is%3Aopen+is%3Aissue+-label%3Ateam%2Fcontrol-plane+-label%3Ateam%2Fmesosphere+-label%3Ateam%2Fredhat+-label%3Ateam%2Frelease-infra+-label%3Ateam%2Fnone+-label%3Ateam%2Fnode+-label%3Ateam%2Fcluster+-label%3Ateam%2Fux+-label%3Ateam%2Fapi+-label%3Ateam%2Ftest-infra+-label%3Ateam%2Fgke+-label%3A"team%2FCSI-API+Machinery+SIG"+-label%3Ateam%2Fhuawei+-label%3Ateam%2Fsig-aws).
|
||||
|
||||
### Build-copping
|
||||
|
||||
* The [merge-bot submit queue](http://submit-queue.k8s.io/)
|
||||
([source](https://github.com/kubernetes/contrib/tree/master/mungegithub/mungers/submit-queue.go))
|
||||
should auto-merge all eligible PRs for you once they've passed all the relevant
|
||||
checks mentioned below and all [critical e2e tests]
|
||||
(https://goto.google.com/k8s-test/view/Critical%20Builds/) are passing. If the
|
||||
merge-bot been disabled for some reason, or tests are failing, you might need to
|
||||
do some manual merging to get things back on track.
|
||||
|
||||
* Once a day or so, look at the [flaky test builds]
|
||||
(https://goto.google.com/k8s-test/view/Flaky/); if they are timing out, clusters
|
||||
are failing to start, or tests are consistently failing (instead of just
|
||||
flaking), file an issue to get things back on track.
|
||||
|
||||
* Jobs that are not in [critical e2e tests](https://goto.google.com/k8s-test/view/Critical%20Builds/)
|
||||
or [flaky test builds](https://goto.google.com/k8s-test/view/Flaky/) are not
|
||||
your responsibility to monitor. The `Test owner:` in the job description will be
|
||||
automatically emailed if the job is failing.
|
||||
|
||||
* If you are oncall, ensure that PRs confirming to the following
|
||||
pre-requisites are being merged at a reasonable rate:
|
||||
|
||||
* [Have been LGTMd](https://github.com/kubernetes/kubernetes/labels/lgtm)
|
||||
* Pass Travis and Jenkins per-PR tests.
|
||||
* Author has signed CLA if applicable.
|
||||
|
||||
|
||||
* Although the shift schedule shows you as being scheduled Monday to Monday,
|
||||
working on the weekend is neither expected nor encouraged. Enjoy your time
|
||||
off.
|
||||
|
||||
* When the build is broken, roll back the PRs responsible ASAP
|
||||
|
||||
* If the build job itself fails, Jenkins will not try again automatically and everything will halt. You can trigger one at http://kubekins.mtv.corp.google.com/job/ci-kubernetes-build/#. Click `log in`, then click `Build Now` in the left margin.
|
||||
|
||||
* When E2E tests are unstable, a "merge freeze" may be instituted. During a
|
||||
merge freeze:
|
||||
|
||||
* Oncall should slowly merge LGTMd changes throughout the day while monitoring
|
||||
E2E to ensure stability.
|
||||
|
||||
* Ideally the E2E run should be green, but some tests are flaky and can fail
|
||||
randomly (not as a result of a particular change).
|
||||
* If a large number of tests fail, or tests that normally pass fail, that
|
||||
is an indication that one or more of the PR(s) in that build might be
|
||||
problematic (and should be reverted).
|
||||
* Use the Test Results Analyzer to see individual test history over time.
|
||||
|
||||
|
||||
* Flake mitigation
|
||||
|
||||
* Tests that flake (fail a small percentage of the time) need an issue filed
|
||||
against them. Please read [this](flaky-tests.md#filing-issues-for-flaky-tests);
|
||||
the build cop is expected to file issues for any flaky tests they encounter.
|
||||
|
||||
* It's reasonable to manually merge PRs that fix a flake or otherwise mitigate it.
|
||||
|
||||
### Contact information
|
||||
|
||||
[@k8s-oncall](https://github.com/k8s-oncall) will reach the current person on
|
||||
call.
|
||||
|
||||
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
|
||||
[]()
|
||||
<!-- END MUNGE: GENERATED_ANALYTICS -->
|
||||
1. Check the Prow batch dashboard: [https://prow.k8s.io/?type=batch](https://prow.k8s.io/?type=batch)
|
||||
to ensure that merges are occurring regularly.
|
||||
2. If there are post-submit blocking jobs (see [link](https://submit-queue.k8s.io/#/e2e)), ensure
|
||||
that those builds are green and allowing merges to occur.
|
||||
3. If several batch merges are failing, file an issue for that job and describe the possible
|
||||
causes for the failure. Debug if possible, else triage and assign to a particular SIG, and
|
||||
@-mention the maintainers. For example, see:
|
||||
[#47135](https://github.com/kubernetes/kubernetes/issues/47135)
|
||||
4. Communicate the actions to **#sig-release** via slack and ensure that the issue is being worked on.
|
||||
5. If the issue is not worked on for several hours, please escalate to the release team.
|
||||
The release team members can be found via the [features](https://github.com/kubernetes/features) repo.
|
||||
For example, the Kubernetes 1.7 release team members are listed [here](https://github.com/kubernetes/features/blob/master/release-1.7/release_team.md).
|
||||
Notify the release manager/release team members via GitHub mentions and slack.
|
||||
6. When the SIG member sends a fix, manually merge if necessary, after verifying that pre-submits pass,
|
||||
or use the 'retest-not-required' label with the appropriate 'queue/*' label to ensure merge of the
|
||||
flake fix.
|
||||
7. Issue an update to the **#sig-release** channel on the merge rate and the PR that was used to fix the queue.
|
||||
|
|
|
|||
Loading…
Reference in New Issue