[flake finders] fix episode 000 readme formatting
Fixes code block formatting in guide for flake finders episode 000. Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
This commit is contained in:
parent
d3ae637cf3
commit
1016d17ce3
|
@ -28,73 +28,74 @@ collaborate across teams to resolve test maintenance issues.
|
|||
[build-master-canary](https://testgrid.k8s.io/sig-release-master-informing#build-master-canary)
|
||||
|
||||
### Breaking PRs
|
||||
- [Use buildx in favor of `FROM --platform` syntax
|
||||
](https://github.com/kubernetes/kubernetes/pull/98529)
|
||||
- [Use buildx in favor of `FROM --platform`
|
||||
syntax](https://github.com/kubernetes/kubernetes/pull/98529)
|
||||
- [Switch to `docker buildx` for conformance
|
||||
image](https://github.com/kubernetes/kubernetes/pull/98569)
|
||||
|
||||
## Investigation
|
||||
|
||||
1. Desire to move from Google-owned infrastructure to Kubernetes community
|
||||
infrastructure. Thus the introduction of a **canary** build job to test pushing
|
||||
building and pushing artifacts with new infrastructure.
|
||||
infrastructure. Thus the introduction of a **canary** build job to test
|
||||
pushing building and pushing artifacts with new infrastructure.
|
||||
1. Desire to move off of `bootstrap.py` job (currently being used for canary
|
||||
job) to `krel` tooling.
|
||||
job) to `krel` tooling.
|
||||
1. Separate job existed (`ci-kubernetes-build-no-bootstrap`) that was doing the
|
||||
same thing as the canary job, but with `krel` tooling.
|
||||
same thing as the canary job, but with `krel` tooling.
|
||||
1. The `no-bootstrap` job was running smoothly, so [updated to use it for the
|
||||
canary job](https://github.com/kubernetes/test-infra/pull/20663).
|
||||
canary job](https://github.com/kubernetes/test-infra/pull/20663).
|
||||
1. Right before the update, we [switched to using buildx for multi-arch
|
||||
images](https://github.com/kubernetes/kubernetes/pull/98529).
|
||||
images](https://github.com/kubernetes/kubernetes/pull/98529).
|
||||
1. Job started failing, which showed up in [some interesting
|
||||
ways](https://kubernetes.slack.com/archives/C09QZ4DQB/p1612269558032700).
|
||||
ways](https://kubernetes.slack.com/archives/C09QZ4DQB/p1612269558032700).
|
||||
1. Triage begins! Issue
|
||||
[opened](https://github.com/kubernetes/kubernetes/issues/98646) and release
|
||||
management team is pinged in Slack.
|
||||
[opened](https://github.com/kubernetes/kubernetes/issues/98646) and release
|
||||
management team is pinged in Slack.
|
||||
1. The `build-master`
|
||||
[job](https://testgrid.k8s.io/sig-release-master-blocking#build-master) was
|
||||
still passing though... interesting.
|
||||
[job](https://testgrid.k8s.io/sig-release-master-blocking#build-master) was
|
||||
still passing though... interesting.
|
||||
1. Both are eventually calling `make release`, so environment must be different.
|
||||
1. Let's look inside!
|
||||
|
||||
``` docker run -it --entrypoint /bin/bash
|
||||
gcr.io/k8s-testimages/bootstrap:v20210130-12516b2 ```
|
||||
```
|
||||
docker run -it --entrypoint /bin/bash gcr.io/k8s-testimages/bootstrap:v20210130-12516b2
|
||||
```
|
||||
|
||||
``` docker run -it
|
||||
gcr.io/k8s-staging-releng/k8s-ci-builder:v20201128-v0.6.0-6-g6313f696-default
|
||||
/bin/bash ```
|
||||
```
|
||||
docker run -it gcr.io/k8s-staging-releng/k8s-ci-builder:v20201128-v0.6.0-6-g6313f696-default /bin/bash
|
||||
```
|
||||
|
||||
1. A few directions we could go here:
|
||||
1. Update the `k8s-ci-builder` image to you use newer version of Docker
|
||||
1. Update the `k8s-ci-builder` image to ensure that
|
||||
`DOCKER_CLI_EXPERIMENTAL=enabled` is set
|
||||
`DOCKER_CLI_EXPERIMENTAL=enabled` is set
|
||||
1. Update the `release.sh` script to set `DOCKER_CLI_EXPERIMENTAL=enabled`
|
||||
|
||||
1. Making the `release.sh` script more flexible serves the community better
|
||||
because it allows for building with more environments. Would also be good to
|
||||
update the `k8s-ci-builder` image for this specific case as well.
|
||||
because it allows for building with more environments. Would also be good to
|
||||
update the `k8s-ci-builder` image for this specific case as well.
|
||||
1. And we get a new
|
||||
[failure](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-build-canary/1356704759045689344/build-log.txt)!
|
||||
[failure](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-build-canary/1356704759045689344/build-log.txt)!
|
||||
1. Let's see what is going on in those images again...
|
||||
1. Why would this cause an error in one but not the other if we have
|
||||
`DOCKER_CLI_EXPERIMENTAL=enabled`?
|
||||
([this](https://github.com/docker/buildx/pull/403) is why)
|
||||
`DOCKER_CLI_EXPERIMENTAL=enabled`?
|
||||
([this](https://github.com/docker/buildx/pull/403) is why)
|
||||
1. In the mean time we went ahead and [re-enabled the bootstrap
|
||||
job](https://github.com/kubernetes/test-infra/pull/20712) (consumers of those
|
||||
images need them!)
|
||||
job](https://github.com/kubernetes/test-infra/pull/20712) (consumers of those
|
||||
images need them!)
|
||||
1. Decided to [increase logging
|
||||
verbosity](https://github.com/kubernetes/kubernetes/pull/98568) on failures to
|
||||
see if that would give us a clue into what was going wrong (and to remove those
|
||||
annoying `quiet currently not implemented` warnings).
|
||||
verbosity](https://github.com/kubernetes/kubernetes/pull/98568) on failures
|
||||
to see if that would give us a clue into what was going wrong (and to remove
|
||||
those annoying `quiet currently not implemented` warnings).
|
||||
1. Job turns green! But how?
|
||||
1. [Buildx](https://github.com/docker/buildx) is versioned separately than
|
||||
Docker itself. Turns out that the `--quiet` flag warning was [actually an
|
||||
error](https://github.com/docker/buildx/pull/403) until `v0.5.1` of Buildx.
|
||||
Docker itself. Turns out that the `--quiet` flag warning was [actually an
|
||||
error](https://github.com/docker/buildx/pull/403) until `v0.5.1` of Buildx.
|
||||
1. The `build-master` job was running with buildx `v0.5.1` while the `krel` job
|
||||
was running with `v0.4.2`. This meant the quiet flag was causing an error in the
|
||||
`krel` job, and removing it alleviated the error.
|
||||
was running with `v0.4.2`. This meant the quiet flag was causing an error in
|
||||
the `krel` job, and removing it alleviated the error.
|
||||
1. Finished up by once again [removing the `bootstrap`
|
||||
job](https://github.com/kubernetes/test-infra/pull/20731).
|
||||
job](https://github.com/kubernetes/test-infra/pull/20731).
|
||||
|
||||
### Fixes
|
||||
|
||||
|
@ -133,8 +134,8 @@ Brand new to the project?
|
|||
Setup already and interested in maintaining tests?
|
||||
- Check out [this video](https://www.youtube.com/watch?v=Ewp8LNY_qTg) from
|
||||
Jordan Liggit who describes strategies and tactics to deflake flaking tests
|
||||
([Jordan's show notes for that
|
||||
talk](https://gist.github.com/liggitt/6a3a2217fa5f846b52519acfc0ffece0))
|
||||
([Jordan's show notes for that
|
||||
talk](https://gist.github.com/liggitt/6a3a2217fa5f846b52519acfc0ffece0))
|
||||
|
||||
Here's how the CI Signal Team actively monitors CI during a release cycle:
|
||||
- [A Tour of CI on the Kubernetes
|
||||
|
|
Loading…
Reference in New Issue