In dev docker we've always used a single schema (`boulder_sa`), with two
environments (`test` and `integration`) making for a combined total of two
databases sharing the same users and schema (e.g. `boulder_sa_test` and
`boulder_sa_integration`). There are also two versions of this schema. `db` and
`db-next`. The former is the schema as it should exist in production and the
latter is everything from `db` with some un-deployed schema changes. This change
adds support for additional schemas with the same aforementioned environments
and versions.
- Add support for additional schemas in `test/create_db.sh` and sa/migrations.sh
- Add new schema `incidents_sa` with its own users
- Replace `bitbucket.org/liamstask/goose/` with `github.com/rubenv/sql-migrate`
Part of #6328
When rsyslog receives multiple identical log lines in a row, it can
collapse those lines into a single instance of the log line and a
follow-up line saying "message repeated X times". However, that
rsyslog-generated line does not contain our log line checksum, so it
immediately causes log-validator to complain about the line. In
addition, the rsyslog docs themselves state that this feature is a
misfeature and should never be turned on. Despite this, Ubuntu turns the
feature on by default when the rsyslog package is installed from apt.
Add an additional command to our dockerfile which overwrites Ubuntu's
default setting to disable this misfeature, and update our test
environment to use the new docker image.
Fixes#6252
Run the Boulder unit and integration tests with go1.19.
In addition, make a few small changes to allow both sets of
tests to run side-by-side. Mark a few tests, including our lints
and generate checks, as go1.18-only. Reformat a few doc
comments, particularly lists, to abide by go1.19's stricter gofmt.
Causes #6275
Add a new code path to the ctpolicy package which enforces Chrome's new
CT Policy, which requires that SCTs come from logs run by two different
operators, rather than one Google and one non-Google log. To achieve
this, invert the "race" logic: rather than assuming we always have two
groups, and racing the logs within each group against each other, we now
race the various groups against each other, and pick just one arbitrary
log from each group to attempt submission to.
Ensure that the new code path does the right thing by adding a new zlint
which checks that the two SCTs embedded in a certificate come from logs
run by different operators. To support this lint, which needs to have a
canonical mapping from logs to their operators, import the Chrome CT Log
List JSON Schema and autogenerate Go structs from it so that we can
parse a real CT Log List. Also add flags to all services which run these
lints (the CA and cert-checker) to let them load a CT Log List from disk
and provide it to the lint.
Finally, since we now have the ability to load a CT Log List file
anyway, use this capability to simplify configuration of the RA. Rather
than listing all of the details for each log we're willing to submit to,
simply list the names (technically, Descriptions) of each log, and look
up the rest of the details from the log list file.
To support this change, SRE will need to deploy log list files (the real
Chrome log list for prod, and a custom log list for staging) and then
update the configuration of the RA, CA, and cert-checker. Once that
transition is complete, the deletion TODOs left behind by this change
will be able to be completed, removing the old RA configuration and old
ctpolicy race logic.
Part of #5938
Update:
- golangci-lint from v1.42.1 to v1.46.2
- protoc from v3.15.6 to v3.20.1
- protoc-gen-go from v1.26.0 to v1.28.0
- protoc-gen-go-grpc from v1.1.0 to v1.2.0
- fpm from v1.14.0 to v1.14.2
Also remove a reference to go1.17.9 from one last place.
This does result in updating all of our generated .pb.go files, but only
to update the version number embedded in each file's header.
Fixes#6123
go1.17.9 (released 2022-04-12) includes security fixes to the crypto/elliptic and encoding/pem packages, as well as bug fixes to the linker and runtime. See the [Go 1.17.9 milestone](https://github.com/golang/go/issues?q=milestone%3AGo1.17.9+label%3ACherryPickApproved) on our issue tracker for details.
go1.18.1 (released 2022-04-12) includes security fixes to the crypto/elliptic, crypto/x509, and encoding/pem packages, as well as bug fixes to the compiler, linker, runtime, the go command, vet, and the bytes, crypto/x509, and go/types packages. See the [Go 1.18.1 milestone](https://github.com/golang/go/issues?q=milestone%3AGo1.18.1+label%3ACherryPickApproved) on our issue tracker for details.
- Remove GOPATH-style path structure, which isn't needed with Go
modules.
- Remove check for existing of docker buildx builder instance, since it
was unreliable.
This requires using GODEBUG to enable a couple of thing turned off by go1.18 (TLS 1.0/1.1, SHA-1 CSRs).
Also add help for a failure mode of cross builds.
When looping over multiple Go versions this script currently exits in error
because we attempt to create a cross-compiling node even though it already
exists. This allows subsequent builds to make use of the Docker cache, reducing
the build time by ~400 seconds.
- Only create the cross-compiling node if it doesn't exist
- No longer remove the cross-compiling node on exit
Build a new docker container for the new Go 1.17.5 security release,
which includes a fix for the `net/http` package. Update our CI to run
tests on both our current and the new go versions.
Currently, if `docker buildx` fails the cross-compilation node, created before
the build starts, will never be deleted. This ensures that the cross-compilation
node is always deleted before `tag_and_upload.sh` exits.
Update the version of golangci-lint we use in our docker image,
and update the version of the docker image we use in our tests.
Fix a couple places where we were violating lints (ineffective assign
and calling `t.Fatal` from outside the main test goroutine), and add
one lint (using math/rand) to the ignore list.
Fixes#5710
Add go1.17beta1 docker images to the set of things we build,
and integrate go1.17beta1 into the set of environments CI runs.
Fix one test which breaks due to an underlying refactoring in
the `crypto/x509` stdlib package. Fix one other test which breaks
due to new guarantees in the stdlib's TLS ALPN implementation.
Also removes go1.16.5 from CI so we're only running 2 versions.
Fixes#5480
protoc now generates grpc code in a separate file from protobuf code.
Also, grpc servers are now required to embed an "unimplemented"
interface from the generated .pb.go file, which provides forward
compatibility.
Update the generate.go files since the invocation for protoc has changed
with the split into .pb.org and _grpc.pb.go.
Fixes#5368
- Remove `goveralls`, `gover`, and `cover` from `build.sh`.
- Remove `--coverage` option from `test.sh`.
- Update Docker image in `docker-compose.yml` and
`.github/workflows/boulder-ci.yml`
Fixes#5357
- Remove `.travis.yml`
- Remove references to Travis in `test.sh`
- Update documentation in `test/boulder-tools/README.md`, `README.MD`,
and `CONTRIBUTING.MD`
- Update comments in `.github/workflows/boulder-ci.yml`
Fixes#5329
Remove mock-vendor, which ensured that mockgen was
available, because we no longer use mockgen. As a result,
remove mockgen from our docker build script. Finally, make
the mock package an indirect dependency since we are no
longer using it directly.
- Add 1.16.1 to the GitHub CI test matrix
- Fix tlsalpn tests for go 1.16.1 but maintain compatibility with 1.15.x
- Fix integration tests.
Fix: #5301Fix: #5316
- Add GitHub actions workflow for Boulder CI tests in parity with Travis
CI except the coverage test.
- Change boulder-tools docker image to push to a static docker repo
instead of creating a new one each time. Use docker version tags and git
hash to identify go versions in the repo.
- Change docker-compose to pull from the static boulder-tools repo. This
breaks using the TRAVIS_GO_VERSION env variable to pull the docker image, but
the default will still work with intent of decommissioning Travis-CI for
GitHub CI.
Fix: #5289
Modified the Dockerfile to build using Debian Buster, an upgrade from
Debian Stretch. The default Python 3 version for Stretch is 3.5.x which
is soon to de deprecated by Python-cryptography a dependency we rely on
for our integration test suite. The default Python 3 version for Debian
Buster is 3.7.x
In the .travis.yml file we are instructing travis to provision Xenial
instances and install two versions of Go. This change bumps Xenial
(16.04) -> Focal (20.04) and removes the installation of the two Go
versions; all of our testing happens inside of a docker container so
having Go installed on the Docker parent isn't necessary.
In the docker-compose.yml file we configure which docker image to pull
from Dockerhub, I've updated these to reflect the Debian Buster images
already built and pushed.
Modified build.sh to install mariadb-client-core 10.3, there is no 10.1
install candidate for Debian Buster and release notes for 10.2 and 10.3
indicate that these were both security releases.
Modified test.sh to use python3 instead of system python (usually 2.7)
for test/grafana/lints.py
Fixes#5180
Go version 1.15.5 is a security release which introduces fixes
both to the big.Int package (which we use) and the go compiler
itself (which we use).
Release notes: https://golang.org/doc/go1.15
This change builds go1.15.5 versions of our docker containers,
adds tests on the new version to our travis config, and sets the
default to be the new version.
Fixes#5173
Go 1.15rc2 was released today. The diff from rc1 only includes one
change to the crypto/ package, but worth upgrading just to be ready
for the official 1.15 stable release.
This enables the gosec linter. It also disables a number of
warnings which it emits on the current codebase. Some of these
(e.g. G104: Errors unhandled) we expect to leave disabled
permanently; others (e.g. G601: Implicit memory aliasing in for loop)
we expect to fix and then enable to prevent regressions.
Part of #4948
This ended up taking a lot more work than I expected. In order to make the implementation more robust a bunch of stuff we previously relied on has been ripped out in order to reduce unnecessary complexity (I think I insisted on a bunch of this in the first place, so glad I can kill it now).
In particular this change:
* Removes bhsm and pkcs11-proxy: softhsm and pkcs11-proxy don't play well together, and any softhsm manipulation would need to happen on bhsm, then require a restart of pkcs11-proxy to pull in the on-disk changes. This makes manipulating softhsm from the boulder container extremely difficult, and because of the need to initialize new on each run (described below) we need direct access to the softhsm2 tools since pkcs11-tool cannot do slot initialization operations over the wire. I originally argued for bhsm as a way to mimic a network attached HSM, mainly so that we could do network level fault testing. In reality we've never actually done this, and the extra complexity is not really realistic for a handful of reasons. It seems better to just rip it out and operate directly on a local softhsm instance (the other option would be to use pkcs11-proxy locally, but this still would require manually restarting the proxy whenever softhsm2-util was used, and wouldn't really offer any realistic benefit).
* Initializes the softhsm slots on each integration test run, rather than when creating the docker image (this is necessary to prevent churn in test/cert-ceremonies/generate.go, which would need to be updated to reflect the new slot IDs each time a new boulder-tools image was created since slot IDs are randomly generated)
* Installs softhsm from source so that we can use a more up to date version (2.5.0 vs. 2.2.0 which is in the debian repo)
* Generates the root and intermediate private keys in softhsm and writes out the root and intermediate public keys to /tmp for use in integration tests (the existing test-{ca,root} certs are kept in test/ because they are used in a whole bunch of unit tests. At some point these should probably be renamed/moved to be more representative of what they are used for, but that is left for a follow-up in order to keep the churn in this PR as related to the ceremony work as possible)
Another follow-up item here is that we should really be zeroing out the database at the start of each integration test run, since certain things like certificates and ocsp responses will be signed by a key/issuer that is no longer is use/doesn't match the current key/issuer.
Fixes#4832.
There are some changes to the code generated in the latest version, so
this modifies every .pb.go file.
Also, the way protoc-gen-go decides where to put files has changed, so
each generate.go gets the --go_opt=paths=source_relative flag to
tell protoc to continue placing output next to the input.
Remove staticcheck from build.sh; we get it via golangci-lint now.
Pass --no-document to gem install fpm; this is recommended in the fpm docs.
We used a template and sed in #3622 because common versions of Docker
didn't support build args. But now they do, so we can use the convenient
build args feature to parameterize which Go version to use.
Also, remove the --no-cache flag to docker build, which slows things
down unnecessarily.
For now this mainly provides an example config and confirms that
log-validator can start up and shut down cleanly, as well as provide a
stat indicating how many log lines it has handled.
This introduces a syslog config to the boulder-tools image that will write
logs to /var/log/program.log. It also tweaks the various .json config
files so they have non-default syslogLevel, to ensure they actually
write something for log-validator to verify.
This makes it easier to configure additional linters, and provides us an
easy command to run locally.
The initial set of linters reflects those we are already running:
govet gofmt ineffassign errcheck misspell staticcheck
Note that misspell is in addition to the Python codespell package.
Since the invocation of these linters from golangci-lint is slightly
different from how we currently invoke them, there are some new
findings. This PR won't pass tests until #4763, #4764, and #4765 are
merged.
Incidentally, rename strat -> strategy to appeal misspell.
This adds staticcheck to our "lints" CI, with a list of excluded checks. Some of these are checks that we don't care about much (like error string capitalization). Others are nice to fix (possible nil pointer dereferences in _test.go files), but we'd like to land the automated checking first to catch any new issues, then later winnow down the list.
This builds on #4726, #4725, and #4722, which addressed some of the categories of findings from staticcheck.
Instead of installing Certbot from the repo, install the python-acme
library (the only piece we need) from the apt repository. This also
allows us to skip installing build dependencies for Certbot.
Uninstall cmake after building.
Clean the various Go caches.
Move codespell and acme into requirements.txt. Don't use virtualenv anymore.
This reduces image size from 1.4 GB to 1.0 GB.
Incidentally, move the Go install to its own phase in the Dockerfile.
This will give it its own image layer, making rebuilds faster.
As part of the process, pin specific versions of protoc-gen-go, mockgen,
and goveralls. Protoc-gen-go recently released a version that was incompatible
with our current version of gRPC. Mockgen has a version that was generating
spurious diffs in our generate test phase, and goveralls recently added
some code that calls git branch --format=..., which breaks on the version of
git in our Docker image.
Pinning versions required forcing go get into module-aware mode, since the
old-style go get doesn't understand versions.
The `codespell` tool will be run during the "lints" phase of `test.sh`.
See `.codespell.ignore.txt for ignored words. Note that these ignored
words should be listed one per-line, in **lowercase** form.
The boulder-tools `build.sh` script is updated to include `codespell` in
the tools image. I built and pushed new images with this script that are
ref'd by `docker-compose.yml`.
Resolves#4635
Python 2 is over in 1 month 4 days: https://pythonclock.org/
This rolls forward most of the changes in #4313.
The original change was rolled back in #4323 because it
broke `docker-compose up`. This change fixes those original issues by
(a) making sure `requests` is installed and (b) sourcing a virtualenv
containing the `requests` module before running start.py.
Other notable changes in this:
- Certbot has changed the developer instructions to install specific packages
rather than rely on `letsencrypt-auto --os-packages-only`, so we follow suit.
- Python3 now has a `bytes` type that is used in some places that used to
provide `str`, and all `str` are now Unicode. That means going from `bytes` to
`str` and back requires explicit `.decode()` and `.encode()`.
- Moved from urllib2 to requests in many places.
A unit test is included to verify that a TLS-ALPN-01 challenge to
a TLS 1.3 only server doesn't succeed when the `GODEBUG` value to
disable TLS 1.3 in `docker-compose.yml` is set. Without this env var
the test fails on the Go 1.13 build because of the new default:
```
=== RUN TestTLSALPN01TLS13
--- FAIL: TestTLSALPN01TLS13 (0.04s)
tlsalpn_test.go:531: expected problem validating TLS-ALPN-01 challenge against a TLS 1.3 only server, got nil
FAIL
FAIL github.com/letsencrypt/boulder/va 0.065s
```
With the env var set the test passes, getting the expected connection
problem reporting a tls error:
```
=== RUN TestTLSALPN01TLS13
2019/09/13 18:59:00 http: TLS handshake error from 127.0.0.1:51240: tls: client offered only unsupported versions: [303 302 301]
--- PASS: TestTLSALPN01TLS13 (0.03s)
PASS
ok github.com/letsencrypt/boulder/va 1.054s
```
Since we plan to eventually enable TLS 1.3 support and the `GODEBUG`
mechanism tested in the above test is platform-wide vs package
specific I decided it wasn't worth the time investment to write a
similar HTTP-01 unit test that verifies the TLS 1.3 behaviour on a
HTTP-01 HTTP->HTTPS redirect.
Resolves https://github.com/letsencrypt/boulder/issues/4415
This reverts commit 796a7aa2f4.
People's tests have been breaking on `docker-compose up` with the following output:
```
ImportError: No module named requests
```
Fixes#4322
* integration: move to Python3
- Add parentheses to all print and raise calls.
- Python3 distinguishes bytes from strings. Add encode() and
decode() calls as needed to provide the correct type.
- Use requests library consistently (urllib3 is not in Python3).
- Remove shebang from Python files without a main, and update
shebang for integration-test.py.
Precursor to #4116. Since some of our dependencies impose a minimum
version on these two packages higher than what we have in Godeps, we'll
have to bump them anyhow. Bumping them independently of the modules
update should keep things a little simpler.
In order to get protobuf tests to pass, I had to update protoc-gen-go in
boulder-tools. Now we download a prebuilt binary instead of using the
Ubuntu package, which is stuck on 3.0.0. This also meant I needed to
re-generate our pb.go files, since the new version generates somewhat
different output.
This happens to change the tag for pbutil, but it's not a substantive change - they just added a tagged version where there was none.
$ go test github.com/miekg/dns/...
ok github.com/miekg/dns 4.675s
ok github.com/miekg/dns/dnsutil 0.003s
ok github.com/golang/protobuf/descriptor (cached)
ok github.com/golang/protobuf/jsonpb (cached)
? github.com/golang/protobuf/jsonpb/jsonpb_test_proto [no test files]
ok github.com/golang/protobuf/proto (cached)
? github.com/golang/protobuf/proto/proto3_proto [no test files]
? github.com/golang/protobuf/proto/test_proto [no test files]
ok github.com/golang/protobuf/protoc-gen-go (cached)
? github.com/golang/protobuf/protoc-gen-go/descriptor [no test files]
ok github.com/golang/protobuf/protoc-gen-go/generator (cached)
ok github.com/golang/protobuf/protoc-gen-go/generator/internal/remap (cached)
? github.com/golang/protobuf/protoc-gen-go/grpc [no test files]
? github.com/golang/protobuf/protoc-gen-go/plugin [no test files]
ok github.com/golang/protobuf/ptypes (cached)
? github.com/golang/protobuf/ptypes/any [no test files]
? github.com/golang/protobuf/ptypes/duration [no test files]
? github.com/golang/protobuf/ptypes/empty [no test files]
? github.com/golang/protobuf/ptypes/struct [no test files]
? github.com/golang/protobuf/ptypes/timestamp [no test files]
? github.com/golang/protobuf/ptypes/wrappers [no test files]
The upstream Certbot project acme module supports initiating TLS-ALPN-01
challenges again and so we can remove the version pin we had in place.
This lets us keep the Certbot version we're testing with in-sync with
master at the time of building the tools image again.
When a Golang security release comes out, we don't want to wait on
the upstream Docker image to update and upload before we can start
running tests on it. This changes our boulder-tools image so it
can download and install Golang itself.
Also, fix some issues in build.sh:
Unparallelize the `go get` with the `apt install`. The `go get`
pulls in `go-sqlite3`, which needs `cmake`, installed by `apt install`.
Also, don't remove build-essential and cmake at the end.
1. Updates both boulder tools images to use an update `pebble-challtestsrv`
2. Updates the Go 1.11.3 boulder tools image to Go 1.11.4
3. Updates the vendored `challtestsrv` dep to 1.0.2
This fixes a panic in the `challtestsrv` library and prepares us to move directly
to 1.11.4 after we've resolved the outstanding issues keeping us on the 1.10.x
stream in prod/staging.
There are no unit tests to run for item 3.
Now that Pebble has a `pebble-challtestsrv` we can remove the `challtestrv`
package and associated command from Boulder. I switched CI to use
`pebble-challtestsrv`. Notably this means that we have to add our expected mock
data using the HTTP management interface. The Boulder-tools images are
regenerated to include the `pebble-challtestsrv` command.
Using this approach also allows separating the TLS-ALPN-01 and HTTPS HTTP-01
challenges by binding each challenge type in the `pebble-challtestsrv` to
different interfaces both using the same VA
HTTPS port. Mock DNS directs the VA to the correct interface.
The load-generator command that was previously using the `challtestsrv` package
from Boulder is updated to use a vendored copy of the new
`github.org/letsencrypt/challtestsrv` package.
Vendored dependencies change in two ways:
1) Gomock is updated to the latest release (matching what the Bouldertools image
provides)
2) A couple of new subpackages in `golang.org/x/net/` are added by way of
transitive dependency through the challtestsrv package.
Unit tests are confirmed to pass for `gomock`:
```
~/go/src/github.com/golang/mock/gomock$ git log --pretty=format:'%h' -n 1
51421b9
~/go/src/github.com/golang/mock/gomock$ go test ./...
ok github.com/golang/mock/gomock 0.002s
? github.com/golang/mock/gomock/internal/mock_matcher [no test files]
```
For `/x/net` all tests pass except two `/x/net/icmp` `TestDiag.go` test cases
that we have agreed are OK to ignore.
Resolves https://github.com/letsencrypt/boulder/issues/3962 and
https://github.com/letsencrypt/boulder/issues/3951
Resolves https://github.com/letsencrypt/boulder/issues/3872
**Note to reviewers**: There's an outstanding bug that I've tracked down to the `--load` stage of the integration tests that results in one of the remote VA instances in the `test/config-next` configuration under Go 1.11.1 to fail to cleanly shut down. I'm working on finding the root cause but in the meantime I've disabled `--load` during CI so we can unblock moving forward with getting Go 1.11.1 in dev/CI. Tracking this in https://github.com/letsencrypt/boulder/issues/3889
`test/setup.sh` is removed because it was a relic of the "slowstart" instructions we removed from the README previously. One less place to keep in-sync.
`test/boulder-tools/build.sh` is updated to fix three problems:
1) the `github.com/golang/lint/golint` package moved to `golang.org/x/lint/golint`
2) it did not abort if a background process (e.g. `apt` or `go get`) failed for any reason.
3) Certbot master has removed tls-alpn-01 support
Problem 1 was causing `go get github.com/golang/lint/golint` to fail. Problem 2 was causing Problem 1 to be ignored, resulting in Docker images that were missing all of the dev tools (missing `goose` is the
first error to indicate this state).
Problem 3 breaks the TLS-ALPN-01 integration tests from `test/config-next` and requires using a specific version of Certbot (even though we don't rely on Certbot for solving the TLS-ALPN-01 challenge, we do rely on it to POST the initiation of the challenge and that does not function with Certbot master).
We've migrated the production/staging Boulder instances to builds using
Go 1.10.3 and can now remove the Go 1.10.2 builds from the travis matrix
and the `tag_and_upload.sh` Boulder tools script.
Production/staging have been updated to a release built with Go 1.10.2.
This allows us to remove the Go 1.10.1 builds from the travis matrix and
default to Go 1.10.2.
This PR adds Go 1.10.2 to the build matrix along with Go 1.10.1.
After staging/prod have been updated to Go 1.10.2 we can remove Go 1.10.1.
Resolves#3680
We have updated staging/prod Boulder builds to use Go 1.10.1. This means
we no longer need to support Go 1.10.0 in dev docker images, CI, and our
image building tools.
This PR updates the `test/boulder-tools/tag_and_upload.sh` script to template a `Dockerfile` for building multiple copies of `boulder-tools`: one per supported Go version. Unfortunately this is required because only Docker 17+ supports an env var in a Dockerfile `FROM`. It's best if we can stay on package manger installed versions of Docker which precludes 17+ 😞.
The `docker-compose.yml` is updated to version "3" to allow specifying a `GO_VERSION` env var in the respective Boulder `image` directives. This requires `docker-compose` version 1.10.0+ which in turn requires Docker engine version 1.13.0+. The README is updated to reflect these new requirements. This Docker engine version is commonly available in package managers (e.g. Ubuntu 16.04). A sufficient `docker-compose` version is not, but this is a simple one binary Go application that is easy to update outside of package managers.
The `.travis.yml` config file is updated to set the `GO_VERSION` in the build matrix, allowing build tasks for different Go versions. Since the `docker-compose.yml` now requires `docker-compose` 1.10.0+ the
`.travis.yml` also gains a new `before_install` for setting up a modern `docker-compose` version.
Lastly tools and images are updated to support both Go 1.10 (our current Go version) and Go 1.10.1 (the new point release). By default Go 1.10 is used, we can switch this once staging/prod are updated.
_*TODO*: One thing I haven't implemented yet is a `sed` expression in `tag_and_upload.sh` that updates both `image` lines in `docker-compose.yml` with an up-to-date tag. Putting this up for review while I work on that last creature comfort._
Resolves https://github.com/letsencrypt/boulder/issues/3551
Replaces https://github.com/letsencrypt/boulder/pull/3620 (GH got stuck from a yaml error)
This pulls in an updated Certbot repo with a fix to content-type for the
revoke method.
Also, adds a convenience replacement in tag_and_release.sh to update
docker-compose.yml.
This change updates boulder-tools to use Go 1.10, and references a
newly-pushed image built using that new config.
Since boulder-tools pulls in the latest Certbot master at the time of
build, this also pulls in the latest changes to Certbot's acme module,
which now supports ACME v2. This means we no longer have to check out
the special acme-v2-integration branch in our integration tests.
This also updates chisel2.py to reflect some of the API changes that
landed in the acme module as it was merged to master.
Since we don't need additional checkouts to get the ACMEv2-compatible
version of the acme module, we can include it in the default RUN set for
local tests.
This was added in the theory that it would stop some errors running
`docker compose go generate ./...` but in tests today, that command
succeeds right now even after removing this from the build steps and
rebuilding. This is not surprising, because as Roland pointed out, the
/alt-gopath/ path was never actually referenced anywhere.
Boulder is fairly noisy about gRPC connection errors. This is a mixed
blessing: Our gRPC configuration will try to reconnect until it hits
an RPC deadline, and most likely eventually succeed. In that case,
we don't consider those to really be errors. However, in cases where
a connection is repeatedly failing, we'd like to see errors in the
logs about connection failure, rather than "deadline exceeded." So
we want to keep logging of gRPC errors.
However, right now we get a lot of these errors logged during
integration tests. They make the output hard to read, and may disguise
more serious errors. So we'd like to avoid causing such errors in
normal integration test operation.
This change reorders the startup of Boulder components by their gRPC
dependencies, so everything's backend is likely to be up and running
before it starts. It also reverses that order for clean shutdowns,
and waits for each process to exit before signalling the next one.
With these changes, I still got connection errors. Taking listenbuddy
out of the gRPC path fixed them. I believe the issue is that
listenbuddy is not a truly transparent proxy. In particular, it
accepts an inbound TCP connection before opening an outbound TCP
connection. If opening that outbound connection results in "connection
refused," it closes the inbound connection. That means gRPC sees a
"connection closed" (or "connection reset"?) rather than "connection
refused". I'm guessing it handles those cases differently, explaining
the different error results.
We've been using listenbuddy to trigger disconnects while Boulder is
running, to ensure that gRPC's reconnect code works. I think we can
probably rely on gRPC's reconnect to work. The initial problem that
led us to start testing this was a configuration problem; now that
we have the configuration we want, we should be fine and don't need
to keep testing reconnects on every integration test run.