This is a follow-up fix to
https://github.com/letsencrypt/boulder/pull/6987 which mistakenly did
not update the docker compose default BOULDER_TOOLS_TAG variable from
go1.20.5 to go1.20.6. This would only manifest on developer machines
manually running unit/integration tests rather than CI which explicitly
tests against a matrix of BOULDER_TOOLS_TAG versions.
```
Error response from daemon: manifest for letsencrypt/boulder-tools:go1.20.5_2023-07-11 not found: manifest unknown: manifest unknown
```
This version includes a fix that seems relevant to us:
> The HTTP/1 client did not fully validate the contents of the Host
header. A maliciously crafted Host header could inject additional
headers or entire requests. The HTTP/1 client now refuses to send
requests containing an invalid Request.Host or Request.URL.Host value.
>
> Thanks to Bartek Nowotarski for reporting this issue.
>
> Includes security fixes for CVE-2023-29406 and Go issue
https://go.dev/issue/60374
Add go1.21rc2 to the matrix of go versions we test against.
Add a new step to our CI workflows (boulder-ci, try-release, and
release) which sets the "GOEXPERIMENT=loopvar" environment variable if
we're running go1.21. This experiment makes it so that loop variables
are scoped only to their single loop iteration, rather than to the whole
loop. This prevents bugs such as our CAA Rechecking incident
(https://bugzilla.mozilla.org/show_bug.cgi?id=1619047). Also add a line
to our docker setup to propagate this environment variable into the
container, where it can affect builds.
Finally, fix one TLS-ALPN-01 test to have the fake subscriber server
actually willing to negotiate the acme-tls/1 protocol, so that the ACME
server's tls client actually waits to (fail to) get the certificate,
instead of dying immediately. This fix is related to the upgrade to
go1.21, not the loopvar experiment.
Fixes https://github.com/letsencrypt/boulder/issues/6950
- Update consul container from `1.13.1` to `1.14.2` to match production.
- Specify `grpc_tls`, now required instead of defaulted to `8503` when
`enable_agent_tls_for_checks` is specified.
Part of #6911
Enable SA gRPC health checks in Consul ahead of further changes for
#6878. Calls to the `Check` method of the SA's grpc.health.v1.Health
service must respond `SERVING` before the `sa` service will be
advertised in Consul DNS. Consul will continue to poll this service
every 5 seconds.
- Add `bconsul` docker service to boulder `bluenet` and `rednet`
- Add TLS credentials for `consul.boulder`:
```shell
$ openssl x509 -in consul.boulder/cert.pem -text | grep DNS
DNS:consul.boulder
```
- Update `test/grpc-creds/generate.sh` to add `consul.boulder`
- Update test SA configs to allow `consul.boulder` to access to
`grpc.health.v1.Health`
Part of #6878
These external port bindings are not necessary, as the integration test
configs resolve the bjaeger container directly. In addition, these
external port bindings cause problems for rootless docker, so let's
remove them.
This adds Jaeger's all-in-one dev container (with no persistent storage)
to boulder's dev docker-compose. It configures config-next/ to send all
traces there.
A new integration test creates an account and issues a cert, then
verifies the trace contains some set of expected spans.
This test found that async finalize broke spans, so I fixed that and a
few related spots where we make a new context.
Add an upstream ProxySQL container to our docker-compose. Configure
ProxySQL to manage database connections for our unit and integration
tests.
Fixes#5873
## Remove cfssl recommendation
While it is a valuable PKI toolkit, it really isn't an alternative to
boulder -- there are other private ACME CA projects, and I don't think
we should be in the business of recommending other software when there's
many tradeoffs to be made.
## Remove references to "two API versions"
This removes the reference to running two WFEs, and simplifies some of
the description around "objects" being passed around, which I don't
think is helpful for understanding how Boulder works as the RPCs aren't
generally broadcasting updated objects in the way the removed paragraph
suggests.
## Update information about solving ACME challenges
In #6619 we removed the VA PortConfig, so information about ports 5001
and 5002 are obsolete.
As well, the docker host IP is almost always (barring a user changing
it) the same, so while there's a longer explanation in the README, a
comment in docker-compose.yml is a useful quick reference.
Only build arm64 images for one version of Go.
Split build.sh into two scripts: build.sh (which installs apt and
Python) and install-go.sh (which installs a specific Go version and Go
dependencies). This allows reusing a cached layer for the build.sh step
across multiple Go versions.
Remove installation of fpm from build.sh. This is no longer needed since
#6669 and allows us to get rid of `rpm`, `ruby`, and `ruby-dev`.
Remove apt dependency on pkg-config, libtool, autoconf, and automake.
These were introduced in
https://github.com/letsencrypt/boulder/pull/4832 but aren't needed
anymore because we don't build softhsm2 ourselves (we get it from apt).
Remove apt dependency on cmake, libssl-dev, and openssl. I'm not totally
sure what these were needed for but they're not needed anymore.
Running this locally on my laptop for our current 3 GO_CI_VERSIONS and 1
GO_DEV_VERSION takes 23 minutes of wall time, dominated by the cross
build for arm64.
Remove `example.com` domain name, which was used by the deleted OldTLS
tests.
Remove GODEBUG=x509sha1=1.
Add a longer comment for the Consul DNS fallback in docker-compose.yml.
Use the "dnsAuthority" field for all gRPC clients in config-next,
instead of implicitly relying on the system DNS. This matches what we do
in prod.
Make "dnsAuthority" field of GRPCClientConfig mandatory whenever
SRVLookup or SRVLookups is used.
Make test/config/ocsp-responder.json use ServerAddress instead of
SRVLookup, like the rest of test/config.
Add go1.20 as a new version to run tests on, and to build release
artifacts from. Fix one test which was failing because it was
accidentally relying on consistent (i.e. unseeded) non-cryptographic
random number generation, which go1.20 now automatically seeds at import
time.
Update the version of golangci-lint used in our docker containers to the
new version that has go1.20 support. Remove a number of nolint comments
that were required due to an old version of the gosec linter.
Clean up several spots where we were behaving differently on
go1.18 and go1.19, now that we're using go1.19 everywhere. Also
re-enable the lint and generate tests, and fix the various places where
the two versions disagreed on how comments should be formatted.
Also clean up the OldTLS codepaths, now that both go1.19 and our
own feature flags have forbidden TLS < 1.2 everywhere.
Fixes#6011
- Add a dedicated Consul container
- Replace `sd-test-srv` with Consul
- Add documentation for configuring Consul
- Re-issue all gRPC credentials for `<service-name>.service.consul`
Part of #6111
In dev docker we've always used a single schema (`boulder_sa`), with two
environments (`test` and `integration`) making for a combined total of two
databases sharing the same users and schema (e.g. `boulder_sa_test` and
`boulder_sa_integration`). There are also two versions of this schema. `db` and
`db-next`. The former is the schema as it should exist in production and the
latter is everything from `db` with some un-deployed schema changes. This change
adds support for additional schemas with the same aforementioned environments
and versions.
- Add support for additional schemas in `test/create_db.sh` and sa/migrations.sh
- Add new schema `incidents_sa` with its own users
- Replace `bitbucket.org/liamstask/goose/` with `github.com/rubenv/sql-migrate`
Part of #6328
When rsyslog receives multiple identical log lines in a row, it can
collapse those lines into a single instance of the log line and a
follow-up line saying "message repeated X times". However, that
rsyslog-generated line does not contain our log line checksum, so it
immediately causes log-validator to complain about the line. In
addition, the rsyslog docs themselves state that this feature is a
misfeature and should never be turned on. Despite this, Ubuntu turns the
feature on by default when the rsyslog package is installed from apt.
Add an additional command to our dockerfile which overwrites Ubuntu's
default setting to disable this misfeature, and update our test
environment to use the new docker image.
Fixes#6252
Run the Boulder unit and integration tests with go1.19.
In addition, make a few small changes to allow both sets of
tests to run side-by-side. Mark a few tests, including our lints
and generate checks, as go1.18-only. Reformat a few doc
comments, particularly lists, to abide by go1.19's stricter gofmt.
Causes #6275
Update:
- golangci-lint from v1.42.1 to v1.46.2
- protoc from v3.15.6 to v3.20.1
- protoc-gen-go from v1.26.0 to v1.28.0
- protoc-gen-go-grpc from v1.1.0 to v1.2.0
- fpm from v1.14.0 to v1.14.2
Also remove a reference to go1.17.9 from one last place.
This does result in updating all of our generated .pb.go files, but only
to update the version number embedded in each file's header.
Fixes#6123
Redis recently released version 7.0.0, which has several breaking
changes. The go-redis library that we rely on does not yet support
communicating with a Redis 7.0.0 cluster.
Pin ourselves to the latest non-7.0.0 version, 6.2.7, until such time
as go-redis releases a version with support for 7.0.0.
Fixes#6071
This adds three features flags: SHA1CSRs, OldTLSOutbound, and
OldTLSInbound. Each controls the behavior of an upcoming deprecation
(except OldTLSInbound, which isn't yet scheduled for a deprecation
but will be soon). Note that these feature flags take advantage of
`features`' default values, so they can default to "true" (that is, each
of these features is enabled by default), and we set them to "false"
in the config JSON to turn them off when the time comes.
The unittest for OldTLSOutbound requires that `example.com` resolves
to 127.0.0.1. This is because there's logic in the VA that checks
that redirected-to hosts end in an IANA TLD. The unittest relies on
redirecting, and we can't use e.g. `localhost` in it because of that
TLD check, so we use example.com.
Fixes#6036 and #6037
go1.17.9 (released 2022-04-12) includes security fixes to the crypto/elliptic and encoding/pem packages, as well as bug fixes to the linker and runtime. See the [Go 1.17.9 milestone](https://github.com/golang/go/issues?q=milestone%3AGo1.17.9+label%3ACherryPickApproved) on our issue tracker for details.
go1.18.1 (released 2022-04-12) includes security fixes to the crypto/elliptic, crypto/x509, and encoding/pem packages, as well as bug fixes to the compiler, linker, runtime, the go command, vet, and the bytes, crypto/x509, and go/types packages. See the [Go 1.18.1 milestone](https://github.com/golang/go/issues?q=milestone%3AGo1.18.1+label%3ACherryPickApproved) on our issue tracker for details.
- Remove GOPATH-style path structure, which isn't needed with Go
modules.
- Remove check for existing of docker buildx builder instance, since it
was unreliable.
- Add a CI workflow which publishes a GitHub Release containing a Debian package
when a release tag is pushed
- Add a script, called by the CI host, that installs all of the dependencies
necessary to `make` a Debian package
- Remove the, now defunct, goreleaser config file
Fixes#5970
This requires using GODEBUG to enable a couple of thing turned off by go1.18 (TLS 1.0/1.1, SHA-1 CSRs).
Also add help for a failure mode of cross builds.
Build a new docker container for the new Go 1.17.5 security release,
which includes a fix for the `net/http` package. Update our CI to run
tests on both our current and the new go versions.
These tests are testing functionality that is no longer in use in
production deployments of Boulder. As we go about removing wfe1
functionality, these tests will break, so let's just remove them
wholesale right now. I have verified that all of the tests removed in
this PR are duplicated against wfe2.
One of the changes in this PR is to cease starting up the wfe1 process
in the integration tests at all. However, that component was serving
requests for the AIA Issuer URL, which gets queried by various OCSP and
revocation tests. In order to keep those tests working, this change also
adds an integration-test-only handler to wfe2, and updates the CA
configuration to point at the new handler.
Part of #5681