The summary here is:
- Move test/cert-ceremonies to test/certs
- Move .hierarchy (generated by the above) to test/certs/webpki
- Remove our mapping of .hierarchy to /hierarchy inside docker
- Move test/grpc-creds to test/certs/ipki
- Unify the generation of both test/certs/webpki and test/certs/ipki
into a single script at test/certs/generate.sh
- Make that script the entrypoint of a new docker compose service
- Have t.sh and tn.sh invoke that service to ensure keys and certs are
created before tests run
No production changes are necessary, the config changes here are just
for testing purposes.
Part of https://github.com/letsencrypt/boulder/issues/7476
Export new prometheus metrics for the `notBefore` and `notAfter` fields
to track internal certificate validity periods when calling the `Load()`
method for a `*tls.Config`. Each metric is labeled with the `serial`
field.
```
tlsconfig_notafter_seconds{serial="2152072875247971686"} 1.664821961e+09
tlsconfig_notbefore_seconds{serial="2152072875247971686"} 1.664821960e+09
```
Fixes https://github.com/letsencrypt/boulder/issues/6829
Boulder builds a single binary which is symlinked to the different binary names, which are included in its releases.
However, requiring symlinks isn't always convenient.
This change makes the base `boulder` command usable as any of the other binary names. If the binary is invoked as boulder, runs the second argument as the command name. It shifts off the `boulder` from os.Args so that all the existing argument parsing can remain unchanged.
This uses the subcommand versions in integration tests, which I think is important to verify this change works, however we can debate whether or not that should be merged, since we're using the symlink method in production, that's what we want to test.
Issue #6362 suggests we want to move to a more fully-featured command-line parsing library that has proper subcommand support. This fixes one fragment of that, by providing subcommands, but is definitely nowhere near as nice as it could be with a more fully fleshed out library. Thus this change takes a minimal-touch approach to this change, since we know a larger refactoring is coming.
- Add a new gRPC client config field which overrides the dNSName checked in the
certificate presented by the gRPC server.
- Revert all test gRPC credentials to `<service>.boulder`
- Revert all ClientNames in gRPC server configs to `<service>.boulder`
- Set all gRPC clients in `test/config` to use `serverAddress` + `hostOverride`
- Set all gRPC clients in `test/config-next` to use `srvLookup` + `hostOverride`
- Rename incorrect SRV record for `ca` with port `9096` to `ca-ocsp`
- Rename incorrect SRV record for `ca` with port `9106` to `ca-crl`
Resolves#6424
- Add a dedicated Consul container
- Replace `sd-test-srv` with Consul
- Add documentation for configuring Consul
- Re-issue all gRPC credentials for `<service-name>.service.consul`
Part of #6111
- Add new configuration key `throughput`, a mapping which contains all
throughput related akamai-purger settings.
- Deprecate configuration key `purgeInterval` in favor of `purgeBatchInterval` in
the new `throughput` configuration mapping.
- When no `throughput` or `purgeInterval` is provided, the purger uses optimized
default settings which offer 1.9x the throughput of current production settings.
- At startup, all throughput related settings are modeled to ensure that we
don't exceed the limits imposed on us by Akamai.
- Queue is now `[][]string`, instead of `[]string`.
- When a given queue entry is purged we know all 3 of it's URLs were purged.
- At startup we know the size of a theoretical request to purge based on the
number of queue entries included
- Raises the queue size from ~333-thousand cached OCSP responses to
1.25-million, which is roughly 6 hours of work using the optimized default
settings
- Raise `purgeInterval` in test config from 1ms, which violates API limits, to 800ms
Fixes#5984
Reverts letsencrypt/boulder#5963
Turns out the tests are still flaky -- using the `grpc.WaitForReady(true)`
connection option results in sometimes seeing 9 entries added to the
purger queue, and sometimes 10 entries. Reverting because flakiness
on main should not be tolerated.
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.36.1 to 1.44.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.36.1...v1.44.0)
Also update akamai-purger integration test to avoid experimental API.
The `conn.GetState()` API is marked experimental and may change behavior
at any time. It appears to have changed between v1.36.1 and v1.44.0,
and so the akamai-purger integration tests which rely on it break.
Rather than writing our own loop which polls `conn.GetState()`, just
use the stable `WaitForReady(true)` connection option, and apply it to
all connections by setting it as a default option in the dial options.
Light cleanup of akamai-purger and the akamai cache-client. This does not make
any material changes to logic.
- Use `errors.New` and `errors.Is` instead of a custom `ErrFatal` type and
`errors.As`
- Add whitespace to separate chunks of execution and error checking from one
another
- Use `logger.Infof` and `logger.Errorf` instead of wrapped calls to
`fmt.Sprintf`
- Remove capital letters from the beginning of error messages
- Additional comments and removal of some that are no longer accurate
Update the version of golangci-lint we use in our docker image,
and update the version of the docker image we use in our tests.
Fix a couple places where we were violating lints (ineffective assign
and calling `t.Fatal` from outside the main test goroutine), and add
one lint (using math/rand) to the ignore list.
Fixes#5710