Update from go1.23.1 to go1.23.6 for our primary CI and release builds.
This brings in a few security fixes that aren't directly relevant to us.
Add go1.24.0 to our matrix of CI and release versions, to prepare for
switching to this next major version in prod.
Replace all of Boulder's usage of the Go stdlib "math/rand" package with
the newer "math/rand/v2" package which first became available in go1.22.
This package has an improved API and faster performance across the
board.
See https://go.dev/blog/randv2 and https://go.dev/blog/chacha8rand for
details.
Although #6771 significantly cleaned up how gRPC services stop and clean
up, it didn't make any changes to our HTTP servers or our non-server
(e.g. crl-updater, log-validator) processes. This change finishes the
work.
Add a new helper method cmd.WaitForSignal, which simply blocks until one
of the three signals we care about is received. This easily replaces all
calls to cmd.CatchSignals which passed `nil` as the callback argument,
with the added advantage that it doesn't call os.Exit() and therefore
allows deferred cleanup functions to execute. This new function is
intended to be the last line of main(), allowing the whole process to
exit once it returns.
Reimplement cmd.CatchSignals as a thin wrapper around cmd.WaitForSignal,
but with the added callback functionality. Also remove the os.Exit()
call from CatchSignals, so that the main goroutine is allowed to finish
whatever it's doing, call deferred functions, and exit naturally.
Update all of our non-gRPC binaries to use one of these two functions.
The vast majority use WaitForSignal, as they run their main processing
loop in a background goroutine. A few (particularly those that can run
either in run-once or in daemonized mode) still use CatchSignals, since
their primary processing happens directly on the main goroutine.
The changes to //test/load-generator are the most invasive, simply
because that binary needed to have a context plumbed into it for proper
cancellation, but it already had a custom struct type named "context"
which needed to be renamed to avoid shadowing.
Fixes https://github.com/letsencrypt/boulder/issues/6794
When using `ct-test-srv` with Boulder infrastructure, it's important now
that the logIDs are correctly configured. This is a nice-to-have that
prints the logID for the provided EC privkey on startup, the same way
that `ct-test-srv` prints the EC pubkey.
Enable the "unparam" linter, which checks for unused function
parameters, unused function return values, and parameters and
return values that always have the same value every time they
are used.
In addition, fix many instances where the unparam linter complains
about our existing codebase. Remove error return values from a
number of functions that never return an error, remove or use
context and test parameters that were previously unused, and
simplify a number of (mostly test-only) functions that always take the
same value for their parameter. Most notably, remove the ability to
customize the RSA Public Exponent from the ceremony tooling,
since it should always be 65537 anyway.
Fixes#6104
Run the Boulder unit and integration tests with go1.19.
In addition, make a few small changes to allow both sets of
tests to run side-by-side. Mark a few tests, including our lints
and generate checks, as go1.18-only. Reformat a few doc
comments, particularly lists, to abide by go1.19's stricter gofmt.
Causes #6275
The iotuil package has been deprecated since go1.16; the various
functions it provided now exist in the os and io packages. Replace all
instances of ioutil with either io or os, as appropriate.
Add a new code path to the ctpolicy package which enforces Chrome's new
CT Policy, which requires that SCTs come from logs run by two different
operators, rather than one Google and one non-Google log. To achieve
this, invert the "race" logic: rather than assuming we always have two
groups, and racing the logs within each group against each other, we now
race the various groups against each other, and pick just one arbitrary
log from each group to attempt submission to.
Ensure that the new code path does the right thing by adding a new zlint
which checks that the two SCTs embedded in a certificate come from logs
run by different operators. To support this lint, which needs to have a
canonical mapping from logs to their operators, import the Chrome CT Log
List JSON Schema and autogenerate Go structs from it so that we can
parse a real CT Log List. Also add flags to all services which run these
lints (the CA and cert-checker) to let them load a CT Log List from disk
and provide it to the lint.
Finally, since we now have the ability to load a CT Log List file
anyway, use this capability to simplify configuration of the RA. Rather
than listing all of the details for each log we're willing to submit to,
simply list the names (technically, Descriptions) of each log, and look
up the rest of the details from the log list file.
To support this change, SRE will need to deploy log list files (the real
Chrome log list for prod, and a custom log list for staging) and then
update the configuration of the RA, CA, and cert-checker. Once that
transition is complete, the deletion TODOs left behind by this change
will be able to be completed, removing the old RA configuration and old
ctpolicy race logic.
Part of #5938
Spamming runs of the `TestPrecertificateRevocation` integration test from
1cd9733c24 found two ways it would flake on rare
occasion:
1. A [data race in the
`ct-test-srv`](https://gist.github.com/cpu/761c176cb72e0eaa52656d3322423202)
would kill the test log process and the integration test would be unable to
reach the mock API. This causes the test failure flagged in #4460. The root
cause is addressed by refactoring the `ct-test-srv`'s
`addChainOrPre` function to use a separate function for checking/extending the
rejected list with the correct locking in place.
2. Occasionally the integration test wasn't able to find a matching precert in
the very first configured ct-test-srv. This produces a test failure like:
```
--- FAIL: TestPrecertificateRevocation (4.95s)
--- FAIL: TestPrecertificateRevocation/revocation_by_certificate_key (1.27s)
revocation_test.go:110: finding rejected precertificate: no matching ct-test-srv rejection found
FAIL
FAIL github.com/letsencrypt/boulder/test/integration 4.961s
FAIL
```
I believe this is addressed by changing the integration test logic to check all of
the configured `ct-test-srv` instances for a matching precert instead of just
the first.
Resolves https://github.com/letsencrypt/boulder/issues/4460
This test adds support in ct-test-srv for rejecting precertificates by
hostname, in order to artificially generate a condition where a
precertificate is issued but no final certificate can be issued. Right
now the final check in the test is temporarily disabled until the
feature is fixed.
Also, as our first Go-based integration test, this pulls in the
eggsampler/acme Go client, and adds some suport in integration-test.py.
This also refactors ct-test-srv slightly to use a ServeMux, and fixes
a couple of cases of not returning immediately on error.
- Move fakeclock, get_future_output, and random_domain to helpers.py.
- Remove tempdir handling from integration-test.py since it's already
done in helpers.py
- Consolidate handling of config dir into helpers.py, and add
CONFIG_NEXT boolean.
- Move RevokeAtRA config gating into verify_revocation to reduce
redundancy.
- Skip load-balancing test when filter is enabled.
- Ungate test_sct_embedding
- Rework test_ct_submissions, which was out of date. In particular, have a couple of
logs where submitFinalCert: false, and make ct-test-srv store submission counts
by hostnames for better test case isolation.
Remove various unnecessary uses of fmt.Sprintf - in particular:
- Avoid calls like t.Error(fmt.Sprintf(...)), where t.Errorf can be used directly.
- Use strconv when converting an integer to a string, rather than using
fmt.Sprintf("%d", ...). This is simpler and can also detect type errors at
compile time.
- Instead of using x.Write([]byte(fmt.Sprintf(...))), use fmt.Fprintf(x, ...).
In publisher and in the integration test, check that SCTs are in a
reasonable range. Also, update CreateTestingSignedSCT (used by
ct-test-srv) to produce SCTs correctly with a timetamp in Unix epoch
milliseconds.
For the upcoming SCT embedding changes, it will be useful to have a CT test server that blocks for nontrivial amounts of time before responding. This change introduces a config file for `ct-test-srv` that can be used to set up multiple "personalities" on various ports. Each personality can have a "latencySchedule" that determines how long it will sleep before servicing responding to a submission.
This change also introduces two new "personalities" on :4510 and :4511, plus configures CTLogGroups in the RA. Having four CT log personalities allows us to simulate two nontrivial log groups.
Note: This triggers Publisher to emit audit errors on timed-out submissions. We may want to make Publisher not treat those as errors, and instead only log an error if a whole log group fails.
Fixes#3020.
In order to write integration tests for some features, especially related to rate limiting, rechecking of CAA, and expiration of authzs, orders, and certs, we need to be able to fake the passage of time in integration tests.
To do so, this change switches out all clock.Default() instances for cmd.Clock(), which can be set manually with the FAKECLOCK environment variable. integration-test.py now starts up all servers once before the main body of tests, with FAKECLOCK set to a date 70 days ago, and does some initial setup for a new integration test case. That test case tries to fetch a 70-day-old authz URL, and expects it to 404.
In order to make this work, I also had to change a number of our test binaries to shut down cleanly in response to SIGTERM. Without that change, stopping the servers between the setup phase and the main tests caused startservers.check() to fail, because some processes exited with nonzero status.
Note: This is an initial stab at things, to prove out the technique. Long-term, I think we will want to use an idiom where test cases are classes that have a number of optional setup phases that may be run at e.g. 70 days prior and 5 days prior. This could help us avoid a proliferation of global state as we add more time-dependent test cases.
Switches imports from `github.com/google/certificate-transparency` to `github.com/google/certificate-transparency-go` and vendors the new code. Also fixes a number of small breakages caused by API changes since the last time we vendored the code. Also updates `github.com/cloudflare/cfssl` since you can't vendor both `github.com/google/certificate-transparency` and `github.com/google/certificate-transparency-go`.
Side note: while doing this `godep` tried to pull in a number of imports under the `golang.org/x/text` repo that I couldn't find actually being used anywhere so I just dropped the changes to `Godeps/Godeps.json` and didn't add the vendored dir to the tree, let's see if this breaks any tests...
All tests pass
```
$ go test ./...
ok github.com/google/certificate-transparency-go 0.640s
ok github.com/google/certificate-transparency-go/asn1 0.005s
ok github.com/google/certificate-transparency-go/client 22.054s
? github.com/google/certificate-transparency-go/client/ctclient [no test files]
ok github.com/google/certificate-transparency-go/fixchain 0.133s
? github.com/google/certificate-transparency-go/fixchain/main [no test files]
ok github.com/google/certificate-transparency-go/fixchain/ratelimiter 27.752s
ok github.com/google/certificate-transparency-go/gossip 0.322s
? github.com/google/certificate-transparency-go/gossip/main [no test files]
ok github.com/google/certificate-transparency-go/jsonclient 25.701s
ok github.com/google/certificate-transparency-go/merkletree 0.006s
? github.com/google/certificate-transparency-go/preload [no test files]
? github.com/google/certificate-transparency-go/preload/dumpscts/main [no test files]
? github.com/google/certificate-transparency-go/preload/main [no test files]
ok github.com/google/certificate-transparency-go/scanner 0.013s
? github.com/google/certificate-transparency-go/scanner/main [no test files]
ok github.com/google/certificate-transparency-go/tls 0.033s
ok github.com/google/certificate-transparency-go/x509 1.071s
? github.com/google/certificate-transparency-go/x509/pkix [no test files]
? github.com/google/certificate-transparency-go/x509util [no test files]
```
```
$ ./test.sh
...
ok github.com/cloudflare/cfssl/api 1.089s coverage: 81.1% of statements
ok github.com/cloudflare/cfssl/api/bundle 1.548s coverage: 87.2% of statements
ok github.com/cloudflare/cfssl/api/certadd 13.681s coverage: 86.8% of statements
ok github.com/cloudflare/cfssl/api/client 1.314s coverage: 55.2% of statements
ok github.com/cloudflare/cfssl/api/crl 1.124s coverage: 75.0% of statements
ok github.com/cloudflare/cfssl/api/gencrl 1.067s coverage: 72.5% of statements
ok github.com/cloudflare/cfssl/api/generator 2.809s coverage: 33.3% of statements
ok github.com/cloudflare/cfssl/api/info 1.112s coverage: 84.1% of statements
ok github.com/cloudflare/cfssl/api/initca 1.059s coverage: 90.5% of statements
ok github.com/cloudflare/cfssl/api/ocsp 1.178s coverage: 93.8% of statements
ok github.com/cloudflare/cfssl/api/revoke 2.282s coverage: 75.0% of statements
ok github.com/cloudflare/cfssl/api/scan 2.729s coverage: 62.1% of statements
ok github.com/cloudflare/cfssl/api/sign 2.483s coverage: 83.3% of statements
ok github.com/cloudflare/cfssl/api/signhandler 1.137s coverage: 26.3% of statements
ok github.com/cloudflare/cfssl/auth 1.030s coverage: 68.2% of statements
ok github.com/cloudflare/cfssl/bundler 15.014s coverage: 85.1% of statements
ok github.com/cloudflare/cfssl/certdb/dbconf 1.042s coverage: 78.9% of statements
ok github.com/cloudflare/cfssl/certdb/ocspstapling 1.919s coverage: 69.2% of statements
ok github.com/cloudflare/cfssl/certdb/sql 1.265s coverage: 65.7% of statements
ok github.com/cloudflare/cfssl/cli 1.050s coverage: 61.9% of statements
ok github.com/cloudflare/cfssl/cli/bundle 1.023s coverage: 0.0% of statements
ok github.com/cloudflare/cfssl/cli/crl 1.669s coverage: 57.8% of statements
ok github.com/cloudflare/cfssl/cli/gencert 9.278s coverage: 83.6% of statements
ok github.com/cloudflare/cfssl/cli/gencrl 1.310s coverage: 73.3% of statements
ok github.com/cloudflare/cfssl/cli/genkey 3.028s coverage: 70.0% of statements
ok github.com/cloudflare/cfssl/cli/ocsprefresh 1.106s coverage: 64.3% of statements
ok github.com/cloudflare/cfssl/cli/revoke 1.081s coverage: 88.2% of statements
ok github.com/cloudflare/cfssl/cli/scan 1.217s coverage: 36.0% of statements
ok github.com/cloudflare/cfssl/cli/selfsign 2.201s coverage: 73.2% of statements
ok github.com/cloudflare/cfssl/cli/serve 1.133s coverage: 39.0% of statements
ok github.com/cloudflare/cfssl/cli/sign 1.210s coverage: 54.8% of statements
ok github.com/cloudflare/cfssl/cli/version 2.475s coverage: 100.0% of statements
ok github.com/cloudflare/cfssl/cmd/cfssl 1.082s coverage: 0.0% of statements
ok github.com/cloudflare/cfssl/cmd/cfssljson 1.016s coverage: 4.0% of statements
ok github.com/cloudflare/cfssl/cmd/mkbundle 1.024s coverage: 0.0% of statements
ok github.com/cloudflare/cfssl/config 2.754s coverage: 67.7% of statements
ok github.com/cloudflare/cfssl/crl 1.063s coverage: 68.3% of statements
ok github.com/cloudflare/cfssl/csr 27.016s coverage: 89.6% of statements
ok github.com/cloudflare/cfssl/errors 1.081s coverage: 81.2% of statements
ok github.com/cloudflare/cfssl/helpers 1.217s coverage: 80.4% of statements
ok github.com/cloudflare/cfssl/helpers/testsuite 7.658s coverage: 65.8% of statements
ok github.com/cloudflare/cfssl/initca 205.809s coverage: 74.2% of statements
ok github.com/cloudflare/cfssl/log 1.016s coverage: 59.3% of statements
ok github.com/cloudflare/cfssl/multiroot/config 1.107s coverage: 77.4% of statements
ok github.com/cloudflare/cfssl/ocsp 1.524s coverage: 77.7% of statements
ok github.com/cloudflare/cfssl/revoke 1.775s coverage: 79.6% of statements
ok github.com/cloudflare/cfssl/scan 1.022s coverage: 1.1% of statements
ok github.com/cloudflare/cfssl/selfsign 1.119s coverage: 70.0% of statements
ok github.com/cloudflare/cfssl/signer 1.019s coverage: 20.0% of statements
ok github.com/cloudflare/cfssl/signer/local 3.146s coverage: 81.2% of statements
ok github.com/cloudflare/cfssl/signer/remote 2.328s coverage: 71.8% of statements
ok github.com/cloudflare/cfssl/signer/universal 2.280s coverage: 67.7% of statements
ok github.com/cloudflare/cfssl/transport 1.028s
ok github.com/cloudflare/cfssl/transport/ca/localca 1.056s coverage: 94.9% of statements
ok github.com/cloudflare/cfssl/transport/core 1.538s coverage: 90.9% of statements
ok github.com/cloudflare/cfssl/transport/kp 1.054s coverage: 37.1% of statements
ok github.com/cloudflare/cfssl/ubiquity 1.042s coverage: 88.3% of statements
ok github.com/cloudflare/cfssl/whitelist 2.304s coverage: 100.0% of statements
```
Fixes#2746.
Pulls in logging improvements in OCSP Responder and the CT client, plus a handful of API changes. Also, the CT client verifies responses by default now.
This change includes some Boulder diffs to accommodate the API changes.
That change broke the certbot tests because it switched to a MariaDB
10.1-specific syntax. certbot/certbot#3058 changes the certbot tests to use
Boulder's docker-compose.yml, so they will get MariaDB 10.1 automatically.
* MariaDB 10.1
* MariaDB 10.1 in Docker
* Run docker stuff.
* Improve test.js error.
* Lower log level
* Revert dockerfile to master
* Export debug ports, set FAKE_DNS, and remove container_name.
* Remove typo.
* Make integration-test.py wait for debug ports.
* Use 10.1 and export more Boulder ports.
* Test updates for Docker
Listen on 0.0.0.0 for utility servers.
Make integration-test.py just wait for ports rather than calling startservers.
Run docker-compose in test.sh.
Remove bypass when database exists.
Separate mailer test into its own function in integration test.
Print better errors in test.js.
* Always bring up mysql container.
* Wait for MySQL to come up.
* Put it in travis-before-install.
* Use 127
* Remove manual docker-up.
* Add ifconfig
* Switch to docker-compose run
* It works!
* Remove some spurious env vars.
* Add bash
* try running it
* Add all deps.
* Pass through env.
* Install everything in the Dockerfile.
* Fix install of ruby
* More improvements
* Revert integration test to run directly
Also remove .git from dockerignore and add some packages.
* Revert integration-test.py to master.
* Stop ignoring test/js
* Start from boulder-tools.
* Add boulder-tools.
* Tweak travis.yml
* Separate out docker-compose pull as install.
* Build in install phase; don't bother with go install in Dockerfile
* Add virtualenv
* Actually build rabbitmq-setup
* Remove FAKE_DNS
* Trivial change
* Pull boulder-tools as a separate step so it gets its own timing info.
* Install certbot and protobuf from repos.
* Use cerbot from debian backports.
* Fix clone
* Remove CERTBOT_PATH
* Updates
* Go back to letsencrypt for build.sh
* Remove certbot volume.
* go back to preinstalled letsencrypt
* Restore ENV
* Remove BASH_ENV
* Adapt reloader test so it psses when run as root.
* Fixups for review.
* Revert test.js
* Revert startservers.py
* Revert Makefile.
This restores an earlier version of
https://github.com/letsencrypt/boulder/pull/1282. During code review on
that change, I requested that we replace the realtime signing with a static
signature. However, that is now generating failures. I think the signature
validation is now validating more of the content.