Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.36.1 to 1.44.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.36.1...v1.44.0)
Also update akamai-purger integration test to avoid experimental API.
The `conn.GetState()` API is marked experimental and may change behavior
at any time. It appears to have changed between v1.36.1 and v1.44.0,
and so the akamai-purger integration tests which rely on it break.
Rather than writing our own loop which polls `conn.GetState()`, just
use the stable `WaitForReady(true)` connection option, and apply it to
all connections by setting it as a default option in the dial options.
- Make maximum queue size configurable via a new configuration key:
'MaxQueueSize'.
- Default 'MaxQueueSize' to the previous value (1M) when 'MaxQueueSize'
isn't specified.
- akamaiPurger.purge() will only place the URLs starting at the first entry of
the failed batch where a failure was encountered instead of the entire set
that was originally passed.
- Add a test to ensure that these changes are working as intended.
- Make the purge batching easier to understand with some minor changes
to variable names
- Responses whose HTTP status code is not 201 will no longer be unmarshaled
- Logs will explicitly call out if a response indicates that we've exceeded any
rate limits imposed by Akamai.
Fixes#5917
Make minor updates to our implementation of the HTTP-01 validation method based
on in-depth review of BRs Section 3.2.2.4.19 and RFC 8555 Section 8.3.
- Move the HTTP response code check above parsing the body.
- Explicitly check for 301, 302, 307, and 308 redirect codes, so that if the go
stdlib updates to allow additional redirects we don't follow suit.
- Trim additional forms of white-space from the key authorization.
Add a new gRPC method `UpdateRevokedCertificate` to the SA. This
method takes the same argument as the existing `RevokeCertificate` RPC,
but only operates on certificates that have already been revoked with a
reason other than keyCompromise (c.f. `RevokeCertificate`, which only
operates on certificates that have not been revoked).
One thing to be careful of here is that storing an updated revocation reason
should not also change the revocation date. To support this, add a new field
to the existing `RevokeCertificateRequest` that allows us to differentiate the
time at which the new OCSP response was created, and the time at which
the revocation went into effect.
Part of #5936
Adapted from https://github.com/golangci/golangci-lint-action#how-to-use.
Uses the same version we've been using in boulder-tools.
Part of #5946
Note: we will eventually want to go back to doing this in boulder-tools,
so it's easy to run the lints locally. But this is useful so we can
unblock testing on go 1.18beta2.
Light cleanup of akamai-purger and the akamai cache-client. This does not make
any material changes to logic.
- Use `errors.New` and `errors.Is` instead of a custom `ErrFatal` type and
`errors.As`
- Add whitespace to separate chunks of execution and error checking from one
another
- Use `logger.Infof` and `logger.Errorf` instead of wrapped calls to
`fmt.Sprintf`
- Remove capital letters from the beginning of error messages
- Additional comments and removal of some that are no longer accurate
Use newly-available HandshakeContext instead of firing off a goroutine.
https://pkg.go.dev/crypto/tls#Conn.HandshakeContext
Don't set explicit min and max TLS versions. Go's defaults do a good job
protecting us here (and the max version prevented us from using TLS
1.3).
The `go` directive inside go.mod determines certain behaviors of
the go command. Since we're using go 1.17 everywhere, we should
update our module's go directive to reflect that, and update its contents
to match the new behavior.
Particularly, updating to 1.17 here means that all indirect dependencies
are listed directly inside go.mod (in a separate block, to keep things clean),
and the go.sum and go.mod files are deleted from vendored dependencies
so that the go tool can correctly find the root of the module even when run
from a vendored dependency's subdirectory.
When inside a closure, it is important to not accidentally assign
to variables declared outside the scope of the closure. Doing so
causes static analysis tools (such as `errcheck`) to be unable to
evaluate the lifetime of the variable, and unable to determine if
it is appropriately read from before being assigned to again.
Fix two instances where we assign to a variable declared in the
closure's enclosing scope, rather than declaring a new variable
with the same name.
We have the `core.AcmeStatus` type precisely so that we don't have to
rely on correctly typing the word `"pending"` everywhere. Use those values
instead of raw strings when converting between string-like Authz statuses
and the database bitfield.
Taking a config object as an argument to a constructor is an anti-
pattern we avoid elsewhere. Resolve the TODO previously put in
place when this first written by instead increasing the number of
arguments `New()` takes and pulling those values out of the config
ahead of time.
Fixes#5904
The stderr, when reviewing logs, saying that the sample size was 0
implies that cert-checker sampled no certificates. In reality, it
is working fine; that zero indicates that the JSON array dumped to
stdout has 0 entries, which is actually good.
Let's just fix this stderr message (which isn't parsed by any tooling we
have) so that the next poor SRE that sees it doesn't freak out and file
tickets to start incident response.
- Break message body construction out into a testable method.
- Ensure that in the event of a missing key, an informative error is returned instead
of allowing the message to be populated with the zero value of the key.
- Add message body construction tests for success, empty map, and missing key.
- Comment the `recipient` struct and it's `Data` field to make it clear that SRE
must be informed of any modifications.
Fixes#5921
Enforce that authorizationLifetimeDays and
pendingAuthorizationLifetimeDays values are configured and set
to a value compliant with the v1.8.1 of the CA/B Forum Baseline
Requirements.
Fixes#5923
Use the T.TempDir method from the testing package to create temporary
directories for tests. Directories created by T.TempDir are automatically
removed when tests complete.
Completely refactor the way we organize our code related to OCSP.
- Move it all into one `//ocsp/` package, rather than having multiple
top-level packages.
- Merge the OCSP updater's config sub-package with its parent
(since it isn't necessary to break it out to avoid cyclic imports).
- Remove all `Source` logic from ocsp-responder's `main.go`, because
it was difficult to mentally trace the control flow there.
- Replace that logic with a set of composable `Source`s in the
`//ocsp/responder/` package, each of which is good at just one thing.
- Update the way the filters work to make sure that the request's
`IssuerKeyHash` and the response's `ResponderName` can both
be derived from the same issuer certificate, ensuring that the req and
resp are correctly matched.
- Split the metrics into a separate metric for each `Source`, so we can
tell what all of them are doing, not just aggregate behavior.
- Split the tests into individual files for each `Source`, and update them
for the new public interfaces.
RFC 8737 says "The client prepares for validation by constructing a self-signed
certificate...". Add a check for whether the challenge certificate is
self-signed by ensuring its issuer and subject are equal, and checking its
signature with its own public key.
Also slightly refactor the helper methods to return only a single cert, since we
only care about the first one returned. And add a test.
We have decided that we don't like the if err := call(); err != nil
syntax, because it creates confusing scopes, but we have not cleaned up
all existing instances of that syntax. However, we have now found a
case where that syntax enables a bug: It caused readers to believe that
a later err = call() statement was assigning to an already-declared err
in the local scope, when in fact it was assigning to an
already-declared err in the parent scope of a closure. This caused our
ineffassign and staticcheck linters to be unable to analyze the
lifetime of the err variable, and so they did not complain when we
never checked the actual value of that error.
This change standardizes on the two-line error checking syntax
everywhere, so that we can more easily ensure that our linters are
correctly analyzing all error assignments.
Detect when a non-compliant ACME client presents a non-compliant x.509
certificate that contains multiple copies of the SubjectAlternativeName or ACME
Identifier extensions; this should be forbidden.
Boulder components initialize their gorp and gorp-less (non-wrapped) database
clients via two new SA helpers. These helpers handle client construction,
database metric initialization, and (for gorp only) debug logging setup.
Removes transaction isolation parameter `'READ-UNCOMMITTED'` from all database
connections.
Fixes#5715Fixes#5889
We insert the precertificate and its ocsp status inside a transaction,
to make sure that they either both succeed or both fail. However, we
were failing to check the error return after inserting the ocsp response.
Add a check to ensure that we fail the whole transaction if the certStatus
insert fails, and add a test to cover this case.
RFC 8737 states that the certificate presented during the "acme-tls/1"
handshake has "a subjectAltName extension containing the dNSName
being validated and no other entries". We were checking that it contained
no other dNSNames, but not requiring that it not have any other kinds of
Subject Alternative Names.
Factor all of our SAN checks into a helper function. Have that function
construct the expected bytes of the SAN extension from the one DNS
name we expect to see, and assert that the actual bytes match the
expectation. Add non-DNS-name identifiers to our error output when we
encounter a cert whose SANs don't match. And add tests which check
that we fail the validation when the cert has multiple SANs.
We have no automated processes which transition Authorizations into
the "revoked" status. Therefore the code which compute the status of
an Order based on the status of its associated Authorizations did not
handle Authorizations with status `core.StatusRevoked`. This code
started failing (by falling through to its default case) when Revoked
Authorizations were introduced to the database.
Unfortunately, the `statusForOrder` function is called during /new-order
handling, to determine whether an existing Order can be reused for
the new request. This means that clients attempting to place a new
order for which there was a candidate existing order were getting 500s,
rather than getting either a new or reused Order object.
To fix the error, introduce a new case which handles `core.StatusRevoked`.
Also collapse the various integer counters which were all just counting
"any status other than Valid or Pending" into a single counter.
This bug would have been avoided by Rust's requirement that case
switches cover all of their possible branches.
Make the `grpcLogger`'s `Warning` methods call the underlying
`Warning` methods on our own logger, rather than upgrading the
severity to `Error`. Move our two filters from the error methods to
the warning methods, since they are warning-level messages in
gRPC. Improve code organization and documentation to make
this struct easier to read next time.
Current metrics show that subscribers present certificates using the
obsolete OID to identify their id-pe-acmeIdentifier extension about
an order of magnitude less often than they present the correct OID.
Remove support for the never-standardized OID.
RFC 8737, Section 4, states "ACME servers that implement "acme-tls/1"
MUST only negotiate TLS 1.2 [RFC5246] or higher when connecting to
clients for validation." Enforce that our outgoing connections to validate
TLS-ALPN-01 challenges do not negotiate TLS1.1.
Incidents of key compromise where proof is supplied in the form of a private key
have historically been labor intensive for SRE. This PR seeks to automate the
process of embedded public key validation , query for issuance, revocation, and
blocking by SPKI hash.
For an example of private keys embedding a mismatched public key, see:
https://blog.hboeck.de/archives/888-How-I-tricked-Symantec-with-a-Fake-Private-Key.html.
Adds two new sub-commands (private-key-block and private-key-revoke) and one new
flag (-dry-run) to admin-revoker. Both new sub-commands validate that the
provided private key and provide the operator with an issuance count. Any
blocking and revocation actions are gated by the new '-dry-run' flag, which is
'true' by default.
private-key-block: if -dry-run=false, will immediately block issuance for the
provided key. The operator is informed that bad-key-revoker will eventually
revoke any certificates using the provided key.
private-key-revoke: if -dry-run=false, will revoke all certificates using the
provided key and then blocks future issuance. This avoids a race with the
bad-key-revoker. This command will execute successfully even if issuance for the
provided key is already blocked.
- Add support for blocking issuance by private key to admin-revoker
- Add support for revoking certificates by private key to admin-revoker
- Create new package called 'privatekey'
- Move private key loading logic from 'issuance' to 'privatekey'
- Add embedded public key verification to 'privatekey'
- Add new field `skipBlockKey` to `AdministrativelyRevokeCertificate` protobuf
- Add check in RA to ensure that only KeyCompromise revocations use
`skipBlockKey`
Fixes#5785
Add `stylecheck` to our list of lints, since it got separated out from
`staticcheck`. Fix the way we configure both to be clearer and not
rely on regexes.
Additionally fix a number of easy-to-change `staticcheck` and
`stylecheck` violations, allowing us to reduce our number of ignored
checks.
Part of #5681
These gRPC methods were only used by the ACMEv1 code paths.
Now that boulder-wfe has been fully removed, we can be confident
that no clients ever call these methods, and can remove them from
the gRPC service interface.
Part of #5816
Move the precertificate ocsp response creation to happen before the
precertificate is issued to avoid a case where a precertificate is issued
but an OCSP response generation error could cause the precertificate to
not be stored in the database.
Fixes#5887
The boulder-wfe binary was responsible for running the ACMEv1 API. This
API has been turned off, and all infrastructure supporting it no longer
expects the binary to exist.
The wfe library performed all of the heavy lifting for the boulder-wfe
binary. It is not used by any other code, and much of its logic is
duplicated in the wfe2 library.
The ACME is dead, long live the ACME.
Fixes#5681
Running an older version (v0.0.1-2020.1.4) of `staticcheck` in
whole-program mode (`staticcheck --unused.whole-program=true -- ./...`)
finds various instances of unused code which don't normally show up
as CI issues. I've used this to find and remove a large chunk of the
unused code, to pave the way for additional large deletions accompanying
the WFE1 removal.
Part of #5681
Back when CRL capabilities were first introduced to the Boulder
codebase for the sake of the `ceremony` tool, they weren't available
in the version of the Go compiler and standard library used by
Boulder at that time. So the CRL capabilities from go1.15 were
backported verbatim into our own `x509crl` library.
This is no longer necessary, as go1.17.5 is our lowest supported
version.
Part of #5681
We can scan metadata and get the age of responses.
We can scan responses and print them in base64.
Note: this issues a GET for each key, and blocks on the result. For much
faster scanning we will want to introduce parallel GETs in a subsequent
PR.
Also, add a `get` operation to get a single entry.
Fixes#5830
rocsp-tool load-from-db scans in batches. On each iteration, it is
supposed to update its starting position based on the highest seen ID from
the last batch. However, it was always setting its starting position to
the same value, and not making progress if the DB was larger than the
batch size.