solve https://github.com/letsencrypt/boulder/issues/8088
RFC8555 6.2 requires badSignatureAlgorithm on unacceptable JWS signing
algorithm, but current boulder return malform:failed to parse jws error
instead
Its because this only checks about JWS protected header's signature
algorithm, current checkAlgorithm is while too late to catch parse time
error but not redundant, as it checks against a key and signed message
---------
Co-authored-by: Samantha Frank <hello@entropy.cat>
Change all of the helper methods and functions in verify.go to return an
`error` instead of a `probs.ProblemDetails`. Add a few new types to our
errors package, and support for those types in ProblemDetailsForError,
to maintain the same public-facing error types. Update the tests to
check for specific errors instead of specific problems.
This is a building block towards making the probs.ProblemDetails type
not implement the Error interface, and only be used when rendering
errors to the user (i.e. not within Boulder logic itself).
Part of https://github.com/letsencrypt/boulder/issues/4980
Change how goodkey.KeyPolicy keeps track of allowed RSA and ECDSA key
sizes, to make it slightly more flexible while still retaining the very
locked-down allowlist of only 6 acceptable key sizes (RSA 2048, 3076,
and 4092, and ECDSA P256, P384, and P521). Add a new constructor which
takes in a collection of allowed key sizes, so that users of the goodkey
package can customize which keys they accept. Rename the existing
constructor to make it clear that it uses hardcoded default values.
With these new constructors available, make all of the goodkey.KeyPolicy
member fields private, so that a KeyPolicy can only be built via these
constructors.
Replace "mocks.StorageAuthority" with "sapb.StorageAuthorityClient" in
our test mocks. The improves them by removing implementations of the
methods the tests don't actually need, instead of inheriting lots of
extraneous methods from the huge and cumbersome mocks.StorageAuthority.
This reduces our usage of mocks.StorageAuthority to only the WFE tests
(which create one in the frequently-used setup() function), which will
make refactoring those mocks in the pursuit of
https://github.com/letsencrypt/boulder/issues/7476 much easier.
Part of https://github.com/letsencrypt/boulder/issues/7476
Upgrade from the old go-jose v2.6.1 to the newly minted go-jose v4.0.1.
Cleans up old code now that `jose.ParseSigned` can take a list of
supported signature algorithms.
Fixes https://github.com/letsencrypt/boulder/issues/7390
---------
Co-authored-by: Aaron Gable <aaron@letsencrypt.org>
Fix an issue related to the custom gRPC Picker implementation introduced
in #6618. When a nonce contained a prefix not associated with a known
backend, the Picker would continuously rebuild, re-resolve DNS, and
eventually throw a 500 "Server Error" at RPC timeout. The Picker now
promptly returns a 400 "Bad Nonce" error as expected, in response the
requesting client should retry their request with a fresh nonce.
Additionally:
- WFE unit tests use derived nonces when `"BOULDER_CONFIG_DIR" ==
"test/config-next"`.
- `Balancer.Build()` in "noncebalancer" forces a rebuild until non-zero
backends are available. This matches the
[balancer/roundrobin](d524b40946/balancer/roundrobin/roundrobin.go (L49-L53))
implementation.
- Nonces with no matching backend increment "jose_errors" with label
`"type": "JWSInvalidNonce"` and "nonce_no_backend_found".
- Nonces of incorrect length are now rejected at the WFE and increment
"jose_errors" with label `"type": "JWSMalformedNonce"` instead of
`"type": "JWSInvalidNonce"`.
- Nonces not encoded as base64url are now rejected at the WFE and
increment "jose_errors" with label `"type": "JWSMalformedNonce"` instead
of `"type": "JWSInvalidNonce"`.
Fixes#6969
Part of #6974
The contacts field of an account can be very verbose, and is irrelevant
to the vast majority -- e.g. creating orders, validating challenges, and
downloading certificates -- of requests made by an account. To reduce
the length of our WFE log lines, remove the Contacts field from all
logs. When we actually need it, we can get it from the database.
Also remove the RequestEvent.TLS field, which is unused.
In WFE, we do a goodkey check when validating a self-authenticated POST
(i.e. when creating an account). For a while, that was a purely local
check, looking at a list of bad keys or bad moduluses, or checking for
factorability. At some point we also added a backend check, querying the
SA to see if a key was blocked. However, we did not update this one code
path to distinguish "bad key" from "timeout querying SA." That meant
that sometimes we would give a badPublicKey Problem Document when we
should have given an internalServerError.
Related:
https://github.com/letsencrypt/boulder/issues/6795#issuecomment-1574217398
Define a `bJSONWebSignature` struct which embeds a
`*jose.JSONWebSignature`. The only method that can produce a
`bJSONWebSignature` is `wfe.parseJWS` so that we can ensure
safety/sanity checks are performed on the incoming data. Restricts
several methods and functions to take a `jose.Header` as an input
parameter, rather than a full JWS.
Fixes https://github.com/letsencrypt/boulder/issues/5676.
Also remove CSRDNSNames, CSRIPAddresses and CSREmailAddresses.
And add a new log field "DNSNames", for use in new-order, finalize, and
revoke requests.
Add a "RevocationReason" field in the "Extra" section for revoke
requests.
Originally, WFEs had a built-in nonce service. Then we added a "remote
nonce service" via gRPC, but we kept a fallback path for when the remote
nonce service was not configured, to use a built-in nonce service. This
PR removes that fallback path.
Since the fallback path was relied on by the unittests, this also
refactors the unittests to use a gRPC-style nonce service (but in-memory
for the unittests).
Fixes#6530
In the WFE, ocsp-responder, and crl-updater, switch from using
StorageAuthorityClients to StorageAuthorityReadOnlyClients. This ensures
that these services cannot call methods which write to our database.
Fixes#6454
When a client cancels their HTTP request, that propagates into
gRPC-level cancellation. But we don't want those canceled RPCs to show
up as 500s, we want them to show up as 408s. We do that by producing a
special ProblemDetails at the gRPC level. However, for that trick to
work, we need to make sure errors from RPC methods get passed through
web.ProblemDetailsForError. There were some places where we didn't do
this and instead created a from-scratch ProblemDetails. This resulted in
spurious 500s.
My methodology was to look at every method call on each of the WFE's
fields that represents a gRPC backend: `ra`, `sa`, `accountGetter`,
`nonceService`, `remoteNonceServe`. If the error handling for that call
did not use web.ProblemDetailsForError, I changed it to use that.
Fixes#6524
In draft because still needs tests.
This special casing was originally added to catch a bug where
the underlying library said that the EC *private* key was the
wrong length. We were also interested in exposing this specific
error to clients because EC key-length strictness was added to
go-jose and we wanted to ensure that upgrading to the new
version wouldn't break existing clients.
We are now several years past that upgrade, and hit this error
case very infrequently. There is no need to continue special-
casing it.
Followup from #5839.
I chose groupcache/lru as our LRU cache implementation because it's part
of the golang org, written by one of the Go authors, and very simple
and easy to read.
This adds an `AccountGetter` interface that is implemented by both the
AccountCache and the SA. If the WFE config includes an AccountCache field,
it will wrap the SA in an AccountCache with the configured max size and
expiration time.
We set an expiration time on account cache entries because we want a
bounded amount of time that they may be stale by. This will be used in
conjunction with a delay on account-updating pathways to ensure we don't
allow authentication with a deactivated account or changed key.
The account cache stores corepb.Registration objects because protobufs
have an established way to do a deep copy. Deep copies are important so
the cache can maintain its own internal state and ensure nothing external
is modifying it.
As part of this process I changed construction of the WFE. Previously,
"SA" and "RA" were public fields that were mutated after construction. Now
they are parameters to the constructor, along with the new "accountGetter"
parameter.
The cache includes stats for requests categorized by hits and misses.
These tests are testing functionality that is no longer in use in
production deployments of Boulder. As we go about removing wfe1
functionality, these tests will break, so let's just remove them
wholesale right now. I have verified that all of the tests removed in
this PR are duplicated against wfe2.
One of the changes in this PR is to cease starting up the wfe1 process
in the integration tests at all. However, that component was serving
requests for the AIA Issuer URL, which gets queried by various OCSP and
revocation tests. In order to keep those tests working, this change also
adds an integration-test-only handler to wfe2, and updates the CA
configuration to point at the new handler.
Part of #5681
Remove the last of the gRPC wrapper files. In order to do so:
- Remove the `core.StorageGetter` interface. Replace it with a new
interface (whose methods include the `...grpc.CallOption` arg)
inside the `sa/proto/` package.
- Remove the `core.StorageAdder` interface. There's no real use-case
for having a write-only interface.
- Remove the `core.StorageAuthority` interface, as it is now redundant
with the autogenerated `sapb.StorageAuthorityClient` interface.
- Replace the `certificateStorage` interface (which appears in two
different places) with a single unified interface also in `sa/proto/`.
- Update all test mocks to include the `_ ...grpc.CallOption` arg in
their method signatures so they match the gRPC client interface.
- Delete many methods from mocks which are no longer necessary (mostly
because they're mocking old authz1 methods that no longer exist).
- Move the two `test/inmem/` wrappers into their own sub-packages to
avoid an import cycle.
- Simplify the `satest` package to satisfy one of its TODOs and to
avoid an import cycle.
- Add many methods to the `test/inmem/sa/` wrapper, to accommodate all
of the methods which are called in unittests.
Fixes#5600
Remove all error checking and type transformation from the gRPC wrappers
for the following methods on the SA:
- GetRegistration
- GetRegistrationByKey
- NewRegistration
- UpdateRegistration
- DeactivateRegistration
Update callers of these methods to construct the appropriate protobuf
request messages directly, and to consume the protobuf response messages
directly. In many cases, this requires changing the way that clients
handle the `Jwk` field (from expecting a `JSONWebKey` to expecting a
slice of bytes) and the `Contacts` field (from expecting a possibly-nil
pointer to relying on the value of the `ContactsPresent` boolean field).
Implement two new methods in `sa/model.go` to convert directly between
database models and protobuf messages, rather than round-tripping
through `core` objects in between. Delete the older methods that
converted between database models and `core` objects, as they are no
longer necessary.
Update test mocks to have the correct signatures, and update tests to
not rely on `JSONWebKey` and instead use byte slices.
Fixes#5531
Update all of our tests to use `AssertMetricWithLabelsEquals`
instead of combinations of the older `CountFoo` helpers with
simple asserts. This coalesces all of our prometheus inspection
logic into a single function, allowing the deletion of four separate
helper functions.
Normally we do this with a reverse proxy in front of Boulder, but it's
nice to have an additional layer of protection in case someone deploys
Boulder without a reverse proxy.
Fixes#4730.
Errors that were being returned in the checkAlgorithm methods of both wfe and
wfe2 didn't really match up to what was actually being checked. This change
attempts to bring the errors in line with what is actually being tested.
Fixes#4452.
To make log analysis easier we choose to elevate the pseudo ACME HTTP
method "POST-as-GET" to the `web.RequestEvent.Method` after processing
a valid POST-as-GET request, replacing the "POST" method value that will
have been set by the outermost handler.
Also excises the existing bad padding metrics code, adds a special error for when we encounter badly padded keys, and adds a test for the new special error.
Fixes#4070 and fixes#3964.
This allows POST-as-GET requests to Orders, Authorizations, Challenges, Certificates and Accounts. Legacy GET support remains for Orders, Authorizations, Challenges and Certificates. Legacy "POST {}" support for Accounts remains.
Resolves https://github.com/letsencrypt/boulder/issues/3871
Removes the checks for a handful of deployed feature flags in preparation for removing the flags entirely. Also moves all of the currently deprecated flags to a separate section of the flags list so they can be more easily removed once purged from production configs.
Fixes#3880.
This commit updates the `wfe2/verify.go` implementation of
`acctIDFromURL` to include the invalid `kid` value being rejected as
part of the rejection error message. This will hopefully help
users/client developers understand the problem faster. Unit tests are
updated accordingly.
While we intended to allow legacy ACME v1 accounts created through the WFE to work with the ACME v2 implementation and the WFE2 we neglected to consider that a legacy account would have a Key ID URL that doesn't match the expected for a V2 account. This caused `wfe2/verify.go`'s `lookupJWK` to reject all POST requests authenticated by a legacy account unless the ACME client took the extra manual step of "fixing" the URL.
This PR adds a configuration parameter to the WFE2 for an allowed legacy key ID prefix. The WFE2 verification logic is updated to allow both the expected key ID prefix and the configured legacy key ID prefix. This will allow us to specify the correct legacy URL in configuration for both staging/prod to allow unmodified V1 ACME accounts to be used with ACME v2.
Resolves https://github.com/letsencrypt/boulder/issues/3674
Remove various unnecessary uses of fmt.Sprintf - in particular:
- Avoid calls like t.Error(fmt.Sprintf(...)), where t.Errorf can be used directly.
- Use strconv when converting an integer to a string, rather than using
fmt.Sprintf("%d", ...). This is simpler and can also detect type errors at
compile time.
- Instead of using x.Write([]byte(fmt.Sprintf(...))), use fmt.Fprintf(x, ...).
This commit adds a new WFE2 feature flag "EnforceV2ContentType". When
enabled, the WFE2's validPostRequest function will enforce that the
request carries a Content-Type header equal to
application/jose+json. This is required by ACME draft-10 per section
6.2 "Request Authentication".
This is behind a feature flag because it is likely to break
some number of existing ACMEv2 clients that may not be sending the
correct Content-Type.
We are defaulting to not setting the new feature flag in test/config-next
because it currently break's Certbot's acme module's revocation support
and we rely on this in our V2 integration tests.
Resolves#3529
Add a logging statement that fires when a remote VA fail causes
overall failure. Also change remoteValidationFailures into a
counter that counts the same thing, instead of a histogram. Since
the histogram had the default bucket sizes, it failed to collect
what we needed, and produced more metrics than necessary.
The ACME specification no longer describes "registrations" since this is
a fairly overloaded term. Instead the term used is "account". This
commit updates the WFE2 & tests throughout to replace occurrences of
"reg" and "registration" to use "acct" and "account".
NOTE: This change is strictly limited to the wfe2 package. E.g. the
RA/SA and many core objects still refer to registrations.
Resolves#2986
Per #3001 we should not be adding new StatsD code for metrics anymore.
This commit updates all of the WFE2 to use 1st class Prometheus stats.
Unit tests are updated accordingly.
I have broken the error stats into two counts:
1. httpErrorCount for all of the http layer client request errors (e.g.
no POST body, no content-length)
2. joseErrorCount, for all of the JOSE layer client request errors (e.g.
malformed JWS, broken signature, invalid JWK)
This commit also removes the stubbed out `TestValidKeyRollover` function
from `wfe2/verify_test.go`. This was committed accidentally and the same
functionality is covered by the `wfe2/wfe_test.go` `TestKeyRollover`
function.
RFC 7515 section 7.2.1 "General JWS JSON Serialization Syntax" describes
an optional "signatures" field that contains an array of JSON objects,
each representing a signature or MAC. ACME only uses the mandatory
"signature" field that contains the BASE64URL of a signature.
We previously checked that the parsed JWS had only one signature and
rejected accordingly but in order to be safe and ensure that nothing is
read from this "signatures" array when we intended to be using the
"signature" field this commit updates the check to explicitly reject the
"signatures" field prior to parsing with go-jose similar to how the
unprotected header is handled.
This PR reworks the original WFE2 JWS post validation code (primarily
from `verifyPOST()` in WFE1) to use the new "ACME v2" style of JWS verification.
For most endpoints this means switching to a style where the JWS does
*not* contain an embedded JWK and instead contains a Key ID that is used
to lookup the JWK to verify the JWS from the database. For some special
endpoints (e.g. new-reg) there is a self-authenticated JWS style that
uses the old method of embedding a JWK instead of using a Key ID
(because no account to reference by ID exists yet).
The JWS validation now lives in `wfe2/verify.go` to keep the main WFEv2
code cleaner. Compared to `verifyPOST` there has been substantial work
done to create smaller easier to test functions instead of one big
validation function. The existing WFE unit tests that were copied to the
WFE2 are largely left as they were (e.g. cruddy) and updated as
minimally as possible to support the new request validation. All tests
for new code were written in a cleaner subtest style. Cleaning up the
existing tests will be follow-up work (See https://github.com/letsencrypt/boulder/issues/2928).
Since the POST validation for the key-change and revocation endpoints
requires special care they were left out of the WFE2 implementation for now
and will return a "not implemented" error if called.
_Note to reviewers_: this is a large diff to `wfe2/wfe.go` and `wfe2/verify.go`
that Github will hide by default. You will need to click to view the diffs.
Resolves https://github.com/letsencrypt/boulder/issues/2858
This PR renames wfe2/jose.go to wfe2/verify.go to better reflect
its purpose.
Additionally this PR moves signatureValidationError, extractJWSKey
and verifyPOST from wfe2/wfe.go to wfe2/verify.go. This is in
preparation of refactoring for the ACME v2 POST verification logic to
help keep diffs reviewable.