A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)).
Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions
of the logger functions and call these directly with the format and arguments.
While here remove some unnecessary trailing newlines and calls to String/Error.
In #3651 we introduced a new parameter to sa.AddCertificate to allow specifying the Issued date. If nil, we defaulted to the current time to maintain deployability guidelines.
Now that this has been deployed everywhere this PR updates SA.AddCertificate and the gRPC wrappers such that a nil issuer argument is rejected with an error.
Unit tests that were previously using nil for the issued time are updated to explicitly set the issued time to the fake clock's now().
Resolves#3657
Now that #3638 has been deployed to all of the RA instances there are no
more RPC clients using the SA's `CountCertificatesRange` RPC.
This commit deletes the implementation, the RPC definition & wrappers,
and all the test code/mocks.
The Boulder orphan-finder command uses the SA's AddCertificate RPC to add orphaned certificates it finds back to the DB. Prior to this commit this RPC always set the core.Certificate.Issued field to the
current time. For the orphan-finder case this meant that the Issued date would incorrectly be set to when the certificate was found, not when it was actually issued. This could cause cert-checker to alarm based on the unusual delta between the cert NotBefore and the core.Certificate.Issued value.
This PR updates the AddCertificate RPC to accept an optional issued timestamp in the request arguments. In the SA layer we address deployability concerns by setting a default value of the current time when none is explicitly provided. This matches the classic behaviour and will let an old RA communicate with a new SA.
This PR updates the orphan-finder to provide an explicit issued time to sa.AddCertificate. The explicit issued time is calculated using the found certificate's NotBefore and the configured backdate.
This lets the orphan-finder set the true issued time in the core.Certificate object, avoiding any cert-checker alarms.
Resolves#3624
In #3614 we adjusted the `SA.NewOrder` function to conditionally call
`ssa.statusForOrder` on the new order when `features.OrderReadyStatus`
was enabled. Unfortunately this call to `ssa.statusForOrder` happened
*before* the `req.BeganProcessing` field was initialized with a pointer
to a `false` bool. The `ssa.statusForOrder` function (correctly) assumes
that `req.BeganProcessing == nil` is illegal and doesn't correspond to
a known status. This results in NewOrder requests returning a 500 error
of the form:
> Internal error - Error creating new order - Order XXX is in an invalid
> state. No state known for this order's authorizations
Our integration tests missed this because we didn't have a test case
that issued for a set of names with one account, and then issued again
for the same set of names with the same account.
This commit fixes the original bug by moving the `BeganProcessing`
initialization before the call to `statusForOrder`. This commit also
adds an integration test to catch this sort of bug again in the
future.
Prior to the SA fix this test failed with the 500 server internal error
observed by the Certbot team. With the SA fix in place the test passes
again.
* SA: Add Order "Ready" status, feature flag.
This commit adds the new "Ready" status to `core/objects.go` and updates
`sa.statusForOrder` to use it conditionally for orders with all valid
authorizations that haven't been finalized yet. This state is used
conditionally based on the `features.OrderReadyStatus` feature flag
since it will likely break some existing clients that expect status
"Processing" for this state. The SA unit test for `statusForOrder` is
updated with a "ready" status test case.
* RA: Enforce order ready status conditionally.
This commit updates the RA to conditionally expect orders that are being
finalized to be in the "ready" status instead of "pending". This is
conditionally enforced based on the `OrderReadyStatus` feature flag.
Along the way the SA was changed to calculate the order status for the
order returned in `sa.NewOrder` dynamically now that it could be
something other than "pending".
* WFE2: Conditionally enforce order ready status for finalization.
Similar to the RA the WFE2 should conditionally enforce that an order's
status is either "ready" or "pending" based on the "OrderReadyStatus"
feature flag.
* Integration: Fix `test_order_finalize_early`.
This commit updates the V2 `test_order_finalize_early` test for the
"ready" status. A nice side-effect of the ready state change is that we
no longer invalidate an order when it is finalized too soon because we
can reject the finalization in the WFE. Subsequently the
`test_order_finalize_early` testcase is also smaller.
* Integration: Test classic behaviour w/o feature flag.
In the previous commit I fixed the integration test for the
`config/test-next` run that has the `OrderReadyStatus` feature flag set
but broke it for the `config/test` run without the feature flag.
This commit updates the `test_order_finalize_early` test to work
correctly based on the feature flag status in both cases.
In `getAuthorizations`, we had a single loop to both select the freshest authz and
fetch challenges corresponding to authzs. This meant that in some cases, we
would fetch challenges only to throw them away. Since each challenge fetch is a
DB round trip, this would cause extreme slowness when called for domains that
have a large number of authorizations.
This change splits that into two loops: One to select the freshest authzs, and
another to fetch challenges for those authzs.
Prior to this commit an order's expiry was set based on
ra.orderLiftime while pending and valid authorization expiry was set
based on ra.pendingAuthorizationLifetime and
ra.authorizationLifetime. Since orders reused existing valid/pending
authorizations this can lead to a case where an order has an expiry
beyond the associated authorization expiries. In this case when an
authorization expired the order becomes inactionable and the extra order
lifetime is not useful.
This commit addresses this problem in two ways:
1. The SA GetAuthorizations function used to find authzs to reuse for
ra.NewOrder is adjusted to only return authorizations at min 24hr
away from expiry.
2. Order expiry is now calculated by the RA in newOrder
as the min of the order's own expiry or the soonest
authorization expiry. This properly reflects the order's true
lifetime based on the authorization lifetime.
The RA/SA unit tests are updated accordingly.
Resolves#3498
This commit fixes `sa.GetAuthorizations` to ensure that both the valid
**and** the pending authorizations it finds for reuse by `ra.NewOrder`
are required to have rows in the order authz join table.
The `sa.GetAuthorizations` unit test is updated to ensure we test the
expected V2 reuse behaviour is enforced.
Resolves https://github.com/letsencrypt/boulder/issues/3565
Previously we introduced the concept of a "pending orders per account
ID" rate limit. After struggling with making an implementation of this
rate limit perform well we reevaluated the problem and decided a "new
orders per account per time window" rate limit would be a better fit for
ACMEv2 overall.
This commit introduces the new newOrdersPerAccount rate limit. The RA
now checks this before creating new pending orders in ra.NewOrder. It
does so after order reuse takes place ensuring the rate limit is only
applied in cases when a distinct new pending order row would be created.
To accomplish this a migration for a new orders field (created) and an
index over created and registrationID is added. It would be possible to
use the existing expires field for this like we've done in the past, but that
was primarily to avoid running a migration on a large table in prod. Since
we don't have that problem yet for V2 tables we can Do The Right Thing
and add a column.
For deployability the deprecated pendingOrdersPerAccount code & SA
gRPC bits are left around. A follow-up PR will be needed to remove
those (#3502).
Resolves#3410
This commit updates `sa.getAllOrderAuthorizations` to not exclude
expired authorizations. The expired authorizations are used in
`sa.statusForOrder` to set the overall order status to invalid when one
or more authorizations are expired.
Fixes#3499
Removes usage of the `EnforceChallengeDisable` feature, the feature itself is not removed as it is still configured in staging/production, once that is fixed I'll submit another PR removing the actual flag.
This keeps the behavior that when authorizations are retrieved from the SA they have their challenges populated, because that seems to make the most sense to me? It also retains TLS re-validation.
Fixes#3441.
Prior to this commit when building up the authorizations for a new-order
request we looked for any unexpired pending/valid authorizations owned
by the account and used them for the order. This allows a client to use
the V1 new-authz endpoint in combination with the V2 new-order endpoint
and we do not want to support this behaviour. All V2 authorizations
should be sourced from other V2 orders. This commit implements a new
parameter for the SA's getAuthorizations function that allows filtering
out legacy V1 authorizations by doing a JOIN on the order to
authorizations join table.
Resolves#3328
This commit resolves the case where an error during finalization occurs.
Prior to this commit if an error (expected or otherwise) occurred after
setting an order to status processing at the start of order
finalization the order would be stuck processing forever.
The SA now has a `SetOrderError` RPC that can be used by the RA to
persist an error onto an order. The order status calculation can use
this error to decide if the order is invalid. The WFE is updated to
write the error to the order JSON when displaying the order information.
Prior to this commit the order protobuf had the error field as
a `[]byte`. It doesn't seem like this is the right decision, we have
a specific protobuf type for ProblemDetails and so this commit switches
the error field to use it. The conversion to/from `[]byte` is done with
the model by the SA.
An integration test is included that prior to this commit left an order
in a stuck processing state. With this commit the integration test
passes as expected.
Resolves https://github.com/letsencrypt/boulder/issues/3403
The SA RPC previously called `GetOrderAuthorizations` only returns
**valid, unexpired** authorizations. This commit updates the name to
emphasize that it only returns valid order authzs.
This PR is a rework of what was originally https://github.com/letsencrypt/boulder/pull/3382, integrating the design feedback proposed by @jsha: https://github.com/letsencrypt/boulder/pull/3382#issuecomment-359912549
This PR removes the stored Order status field and replaces it with a value that is calculated on-the-fly by the SA when fetching an order, based on the order's associated authorizations.
In summary (and order of precedence):
* If any of the order's authorizations are invalid, the order is invalid.
* If any of the order's authorizations are deactivated, the order is deactivated.
* If any of the order's authorizations are pending, the order is pending.
* If all of the order's authorizations are valid, and there is a certificate serial, the order is valid.
* If all of the order's authorizations are valid, and we have began processing, but there is no certificate serial, the order is processing.
* If all of the order's authorizations are valid, and we haven't processing, then the order is pending waiting a finalization request.
This avoids having to explicitly update the order status when an associated authorization changes status.
The RA's implementation of new-order is updated to only reuse an existing order if the calculated status is pending. This avoids giving back invalid or deactivated orders to clients.
Resolves#3333
This change adds a feature flag, TLSSNIRevalidation. When it is enabled, Boulder
will create new authorization objects with TLS-SNI challenges if the requesting
account has issued a certificate with the relevant domain name, and was the most
recent account to do so*. This setting overrides the configured list of
challenges in the PolicyAuthority, so even if TLS-SNI is disabled in general, it
will be enabled for revalidation.
Note that this interacts with EnforceChallengeDisable. Because
EnforceChallengeDisable causes additional checked at validation time and at
issuance time, we need to update those two places as well. We'll send a
follow-up PR with that.
*We chose to make this work only for the most recent account to issue, even if
there were overlapping certificates, because it significantly simplifies the
database access patterns and should work for 95+% of cases.
Note that this change will let an account revalidate and reissue for a domain
even if the previous issuance on that account used http-01 or dns-01. This also
simplifies implementation, and fits within the intent of the mitigation plan: If
someone previously issued for a domain using http-01, we have high confidence
that they are actually the owner, and they are not going to "steal" the domain
from themselves using tls-sni-01.
Also note: This change also doesn't work properly with ReusePendingAuthz: true.
Specifically, if you attempted issuance in the last couple days and failed
because there was no tls-sni challenge, you'll still have an http-01 challenge
lying around, and we'll reuse that; then your client will fail due to lack of
tls-sni challenge again.
This change was joint work between @rolandshoemaker and @jsha.
This commit adds pending order reuse. Subsequent to this commit multiple
add-order requests from the same account ID for the same set of order
names will result in only one order being created. Orders are only
reused while they are not expired. Finalized orders will not be reused
for subsequent new-order requests allowing for duplicate order issuance.
Note that this is a second level of reuse, building on the pending
authorization reuse that's done between separate orders already.
To efficiently find an appropriate order ID given a set of names,
a registration ID, and the current time a new orderFqdnSets table is
added with appropriate indexes and foreign keys.
Resolves#3258
If two requests simultaneously update a challenge for the same
authorization there is a chance that `UpdatePendingAuthorization` will
encounter a Gorp optimistic lock error having read one LockCol value
from a `Select` on the `pendingAuthorizations` table only for it to have
changed by the time an `Update` on the same row is performed.
After closer examination this `Update` is unnecessary! Only
`RA.UpdateAuthorization` calls `SA.UpdatePendingAuthorization` and it
does so only to record updated challenge information by way of
`UpdatePendingAuthorization`'s call to `updateChallenges`. Since no data
in the `pendingAuthorizations` row is being changed we don't need to do
this `Update` at all, saving both a potential race condition & some
database load.
This commit removes the `Update` entirely. Several SA unit tests had to
be updated because they were (ab)using `UpdatePendingAuthorization` to
mutate pendingAuthz rows.
This commit adds a new rate limit to restrict the number of outstanding
pending orders per account. If the threshold for this rate limit is
crossed subsequent new-order requests will return a 429 response.
Note: Since this the rate limit object itself defines an `Enabled()`
test based on whether or not it has been configured there is **not**
a feature flag for this change.
Resolves https://github.com/letsencrypt/boulder/issues/3246
This PR implements issuance for wildcard names in the V2 order flow. By policy, pending authorizations for wildcard names only receive a DNS-01 challenge for the base domain. We do not re-use authorizations for the base domain that do not come from a previous wildcard issuance (e.g. a normal authorization for example.com turned valid by way of a DNS-01 challenge will not be reused for a *.example.com order).
The wildcard prefix is stripped off of the authorization identifier value in two places:
When presenting the authorization to the user - ACME forbids having a wildcard character in an authorization identifier.
When performing validation - We validate the base domain name without the *. prefix.
This PR is largely a rewrite/extension of #3231. Instead of using a pseudo-challenge-type (DNS-01-Wildcard) to indicate an authorization & identifier correspond to the base name of a wildcard order name we instead allow the identifier to take the wildcard order name with the *. prefix.
As described in #3201 a concurrent challenge POST would result in 500 errors if the pending authz row was deleted by a promotion to the authz table underneath another request.
This PR adjusts the SA & RA so that if a pending authz is promoted to a final authz between updates a not found error will be returned instead of a server internal error.
Resolves https://github.com/letsencrypt/boulder/issues/3201
When counting certificates for rate limiting, we attempted to impose a limit on
the query results to avoid we did not receive so many results that they caused
slowness on the database or SA side. However, that check has never actually been
executed correctly. The check was fixed in #3126, but rolling out that fix
broke issuance for subscribers with rate limit overrides that have allowed them
to exceed the limit.
Because this limit has not been needed in practice over the years, remove it
rather than refining it. The size of the results are loosely governed by our
rate limits (and overrides), and if result sizes from this query become a
performance issue in the future, we can address it then. For now, opt for
simplification.
Fixes#3214.
In #1864 we discussed possible
optimizations to how expiration-mailer and ocsp-updater query the
certificateStatus table. In #2177 we
added the notAfter and isExpired fields for more efficient querying.
However, we forgot to add indexes on these fields. This change adds new indexes
and drops the old indexes, and should result in much more efficient querying in
those two components.
Also, remove a comment that goose couldn't understand.
Running EXPLAINs to show the difference:
For expiration-mailer, before:
MariaDB [boulder_sa_integration]> EXPLAIN SELECT cs.serial FROM certificateStatus AS cs WHERE cs.notAfter > DATE_ADD(NOW(), INTERVAL 21 DAY) AND cs.notAfter < DATE_ADD(NOW(), INTERVAL 10 DAY) AND cs.status != "revoked" AND COALESCE(TIMESTAMPDIFF(SECOND, cs.lastExpirationNagSent, cs.notAfter) > 10 * 86400, 1) ORDER BY cs.notAfter ASC LIMIT 100000;
+------+-------------+-------+------+------------------------------+------+---------+------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------+------------------------------+------+---------+------+------+-----------------------------+
| 1 | SIMPLE | cs | ALL | status_certificateStatus_idx | NULL | NULL | NULL | 486 | Using where; Using filesort |
+------+-------------+-------+------+------------------------------+------+---------+------+------+-----------------------------+
1 row in set (0.00 sec)
For expiration-mailer, after:
MariaDB [boulder_sa_integration]> EXPLAIN SELECT cs.serial FROM certificateStatus AS cs WHERE cs.notAfter < DATE_ADD(NOW(), INTERVAL 21 DAY) AND cs.notAfter < DATE_ADD(NOW(), INTERVAL 10 DAY) AND cs.status != "revoked" AND COALESCE(TIMESTAMPDIFF(SECOND, cs.lastExpirationNagSent, cs.notAfter) > 10 * 86400, 1) ORDER BY cs.notAfter ASC LIMIT 100000;
+------+-------------+-------+-------+---------------+--------------+---------+------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+--------------+---------+------+------+------------------------------------+
| 1 | SIMPLE | cs | range | notAfter_idx | notAfter_idx | 6 | NULL | 1 | Using index condition; Using where |
+------+-------------+-------+-------+---------------+--------------+---------+------+------+------------------------------------+
For ocsp-updater, before:
MariaDB [boulder_sa_integration]> EXPLAIN SELECT cs.serial, cs.status, cs.revokedDate, cs.notAfter FROM certificateStatus AS cs WHERE cs.ocspLastUpdated > DATE_SUB(NOW(), INTERVAL 10 DAY) AND cs.ocspLastUpdated < DATE_SUB(NOW(), INTERVAL 3 DAY) AND NOT cs.isExpired ORDER BY cs.ocspLastUpdated ASC LIMIT 100000;
+------+-------------+-------+-------+---------------------------------------+---------------------------------------+---------+------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------------------------------+---------------------------------------+---------+------+------+------------------------------------+
| 1 | SIMPLE | cs | range | ocspLastUpdated_certificateStatus_idx | ocspLastUpdated_certificateStatus_idx | 5 | NULL | 1 | Using index condition; Using where |
+------+-------------+-------+-------+---------------------------------------+---------------------------------------+---------+------+------+------------------------------------+
1 row in set (0.00 sec)
For ocsp-updater, after:
MariaDB [boulder_sa_integration]> EXPLAIN SELECT cs.serial, cs.status, cs.revokedDate, cs.notAfter FROM certificateStatus AS cs WHERE cs.ocspLastUpdated > DATE_SUB(NOW(), INTERVAL 10 DAY) AND cs.ocspLastUpdated < DATE_SUB(NOW(), INTERVAL 3 DAY) AND NOT cs.isExpired ORDER BY cs.ocspLastUpdated ASC LIMIT 100000;
+------+-------------+-------+-------+-------------------------------+-------------------------------+---------+------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+-------------------------------+-------------------------------+---------+------+------+-----------------------+
| 1 | SIMPLE | cs | range | isExpired_ocspLastUpdated_idx | isExpired_ocspLastUpdated_idx | 7 | NULL | 1 | Using index condition |
+------+-------------+-------+-------+-------------------------------+-------------------------------+---------+------+------+-----------------------+
1 row in set (0.00 sec)
This PR implements order finalization for the ACME v2 API.
In broad strokes this means:
* Removing the CSR from order objects & the new-order flow
* Adding identifiers to the order object & new-order
* Providing a finalization URL as part of orders returned by new-order
* Adding support to the WFE's Order endpoint to receive finalization POST requests with a CSR
* Updating the RA to accept finalization requests and to ensure orders are fully validated before issuance can proceed
* Updating the SA to allow finding order authorizations & updating orders.
* Updating the CA to accept an Order ID to log when issuing a certificate corresponding to an order object
Resolves#3123
This is only the migration, so far. Rather than doing the feature-switch dance,
we can wait for this migration to be applied, and then commit the code to start
setting it, with a feature switch to start checking it, which can be turned on
once we've been setting the bit in production for a week.
Having this as an indexed bit on issuedNames allows us to cheaply exclude
renewals from our rate limit queries, so we can avoid the ordering dependency
for renewals vs new issuances on the same domain.
Fixes#3161
Since we can make up to 100 SQL queries from this method (based on the 100-SAN
limit), sometimes it is too slow and we get a timeout for large certificates. By
running some of those queries in parallel, we can speed things up and stop
getting timeouts.
For the new-order endpoint only. This does some refactoring of the order of operations in `ra.NewAuthorization` as well in order to reduce the duplication of code relating to creating pending authorizations, existing tests still seem to work as intended... A close eye should be given to this since we don't have integration tests yet that test it end to end. This also changes the inner type of `grpc.StorageAuthorityServerWrapper` to `core.StorageAuthority` so that we can avoid a circular import that is created by needing to import `grpc.AuthzToPB` and `grpc.PBToAuthz` in `sa/sa.go`.
This is a big change but should considerably improve the performance of the new-order flow.
Fixes#2955.
In 2fb247488f we consolidated the
`regModelV2` and `regModelv1` structs to one `regModel` type. In the
process we accidentally lost the explicit assignment of the
to-be-updated registration model's `LockCol` with the value of the
existing registration's `LockCol`. This meant that the Update was
occurring with a where clause `LockCol=0` (the default value).
In practice this meant that the first reg update would succeed (since
the reg row starts with LockCol=0) but any regs that had already been
updated once before would modify 0 rows in the update (because the where
clause on `LockCol` failed) and this in turn was translated into
a ServerInternal error since we knew the reg being updated did exist.
This commit updates the SA's `UpdateRegistration` function to properly
set the `LockCol` on the to-be-updated row.
This commit additionally adds an integration test for registration
contact information updating to ensure we don't fall into this trap in
the future.
This commit adds a new boulder error type WrongAuthorizationState.
This error type is returned by the SA when UpdateAuthorization is
provided an authz that isn't pending. The RA and WFE are updated
accordingly such that this error percolates back through the API to the
user as a Malformed problem with a sensible description. Previously this
behaviour resulted in a ServerInternal error.
Resolves#3032