boulder

Commit Graph

Author	SHA1	Message	Date
Daniel McCarney	b4c13ce6c7	RA: fix valid authz reuse control for V2 newOrder. (#4027 ) Before fixing `ra.NewOrder` the `TestNewOrderAuthzReuseDisabled` test from this branch fails as expected based on #4026: ``` === RUN TestNewOrderAuthzReuseDisabled --- FAIL: TestNewOrderAuthzReuseDisabled (0.24s) ra_test.go:2477: "reused-valid-authz" == "reused-valid-authz" FAIL FAIL github.com/letsencrypt/boulder/ra 0.270s ``` Afterwards it passes: ``` === RUN TestNewOrderAuthzReuseDisabled --- PASS: TestNewOrderAuthzReuseDisabled (0.26s) PASS ok github.com/letsencrypt/boulder/ra 1.291s ```	2019-01-23 13:00:44 -05:00
Daniel McCarney	b0f407dcf0	RA: Remove deprecated UpdateAuthorization RPC. (#3993 ) Staging and prod both deployed the PerformValidationRPC feature flag. All running WFE/WFE2 instances are using the more accurately named PerformValidation RPC and we can strip out the old UpdateAuthorization bits. The feature flag for PerformValidationRPC remains until we clean up the staging/prod configs. Resolves #3947 and completes the last of #3930	2019-01-07 16:35:27 -08:00
Daniel McCarney	8f5de538c1	RA: Add PerformValidation RPC to replace UpdateAuthorization. (#3942 ) The existing RA `UpdateAuthorization` RPC needs replacing for two reasons: 1. The name isn't accurate - `PerformValidation` better captures the purpose of the RPC. 2. The `core.Challenge` argument is superfluous since Key Authorizations are not sent in the initiation POST from the client anymore. The corresponding unmarshal and verification is now removed. Notably this means broken clients that were POSTing the wrong thing and failing pre-validation will now likely fail post-validation. To remove `UpdateAuthorization` the new `PerformValidation` RPC is added alongside the old one. WFE and WFE2 are updated to use the new RPC when the perform validation feature flag is enabled. We can remove `UpdateAuthorization` and its associated wrappers once all WFE instances have been updated. Resolves https://github.com/letsencrypt/boulder/issues/3930	2018-11-28 10:12:47 -05:00
Roland Bracewell Shoemaker	465be64f3f	Remove unnecessary usage of UpdatePendingAuthorization in RA (#3927 ) Removes superfluous usage of `UpdatePendingAuthorization` in the RA to update the key authorization and test if the authorization is pending and instead uses the result of the initial `GetAuthorization` call in the WFE. Fixes #3923.	2018-11-12 17:23:56 -08:00
Jacob Hoffman-Andrews	b31894d0e7	Remove crufty old TODO from UpdateAuthorization. (#3910 ) At one point, the POST to a challenge object was in theory updating a resource, and might have a "Type" field. We considered adding a check that the "Type" field matched the stored "Type" field. In ACMEv2, the POST body is always simply `{}` so this check is irrelevant. And we're unlikely to change the behavior in ACMEv1, so let's get rid of the counter and TODO.	2018-10-29 09:11:26 -04:00
Daniel McCarney	3319246a97	Dev/CI: Add Go 1.11.1 builds (#3888 ) Resolves https://github.com/letsencrypt/boulder/issues/3872 Note to reviewers: There's an outstanding bug that I've tracked down to the `--load` stage of the integration tests that results in one of the remote VA instances in the `test/config-next` configuration under Go 1.11.1 to fail to cleanly shut down. I'm working on finding the root cause but in the meantime I've disabled `--load` during CI so we can unblock moving forward with getting Go 1.11.1 in dev/CI. Tracking this in https://github.com/letsencrypt/boulder/issues/3889	2018-10-19 09:38:20 -07:00
Roland Bracewell Shoemaker	a9a0846ee9	Remove checks for deployed features (#3881 ) Removes the checks for a handful of deployed feature flags in preparation for removing the flags entirely. Also moves all of the currently deprecated flags to a separate section of the flags list so they can be more easily removed once purged from production configs. Fixes #3880.	2018-10-17 20:29:18 -07:00
Roland Bracewell Shoemaker	196f019851	Add support for temporal CT logs (#3853 ) Required a little bit of rework of the RA issuance flow (to add parsing of the precert to determine the expiration date, and moving final cert parsing before final cert submission) and RA tests, but I think it shouldn't create any issues... Fixes #3197.	2018-09-14 16:14:42 -07:00
Daniel McCarney	d39babdcf3	RA: Remove vestigial DNS config/setup. (#3854 ) In `db01b0b` we removed email validation from the RA. This was the only use of the `bdns` package by the RA and so we can go one step further and delete the remaining setup, configuration and `bdns` fields.	2018-09-13 13:39:23 -04:00
Daniel McCarney	db01b0b5bc	RA: Remove email DNS validations. (#3851 ) Performing DNS lookups to check the A/AAAA/MX records of a provided contact e-mail address adds variability to the RA's NewRegistration/UpdateRegistration functions and requires that the RA be able to reach out to the EFN. Since this is simply a convenience to prevent some classes of registration errors we can remove it to improve performance and to tighten up our security posture slightly. Resolves https://github.com/letsencrypt/boulder/issues/3849	2018-09-11 18:42:34 -07:00
Roland Bracewell Shoemaker	e27f370fd3	Excise code relating to pre-SCT embedding issuance flow (#3769 ) Things removed: * features.EmbedSCTs (and all the associated RA/CA/ocsp-updater code etc) * ca.enablePrecertificateFlow (and all the associated RA/CA code) * sa.AddSCTReceipt and sa.GetSCTReceipt RPCs * publisher.SubmitToCT and publisher.SubmitToSingleCT RPCs Fixes #3755.	2018-06-28 08:33:05 -04:00
Joel Sing	9c2859c87b	Add support for CAA account-uri validation. (#3736 ) This adds support for the account-uri CAA parameter as specified by section 3 of https://tools.ietf.org/html/draft-ietf-acme-caa-04, allowing issuance to be restricted to one or more ACME accounts as specified by CAA records.	2018-06-08 12:08:03 -07:00
Daniel McCarney	8583e42964	RA: Forbid contact addresses for IANA example domains. (#3748 ) We see a fair number of ACME accounts/registrations with contact addresses for the RFC2606 Section 3 "Reserved Example Second Level Domain Names" (`example.com`, `example.net`, `example.org`). These are not real contact addresses and are likely the result of the user copy-pasting example configuration. These users will miss out on expiration emails and other subscriber communications :-( This commit updates the RA's `validateEmail` function to reject any contact addresses for reserved example domain names. The corresponding unit test is updated accordingly. Resolves https://github.com/letsencrypt/boulder/issues/3719	2018-06-08 13:42:51 -04:00
Joel Sing	2540d59296	Implement CAA validation-methods checking. (#3716 ) When performing CAA checking respect the validation-methods parameter (if present) and restrict the allowed authorization methods to those specified. This allows a domain to restrict authorization methods that can be used with Let's Encrypt. This is largely based on PR #3003 (by @lukaslihotzki), which was landed and then later reverted due to issue #3143. The bug the resulted in the previous code being reverted has been addressed (likely inadvertently) by `76973d0f`. This implementation also includes integration tests for CAA validation-methods. Fixes issue #3143.	2018-05-23 14:32:31 -07:00
Joel Sing	5bc7fe639d	Distinguish between recheckCAA failures. (#3710 ) When rechecking CAA, the existing code maps all failures to a CAAError. This means that any other non-CAA failure (for example, an internal server error) gets hidden. Avoid this by reworking recheckCAA to return errors and if we find a non-CAAError, we return that directly. Revise tests to cover both situations. Updates issue #3143.	2018-05-15 17:57:35 -07:00
metaclassing	f6c49adc30	Make datetime consistent for authz expiration (#3706 ) So I took a quick stab at one possible solution for the authz expiration variance discussed over at community.letsencrypt.org/t/inconsistent-datetime-format-in-some-responses/61452 Golang's nanosecond precision results in newly created pending authz having one expires datetime, but subsequent requests have a different expires datetime due to database storage throwing away fractional second information. "expires": "2018-04-14T04:43:00.105818284Z", ... "expires": "2018-04-14T04:43:00Z", I am not a Go expert so there might be some more widely accepted approach to accomplishing the same thing, please let me know if you would prefer a different solution.	2018-05-11 11:57:52 -07:00
Joel Sing	f03c2517c6	Simplify the recheckCAA function. (#3701 ) Using both a sync.WaitGroup and channel is somewhat unnecessary - instead synchronize directly on the channel. Additionally, strings in Go are immutable - as such using string concatentation in a loop results in reallocation. Use a string slice and strings.Join, which not only avoids this, but also cuts down on the lines of code needed.	2018-05-11 11:56:27 -07:00
Joel Sing	8ebdfc60b6	Provide formatting logger functions. (#3699 ) A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)). Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions of the logger functions and call these directly with the format and arguments. While here remove some unnecessary trailing newlines and calls to String/Error.	2018-05-10 11:06:29 -07:00
Roland Bracewell Shoemaker	1271a15be7	Submit final certs to CT logs (#3640 ) Submits final certificates to any configured CT logs. This doesn't introduce a feature flag as it is config gated, any log we want to submit final certificates to needs to have it's log description updated to include the `"submitFinalCerts": true` field. Fixes #3605.	2018-04-13 12:02:01 -04:00
Jacob Hoffman-Andrews	6ca5c51e8e	Fix newOrder ready status regression, restore ready status. (#3644 ) In #3614 we adjusted the `SA.NewOrder` function to conditionally call `ssa.statusForOrder` on the new order when `features.OrderReadyStatus` was enabled. Unfortunately this call to `ssa.statusForOrder` happened before the `req.BeganProcessing` field was initialized with a pointer to a `false` bool. The `ssa.statusForOrder` function (correctly) assumes that `req.BeganProcessing == nil` is illegal and doesn't correspond to a known status. This results in `NewOrder` requests returning a 500 error of the form: > Internal error - Error creating new order - Order XXX is in an invalid state. No state known for this order's authorizations Our integration tests missed this because we didn't have a test case that issued for a set of names with one account, and then issued again for the same set of names with the same account. This PR fixes the original bug by moving the `BeganProcessing` initialization before the call to `statusForOrder`. This PR also adds an integration test to catch this sort of bug again in the future. Prior to the SA fix this test failed with the 500 server internal error observed by the Certbot team. With the SA fix in place the test passes again. Finally, this PR disables the `OrderReadyStatus` feature flag in `test/config-next/sa.json`. Certbot's ACME implementation breaks when this flag is enabled (See https://github.com/certbot/certbot/issues/5856). Since Certbot runs integration tests against Boulder with config-next we should be courteous and leave this flag disabled until we are closer to being able to turn it on for staging/prod.	2018-04-12 13:55:46 -07:00
Daniel McCarney	74d5decc67	Remove `TotalCertificates` rate limit. (#3638 ) The `TotalCertificates` rate limit serves to ensure we don't accidentally exceed our OCSP signing capacity by issuing too many certificates within a fixed period. In practice this rate limit has been fragile and the associated queries have been linked to performance problems. Since we now have better means of monitoring our OCSP signing capacity this commit removes the rate limit and associated code.	2018-04-12 13:25:47 -07:00
Daniel	ac6672bc71	Revert "Revert "V2: implement "ready" status for Order objects (#3614 )" (#3643 )" This reverts commit `3ecf841a3a`.	2018-04-12 13:20:47 -04:00
Jacob Hoffman-Andrews	3ecf841a3a	Revert "V2: implement "ready" status for Order objects (#3614 )" (#3643 ) This reverts commit `1d22f47fa2`. According to https://github.com/letsencrypt/boulder/pull/3614#issuecomment-380615172, this broke Certbot's tests. We'll investigate, and then roll forward once we understand what broke.	2018-04-12 10:46:57 -04:00
Daniel McCarney	1d22f47fa2	V2: implement "ready" status for Order objects (#3614 ) * SA: Add Order "Ready" status, feature flag. This commit adds the new "Ready" status to `core/objects.go` and updates `sa.statusForOrder` to use it conditionally for orders with all valid authorizations that haven't been finalized yet. This state is used conditionally based on the `features.OrderReadyStatus` feature flag since it will likely break some existing clients that expect status "Processing" for this state. The SA unit test for `statusForOrder` is updated with a "ready" status test case. * RA: Enforce order ready status conditionally. This commit updates the RA to conditionally expect orders that are being finalized to be in the "ready" status instead of "pending". This is conditionally enforced based on the `OrderReadyStatus` feature flag. Along the way the SA was changed to calculate the order status for the order returned in `sa.NewOrder` dynamically now that it could be something other than "pending". * WFE2: Conditionally enforce order ready status for finalization. Similar to the RA the WFE2 should conditionally enforce that an order's status is either "ready" or "pending" based on the "OrderReadyStatus" feature flag. * Integration: Fix `test_order_finalize_early`. This commit updates the V2 `test_order_finalize_early` test for the "ready" status. A nice side-effect of the ready state change is that we no longer invalidate an order when it is finalized too soon because we can reject the finalization in the WFE. Subsequently the `test_order_finalize_early` testcase is also smaller. * Integration: Test classic behaviour w/o feature flag. In the previous commit I fixed the integration test for the `config/test-next` run that has the `OrderReadyStatus` feature flag set but broke it for the `config/test` run without the feature flag. This commit updates the `test_order_finalize_early` test to work correctly based on the feature flag status in both cases.	2018-04-11 10:31:25 -07:00
Daniel McCarney	fa5c917665	RA: Don't lose CA error types when prefixing err msg. (#3633 ) Previously we updated the RA's issueCertificateInner function to prefix errors returned from the CA with meaningful information about which CA RPC caused the failure. Unfortunately by using fmt.Errorf to do this we're discarding the underlying error type. This can cause unexpected server internal errors downstream if (for e.g.) the CA rejects a CSR with a malformed error (see #3632). This PR updates the issueCertificateInner error message prefixing to maintain the error type if the underlying error is a berrors.BoulderError. A RA unit test with several mock CAs is added to test the prefixing occurs as expected without loss of error type. This PR also adds an integration test that ensures we reject a CSR with >100 names with a malformed error. This is not strictly related to this PR but since I wrote it while debugging the root issue I thought I'd include it. To allow this test to pass the pendingAuthorizationsPerAccount in test/rate-limit-policies.yml and associated tests had to be adjusted. Resolves #3632	2018-04-10 11:28:03 -07:00
Daniel McCarney	8a03a6848e	RA: Log Authz ID and solved-by Chal type at Issuance (#3601 ) This PR updates the RA such that certificateRequestEvent objects created during issuance and written to the audit log as JSON also include a new Authorizations field. This field is a map of the form map[string]certificateRequestAuthz and can be used to map from an identifier name appearing in the associated certificate to a certificateRequestAuthz object. Each of the certificateRequestAuthz objects holds an authorization ID and the type of challenge that made the authorization valid. Example Audit log output (with the JSON pulled out and pretty-printed): { "ID":"0BjPk94KlxExRRIQ3kslRVSJ68KMaTh416chRvq0wyA", "Requester":666, "SerialNumber":"ff699d91cab5bc84f1bc97fc71e4e27abc1a", "VerifiedFields":["subject.commonName","subjectAltName"], "CommonName":"rand.44634cbf.xyz", "Names":["rand.44634cbf.xyz"], "NotBefore":"2018-03-28T19:50:07Z", "NotAfter":"2018-06-26T19:50:07Z", "RequestTime":"2018-03-28T20:50:07.234038859Z", "ResponseTime":"2018-03-28T20:50:07.278848954Z", "Authorizations": { "rand.44634cbf.xyz" : { "ID":"jGt37Rnvfr0nhZn-wLkxrQxc2HBfV4t6TSraRiGnNBM", "ChallengeType":"http-01" } } } Resolves #3253	2018-04-04 20:59:38 +01:00
Roland Bracewell Shoemaker	8167abd5e3	Use internet facing appropriate histogram buckets for DNS latencies (#3616 ) Also instead of repeating the same bucket definitions everywhere just use a single top level var in the metrics package in order to discourage copy/pasting. Fixes #3607.	2018-04-04 08:01:54 -04:00
Jacob Hoffman-Andrews	c7e5fc1d41	Add better logging to errors in issueCertificateInner. (#3575 ) Also, remove some of the assignments to logEvent.Error, since these are just overridden with the returned error in `issueCertificate`.	2018-03-26 13:29:47 -04:00
Daniel McCarney	866627ee29	Return descriptive error when SCTs policy can't be met. (#3586 ) This commit updates CTPolicy & the RA to return a distinct error when the RA is unable to fetch the required SCTs for a certificate when processing an issuance. This error type is plumbed up to the WFE/WFE2 where the `web/probs.go` code converts it into a server internal error with a suitable user facing error.	2018-03-22 13:10:08 -07:00
Daniel McCarney	21a17f0baf	Harmonize order expiry with assoc. authz expiry. (#3567 ) Prior to this commit an order's expiry was set based on ra.orderLiftime while pending and valid authorization expiry was set based on ra.pendingAuthorizationLifetime and ra.authorizationLifetime. Since orders reused existing valid/pending authorizations this can lead to a case where an order has an expiry beyond the associated authorization expiries. In this case when an authorization expired the order becomes inactionable and the extra order lifetime is not useful. This commit addresses this problem in two ways: 1. The SA GetAuthorizations function used to find authzs to reuse for ra.NewOrder is adjusted to only return authorizations at min 24hr away from expiry. 2. Order expiry is now calculated by the RA in newOrder as the min of the order's own expiry or the soonest authorization expiry. This properly reflects the order's true lifetime based on the authorization lifetime. The RA/SA unit tests are updated accordingly. Resolves #3498	2018-03-16 20:42:21 +00:00
Jacob Hoffman-Andrews	700604dda1	Overlapping wildcard errors are 400, not 500. (#3561 ) Return a malformed error for these requests. Also add an integration test. Fixes #3558	2018-03-14 13:18:25 -07:00
Roland Bracewell Shoemaker	9c9e944759	Add SCT embedding (#3521 ) Adds SCT embedding to the certificate issuance flow. When a issuance is requested a precertificate (the requested certificate but poisoned with the critical CT extension) is issued and submitted to the required CT logs. Once the SCTs for the precertificate have been collected a new certificate is issued with the poison extension replace with a SCT list extension containing the retrieved SCTs. Fixes #2244, fixes #3492 and fixes #3429.	2018-03-12 11:58:30 -07:00
Jacob Hoffman-Andrews	c621cbd33f	Reject overlaps with wildcards. (#3542 ) Requesting a certificate with "*.example.com" and "www.example.com" as separate SANs doesn't make sense, because "www.example.com" is covered by the wildcard. #3524	2018-03-10 06:49:36 +00:00
Jacob Hoffman-Andrews	11434650b7	Check safe browsing at validation time (#3539 ) Right now we check safe browsing at new-authz time, which introduces a possible external dependency when calling new-authz. This is usually fine, since most safe browsing checks can be satisfied locally, but when requests have to go external, it can create variance in new-authz timing. Fixes #3491.	2018-03-09 11:15:05 +00:00
Jacob Hoffman-Andrews	c8ee2a1719	Fix error logging in issueCertificate. (#3534 ) The logEvent setup we had in issueCertificate depending on all error handling code setting logEvent.Error in addition to returning the error. Since this is not the common pattern in Go, there were a few places where logEvent.Error wasn't set, making it hard to find the root cause of errors in the RA logs (though the errors would get propagated to the WFE for logging). This change wraps issueCertificate such that all errors returned get logged.	2018-03-08 18:16:54 +00:00
Daniel McCarney	49d55d9ab5	Make POSTing KeyAuthorization optional, V2 don't echo it. (#3526 ) This commit updates the RA to make the notion of submitting a KeyAuthorization value as part of the ra.UpdateAuthorization call optional. If set, the value is enforced against expected and an error is returned if the provided authorization isn't correct. If it isn't set the RA populates the field with the computed authorization for the VA to enforce against the value it sees in challenges. This retains the legacy behaviour of the V1 API. The V2 API will never unmarshal a provided key authorization. The ACMEv2/WFEv2 prepChallengeForDisplay function is updated to strip the ProvidedKeyAuthorization field before sending the challenge object back to a client. ACMEv1/WFEv1 continue to return the KeyAuthorization in challenges to avoid breaking clients that are relying on this legacy behaviour. For deployability ease this commit retains the name of the core.Challenge.ProvidedKeyAuthorization field even though it should be called core.Challenge.ComputedKeyAuthorization now that it isn't set based on the client's provided key authz. This will be easier as a follow-up change. Resolves #3514	2018-03-06 20:33:01 +00:00
Jacob Hoffman-Andrews	2d1d895da1	Copy authz.challenges to fix data race. (#3520 ) This race was uncovered by running the load generator as part of our CI. Also, update ra_test.go. It was previously testing that the returned authz and the stored authz should be identical, which is not actually a property of UpdateAuthorization; in general, they will not be identical.	2018-03-03 09:19:56 -08:00
Jacob Hoffman-Andrews	5d1a3bbf36	Copy authz for goroutine in RA. (#3508 ) There's a minor data race in UpdateAuthorization where the authorization can be changed in the goroutine concurrently with returning it. This fixes it by copying the authorization as an argument to the goroutine. Fixes #3506.	2018-03-02 10:52:24 -08:00
Daniel McCarney	f2d3ad6d52	Enforce new orders per acct per window rate limit. (#3501 ) Previously we introduced the concept of a "pending orders per account ID" rate limit. After struggling with making an implementation of this rate limit perform well we reevaluated the problem and decided a "new orders per account per time window" rate limit would be a better fit for ACMEv2 overall. This commit introduces the new newOrdersPerAccount rate limit. The RA now checks this before creating new pending orders in ra.NewOrder. It does so after order reuse takes place ensuring the rate limit is only applied in cases when a distinct new pending order row would be created. To accomplish this a migration for a new orders field (created) and an index over created and registrationID is added. It would be possible to use the existing expires field for this like we've done in the past, but that was primarily to avoid running a migration on a large table in prod. Since we don't have that problem yet for V2 tables we can Do The Right Thing and add a column. For deployability the deprecated pendingOrdersPerAccount code & SA gRPC bits are left around. A follow-up PR will be needed to remove those (#3502). Resolves #3410	2018-03-02 10:47:39 -08:00
Jacob Hoffman-Andrews	3568ad29ea	Turn CT failures into hard failures in RA. (#3496 ) When we originally deployed the inline CT submission code, we wanted to be conservative in case it increased our issuance error rate. However, we've established that the success rate is quite good, so we'll remove some complexity and make things more realistic by removing the code that avoids returning errors when CT submission fails.	2018-02-28 15:22:10 -08:00
Roland Bracewell Shoemaker	9bd8dc5d0f	Rever to using ParseAddress instead of ParseAddressList (#3475 ) Fixes #1558.	2018-02-23 21:48:31 -05:00
Jacob Hoffman-Andrews	92c9340fe8	Deprecate CountCertificatesExact. (#3462 ) This is now enabled in prod and we can make it the default.	2018-02-20 14:34:03 -08:00
Jacob Hoffman-Andrews	9ae81b822a	Add more error details on NewCertificate problems. (#3461 ) We occasionally get deadline exceeded errors inside RA's NewCertificate method, however it's not clear which SA queries cause these: we just see "deadline exceeded" errors. This prefixes errors from the most relevant queries with details about the request being made.	2018-02-20 14:32:52 -08:00
Roland Bracewell Shoemaker	8446571b46	Remove EnforceChallengeDisable (#3444 ) Removes usage of the `EnforceChallengeDisable` feature, the feature itself is not removed as it is still configured in staging/production, once that is fixed I'll submit another PR removing the actual flag. This keeps the behavior that when authorizations are retrieved from the SA they have their challenges populated, because that seems to make the most sense to me? It also retains TLS re-validation. Fixes #3441.	2018-02-14 13:21:26 -08:00
Daniel McCarney	f3d2dc50d9	Fix RA V2 wildcard authz reuse safety check. (#3434 ) Prior to this commit a logical error in the RA's `NewOrder` caused a safety check that prevents authorization reuse with a non-wildcard authz for a wildcard name to not work. This commit adds a test for the condition that the safety check is designed for and fixes the logical error. Prior to fixing the logical error the test fails. With the corrected safety check the test passes.	2018-02-08 15:35:11 -08:00
Roland Bracewell Shoemaker	9e23edf850	Use ctpolicy package in RA (#3422 ) And collect the metrics on success/failure rates. Built on top of #3414. Fixes #3413.	2018-02-08 13:33:42 -08:00
Daniel McCarney	4ac109ac25	Do not reuse legacy authzs in V2 new-order. (#3432 ) Prior to this commit when building up the authorizations for a new-order request we looked for any unexpired pending/valid authorizations owned by the account and used them for the order. This allows a client to use the V1 new-authz endpoint in combination with the V2 new-order endpoint and we do not want to support this behaviour. All V2 authorizations should be sourced from other V2 orders. This commit implements a new parameter for the SA's getAuthorizations function that allows filtering out legacy V1 authorizations by doing a JOIN on the order to authorizations join table. Resolves #3328	2018-02-08 12:31:04 -08:00
Daniel McCarney	d7bfb542c0	Handle order finalization errors. (#3404 ) This commit resolves the case where an error during finalization occurs. Prior to this commit if an error (expected or otherwise) occurred after setting an order to status processing at the start of order finalization the order would be stuck processing forever. The SA now has a `SetOrderError` RPC that can be used by the RA to persist an error onto an order. The order status calculation can use this error to decide if the order is invalid. The WFE is updated to write the error to the order JSON when displaying the order information. Prior to this commit the order protobuf had the error field as a `[]byte`. It doesn't seem like this is the right decision, we have a specific protobuf type for ProblemDetails and so this commit switches the error field to use it. The conversion to/from `[]byte` is done with the model by the SA. An integration test is included that prior to this commit left an order in a stuck processing state. With this commit the integration test passes as expected. Resolves https://github.com/letsencrypt/boulder/issues/3403	2018-02-07 16:34:07 -05:00
Daniel McCarney	67ae7f75b4	`sa.GetOrderAuthorizations` -> `sa.GetValidOrderAuthorizations`. (#3411 ) The SA RPC previously called `GetOrderAuthorizations` only returns valid, unexpired authorizations. This commit updates the name to emphasize that it only returns valid order authzs.	2018-02-07 11:54:18 -08:00
Daniel McCarney	eea049da40	Fix order reuse, calc order status by authz status (#3402 ) This PR is a rework of what was originally https://github.com/letsencrypt/boulder/pull/3382, integrating the design feedback proposed by @jsha: https://github.com/letsencrypt/boulder/pull/3382#issuecomment-359912549 This PR removes the stored Order status field and replaces it with a value that is calculated on-the-fly by the SA when fetching an order, based on the order's associated authorizations. In summary (and order of precedence): * If any of the order's authorizations are invalid, the order is invalid. * If any of the order's authorizations are deactivated, the order is deactivated. * If any of the order's authorizations are pending, the order is pending. * If all of the order's authorizations are valid, and there is a certificate serial, the order is valid. * If all of the order's authorizations are valid, and we have began processing, but there is no certificate serial, the order is processing. * If all of the order's authorizations are valid, and we haven't processing, then the order is pending waiting a finalization request. This avoids having to explicitly update the order status when an associated authorization changes status. The RA's implementation of new-order is updated to only reuse an existing order if the calculated status is pending. This avoids giving back invalid or deactivated orders to clients. Resolves #3333	2018-02-01 16:33:42 -05:00
Roland Bracewell Shoemaker	2adf5a54ab	Move CN to SAN in v2 API (#3394 ) Fixes #3368. Basically just adds a `csr.VerifyCSR` call in `ra.FinalizeOrder` that mirrors what we have in `ra.NewCertificate`, this moves the CN to SAN as expected if included.	2018-01-29 13:21:12 -08:00
Roland Bracewell Shoemaker	7e4d44e172	Don't mask sa.GetValidAuthorization error in ra.NewAuthorization (#3381 )	2018-01-18 15:53:14 -05:00
Jacob Hoffman-Andrews	cfc7823cdd	Remove EnforceChallengeDisable check at issuance. (#3362 ) Per https://community.letsencrypt.org/t/2018-01-11-update-regarding-acme-tls-sni-and-shared-hosting-infrastructure/50188/3, we are planning to treat prior issuance by an account as reason to whitelist that account for reissuance via TLS-SNI. By extension, reusing validations that occurred prior to disclosure of the TLS-SNI issue is reasonably safe, so this change removes the issuance-time check for whether a challenge has been disabled. This saves us significant complexity and database load in implementing TLSSNIRevalidation (https://github.com/letsencrypt/boulder/pull/3361), since ChallengeTypeEnabled returns false, so we'd have to plumb through data about whether an issuance was based on a revalidation. Instead, we can safely delete this code. Note that "EnforceChallengeDisable" is implemented in three places: new-authz, validation time, and issuance time. We're keeping it in place at new-authz for now because it's intertwined with the account whitelisting code. We're keeping it in place at validation time, because there's a small chance that someone could have created a pending authz for a domain they don't control before the TLS-SNI issue was announced, and that authz could still be pending, and they could find out that that domain is hosted on a vulnerable provider, and use the vulnerability now that they know about it. A tiny chance, but may as well be careful.	2018-01-12 11:35:23 -08:00
Jacob Hoffman-Andrews	8153b919be	Implement TLSSNIRevalidation (#3361 ) This change adds a feature flag, TLSSNIRevalidation. When it is enabled, Boulder will create new authorization objects with TLS-SNI challenges if the requesting account has issued a certificate with the relevant domain name, and was the most recent account to do so. This setting overrides the configured list of challenges in the PolicyAuthority, so even if TLS-SNI is disabled in general, it will be enabled for revalidation. Note that this interacts with EnforceChallengeDisable. Because EnforceChallengeDisable causes additional checked at validation time and at issuance time, we need to update those two places as well. We'll send a follow-up PR with that. We chose to make this work only for the most recent account to issue, even if there were overlapping certificates, because it significantly simplifies the database access patterns and should work for 95+% of cases. Note that this change will let an account revalidate and reissue for a domain even if the previous issuance on that account used http-01 or dns-01. This also simplifies implementation, and fits within the intent of the mitigation plan: If someone previously issued for a domain using http-01, we have high confidence that they are actually the owner, and they are not going to "steal" the domain from themselves using tls-sni-01. Also note: This change also doesn't work properly with ReusePendingAuthz: true. Specifically, if you attempted issuance in the last couple days and failed because there was no tls-sni challenge, you'll still have an http-01 challenge lying around, and we'll reuse that; then your client will fail due to lack of tls-sni challenge again. This change was joint work between @rolandshoemaker and @jsha.	2018-01-12 11:00:06 -08:00
Maciej Dębski	44984cd84a	Implement regID whitelist for allowed challenge types. (#3352 ) This updates the PA component to allow authorization challenge types that are globally disabled if the account ID owning the authorization is on a configured whitelist for that challenge type.	2018-01-10 13:44:53 -05:00
Roland Shoemaker	d92713826c	remove debug statements	2018-01-09 20:58:53 -08:00
Roland Shoemaker	400ffede3d	More fixes	2018-01-09 20:48:16 -08:00
Roland Shoemaker	1a3a76438c	Fix tests and GetOrderAuthorizations	2018-01-09 20:38:52 -08:00
Roland Shoemaker	dcd2b438f4	Fix previous impl, add valid authz reuse fix and existing authz validation fix	2018-01-09 19:53:48 -08:00
Roland Shoemaker	5ca646c5dd	Disallow the use of valid authorizations that used currently disabled challenges for issuance	2018-01-09 18:52:29 -08:00
Daniel McCarney	7bb16ff21e	ACMEv2: Add pending order reuse (#3290 ) This commit adds pending order reuse. Subsequent to this commit multiple add-order requests from the same account ID for the same set of order names will result in only one order being created. Orders are only reused while they are not expired. Finalized orders will not be reused for subsequent new-order requests allowing for duplicate order issuance. Note that this is a second level of reuse, building on the pending authorization reuse that's done between separate orders already. To efficiently find an appropriate order ID given a set of names, a registration ID, and the current time a new orderFqdnSets table is added with appropriate indexes and foreign keys. Resolves #3258	2018-01-02 13:27:16 -08:00
Jacob Hoffman-Andrews	e60251d8ca	Add documentation link to rate limit errors. (#3286 ) Resolves #2951	2017-12-19 15:46:18 -08:00
Daniel McCarney	de5fbbdb67	Implement CAA issueWild enforcement for wildcard names (#3266 ) This commit implements RFC 6844's description of the "CAA issuewild property" for CAA records. We check CAA in two places: at the time of validation, and at the time of issuance when an authorization is more than 8hours old. Both locations have been updated to properly enforce issuewild when checking CAA for a domain corresponding to a wildcard name in a certificate order. Resolves https://github.com/letsencrypt/boulder/issues/3211	2017-12-13 12:09:33 -05:00
Daniel McCarney	0684d5fc73	Add pending orders rate limit to new-order. (#3257 ) This commit adds a new rate limit to restrict the number of outstanding pending orders per account. If the threshold for this rate limit is crossed subsequent new-order requests will return a 429 response. Note: Since this the rate limit object itself defines an `Enabled()` test based on whether or not it has been configured there is not a feature flag for this change. Resolves https://github.com/letsencrypt/boulder/issues/3246	2017-12-04 16:36:48 -05:00
Daniel McCarney	1c99f91733	Policy based issuance for wildcard identifiers (Round two) (#3252 ) This PR implements issuance for wildcard names in the V2 order flow. By policy, pending authorizations for wildcard names only receive a DNS-01 challenge for the base domain. We do not re-use authorizations for the base domain that do not come from a previous wildcard issuance (e.g. a normal authorization for example.com turned valid by way of a DNS-01 challenge will not be reused for a .example.com order). The wildcard prefix is stripped off of the authorization identifier value in two places: When presenting the authorization to the user - ACME forbids having a wildcard character in an authorization identifier. When performing validation - We validate the base domain name without the . prefix. This PR is largely a rewrite/extension of #3231. Instead of using a pseudo-challenge-type (DNS-01-Wildcard) to indicate an authorization & identifier correspond to the base name of a wildcard order name we instead allow the identifier to take the wildcard order name with the *. prefix.	2017-12-04 12:18:10 -08:00
Roland Bracewell Shoemaker	d5db80ab12	Various publisher CT fixes (#3219 ) Makes a couple of changes: * Change `SubmitToCT` to make submissions to each log in parallel instead of in serial, this prevents a single slow log from eating up the majority of the deadline and causing submissions to other logs to fail * Remove the 'submissionTimeout' field on the publisher since it is actually bounded by the gRPC timeout as is misleading * Add a timeout to the CT clients internal HTTP client so that when log servers hang indefinitely we actually do retries instead of just using the entire submission deadline. Currently set at 2.5 minutes Fixes #3218.	2017-11-09 10:05:26 -05:00
Daniel McCarney	2f263f8ed5	ACME v2 Finalize order support (#3169 ) This PR implements order finalization for the ACME v2 API. In broad strokes this means: * Removing the CSR from order objects & the new-order flow * Adding identifiers to the order object & new-order * Providing a finalization URL as part of orders returned by new-order * Adding support to the WFE's Order endpoint to receive finalization POST requests with a CSR * Updating the RA to accept finalization requests and to ensure orders are fully validated before issuance can proceed * Updating the SA to allow finding order authorizations & updating orders. * Updating the CA to accept an Order ID to log when issuing a certificate corresponding to an order object Resolves #3123	2017-11-01 12:39:44 -07:00
JP Phillips	e83480f86b	Return accurate error description for invalid authz limit (#3156 ) Fixes #3144	2017-10-11 22:21:51 -07:00
Daniel McCarney	1794c56eb8	Revert "Add CAA parameter to restrict challenge type (#3003 )" (#3145 ) This reverts commit `23e2c4a836`.	2017-10-04 12:00:44 -07:00
lukaslihotzki	23e2c4a836	Add CAA parameter to restrict challenge type (#3003 ) This commit adds CAA `issue` paramter parsing and the `challenge` parameter to permit a single challenge type only. By setting `challenge=dns-01`, the nameserver keeps control over every issued certificate.	2017-10-02 11:59:47 -07:00
Jacob Hoffman-Andrews	b80b129d1a	Remove unused requestMethod and VerificationMethods. (#3129 ) These were added as part of #62, based on the original CPS at https://letsencrypt.org/documents/ISRG-CPS-May-5-2015.pdf. Request method was an odd thing to log because for Let's Encrypt it will always be "online", never "in person." And VerificationMethods is logged separately during the authz validation process. The newest CPS at https://letsencrypt.org/documents/isrg-cps-v2.0/ no longer requires these specific fields, so we're removing them for clarity.	2017-09-29 12:35:58 -07:00
Roland Bracewell Shoemaker	b7bca87134	Batch fetching of existing authorizations and creation of pending authorizations (#3058 ) For the new-order endpoint only. This does some refactoring of the order of operations in `ra.NewAuthorization` as well in order to reduce the duplication of code relating to creating pending authorizations, existing tests still seem to work as intended... A close eye should be given to this since we don't have integration tests yet that test it end to end. This also changes the inner type of `grpc.StorageAuthorityServerWrapper` to `core.StorageAuthority` so that we can avoid a circular import that is created by needing to import `grpc.AuthzToPB` and `grpc.PBToAuthz` in `sa/sa.go`. This is a big change but should considerably improve the performance of the new-order flow. Fixes #2955.	2017-09-25 09:10:59 -07:00
Daniel McCarney	f0a9e9aa0e	Fix duplicate cert limit off-by-one. (#3117 ) The RA's `checkCertificatesPerFQDNSetLimit` function was using `>` where it should have been using `>=` when evaluating a FQDNSet count against the rate limit threshold. This resulted in an off by one error where we allowed 1 more duplicate certificate than intended. This commit fixes the off-by-one error and adds a short unit test. The unit test failed the `TestCheckExactCertificateLimit/FQDN_set_issuances_equal_to_limit` subtest before the fix was applied and passes afterwards.	2017-09-22 14:45:48 -07:00
Kleber Correia	72b1c69761	Remove RecheckCAA feature flag (#3103 ) Updates #2692.	2017-09-18 11:59:58 -07:00
Jacob Hoffman-Andrews	9ab2ff4e03	Add CAA-specific error. (#3051 ) Previously, CAA problems were lumped in under "ConnectionProblem" or "Unauthorized". This should make things clearer and easier to differentiate. Fixes #3043	2017-09-14 14:11:41 -07:00
Jacob Hoffman-Andrews	3b6a5ff63c	Fix a bad error return in RA (#3061 ) RA.NewRegistration checked for an error return from SA.NewRegistration, but failed to properly propagate the error. It was just setting the err variable and falling through to the successful return case. This fixes that issue and adds a unittest.	2017-09-08 13:15:09 -07:00
Daniel McCarney	d18e1dbcff	Add WrongAuthorizationState error code for UpdateAuthorization (#3053 ) This commit adds a new boulder error type WrongAuthorizationState. This error type is returned by the SA when UpdateAuthorization is provided an authz that isn't pending. The RA and WFE are updated accordingly such that this error percolates back through the API to the user as a Malformed problem with a sensible description. Previously this behaviour resulted in a ServerInternal error. Resolves #3032	2017-09-07 11:22:02 -07:00
Kleber Correia	710c814720	Remove AllowKeyRollover flag (#3037 )	2017-09-06 08:43:15 -04:00
Jacob Hoffman-Andrews	18f15b2b3d	Remove unused error types (#3041 ) * Remove all of the errors under core. Their purpose is now served by errors, and they were almost entirely unused. The remaining uses were switched to errors. * Remove errors.NotSupportedError. It was used in only one place (ca.go), and that usage is more appropriately a ServerInternal error.	2017-09-05 16:51:32 -07:00
Jacob Hoffman-Andrews	45f42f6583	Switch recheckCAA error to Unauthorized. (#3040 ) ConnectionFailure is only used during validation, and so isn't handled by WFE's problemDetailsFromBoulderError. This led to returning ServerInternal instead of the intended error code, and hiding the error detail. Unauthorized is probably a better error type for now anyhow, but long-term we should switch to a specific CAA error type. This PR will allow clients to see the detailed list of problem domains when new-cert returns an error due to CAA rechecking.	2017-09-05 10:53:47 -07:00
Roland Bracewell Shoemaker	191a043585	Implement handler for retrieving an order object and SA RPC (#3016 ) Fixes #2984 and fixes #2985.	2017-09-01 15:26:36 -07:00
Jacob Hoffman-Andrews	ba95793628	Remove all named returns from RA. (#3021 ) This is a followup from https://github.com/letsencrypt/boulder/pull/3017, in which we identified a data race caused by the use of named returns. This also reverts the change from that PR, which was only a surface level fix. Fixes #3019.	2017-08-30 12:03:27 -07:00
Jacob Hoffman-Andrews	4ec662ee59	Fix data race when MatchesCSR fails (#3017 )	2017-08-29 10:05:09 -04:00
Jacob Hoffman-Andrews	b0c7bc1bee	Recheck CAA for authorizations older than 8 hours (#3014 ) Fixes #2889. VA now implements two gRPC services: VA and CAA. These both run on the same port, but this allows implementation of the IsCAAValid RPC to skip using the gRPC wrappers, and makes it easier to potentially separate the service into its own package in the future. RA.NewCertificate now checks the expiration times of authorizations, and will call out to VA to recheck CAA for those authorizations that were not validated recently enough.	2017-08-28 16:40:57 -07:00
Roland Bracewell Shoemaker	90ba766af9	Add NewOrder RPCs + methods to SA and RA (#2907 ) Fixes #2875, #2900 and #2901.	2017-08-11 14:24:25 -04:00
Jacob Hoffman-Andrews	8bc1db742c	Improve recycling of pending authzs (#2896 ) The existing ReusePendingAuthz implementation had some bugs: It would recycle deactivated authorizations, which then couldn't be fulfilled. (#2840) Since it was implemented in the SA, it wouldn't get called until after the RA checks the Pending Authorizations rate limit. Which means it wouldn't fulfill its intended purpose of making accounts less likely to get stuck in a Pending Authorizations limited state. (#2831) This factors out the reuse functionality, which used to be inside an "if" statement in the SA. Now the SA has an explicit GetPendingAuthorization RPC, which gets called from the RA before calling NewPendingAuthorization. This happens to obsolete #2807, by putting the recycling logic for both valid and pending authorizations in the RA.	2017-07-26 14:00:30 -07:00
Brian Smith	ac63c78313	CA: Have IssueCertificate use IssueCertificateRequest directly. (#2886 ) This is a step towards the long-term goal of eliminating wrappers and a step towards the short-term goal of making it easier to refactor ca/ca_test.go to add testing of precertificate-based issuance.	2017-07-25 13:35:25 -04:00
Roland Bracewell Shoemaker	05d869b005	Rename DNSResolver -> DNSClient (#2878 ) Fixes #639. This resolves something that has bugged me for two+ years, our DNSResolverImpl is not a DNS resolver, it is a DNS client. This change just makes that obvious.	2017-07-18 08:37:45 -04:00
Daniel McCarney	b18c3aaa96	Improve the `emptyDNSResponseError` message. (#2812 ) When the RA's `validateEmail` function performs DNS lookups for the MX and A records for the email domain it returns the `emptyDNSResponseError` message if neither an MX or A record are found. Prior to this commit the message only said "empty DNS response" and didn't indicate what the DNS lookup was being used for, or which records were empty. This commit updates the message to hopefully be less confusing and more immediately actionable by a user.	2017-06-15 11:40:39 -07:00
Daniel McCarney	fbd87b1757	Splits CountRegistrationsByIP to exact-match and by /48. (#2782 ) Prior to this PR the SA's `CountRegistrationsByIP` treated IPv6 differently than IPv4 by counting registrations within a /48 for IPv6 as opposed to exact matches for IPv4. This PR updates `CountRegistrationsByIP` to treat IPv4 and IPv6 the same, always matching exactly. The existing RegistrationsPerIP rate limit policy will be applied against this exact matching count. A new `CountRegistrationsByIPRange` function is added to the SA that performs the historic matching process, e.g. for IPv4 it counts exactly the same as `CountRegistrationsByIP`, but for IPv6 it counts within a /48. A new `RegistrationsPerIPRange` rate limit policy is added to allow configuring the threshold/window for the fuzzy /48 matching registration limit. Stats for the "Exceeded" and "Pass" events for this rate limit are separated into a separate `RegistrationsByIPRange` stats scope under the `RateLimit` scope to allow us to track it separate from the exact registrations per IP rate limit. Resolves https://github.com/letsencrypt/boulder/issues/2738	2017-05-30 15:12:20 -07:00
Daniel McCarney	7393db6d59	Fixes RPC bug for CountCertificatesExact feature flag. (#2759 ) With the CountCertificatesExact feature flag enabled if the RA's checkCertificatesPerNameLimit was called with names only containing domains exactly matching a public suffix entry then the legacy ra.enforceNameCounts function will be called with an empty tldNames argument. This in turn will cause the RA->SA RPC to fail with an "incomplete gRPC request message error". This commit fixes this bug by only calling ra.enforceNameCounts when len(tldNames) > 0. Resolves #2758	2017-05-12 15:21:00 -04:00
Daniel McCarney	b9369a4814	Uses `UniqueLowerNames` for domain/suffix rl funcs. (#2725 ) Both `ra.domainsForRateLimiting` and `ra.suffixesForRateLimiting` were doing their own "unique"ing with a `map[string]struct{}` when they could have used `core.UniqueLowerNames`. This commit updates them both to do so and adjusts the unit tests to expect the new sorted order return.	2017-05-04 12:37:05 +01:00
Daniel McCarney	1ed34a4a5d	Fixes cert count rate limit for exact PSL matches. (#2703 ) Prior to this PR if a domain was an exact match to a public suffix list entry the certificates per name rate limit was applied based on the count of certificates issued for that exact name and all of its subdomains. This PR introduces an exception such that exact public suffix matches correctly have the certificate per name rate limit applied based on only exact name matches. In order to accomplish this a new RPC is added to the SA `CountCertificatesByExactNames`. This operates similar to the existing `CountCertificatesByNames` but does not include subdomains in the count, only exact matches to the names provided. The usage of this new RPC is feature flag gated behind the "CountCertificatesExact" feature flag. The RA unit tests are updated to test the new code paths both with and without the feature flag enabled. Resolves #2681	2017-05-02 13:43:35 -07:00
Jacob Hoffman-Andrews	d99800ecb1	Remove some last traces of AMQP. (#2687 ) Fixes #2665	2017-04-20 10:43:17 -07:00
David Calavera	cc5ee3906b	Refactor IsSane and IsSane* to return useful errors. (#2685 ) This change changes the returning values from boolean to error. It makes `checkConsistency` an internal function and removes the optional argument in favor of making checks explicit where they are used. It also renames those functions to CheckConsistency* to not give the impression of still returning boolean values. Signed-off-by: David Calavera <david.calavera@gmail.com>	2017-04-19 12:08:47 -04:00
Brian Smith	497a027842	RA: Parse issued certificate only once. (#2657 ) Previously RegistrationAuthorityImpl.NewCertificate would call MatchesCSR() and then verify that the certificate can be successfully parsed. However, MatchesCSR() itself parses the certificate, so the latter check was pointless. Instead, parse the certificate once, fail if it can't be parsed, then pass the parsed certificate to MatchesCSR().	2017-04-06 14:44:32 -04:00
Daniel McCarney	594e31b724	Logs authz ID when ra.onValidationUpdate fails. (#2662 ) `0e112ae` updates ra/ra.go such that when onValidationUpdate returns a non-nil error the AuditErr message includes the affected authorization ID in addition to the registration ID. Resolves #2661.	2017-04-06 11:16:18 -07:00
Roland Bracewell Shoemaker	e2b2511898	Overhaul internal error usage (#2583 ) This patch removes all usages of the `core.XXXError` and almost all usages of `probs` outside of the WFE and VA and replaces them with a unified internal error type. Since the VA uses `probs.ProblemDetails` quite extensively in challenges, and currently stores them in the DB I've saved this change for another change (it'll also require a migration). Since `ProblemDetails` should only ever be exposed to end-users all of its related logic should be moved into the `WFE` but since it still needs to be exposed to the VA and SA I've left it in place for now. The new internal `errors` package offers the same convenience functions as `probs` does as well as a new simpler type testing method. A few small changes have also been made to error messages, mainly adding the library and function name to internal server errors for easier debugging (i.e. where a number of functions return the exact same errors and there is no other way to distinguish which method threw the error). Also adds proper encoding of internal errors transferred over gRPC (the current encoding scheme is kept for `core` and `probs` errors since it'll be ideally be removed after we deploy this and follow-up changes) using `grpc/metadata` instead of the gRPC status codes. Fixes #2507. Updates #2254 and #2505.	2017-03-22 23:27:31 -07:00
Daniel	e88db3cd5e	Revert "Revert "Copy all statsd stats to Prometheus. (#2474 )" (#2541 )" This reverts commit `9d9e4941a5` and restores the statsd prometheus code.	2017-02-01 15:48:18 -05:00
Daniel McCarney	9d9e4941a5	Revert "Copy all statsd stats to Prometheus. (#2474 )" (#2541 ) This reverts commit `58ccd7a71a`. We are seeing multiple boulder components restart when they encounter the stat registration race condition described in https://github.com/letsencrypt/boulder/issues/2540	2017-02-01 12:50:27 -05:00
Jacob Hoffman-Andrews	6c93b41f20	Add a limit on failed authorizations (#2513 ) Fixes #976. This implements a new rate limit, InvalidAuthorizationsPerAccount. If a given account fails authorization for a given hostname too many times within the window, subsequent new-authz attempts for that account and hostname will fail early with a rateLimited error. This mitigates the misconfigured clients that constantly retry authorization even though they always fail (e.g., because the hostname no longer resolves). For the new rate limit, I added a new SA RPC, CountInvalidAuthorizations. I chose to implement this only in gRPC, not in AMQP-RPC, so checking the rate limit is gated on gRPC. See #2406 for some description of the how and why. I also chose to directly use the gRPC interfaces rather than wrapping them in core.StorageAuthority, as a step towards what we will want to do once we've moved fully to gRPC. Because authorizations don't have a created time, we need to look at the expires time instead. Invalid authorizations retain the expiration they were given when they were created as pending authorizations, so we use now + pendingAuthorizationLifetime as one side of the window for rate limiting, and look backwards from there. Note that this means you could maliciously bypass this rate limit by stacking up pending authorizations over time, then failing them all at once. Similarly, since this limit is by (account, hostname) rather than just (hostname), you can bypass it by creating multiple accounts. It would be more natural and robust to limit by hostname, like our certificate limits. However, we currently only have two indexes on the authz table: the primary key, and (`registrationID`,`identifier`,`status`,`expires`) Since this limit is intended mainly to combat misconfigured clients, I think this is sufficient for now. Corresponding PR for website: letsencrypt/website#125	2017-01-23 11:22:51 -08:00
Jacob Hoffman-Andrews	9dacdd5443	Fix SA wrappers for maps. (#2498 ) We turn arrays into maps with a range command. Previously, we were taking the address of the iteration variable in that range command, which meant incorrect results since the iteration variable gets reassigned. Also change the integration test to catch this error. Fixes #2496	2017-01-17 14:07:07 -08:00
Jacob Hoffman-Andrews	58ccd7a71a	Copy all statsd stats to Prometheus. (#2474 ) We have a number of stats already expressed using the statsd interface. During the switchover period to direct Prometheus collection, we'd like to make those stats available both ways. This change automatically exports any stats exported using the statsd interface via Prometheus as well. This is a little tricky because Prometheus expects all stats to by registered exactly once. Prometheus does offer a mechanism to gracefully recover from registering a stat more than once by handling a certain error, but it is not safe for concurrent access. So I added a concurrency-safe wrapper that creates Prometheus stats on demand and memoizes them. In the process, made a few small required side changes: - Clean "/" from method names in the gRPC interceptors. They are allowed in statsd but not in Prometheus. - Replace "127.0.0.1" with "boulder" as the name of our testing CT log. Prometheus stats can't start with a number. - Remove ":" from the CT-log stat names emitted by Publisher. Prometheus stats can't include it. - Remove a stray "RA" in front of some rate limit stats, since it was duplicative (we were emitting "RA.RA..." before). Note that this means two stat groups in particular are duplicated: - Gostats* is duplicated with the default process-level stats exported by the Prometheus library. - gRPCClient* are duplicated by the stats generated by the go-grpc-prometheus package. When writing dashboards and alerts in the Prometheus world, we should be careful to avoid these two categories, as they will disappear eventually. As a general rule, if a stat is available with an all-lowercase name, choose that one, as it is probably the Prometheus-native version. In the long run we will want to create most stats using the native Prometheus stat interface, since it allows us to use add labels to metrics, which is very useful. For instance, currently our DNS stats distinguish types of queries by appending the type to the stat name. This would be more natural as a label in Prometheus.	2017-01-10 10:30:15 -05:00
Daniel McCarney	9b89aa7d2c	Logs failed SA.UpdatePendingAuthorization. (#2321 ) This commit resolves #2303 by updating the comment, and returned error type produced when the RA calls `SA.UpdatePendingAuthorization` and it fails. Previously this produced a `MalformedRequestError` that was described as only happening when the client corrupted the challenge data. Now this is returned as more descriptive `ServerInternalError` and the underlying error from the SA is logged as a warning for further debugging.	2016-11-11 09:26:53 -08:00
Roland Bracewell Shoemaker	c5f99453a9	Switch CT submission RPC from CA -> RA (#2304 ) With the current gRPC design the CA talks directly to the Publisher when calling SubmitToCT which crosses security bounadries (secure internal segment -> internet facing segment) which is dangerous if (however unlikely) the Publisher is compromised and there is a gRPC exploit that allows memory corruption on the caller end of a RPC which could expose sensitive information or cause arbitrary issuance. Instead we move the RPC call to the RA which is in a less sensitive network segment. Switching the call site from the CA -> RA is gated on adding the gRPC PublisherService object to the RA config. Fixes #2202.	2016-11-08 11:39:02 -08:00
Daniel McCarney	a6f2b0fafb	Updates `go-jose` dep to v1.1.0 (#2314 ) This commit updates the `go-jose` dependency to [v1.1.0](https://github.com/square/go-jose/releases/tag/v1.1.0) (Commit: aa2e30fdd1fe9dd3394119af66451ae790d50e0d). Since the import path changed from `github.com/square/...` to `gopkg.in/square/go-jose.v1/` this means removing the old dep and adding the new one. The upstream go-jose library added a `[]x509.Certificate` member to the `JsonWebKey` struct that prevents us from using a direct equality test against two `JsonWebKey` instances. Instead we now must compare the inner `Key` members. The `TestRegistrationContactUpdate` function from `ra_test.go` was updated to populate the `Key` members used in testing instead of only using KeyID's to allow the updated comparisons to work as intended. The `Key` field of the `Registration` object was switched from `jose.JsonWebKey` to `jose.JsonWebKey ` to make it easier to represent a registration w/o a Key versus using a value with a nil `JsonWebKey.Key`. I verified the upstream unit tests pass per contributing.md: ``` daniel@XXXXX:~/go/src/gopkg.in/square/go-jose.v1$ git show commit aa2e30fdd1fe9dd3394119af66451ae790d50e0d Merge: 139276c e18a743 Author: Cedric Staub <cs@squareup.com> Date: Thu Sep 22 17:08:11 2016 -0700 Merge branch 'master' into v1 * master: Better docs explaining embedded JWKs Reject invalid embedded public keys Improve multi-recipient/multi-sig handling daniel@XXXXX:~/go/src/gopkg.in/square/go-jose.v1$ go test ./... ok gopkg.in/square/go-jose.v1 17.599s ok gopkg.in/square/go-jose.v1/cipher 0.007s ? gopkg.in/square/go-jose.v1/jose-util [no test files] ok gopkg.in/square/go-jose.v1/json 1.238s ```	2016-11-08 13:56:50 -05:00
Daniel McCarney	eb67ad4f88	Allow `validateEmail` to timeout w/o error. (#2288 ) This PR reworks the validateEmail() function from the RA to allow timeouts during DNS validation of MX/A/AAAA records for an email to be non-fatal and match our intention to verify emails best-effort. Notes: bdns/problem.go - DNSError.Timeout() was changed to also include context cancellation and timeout as DNS timeouts. This matches what DNSError.Error() was doing to set the error message and supports external callers to Timeout not duplicating the work. bdns/mocks.go - the LookupMX mock was changed to support always.error and always.timeout in a manner similar to the LookupHost mock. Otherwise the TestValidateEmail unit test for the RA would fail when the MX lookup completed before the Host lookup because the error wouldn't be correct (empty DNS records vs a timeout or network error). test/config/ra.json, test/config-next/ra.json - the dnsTries and dnsTimeout values were updated such that dnsTries * dnsTimeout was <= the WFE->RA RPC timeout (currently 15s in the test configs). This allows the dns lookups to all timeout without the overall RPC timing out. Resolves #2260.	2016-10-27 11:56:12 -07:00
Daniel McCarney	39c7ce7b69	Properly abort `NewAuthorization` when SA RPC fails. (#2286 ) The RA performs an RPC to the SA's `GetValidAuthorizations` function when attempting to find existing valid authorizations to reuse. Prior to this commit, ff the RPC fails (e.g. due to a timeout) the calling code logs the failure as a warning but fails to return the error and cease processing. This results in a nil panic when we later try to index `auths` This commit inserts the missing `return` to ensure we don't process further, thereby resolving #2274. A test for this fix is provided with `TestReuseAuthorizationFaultySA`. Without `f52f340` applied this test recreates the panic observed in #2274 and produces: ``` go test -p 1 -v -race --test.run TestReuseAuthorizationFaultySA github.com/letsencrypt/boulder/ra === RUN TestReuseAuthorizationFaultySA --- FAIL: TestReuseAuthorizationFaultySA (0.04s) panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal 0xb code=0x1 addr=0x20 pc=0x4be2b8] ``` With `f52f340` it passes. Yay!	2016-10-27 08:58:26 -07:00
Roland Bracewell Shoemaker	ce679bad41	Implement key rollover (#2231 ) Fixes #503. Functionality is gated by the feature flag `AllowKeyRollover`. Since this functionality is only specified in ACME draft-03 and we mostly implement the draft-02 style this takes some liberties in the implementation, which are described in the updated divergences doc. The `key-change` resource is used to side-step draft-03 `url` requirement.	2016-10-27 10:22:09 -04:00
Jacob Hoffman-Andrews	1958dc9065	Update total issued count asynchronously. (#2246 ) Previously the lock on total issued count would exacerbate problems when the count query was slow, which it often is. Fixes #1809.	2016-10-20 14:17:34 -07:00
Jacob Hoffman-Andrews	404e9682b1	Improve error messages. (#2256 ) Quote rejected hostnames. Include term "global" when rejecting based on global rate limit. Fixes #2252	2016-10-18 10:15:21 -07:00
Blake Griffith	d2cf6ee126	Fixes RA `DeactivateRegistration` err message typo (#2222 )	2016-10-03 15:20:43 -04:00
Roland Bracewell Shoemaker	c6e3ef660c	Re-apply 2138 with proper gating (#2199 ) Re-applies #2138 using the new style of feature-flag gated migrations. Account deactivation is gated behind `features.AllowAccountDeactivation`.	2016-09-29 17:16:03 -04:00
Roland Bracewell Shoemaker	2c966c61b2	Revert "Allow account deactivation (#2138 )" (#2188 ) This reverts commit `6f3d078414`, reversing changes made to `c8f1fb3e2f`.	2016-09-19 11:20:41 -07:00
Jacob Hoffman-Andrews	6f3d078414	Allow account deactivation (#2138 ) Fixes #2011.	2016-09-07 19:36:54 -04:00
Roland Bracewell Shoemaker	c8f1fb3e2f	Remove direct usages of go-statsd-client in favor of using metrics.Scope (#2136 ) Fixes #2118, fixes #2082.	2016-09-07 19:35:13 -04:00
Blake Griffith	344a312905	Remove audit comments -- closes #2129 (#2139 ) Closes #2129 * Remove audit comments. * Nuke doc/requirements/*	2016-08-25 18:23:42 -07:00
Roland Shoemaker	dbf9afa7d6	Review fixes pt. 1	2016-08-25 16:28:58 -07:00
Roland Shoemaker	c7e5ed1262	RA/WFE tests	2016-08-24 12:36:41 -07:00
Roland Shoemaker	aa7f85d3f5	Merge branch 'master' into reg-deact	2016-08-24 11:51:15 -07:00
Roland Bracewell Shoemaker	51ee04e6a9	Allow authorization deactivation (#2116 ) Implements `valid` and `pending` authz deactivation.	2016-08-23 16:25:06 -04:00
Roland Bracewell Shoemaker	91bfd05127	Revert #2088 (#2137 ) * Remove oldx509 usage * Un-vendor old crypto/x509, crypto/x509/pkix, and encoding/asn1	2016-08-23 14:01:37 -04:00
Roland Shoemaker	003158c9e3	Initial impl	2016-08-18 14:12:09 -07:00
Ben Irving	8ed5b1e6a1	Replace *AcmeURL with string (#2117 ) Removes core.AcmeURL from boulder and uses string instead. Fixes #1996	2016-08-11 13:27:19 -07:00
Ben Irving	8b622c805a	Move MergeUpdate out of core (#2114 ) Fixes #2022	2016-08-08 17:12:52 -07:00
Roland Bracewell Shoemaker	fc39781274	Allow user specified revocation reason (#2089 ) Fixes #140. This patch allows users to specify the following revocation reasons based on my interpretation of the meaning of the codes but could use confirmation from others. * unspecified (0) * keyCompromise (1) * affiliationChanged (3) * superseded (4) * cessationOfOperation (5)	2016-08-08 14:26:52 -07:00
Jacob Hoffman-Andrews	474b76ad95	Import forked x509 for parsing of CSRs with empty integers (#2088 ) Part of #2080. This change vendors `crypto/x509`, `crypto/x509/pkix`, and `encoding/asn1` from `1d5f6a765d`. That commit is a direct child of the Go 1.5.4 release tag, so it contains the same code as the current Go version we are using. In that commit I rewrote imports in those packages so they depend on each other internally rather than calling out to the standard library, which would cause type disagreements. I changed the imports in each place where we're parsing CSRs, and imported under a different name `oldx509`, both to avoid collisions and make it clear what's going on. Places that only use `x509` to parse certificates are not changed, and will use the current standard library. This will unblock us from moving to Go 1.6, and subsequently Go 1.7.	2016-07-28 10:38:33 -04:00
Patrick Figel	8cd74bf766	Make (pending)AuthorizationLifetime configurable (#2028 ) Introduces the `authorizationLifetimeDays` and `pendingAuthorizationLifetimeDays` configuration options for `RA`. If the values are missing from configuration, the code defaults back to the current values (300/7 days). fixes #2024	2016-07-12 15:18:22 -04:00
Daniel McCarney	7e946eaacc	Registration update optimizations (#2001 ) This PR adds two optimizations to fix the optimistic lock errors observed in #1986. First, the WFE now returns early for registration POST's (before invoking the RA and SA) when the POST body is the trivial update (`{"resource":"reg"}`). This prevents any DB operations from being performed when there is no work to be done. Second, the RA now tracks whether a update actually changes the base registration's `Contact` slice, or `Agreement` string. If the proposed update doesn't change either of these fields then the RA will return early before handing the update to the SA. Both changes save database operations from being performed needlessly and will help avoid the optimistic lock errors we observed when a problematic client was POSTing the trivial update repeatedly in a short period. The fix was verified as follows: I checked out master and artificially introduced lock contention into the SA by adding a 2s sleep into `UpdateRegistration` between fetching the `existingRegModel` to get the `LockCol` value and calling `ssa.dbMap.Update`. With the sleep in place & two certbot clients posting matching registration updates the lock contention error is produced as expected. After checking out the `empty-reg-updates` branch, re-adding the sleep to the SA, and performing the same two client reg updates no error is produced.	2016-07-07 13:40:55 -04:00
Simone Carletti	7172e49650	Replace x/net/publicsuffix with weppos/publicsuffix-go (#1969 ) This PR replaces the `x/net/publicsuffix` package with `weppos/publicsuffix-go`. The conversations that leaded to this decision are #1479 and #1374. To summarize the discussion, the main issue with `x/net/publicsuffix` is that the package compiles the list into the Go source code and doesn't provide a way to easily pull updates (e.g. by re-parsing the original PSL) unless the entire package is recompiled. The PSL update frequency is almost daily, which makes very hard to recompile the official Golang package to stay up-to-date with all the changes. Moreover, Golang maintainers expressed some concerns about rebuilding and committing changes with a frequency that would keep the package in sync with the original PSL. See https://github.com/letsencrypt/boulder/issues/1374#issuecomment-182429297 `weppos/publicsuffix-go` contains a compiled version of the list that is updated weekly (or more frequently). Moreover, the package can read and parse a PSL from a String or a File which will effectively decouple the Boulder source code with the list itself. The main benefit is that it will be possible to update the definition by simply downloading the latest list and restarting the application (assuming the list is persisted in memory).	2016-06-30 15:03:14 -07:00
Daniel McCarney	3a6a254c5c	Expose last issuance timestamp via expvar (#1982 ) This PR adds a expvar.Int published under the key "lastIssuance" that contains the timestamp of the last successful certificate issuance. This allows easy creation of a script that monitors the RA debug server (port 8002) to ensure that there has been a successful issuance within a set period (e.g. last five minutes). The underlying expvar.Int code uses the atomic package to ensure safe updates/reads across multiple goroutines. This resolves #1945 and was selected in place of the more complex circular bucket design. While the timestamp approach doesn't provide the issuance volume as readily it is less complex and meets the immediate need of a reliable external monitoring process hook. https://github.com/letsencrypt/boulder/pull/1982	2016-06-27 10:38:35 -07:00
Ben Irving	d3db851403	remove regID from WillingToIssue (#1957 ) The `regID` parameter in the PA's `WillingToIssue` function was originally used for whitelisting purposes, but is not used any longer. This PR removes it.	2016-06-22 12:21:07 -04:00
Ben Irving	77e64fef79	Disallow non-ASCII email addresses (#1953 ) This PR, adds a check in registration authority for non-ASCII encoded characters in an email address. This is due to a 'funky email implementation'. Fixes #1350	2016-06-21 17:53:38 -07:00
Jacob Hoffman-Andrews	4e0f96d924	Remove last vestiges of challenge.AccountKey. (#1949 ) This is a followup from https://github.com/letsencrypt/boulder/pull/1942. That PR stopped setting challenge.AccountKey. This one removes it entirely. Fixes #1948	2016-06-21 16:25:58 -07:00
Jacob Hoffman-Andrews	0535ac78d7	Stop setting AccountKey in challenges (#1942 ) In https://github.com/letsencrypt/boulder/pull/774 we introduced and account key stored with the challenge. This was a stopgap fix to the now-defunct SimpleHTTP and DNS challenges in the face of https://mailarchive.ietf.org/arch/msg/acme/F71iz6qq1o_QPVhJCV4dqWf-4Yc. However, we no longer offer or implement those challenges, so the extra field is unnecessary. It also take up a huge amount of space in the challenges table, which is our biggest table. SimpleHTTP and DNS challenges were removed in https://github.com/letsencrypt/boulder/pull/1247. We can provide a follow-up migration to delete the column later, once we have a plan for large migrations without downtime. Fixes #1909	2016-06-20 14:26:53 -07:00
Daniel McCarney	778c9bba86	Fix FQDNSet rate limit exemption. (#1935 ) As reported in #1925 the Certificates per Domain rate limit was being incorrectly enforced on certificate renewals for FQDN sets that have been previously issued. This is counter to the described rate limit policies[0] that detail a separate rate limit for certificates issued for the "exact same set of Fully Qualified Domain Names". The bug was caused by the result of `domainsForRateLimiting` overwriting the original `names []string` provided to the RA's `checkCertificatesPerNameLimit` function. This meant instead of looking for an existing FQDN set for the full set of domain names being requested we checked for an FQDN set for just the eTLD+1's of the domains. (e.g "www.example.com, foo.example.com, bar.example.com" vs "example.com"). This commit preserves the original `names` values for doing an FQDN set lookup and uses the `tldNames` from `domainsForRateLimiting` elsewhere. This fixes #1925. A test is added to ensure that `checkCertificatesPerNameLimit` does the correct thing both with and without an existing FQDN set. [0] https://community.letsencrypt.org/t/rate-limits-for-lets-encrypt/6769	2016-06-16 13:50:39 -07:00
Daniel McCarney	cd2d1c4f6b	Allow removing registration contact. (#1923 ) The RA UpdateRegistration function merges a base registration object with an update by calling Registration.MergeUpdate. Prior to this commit MergeUpdate only allowed the updated registration object to overwrite the Contact field of the existing registration if the updated reg. defined at least one AcmeURL. This prevented clients from being able to outright remove the contact associated with an existing registration. This commit removes the len() check on the input.Contact in MergeUpdate to allow the r.Contact field to be overwritten by a []core.AcmeURL(nil) Contact field. Subsequently clients can now send an empty contacts list in the update registration POST in order to remove their reg contact. Fixes #1846 Allow removing registration contact. * Adds a test for `MergeUpdate` contact removal. * Change `Registration.Contact` type to `[]core.AcmeURL`. * End validateContacts early for empty contacts * Test removing reg. contact more thoroughly.	2016-06-13 11:02:29 -07:00
Daniel McCarney	9abc212448	Reuse valid authz for subsequent new authz requests (#1921 ) Presently clients may request a new AuthZ be created for a domain that they have already proved authorization over. This results in unnecessary bloat in the authorizations table and duplicated effort. This commit alters the `NewAuthorization` function of the RA such that before going through the work of creating a new AuthZ it checks whether there already exists a valid AuthZ for the domain/regID that expires in more than 24 hours from the current date. If there is, then we short circuit creation and return the existing AuthZ. When this case occurs the `RA.ReusedValidAuthz` counter is incremented to provide visibility. Since clients requesting a new AuthZ and getting an AuthZ back expect to turn around and post updates to the corresponding challenges we also return early in `UpdateAuthorization` when asked to update an AuthZ that is already valid. When this case occurs the `RA.ReusedValidAuthzChallenge` counter is incremented. All of the above behaviour is gated by a new RA config flag `reuseValidAuthz`. In the default case (false) the RA does not reuse any AuthZ's and instead maintains the historic behaviour; always creating a new AuthZ when requested, irregardless of whether there are already valid AuthZ's that could be reused. In the true case (enabled only in `boulder-config-next.json`) the AuthZ reuse described above is enabled. Resolves #1854	2016-06-10 16:44:16 -04:00
Ben Irving	438580f206	Remove last of UseNewVARPC (#1914 ) `UseNewVARPC` is no longer necessary and is safe to be removed. We default to using the newer VA RPC code.	2016-06-09 10:12:46 -04:00
Daniel McCarney	4c289f2a8f	Reload ratelimit policy automatically at runtime (#1894 ) Resolves #1810 by automatically updating the RA ratelimit.RateLimitConfig whenever the backing config file is changed. Much like the Policy Authority uses a reloader instance to support updating the Hostname policy on the fly, this PR changes the Registration Authority to use a reloader for the rate limit policy file. Access to the ra.rlPolicies member is protected with a RWMutex now that there is a potential for the values to be reloaded while a reader is active. A test is introduced to ensure that writing a new policy YAML to the policy config file results in new values being set in the RA's rlPolicies instance. https://github.com/letsencrypt/boulder/pull/1894	2016-06-08 12:11:46 -07:00
Ben Irving	1336c42813	Replace all log.Err calls with log.AuditErr (#1891 ) * remove calls to log.Err() * go fmt * remove more occurrences * change AuditErr argument to string and replace occurrences	2016-06-06 16:27:16 -04:00
Jacob Hoffman-Andrews	92df4d0fc2	Rename authorities to shorter names. (#1878 ) Fixes #1875.	2016-06-03 13:35:28 -07:00

... 5 6 7 8 9

442 Commits