Commit Graph

45 Commits

Author SHA1 Message Date
Daniel McCarney de5fbbdb67
Implement CAA issueWild enforcement for wildcard names (#3266)
This commit implements RFC 6844's description of the "CAA issuewild
property" for CAA records.

We check CAA in two places: at the time of validation, and at the time
of issuance when an authorization is more than 8hours old. Both
locations have been updated to properly enforce issuewild when checking
CAA for a domain corresponding to a wildcard name in a certificate
order.

Resolves https://github.com/letsencrypt/boulder/issues/3211
2017-12-13 12:09:33 -05:00
Daniel McCarney 1c99f91733 Policy based issuance for wildcard identifiers (Round two) (#3252)
This PR implements issuance for wildcard names in the V2 order flow. By policy, pending authorizations for wildcard names only receive a DNS-01 challenge for the base domain. We do not re-use authorizations for the base domain that do not come from a previous wildcard issuance (e.g. a normal authorization for example.com turned valid by way of a DNS-01 challenge will not be reused for a *.example.com order).

The wildcard prefix is stripped off of the authorization identifier value in two places:

When presenting the authorization to the user - ACME forbids having a wildcard character in an authorization identifier.
When performing validation - We validate the base domain name without the *. prefix.
This PR is largely a rewrite/extension of #3231. Instead of using a pseudo-challenge-type (DNS-01-Wildcard) to indicate an authorization & identifier correspond to the base name of a wildcard order name we instead allow the identifier to take the wildcard order name with the *. prefix.
2017-12-04 12:18:10 -08:00
Daniel McCarney 55dd1020c0 Increase VA SingleDialTimeout to 10s. (#3260)
This PR changes the VA's singleDialTimeout value from 5 * time.Second to 10 * time.Second. This will give slower servers a better chance to respond, especially for the multi-VA case where n requests arrive ~simultaneously.

This PR also bumps the RA->VA timeout by 5s and the WFE->RA timeout by 5s to accommodate the increased dial timeout. I put this in a separate commit in case we'd rather deal with this separately.
2017-12-04 09:53:26 -08:00
Roland Bracewell Shoemaker 9da1bea433 Update histogram buckets for latencies that measure things over the internet (#3254)
Updates the buckets for histograms in the publisher, va, and expiration-mailer which are used to measure the latency of operations that go over the internet and therefore are liable to take a lot longer than the default buckets can measure. Uses a standard set of buckets for all three instead of attempting to tune for each one.

Fixes #3217.
2017-11-29 15:13:14 -08:00
Jacob Hoffman-Andrews 51991cd264 Fix logging of hostname in VA. (#3149)
The pbToAuthzMeta method in rpc/pb-marshalling.go only propagates ID and
registrationID, not hostname. So log the "domain" parameter instead.
2017-10-06 11:10:02 -07:00
Daniel McCarney 1794c56eb8 Revert "Add CAA parameter to restrict challenge type (#3003)" (#3145)
This reverts commit 23e2c4a836.
2017-10-04 12:00:44 -07:00
lukaslihotzki 23e2c4a836 Add CAA parameter to restrict challenge type (#3003)
This commit adds CAA `issue` paramter parsing and the `challenge` parameter to permit a single challenge type only. By setting `challenge=dns-01`, the nameserver keeps control over every issued certificate.
2017-10-02 11:59:47 -07:00
Jacob Hoffman-Andrews d2883f12c1 Remove TimingDuration call from VA (#3122)
Also switch over tests.

Fixes #3100
2017-09-28 14:25:22 -07:00
Daniel McCarney 966e02313f Forbid HTTP redirects to non-80/443 ports. (#3115)
Prior to this commit the VA would follow redirects from the initial
HTTP-01 challenge request on port 80 to any other port. In practice the
Let's Encrypt production environment has network egress firewall rules
that drop outbound requests that are not on port 80 or 443. In effect
this meant any challenge request that was redirected from 80 to a port
other than 80/443 was turned into a mysterious connection timeout error.

We have decided to preserve the egress firewall rule and continue to act
conservatively. Only port 80 and 443 should be allowed in redirects.

This commit updates the VA to return a clear error message when
a non-80/443 redirect is made.

To aid in testing/configuration the actual ports enforced are specified
by the va.httpPort and va.httpsPort that are used for the initial
outbound HTTP-01 connection.

The VA TestHTTPRedirectLookup unit test is updated accordingly to test
that a non-80/443 redirect fails with the expected message.

Resolves #3049
2017-09-25 10:19:10 -07:00
Jacob Hoffman-Andrews 568407e5b8 Remote VA logging and stats (#3063)
Add a logging statement that fires when a remote VA fail causes
overall failure. Also change remoteValidationFailures into a
counter that counts the same thing, instead of a histogram. Since
the histogram had the default bucket sizes, it failed to collect
what we needed, and produced more metrics than necessary.
2017-09-11 12:50:50 -07:00
Jacob Hoffman-Andrews 0d69b24fcc Move VA's CAA code into separate file (#3010)
va.go is quite a large file. This splits out the CAA-related code and tests into its own file for simplicity. This is a simple move; no code has been changed, and there is no package split.
2017-08-28 11:24:03 -07:00
Roland Bracewell Shoemaker 05d869b005 Rename DNSResolver -> DNSClient (#2878)
Fixes #639.

This resolves something that has bugged me for two+ years, our DNSResolverImpl is not a DNS resolver, it is a DNS client. This change just makes that obvious.
2017-07-18 08:37:45 -04:00
Daniel McCarney 2f53c202a6 Rename `validationTimeout`, document and clarify purpose. (#2866)
This commit renames the `validationTimeout` to `singleDialTimeout` and
adds a comment describing its purpose.
2017-07-12 11:35:14 -07:00
Roland Bracewell Shoemaker 77f1364e9a Return more detailed error for connection reset in va (#2860)
If we hit a `syscall.ECONNRESET` error return a more useful error than `Error getting validation data`, updates the `TestDetailedError` test to cover this case.

Fixes #2851.
2017-07-11 14:29:31 -07:00
Daniel McCarney 957a68c72b Fix HTTP-01 IPv6 to IPv4 fallback with fresh dialer per conn. (#2852)
The implementation of the dialer used by the HTTP01 challenge, constructed with `resolveAndConstructDialer`, used the same wrapped `net.Dialer` for both the initial IPv6 connection, and any subsequent IPv4 fallback connections. This caused the IPv4 fallback to never succeed for cases where the initial IPv6 connection expended the `validationTimeout`.

This commit updates the http01Dialer (newly renamed from `dialer` since it is in fact specific to HTTP01 challenges) to use a fresh dialer for each connection. To facilitate testing the http01Dialer maintains
a count of how many dialer instances it has constructed. We use this in a unit test to ensure the correct behaviour without a great deal of new mocking/interfaces.

Resolves #2770
2017-07-10 15:41:49 -04:00
Daniel McCarney 7120d72197 Improve error message for IPv6 failure with no IPv4 fallback. (#2844)
This commit improves the rather vague error message that was previously returned if an IPv6 challenge validation failed when IPv6First was enabled and there were no IPv4 addresses left to try as a fallback.

Resolves #2821
2017-07-05 15:54:30 -04:00
Jacob Hoffman-Andrews f710e574b3 Provide specific error for connection refused (#2843)
Fixes #2830.
2017-06-30 16:22:19 -07:00
Roland Bracewell Shoemaker 088b872287 Implement multi VA validation (#2802)
Adds basic multi-path validation functionality. A new method `performRemoteValidation` is added to `boulder-va` which is called if it is configured with a list of remote VA gRPC addresses. In this initial implementation the remote VAs are only used to check the validation result of the main VA, if all of the remote validations succeed but the local validation failed, the overall validation will still fail. Remote VAs use the exact same code as the local VA to perform validation. If the local validation succeeds then a configured quorum of the remote VA successes must be met in order to fully complete the validation.

This implementation assumes that metrics are collected from the remote VAs in order to have visibility into their individual validation latencies etc.

Fixes #2621.
2017-06-29 14:11:01 -07:00
Roland Bracewell Shoemaker 764658ab84 Remove CAA distributed resolver (#2804)
We never used it and it's been superseded by the multi-VA design.
2017-06-15 13:35:50 -04:00
Jacob Hoffman-Andrews f236ca522f Improve validation error messages. (#2791)
Previously, a lot of validations problems would give the message "Failed to
connect to X for ..." This was misleading because the issue was not always a
connection error, and when it was, it was valuable to distinguish between
connection refused and timeout. Also, for HTTP, this message would echo the
first URL in a redirect chain, when we really want the URL that failed.

Renames the misleading "parseHTTPConnError" and removes an inaccurate
check for temporary errors. It also eliminates the "detail" argument, instead
generating all messages inside the function.

Improves the handling of tls.alert errors to actually pass through the error
message, rather than just quietly changing the problem type (which was very
easy to miss).

Gives a specific error message for timeouts.

Preserves the URL from url.Error types and incorporates it into error messages.

Splits the HTTP timeout test into its own test case.
2017-06-12 10:35:22 -04:00
Jacob Hoffman-Andrews 0bfb542514 Use fields, not globals, for stats (#2790)
Following up on #2752, we don't need to use global vars for our Prometheus stats. We already have a custom registry plumbed through using Scope objects. In this PR, expose the MustRegister method of that registry through the Scope interface, and move existing global vars to be fields of objects. This should improve testability somewhat.

Note that this has a bit of an unfortunate side effect: two instances of the same stats-using class (e.g. VA) can't use the same Scope object, because their MustRegister calls will conflict. In practice this is fine since we never instantiate duplicates of the the classes that use stats, but it's something we should keep an eye on.

Updates #2733
2017-06-06 12:09:31 -07:00
Daniel c905cfb8db
Rewords IPv6 -> IPv4 fallback error messages. 2017-05-11 10:35:30 -04:00
Daniel McCarney 47452d6c6c Prefer IPv6 addresses, fall back to IPv4. (#2715)
This PR introduces a new feature flag "IPv6First". 

When the "IPv6First" feature is enabled the VA's HTTP dialer and TLS SNI
(01 and 02) certificate fetch requests will attempt to automatically
retry when the initial connection was to IPv6 and there is an IPv4
address available to retry with.

This resolves https://github.com/letsencrypt/boulder/issues/2623
2017-05-08 13:00:16 -07:00
David Calavera cc5ee3906b Refactor IsSane and IsSane* to return useful errors. (#2685)
This change changes the returning values from boolean to error.

It makes `checkConsistency` an internal function and removes the
optional argument in favor of making checks explicit where they are
used.

It also renames those functions to CheckConsistency* to not
give the impression of still returning boolean values.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2017-04-19 12:08:47 -04:00
Jacob Hoffman-Andrews d849f58cec Fix "valiation typo in VA. (#2676) 2017-04-12 11:12:22 -07:00
Jacob Hoffman-Andrews 4b665e35a6 Use Prometheus stats for VA, WFE, and OCSP Responder (#2628)
Rename HTTPMonitor to MeasuredHandler.
Remove inflight stat (we didn't use it).
Add timing stat by method, endpoint, and status code.
The timing stat subsumes the "rate" stat, so remove that.
WFE now wraps in MeasuredHandler, instead of relying on its cmd/main.go.
Remove FBAdapter stats.
MeasuredHandler tracks stats by method, status code, and endpoint.

In VA, add a Prometheus histogram for validation timing.
2017-04-03 17:03:04 -07:00
Roland Bracewell Shoemaker e2b2511898 Overhaul internal error usage (#2583)
This patch removes all usages of the `core.XXXError` and almost all usages of `probs` outside of the WFE and VA and replaces them with a unified internal error type. Since the VA uses `probs.ProblemDetails` quite extensively in challenges, and currently stores them in the DB I've saved this change for another change (it'll also require a migration). Since `ProblemDetails` should only ever be exposed to end-users all of its related logic should be moved into the `WFE` but since it still needs to be exposed to the VA and SA I've left it in place for now.

The new internal `errors` package offers the same convenience functions as `probs` does as well as a new simpler type testing method. A few small changes have also been made to error messages, mainly adding the library and function name to internal server errors for easier debugging (i.e. where a number of functions return the exact same errors and there is no other way to distinguish which method threw the error).

Also adds proper encoding of internal errors transferred over gRPC (the current encoding scheme is kept for `core` and `probs` errors since it'll be ideally be removed after we deploy this and follow-up changes) using `grpc/metadata` instead of the gRPC status codes.

Fixes #2507. Updates #2254 and #2505.
2017-03-22 23:27:31 -07:00
David Calavera c71c3cff80 Implement TLS-SNI-02 challenge validations. (#2585)
I think these are all the necessary changes to implement TLS-SNI-02 validations, according to the section 7.3 of draft 05:

https://tools.ietf.org/html/draft-ietf-acme-acme-05#section-7.3

I don't have much experience with this code, I'll really appreciate your feedback.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2017-03-22 10:17:59 -07:00
Daniel McCarney e81f7477a3 Fixes outdated IPv6 TODO on `getAddr`. (#2601)
The VA's `getAddr` function prior to this commit had an outdated comment
& a pointer to a TODO for Boulder Issue #593. That issue has been closed
and bdns' `LookupHost` supports AAAA records now. This commit updates
the comment to match the current behaviour and removes the TODO.
2017-03-09 13:20:03 -05:00
Daniel McCarney 3fa950ac58 Improve VA TLS-SNI-01 challenge failure error. (#2527)
Previous to this PR the VA's validateTLSWithZName function would
return an error message containing the SAN names of the leaf certificate
when the validation failed. This commit updates that message to include
the Subject Common Name of the leaf cert in addition to the SANs. The
names are deduplicated to prevent listing a Subj CN twice if its also
a SAN. This will help debug cases where a cert with no SANs is returned
by the server.

In addition, the number of certificates in the chain received from the
server is included in the message. This will hopefully further help
users identify misconfiguration since a TLS SNI 01 challenge response
should have a chain length of 1.

Resolves #2468
2017-01-27 10:05:42 -08:00
Roland Bracewell Shoemaker 170e37c675 Add a special error message if we are trying to talk TLS to a HTTP-only server (#2511)
If the VA fails to validate a TLS-SNI-01 challenge because it is trying to talk TLS to a HTTP-only server return a special error message that is slightly more informative.
2017-01-20 11:36:39 -05:00
Josh Soref 8adf9d41cf Spelling (#2500)
Various spelling fixes.
2017-01-16 10:44:52 -05:00
Daniel McCarney d4902820ca Adds unique VA DNS validation error for empty TXTs. (#2401)
Presently when the VA performs a DNS-01 challenge verification it
returns the same error for the case where the remote nameserver had the
**wrong** TXT value, and when the remote nameserver had an **empty**
response for the TXT query. It would aid debugging if the user was told
which of the two failure cases was responsible for the overall challenge
failure.

This commit adds a unique error message for the empty TXT records case,
and a unit test/mock to exercise the new the error message.

Resolves #2326
2016-12-08 11:27:28 -08:00
Jacob Hoffman-Andrews ec7af70aad Log hostname in VA validation result. (#2386)
This makes it easier to analyze logs for repeated failures on the same hostname.
2016-12-02 18:06:46 -08:00
Daniel 970f3aa422
Drops unneeded else block 2016-11-30 13:40:14 -05:00
Roland Bracewell Shoemaker e2155388a1 Remove caa-checker from the tree (#2351)
The VA can internally check CAA and this additional code was deemed unneeded complexity that could be hoisted outside of Boulder. Fixes #2346.
2016-11-23 08:42:33 -05:00
Roland Bracewell Shoemaker c8f1fb3e2f Remove direct usages of go-statsd-client in favor of using metrics.Scope (#2136)
Fixes #2118, fixes #2082.
2016-09-07 19:35:13 -04:00
Ben Irving ce0016bc08 HTTP-01 gives misleading "could not connect" error (#2142)
Fixes #2122
2016-08-29 11:45:32 -07:00
Blake Griffith 344a312905 Remove audit comments -- closes #2129 (#2139)
Closes #2129

* Remove audit comments.
* Nuke doc/requirements/*
2016-08-25 18:23:42 -07:00
Jacob Hoffman-Andrews fadc1f5baf Log cert in TLS-SNI challenge. (#2127)
Fixes #2126
2016-08-15 11:40:28 -07:00
Roland Bracewell Shoemaker 6264706557 Fix distributed CAA resolver (#1813)
While testing with real proxies I noticed the original CDR implementation was actually pretty broken, this refactors a bit and fixes a number of bugs. With this patch fallback to GPDNS over three distributed test proxies worked perfectly.

(Side note: `nginx` is not a viable forward proxy for this use as it doesn't support SSL, and a bunch of other _real_ forward proxy features, I ended up just using `squid3`.)

The main error in the previous implementation was the fallback was implemented in `getCAASet` which is only called in the old code path (the local CAA impl instead of the remote service) which mean't it wasn't actually being tested in the integration test. This also refactors a few repeated blocks into their own functions. Also there was a unicode encoding problem somewhere with the query string but for the life of me I can't figure out why it was broken now.
2016-06-23 11:30:20 -07:00
Jacob Hoffman-Andrews 0535ac78d7 Stop setting AccountKey in challenges (#1942)
In https://github.com/letsencrypt/boulder/pull/774 we introduced and account key stored with the challenge. This was a stopgap fix to the now-defunct SimpleHTTP and DNS challenges in the face of https://mailarchive.ietf.org/arch/msg/acme/F71iz6qq1o_QPVhJCV4dqWf-4Yc. However, we no longer offer or implement those challenges, so the extra field is unnecessary. It also take up a huge amount of space in the challenges table, which is our biggest table. SimpleHTTP and DNS challenges were removed in https://github.com/letsencrypt/boulder/pull/1247.

We can provide a follow-up migration to delete the column later, once we have a plan for large migrations without downtime.

Fixes #1909
2016-06-20 14:26:53 -07:00
Ben Irving 7e7ccde5c9 Change error message for invalid IPs (#1934)
Clarifies the `UnknownHost` problem details error message created in the VA's `getAddr` when there is no valid IP address for the domain. 

Previously this was reported as "No IPv4 addresses found for x" leading to user confusion (ref #1790) when a domain resolved to a private IP.
2016-06-16 09:42:33 -04:00
Ben Irving 1336c42813 Replace all log.Err calls with log.AuditErr (#1891)
* remove calls to log.Err()
* go fmt
* remove more occurrences
* change AuditErr argument to string and replace occurrences
2016-06-06 16:27:16 -04:00
Jacob Hoffman-Andrews 92df4d0fc2 Rename authorities to shorter names. (#1878)
Fixes #1875.
2016-06-03 13:35:28 -07:00