Commit Graph

581 Commits

Author SHA1 Message Date
Daniel McCarney ddfc620c44
va: exempt multi-va enforcement by domain/acct ID. (#4458)
In order to move multi perspective validation forward we need to support policy
in Boulder configuration that can relax multi-va requirements temporarily.

A similar mechanism was used in support of the gradual deprecation of the
TLS-SNI-01 challenge type and with the introduction of CAA enforcement and has
shown to be a helpful tool to have available when introducing changes that are
expected to break sites.

When the VA "multiVAPolicyFile" is specified it is assumed to be a YAML file
containing two lists:

1. disabledNames - a list of domain names that are exempt from multi VA
   enforcement.
2. disabledAccounts - a list of account IDs that are exempt from multi VA
   enforcement.

When a hostname or account ID is added to the policy we'll begin communication
with the related ACME account contact to establish that this is a temporary
measure and the root problem will need to be addressed before an eventual
cut-off date.

Resolves https://github.com/letsencrypt/boulder/issues/4455
2019-10-07 16:43:11 -04:00
Daniel McCarney 93902965e5 Add Go 1.13 support, temporarily disable TLS 1.3 default. (#4435)
A unit test is included to verify that a TLS-ALPN-01 challenge to
a TLS 1.3 only server doesn't succeed when the `GODEBUG` value to
disable TLS 1.3 in `docker-compose.yml` is set. Without this env var
the test fails on the Go 1.13 build because of the new default:

```
=== RUN   TestTLSALPN01TLS13
--- FAIL: TestTLSALPN01TLS13 (0.04s)
    tlsalpn_test.go:531: expected problem validating TLS-ALPN-01 challenge against a TLS 1.3 only server, got nil
    FAIL
    FAIL        github.com/letsencrypt/boulder/va       0.065s
```

With the env var set the test passes, getting the expected connection
problem reporting a tls error:

```
=== RUN   TestTLSALPN01TLS13
2019/09/13 18:59:00 http: TLS handshake error from 127.0.0.1:51240: tls: client offered only unsupported versions: [303 302 301]
--- PASS: TestTLSALPN01TLS13 (0.03s)
PASS
ok      github.com/letsencrypt/boulder/va       1.054s
```

Since we plan to eventually enable TLS 1.3 support and the `GODEBUG`
mechanism tested in the above test is platform-wide vs package
specific I decided it wasn't worth the time investment to write a
similar HTTP-01 unit test that verifies the TLS 1.3 behaviour on a
HTTP-01 HTTP->HTTPS redirect.

Resolves https://github.com/letsencrypt/boulder/issues/4415
2019-09-17 11:00:58 -07:00
Daniel McCarney d67d76388c
va: include hostname in remote VA differentials. (#4411)
Also rename the `RemoteVA.Addresses` field. The address is always
a singular value.
2019-08-30 13:32:44 -04:00
Daniel McCarney fe23dabd69 va: add challenge type to remote VA differentials. (#4410)
This will make data analysis of the differentials easier. Along the way
I also added a unit test for `logRemoteValidationDifferentials`.
2019-08-29 14:41:14 -07:00
Daniel McCarney 4a6e34fc4e
va: clean up DNS error handling for HTTP-01 challenges. (#4409)
This PR changes the VA to return `dns` problem type for errors when performing
HTTP-01 challenges for domains that have no IP addresses, or errors looking up
the IP addresses.

The `va.getAddrs` function is internal to the VA and can return
`berrors.BoulderError`s with a DNS type when there is an error, allowing the
calling code to convert this to a problem when required
using an updated `detailedError` function. This avoids some clunky conversion
the HTTP-01 code was doing that misrepresented DNS level errors as connection
problems with a DNS detail message.

In order to add an integration test for challenge validation that results in
`getAddrs` DNS level errors the Boulder tools image had to be bumped to a tag
that includes the latest `pebble-challtestsrv` that
supports mocking SERVFAILs. It isn't possible to mock this case with internal IP
addresses because our VA test configuration does not filter internal addresses
to support the testing context.

Additionally this branch removes the `UnknownHostProblem` from the `probs`
package:

1. It isn't used anywhere after 532c210
2. It's not a real RFC 8555 problem type. We should/do use the
   DNS type for this.

Resolves https://github.com/letsencrypt/boulder/issues/4407
2019-08-28 15:47:35 -04:00
alexzorin df2909a7ca va: Send extValue in TLSALPN unauthorized response (#4330)
Brings it to be more in line with the responses from the other two challenges and
will hopefully make the challenge a lot easier to debug (like in the recent community 
thread).

```json
"error": {
  "type": "urn:ietf:params:acme:error:unauthorized",
  "detail": "Incorrect validation certificate for tls-alpn-01 challenge. Expected acmeValidationV1 extension value 836bf5358f8a32826c61faeff2e0225b00756f935b00ed3002cabb9d536b9f53 for this challenge but got 8539b12e31c306b81a0aedab4128722c6ad71f71f46316a3c71612f47df0e532",
  "status": 403
},
```
2019-07-11 09:08:14 -07:00
Jacob Hoffman-Andrews c8dbbf005d Handle unprintable characters in HTTP responses. (#4312)
Fixes #4244.
2019-07-02 13:42:55 -04:00
Roland Bracewell Shoemaker 6f93942a04 Consistently used stdlib context package (#4229) 2019-05-28 14:36:16 -04:00
Roland Bracewell Shoemaker e839042bae dns: Remove Authorities field from ValidationRecord (#4230) 2019-05-28 14:11:32 -04:00
Daniel McCarney ea9871de1e core: split identifier types into separate package. (#4225)
This will allow implementing sub-problems without creating a cyclic
dependency between `core` and `problems`.

The `identifier` package is somewhat small/single-purpose and in the
future we may want to move more "ACME" bits beyond the `identifier`
types into a dedicated package outside of `core`.
2019-05-23 13:24:41 -07:00
Daniel McCarney 1d9de1cae0
va: fix flaky test_http2_http01_challenge int. test. (#4222)
In some rare conditions the malformed HTTP response error message that
we match in the VA for HTTP-01 connections to HTTP/2 servers will be
returned as a raw `http.badStringError` that doesn't have a transport
connection broken prefix. In these cases the existing
`test_http2_http01_challenge` integration tests fails because the
`h2SettingsFrameErrRegex` doesn't match the returned error.

To accommodate this we make the `h2SettingsFrameErrRegex` optionally
match the transport connection broken prefix.
2019-05-23 12:42:58 -04:00
alexzorin 105fe3b8e1 va: case-insensitivity of suffixes in http redirs (#4218)
An URI host is supposed to be case-insensitive.
2019-05-16 10:52:05 -04:00
Daniel McCarney 4229a29142
va: fix validationTime metric w/ multi-va full results no enforce. (#4217) 2019-05-15 12:59:46 -04:00
Jacob Hoffman-Andrews 4c420e2bc2 bdns: Remove LookupMX. (#4202)
We used to use this for checking email domains on registration, but not
anymore.
2019-05-06 09:29:44 -04:00
Daniel McCarney e050820fcc
va: add specific error for HTTP-01 to HTTP/2 server. (#4172)
In practice it seems the only way to add a specific error for when an initial HTTP-01 challenge request is made to an HTTP/2 server mis-configured on `:80` is with a regex on the error string.

The error returned from the stdlib `http.Client` for HTTP to an HTTP/2 server is just an `errors.ErrorString` instance without any context (once you peel it out of the wrapping `url.Error`):
>  Err:(*errors.errorString)(0xc420609bf0)}] errStr=[Get http://example.com/.well-known/acme-challenge/xxxxxxx: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x00\x00\x12\x04\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x80\x00\x04\x00\x01\x00\x00\x00\x05\x00\xff\xff\xff\x00\x00\x04\b\x00\x00\x00\x00\x00\u007f\xff\x00\x00\x00\x00\b\a\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01"]

Even directly in the stdlib code at the place in `http/response.go` that generates the error it's using a `&badStringError{}` and just putting the byte string that `textproto` read into it.

To detect this case in `detailedError` I added a pre-compiled regex that will match the net/http malformed HTTP response error for raw bytes matching an arbitrarily sized HTTP/2 SETTINGS frame. Per RFC "A SETTINGS frame MUST be sent by both endpoints at the start of a connection" and so this seems like a fairly reliable indicator of an unexpected HTTP/2 response in an HTTP/1.1 context.

Thanks to @mnordhoff for the detailed notes (and RFC refs) in #3416 It made this a lot easier!

Resolves #3416.
2019-04-23 14:56:37 -04:00
Daniel McCarney 2f3c703a72 va: add specific error for broken HTTP-01 redirects. (#4171)
Often folks will mis-configure their webserver to send an HTTP redirect missing
a `/' between the FQDN and the path.

E.g. in Apache using:

Redirect / https://bad-redirect.org

Instead of:

Redirect / https://bad-redirect.org/

Will produce an invalid HTTP-01 redirect target like:

https://bad-redirect.org.well-known/acme-challenge/xxxx

This happens frequently enough we want to return a distinct error message
for this case by detecting the redirect targets ending in ".well-known".

After the "Simple HTTP-01" code landed this case was previously getting an error
message of the form:
> "Invalid hostname in redirect target, must end in IANA registered TLD"

Resolves https://github.com/letsencrypt/boulder/issues/3606
2019-04-23 10:50:47 -07:00
Daniel McCarney cc0d15841f va: split up va.go by challenge type. (#4170)
When I introduced the new HTTP-01 code I did it in `va/http.go` intending to try and make the very large `va.go` file a little bit smaller. This is the continuation of that work.

* f96ad92 - moves remaining HTTP-01 specific code to `va/http.go`.
* 1efb9a1 - moves TLS-ALPN-01 code into `va/tlsalpn.go`.
* 95ea567 - moves DNS-01 code into `va/dns.go`.
* 6ff0395 - moves unit tests from `va/va_test.go` into `va/http_test.go`, `va/tlsalpn_test.go` and `va/dns_test.go`.

In the end `va/va.go` contains code related to metrics, top level RPCs (e.g. `PerformValidation`), and the multi-VA code. This makes the file lengths much more manageable overall.

Note: There is certainly room for cleaning up some of the older unit test cruft from `va/va_test.go`. For now I only moved it as-is into the challenge specific test files.
2019-04-19 11:34:58 -07:00
Jacob Hoffman-Andrews d2d5f0a328 Update miekg/dns and golang/protobuf. (#4150)
Precursor to #4116. Since some of our dependencies impose a minimum
version on these two packages higher than what we have in Godeps, we'll
have to bump them anyhow. Bumping them independently of the modules
update should keep things a little simpler.

In order to get protobuf tests to pass, I had to update protoc-gen-go in
boulder-tools. Now we download a prebuilt binary instead of using the
Ubuntu package, which is stuck on 3.0.0. This also meant I needed to
re-generate our pb.go files, since the new version generates somewhat
different output.

This happens to change the tag for pbutil, but it's not a substantive change - they just added a tagged version where there was none.

$ go test github.com/miekg/dns/...
ok      github.com/miekg/dns    4.675s
ok      github.com/miekg/dns/dnsutil    0.003s

ok      github.com/golang/protobuf/descriptor   (cached)
ok      github.com/golang/protobuf/jsonpb       (cached)
?       github.com/golang/protobuf/jsonpb/jsonpb_test_proto     [no test files]
ok      github.com/golang/protobuf/proto        (cached)
?       github.com/golang/protobuf/proto/proto3_proto   [no test files]
?       github.com/golang/protobuf/proto/test_proto     [no test files]
ok      github.com/golang/protobuf/protoc-gen-go        (cached)
?       github.com/golang/protobuf/protoc-gen-go/descriptor     [no test files]
ok      github.com/golang/protobuf/protoc-gen-go/generator      (cached)
ok      github.com/golang/protobuf/protoc-gen-go/generator/internal/remap       (cached)
?       github.com/golang/protobuf/protoc-gen-go/grpc   [no test files]
?       github.com/golang/protobuf/protoc-gen-go/plugin [no test files]
ok      github.com/golang/protobuf/ptypes       (cached)
?       github.com/golang/protobuf/ptypes/any   [no test files]
?       github.com/golang/protobuf/ptypes/duration      [no test files]
?       github.com/golang/protobuf/ptypes/empty [no test files]
?       github.com/golang/protobuf/ptypes/struct        [no test files]
?       github.com/golang/protobuf/ptypes/timestamp     [no test files]
?       github.com/golang/protobuf/ptypes/wrappers      [no test files]
2019-04-09 09:27:28 -07:00
Jacob Hoffman-Andrews ff3129247d Put features.Reset in unitest setup functions. (#4129)
Previously we relied on each instance of `features.Set` to have a
corresponding `defer features.Reset()`. If we forget that, we can wind
up with unexpected behavior where features set in one test case leak
into another test case. This led to the bug in
https://github.com/letsencrypt/boulder/issues/4118 going undetected.

Fix #4120
2019-04-02 10:14:38 -07:00
Daniel McCarney 063a98f02a
VA: additional feature flag control for multiVA. (#4122)
* `EnforceMultiVA` to allow configuring multiple VAs but not changing the primary VA's result based on what the remote VAs return.
* `MultiVAFullResults` to allow collecting all of the remote VA results. When all results are collected a JSON log line with the differential between the primary/remote VAs is logged.

Resolves https://github.com/letsencrypt/boulder/issues/4066
2019-03-25 12:23:53 -04:00
Daniel McCarney 56c18f6d96
VA: reject HTTP-01 redirects to bad hostnames earlier. (#4123)
* A redirect without a hostname is obviously bad and should get
a distinct error message as early as possible.

* A redirect to a hostname that doesn't end in an IANA registered TLD is
also obviously bad and should get a distinct error message as early as
possible.
2019-03-19 08:31:38 -04:00
Jacob Hoffman-Andrews 7891cec305 Fix includes. (#4119)
Two recent PRs had a bad merge and resulted in this unused include left behind.
2019-03-15 11:05:57 -07:00
Jacob Hoffman-Andrews 677b9b88ad Remove GSB support. (#4115)
This is no longer enabled in prod; cleaning up the code.

https://community.letsencrypt.org/t/let-s-encrypt-no-longer-checking-google-safe-browsing/82168
2019-03-15 10:24:44 -07:00
Jacob Hoffman-Andrews d1e6d0f190 Remove TLS-SNI-01 (#4114)
* Remove the challenge whitelist
* Reduce the signature for ChallengesFor and ChallengeTypeEnabled
* Some unit tests in the VA were changed from testing TLS-SNI to testing the same behavior
  in TLS-ALPN, when that behavior wasn't already tested. For instance timeouts during connect 
  are now tested.

Fixes #4109
2019-03-15 09:05:24 -04:00
Jacob Hoffman-Andrews 72b361d7a7 Shave 200ms off context for HTTP validations. (#4101)
Our integration test test_http_challenge_timeout occasionally fails with

boulder-ra [AUDIT] Could not communicate with VA: rpc
  error: code = DeadlineExceeded desc = context deadline exceeded

In at least one of these cases, the VA correctly timed-out its HTTP
request and logged a validation error with the correct error message.
I believe that there is a race between the VA returning its validation
error to the RA, and the RA timing out its gRPC call. By shaving some
time off the context we should more reliably get the response back to
the RA.

The order the primary VA calls `PerformValidation` on configured 
remote VAs is also changed to be done in a random order.

Resolves #4087
2019-03-11 13:46:56 -04:00
Daniel McCarney 9f5c1b9e25 VA: Remove legacy HTTP-01 validation code. (#4102)
We're only using the simplified HTTP-01 code from `va/http.go` now 🎉 The old unit tests that still seem relevant are left in place in `va/va_test.go` instead of being moved to `va/http_test.go` to signal that they're a bit crufty and could probably use a separate cleanup. For now I'm hesitant to remove test coverage so I updated them in-place without moving them to a new home.

Resolves https://github.com/letsencrypt/boulder/issues/4089
2019-03-08 11:57:39 -08:00
Romuald Brunet cc4ce59d7d VA: fix typo in checkCAARecords comment (#4103) 2019-03-08 13:03:39 -05:00
Jacob Hoffman-Andrews f816ca5e0d Improve error in TestPerformRemoveValidationFail (#4100)
When this test fails, it logs the fact that it got the wrong type of
ProblemDetail, but not what the actual ProblemDetail was. Fixing that
will make it easier to track down intermittent failures.
2019-03-07 13:53:27 -08:00
Roland Bracewell Shoemaker 232a5f828f Fix ineffectual assignments (#4052)
* in boulder-ra we connected to the publisher and created a publisher gRPC client twice for no apparent reason
* in the SA we ignored errors from `getChallenges` in `GetAuthorizations` which could result in a nil challenge being returned in an authorization
2019-02-13 15:39:58 -05:00
Jacob Hoffman-Andrews 57f97eb550 Remove ValidationRecords from VA logEvent. (#4054)
These are included in the challenge object, which also gets logged, so
including them twice was approximately doubling the size of the VA logs.
2019-02-13 10:57:16 -08:00
Daniel McCarney 2b12c6acc8 VA: Promote singleDialTimeout, add preresolvedDialer timeout test. (#4049)
The `singleDialTimeout` field was previously a global `const` in the
`va` package. Making it a field of the VA impl (and the dialer structs)
makes it easier to test that it is working as expected with a smaller
than normal value.

A new `TestPreresolvedDialerTimeout` unit test is added that tests the
fix from https://github.com/letsencrypt/boulder/pull/4046

Without the fix applied:
```
=== RUN   TestPreresolvedDialerTimeout
--- FAIL: TestPreresolvedDialerTimeout (0.49s)
    http_test.go:86: fetch didn't timeout after 50ms
    FAIL
    FAIL  github.com/letsencrypt/boulder/va 0.512s
```

With the fix applied:
```
=== RUN   TestPreresolvedDialerTimeout
--- PASS: TestPreresolvedDialerTimeout (0.05s)
PASS
ok    github.com/letsencrypt/boulder/va 1.075s
```
2019-02-12 13:59:51 -08:00
Daniel McCarney 1c0be52e53 VA: Add integration test for HTTP timeouts. (#4050)
Also update `TestHTTPTimeout` to test with the `SimplifiedVAHTTP`
feature flag enabled.
2019-02-12 13:42:01 -08:00
Daniel McCarney c37355b40b VA: Use correct Timeout for SimplifiedVAHTTP reqs. (#4046)
The `DefaultTransport`'s `DialContext` sets a `Timeout` and `KeepAlive`
of 30 seconds. When configured with the `SimplifiedVAHTTP` feature flag
we need to use a shorter `Timeout`.
2019-02-11 11:03:03 -08:00
Daniel McCarney 98663717d8
VA: Rework SimplifiedVAHTTP for pre-resolved dials. (#4016)
The URL construction approach we were previously using for the refactored VA HTTP-01 validation code was nice but broke SNI for HTTP->HTTPS redirects. In order to preserve this functionality we need to use a custom `DialContext` handler on the HTTP Transport that overrides the target host to use a pre-resolved IP.

Resolves https://github.com/letsencrypt/boulder/issues/3969
2019-01-21 15:08:40 -05:00
Roland Bracewell Shoemaker 93ac7fbe9e
Modify authorization creation to allow for new style storage schema (#3998)
Adds a feature which gates creation of authorizations following the style required for the new schema (and which can be used for gating the reset of our new schema code later down the road).

There was an internal discussion about an issue this creates regarding a predictable ordering of challenges within a challenge due to sequential challenge IDs which will always be static for each challenge type. It was suggested we could add some kind of obfuscation to the challenge ID when presented to the user to prevent this. This hasn't been done in this PR as it would only be focused in the WFE and would be better suited as its own changeset.

Fixes #3981.
2019-01-17 17:09:38 -08:00
Daniel McCarney 11433e1ea0
VA: Fix SimplifiedVAHTTP01 redirect query param handling. (#3988)
When the `SimplifiedVAHTTP01` feature flag is enabled we need to
preserve query parameters when reconstructing a redirect URL for the
resolved IP address.

To add integration testing for this condition the Boulder tools images
are updated to in turn pull in an updated `pebble-challtestsrv` command
that tracks request history.

A new Python wrapper for the `pebble-challtestsrv` HTTP API is added to
centralize interacting with the chall test srv to add mock data and to
get the history of HTTP requests that have been processed.
2019-01-04 14:20:44 -05:00
Daniel McCarney 5fbed0c49e
VA: Replace invalid UTF-8 in cert contents, proactively marshal. (#3973)
Marshaling invalid UTF-8 strings to protocol buffers causes an error. This can
happen in VA `PerformValidation` RPC responses if remote servers return invalid
UTF-8 in some ACME challenge contexts. We previously fixed this for HTTP-01 and
DNS-01 but missed a case where TLS-ALPN-01/TLS-SNI-01 challenge response
certificate content was included in error messages without replacing invalid
UTF-8. That's now fixed & unit tests are added.

To aid in diagnosing any future instances the VA is also updated to proactively
attempt to marshal its `PerformValidation` results before handing off to the RPC
wrappers that will do the same. This way if we detect an error in marshaling the
VA can audit log the escaped content for investigation purposes.

Hopefully with these two efforts combined we can avoid any future VA RPC errors
from UTF-8 encoding.

Resolves https://github.com/letsencrypt/boulder/issues/3838
2018-12-07 14:58:12 -05:00
Jacob Hoffman-Andrews 1ad8c70c36 Lower log level for following redirects. (#3972) 2018-12-06 16:47:40 -08:00
Daniel McCarney bd4c254942
Use Challtestsrv for HTTP-01 integration tests, add redirect tests (#3960)
To complete https://github.com/letsencrypt/boulder/issues/3956 the `challtestsrv` is updated such that its existing TLS-ALPN-01 challenge test server will serve HTTP-01 responses with a self-signed certificate when a non-TLS-ALPN-01 request arrives. This lets the TLS-ALPN-01 challenge server double as a HTTPS version of the HTTP challenge server. The `challtestsrv` now also supports adding/remove redirects that will be served to clients when requesting matching paths.

The existing chisel/chisel2 integration tests are updated to use the `challtestsrv` instead of starting their own standalone servers. This centralizes our mock challenge responses and lets us bind the `challtestsrv` to the VA's HTTP port in `startservers.py` without clashing ports later on.

New integration tests are added for HTTP-01 redirect scenarios using the updated `challtestserv`. These test cases cover:
* valid HTTP -> HTTP redirect
* valid HTTP -> HTTPS redirect
* Invalid HTTP -> non-HTTP/HTTPS port redirect
* Invalid HTTP-> non-HTTP/HTTPS protocol scheme redirect
* Invalid HTTP-> bare IP redirect
* Invalid HTTP redirect loop

The new integration tests shook out two fixes that were required for the legacy VA HTTP-01 code (afad22b) and one fix for the challtestsrv mock DNS (59b7d6d).

Resolves https://github.com/letsencrypt/boulder/issues/3956
2018-11-30 17:20:10 -05:00
Daniel McCarney 8a610e5828
VA: Fix SimplifiedHTTP01 w/ HTTPS redirects. (#3959)
The VA code activated by the `SimplifiedHTTP01` flag had a regression
that constructed validation URLs for HTTP-01 challenge redirects to
a HTTPS host incorrectly.

The regression is fixed and the unit tests are updated to cover this
case. An integration test is not included in this commit but will be
done as follow-up work since it requires adjusting existing integration
tests for HTTP-01 to use the challtestrv instead of Certbot's ACME
standalone server.
2018-11-30 08:12:24 -05:00
Daniel McCarney d9d2f4e9b0
VA: Simplified HTTP-01 w/ IP address URLs (#3939)
Continued bugs from the custom dialer approach used by the VA for HTTP-01 (most recently https://github.com/letsencrypt/boulder/issues/3889) motivated a rewrite.

Instead of using a custom dialer to be able to control DNS resolution for HTTP validation requests we can construct URLs for the IP addresses we resolve and overload the Host header. This avoids having to do address resolution within the dialer and eliminates the complexity of the dialer `addrInfoChan`. The only thing left for our custom dialer now is to shave some time off of the provided context to help us discern timeouts before/after connect.

The existing IP preference & fallback behaviour is preserved: e.g. if a host has both IPv6 and IPv4 addresses we connect to the first IPv6 address. If there is a network error connecting to that address (e.g. an error during "dial"), we try once more with the first IPv4 address. No other retries are done. Matching existing behaviour no fallback is done for HTTP level failures on an IPv6 address (e.g. mismatched webroots, redirect loops, etc). A new Prometheus counter "http01_fallbacks" is used to keep track of the number of fallbacks performed.

As a result of moving the layer at which the retry happens a fallback like described above will now produce two validation records: one for the initial IPv6 connection, and one for the IPv4 connection. Neither will have the "addressesTried" field populated, just "addressesResolved" and "addressUsed". Previously with the dialer doing the retry we would have created just one validation record with an IPv4 "addressUsed" field and both an IPv6 and IPv4 address in the "addressesTried" field.

Because this is a big diff for a key part of the VA the new code is gated by the `SimplifiedVAHTTP` feature flag.

Resolves #3889
2018-11-19 14:15:39 -05:00
Jacob Hoffman-Andrews b1be4ccaed Fix latency logging. (#3937)
In the VA, we were rendering a Duration to JSON, which gave an integer
number of nanoseconds rather than a float64 of seconds. Also, in both VA
and WFE we were rendering way more precision than we needed. Millisecond
precision is enough, and since we log latency for every WFE response,
the extra bytes are worth saving.
2018-11-14 10:52:48 -05:00
Jacob Hoffman-Andrews 714457badc Add a deadline to TLS handshake. (#3921)
Previously, if a TLS handshake timed out, we would block forever in
`conn.Handshake()`, leaking both a TCP connection and a goroutine.
This sets a deadline on the underlying TCP connection, ensuring that
`conn.Handshake()` eventually times out.

Fixes #3915
2018-11-05 15:44:08 -05:00
Roland Bracewell Shoemaker 3c2888a49e Add a counter for the tls alpn OID that is used (#3914)
Fixes #3913.
2018-10-31 13:12:11 -04:00
Roland Bracewell Shoemaker a9a0846ee9
Remove checks for deployed features (#3881)
Removes the checks for a handful of deployed feature flags in preparation for removing the flags entirely. Also moves all of the currently deprecated flags to a separate section of the flags list so they can be more easily removed once purged from production configs.

Fixes #3880.
2018-10-17 20:29:18 -07:00
Roland Bracewell Shoemaker 15ccea65f7
Record latency of validation instead of request/response time (#3879)
Fixes #3862.
2018-10-05 10:59:53 -04:00
Jacob Hoffman-Andrews 69f4f666b6 Add timeout values to VA RoundTripper. (#3869)
Fixes #3868.
2018-09-24 16:11:23 -04:00
Daniel McCarney 43b61f5c25 VA: Fix q -> %q format specifier (#3870) 2018-09-24 09:59:22 -07:00
Jacob Hoffman-Andrews b25b431266 Filter invalid UTF-8 from error responses. (#3845)
For HTTP-01 challenges that return incorrect responses, the
VA tries to put the first little bit of the HTTP response in the problem
detail.

However, VA needs to be able to serialize the problem detail as a
protobuf to send it to the RA, and protobufs require string types to be
UTF-8. Filter out any invalid UTF-8 sequences and replace them with
REPLACEMENT CHARACTER.
2018-09-17 14:35:46 -04:00
Joel Sing a64928bc3d Rework CAA value parameter parsing to match RFC 6644bis draft. (#3805)
This switches from whitespace to semi-colon separated tag/value parameters,
while implementing stricter checks on valid tag and value values (to match
the RFC). Test coverage is added for CAA value parameter parsing, along with
some additional tests for CAA records with multiple parameter values.

Fixes issue #3795.
2018-09-05 17:09:10 -07:00
Daniel McCarney 94bcebd658
VA: Ignore cancelled errs from remote VAs. (#3827)
If the context provided to a remote VA's `PerformValidation` is
cancelled we should not treat the returned context cancelled error as an
unexpected error and should instead ignore it as an expected result.
2018-08-27 12:20:54 -04:00
Roland Bracewell Shoemaker 1ef93c3809 Support both obsolete and new TLS-ALPN OID (#3819) 2018-08-20 10:51:33 -04:00
Roland Bracewell Shoemaker 1e6699d03e Remove hyphens from ACME-CAA parameters (#3772)
The hyphens were incompatible with RFC 6844 (but not RFC 6844bis), and
broke some CAA-processing software in practice. Hugo revised the ACME-CAA
draft (https://datatracker.ietf.org/doc/html/draft-ietf-acme-caa-05) to remove
the hyphens.
2018-06-21 13:49:48 -07:00
Daniel McCarney 2dadd5e09a VA: Log exceptional non-problem remote VA errors. (#3760)
Previously, if a remote VA returned an error that is not a ProblemDetail, the
primary VA would log a ServerInternalProblem but not the underlying error.
This commit updates performRemoteValidation to always return the full error it
receives from a remote VA.

This commit also adds a unittest that checks that the VA still returns a
ServerInternalProblem to the RA, and that the VA audit logs the underlying
error.

Resolves https://github.com/letsencrypt/boulder/issues/3753
2018-06-15 10:53:16 -07:00
Roland Bracewell Shoemaker 813aa788e9 Assume acmeValidation-v1 is wrapped OCTET STRING (#3752)
As defined by the spec.
2018-06-11 14:44:13 -07:00
Joel Sing 9c2859c87b Add support for CAA account-uri validation. (#3736)
This adds support for the account-uri CAA parameter as specified by
section 3 of https://tools.ietf.org/html/draft-ietf-acme-caa-04, allowing
issuance to be restricted to one or more ACME accounts as specified by CAA
records.
2018-06-08 12:08:03 -07:00
Maciej Dębski bb9ddb124e Implement TLS-ALPN-01 and integration test for it (#3654)
This implements newly proposed TLS-ALPN-01 validation method, as described in
https://tools.ietf.org/html/draft-ietf-acme-tls-alpn-01 This challenge type is disabled 
except in the config-next tree.
2018-06-06 13:04:09 -04:00
Daniel McCarney b29fe6559d VA: Log challenge type in `checkCAA`. (#3742)
This commit updates the VA's checkCAA function to include the provided
challengeType parameter as part of the audit log line for the CAA
check result. If challengeType == nil then the value "none" is logged.
Unit tests are updated accordingly.

This change will make it easier to distinguish cases where "Valid for
issuance" is false because of a validation-methods restriction.

Resolves #3740
2018-06-04 09:19:10 -07:00
Joel Sing 2540d59296 Implement CAA validation-methods checking. (#3716)
When performing CAA checking respect the validation-methods parameter (if
present) and restrict the allowed authorization methods to those specified.
This allows a domain to restrict authorization methods that can be used with
Let's Encrypt.

This is largely based on PR #3003 (by @lukaslihotzki), which was landed and
then later reverted due to issue #3143. The bug the resulted in the previous
code being reverted has been addressed (likely inadvertently) by 76973d0f.

This implementation also includes integration tests for CAA validation-methods.

Fixes issue #3143.
2018-05-23 14:32:31 -07:00
Jacob Hoffman-Andrews 5ad14170fb Ignore canceled IsSafeDomain calls. (#3730)
Fixes #3681.
2018-05-23 12:50:30 -07:00
Daniel McCarney 3df93d2230 VA: Log observed CAA records. (#3726)
This is a quick first pass at audit logging the *dns.CAA records in JSON format from within the VA's IsCAAValid function. This will provide more information for post-hoc analysis of CAA decisions.
2018-05-18 15:28:45 -07:00
Daniel McCarney 084819b011
VA: CAA tag identifiers are case insensitive. (#3722)
Per RFC 6844 Section 5.1 the matching of CAA tag identifiers (e.g.
"Issue") is case insensitive. This commit updates the CAA tag processing
to be case insensitive as required by the RFC.

To exercise the fix this commit adds a test case to the `caaMockDNS`
`LookupCAA` implementation for a hostname (`mixedcase.com`) that has
a CAA record with a mixed case `Issue` tag. Prior to the fix from this
branch being included in `va/caa.go` the test fails:
```
    --- FAIL: TestCAAChecking/Bad_(Reserved,_Mixed_case_Issue) (0.00s)
          caa_test.go:292: checkCAARecords validity mismatch for
          mixedcase.com: got true expected false
```

With the fix applied, the test passes.
2018-05-18 09:34:57 -04:00
Joel Sing f8a023e49c Remove various unnecessary uses of fmt.Sprintf (#3707)
Remove various unnecessary uses of fmt.Sprintf - in particular:

- Avoid calls like t.Error(fmt.Sprintf(...)), where t.Errorf can be used directly.

- Use strconv when converting an integer to a string, rather than using
  fmt.Sprintf("%d", ...). This is simpler and can also detect type errors at
  compile time.

- Instead of using x.Write([]byte(fmt.Sprintf(...))), use fmt.Fprintf(x, ...).
2018-05-11 11:55:25 -07:00
Joel Sing 9990d14654 Convert the probs functions to be formatters. (#3708)
Many of the probs.XYZ calls are of the form probs.XYZ(fmt.Sprintf(...)).
Convert these functions to take a format string and optional arguments,
following the same pattern used in the errors package. Convert the
various call sites to remove the now redundant fmt.Sprintf calls.
2018-05-11 11:51:16 -07:00
Joel Sing 8ebdfc60b6 Provide formatting logger functions. (#3699)
A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)).
Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions
of the logger functions and call these directly with the format and arguments.

While here remove some unnecessary trailing newlines and calls to String/Error.
2018-05-10 11:06:29 -07:00
Roland Bracewell Shoemaker b2a2a24dc3 Stop using validation record as an input/output (#3694)
This change cleans up how `va.http01Dialer` works with regards to `core.ValidationRecord`s. Instead of using the record as both an input and a output it now uses a set of inputs and outputs information about addresses via a channel. The validation record is then constructed in the parent scope or in the redirect function instead of the dialer itself.

Fixes #2730, fixes #3109, and fixes #3663.
2018-05-09 11:55:14 -04:00
Roland Bracewell Shoemaker a5ac5fa078 Deprecate IPv6First feature flag (#3684) 2018-05-02 10:22:25 -07:00
Jacob Hoffman-Andrews d0a510664b Remove Timeout field from VA's http.Client. (#3661)
This field was set to singleDialTimeout, but the net/http library treats
it as covering all of dial, write headers, and read headers and body.
Since http01Dialer also uses singleDialTimeout, there's a race between
http01Dialer and net/http to see who will time out first. The result is
that sometimes we give "Timeout after connect" when the error really
should be "Timeout during connect." This issue also inhibits IPv6 to
IPv4 fallback, and tickles a data race that was causing a rare panic in
VA: https://github.com/letsencrypt/boulder/issues/3109.

After this change, the overall HTTP request will get the full deadline
allowed by the RPC context. The dialer will continue to use
singleDialTimeout for each of its two possible dial attempts.
2018-04-23 09:24:23 -04:00
Jacob Hoffman-Andrews a1b98d9163
Use a context when dialing TLS for TLS-SNI (#3648)
This allows us to have fast-running unittests without modifying the global state in singleDialTimeout,
which can become a const.

Fixes #3628.

Builds on top of #3629, review that first.
2018-04-16 15:06:56 -07:00
Jacob Hoffman-Andrews 339ea954bd Give more detailed validation errors in VA (#3629)
In particular, differentiate timeouts during connect (which are usually a firewall problem) from timeouts after connect (which are usually a software problem). In the process, refactor the tests and add testing for specific problem detail messages.

This also switches over the HTTP challenge's dialer to use DialContext, and to shave a little bit of headroom off of the context deadline, so that the dial can report its timeout before the overall context expires, which would lead to an overly generic "deadline exceeded" error, which would then get translated (incorrectly) into a "timeout after connect."

There is an additional error case, Timeout during %s (your server may be slow or overloaded), (where %s can be read or write) which doesn't have any unittests. I believe it may not be possible to trigger this, since read and write timeouts get subsumed by the HTTP or TLS library, but it's worth having as a fallback case. We'll see if it shows up in the logs.

Among the test refactorings, I shortened the timeout on the TLS timeout test to 50ms. Previously this was the long pole making the whole test take 10s. Now it takes ~500 ms overall.

I recommend starting review at https://github.com/letsencrypt/boulder/compare/detailed-va-errors?expand=1#diff-4c51d1d7ca3ec3022d14b42809af0d7eR671 (the changes to detailedError), then reviewing the Dial -> DialContext changes, then the tests.
2018-04-16 12:01:08 -07:00
Jacob Hoffman-Andrews 699c7e4c44 Add a DNS problem type. (#3625)
As specified in ACME. Also, include problem type in the stats.

Fixes #3613.
2018-04-09 12:21:02 -04:00
Roland Bracewell Shoemaker 8167abd5e3 Use internet facing appropriate histogram buckets for DNS latencies (#3616)
Also instead of repeating the same bucket definitions everywhere just use a single top level var in the metrics package in order to discourage copy/pasting.

Fixes #3607.
2018-04-04 08:01:54 -04:00
Jacob Hoffman-Andrews 11434650b7 Check safe browsing at validation time (#3539)
Right now we check safe browsing at new-authz time, which introduces a possible
external dependency when calling new-authz. This is usually fine, since most safe
browsing checks can be satisfied locally, but when requests have to go external,
it can create variance in new-authz timing.

Fixes #3491.
2018-03-09 11:15:05 +00:00
Daniel McCarney 49d55d9ab5 Make POSTing KeyAuthorization optional, V2 don't echo it. (#3526)
This commit updates the RA to make the notion of submitting
a KeyAuthorization value as part of the ra.UpdateAuthorization call
optional. If set, the value is enforced against expected and an error is
returned if the provided authorization isn't correct. If it isn't set
the RA populates the field with the computed authorization for the VA to
enforce against the value it sees in challenges. This retains the legacy
behaviour of the V1 API. The V2 API will never unmarshal a provided
key authorization.

The ACMEv2/WFEv2 prepChallengeForDisplay function is updated to strip
the ProvidedKeyAuthorization field before sending the challenge object
back to a client. ACMEv1/WFEv1 continue to return the KeyAuthorization
in challenges to avoid breaking clients that are relying on this legacy
behaviour.

For deployability ease this commit retains the name of the
core.Challenge.ProvidedKeyAuthorization field even though it should
be called core.Challenge.ComputedKeyAuthorization now that it isn't
set based on the client's provided key authz. This will be easier as
a follow-up change.

Resolves #3514
2018-03-06 20:33:01 +00:00
Daniel McCarney 28cc969814 Remove TLS-SNI-02 implementation. (#3516)
This code was never enabled in production. Our original intent was to
ship this as part of the ACMEv2 API. Before that could happen flaws were
identified in TLS-SNI-01|02 that resulted in TLS-SNI-02 being removed
from the ACME protocol. We won't ever be enabling this code and so we
might as well remove it.
2018-03-02 10:56:13 -08:00
Jacob Hoffman-Andrews 1fe8aa8128 Improve errors for DNS challenge (#3317)
Before this change, we would just log "Correct value not found for DNS challenge"
when we got a TXT record that didn't match what we expected. This was different
from the error when no TXT records were found at all, but viewing the error out of
context doesn't make that clear. This change improves the error to specifically say
that we found a TXT record, but it was the wrong one.

Also in this change: if we found multiple TXT records, we mention the number;
and we trim the length of the echoed TXT record.
2018-01-03 15:37:23 -05:00
Daniel McCarney de5fbbdb67
Implement CAA issueWild enforcement for wildcard names (#3266)
This commit implements RFC 6844's description of the "CAA issuewild
property" for CAA records.

We check CAA in two places: at the time of validation, and at the time
of issuance when an authorization is more than 8hours old. Both
locations have been updated to properly enforce issuewild when checking
CAA for a domain corresponding to a wildcard name in a certificate
order.

Resolves https://github.com/letsencrypt/boulder/issues/3211
2017-12-13 12:09:33 -05:00
Daniel McCarney 1c99f91733 Policy based issuance for wildcard identifiers (Round two) (#3252)
This PR implements issuance for wildcard names in the V2 order flow. By policy, pending authorizations for wildcard names only receive a DNS-01 challenge for the base domain. We do not re-use authorizations for the base domain that do not come from a previous wildcard issuance (e.g. a normal authorization for example.com turned valid by way of a DNS-01 challenge will not be reused for a *.example.com order).

The wildcard prefix is stripped off of the authorization identifier value in two places:

When presenting the authorization to the user - ACME forbids having a wildcard character in an authorization identifier.
When performing validation - We validate the base domain name without the *. prefix.
This PR is largely a rewrite/extension of #3231. Instead of using a pseudo-challenge-type (DNS-01-Wildcard) to indicate an authorization & identifier correspond to the base name of a wildcard order name we instead allow the identifier to take the wildcard order name with the *. prefix.
2017-12-04 12:18:10 -08:00
Daniel McCarney 55dd1020c0 Increase VA SingleDialTimeout to 10s. (#3260)
This PR changes the VA's singleDialTimeout value from 5 * time.Second to 10 * time.Second. This will give slower servers a better chance to respond, especially for the multi-VA case where n requests arrive ~simultaneously.

This PR also bumps the RA->VA timeout by 5s and the WFE->RA timeout by 5s to accommodate the increased dial timeout. I put this in a separate commit in case we'd rather deal with this separately.
2017-12-04 09:53:26 -08:00
Roland Bracewell Shoemaker 9da1bea433 Update histogram buckets for latencies that measure things over the internet (#3254)
Updates the buckets for histograms in the publisher, va, and expiration-mailer which are used to measure the latency of operations that go over the internet and therefore are liable to take a lot longer than the default buckets can measure. Uses a standard set of buckets for all three instead of attempting to tune for each one.

Fixes #3217.
2017-11-29 15:13:14 -08:00
Jacob Hoffman-Andrews 2fd2f9e230 Remove LegacyCAA implementation. (#3240)
Fixes #3236
2017-11-20 16:09:00 -05:00
Jacob Hoffman-Andrews bf9ce64aca Update GSB library (#3192)
This pulls in google/safebrowsing#74, which introduces a new LookupURLsContext that allows us to pass through timeout information nicely.

Also, update calling code to use LookupURLsContext instead of LookupURLs.
2017-10-24 08:33:03 -04:00
Jacob Hoffman-Andrews 51991cd264 Fix logging of hostname in VA. (#3149)
The pbToAuthzMeta method in rpc/pb-marshalling.go only propagates ID and
registrationID, not hostname. So log the "domain" parameter instead.
2017-10-06 11:10:02 -07:00
Daniel McCarney 1794c56eb8 Revert "Add CAA parameter to restrict challenge type (#3003)" (#3145)
This reverts commit 23e2c4a836.
2017-10-04 12:00:44 -07:00
lukaslihotzki 23e2c4a836 Add CAA parameter to restrict challenge type (#3003)
This commit adds CAA `issue` paramter parsing and the `challenge` parameter to permit a single challenge type only. By setting `challenge=dns-01`, the nameserver keeps control over every issued certificate.
2017-10-02 11:59:47 -07:00
Jacob Hoffman-Andrews d2883f12c1 Remove TimingDuration call from VA (#3122)
Also switch over tests.

Fixes #3100
2017-09-28 14:25:22 -07:00
Daniel McCarney 966e02313f Forbid HTTP redirects to non-80/443 ports. (#3115)
Prior to this commit the VA would follow redirects from the initial
HTTP-01 challenge request on port 80 to any other port. In practice the
Let's Encrypt production environment has network egress firewall rules
that drop outbound requests that are not on port 80 or 443. In effect
this meant any challenge request that was redirected from 80 to a port
other than 80/443 was turned into a mysterious connection timeout error.

We have decided to preserve the egress firewall rule and continue to act
conservatively. Only port 80 and 443 should be allowed in redirects.

This commit updates the VA to return a clear error message when
a non-80/443 redirect is made.

To aid in testing/configuration the actual ports enforced are specified
by the va.httpPort and va.httpsPort that are used for the initial
outbound HTTP-01 connection.

The VA TestHTTPRedirectLookup unit test is updated accordingly to test
that a non-80/443 redirect fails with the expected message.

Resolves #3049
2017-09-25 10:19:10 -07:00
Daniel McCarney 3408b62720 Include the domain name in problems from IsCAAValid. (#3116)
For certificates with many domains it can be difficult to associate
a given CAA error with the specific domain that caused it. To make this
easier this commit explicitly prefixes all of the problems that can be
returned from `va.IsCAAValid` with the domain name in question.

A small unit test is included to check a CAA problem's detail message is
suitably prefixed with the affected domain.
2017-09-25 13:11:50 -04:00
Jacob Hoffman-Andrews fce975a1e6 Move CAA mocks into caa_test. (#3084)
There were a bunch of test fixtures in bdns/mocks.go that were only used in va/caa_test.go. This moves them to be in the same file so we have less spooky action at a distance.

One side-effect: We can't construct bdns.DNSError with the internal fields we want, because those fields are unexported. So we switch a couple of mock cases to just return a generic error, and the corresponding test cases to expect that error.
2017-09-18 13:10:01 -07:00
Roland Bracewell Shoemaker 0ab1e2ff46 Raise treeClimbingLookupCAA limit (#3098) 2017-09-15 18:30:29 -07:00
Roland Bracewell Shoemaker 8a2ad13a87 Don't tree climb for trees we've already climbed (#3096)
Prevents repeated lookups in traditional CNAME or tree based CNAME loops
2017-09-15 19:29:35 -04:00
Roland Bracewell Shoemaker d1d6cab8ce Fix CAA test (#3092) 2017-09-14 16:02:54 -07:00
Jacob Hoffman-Andrews 9ab2ff4e03 Add CAA-specific error. (#3051)
Previously, CAA problems were lumped in under "ConnectionProblem" or
"Unauthorized". This should make things clearer and easier to differentiate.

Fixes #3043
2017-09-14 14:11:41 -07:00
Jacob Hoffman-Andrews 4266853092 Implement legacy form of CAA (#3075)
This implements the pre-erratum 5065 version of CAA, behind a feature flag.

This involved refactoring DNSClient.LookupCAA to return a list of CNAMEs in addition to the CAA records, and adding an alternate lookuper that does tree-climbing on single-depth aliases.
2017-09-13 10:16:12 -04:00
Jacob Hoffman-Andrews 568407e5b8 Remote VA logging and stats (#3063)
Add a logging statement that fires when a remote VA fail causes
overall failure. Also change remoteValidationFailures into a
counter that counts the same thing, instead of a histogram. Since
the histogram had the default bucket sizes, it failed to collect
what we needed, and produced more metrics than necessary.
2017-09-11 12:50:50 -07:00
Roland Bracewell Shoemaker e91349217e Switch to using go 1.9 (#3047)
* Switch to using go 1.9

* Regenerate with 1.9

* Manually fix import path...

* Upgrade mockgen and regenerate

* Update github.com/golang/mock
2017-09-06 16:30:13 -04:00
Daniel McCarney baf32878c0 Prefix problem type with namespace at runtime. (#3039)
To support having problem types that use either the classic
"urn:acme:error" namespace or the new "urn:ietf:params:acme:error"
namespace as appropriate we need to prefix the problem type at runtime
right before returning it through the WFE to the user as JSON. This
commit updates the WFE/WFE2 to do this for both problems sent through
sendError as well as problems embedded in challenges. For the latter
we do not modify problems with a type that is already prefixed to
support backwards compatibility.

Resolves #2938

Note: We should cut a follow-up issue to devise a way to share some
common code between the WFE and WFE2. For example, the
prepChallengeForDisplay should probably be hoisted to a common
"web" package
2017-09-06 12:55:10 -07:00
Jacob Hoffman-Andrews b0c7bc1bee Recheck CAA for authorizations older than 8 hours (#3014)
Fixes #2889.

VA now implements two gRPC services: VA and CAA. These both run on the same port, but this allows implementation of the IsCAAValid RPC to skip using the gRPC wrappers, and makes it easier to potentially separate the service into its own package in the future.

RA.NewCertificate now checks the expiration times of authorizations, and will call out to VA to recheck CAA for those authorizations that were not validated recently enough.
2017-08-28 16:40:57 -07:00
Jacob Hoffman-Andrews 0d69b24fcc Move VA's CAA code into separate file (#3010)
va.go is quite a large file. This splits out the CAA-related code and tests into its own file for simplicity. This is a simple move; no code has been changed, and there is no package split.
2017-08-28 11:24:03 -07:00
Jacob Hoffman-Andrews 9026f6cbf8 Remove global state from VA test (#3009)
The VA test had a global:

`var ident = core.AcmeIdentifier{Type: core.IdentifierDNS, Value: "localhost"}`

Evidently this was meant as a convenience to avoid having to retype this common value, but it wound up being mutated independently by different tests. This PR replaces it with a convenience function `dnsi()` that generates a DNS-type identifier with the given hostname. Makes the VA test much more reliable locally.
2017-08-25 16:55:38 -07:00
Daniel McCarney 2bc7b604bd Allow authz creation in absence of GSB response. (#2972)
This commit updates the VA's `IsSafeDomain` RPC to treat errors from the
Google Safe Browsing client as a positive response. Subsequently the VA
will only block authz creation in the case that the GSB API returns
a true negative (e.g. confirms an unsafe domain). If the database is in
an inconsistent state due to an API outage we will allow the authz to be
created.
2017-08-11 11:22:53 -07:00
Daniel McCarney 57252c3b07 Remove letsencrypt/go-safe-browsing-api dependency. (#2905)
We have migrated from our fork of `go-safe-browsing-api` to Google's
safebrowsing v4 library. This commit removes the legacy safe browsing
code.
2017-07-26 13:57:57 -07:00
Daniel McCarney 2a84bc2495 Replace go-jose v1 with go-jose v2. (#2899)
This commit replaces the Boulder dependency on
gopkg.in/square/go-jose.v1 with gopkg.in/square/go-jose.v2. This is
necessary both to stay in front of bitrot and because the ACME v2 work
will require a feature from go-jose.v2 for JWS validation.

The largest part of this diff is cosmetic changes:

Changing import paths
jose.JsonWebKey -> jose.JSONWebKey
jose.JsonWebSignature -> jose.JSONWebSignature
jose.JoseHeader -> jose.Header
Some more significant changes were caused by updates in the API for
for creating new jose.Signer instances. Previously we constructed
these with jose.NewSigner(algorithm, key). Now these are created with
jose.NewSigner(jose.SigningKey{},jose.SignerOptions{}). At present all
signers specify EmbedJWK: true but this will likely change with
follow-up ACME V2 work.

Another change was the removal of the jose.LoadPrivateKey function
that the wfe tests relied on. The jose v2 API removed these functions,
moving them to a cmd's main package where we can't easily import them.
This function was reimplemented in the WFE's test code & updated to fail
fast rather than return errors.

Per CONTRIBUTING.md I have verified the go-jose.v2 tests at the imported
commit pass:

ok      gopkg.in/square/go-jose.v2      14.771s
ok      gopkg.in/square/go-jose.v2/cipher       0.025s
?       gopkg.in/square/go-jose.v2/jose-util    [no test files]
ok      gopkg.in/square/go-jose.v2/json 1.230s
ok      gopkg.in/square/go-jose.v2/jwt  0.073s

Resolves #2880
2017-07-26 10:55:14 -07:00
Roland Bracewell Shoemaker 05d869b005 Rename DNSResolver -> DNSClient (#2878)
Fixes #639.

This resolves something that has bugged me for two+ years, our DNSResolverImpl is not a DNS resolver, it is a DNS client. This change just makes that obvious.
2017-07-18 08:37:45 -04:00
Daniel McCarney 2f53c202a6 Rename `validationTimeout`, document and clarify purpose. (#2866)
This commit renames the `validationTimeout` to `singleDialTimeout` and
adds a comment describing its purpose.
2017-07-12 11:35:14 -07:00
Roland Bracewell Shoemaker 77f1364e9a Return more detailed error for connection reset in va (#2860)
If we hit a `syscall.ECONNRESET` error return a more useful error than `Error getting validation data`, updates the `TestDetailedError` test to cover this case.

Fixes #2851.
2017-07-11 14:29:31 -07:00
Jacob Hoffman-Andrews 9c7482fa94 Remove error return from Scope interface. (#2857)
This was inherited from the statsd interface but never used. This allows us to
remove one of our errcheck exceptions.
2017-07-11 10:54:06 -07:00
Daniel McCarney 957a68c72b Fix HTTP-01 IPv6 to IPv4 fallback with fresh dialer per conn. (#2852)
The implementation of the dialer used by the HTTP01 challenge, constructed with `resolveAndConstructDialer`, used the same wrapped `net.Dialer` for both the initial IPv6 connection, and any subsequent IPv4 fallback connections. This caused the IPv4 fallback to never succeed for cases where the initial IPv6 connection expended the `validationTimeout`.

This commit updates the http01Dialer (newly renamed from `dialer` since it is in fact specific to HTTP01 challenges) to use a fresh dialer for each connection. To facilitate testing the http01Dialer maintains
a count of how many dialer instances it has constructed. We use this in a unit test to ensure the correct behaviour without a great deal of new mocking/interfaces.

Resolves #2770
2017-07-10 15:41:49 -04:00
Daniel McCarney 7120d72197 Improve error message for IPv6 failure with no IPv4 fallback. (#2844)
This commit improves the rather vague error message that was previously returned if an IPv6 challenge validation failed when IPv6First was enabled and there were no IPv4 addresses left to try as a fallback.

Resolves #2821
2017-07-05 15:54:30 -04:00
Jacob Hoffman-Andrews f710e574b3 Provide specific error for connection refused (#2843)
Fixes #2830.
2017-06-30 16:22:19 -07:00
Roland Bracewell Shoemaker 088b872287 Implement multi VA validation (#2802)
Adds basic multi-path validation functionality. A new method `performRemoteValidation` is added to `boulder-va` which is called if it is configured with a list of remote VA gRPC addresses. In this initial implementation the remote VAs are only used to check the validation result of the main VA, if all of the remote validations succeed but the local validation failed, the overall validation will still fail. Remote VAs use the exact same code as the local VA to perform validation. If the local validation succeeds then a configured quorum of the remote VA successes must be met in order to fully complete the validation.

This implementation assumes that metrics are collected from the remote VAs in order to have visibility into their individual validation latencies etc.

Fixes #2621.
2017-06-29 14:11:01 -07:00
Roland Bracewell Shoemaker 764658ab84 Remove CAA distributed resolver (#2804)
We never used it and it's been superseded by the multi-VA design.
2017-06-15 13:35:50 -04:00
Jacob Hoffman-Andrews f236ca522f Improve validation error messages. (#2791)
Previously, a lot of validations problems would give the message "Failed to
connect to X for ..." This was misleading because the issue was not always a
connection error, and when it was, it was valuable to distinguish between
connection refused and timeout. Also, for HTTP, this message would echo the
first URL in a redirect chain, when we really want the URL that failed.

Renames the misleading "parseHTTPConnError" and removes an inaccurate
check for temporary errors. It also eliminates the "detail" argument, instead
generating all messages inside the function.

Improves the handling of tls.alert errors to actually pass through the error
message, rather than just quietly changing the problem type (which was very
easy to miss).

Gives a specific error message for timeouts.

Preserves the URL from url.Error types and incorporates it into error messages.

Splits the HTTP timeout test into its own test case.
2017-06-12 10:35:22 -04:00
Jacob Hoffman-Andrews 0bfb542514 Use fields, not globals, for stats (#2790)
Following up on #2752, we don't need to use global vars for our Prometheus stats. We already have a custom registry plumbed through using Scope objects. In this PR, expose the MustRegister method of that registry through the Scope interface, and move existing global vars to be fields of objects. This should improve testability somewhat.

Note that this has a bit of an unfortunate side effect: two instances of the same stats-using class (e.g. VA) can't use the same Scope object, because their MustRegister calls will conflict. In practice this is fine since we never instantiate duplicates of the the classes that use stats, but it's something we should keep an eye on.

Updates #2733
2017-06-06 12:09:31 -07:00
Daniel eab44959f3
Review feedback 2017-05-18 15:08:53 -04:00
Daniel 701cb5cb44
Merge master, review feedback. 2017-05-17 12:13:36 -04:00
Jacob Hoffman-Andrews 2d0734d35b Fix flakiness in VA test. (#2767)
When the VA validates a domain, it runs both CAA check and validation check in
parallel. If the validation check returns a failed result before the CAA check is done,
the VA returns immediately. However, the CAA check increments a stat. So there's
a flaky condition where, depending on the timing, that stat increment may or may
not get called on the MockScope. Add `.AnyTimes()` to our mock statements to
indicate we don't care whether Inc gets called.
2017-05-15 16:44:59 -07:00
Daniel McCarney e1bff721a7 Fixes master's unit tests, test failure. (#2763)
* Fixes post-merge artifact breaking VA unit tests.

* Fixes CI unit tests to fail build if tests fail.
2017-05-15 14:33:31 -04:00
Jacob Hoffman-Andrews b17b5c72a6 Remove statsd from Boulder (#2752)
This removes the config and code to output to statsd.

- Change `cmd.StatsAndLogging` to output a `Scope`, not a `Statter`.
- Remove the prefixing of component name (e.g. "VA") in front of stats; this was stripped by `autoProm` but now no longer needs to be.
- Delete vendored statsd client.
- Delete `MockStatter` (generated by gomock) and `mocks.Statter` (hand generated) in favor of mocking `metrics.Scope`, which is the interface we now use everywhere.
- Remove a few unused methods on `metrics.Scope`, and update its generated mock.
- Refactor `autoProm` and add `autoRegisterer`, which can be included in a `metrics.Scope`, avoiding global state. `autoProm` now registers everything with the `prometheus.Registerer` it is given.
- Change va_test.go's `setup()` to not return a stats object; instead the individual tests that care about stats override `va.stats` directly.

Fixes #2639, #2733.
2017-05-15 10:19:54 -04:00
Daniel c905cfb8db
Rewords IPv6 -> IPv4 fallback error messages. 2017-05-11 10:35:30 -04:00
Daniel d0029fcc41
Cleans up duplicated port management code. 2017-05-10 10:50:02 -04:00
Daniel McCarney 47452d6c6c Prefer IPv6 addresses, fall back to IPv4. (#2715)
This PR introduces a new feature flag "IPv6First". 

When the "IPv6First" feature is enabled the VA's HTTP dialer and TLS SNI
(01 and 02) certificate fetch requests will attempt to automatically
retry when the initial connection was to IPv6 and there is an IPv4
address available to retry with.

This resolves https://github.com/letsencrypt/boulder/issues/2623
2017-05-08 13:00:16 -07:00
David Calavera cc5ee3906b Refactor IsSane and IsSane* to return useful errors. (#2685)
This change changes the returning values from boolean to error.

It makes `checkConsistency` an internal function and removes the
optional argument in favor of making checks explicit where they are
used.

It also renames those functions to CheckConsistency* to not
give the impression of still returning boolean values.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2017-04-19 12:08:47 -04:00
Jacob Hoffman-Andrews d849f58cec Fix "valiation typo in VA. (#2676) 2017-04-12 11:12:22 -07:00
Jacob Hoffman-Andrews 4b665e35a6 Use Prometheus stats for VA, WFE, and OCSP Responder (#2628)
Rename HTTPMonitor to MeasuredHandler.
Remove inflight stat (we didn't use it).
Add timing stat by method, endpoint, and status code.
The timing stat subsumes the "rate" stat, so remove that.
WFE now wraps in MeasuredHandler, instead of relying on its cmd/main.go.
Remove FBAdapter stats.
MeasuredHandler tracks stats by method, status code, and endpoint.

In VA, add a Prometheus histogram for validation timing.
2017-04-03 17:03:04 -07:00
Roland Bracewell Shoemaker 08f4dda038 Update github.com/grpc-ecosystem/go-grpc-prometheus and google.golang.org/grpc (#2637)
Updates the various gRPC/protobuf libs (google.golang.org/grpc/... and github.com/golang/protobuf/proto) and the boulder-tools image so that we can update to the newest github.com/grpc-ecosystem/go-grpc-prometheus. Also regenerates all of the protobuf definition files.

Tests run on updated packages all pass.

Unblocks #2633 fixes #2636.
2017-04-03 11:13:48 -07:00
Roland Bracewell Shoemaker e2b2511898 Overhaul internal error usage (#2583)
This patch removes all usages of the `core.XXXError` and almost all usages of `probs` outside of the WFE and VA and replaces them with a unified internal error type. Since the VA uses `probs.ProblemDetails` quite extensively in challenges, and currently stores them in the DB I've saved this change for another change (it'll also require a migration). Since `ProblemDetails` should only ever be exposed to end-users all of its related logic should be moved into the `WFE` but since it still needs to be exposed to the VA and SA I've left it in place for now.

The new internal `errors` package offers the same convenience functions as `probs` does as well as a new simpler type testing method. A few small changes have also been made to error messages, mainly adding the library and function name to internal server errors for easier debugging (i.e. where a number of functions return the exact same errors and there is no other way to distinguish which method threw the error).

Also adds proper encoding of internal errors transferred over gRPC (the current encoding scheme is kept for `core` and `probs` errors since it'll be ideally be removed after we deploy this and follow-up changes) using `grpc/metadata` instead of the gRPC status codes.

Fixes #2507. Updates #2254 and #2505.
2017-03-22 23:27:31 -07:00
David Calavera c71c3cff80 Implement TLS-SNI-02 challenge validations. (#2585)
I think these are all the necessary changes to implement TLS-SNI-02 validations, according to the section 7.3 of draft 05:

https://tools.ietf.org/html/draft-ietf-acme-acme-05#section-7.3

I don't have much experience with this code, I'll really appreciate your feedback.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2017-03-22 10:17:59 -07:00
Daniel McCarney e81f7477a3 Fixes outdated IPv6 TODO on `getAddr`. (#2601)
The VA's `getAddr` function prior to this commit had an outdated comment
& a pointer to a TODO for Boulder Issue #593. That issue has been closed
and bdns' `LookupHost` supports AAAA records now. This commit updates
the comment to match the current behaviour and removes the TODO.
2017-03-09 13:20:03 -05:00
David Calavera 0dc2513d2d
Generate GRPC objects with Go 1.8.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2017-02-21 12:11:17 +01:00
Daniel McCarney 3fa950ac58 Improve VA TLS-SNI-01 challenge failure error. (#2527)
Previous to this PR the VA's validateTLSWithZName function would
return an error message containing the SAN names of the leaf certificate
when the validation failed. This commit updates that message to include
the Subject Common Name of the leaf cert in addition to the SANs. The
names are deduplicated to prevent listing a Subj CN twice if its also
a SAN. This will help debug cases where a cert with no SANs is returned
by the server.

In addition, the number of certificates in the chain received from the
server is included in the message. This will hopefully further help
users identify misconfiguration since a TLS SNI 01 challenge response
should have a chain length of 1.

Resolves #2468
2017-01-27 10:05:42 -08:00
Roland Bracewell Shoemaker 170e37c675 Add a special error message if we are trying to talk TLS to a HTTP-only server (#2511)
If the VA fails to validate a TLS-SNI-01 challenge because it is trying to talk TLS to a HTTP-only server return a special error message that is slightly more informative.
2017-01-20 11:36:39 -05:00
Josh Soref 8adf9d41cf Spelling (#2500)
Various spelling fixes.
2017-01-16 10:44:52 -05:00
Daniel McCarney 74e281c1ce Switch to Google's v4 safebrowsing library. (#2446)
Right now we are using a third-party client for the Google Safe Browsing API, but Google has recently released their own [Golang library](https://github.com/google/safebrowsing) which also supports the newer v4 API. Using this library will let us avoid fixing some lingering race conditions & unpleasantness in our fork of `go-safebrowsing-api`.

This PR adds support for using the Google library & the v4 API in place of our existing fork when the `GoogleSafeBrowsingV4` feature flag is enabled in the VA "features" configuration.

Resolves https://github.com/letsencrypt/boulder/issues/1863

Per `CONTRIBUTING.md` I also ran the unit tests for the new dependency:
```
daniel@XXXXXXXXXX:~/go/src/github.com/google/safebrowsing$ go test ./...
ok  	github.com/google/safebrowsing	3.274s
?   	github.com/google/safebrowsing/cmd/sblookup	[no test files]
?   	github.com/google/safebrowsing/cmd/sbserver	[no test files]
?   	github.com/google/safebrowsing/cmd/sbserver/statik	[no test files]
?   	github.com/google/safebrowsing/internal/safebrowsing_proto	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/jsonpb	0.012s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/jsonpb/jsonpb_test_proto	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/proto	0.062s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/proto/proto3_proto	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/descriptor	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/generator	0.017s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/grpc	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/plugin	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes	0.009s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/any	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/duration	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/empty	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/struct	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/timestamp	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/wrappers	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/rakyll/statik	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/rakyll/statik/fs	[no test files]
ok  	github.com/google/safebrowsing/vendor/golang.org/x/net/idna	0.003s
```
2016-12-27 11:18:11 -05:00
Daniel McCarney d4902820ca Adds unique VA DNS validation error for empty TXTs. (#2401)
Presently when the VA performs a DNS-01 challenge verification it
returns the same error for the case where the remote nameserver had the
**wrong** TXT value, and when the remote nameserver had an **empty**
response for the TXT query. It would aid debugging if the user was told
which of the two failure cases was responsible for the overall challenge
failure.

This commit adds a unique error message for the empty TXT records case,
and a unit test/mock to exercise the new the error message.

Resolves #2326
2016-12-08 11:27:28 -08:00
Jacob Hoffman-Andrews ec7af70aad Log hostname in VA validation result. (#2386)
This makes it easier to analyze logs for repeated failures on the same hostname.
2016-12-02 18:06:46 -08:00
Daniel 970f3aa422
Drops unneeded else block 2016-11-30 13:40:14 -05:00
Daniel bcc389d109
Fixes gofmt -s diffs 2016-11-30 13:30:03 -05:00
Roland Bracewell Shoemaker e2155388a1 Remove caa-checker from the tree (#2351)
The VA can internally check CAA and this additional code was deemed unneeded complexity that could be hoisted outside of Boulder. Fixes #2346.
2016-11-23 08:42:33 -05:00
Daniel McCarney a6f2b0fafb Updates `go-jose` dep to v1.1.0 (#2314)
This commit updates the `go-jose` dependency to [v1.1.0](https://github.com/square/go-jose/releases/tag/v1.1.0) (Commit: aa2e30fdd1fe9dd3394119af66451ae790d50e0d). Since the import path changed from `github.com/square/...` to `gopkg.in/square/go-jose.v1/` this means removing the old dep and adding the new one.

The upstream go-jose library added a `[]*x509.Certificate` member to the `JsonWebKey` struct that prevents us from using a direct equality test against two `JsonWebKey` instances. Instead we now must compare the inner `Key` members.

The `TestRegistrationContactUpdate` function from `ra_test.go` was updated to populate the `Key` members used in testing instead of only using KeyID's to allow the updated comparisons to work as intended.

The `Key` field of the `Registration` object was switched from `jose.JsonWebKey` to `*jose.JsonWebKey ` to make it easier to represent a registration w/o a Key versus using a value with a nil `JsonWebKey.Key`.

I verified the upstream unit tests pass per contributing.md:
```
daniel@XXXXX:~/go/src/gopkg.in/square/go-jose.v1$ git show
commit aa2e30fdd1fe9dd3394119af66451ae790d50e0d
Merge: 139276c e18a743
Author: Cedric Staub <cs@squareup.com>
Date:   Thu Sep 22 17:08:11 2016 -0700

    Merge branch 'master' into v1
    
    * master:
      Better docs explaining embedded JWKs
      Reject invalid embedded public keys
      Improve multi-recipient/multi-sig handling

daniel@XXXXX:~/go/src/gopkg.in/square/go-jose.v1$ go test ./...
ok  	gopkg.in/square/go-jose.v1	17.599s
ok  	gopkg.in/square/go-jose.v1/cipher	0.007s
?   	gopkg.in/square/go-jose.v1/jose-util	[no test files]
ok  	gopkg.in/square/go-jose.v1/json	1.238s
```
2016-11-08 13:56:50 -05:00
Roland Bracewell Shoemaker 313bc75c05 Switch to Golang 1.7.3 in travis (#2305)
Protobuf files need to be regenerated because (I think) Golang 1.7.3 uses a somewhat different method of ordering fields in a struct when marshaling to bytes.
2016-11-07 10:49:23 -08:00
Jacob Hoffman-Andrews 580c25286a Switch back to go 1.5 in Travis. (#2261)
* Switch back to go 1.5 in Travis.

* Add back GO15VENDOREXPERIMENT.

* Add GO15VENDOREXPERIMENT to Dockerfile

* Revert FAKE_DNS change.

* Revert "Properly close test servers (#2110)"

* Revert "Close VA HTTP test servers (#2111)"

* Change Godep version to 1.5.

* Standardize on issue number
2016-10-20 14:11:26 -04:00
Roland Bracewell Shoemaker c8f1fb3e2f Remove direct usages of go-statsd-client in favor of using metrics.Scope (#2136)
Fixes #2118, fixes #2082.
2016-09-07 19:35:13 -04:00
Ben Irving ce0016bc08 HTTP-01 gives misleading "could not connect" error (#2142)
Fixes #2122
2016-08-29 11:45:32 -07:00
Blake Griffith 344a312905 Remove audit comments -- closes #2129 (#2139)
Closes #2129

* Remove audit comments.
* Nuke doc/requirements/*
2016-08-25 18:23:42 -07:00
Jacob Hoffman-Andrews fadc1f5baf Log cert in TLS-SNI challenge. (#2127)
Fixes #2126
2016-08-15 11:40:28 -07:00
Roland Bracewell Shoemaker 4092d0d5d9 Close VA HTTP test servers (#2111)
Fixes #661 / follow-up to #2110.
2016-08-06 08:57:42 -07:00
Jacob Hoffman-Andrews fd095b8881 Fix Docker test builds: latest certbot and grpc (#1991)
The `letsencrypt/boulder-tools` image was recently updated, pulling in version
0.8.0 of certbot. That version stores the output of `certonly` requests in a
different path. In test.sh, we check out a specific tagged release of certbot in
order to get its integration tests. Prior to this commit, we were using
certbot 0.8.0 with the integration tests from version 0.6.0 of certbot,
which looked for `certonly` output in the wrong place, and failed.

This commit changes test.sh to checkout the 0.8.0 branch, and also removes a
temporary shim we used to make the `certbot` command call out to the
`letsencrypt` command.

Also, since the latest version of `letsencrypt/boulder-tools` includes an updated
`protoc-gen-go`, this change also updates the support packages to match.
2016-06-29 10:54:03 -07:00
Roland Bracewell Shoemaker 6264706557 Fix distributed CAA resolver (#1813)
While testing with real proxies I noticed the original CDR implementation was actually pretty broken, this refactors a bit and fixes a number of bugs. With this patch fallback to GPDNS over three distributed test proxies worked perfectly.

(Side note: `nginx` is not a viable forward proxy for this use as it doesn't support SSL, and a bunch of other _real_ forward proxy features, I ended up just using `squid3`.)

The main error in the previous implementation was the fallback was implemented in `getCAASet` which is only called in the old code path (the local CAA impl instead of the remote service) which mean't it wasn't actually being tested in the integration test. This also refactors a few repeated blocks into their own functions. Also there was a unicode encoding problem somewhere with the query string but for the life of me I can't figure out why it was broken now.
2016-06-23 11:30:20 -07:00
Jacob Hoffman-Andrews 4e0f96d924 Remove last vestiges of challenge.AccountKey. (#1949)
This is a followup from https://github.com/letsencrypt/boulder/pull/1942. That PR stopped setting challenge.AccountKey. This one removes it entirely.

Fixes #1948
2016-06-21 16:25:58 -07:00
Jacob Hoffman-Andrews 0535ac78d7 Stop setting AccountKey in challenges (#1942)
In https://github.com/letsencrypt/boulder/pull/774 we introduced and account key stored with the challenge. This was a stopgap fix to the now-defunct SimpleHTTP and DNS challenges in the face of https://mailarchive.ietf.org/arch/msg/acme/F71iz6qq1o_QPVhJCV4dqWf-4Yc. However, we no longer offer or implement those challenges, so the extra field is unnecessary. It also take up a huge amount of space in the challenges table, which is our biggest table. SimpleHTTP and DNS challenges were removed in https://github.com/letsencrypt/boulder/pull/1247.

We can provide a follow-up migration to delete the column later, once we have a plan for large migrations without downtime.

Fixes #1909
2016-06-20 14:26:53 -07:00
Ben Irving 7e7ccde5c9 Change error message for invalid IPs (#1934)
Clarifies the `UnknownHost` problem details error message created in the VA's `getAddr` when there is no valid IP address for the domain. 

Previously this was reported as "No IPv4 addresses found for x" leading to user confusion (ref #1790) when a domain resolved to a private IP.
2016-06-16 09:42:33 -04:00
Jacob Hoffman-Andrews 87d7d787aa Remove TestDNSValidationLive (#1905)
This functionality is already tested (with mocks) in TestDNSValidationOK, and
TestDNSValidationLive doesn't succeed anyway.
2016-06-08 08:36:22 -07:00
Ben Irving 1336c42813 Replace all log.Err calls with log.AuditErr (#1891)
* remove calls to log.Err()
* go fmt
* remove more occurrences
* change AuditErr argument to string and replace occurrences
2016-06-06 16:27:16 -04:00
Jacob Hoffman-Andrews 92df4d0fc2 Rename authorities to shorter names. (#1878)
Fixes #1875.
2016-06-03 13:35:28 -07:00
Ben Irving a713823cd5 Remove most instances of blog.Get() (#1870)
In this PR, logger is passed to the following callers:
  NewWebFrontEndImpl
  NewCertificateAuthorityImpl
  NewValidationAuthorityImpl
  NewAmqpRPCServer
  newUpdater
  NewRegistrationAuthorityServer

This reduces the usage of a global singleton logger and allows tests to consistently use a mock logger.

Fixes #1642

* remove blog.Get() in wfe
* remove blog.Get() from va
* remove Blog.Get() from ca
* remove blog.Get() from oscp updater, ampq rpc server, registration authority server
* removed some pointless logging code
* remove one added newline
* fix format issue
* fix setup function to return *blog.Mock instead of being passed in
* remove useless blog.NewMock() call
2016-06-02 14:26:39 -07:00
Jacob Hoffman-Andrews cf65d7d595 Improve VA and RA initialization (#1867)
- Run both gRPC and AMQP servers simultaneously
- Take explicit constructor parameters and unexport fields that were previously set by users
- Remove transitional DomainCheck code in RA now that GSB is enabled.
- Remove some leftover UpdateValidation dummy methods.
2016-06-02 11:57:37 -07:00
Kane York 0430c08388 Implement gRPC for VA's RPCs (#1738)
Adds appropriate protobufs, and sets up RA / VA to use the gRPC versions of the calls in boulder-config-next.json.

Fixes #1626.
2016-05-31 16:44:48 -07:00
Daniel McCarney 77030c3eb1 Make it easier to instantiate ProblemDetails (#1851)
Several of the `ProblemType`s had convenience functions to instantiate `ProblemDetails`s using their type and a detail message. Where these existed I did a quick scan of the codebase to convert places where callers were explicitly constructing the `ProblemDetails` to use the convenience function.

For the `ProblemType`s that did not have such a function, I created one and then converted callers to use it.

Solves #1837.
2016-05-31 14:05:37 -07:00
Roland Bracewell Shoemaker 54573b36ba Remove all stray copyright headers and appends the initial line to LICENSE.txt (#1853) 2016-05-31 12:32:04 -07:00
Jacob Hoffman-Andrews d8d1753168 Remove OnValidationUpdate callback in RA (#1848)
Also, UpdateValidations in VA.
2016-05-31 12:27:26 -07:00
Jacob Hoffman-Andrews dbc81e0ede Improve error messages in validation cases. (#1812)
* Improve error messages in validation cases.

* Add challenge type to error detail.

This makes the errors easier to search for.

* %v->%s

* Revert boulder-config to master.
2016-05-12 20:30:56 -07:00
Roland Bracewell Shoemaker efeb81d4be Use a LimitedReader when reading from HTTP response bodies (#1789)
Instead of ioutil.ReadAll
2016-05-05 13:58:14 -07:00
Roland Bracewell Shoemaker 35b6e83e81 Implement CAA quorum checking after failure (#1763)
When a CAA request to Unbound times out, fall back to checking CAA via Google Public DNS' HTTPS API, through multiple proxies so as to hit geographically distributed paths. All successful multipath responses must be identical in order to succeed, and at most one can fail.

Fixes #1618
2016-05-05 11:16:58 -07:00
Kane York 801626fb15 Limit the length of logged HTTP response (#1778)
* Limit the length of logged HTTP response

Fixes #1777

* Fix crash, add tests
* Fix utf-8 truncation
* move more logic into helper method
* Add unit test for truncateBody

https://github.com/letsencrypt/boulder/pull/1778
2016-04-29 11:25:51 -07:00
Kane York dc15f6a55e Readd timing stats for VA (#1759)
* Fix newVARPC sanity check logic error

* Redo style of sanity check in RA to match VA

* Switch to ServerInternal(), readd log.Info

* Add tests to make sure correct metric measured

* comments on exported symbols

* review fixes

* build an array instead of storing last
2016-04-27 11:57:21 -07:00
Jacob Hoffman-Andrews 6d5348f975 Run go generate in Travis (#1762)
* Fix go generate command in metrics.

The previous command only worked on OS X. This one works on Linux but not
OS X.

Also add generate phase of test.sh.

* Add mockgen to test setup.

* Fix github-pr-status output.

* Fix envvar style.

* Set xtrace.

* Fix test.sh

* Fix test.sh some more.

* Fix mockgen command.

* Add dependencies for running `go generate`.

* Add protoc-gen-go.

* Fix go get command.

* Fix generate.

* Wait for all.

* Fix generate.

* Update generated pb.

* Fix generate commands for vendored world.

* Update documentation for new vendor style.

* Update grpc package to latest.

* Update caaChecker proto with latest.

* Run go generate only over TESTPATHS

* See if Travis passes under 1.6

* Switch back to 1.5.

* Trim run command.

* Run stringer from correct directory.

* Move generate command.

* Restore and generate

* Fix path.

* list contents of GOPATH.

* Fix stringer by prebuilding.

* Try another import path.

* regenerate bcode_string.

* remove excess package

* pull jsha fork of protoc-gen-go that echoes

* Echo protoc version.

* install from source

* CD back.

* Go back to normal protoc-gen-go

* Fix path

* Move protobuf install into test/setup.sh

* Move before_install to install.

* Set PATH.

* Follow 301 with curl.

* Shuffle test order.

* Swap back test order.

* Restore all tests.

* Restore 1.5.3 to Travis.

* Remove unnecessary wait-or-exit

* Generate metrics mock with latest mockgen.

* Wrap TESTPATHS in curlies

* Remove spurious bracket
2016-04-21 15:23:06 -07:00
Kane York b7cf618f5d context.Context as the first parameter of all RPC calls (#1741)
Change core/interfaces to put context.Context as the first parameter of all RPC calls in preparation for gRPC.
2016-04-19 11:34:36 -07:00
Kane York 37fd71d21e Rework how KeyAuthorization works (#1688)
* Enhance error message in NewKeyAuthFromString

* va: generate expected response and string compare

* NewKeyAuth can return error, handle that...

* wip commit

* convert ch.KeyAuthorization to string and rename

* more wip commit

* wip 3

* wip 4

* delete NewKeyAuthorizationFromString

* change to ServerInternalProblem

* fix compile error

* semantic merge conflicts are the worst

actually compiled this time.

* Replace new error with statsd increment
2016-04-19 10:54:39 -07:00
Jacob Hoffman-Andrews e6c17e1717 Switch to new vendor style (#1747)
* Switch to new vendor style.

* Fix metrics generate command.

* Fix miekg/dns types_generate.

* Use generated copies of files.

* Update miekg to latest.

Fixes a problem with `go generate`.

* Set GO15VENDOREXPERIMENT.

* Build in letsencrypt/boulder.

* fix travis more.

* Exclude vendor instead of godeps.

* Replace some ...

* Fix unformatted cmd

* Fix errcheck for vendorexp

* Add GO15VENDOREXPERIMENT to Makefile.

* Temp disable errcheck.

* Restore master fetch.

* Restore errcheck.

* Build with 1.6 also.

* Match statsd.*"

* Skip errcheck unles Go1.6.

* Add other ignorepkg.

* Fix errcheck.

* move errcheck

* Remove go1.6 requirement.

* Put godep-restore with errcheck.

* Remove go1.6 dep.

* Revert master fetch revert.

* Remove -r flag from godep save.

* Set GO15VENDOREXPERIMENT in Dockerfile and remove _worskpace.

* Fix Godep version.
2016-04-18 12:51:36 -07:00
Jacob Hoffman-Andrews deb49a4048 Remove CAA timeout warning in VA. (#1729)
Because this event can be caused by end users, we don't want it higher than
Info. Also, it's redundant here: if this winds up being the gating factor, it
will show up in the "Validation result" log entry.
2016-04-13 23:55:42 -07:00
Roland Bracewell Shoemaker 8eaf247ee9 Split CAA checking out to its own service (#1647)
* Split out CAA checking service (minus logging etc)
* Add example.yml config + follow general Boulder style
* Update protobuf package to correct version
* Add grpc client to va
* Add TLS authentication in both directions for CAA client/server
* Remove go lint check
* Add bcodes package listing custom codes for Boulder
* Add very basic (pull-only) gRPC metrics to VA + caa-service
2016-04-12 23:02:41 -07:00
Kane York 25b45a45ec Errcheck errors fixed (#1677)
* Fix all errcheck errors
* Add errcheck to test.sh
* Add a new sa.Rollback method to make handling errors in rollbacks easier.
This also causes a behavior change in the VA. If a HTTP connection is
abruptly closed after serving the headers for a non-200 response, the
reported error will be the read failure instead of the non-200.
2016-04-12 16:54:01 -07:00
Jacob Hoffman-Andrews ecc04e8e61 Refactor log package (#1717)
- Remove error signatures from log methods. This means fewer places where errcheck will show ignored errors.
- Pull in latest cfssl to be compatible with errorless log messages.
- Reduce the number of message priorities we support to just those we actually use.
- AuditNotice -> AuditInfo
- Remove InfoObject (only one use, switched to Info)
- Remove EmergencyExit and related functions in favor of panic
- Remove SyslogWriter / AuditLogger separate types in favor of a single interface, Logger, that has all the logging methods on it.
- Merge mock log into logger. This allows us to unexport the internals but still override them in the mock.
- Shorten names to be compatible with Go style: New, Set, Get, Logger, NewMock, etc.
- Use a shorter log format for stdout logs.
- Remove "... Starting" log messages. We have better information in the "Versions" message logged at startup.

Motivation: The AuditLogger / SyslogWriter distinction was confusing and exposed internals only necessary for tests. Some components accepted one type and some accepted the other. This made it hard to consistently use mock loggers in tests. Also, the unnecessarily fat interface for AuditLogger made it hard to meaningfully mock out.
2016-04-08 16:12:20 -07:00
Jacob Hoffman-Andrews 090565a711 Accept = in orphan-finder.
Also, when a certificate already exists, treat that as info, not error.

Update mock logger to allow matching by log level, and fix WFE and VA tests
correspondingly.
2016-04-05 17:46:51 -07:00
Patrick Figel 93a57993de Ignore Carriage Return in http-01 validation
Windows uses \r\n as a new line character. This change should fix
issues some Windows users are seeing when they include a new line
at the end of their challenge tokens.
2016-03-30 13:36:06 +02:00
Jacob Hoffman-Andrews a40ac55d61 AuditErr -> AuditNotice for a common log entry. 2016-03-25 09:29:03 -07:00
Kane York d778115c5d Add PerformValidation RA→VA rpc call
The RPC call returns only the updated Challenge object instead of the
whole authz, because the VA shouldn't be allowed to update any of the
other fields of the Authorization object.
2016-03-23 10:39:23 -07:00
Kane York 0e85854acc Improve logging of failures in the VA
There's a consistent format to the messages to aid in searching. They
start with [va-err-log], and if an error value is present, end with
Err=%#v.
2016-03-18 13:05:32 -07:00
Roland Shoemaker 00b617b59a Switch to upstream square/go-jose + pull latest 2016-03-15 13:54:22 -07:00
Kane York b1c1f84929 Reword CAA failure message 2016-03-14 09:05:19 -07:00
Kane York 31535f5b89 Perform CAA lookups in parallel.
Also, stop skipping CAA lookups for the root TLDs. The RFC is unclear on
the desired behavior here, but the ICANNTLD function is nonstandard and
the behavior is strictly more conservative than what we had before.

This unblocks the removal of the ICANNTLD function, which allows us to
stop forking upstream.

Closes #1522
2016-03-04 11:07:14 -08:00
Kane York db8bea7b05 Specify severity of all log.Audit calls
Fixes #1534
2016-02-25 10:55:01 -08:00
Jessica Frazelle 3df2e942be
go fmt fixes
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-02-16 12:19:15 -08:00
Jeff Hodges fb657de5f1 Merge branch 'master' into delete_check_caa 2016-02-12 13:09:10 -08:00
Jeff Hodges c36bc382dd remove unused CheckCAA RPC
Helpful for #1486 and was already in my local repo.
2016-02-12 13:01:05 -08:00
Hugo Landau 1bc980b04a Change to use non-naked returns in checkCAARecords 2016-02-10 20:21:54 +00:00
Hugo Landau ea9853a35b Remove issuewild support from CAA patch 2016-01-31 02:01:34 +00:00
Hugo Landau 4f27c24cf3 Make CAA checking more compliant with the RFC; CAA refactoring
The CAA response checking method has been refactored to have a
easier to follow straight-line control flow. Several bugs in it have
been fixed:

  - Firstly, parameters for issue and issuewild directives were not
    parsed, so any attempt to specify parameters would result in
    a string mismatch with the CA CAA identity (e.g. "letsencrypt.org").
    Moreover, the syntax as specified permits leading and trailing
    whitespace, so a parameter-free record such as
    "  letsencrypt.org ;  " would not be considered a match.

    This has been fixed by stripping whitespace and parameters. The RFC
    does not specify the criticality of parameters, so unknown
    parameters (currently all parameters) are considered noncritical.
    I justify this as follows:

    If someone decides to nominate a CA in a CAA record, they can,
    with trivial research, determine what parameters, if any, that
    CA supports, and presumably in trusting them in the first place
    is able to adequately trust that the CA will continue to support
    those parameters. The risk from other CAs is zero because other CAs
    do not process the parameters because the records in which they
    appear they do not relate to them.

  - Previously, all of the flag bits were considered to effectively mean
    'critical'. However, the RFC specifies that all bits except for the
    actual critical bit (decimal 128) should be ignored. In practice,
    many people have misunderstood the RFC to mean that the critical bit
    is decimal 1, so both bits are interpreted to mean 'critical', but
    this change ignores all of the other bits. This ensures that the
    remaining six bits are reasonably usable for future standards action
    if any need should arise.

  - Previously, existence of an "issue" directive but no "issuewild"
    directive was essentially equivalent to an unsatisfiable "issuewild"
    directive, meaning that no wildcard identifiers could pass the CAA
    check. This is contrary to the RFC, which states that issuewild
    should default to what is specified for "issue" if no issuewild
    directives are specified. (This is somewhat moot since boulder
    doesn't currently support wildcard issuance.)

  - Conversely, existence of an "issuewild" directive but no "issue"
    directive would cause CAA validation for a non-wildcard identifier
    to fail, which was contrary to the RFC. This has been fixed.

  - More generally, existence of any unknown non-critical directive, say
    "foobar", would cause the CAA checking code to act as though an
    unsatisfiable "issue" directive existed, preventing any issuance.
    This has been fixed.

Test coverage for corner cases is enhanced and provides regression
testing for these bugs.

statsd statistics have been added for tracking the relative frequency
of occurrence of different CAA features and outcomes. I added these on
a whim suspecting that they may be of interest.

Fixes #1436.
2016-01-31 01:51:28 +00:00
Roland Shoemaker b55a0e64ec Add small test 2016-01-26 13:59:25 -08:00
Roland Shoemaker 327253d1ee Remove incorrect CAA critical flag check 2016-01-26 13:50:14 -08:00
Romain Fliedel 5d9191b537 add test to ensure dns validation succeed when dns reply does not contain authority 2016-01-21 16:17:48 +01:00
Jacob Hoffman-Andrews 40167f3da3 Merge remote-tracking branch 'le/master' into dns-errors-fix
Conflicts:
	bdns/dns.go
	bdns/dns_test.go
	mocks/mocks.go
	ra/registration-authority.go
	ra/registration-authority_test.go
2016-01-08 14:07:05 -08:00
Roland Shoemaker b4c761aa5d Merge branch 'master' into dns-meta 2016-01-06 15:09:28 -08:00
Jeff Hodges adbbbec962 Merge branch 'master' into retry_dns 2016-01-04 16:38:27 -08:00
Jeff Hodges a00094faa8 Merge branch 'master' into dns-meta 2016-01-04 16:16:33 -08:00
Jacob Hoffman-Andrews df4ba7aaa8 Report DNS errors properly.
Previously we would return a detailed errorString, which ProblemDetailsFromDNSError
would turn into a generic, uninformative "Server failure at resolver".

Now we return a new internal dnsError type, which ProblemDetailsFromDNSError can
turn into a more informative message to be shown to the user.
2016-01-04 16:07:02 -08:00
Alex Gaynor cbeffe96a6 Fixed a bunch of typos 2016-01-04 18:39:34 -05:00
Jeff Hodges 116ce96326 add retries and context deadlines to DNSResolver
This provides a means to add retries to DNS look ups, and, with some
future work, end retries early if our request deadline is blown. That
future work is tagged with #1292.

Updates #1258
2016-01-04 14:59:10 -08:00
Roland Shoemaker 2e91b6515b Merge branch 'master' into dns-meta 2016-01-04 14:55:39 -08:00