When we get a DNS error that has an internal cause (like connection
refused), we return a generic message like "networking error" to the
user to avoid revealing details that would be confusing. However, when
debugging problems with our own services, it's useful to have the
underlying errors.
This adds a helper method in the VA and calls it from each place we use
DNS errors.
This patch removes all usages of the `core.XXXError` and almost all usages of `probs` outside of the WFE and VA and replaces them with a unified internal error type. Since the VA uses `probs.ProblemDetails` quite extensively in challenges, and currently stores them in the DB I've saved this change for another change (it'll also require a migration). Since `ProblemDetails` should only ever be exposed to end-users all of its related logic should be moved into the `WFE` but since it still needs to be exposed to the VA and SA I've left it in place for now.
The new internal `errors` package offers the same convenience functions as `probs` does as well as a new simpler type testing method. A few small changes have also been made to error messages, mainly adding the library and function name to internal server errors for easier debugging (i.e. where a number of functions return the exact same errors and there is no other way to distinguish which method threw the error).
Also adds proper encoding of internal errors transferred over gRPC (the current encoding scheme is kept for `core` and `probs` errors since it'll be ideally be removed after we deploy this and follow-up changes) using `grpc/metadata` instead of the gRPC status codes.
Fixes#2507. Updates #2254 and #2505.
This PR reworks the validateEmail() function from the RA to allow timeouts during DNS validation of MX/A/AAAA records for an email to be non-fatal and match our intention to verify emails best-effort.
Notes:
bdns/problem.go - DNSError.Timeout() was changed to also include context cancellation and timeout as DNS timeouts. This matches what DNSError.Error() was doing to set the error message and supports external callers to Timeout not duplicating the work.
bdns/mocks.go - the LookupMX mock was changed to support always.error and always.timeout in a manner similar to the LookupHost mock. Otherwise the TestValidateEmail unit test for the RA would fail when the MX lookup completed before the Host lookup because the error wouldn't be correct (empty DNS records vs a timeout or network error).
test/config/ra.json, test/config-next/ra.json - the dnsTries and dnsTimeout values were updated such that dnsTries * dnsTimeout was <= the WFE->RA RPC timeout (currently 15s in the test configs). This allows the dns lookups to all timeout without the overall RPC timing out.
Resolves#2260.
Several of the `ProblemType`s had convenience functions to instantiate `ProblemDetails`s using their type and a detail message. Where these existed I did a quick scan of the codebase to convert places where callers were explicitly constructing the `ProblemDetails` to use the convenience function.
For the `ProblemType`s that did not have such a function, I created one and then converted callers to use it.
Solves #1837.
When a CAA request to Unbound times out, fall back to checking CAA via Google Public DNS' HTTPS API, through multiple proxies so as to hit geographically distributed paths. All successful multipath responses must be identical in order to succeed, and at most one can fail.
Fixes#1618
* Split out CAA checking service (minus logging etc)
* Add example.yml config + follow general Boulder style
* Update protobuf package to correct version
* Add grpc client to va
* Add TLS authentication in both directions for CAA client/server
* Remove go lint check
* Add bcodes package listing custom codes for Boulder
* Add very basic (pull-only) gRPC metrics to VA + caa-service
Previously we would return a detailed errorString, which ProblemDetailsFromDNSError
would turn into a generic, uninformative "Server failure at resolver".
Now we return a new internal dnsError type, which ProblemDetailsFromDNSError can
turn into a more informative message to be shown to the user.
Moves the DNS code from core to dns and renames the dns package to bdns
to be clearer.
Fixes#1260 and will be good to have while we add retries and such.