Commit Graph

92 Commits

Author SHA1 Message Date
Aaron Gable 8cb01a0c34
Enable additional linters (#6106)
These new linters are almost all part of golangci-lint's collection
of default linters, that would all be running if we weren't setting
`disable-all: true`. By adding them, we now have parity with the
default configuration, as well as the additional linters we like.

Adds the following linters:
* unconvert
* deadcode
* structcheck
* typecheck
* varcheck
* wastedassign
2022-05-11 13:58:58 -07:00
James Renken 13214b87ac
Correct privateNetworks range typo (#5915)
Fixes #5914
2022-01-28 15:40:13 -08:00
Aaron Gable 18389c9024
Remove dead code (#5893)
Running an older version (v0.0.1-2020.1.4) of `staticcheck` in
whole-program mode (`staticcheck --unused.whole-program=true -- ./...`)
finds various instances of unused code which don't normally show up
as CI issues. I've used this to find and remove a large chunk of the
unused code, to pave the way for additional large deletions accompanying
the WFE1 removal.

Part of #5681
2022-01-19 12:23:06 -08:00
Aaron Gable 2f2bac4bf2
Improve readability of A and AAAA lookup errors (#5843)
When we query DNS for a host, and both the A and AAAA lookups fail or
are empty, combine both errors into a single error rather than only
returning the error from the A lookup.

Fixes #5819
Fixes #5319
2022-01-03 10:39:25 -08:00
Jacob Hoffman-Andrews 11bda3e486
Add error counter for TLD (#5717) 2021-10-19 15:57:31 -07:00
Jacob Hoffman-Andrews 4205400a98
Lower logDNSError to info level. (#5701)
These log lines are sometimes useful for debugging, but are a normal
part of operation, not an error: Unbound will allow a response to
timeout if the remote server is too slow.
2021-10-12 10:44:54 -06:00
Samantha 6eee230d69
BDNS: Ensure DNS server addresses are dialable (#5520)
- Add function `validateServerAddress()` to `bdns/servers.go` which ensures that
  DNS server addresses are TCP/ UDP dial-able per: https://golang.org/src/net/dial.go?#L281
- Add unit test for `validateServerAddress()` in `bdns/servers_test.go`
- Update `cmd/boulder-va/main.go` to handle `bdns.NewStaticProvider()`
  potentially returning an error.
- Update unit tests in `bdns/dns_test.go`:
  - Handle `bdns.NewStaticProvider()` potentially returning an error
  - Add an IPv6 address to `TestRotateServerOnErr`
- Ensure DNS server addresses are validated by `validateServerAddress` whenever:
  - `dynamicProvider.update() is called`
  - `staticProvider` is constructed
- Construct server addresses using `net.JoinHostPost()` when
  `dynamicProvider.Addrs()` is called

Fixes #5463
2021-07-20 10:11:11 -07:00
Aaron Gable a19ebfa0e9
VA: Query SRV to preload/cache DNS resolver addrs (#5360)
Abstract out the way that the bdns library keeps track of the
resolvers it uses to do DNS lookups. Create one implementation,
the `StaticProvider`, which behaves exactly the same as the old
mechanism (providing whatever names or addresses were given
in the config). Create another implementation, `DynamicProvider`,
which re-resolves the provided name on a regular basis.

The dynamic provider consumes a single name, does a lookup
on that name for any SRV records suggesting that it is running a
DNS service, and then looks up A records to get the address of
all the names returned by the SRV query. It exports its successes
and failures as a prometheus metric.

Finally, update the tests and config-next configs to work with
this new mechanism. Give sd-test-srv the capability to respond
to SRV queries, and put the names it provides into docker's
default DNS resolver.

Fixes #5306
2021-04-20 10:11:53 -07:00
Jacob Hoffman-Andrews a89b79cb7d
Remove unused methods on bdns.Error (#5395) 2021-04-19 15:30:42 -07:00
Jacob Hoffman-Andrews 6a8bec395f
Distinguish cancellation from timeout in DNS. (#5385)
Under normal circumstances, I believe we should never have cause to
return a cancellation-related error to the user. This change should
distinguish that case in the logs so we can look for it. If it turns out
we do sometimes return cancellation-related errors to the user, we
should do further digging and figure out why.

Related #5346
2021-04-05 15:44:27 -07:00
Aaron Gable ef1d3c4cde
Standardize on `AssertMetricWithLabelsEquals` (#5371)
Update all of our tests to use `AssertMetricWithLabelsEquals`
instead of combinations of the older `CountFoo` helpers with
simple asserts. This coalesces all of our prometheus inspection
logic into a single function, allowing the deletion of four separate
helper functions.
2021-04-01 15:20:43 -07:00
Jacob Hoffman-Andrews d36ab3b9b9
Report metrics by server host, not host:port (#5262)
Fixes #5257
2021-02-08 12:22:17 -08:00
Jacob Hoffman-Andrews 2a8f0fe6ac
Rename several items in bdns (#5260)
[Go style says](https://blog.golang.org/package-names):

> Avoid stutter. Since client code uses the package name as a prefix
> when referring to the package contents, the names for those contents
> need not repeat the package name. The HTTP server provided by the
> http package is called Server, not HTTPServer. Client code refers to
> this type as http.Server, so there is no ambiguity.

Rename DNSClient, DNSClientImpl, NewDNSClientImpl,
NewTestDNSClientImpl, DNSError, and MockDNSClient to follow those
guidelines.

Unexport DNSClientImpl and MockTimeoutError (was only used internally).

Make New and NewTest return the Client interface rather than a concrete
`impl` type.
2021-01-29 17:20:35 -08:00
Samantha 802d4fed9d
Return full CAA RR response from bdns to va (#5181)
When the VA encounters CAA records, it logs the contents of those
records. When those records were the result of following a chain of
CNAMEs, the CNAMEs are included as part of the response from our
recursive resolver. However, the current flow for logging the responses
logs only the CAA records, not the CNAMEs.

This change returns the complete dig-style RR response from bdns to the
va where the response of the authoritative CAA RR is string-quoted and
logged.

This dig-style RR response is quite verbose, however it is only ever
returned from bdns.LookupCAA when a CAA response is non-empty. If the CAA
response is empty only an empty string is returned.

Fixes #5082
2020-12-10 18:17:04 -08:00
Aaron Gable 294d1c31d7
Use error wrapping for berrors and tests (#5169)
This change adds two new test assertion helpers, `AssertErrorIs`
and `AssertErrorWraps`. The former is a wrapper around `errors.Is`,
and asserts that the error's wrapping chain contains a specific (i.e.
singleton) error. The latter is a wrapper around `errors.As`, and
asserts that the error's wrapping chain contains any error which is
of the given type; it also has the same unwrapping side effect as
`errors.As`, which can be useful for further assertions about the
contents of the error.

It also makes two small changes to our `berrors` package, namely
making `berrors.ErrorType` itself an error rather than just an int,
and giving `berrors.BoulderError` an `Unwrap()` method which
exposes that inner `ErrorType`. This allows us to use the two new
helpers above to make assertions about berrors, rather than
having to hand-roll equality assertions about their types.

Finally, it takes advantage of the two changes above to greatly
simplify many of the assertions in our tests, removing conditional
checks and replacing them with simple assertions.
2020-11-06 13:17:11 -08:00
Samantha feebb4017e
bdns: replace direct type assertions with errors.As (#5122)
errors.As checks for a specific error in a wrapped error chain
(see https://golang.org/pkg/errors/#As) as opposed to asserting
that an error is of a specific type

Part of #5010
2020-10-13 17:31:42 -07:00
Samantha 7ca12212c4
Replace direct type casts with `errors.As` in BDNS (#5121)
`errors.As` checks for a specific error in a wrapped error chain
(see https://golang.org/pkg/errors/#As) as opposed to asserting
that an error is of a specific type.

Part of #5010
2020-10-08 17:20:33 -07:00
Jacob Hoffman-Andrews 2d7337dcd0
Remove newlines from log messages. (#4777)
Since Boulder's log system adds checksums to lines, but log-validator
processes entries on a per-line basis, including newlines in log
messages can cause a validation failure.
2020-04-16 16:49:08 -07:00
Jacob Hoffman-Andrews bef02e782a
Fix nits found by staticcheck (#4726)
Part of #4700
2020-03-30 10:20:20 -07:00
Jacob Hoffman-Andrews b58e5453e8
Fix output of logDNSError. (#4691)
The message had hostname and queryType backwards.
2020-03-02 08:39:21 -08:00
alexzorin 03090a0e80 bdns: friendly error text for NXDOMAIN, SERVFAIL (#4642)
Providing additional explanatory text in the error message may help
guide users who are unfamiliar with DNS error codes.
2020-01-14 08:54:33 -08:00
Roland Bracewell Shoemaker 5b2f11e07e Switch away from old style statsd metrics wrappers (#4606)
In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back.

There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change.

Fixes #4591.
2019-12-18 11:08:25 -05:00
Daniel McCarney 6ed4ce23a8
bdns: move logDNSError to exchangeOne, log ErrId specially. (#4553)
We've found we need the context offered from logging the error closer to when it
happens in the `bdns` package rather than in the `va`. Adopting the function
requires adapting it slightly. Specifically in the new location we know it won't
be called with any timeout results, with a non-dns error, or with a nil
underlying error.

Having the logging done in `bdns` (and specifically from `exchangeOne`) also
lets us log the wire format of the query and response when we get a `dns.ErrId`
error indicating a query/response ID mismatch. A small unit test is included
that ensures the logging happens as expected.

In case it proves useful for matching against other metrics the DNS ID mismatch
error case also now increments a dedicated prometheus counter vector stat,
`dns_id_mismatch`. The stat is labelled by resolver and query type.

Resolves https://github.com/letsencrypt/boulder/issues/4532
2019-11-15 16:03:45 -05:00
Jacob Hoffman-Andrews 7f6caddc5b VA: log internal DNS errors. (#4520)
When we get a DNS error that has an internal cause (like connection
refused), we return a generic message like "networking error" to the
user to avoid revealing details that would be confusing. However, when
debugging problems with our own services, it's useful to have the
underlying errors.

This adds a helper method in the VA and calls it from each place we use
DNS errors.
2019-11-04 09:09:24 -05:00
Roland Bracewell Shoemaker 6f93942a04 Consistently used stdlib context package (#4229) 2019-05-28 14:36:16 -04:00
Roland Bracewell Shoemaker e839042bae dns: Remove Authorities field from ValidationRecord (#4230) 2019-05-28 14:11:32 -04:00
Jacob Hoffman-Andrews 4c420e2bc2 bdns: Remove LookupMX. (#4202)
We used to use this for checking email domains on registration, but not
anymore.
2019-05-06 09:29:44 -04:00
Roland Bracewell Shoemaker 97d1788a18 Add resolver to DNS metrics (#3874)
Helpful for debugging stuff in multi-resolver setups.
2018-10-01 11:16:45 -07:00
Daniel McCarney cca4a0c14a BDNS: Rotate the DNS server between query retries. (#3861)
When a retryable error occurs and there are multiple DNS servers configured it is prudent to change servers before retrying the query. This helps ensure that one dead DNS server won't result in queries failing.

Resolves https://github.com/letsencrypt/boulder/issues/3846
2018-09-19 08:06:09 -07:00
Joel Sing f8a023e49c Remove various unnecessary uses of fmt.Sprintf (#3707)
Remove various unnecessary uses of fmt.Sprintf - in particular:

- Avoid calls like t.Error(fmt.Sprintf(...)), where t.Errorf can be used directly.

- Use strconv when converting an integer to a string, rather than using
  fmt.Sprintf("%d", ...). This is simpler and can also detect type errors at
  compile time.

- Instead of using x.Write([]byte(fmt.Sprintf(...))), use fmt.Fprintf(x, ...).
2018-05-11 11:55:25 -07:00
Jacob Hoffman-Andrews 49511538d0 Make DNS timeout stat more specific. (#3627)
Distinguish between deadline exceeded vs canceled. Also, combine those
two cases with "out of retries" into a single stat with a label
determining type.
2018-04-09 09:29:07 -04:00
Roland Bracewell Shoemaker 8167abd5e3 Use internet facing appropriate histogram buckets for DNS latencies (#3616)
Also instead of repeating the same bucket definitions everywhere just use a single top level var in the metrics package in order to discourage copy/pasting.

Fixes #3607.
2018-04-04 08:01:54 -04:00
Roland Bracewell Shoemaker 5748792a07 Fix IP blacklist typo (#3478) 2018-02-26 09:31:33 -08:00
Jacob Hoffman-Andrews 4f2e5f11e5 Accept large responses from the resolver. (#3467)
This fixes some use cases where a domain being validated has a very
large number of CAA records.
2018-02-21 08:54:48 -05:00
Jacob Hoffman-Andrews 1fe8aa8128 Improve errors for DNS challenge (#3317)
Before this change, we would just log "Correct value not found for DNS challenge"
when we got a TXT record that didn't match what we expected. This was different
from the error when no TXT records were found at all, but viewing the error out of
context doesn't make that clear. This change improves the error to specifically say
that we found a TXT record, but it was the wrong one.

Also in this change: if we found multiple TXT records, we mention the number;
and we trim the length of the echoed TXT record.
2018-01-03 15:37:23 -05:00
Jacob Hoffman-Andrews f16c3af335 Active UDPDNS by default. (#3285)
Active UDPDNS by default
2017-12-15 12:26:45 -08:00
Roland Bracewell Shoemaker bdea281ae0 Remove CAA SERVFAIL exceptions code (#3262)
Fixes #3080.
2017-12-05 14:39:37 -08:00
Jacob Hoffman-Andrews 2fd2f9e230 Remove LegacyCAA implementation. (#3240)
Fixes #3236
2017-11-20 16:09:00 -05:00
Jacob Hoffman-Andrews 90278c80fe Revert "Reject CAA responses containing DNAMEs (#3082)" (#3188)
This reverts commit 08d2018c10.

Feedback from root programs:

https://cabforum.org/pipermail/public/2017-October/012293.html
https://cabforum.org/pipermail/public/2017-October/012297.html
https://cabforum.org/pipermail/public/2017-October/012358.html
https://cabforum.org/pipermail/public/2017-October/012320.html

Resolves #3130.
2017-10-23 11:14:56 -07:00
Jacob Hoffman-Andrews 4e68fb2ff6 Switch to udp for internal DNS. (#3135)
We used to use TCP because we would request DNSSEC records from Unbound, and
they would always cause truncated records when present. Now that we no longer
request those (#2718), we can use UDP. This is better because the TCP serving
paths in Unbound are likely less thoroughly tested, and not optimized for high
load. In particular this may resolve some availability problems we've seen
recently when trying to upgrade to a more recent Unbound.

Note that this only affects the Boulder->Unbound path. The Unbound->upstream
path is already UDP by default (with TCP fallback for truncated ANSWERs).
2017-10-10 10:06:33 -04:00
Jacob Hoffman-Andrews fce975a1e6 Move CAA mocks into caa_test. (#3084)
There were a bunch of test fixtures in bdns/mocks.go that were only used in va/caa_test.go. This moves them to be in the same file so we have less spooky action at a distance.

One side-effect: We can't construct bdns.DNSError with the internal fields we want, because those fields are unexported. So we switch a couple of mock cases to just return a generic error, and the corresponding test cases to expect that error.
2017-09-18 13:10:01 -07:00
Jacob Hoffman-Andrews 08d2018c10 Reject CAA responses containing DNAMEs (#3082)
Since the legacy CAA spec does the wrong thing with DNAMEs (treating them as
CNAMEs), and it's hard to reconcile this approach with CNAME handling, and
DNAMEs are extremely rare, reject outright any CAA responses containing DNAMEs.

Also, in the process, fix a bug in the previous LegacyCAA implementation.
Because the processing of records in LookupCAA was gated by `if
answer.Header().RRType == dnsType`, non-CAA responses were filtered out. This
wasn't caught by previous testing, because it was unittesting that mocked out
bdns.
2017-09-13 10:54:48 -07:00
Jacob Hoffman-Andrews 4266853092 Implement legacy form of CAA (#3075)
This implements the pre-erratum 5065 version of CAA, behind a feature flag.

This involved refactoring DNSClient.LookupCAA to return a list of CNAMEs in addition to the CAA records, and adding an alternate lookuper that does tree-climbing on single-depth aliases.
2017-09-13 10:16:12 -04:00
Roland Bracewell Shoemaker 9d34af6a82 Set AD bit in the header of DNS queries (#3068)
Fixes broken DNSSEC metrics, lack of this bit being set in queries had no security implications.
2017-09-12 09:28:07 -07:00
Jacob Hoffman-Andrews 568407e5b8 Remote VA logging and stats (#3063)
Add a logging statement that fires when a remote VA fail causes
overall failure. Also change remoteValidationFailures into a
counter that counts the same thing, instead of a histogram. Since
the histogram had the default bucket sizes, it failed to collect
what we needed, and produced more metrics than necessary.
2017-09-11 12:50:50 -07:00
Roland Bracewell Shoemaker eadbc19c43 Switch DNS metrics from statsd to prometheus (#2994)
Makes the DNS stats code much nicer if I don't say so myself. Should make investigating DNS problems much easier now as well.

Fixes #2956.
2017-08-22 14:33:36 -07:00
Roland Bracewell Shoemaker 05d869b005 Rename DNSResolver -> DNSClient (#2878)
Fixes #639.

This resolves something that has bugged me for two+ years, our DNSResolverImpl is not a DNS resolver, it is a DNS client. This change just makes that obvious.
2017-07-18 08:37:45 -04:00
Daniel c905cfb8db
Rewords IPv6 -> IPv4 fallback error messages. 2017-05-11 10:35:30 -04:00
Daniel McCarney 47452d6c6c Prefer IPv6 addresses, fall back to IPv4. (#2715)
This PR introduces a new feature flag "IPv6First". 

When the "IPv6First" feature is enabled the VA's HTTP dialer and TLS SNI
(01 and 02) certificate fetch requests will attempt to automatically
retry when the initial connection was to IPv6 and there is an IPv4
address available to retry with.

This resolves https://github.com/letsencrypt/boulder/issues/2623
2017-05-08 13:00:16 -07:00
Roland Bracewell Shoemaker b82c244e65 Add stat for how often DNS responses are signed (#2716)
I'm interested in seeing both how often DNS responses we see are signed (mainly for CAA, but
also interested in other query types). This change adds a new counter, `Authenticated`, that can
be compared against the `Successes` counter to find the percentage of signed responses we see.
The counter is incremented if the `msg.AuthenticatedData` bit is set by the upstream resolver.
2017-05-02 10:57:11 -07:00