Commit Graph

3797 Commits

Author SHA1 Message Date
Daniel McCarney 361e7d4caa Clean up `berrors` (#2724)
This PR removes two berrors that aren't used anywhere in the codebase:

TooManyRequests , a holdover from AMQP, and is no longer used.
UnsupportedIdentifier, used just for rejecting IDNs, which we no longer do.
In addition, the SignatureValidation error was only used by the WFE so it is moved there and unexported.

Note for reviewers: To remove berrors.UnsupportedIdentifierError I replaced the errIDNNotSupported error in policy/pa.go with a berrors.MalformedError with the same name. This allows removing UnsupportedIdentifierError ahead of #2712 which removes the IDNASupport feature flag. This seemed OK to me, but I can restore UnsupportedIdentifierError and clean it up after 2712 if that's preferred.

Resolves #2709
2017-05-04 10:56:26 +01:00
Daniel McCarney 40663ba66c Fixes test speeds by splitting `-race` from coverage runs. (#2721)
The unit test runs in CI have been taking ~20 minutes. The root cause is
using `-race` on every individual `go test` invocation. We can't switch
to one big `go test` with `-race` instead of individuals if we want test
coverage to be reported. The workaround is to do one big `go test` with
`-race` first, and then many individual `go test`'s to collect coverage
*without* `-race`. This is still faster overall than the current state
of affairs.

Resolves https://github.com/letsencrypt/boulder/issues/2695
2017-05-02 14:57:32 -07:00
Daniel McCarney 1ed34a4a5d Fixes cert count rate limit for exact PSL matches. (#2703)
Prior to this PR if a domain was an exact match to a public suffix
list entry the certificates per name rate limit was applied based on the
count of certificates issued for that exact name and all of its
subdomains.

This PR introduces an exception such that exact public suffix
matches correctly have the certificate per name rate limit applied based
on only exact name matches.

In order to accomplish this a new RPC is added to the SA
`CountCertificatesByExactNames`. This operates similar to the existing
`CountCertificatesByNames` but does *not* include subdomains in the
count, only exact matches to the names provided. The usage of this new
RPC is feature flag gated behind the "CountCertificatesExact" feature flag.

The RA unit tests are updated to test the new code paths both with and
without the feature flag enabled.

Resolves #2681
2017-05-02 13:43:35 -07:00
Roland Bracewell Shoemaker b82c244e65 Add stat for how often DNS responses are signed (#2716)
I'm interested in seeing both how often DNS responses we see are signed (mainly for CAA, but
also interested in other query types). This change adds a new counter, `Authenticated`, that can
be compared against the `Successes` counter to find the percentage of signed responses we see.
The counter is incremented if the `msg.AuthenticatedData` bit is set by the upstream resolver.
2017-05-02 10:57:11 -07:00
Roland Bracewell Shoemaker 2ecb8bf7a5 Remove confusing SetEdns0 call (#2718)
Remove `SetEdns0` call in `bdns.exchangeOne`. Since we talk over TCP to the production
resolver and we don't do any local validation of DNSSEC records adding the EDNS0 OPT
record is pointless and confusing. Testing against a local `unbound` instance shows you
 don't need to set the DO bit for DNSSEC requests/validation to be done at the resolver
level.
2017-05-02 10:55:47 -07:00
Jacob Hoffman-Andrews 8e80a22493 Remove RequestTime and ResponseTime from WFE log (#2708)
Fixes #2707.
2017-04-27 14:55:31 -07:00
Daniel McCarney 0282f9f48e Embeds detail msg for RejectedIdentifier and InvalidEmail probs. (#2704)
In #2583 the internal error usage was reworked. Previously the rejected
identifier and invalid email problems were constructed directly with
a meaningful detail message and then piped straight through the
`core.ProblemDetailsForError` function unaltered allowing the detail to
make it all the way through to the error returned by the WFE to the
client.

Since the refactor Boulder has not been appending the detail message for
these two problem types in `problemDetailsForBoulderError`, making the
errors harder to diagnose client-side.

This commit restores the previous behaviour by updating
`problemDetailsForBoulderError`. The `TestProblemDetailsFromError` unit
test is also updated to check that the correct amount of detail is being
embedded in the problem detail based on the error type.
2017-04-26 14:56:54 -07:00
Jacob Hoffman-Andrews d542960a35 Remove statsd version of RPC stats (#2693)
* Remove statsd-style RPC stats.

* Remove tests for old code.
2017-04-25 10:10:35 -04:00
Jacob Hoffman-Andrews d99800ecb1 Remove some last traces of AMQP. (#2687)
Fixes #2665
2017-04-20 10:43:17 -07:00
Jacob Hoffman-Andrews d59188c676 Use pattern to determine endpoint metrics. (#2689)
This ensures we don't create infinite metrics based on users hitting
non-existent endpoints.
2017-04-20 13:14:47 -04:00
Jacob Hoffman-Andrews eccca3ccd4 Dockerfile/docker-compose.yml cleanups (#2682)
Removes the reliance on `$GOPATH` being set in order to use `docker-compose`. Also removes
a few unnecessary commands from the `Dockerfile` that were no longer doing anything. If you
get weird errors along the lines of `oci runtime error: cannot chdir to ...` you will need to
`docker-compose rm; docker-compose build; docker-compose up` to fix them.

Fixes #2660.
2017-04-20 09:08:55 -07:00
Brian Smith 2781549618 CA: Use Prometheus for counting issued certificates. (#2690)
Prepare for counting precertificates by paramaterizing the counter
on "purpose".
2017-04-19 11:06:04 -07:00
David Calavera cc5ee3906b Refactor IsSane and IsSane* to return useful errors. (#2685)
This change changes the returning values from boolean to error.

It makes `checkConsistency` an internal function and removes the
optional argument in favor of making checks explicit where they are
used.

It also renames those functions to CheckConsistency* to not
give the impression of still returning boolean values.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2017-04-19 12:08:47 -04:00
Jacob Hoffman-Andrews 17e697fde8 Add boulderdash Grafana dashboard in JSON. (#2683) 2017-04-18 09:54:03 -04:00
Lucas Amorim 3c4873bb5c Returns an Internal Server Error on grpc/db timeouts (#2624) 2017-04-17 19:50:03 -07:00
Roland Shoemaker 0131a96a71 Dockerfile/docker-compose.yml cleanups 2017-04-14 14:59:31 -07:00
Jacob Hoffman-Andrews 9c1e8e6764 Call logging.basicConfig() in chisel. (#2679)
Without this, chisel would fail to log even with LOGLEVEL set to 0.
2017-04-14 10:08:53 -04:00
Jacob Hoffman-Andrews 6155ec9ad2 Update divergences doc to describe rel=next. (#2678)
Based on a conversation with an implementer who found this confusing (since
Certbot 0.11 uses them).
2017-04-14 10:08:10 -04:00
Jacob Hoffman-Andrews 99ce9cc51c Add Prometheus collection for WFE to default config. (#2675) 2017-04-13 10:07:28 -07:00
Roland Bracewell Shoemaker b38077e02e Change prefixdb semantics (#2674)
Instead of executing the prefix for every statement only do it when creating the connection.
Leaves most of the existing naming conventions alone but updates the relevant comments
to reflect setting variables is now connection level instead of statement.

Fixes #2673.
2017-04-12 21:57:58 -07:00
Jacob Hoffman-Andrews d849f58cec Fix "valiation typo in VA. (#2676) 2017-04-12 11:12:22 -07:00
Jacob Hoffman-Andrews dcbe7e0895 Increase readTimeout in TestTimeouts. (#2671)
Should decrease flakiness in this test. Fixes #2564
2017-04-09 15:32:02 -04:00
Brian Smith 647b448cc1 CA: Don't add cert DER to audit log for OCSP failures. (#2667)
First, commit c0ad8d9040 (PR #2658)
had a minor bug: It didn't update the "pem=" in the audit log
message to "cert=" to be consistent with the rest of the code.

But, more importantly, we don't need to include the cert DER in the
audit log at this point because we've already logged the DER and its
serial number prior to this. Thus, at this point logging the serial
number is good enough.
2017-04-07 13:50:59 -07:00
Daniel McCarney 4bc28ff0c4 Relaxes CT integration test hack further. (#2670)
In 18f4c5c we introduced a workaround for the CT submission integration
test to allow exactly expected, or twice as many CT log submissions as
expected to account for the case where the ocsp-updater and the CA race.
This didn't completely patch over the issue because the number of
submissions can fall between `n` and `2n`.

This commit updates the hack to be even hackier (twice as hacky or your
money back). Now we consider any value *between* `n` and `2n` as a test
pass.
2017-04-07 16:02:40 -04:00
Brian Smith 29bc4033ed CA: Clarify error handling in IssueCertificate. (#2659)
It is obvious `err` can never be non-nil in these places so remove
any suggestion to the contrary.
2017-04-06 14:48:25 -04:00
Brian Smith c0ad8d9040 CA: Consistently use hex.EncodeToString(certDER) in audit log. (#2658)
`certPEM` should only be used when it cannot be decoded. Otherwise
`hex.EncodeToString(certDER)` should be used, as is done everywhere
else.
2017-04-06 14:45:37 -04:00
Brian Smith 497a027842 RA: Parse issued certificate only once. (#2657)
Previously RegistrationAuthorityImpl.NewCertificate would call
MatchesCSR() and then verify that the certificate can be successfully
parsed. However, MatchesCSR() itself parses the certificate, so the
latter check was pointless.

Instead, parse the certificate once, fail if it can't be parsed,
then pass the parsed certificate to MatchesCSR().
2017-04-06 14:44:32 -04:00
Daniel McCarney 594e31b724 Logs authz ID when ra.onValidationUpdate fails. (#2662)
0e112ae updates ra/ra.go such that when onValidationUpdate returns a non-nil error the AuditErr
message includes the affected authorization ID in addition to the registration ID.

Resolves #2661.
2017-04-06 11:16:18 -07:00
Brian Smith c0b0163a06 Add “.idea” to .gitignore. (#2655)
The JetBrains Gogland IDE stores its stuff in .idea.
2017-04-06 11:15:32 -07:00
Daniel McCarney e6c63f1f11 Removes notafter-backfiller configs. (#2650)
This commit removes the notafter-backfiller config files. The actual
tool was already removed and these are just cruft.
2017-04-05 18:26:05 -07:00
Jacob Hoffman-Andrews 02f3c3be8e Add checkocsp and ocsp_forever. (#2632)
These are monitoring tools, originally from
https://github.com/jsha/go/tree/master/ocsp. We'd like to formalize their role
in monitoring Boulder, so I'm adding them to the Boulder repo and getting them
reviewed.
2017-04-05 12:05:06 -07:00
Jacob Hoffman-Andrews cef0a630b3 Remove old-style gRPC TLS configs (#2495)
* Switch Publisher gRPC to use new "tls" block.

* Remove old-style GRPC TLS configs.

* Fix incorrect TLS blocks.

* Remove more config.
2017-04-05 12:41:41 -04:00
Roland Bracewell Shoemaker a46d30945c Purge remaining AMQP code (#2648)
Deletes github.com/streadway/amqp and the various RabbitMQ setup tools etc. Changes how listenbuddy is used to proxy all of the gRPC client -> server connections so we test reconnection logic.

+49 -8,221 😁

Fixes #2640 and #2562.
2017-04-04 15:02:22 -07:00
Roland Bracewell Shoemaker ccf8c45eea Purge everything that would be expired in a year at start of eap test (#2649)
Instead of running it at the current time to clean out left over cruft run it with a FAKECLOCK of +1 year so that we catch everything that could get in the way.
2017-04-04 14:11:42 -07:00
Roland Bracewell Shoemaker fd561ef842 Block issuance on first OCSP response generation (#2633)
Generate first OCSP response in ca.IssueCertificate instead of ocsp-updater.newCertificateTick
if features.GenerateOCSPEarly is enabled. Adds a new field to the sa.AddCertiifcate RPC for
the OCSP response and only adds it to the certificate status + sets ocspLastUpdated if it is a
non-empty slice. ocsp-updater.newCertificateTick stays the same so we can catch certificates
that were successfully signed + stored but a OCSP response couldn't be generated (for whatever
reason).

Fixes #2477.
2017-04-04 11:28:09 -07:00
Jacob Hoffman-Andrews f4c11a673c Revert BOULDER_CONFIG_DIR to test/config. (#2644)
This was accidentally changed in
https://github.com/letsencrypt/boulder/pull/2634/, and broke Certbot's tests.

This also includes an update to chisel that fetches the certificate chain, which
would have caught this error.
2017-04-03 17:29:51 -07:00
Jacob Hoffman-Andrews 4b665e35a6 Use Prometheus stats for VA, WFE, and OCSP Responder (#2628)
Rename HTTPMonitor to MeasuredHandler.
Remove inflight stat (we didn't use it).
Add timing stat by method, endpoint, and status code.
The timing stat subsumes the "rate" stat, so remove that.
WFE now wraps in MeasuredHandler, instead of relying on its cmd/main.go.
Remove FBAdapter stats.
MeasuredHandler tracks stats by method, status code, and endpoint.

In VA, add a Prometheus histogram for validation timing.
2017-04-03 17:03:04 -07:00
Daniel McCarney ca3a2e0e3c Update publicsuffix-go to `fb1fc94` (#2642)
This PR updates the `publicsuffix-go` dependency to `fb1fc94`, the
latest autopull and the HEAD of master at the time of writing.

Per CONTRIBUTING.md the tests were verified to pass:
```
?       github.com/weppos/publicsuffix-go/cmd/load      [no test files]
ok      github.com/weppos/publicsuffix-go/net/publicsuffix      0.007s
ok      github.com/weppos/publicsuffix-go/publicsuffix  0.027s

```
2017-04-03 12:23:29 -07:00
Roland Bracewell Shoemaker 08f4dda038 Update github.com/grpc-ecosystem/go-grpc-prometheus and google.golang.org/grpc (#2637)
Updates the various gRPC/protobuf libs (google.golang.org/grpc/... and github.com/golang/protobuf/proto) and the boulder-tools image so that we can update to the newest github.com/grpc-ecosystem/go-grpc-prometheus. Also regenerates all of the protobuf definition files.

Tests run on updated packages all pass.

Unblocks #2633 fixes #2636.
2017-04-03 11:13:48 -07:00
Jacob Hoffman-Andrews 6719dc17a6 Remove AMQP config and code (#2634)
We now use gRPC everywhere.
2017-04-03 10:39:39 -04:00
Roland Bracewell Shoemaker 98addd5f36 expiration-mailer daemon mode (#2631)
Adds a daemon mode to `expiration-mailer` that is triggered by using the flag `--daemon` in order to follow deployability guidelines. If the `--daemon` flag is used the `mailer.runPeriod` config field is checked for a tick duration, if the value is `0` it exits.

Super lightweight implementation, OCSP-Updater has some custom ticker code which we use to do fancy things when the method being invoked in the loop takes longer expected, but that isn't necessary here.

Fixes #2617.
2017-03-30 16:16:41 -07:00
Roland Bracewell Shoemaker cefb153ea7 Fix missing berrors.InvalidEmail -> probs.ProblemDetails mapping (#2630)
This fixes an issue caused by #2583. Prior to that PR, we would serve the "invalidEmail" problem type when a DNS lookup for an email base domain failed. After that PR, we would map "berrors.InvalidEmail" to the "InternalServerError" problem type, which caused 500 errors to be returned to the user.

This PR restores the behavior of returning "type": "...invalidEmail" to the user.
2017-03-29 15:31:33 -07:00
Roland Bracewell Shoemaker acbd9ed3a7 Purge both pending and finalized authorizations as well as challenges (#2149)
Fixes #2148.

Instead of just doing a blanket `DELETE FROM ...` this changes the `expired-authz-purger` to select all of the expired IDs (for both pending and finalized authorizations) then loop over them deleting each and its associated challenges from their respective tables.

Local testing indicates the performance of this is not awful but we should do a test run on staging to verify. If it ends up taking way too long to run there the easiest optimization would be to turn the slice of IDs into a channel and run multiple workers looping over the channel deleting stuff instead of just a single one.

Makes a few small integration test changes in order to facilitate deleting both pending and finalized authorizations.
2017-03-24 11:04:35 -07:00
Roland Bracewell Shoemaker e2b2511898 Overhaul internal error usage (#2583)
This patch removes all usages of the `core.XXXError` and almost all usages of `probs` outside of the WFE and VA and replaces them with a unified internal error type. Since the VA uses `probs.ProblemDetails` quite extensively in challenges, and currently stores them in the DB I've saved this change for another change (it'll also require a migration). Since `ProblemDetails` should only ever be exposed to end-users all of its related logic should be moved into the `WFE` but since it still needs to be exposed to the VA and SA I've left it in place for now.

The new internal `errors` package offers the same convenience functions as `probs` does as well as a new simpler type testing method. A few small changes have also been made to error messages, mainly adding the library and function name to internal server errors for easier debugging (i.e. where a number of functions return the exact same errors and there is no other way to distinguish which method threw the error).

Also adds proper encoding of internal errors transferred over gRPC (the current encoding scheme is kept for `core` and `probs` errors since it'll be ideally be removed after we deploy this and follow-up changes) using `grpc/metadata` instead of the gRPC status codes.

Fixes #2507. Updates #2254 and #2505.
2017-03-22 23:27:31 -07:00
Roland Bracewell Shoemaker 194a55d7c7 Remove RabbitMQ + AMQP references from README (#2616)
Fixes #2407.
2017-03-22 12:43:43 -07:00
David Calavera c71c3cff80 Implement TLS-SNI-02 challenge validations. (#2585)
I think these are all the necessary changes to implement TLS-SNI-02 validations, according to the section 7.3 of draft 05:

https://tools.ietf.org/html/draft-ietf-acme-acme-05#section-7.3

I don't have much experience with this code, I'll really appreciate your feedback.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2017-03-22 10:17:59 -07:00
Daniel McCarney 8f1de3b57e Allows expiration-mailer to use template as subject. (#2613)
This commit resolves #2599 by adding support to the expiration-mailer to
treat the subject for email messages as a template. This allows for the
dynamic subject lines from #2435 to be used with a prefix for staging
emails.
2017-03-21 16:57:28 -07:00
Daniel McCarney 2114596e58 Workaround #2610 for flaky ct submission test. (#2611)
Presently the CA and the ocsp-updater can race on the initial
submission of a certificate to the configured logs. This results
in double submitting certificates. In integration tests with the fake CT
server this manifests as an occasional failure of the
`test_ct_submission` test (Issue #2579).

The race we currently experience is expected to be fixed in
the future by a planned redesign so for now this commit works around the
failure by allowing either the expected number of submissions, or
exactly double the expected. This fixes #2579. The need to fix the
underlying race was captured in #2610.

The workaround was verified by submitting 10 builds to travis, all
succeeded.
2017-03-20 09:03:54 -04:00
Daniel McCarney e81f7477a3 Fixes outdated IPv6 TODO on `getAddr`. (#2601)
The VA's `getAddr` function prior to this commit had an outdated comment
& a pointer to a TODO for Boulder Issue #593. That issue has been closed
and bdns' `LookupHost` supports AAAA records now. This commit updates
the comment to match the current behaviour and removes the TODO.
2017-03-09 13:20:03 -05:00
Roland Bracewell Shoemaker 8a1adbdc9a Switch to gorp.v2 (#2598)
Switch from `gorp.v1` to `gorp.v2`. Removes `vendor/gopkg.in/gorp.v1` and vendors `vendor/gopkg/go-gorp/gorp.v2`, all tests pass.

Changes between `v1.7.1` and `v2.0.0`: c87af80f3c...4deece6103

Fixes #2490.
2017-03-08 12:20:22 -05:00