Commit Graph

3641 Commits

Author SHA1 Message Date
Daniel McCarney 63e8d25394 Fixes Mandrill UNSUB merge tag (#2524)
Per the documentation we should be using *|UNSUB:url|* not |UNSUB:url|. I also confirmed that the template that predates the Boulder data/ version used *|UNSUB:https://mandrillapp.com/unsub|*.

I also moved the "WARNING" onto the preceding line. The unsubscribe URL is quite long and gnarly once its been filed in and the warning could be missed in a MUA that doesn't wrap lines well.

This resolves #2523
2017-01-25 10:53:17 -08:00
Jacob Hoffman-Andrews 056defba86 Refactor integration test. (#2515)
Add a new tiny client called chisel, in place of test.js. This reduces the
number of language runtimes Boulder depends on for its tests. Also, since chisel
uses the acme Python library, we get more testing of that library, which
underlies Certbot. This also gives us more flexibility to hook different parts
of the issuance flows in our tests.

Reorganize integration-test.py itself. There was not clear separation of
specific test cases. Some test cases were added as part of run_node_test; some
were wrapped around it. There is now much closer to one function per test case.
Eventually we may be able to adopt Python's test infrastructure for these test
cases.

Remove some unused imports; consolidate on urllib2 instead of urllib.

For getting serial number and expiration date, replace shelling out to OpenSSL
with using pyOpenSSL, since we already have an in-memory parsed certificate.

Replace ISSUANCE_FAILED, REVOCATION_FAILED, MAILER_FAILED with simple die, since
we don't use these. Later, I'd like to remove the other specific exit codes. We
don't make very good use of them, and it would be more effective to just use
stack traces or, even better, reporting of which test cases failed.

Make test_single_ocsp_sign responsible for its own subprocess lifecycle.

Skip running startservers if WFE is already running, to make it easier to
iterate against a running Boulder (saves a few seconds of Boulder startup).
2017-01-25 10:50:04 -08:00
Roland Bracewell Shoemaker 7853532972 Encode challenge errors and validation records when handling protobufs (#2520)
Previously we had `Error` and `ValidationRecords` fields in the `Challenge` protobuf but they were never populated which mean't that when using gRPC these fields wouldn't be sent to the SA from the RA on a `FinalizeAuthorization` call. This change populates those fields and updates the PB marshaling tests to verify the correct behavior.

Fixes #2514.
2017-01-25 09:39:35 -05:00
Jacob Hoffman-Andrews ad3738bbf5 Robustify expired_authz_purger test. 2017-01-24 18:02:35 -08:00
Jacob Hoffman-Andrews ecd8d558f3 Review feedback. 2017-01-24 17:45:19 -08:00
Jacob Hoffman-Andrews 94bd21c082 Merge branch 'master' of github.com:letsencrypt/boulder into chisel2 2017-01-23 13:30:11 -08:00
Jacob Hoffman-Andrews 6c93b41f20 Add a limit on failed authorizations (#2513)
Fixes #976.

This implements a new rate limit, InvalidAuthorizationsPerAccount. If a given account fails authorization for a given hostname too many times within the window, subsequent new-authz attempts for that account and hostname will fail early with a rateLimited error. This mitigates the misconfigured clients that constantly retry authorization even though they always fail (e.g., because the hostname no longer resolves).

For the new rate limit, I added a new SA RPC, CountInvalidAuthorizations. I chose to implement this only in gRPC, not in AMQP-RPC, so checking the rate limit is gated on gRPC. See #2406 for some description of the how and why. I also chose to directly use the gRPC interfaces rather than wrapping them in core.StorageAuthority, as a step towards what we will want to do once we've moved fully to gRPC.

Because authorizations don't have a created time, we need to look at the expires time instead. Invalid authorizations retain the expiration they were given when they were created as pending authorizations, so we use now + pendingAuthorizationLifetime as one side of the window for rate limiting, and look backwards from there. Note that this means you could maliciously bypass this rate limit by stacking up pending authorizations over time, then failing them all at once.

Similarly, since this limit is by (account, hostname) rather than just (hostname), you can bypass it by creating multiple accounts. It would be more natural and robust to limit by hostname, like our certificate limits. However, we currently only have two indexes on the authz table: the primary key, and

(`registrationID`,`identifier`,`status`,`expires`)

Since this limit is intended mainly to combat misconfigured clients, I think this is sufficient for now.

Corresponding PR for website: letsencrypt/website#125
2017-01-23 11:22:51 -08:00
Daniel McCarney 15e73edc5a Google Safe Browsing V4 Improvements (#2504)
This PR has three primary contributions:

1. The existing code for using the V4 safe browsing API introduced in #2446 had some bugs that are fixed in this PR.
2. A gsb-test-srv is added to provide a mock Google Safebrowsing V4 server for integration testing purposes.
3. A short integration test is added to test end-to-end GSB lookup for an "unsafe" domain.

For 1) most notably Boulder was assuming the new V4 library accepted a directory for its database persistence when it instead expects an existing file to be provided. Additionally the VA wasn't properly instantiating feature flags preventing the V4 api from being used by the VA.

For 2) the test server is designed to have a fixed set of "bad" domains (Currently just honest.achmeds.discount.hosting.com). When asked for a database update by a client it will package the list of bad domains up & send them to the client. When the client is asked to do a URL lookup it will check the local database for a matching prefix, and if found, perform a lookup against the test server. The test server will process the lookup and increment a count for how many times the bad domain was asked about.

For 3) the Boulder startservers.py was updated to start the gsb-test-srv and the VA is configured to talk to it using the V4 API. The integration test consists of attempting issuance for a domain pre-configured in the gsb-test-srv as a bad domain. If the issuance succeeds we know the GSB lookup code is faulty. If the issuance fails, we check that the gsb-test-srv received the correct number of lookups for the "bad" domain and fail if the expected isn't reality.

Notes for reviewers:

* The gsb-test-srv has to be started before anything will use it. Right now the v4 library handles database update request failures poorly and will not retry for 30min. See google/safebrowsing#44 for more information.
* There's not an easy way to test for "good" domain lookups, only hits against the list. The design of the V4 API is such that a list of prefixes is delivered to the client in the db update phase and if the domain in question matches no prefixes then the lookup is deemed unneccesary and not performed. I experimented with sending 256 1 byte prefixes to try and trick the client to always do a lookup, but the min prefix size is 4 bytes and enumerating all possible prefixes seemed gross.
* The test server has a /add endpoint that could be used by integration tests to add new domains to the block list, but it isn't being used presently. The trouble is that the client only updates its database every 30 minutes at present, and so adding a new domain will only take affect after the client updates the database.

Resolves #2448
2017-01-23 11:07:20 -08:00
Jacob Hoffman-Andrews 7705b18a70 Refactor integration test.
Add a new tiny client called chisel, in place of test.js. This reduces the
number of language runtimes Boulder depends on for its tests. Also, since chisel
uses the acme Python library, we get more testing of that library, which
underlies Certbot. This also gives us more flexibility to hook different parts
of the issuance flows in our tests.

Reorganize integration-test.py itself. There was not clear separation of
specific test cases. Some test cases were added as part of run_node_test; some
were wrapped around it. There is now much closer to one function per test case.
Eventually we may be able to adopt Python's test infrastructure for these test
cases.

Remove some unused imports; consolidate on urllib2 instead of urllib.

For getting serial number and expiration date, replace shelling out to OpenSSL
with using pyOpenSSL, since we already have an in-memory parsed certificate.

Replace ISSUANCE_FAILED, REVOCATION_FAILED, MAILER_FAILED with simple die, since
we don't use these. Later, I'd like to remove the other specific exit codes. We
don't make very good use of them, and it would be more effective to just use
stack traces or, even better, reporting of which test cases failed.

Make single_ocsp_sign responsible for its own subprocess lifecycle.

Skip running startservers if WFE is already running, to make it easier to
iterate against a running Boulder (saves a few seconds of Boulder startup).
2017-01-22 20:51:27 -08:00
Roland Bracewell Shoemaker 170e37c675 Add a special error message if we are trying to talk TLS to a HTTP-only server (#2511)
If the VA fails to validate a TLS-SNI-01 challenge because it is trying to talk TLS to a HTTP-only server return a special error message that is slightly more informative.
2017-01-20 11:36:39 -05:00
Roland Bracewell Shoemaker b2a4a1692b Add counter for signatures (#2510)
Add a super basic counter for certificate and OCSP signatures so we have a slightly less noisy idea of our current HSM signing performance and where it is going.

Fixes #2438.
2017-01-20 11:33:09 -05:00
Jacob Hoffman-Andrews 16ab736c07 Temporarily switch to SIGKILL for startservers shutdown. (#2512)
Unfortunately our clean shutdown code paths are too noisy, and often obscure
real errors. We can turn this back to SIGTERM once that's fixed.
2017-01-19 16:45:43 -08:00
Roland Bracewell Shoemaker 7d7adabe44 Allow probs.ProblemDetails to be passed across gRPC layer (#2506)
Currently services will pass both `core.XXXError` and `probs.XXX` type errors across the gRPC layer. In the future (#2505) we intend to stop passing `probs.XXX` type errors across this layer but for now we need to support them until that change is landed. This patch takes the easiest path to allow this by encoding the `probs.ProblemDetails` to JSON and storing it in the gRPC error body so that it can be passed around.

Fixes #2497.
2017-01-19 14:59:44 -08:00
Roland Bracewell Shoemaker cb64fee358 Purge POST'd OCSP responses as well as GETs (#2449)
Akamai expects a special URL to be used to purge responses from POST'd requests, this change adds those URLs to the existing GET URLs when purging OCSP responses.

Note this assumes all POST requests encode their OCSP request in the same manner that Golang does, which is most likely not the case in some situations. In order to mimic all the possible formats we would need to write a bunch of custom ASN.1 definitions that conform with specific field use/order that browsers etc use and generate a URL for each of those as well.

Fixes #996.
2017-01-18 09:30:15 -08:00
Jacob Hoffman-Andrews 714ec98a0c Update OCSP load testing doc. (#2486)
Prefer up over start to allow prometheus container to find boulder.
Use ocspMinTimeToExpiry: 0h trick instead of updating DB manually.

Offer command to fill DB.

Offer Prometheus link to throughput graph.
2017-01-17 16:32:31 -08:00
Jacob Hoffman-Andrews 9dacdd5443 Fix SA wrappers for maps. (#2498)
We turn arrays into maps with a range command. Previously, we were taking the
address of the iteration variable in that range command, which meant incorrect
results since the iteration variable gets reassigned.

Also change the integration test to catch this error.

Fixes #2496
2017-01-17 14:07:07 -08:00
Roland Bracewell Shoemaker 80600fbadb Only send cache purge request if generated OCSP has 'revoked' status (#2487)
* Only send cache purge request if generated OCSP has 'revoked' status

* Only call sendPurge from generateRevokedResponse
2017-01-17 17:02:41 -05:00
Josh Soref 281de6d90e Fix auth line in email test (#2499)
Previously the comment was wrong (left off the leading `\0`), but the base64-encoded `AUTH` line was right.

This change also modified the password from `paswd` to `passwd`.
2017-01-17 09:46:58 -08:00
Josh Soref 8adf9d41cf Spelling (#2500)
Various spelling fixes.
2017-01-16 10:44:52 -05:00
Jacob Hoffman-Andrews 0a367962d6 Make restarting boulder in docker nicer. (#2492)
* Make restarting boulder in docker nicer.

Handle SIGTERM in startservers.py.
Forcibly remove rsyslog pid to avoid error.

* Add explanatory comment.

* Send SIGTERM instead of kill.

* Further improvements.

- Handle SIGINT too.
- Use unbuffered mode for Python so the print statements (like "all servers
  running") get printed right away rather than at shutdown
- Squelch an unnecessary OSError about interrupting the wait() call.
2017-01-13 11:55:28 -05:00
Jacob Hoffman-Andrews 373ff015a2 Update cfssl, CT, and OCSP dependencies (#2170)
Pulls in logging improvements in OCSP Responder and the CT client, plus a handful of API changes. Also, the CT client verifies responses by default now.

This change includes some Boulder diffs to accommodate the API changes.
2017-01-12 16:01:14 -08:00
Jacob Hoffman-Andrews 82a048cfb9 Use config overrides for expiration-mailer. (#2473)
Previously, the expiration-mailer would always run with the default config, even
if BOULDER_CONFIG_DIR was used to point at config-next. This led to missing some
config parse problems that should have been test failures.
2017-01-12 12:13:27 -08:00
Jacob Hoffman-Andrews 93a5e1284a Add dial timeout to SQL DSNs. (#2491)
We're planning to add dial timeouts in prod, and want to make it consistent with dev.

Read timeout has to be fairly generous because it has to be at least as high as the longest query we expect to run. Dial timeout can be much more aggressive, because in all normal cases, dial should complete very quickly. This allows us to timeout connections more quickly when the database is slow or unavailable, returning 500 immediately rather than piling up connections and eventually returning 500.

This is mainly useful for services that serve user traffic directly and have a high number of maxDBConns. It's not as important for admin tools like the cert-checker, mailer, and so on.
2017-01-12 10:52:33 -08:00
Daniel McCarney 74c7566904 Mention the non-default challenge ports in README. (#2493)
By default the Boulder dev env's VA config test/config/va.json has
a portConfig that connects to port 5002 for HTTP-01 challenges and
port 5001 for TLS-SNI-01 challenges.

This can be confusing to someone that follows the FAKE_DNS guidelines
for using a client from the host machine but expects the VA to reach out
on port 80 or 443 like production/staging's VA.

This commit updates the README to include a note on the non-standard
ports and how to change them.
2017-01-12 10:41:32 -08:00
Jacob Hoffman-Andrews cbde78d58f Harmonize and tweak configs (#2479)
Set authorizationLifetimeDays to 60 across both config and config-next.

Set NumSessions to 2 in both config and config-next. A decrease from 10 because pkcs11-proxy (or pkcs11-daemon?) seems to error out under load if you have more sessions than CPUs.

Reorder parallelGenerateOCSPRequests to match config-next.

Remove extra tags for parsing yaml in config objects.
2017-01-10 13:46:38 -08:00
Jacob Hoffman-Andrews 902d14d7ba Add Prometheus to docker-compose. (#2481)
This makes it easier to test monitoring of local Boulder instances, and look at stats when load testing.

Note that this creates a dependency from prom -> boulder that can only be satisfied by docker-compose
up or docker-compose up boulder, not by docker-compose run --service-ports boulder (since
docker-compose doesn't see the latter as a full-fledged boulder instance).
2017-01-10 13:45:32 -08:00
Jacob Hoffman-Andrews d6ba7fcba9 Add some timing histogram stats (#2482)
Previously our gRPC client code called the wrong function, enabling server-side instead of client-side histograms.

Also, add a timing stat for the generate / store combination in OCSP Updater.
2017-01-10 11:02:41 -08:00
Jacob Hoffman-Andrews 5aa78a578c Skip parsing of certificates in OCSP Updater. (#2480)
The result is thrown away, so this adds a slight performance overhead for no
benefit. In theory it would catch malformed certificates in the DB, but that is
the job of cert-checker.
2017-01-10 10:36:34 -05:00
Jacob Hoffman-Andrews 58ccd7a71a Copy all statsd stats to Prometheus. (#2474)
We have a number of stats already expressed using the statsd interface. During
the switchover period to direct Prometheus collection, we'd like to make those
stats available both ways. This change automatically exports any stats exported
using the statsd interface via Prometheus as well.

This is a little tricky because Prometheus expects all stats to by registered
exactly once. Prometheus does offer a mechanism to gracefully recover from
registering a stat more than once by handling a certain error, but it is not
safe for concurrent access. So I added a concurrency-safe wrapper that creates
Prometheus stats on demand and memoizes them.

In the process, made a few small required side changes:
 - Clean "/" from method names in the gRPC interceptors. They are allowed in
   statsd but not in Prometheus.
 - Replace "127.0.0.1" with "boulder" as the name of our testing CT log.
   Prometheus stats can't start with a number.
 - Remove ":" from the CT-log stat names emitted by Publisher. Prometheus stats
   can't include it.
 - Remove a stray "RA" in front of some rate limit stats, since it was
   duplicative (we were emitting "RA.RA..." before).

Note that this means two stat groups in particular are duplicated:
 - Gostats* is duplicated with the default process-level stats exported by the
   Prometheus library.
 - gRPCClient* are duplicated by the stats generated by the go-grpc-prometheus
   package.

When writing dashboards and alerts in the Prometheus world, we should be careful
to avoid these two categories, as they will disappear eventually. As a general
rule, if a stat is available with an all-lowercase name, choose that one, as it
is probably the Prometheus-native version.

In the long run we will want to create most stats using the native Prometheus
stat interface, since it allows us to use add labels to metrics, which is very
useful. For instance, currently our DNS stats distinguish types of queries by
appending the type to the stat name. This would be more natural as a label in
Prometheus.
2017-01-10 10:30:15 -05:00
Jacob Hoffman-Andrews 328589ba0f Add Prometheus to docker-compose. 2017-01-07 13:38:51 -08:00
Jacob Hoffman-Andrews 510e279208 Simplify gRPC TLS configs. (#2470)
Previously, a given binary would have three TLS config fields (CA cert, cert,
key) for its gRPC server, plus each of its configured gRPC clients. In typical
use, we expect all three of those to be the same across both servers and clients
within a given binary.

This change reuses the TLSConfig type already defined for use with AMQP, adds a
Load() convenience function that turns it into a *tls.Config, and configures it
for use with all of the binaries. This should make configuration easier and more
robust, since it more closely matches usage.

This change preserves temporary backwards-compatibility for the
ocsp-updater->publisher RPCs, since those are the only instances of gRPC
currently enabled in production.
2017-01-06 14:19:18 -08:00
Jacob Hoffman-Andrews 7e187773b2 Expose more debug ports. (#2471)
Some Boulder services offered debug ports but did not expose them in
docker-compose.yml.
2017-01-06 16:48:25 -05:00
Jacob Hoffman-Andrews 9b8dacab03 Split out separate RPC services for issuing and for signing OCSP (#2452)
This allows finer-grained control of which components can request issuance. The OCSP Updater should not be able to request issuance.

Also, update test/grpc-creds/generate.sh to reissue the certs properly.

Resolves #2417
2017-01-05 15:08:39 -08:00
Daniel McCarney 74c5e68491 Fixes the `config/publisher.json` clientNames list. (#2466)
In https://github.com/letsencrypt/boulder/pull/2453 we created
individual client certificates for each gRPC client. The "clientNames"
list in the `config-next/publisher.json` was updated for the new
component-specific SANs but we neglected to updated
`config/publisher.json`. This caused the `ocsp-updater` (which uses gRPC
in the base `config/` to talk to the `publisher`) to fail to connect.

This commit updates `config/publisher.json` to have the same clientNames
as `config-next/publisher.json` and resolves #2465
2017-01-03 10:10:01 -08:00
Jacob Hoffman-Andrews eadce69146 Improve Docker instructions. (#2464)
Previously the instructions assumed you had Go setup on your host, which
somewhat defeates the point of running Boulder inside Docker, since it requires
more initial setup. These instructions make first-time users less likely to hit
the oci runtime error described later in the README.
2017-01-03 09:43:44 -08:00
Jacob Hoffman-Andrews 089a270453 Add instructions on load testing OCSP generation. (#2459) 2017-01-02 11:36:03 -08:00
Simone Carletti b5bac90efd Update the publicsuffix dep to v0.3.1 (#2462)
We recently made changes to the IANA suffixes, and you may want to pull them into the latest Boulder version.

```
➜  publicsuffix-go git:(master) git show -s

commit 3ea542729b4d7056a9d1356c9baf27bcad2bda7f
Author: Simone Carletti <weppos@weppos.net>
Date:   Mon Jan 2 18:28:57 2017 +0100

    Release 0.3.1
```

```
➜  publicsuffix-go git:(master) go test ./...
?   	github.com/weppos/publicsuffix-go/cmd/gen	[no test files]
?   	github.com/weppos/publicsuffix-go/cmd/load	[no test files]
ok  	github.com/weppos/publicsuffix-go/net/publicsuffix	0.045s
ok  	github.com/weppos/publicsuffix-go/publicsuffix	0.091s
```

v0.3.1 is tagged and signed with my PGP key.
https://github.com/weppos/publicsuffix-go/releases/tag/v0.3.1
2017-01-02 11:05:19 -08:00
IntiGabriel 18341d0e3f Add instructions to fetch boulder into $GOPATH (#2460)
When cloning this repository it is not clear that you need to get boulder first. Starting directly with `docker-compose up` fails.
2016-12-30 01:23:18 -08:00
Jacob Hoffman-Andrews 0c665b2053 Split up gRPC certificates by service. (#2453)
Previously, all gRPC services used the same client and server certificates. Now,
each service has its own certificate, which it uses for both client and server
authentication, more closely simulating production.

This also adds aliases for each of the relevant hostnames in /etc/hosts. There
may be some issues if Docker decides to rewrite /etc/hosts while Boulder is
running, but this seems to work for now.
2016-12-29 14:53:59 -08:00
Jacob Hoffman-Andrews 3abb9d1780 Make client certificate errors more verbose. (#2451)
Echo the expected list of names and the received list of names.

Also, change the unittest to use its own testdata directory rather than
borrowing.
2016-12-29 14:52:12 -08:00
Daniel McCarney d26a54b3e9 Adds 'kid' divergence to docs (#2458)
Resolves #2455
2016-12-29 14:51:47 -08:00
Daniel McCarney 74e281c1ce Switch to Google's v4 safebrowsing library. (#2446)
Right now we are using a third-party client for the Google Safe Browsing API, but Google has recently released their own [Golang library](https://github.com/google/safebrowsing) which also supports the newer v4 API. Using this library will let us avoid fixing some lingering race conditions & unpleasantness in our fork of `go-safebrowsing-api`.

This PR adds support for using the Google library & the v4 API in place of our existing fork when the `GoogleSafeBrowsingV4` feature flag is enabled in the VA "features" configuration.

Resolves https://github.com/letsencrypt/boulder/issues/1863

Per `CONTRIBUTING.md` I also ran the unit tests for the new dependency:
```
daniel@XXXXXXXXXX:~/go/src/github.com/google/safebrowsing$ go test ./...
ok  	github.com/google/safebrowsing	3.274s
?   	github.com/google/safebrowsing/cmd/sblookup	[no test files]
?   	github.com/google/safebrowsing/cmd/sbserver	[no test files]
?   	github.com/google/safebrowsing/cmd/sbserver/statik	[no test files]
?   	github.com/google/safebrowsing/internal/safebrowsing_proto	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/jsonpb	0.012s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/jsonpb/jsonpb_test_proto	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/proto	0.062s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/proto/proto3_proto	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/descriptor	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/generator	0.017s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/grpc	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/plugin	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes	0.009s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/any	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/duration	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/empty	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/struct	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/timestamp	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/wrappers	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/rakyll/statik	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/rakyll/statik/fs	[no test files]
ok  	github.com/google/safebrowsing/vendor/golang.org/x/net/idna	0.003s
```
2016-12-27 11:18:11 -05:00
Daniel McCarney 5acce8ba38 Removes `ProblemDetailsForError` from `verifyPOST`. (#2444)
Prior to this commit, when there was an err from
`wfe.SA.GetRegistrationByKey`, and that error wasn't an instance of
`core.NoSuchRegistrationError`, `verifyPOST` converted the error into
a problem by sending it through `core.ProblemDetailsForError(err, "")`.

In this case, this isn't an appropriate strategy. The only possible
errors that can be sent through this function will not match any of the
`case` statements in `core.ProblemDetailsForError` and will be returned
by the `default` case:

```
default:
  // Internal server error messages may include sensitive data, so we do
  // not include it.
  return probs.ServerInternal(msg)
```

Since `verifyPOST` calls this function with `msg = ""`,
`ProblemDetailsForError` will return an empty `ServerInternalProblem`. When the
caller of `verifyPOST` gives the returned serv internal problem to `sendError`
it will produce: `"Internal error -  - %s<nil>"` because the problem's detail
is "" and the error code given to `sendError` is nil.

Since having examined the code paths in `verifyPOST` before
`core.ProblemDetailsForError` won't ever match anything but the default case
producing a blank message it seems the proper fix here is to not use
`ProblemDetailsForError` at all and instead directly instantiate a
`ServerInternalProblem` with a suitable message.
2016-12-21 13:00:48 -08:00
Roland Bracewell Shoemaker 5ca43d2985 Fix akamai cache purger bugs (#2443)
Fixes two bugs in the Akamai cache purging library and one in the `ocsp-updater` and adds some tests to the Akamai library.

* The first was that the backoff logic was broken, the backoff was calculated but discarded as it was assumed the sleep happened inside `core.RetryBackoff` instead of it returning the amount of time to backoff.
* The second was that the internal HTTP client would only log errors if they were fatal which was superfluous as the caller would also log the fatal errors and masked what the actual issue was during retries.
* The last in `ocsp-updater` was that `path.Join` was used to create a URL which is not an intended use of the method as it attempts to clean paths. This meant that the scheme prefix `http://` would be 'cleaned' to `http:/`, since Akamai has no idea what the malformed URLs referred to it would return 403 Forbidden which we could consider a temporary error and retry until failure.
2016-12-21 09:05:49 -05:00
Jacob Hoffman-Andrews d710dc9a2f Updates divergences after more feedback (#2441)
As pointed out by @webczat (Thanks again!), my last update to `acme-divergences.md`
https://github.com/letsencrypt/boulder/pull/2402 was still a little bit off-the-mark accuracy wise.

This PR resolves the problems (fingers crossed! 🎌) that remained with the
documentation of the Section 6.3 "URL field" divergence and the Section 5.4.1
"existing registration" divergence.

Resolves https://github.com/letsencrypt/boulder/issues/2414
2016-12-19 15:00:49 -08:00
Daniel McCarney e74e7ad14b Include domain name in email subj (#2435)
This PR modifies the `expiration-mailer` utility to change the subject used in the reminder emails to include a domain name from the expiring certificate.

Previously unless otherwise specified using the `Mailer.Subject` configuration parameter all reminder emails were sent with the subject `Certificate expiration notice`. Both the `test/config/` and `test/config-next` expiration mailer configurations do not override the subject and were using the default.

With this PR, if no `Mailer.Subject` configuration parameter is provided then reminder emails are sent with the subject `Certificate expiration notice for domain "foo.bar.com"` in the case of only one domain in the expiring certificate, and `Certificate expiration for domain "foo.bar.com" (and $(n-1) more)` for the case where there are n > 1 domains (e.g. "(and 1 more)", "(and 2 more)" ...). I explicitly left support for the `Mailer.Subject` override to allow legacy configurations to function.

I didn't explicitly add a new unit test for this behaviour because the existing unit tests were exercising both the "configuration override" portion of the subject behaviour, and matching the new expected subject. It would be entirely duplicated code to write a separate test for the subject template.

Resolves #2411
2016-12-19 17:12:37 -05:00
Daniel McCarney 1083db5a15 Optimize expiration-mailer queries (#2440)
This PR splits up the expiration-mailer's `findExpiringCertificates` query into two parts:
1. One query to find `certificateStatus` serial numbers that match the search criteria
2. Sequential queries to find each `certificate` row for the results from 1.

This removes the `JOIN` on two large tables from the original `findExpiringCertificates` query and lets us shift load away from the database. https://github.com/letsencrypt/boulder/issues/2432 wasn't sufficient to reduce the load of this query.

Resolves https://github.com/letsencrypt/boulder/issues/2425
2016-12-19 14:29:37 -05:00
Daniel 2cf2b97358
Updates divergences after more feedback 📣 2016-12-19 11:45:43 -05:00
Jacob Hoffman-Andrews 0cc0ab9d9b Add admin-revoker integration tests for serial-revoke and auth-revoke (#2421)
Skips adding tests for reg-revoke as it would require significant changes to how test.js
functions that would additionally require re-working a number of the other integration
tests.

Updates #2340.
2016-12-15 16:54:32 -08:00
Daniel McCarney 5c3482d2dd `certificateStatus` table optimizations (Part Four) (#2432)
Similar to #2431 the expiration-mailer's `findExpiringCertificates()` query can be optimized slightly by using `certificateStatus.NotAfter` in place of `certificate.expires` in the `WHERE` clause of its query when the `CertStatusOptimizationsMigrated` feature is enabled.

Resolves https://github.com/letsencrypt/boulder/issues/2425
2016-12-15 12:53:54 -08:00