Commit Graph

4894 Commits

Author SHA1 Message Date
Roland Bracewell Shoemaker 17d84ce18b Add GetAuthorizationsPerf feature to test/config-next/sa.json (#4534)
Seem to have forgotten to do this in #4512.
2019-11-08 11:29:54 -08:00
Daniel McCarney 32ad79a0df deps: rerun the gopkg.in/go-gorp/gorp.v2 go get. (#4531)
Newer Go versions seem to give a different psuedoversion for this
dependency at the same commit than when we initially switched to Go
modules for Boulder. Fixing the psuedoversion now so it won't trip up
future updates unexpectedly.
2019-11-07 10:21:28 -08:00
Daniel McCarney 6da5e18a1e deps: update CFSSL to v1.4.0 (#4529)
This keeps us on a tagged release and includes only small bugfixes/doc
updates.
2019-11-06 14:23:22 -08:00
Jacob Hoffman-Andrews 5e608ccbb8 tests: use relative paths in Go integration tests. (#4526)
This makes it simpler to specify paths to testdata and config files.

Fixes #4508
2019-11-05 10:22:20 -05:00
Roland Bracewell Shoemaker e402156c1c Revert "Revert "Remove remaining old format authorization code from SA/… (#4502)" (#4524)
This reverts commit dc2ce4ca84.
2019-11-04 09:45:19 -05:00
Jacob Hoffman-Andrews 49043a4156 Clarify public key blocklist documentation. (#4523)
Previously, we referred to "DER encoded PKIX public keys", but PKIX (RFC 5280)
doesn't define a standalone "public key" type. Instead, it defines
SubjectPublicKeyInfo, containing an algorithm and a BIT STRING. As a
result, SPKI and SPKI hash are more commonly used terms, and we're more
likely to get reports based on those. We should mirror that terminology
in our documentation.
2019-11-04 09:10:36 -05:00
Jacob Hoffman-Andrews 7f6caddc5b VA: log internal DNS errors. (#4520)
When we get a DNS error that has an internal cause (like connection
refused), we return a generic message like "networking error" to the
user to avoid revealing details that would be confusing. However, when
debugging problems with our own services, it's useful to have the
underlying errors.

This adds a helper method in the VA and calls it from each place we use
DNS errors.
2019-11-04 09:09:24 -05:00
stilez 2461c9fc5b PA: make user-facing error messages more intuitive (#4513)
LE is popular and aims to popularise certificate issuance. End users who see
error messages cannot be assumed to be as DNS-experienced as previously. The
user-facing error messages in the policy authority file are terse and unobvious
to the point that they are often unlikely to be well understood by those they
are intended to inform, who may be "just trying to get a LE cert for their
domain".
2019-11-01 10:25:55 -04:00
Roland Bracewell Shoemaker e49b6d7c61 SA: remove JOIN from GetAuthorizations2 and filter in code (#4512)
Previously we used a JOIN on the orderToAuthz2 table in order to make sure
we only returned authorizations created using the ACME v2 API. Each time an
order is created a pivot row (order ID + authz ID) is added to the
orderToAuthz2 table. If a large number of orders are created that all contain
the same authorization, due to reuse, then the JOINd query would return a full
authorization row for each entry in the orderToAuthz2 table with the authorization
ID.

Instead we now filter out these authorizations by doing a second query against
the orderToAuthz2 table. Using this query still requires examining a large number
of rows, but because we don't need to construct a temporary table for the JOIN
and fill it with all the full authorization rows we should save resources.

Fixes #4500.
2019-10-31 13:25:32 -04:00
Daniel McCarney 7b60b57c33
va: log account ID in multi VA differential JSON. (#4521)
This will reduce the amount of analysis time required to identify
large integrators that aren't compatible with multi VA.
2019-10-31 13:12:28 -04:00
Roland Bracewell Shoemaker e5eb8f8736 wfe/wfe2: make JWS signature alg error msgs match reality (#4519)
Errors that were being returned in the checkAlgorithm methods of both wfe and
wfe2 didn't really match up to what was actually being checked. This change
attempts to bring the errors in line with what is actually being tested.

Fixes #4452.
2019-10-31 09:55:11 -04:00
Daniel McCarney e448e81dc4 deps: update square/go-jose to v2.4.0 (#4518)
This branch also updates the WFE2 parseJWS function to match the error string fixed in the upstream project for the case where a JWS EC public key fails to unmarshal due to an incorrect length.

Resolves #4300
2019-10-30 10:59:41 -07:00
alexzorin a36a90519b Fix start.py/docker-compose (#4510)
fakeclock became a mandatory parameter in #4509
2019-10-28 11:00:43 -07:00
Jacob Hoffman-Andrews 0c9ca050ab Tidy up default_config_dir in integration test (#4509)
We now expect that the config dir is always set, so we make that
explicit in the integration test and error if that's not true.

This change also renames the variable to just "config_dir", and removes
the parameter to startservers.start, which is currently never set to
anything other than its default value.

This also explicitly sets the environment variable in .travis.yml.
2019-10-25 09:51:48 -07:00
Jacob Hoffman-Andrews d4168626ad Fix orphan-finder (#4507)
This creates the correct type of backend service for the OCSP generator.
It also adds an invocation of orphan-finder during the integration
tests.

This also adds a minor safety check to SA that I hit while writing the
test. Without this safety check, passing a certificate with no DNSNames
to AddCertificate would result in an obscure MariaDB syntax error
without enough context to track it down. In normal circumstances this
shouldn't be hit, but it will be good to have a solid error message if
we hit it in tests sometime.

Also, this tweaks the .travis.yml so it explicitly sets BOULDER_CONFIG_DIR
to test/config in the default case. Because the docker-compose run
command uses -e BOULDER_CONFIG_DIR="${BOULDER_CONFIG_DIR}",
we were setting a blank BOULDER_CONFIG_DIR in default case.
Since the Python startservers script sets a default if BOULDER_CONFIG_DIR
is not set, we haven't noticed this before. But since this test case relies
on the actual environment variable, it became an issue.

Fixes #4499
2019-10-25 09:51:14 -07:00
Jacob Hoffman-Andrews 329e4154cd
Deprecate EarlyOrderRateLimit and FasterGetOrder (#4497)
These feature flags are already turned on in production.
2019-10-24 10:47:29 -07:00
Roland Bracewell Shoemaker 83aafd1884
Address #4476 issues (#4504)
Addresses two issues introduced in #4476:
* Keep setting the V2 field in modelToAuthzPB so RPCs returned from new components to old don't cause panics
* Don't return expired orders from the SA, so that users requesting old orders that contain old style authorizations don't cause breakage in the RA
2019-10-23 13:08:32 -07:00
Daniel McCarney 3175b4f9eb
web: strip :443/:80 unconditionally w/ features.StripDefaultSchemePort (#4505)
Only removing :443 when the http.Request.TLS is not nil breaks when
Boulder's WFE/WFE2 are running HTTP behind a separate ingress proxy that
terminates HTTPS on its behalf.
2019-10-23 15:17:13 -04:00
Roland Bracewell Shoemaker dc2ce4ca84
Revert "Remove remaining old format authorization code from SA/… (#4502)
We need to apply some fixes for bugs introduced in #4476 before it can be deployed, as such we need to revert #4495 as there needs to be a full deploy cycle between these two changes.

This reverts commit 3ae1ae1.

😭
2019-10-23 10:45:29 -07:00
Roland Bracewell Shoemaker 3ae1ae1493 Remove remaining old format authorization code from SA/protos (#4495) 2019-10-23 09:08:38 -04:00
Jacob Hoffman-Andrews 672bdcfdcb orphan-finder: Rename CAService in config. (#4496)
OCSPGeneratorService matches the semantics better, and is what
ocsp-updater uses. It also matches what's in the config-next.

This wasn't caught by integration tests because we don't currently
run orphan-finder in the integration tests. We don't have a good way
to induce failures in the SA on demand.
2019-10-22 09:25:11 -07:00
Roland Bracewell Shoemaker 46e0468220 Make authz2 the default storage format (#4476)
This change set makes the authz2 storage format the default format. It removes
most of the functionality related to the previous storage format, except for
the SA fallbacks and old gRPC methods which have been left for a follow-up
change in order to make these changes deployable without introducing
incompatibilities.

Fixes #4454.
2019-10-21 15:29:15 -04:00
Jacob Hoffman-Andrews 75e1902524 publisher: allow custom UA for CT submissions. (#4492)
Configure "User-Agent: boulder/1.0" for publisher CT submissions.
2019-10-21 15:08:03 -04:00
Roland Bracewell Shoemaker 02ff51a815 publisher: remove redundant SCT signature verification (#4494)
SCT signature verification already happens within the CT client, so we are
currently doing it twice for no reason. This change removes the redundant check
we perform.
2019-10-21 10:18:58 -04:00
Roland Bracewell Shoemaker 308960cbdd log-validator: add cmd/daemon for verifying log integrity (#4482)
In f32fdc4 the Boulder logging framework was updated to emit a CRC32-IEEE
checksum in log lines. The `log-validator` command verifies these checksums in
one of two ways:

1. By running as a daemon process, tailing logs and verifying checksums as they
arrive.
2. By running as a one-off command, verifying checksums of every line in a log
file on disk.
2019-10-21 10:12:55 -04:00
Daniel McCarney eb4445be6c
log: panic if bothWriter write to stdout errs. (#4491)
If we can't write to stdout we prefer to panic immediately rather than
potentially lose logs we capture from redirecting stdout as a syslog backup.

A unit test is included to verify the panic behaviour. Prior to the `log` diff
in this branch the test failed because the non-nil `err` result from
`fmt.Printf` was being away:

```
=== RUN   TestStdoutFailure
=== PAUSE TestStdoutFailure
=== CONT  TestStdoutFailure
FAIL	github.com/letsencrypt/boulder/log	0.011s
FAIL
```

After the `log` package diff in this branch is applied the test passes.

I additionally tested this end-to-end by redirecting stdout to a full
filesystem volume mounted into the Boulder docker image. It provoked the
expected panic when a component tried to write to stdout and the filesystem was
full.
2019-10-18 13:53:00 -04:00
Daniel McCarney 7b513de6a5 orphan-finder: adopt orphan precerts. (#4483)
Since 9906c93 the CA has logged orphan log lines for precertificates as well
as certificates. The orphan-finder needs to handle them similar to final certificates.

Resolves https://github.com/letsencrypt/boulder/issues/4479
2019-10-17 13:14:57 -07:00
Jacob Hoffman-Andrews 5ff750076c Move to Go 1.13.2. (#4490) 2019-10-17 15:39:04 -04:00
Daniel McCarney 2926074a29
CI/Dev: enable TLS 1.3 (#4489)
Also update the VA's TLS-ALPN-01 TLS 1.3 unit test to not expect
a failure.
2019-10-17 14:01:38 -04:00
Daniel McCarney 117df57e8c
cmd: remove stale package comment. (#4488)
The idea expressed in this comment isn't representative of the
Boulder cmds. E.g. There's no top level "App Shell" in use and the
`NewAppShell`, `Action` and `Run` functions ref'd do not exist.
2019-10-17 13:40:32 -04:00
Roland Bracewell Shoemaker 2de47bcdee WFE/WFE2: Remove old authz/challenge support (#4486)
Does what it says on the tin. Also requires some mocks changes that will also be
used by RA changes in the next part of this change series.
2019-10-17 10:19:04 -04:00
Daniel McCarney 2e7333d9ab
SA: Move GetPrecertificate next to AddPrecertificate. (#4484)
We have a nice `sa/precertificates.go` file that holds `AddPrecertificate`
(and other precert functions). Let's put `GetPrecertificate` there
too instead of in the more generic `sa/sa.go` file.
2019-10-16 15:15:51 -04:00
Daniel McCarney 3c543cca68
travis: remove stale rabbitmq hosts addon. (#4485)
We've long since killed off RabbitMQ usage in Boulder.
2019-10-16 13:39:44 -04:00
Roland Bracewell Shoemaker f32fdc4639
Include a CRC32-IEEE checksum in log lines (#4478)
Adds a CRC32-IEEE checksum to our log lines. At most this adds 8 bytes per line, and at least adds 2 bytes. Given this a relatively minor change I haven't bothered flagging it, although if we have anything in place that assumes the current structure of log lines we may want to add a flag in order to prevent immediate breakage before things can be altered.

Fixes #4474.
2019-10-14 13:57:43 -07:00
Daniel McCarney d35c20db75 boulder-janitor: switch workbatch gauge to counter. (#4477)
A gauge wasn't the appropriate stat type choice for this usage.

Switching the stat to be a counter instead of a gauge means we can't
detect when the janitor is finished its work in the integration test by
watching for this stat to drop to zero for all the table labels we're
concerned with. Instead the test is updated to watch for the counter
value to stabilize for a period longer than the workbatch sleep.
2019-10-11 14:40:59 -07:00
Daniel McCarney 83882abf46
tests: fix TestPrecertificateRevocation integration test (#4475)
Spamming runs of the `TestPrecertificateRevocation` integration test from
1cd9733c24 found two ways it would flake on rare
occasion:

1. A [data race in the
`ct-test-srv`](https://gist.github.com/cpu/761c176cb72e0eaa52656d3322423202)
would kill the test log process and the integration test would be unable to
reach the mock API. This causes the test failure flagged in #4460. The root
cause is addressed by refactoring the `ct-test-srv`'s
`addChainOrPre` function to use a separate function for checking/extending the
rejected list with the correct locking in place.

2. Occasionally the integration test wasn't able to find a matching precert in
the very first configured ct-test-srv. This produces a test failure like:

```
--- FAIL: TestPrecertificateRevocation (4.95s)
    --- FAIL: TestPrecertificateRevocation/revocation_by_certificate_key (1.27s)
        revocation_test.go:110: finding rejected precertificate: no matching ct-test-srv rejection found
FAIL
FAIL	github.com/letsencrypt/boulder/test/integration	4.961s
FAIL
```

I believe this is addressed by changing the integration test logic to check all of 
the configured `ct-test-srv` instances for a matching precert instead of just 
the first.

Resolves https://github.com/letsencrypt/boulder/issues/4460
2019-10-10 13:23:49 -04:00
Jacob Hoffman-Andrews 3cfec7c9e1 CA: Return error when OCSP signing fails. (#4466)
Since OCSP signing happens before precertificate signing, it is simpler
and safer to consider the issuance a failure if OCSP signing fails than to
continue with signing the precertificate and try to sign OCSP later.

Fixes #4429
2019-10-10 13:13:25 -04:00
Jacob Hoffman-Andrews fead807c7c Make PrecertificateOCSP the default behavior. (#4465)
In the process, rename generateOCSPAndStoreCertificate to just
storeCertificate, because that function doesn't generate OCSP anymore;
instead the OCSP is generated (and stored) at precertificate issuance
time.
2019-10-09 17:11:58 -07:00
Daniel McCarney ecca3492e9 csr: return berrors in VerifyCSR. (#4473)
This also adds the badCSR error type specified by RFC 8555. It is a natural fit for the errors in VerifyCSR that aren't covered by badPublicKey. The web package function for converting a berror to
a problem is updated for the new badCSR error type.

The callers (RA and CA) are updated to return the berrors from VerifyCSR as is instead of unconditionally wrapping them as a berrors.MalformedError instance. Unit/integration tests are updated accordingly.

Resolves #4418
2019-10-09 17:11:11 -07:00
Daniel McCarney ddfc620c44
va: exempt multi-va enforcement by domain/acct ID. (#4458)
In order to move multi perspective validation forward we need to support policy
in Boulder configuration that can relax multi-va requirements temporarily.

A similar mechanism was used in support of the gradual deprecation of the
TLS-SNI-01 challenge type and with the introduction of CAA enforcement and has
shown to be a helpful tool to have available when introducing changes that are
expected to break sites.

When the VA "multiVAPolicyFile" is specified it is assumed to be a YAML file
containing two lists:

1. disabledNames - a list of domain names that are exempt from multi VA
   enforcement.
2. disabledAccounts - a list of account IDs that are exempt from multi VA
   enforcement.

When a hostname or account ID is added to the policy we'll begin communication
with the related ACME account contact to establish that this is a temporary
measure and the root problem will need to be addressed before an eventual
cut-off date.

Resolves https://github.com/letsencrypt/boulder/issues/4455
2019-10-07 16:43:11 -04:00
Daniel McCarney f885d153c2
tests: add select on precerts grant for ocsp_update user. (#4470) 2019-10-07 15:38:23 -04:00
Roland Bracewell Shoemaker 215818a13a Remove godeps cruft (#4467)
Boulder is fully transitioned to using 1st party Go modules/dependency management.
2019-10-07 14:55:14 -04:00
Jacob Hoffman-Andrews d3b9107059 orphan-finder: add OCSP generation (#4457)
Fixes #4428
2019-10-07 14:40:36 -04:00
Jacob Hoffman-Andrews 58ec8747da CA: add a 1-second error backoff for orphan queue. (#4462)
If there's something preventing the CA from processing its orphan queue
(and therefore returning errors), we only want to retry once per second.
2019-10-07 14:15:59 -04:00
Roland Bracewell Shoemaker 1d1da08816 gRPC: Fix client side error checking in a handful of wrappers (#4464)
A handful of the gRPC client-side wrappers are checking that the request is not
nil when they should be checking that the response is not nil. This could lead
to derefs where callers assume that the thing being returned is non-nil.

Fixes #4463.
2019-10-07 14:12:32 -04:00
Roland Bracewell Shoemaker 3359ec349b ocsp-responder: Integrate CFSSL OCSP responder code (#4461)
Integrates the cfssl/ocsp responder code directly into boulder. I've tried to
pare down the existing code to only the bits we actually use and have removed
some generic interfaces in places in favor of directly using our boulder
specific interfaces.

Fixes #4427.
2019-10-07 14:05:37 -04:00
Daniel McCarney ab26662fc8
ocsp-updater: fix generateResponse for precerts w/o certs (#4468)
Since 9906c93217 when
`features.PrecertificateOCSP` is enabled it is possible for there to be
`certificateStatus` rows that correspond to `precertificates` that do not have
a matching final `certificates` row. This happens in the case where we began
serving OCSP for a precert and weren't able to issue a final certificate.

Prior to the fix in this branch when the `ocsp-updater` would find stale OCSP
responses by querying the `certificateStatus` table it would error in
`generateResponse` when it couldn't find a matching `certificates` row. This
branch updates the logic so that when `features.PrecertificateOCSP` is enabled
it will also try finding the ocsp update DER from the `precertificates` table
when there is no matching serial in the `certificates` table.
2019-10-07 13:11:31 -04:00
Jacob Hoffman-Andrews a3c3f521ef sa: Add check for duplicates in AddPrecertificate (#4459)
This allows the CA's orphan queue to specially handle
`berrors.Duplicate` as a success case, meaning that affected
certificates can be cleared off the queue.
2019-10-03 16:29:45 -04:00
Roland Bracewell Shoemaker 31ed590edd
Strip default scheme ports from Host headers (#4448)
Fixes #4447.
2019-09-27 16:14:40 -07:00
Daniel McCarney fdbf87679b
wfe/wfe2: handle bad authz2 IDs as malformed req. (#4453)
Prev. the WFE2 would return a 500 error in the case where an
authorization ID was invalid. The WFE1 would return a 404. Returning
a malformed request problem reports the true cause of the error as
an invalid client request.
2019-09-27 17:27:02 -04:00