Errors that were being returned in the checkAlgorithm methods of both wfe and
wfe2 didn't really match up to what was actually being checked. This change
attempts to bring the errors in line with what is actually being tested.
Fixes#4452.
This branch also updates the WFE2 parseJWS function to match the error string fixed in the upstream project for the case where a JWS EC public key fails to unmarshal due to an incorrect length.
Resolves#4300
We now expect that the config dir is always set, so we make that
explicit in the integration test and error if that's not true.
This change also renames the variable to just "config_dir", and removes
the parameter to startservers.start, which is currently never set to
anything other than its default value.
This also explicitly sets the environment variable in .travis.yml.
This creates the correct type of backend service for the OCSP generator.
It also adds an invocation of orphan-finder during the integration
tests.
This also adds a minor safety check to SA that I hit while writing the
test. Without this safety check, passing a certificate with no DNSNames
to AddCertificate would result in an obscure MariaDB syntax error
without enough context to track it down. In normal circumstances this
shouldn't be hit, but it will be good to have a solid error message if
we hit it in tests sometime.
Also, this tweaks the .travis.yml so it explicitly sets BOULDER_CONFIG_DIR
to test/config in the default case. Because the docker-compose run
command uses -e BOULDER_CONFIG_DIR="${BOULDER_CONFIG_DIR}",
we were setting a blank BOULDER_CONFIG_DIR in default case.
Since the Python startservers script sets a default if BOULDER_CONFIG_DIR
is not set, we haven't noticed this before. But since this test case relies
on the actual environment variable, it became an issue.
Fixes#4499
Addresses two issues introduced in #4476:
* Keep setting the V2 field in modelToAuthzPB so RPCs returned from new components to old don't cause panics
* Don't return expired orders from the SA, so that users requesting old orders that contain old style authorizations don't cause breakage in the RA
Only removing :443 when the http.Request.TLS is not nil breaks when
Boulder's WFE/WFE2 are running HTTP behind a separate ingress proxy that
terminates HTTPS on its behalf.
We need to apply some fixes for bugs introduced in #4476 before it can be deployed, as such we need to revert #4495 as there needs to be a full deploy cycle between these two changes.
This reverts commit 3ae1ae1.
😭
OCSPGeneratorService matches the semantics better, and is what
ocsp-updater uses. It also matches what's in the config-next.
This wasn't caught by integration tests because we don't currently
run orphan-finder in the integration tests. We don't have a good way
to induce failures in the SA on demand.
This change set makes the authz2 storage format the default format. It removes
most of the functionality related to the previous storage format, except for
the SA fallbacks and old gRPC methods which have been left for a follow-up
change in order to make these changes deployable without introducing
incompatibilities.
Fixes#4454.
SCT signature verification already happens within the CT client, so we are
currently doing it twice for no reason. This change removes the redundant check
we perform.
In f32fdc4 the Boulder logging framework was updated to emit a CRC32-IEEE
checksum in log lines. The `log-validator` command verifies these checksums in
one of two ways:
1. By running as a daemon process, tailing logs and verifying checksums as they
arrive.
2. By running as a one-off command, verifying checksums of every line in a log
file on disk.
If we can't write to stdout we prefer to panic immediately rather than
potentially lose logs we capture from redirecting stdout as a syslog backup.
A unit test is included to verify the panic behaviour. Prior to the `log` diff
in this branch the test failed because the non-nil `err` result from
`fmt.Printf` was being away:
```
=== RUN TestStdoutFailure
=== PAUSE TestStdoutFailure
=== CONT TestStdoutFailure
FAIL github.com/letsencrypt/boulder/log 0.011s
FAIL
```
After the `log` package diff in this branch is applied the test passes.
I additionally tested this end-to-end by redirecting stdout to a full
filesystem volume mounted into the Boulder docker image. It provoked the
expected panic when a component tried to write to stdout and the filesystem was
full.
The idea expressed in this comment isn't representative of the
Boulder cmds. E.g. There's no top level "App Shell" in use and the
`NewAppShell`, `Action` and `Run` functions ref'd do not exist.
We have a nice `sa/precertificates.go` file that holds `AddPrecertificate`
(and other precert functions). Let's put `GetPrecertificate` there
too instead of in the more generic `sa/sa.go` file.
Adds a CRC32-IEEE checksum to our log lines. At most this adds 8 bytes per line, and at least adds 2 bytes. Given this a relatively minor change I haven't bothered flagging it, although if we have anything in place that assumes the current structure of log lines we may want to add a flag in order to prevent immediate breakage before things can be altered.
Fixes#4474.
A gauge wasn't the appropriate stat type choice for this usage.
Switching the stat to be a counter instead of a gauge means we can't
detect when the janitor is finished its work in the integration test by
watching for this stat to drop to zero for all the table labels we're
concerned with. Instead the test is updated to watch for the counter
value to stabilize for a period longer than the workbatch sleep.
Spamming runs of the `TestPrecertificateRevocation` integration test from
1cd9733c24 found two ways it would flake on rare
occasion:
1. A [data race in the
`ct-test-srv`](https://gist.github.com/cpu/761c176cb72e0eaa52656d3322423202)
would kill the test log process and the integration test would be unable to
reach the mock API. This causes the test failure flagged in #4460. The root
cause is addressed by refactoring the `ct-test-srv`'s
`addChainOrPre` function to use a separate function for checking/extending the
rejected list with the correct locking in place.
2. Occasionally the integration test wasn't able to find a matching precert in
the very first configured ct-test-srv. This produces a test failure like:
```
--- FAIL: TestPrecertificateRevocation (4.95s)
--- FAIL: TestPrecertificateRevocation/revocation_by_certificate_key (1.27s)
revocation_test.go:110: finding rejected precertificate: no matching ct-test-srv rejection found
FAIL
FAIL github.com/letsencrypt/boulder/test/integration 4.961s
FAIL
```
I believe this is addressed by changing the integration test logic to check all of
the configured `ct-test-srv` instances for a matching precert instead of just
the first.
Resolves https://github.com/letsencrypt/boulder/issues/4460
Since OCSP signing happens before precertificate signing, it is simpler
and safer to consider the issuance a failure if OCSP signing fails than to
continue with signing the precertificate and try to sign OCSP later.
Fixes#4429
In the process, rename generateOCSPAndStoreCertificate to just
storeCertificate, because that function doesn't generate OCSP anymore;
instead the OCSP is generated (and stored) at precertificate issuance
time.
This also adds the badCSR error type specified by RFC 8555. It is a natural fit for the errors in VerifyCSR that aren't covered by badPublicKey. The web package function for converting a berror to
a problem is updated for the new badCSR error type.
The callers (RA and CA) are updated to return the berrors from VerifyCSR as is instead of unconditionally wrapping them as a berrors.MalformedError instance. Unit/integration tests are updated accordingly.
Resolves#4418
In order to move multi perspective validation forward we need to support policy
in Boulder configuration that can relax multi-va requirements temporarily.
A similar mechanism was used in support of the gradual deprecation of the
TLS-SNI-01 challenge type and with the introduction of CAA enforcement and has
shown to be a helpful tool to have available when introducing changes that are
expected to break sites.
When the VA "multiVAPolicyFile" is specified it is assumed to be a YAML file
containing two lists:
1. disabledNames - a list of domain names that are exempt from multi VA
enforcement.
2. disabledAccounts - a list of account IDs that are exempt from multi VA
enforcement.
When a hostname or account ID is added to the policy we'll begin communication
with the related ACME account contact to establish that this is a temporary
measure and the root problem will need to be addressed before an eventual
cut-off date.
Resolves https://github.com/letsencrypt/boulder/issues/4455
A handful of the gRPC client-side wrappers are checking that the request is not
nil when they should be checking that the response is not nil. This could lead
to derefs where callers assume that the thing being returned is non-nil.
Fixes#4463.
Integrates the cfssl/ocsp responder code directly into boulder. I've tried to
pare down the existing code to only the bits we actually use and have removed
some generic interfaces in places in favor of directly using our boulder
specific interfaces.
Fixes#4427.
Since 9906c93217 when
`features.PrecertificateOCSP` is enabled it is possible for there to be
`certificateStatus` rows that correspond to `precertificates` that do not have
a matching final `certificates` row. This happens in the case where we began
serving OCSP for a precert and weren't able to issue a final certificate.
Prior to the fix in this branch when the `ocsp-updater` would find stale OCSP
responses by querying the `certificateStatus` table it would error in
`generateResponse` when it couldn't find a matching `certificates` row. This
branch updates the logic so that when `features.PrecertificateOCSP` is enabled
it will also try finding the ocsp update DER from the `precertificates` table
when there is no matching serial in the `certificates` table.
This allows the CA's orphan queue to specially handle
`berrors.Duplicate` as a success case, meaning that affected
certificates can be cleared off the queue.
Prev. the WFE2 would return a 500 error in the case where an
authorization ID was invalid. The WFE1 would return a 404. Returning
a malformed request problem reports the true cause of the error as
an invalid client request.
A unit test is included to verify that a TLS-ALPN-01 challenge to
a TLS 1.3 only server doesn't succeed when the `GODEBUG` value to
disable TLS 1.3 in `docker-compose.yml` is set. Without this env var
the test fails on the Go 1.13 build because of the new default:
```
=== RUN TestTLSALPN01TLS13
--- FAIL: TestTLSALPN01TLS13 (0.04s)
tlsalpn_test.go:531: expected problem validating TLS-ALPN-01 challenge against a TLS 1.3 only server, got nil
FAIL
FAIL github.com/letsencrypt/boulder/va 0.065s
```
With the env var set the test passes, getting the expected connection
problem reporting a tls error:
```
=== RUN TestTLSALPN01TLS13
2019/09/13 18:59:00 http: TLS handshake error from 127.0.0.1:51240: tls: client offered only unsupported versions: [303 302 301]
--- PASS: TestTLSALPN01TLS13 (0.03s)
PASS
ok github.com/letsencrypt/boulder/va 1.054s
```
Since we plan to eventually enable TLS 1.3 support and the `GODEBUG`
mechanism tested in the above test is platform-wide vs package
specific I decided it wasn't worth the time investment to write a
similar HTTP-01 unit test that verifies the TLS 1.3 behaviour on a
HTTP-01 HTTP->HTTPS redirect.
Resolves https://github.com/letsencrypt/boulder/issues/4415
When the `features.PrecertificateRevocation` feature flag is enabled the WFE2
will allow revoking certificates for a submitted precertificate. The legacy WFE1
behaviour remains unchanged (as before (pre)certificates issued through the V1
API will be revocable with the V2 API).
Previously the WFE2 vetted the certificate from the revocation request by
looking up a final certificate by the serial number in the requested
certificate, and then doing a byte for byte comparison between the stored and
requested certificate.
Rather than adjust this logic to handle looking up and comparing stored
precertificates against requested precertificates (requiring new RPCs and an
additional round-trip) we choose to instead check the signature on the requested
certificate or precertificate and consider it valid for revocation if the
signature validates with one of the WFE2's known issuers. We trust the integrity
of our own signatures.
An integration test that performs a revocation of a precertificate (in this case
one that never had a final certificate issued due to SCT embedded errors) with
all of the available authentication mechanisms is included.
Resolves https://github.com/letsencrypt/boulder/issues/4414