Precursor to #4116. Since some of our dependencies impose a minimum
version on these two packages higher than what we have in Godeps, we'll
have to bump them anyhow. Bumping them independently of the modules
update should keep things a little simpler.
In order to get protobuf tests to pass, I had to update protoc-gen-go in
boulder-tools. Now we download a prebuilt binary instead of using the
Ubuntu package, which is stuck on 3.0.0. This also meant I needed to
re-generate our pb.go files, since the new version generates somewhat
different output.
This happens to change the tag for pbutil, but it's not a substantive change - they just added a tagged version where there was none.
$ go test github.com/miekg/dns/...
ok github.com/miekg/dns 4.675s
ok github.com/miekg/dns/dnsutil 0.003s
ok github.com/golang/protobuf/descriptor (cached)
ok github.com/golang/protobuf/jsonpb (cached)
? github.com/golang/protobuf/jsonpb/jsonpb_test_proto [no test files]
ok github.com/golang/protobuf/proto (cached)
? github.com/golang/protobuf/proto/proto3_proto [no test files]
? github.com/golang/protobuf/proto/test_proto [no test files]
ok github.com/golang/protobuf/protoc-gen-go (cached)
? github.com/golang/protobuf/protoc-gen-go/descriptor [no test files]
ok github.com/golang/protobuf/protoc-gen-go/generator (cached)
ok github.com/golang/protobuf/protoc-gen-go/generator/internal/remap (cached)
? github.com/golang/protobuf/protoc-gen-go/grpc [no test files]
? github.com/golang/protobuf/protoc-gen-go/plugin [no test files]
ok github.com/golang/protobuf/ptypes (cached)
? github.com/golang/protobuf/ptypes/any [no test files]
? github.com/golang/protobuf/ptypes/duration [no test files]
? github.com/golang/protobuf/ptypes/empty [no test files]
? github.com/golang/protobuf/ptypes/struct [no test files]
? github.com/golang/protobuf/ptypes/timestamp [no test files]
? github.com/golang/protobuf/ptypes/wrappers [no test files]
When using the `load-generator` with a config that
specifies an `"externalState"` file and `"dontSaveState: false` it's
clunky to have to manually create the file ahead of running the
`load-generator` (and put `{}` into it to avoid an unmarshalling error).
* load-generator: remove unused GET support, rework req totals.
The `State.get` function isn't required now that POST-as-GET is being
used throughout. Similarly tracking the # of POST requests and the # of
GET requests doesn't make sense anymore so now the `load-generator` just
tracks overall HTTP requests.
* load-generator: clean up getNonce
* Support enabling POST-as-GET requests in state config.
* Don't send key authorization in challenge initiation POSTs
* Use `RawURLEncoding` for CSR in finalization requests to strip padding.
* Fix bug that mixed up GET/POST request totals in output.
## CI: restore load-generator run.
This restores running the `load-generator` during CI to make sure it doesn't bitrot. It was previously removed while we debugged the VA getting jammed up and not cleanly shutting down.
Since the global `pebble-challtestsrv` and the `load-generator`'s internal chall test srv will conflict this requires moving the `load-generator` run to the end of integration tests and updating `startservers.py` to allow the load gen integration test code to stop the `pebble-challtestsrv` before starting the `load-generator`.
The `load-generator` and associated config are updated to allow specifying bind addresses for the DNS interface of the internal challtestsrv. Multiple addresses are supported so that the `load-generator`'s chall test srv can listen on port DNS ports Boulder is configured to use. The `load-generator` config now accepts a `fakeDNS` parameter that can be used to specify the default IPv4 address returned by the `load-generator`'s DNS server for A queries.
## load-generator: support different challenges/strategies.
Updates the load-generator to support HTTP-01, DNS-01, and TLS-ALPN-01 challenge response servers. A new challenge selection configuration parameter (`ChallengeStrategy`) can be set to `"http-01"`, `"dns-01"`, or `"tls-alpn-01"` to solve only challenges of that type. Using `"random"` will let the load-generator choose a challenge type randomly.
Resolves https://github.com/letsencrypt/boulder/issues/3900
- Move fakeclock, get_future_output, and random_domain to helpers.py.
- Remove tempdir handling from integration-test.py since it's already
done in helpers.py
- Consolidate handling of config dir into helpers.py, and add
CONFIG_NEXT boolean.
- Move RevokeAtRA config gating into verify_revocation to reduce
redundancy.
- Skip load-balancing test when filter is enabled.
- Ungate test_sct_embedding
- Rework test_ct_submissions, which was out of date. In particular, have a couple of
logs where submitFinalCert: false, and make ct-test-srv store submission counts
by hostnames for better test case isolation.
FinalizeAuthorization deleted from pendingAuthorizations and then added
to authz. DeactivateAuthorization did it in opposite order. This tweaks
them so they always do the insert / delete in the same order as each
other, to avoid deadlocks.
Removing hard-coded paths and using the server directory to bootstrap endpoint URLs improves RFC 8555 compatibility.
This branch also updates the `github.com/letsencrypt/challtestsrv` vendored dep to the latest release. There are no upstream unit tests to run.
Updates https://github.com/letsencrypt/boulder/issues/4086 - there are still a few Pebble compatibility issues to work out. I started on what became a near total rewrite of the load-generator and decided it was best to pull out some smaller PRs and re-evaluate. I'm optimistic that stashing little bits of a Go testing/boulder focused ACME client in `test/load-generator` will one day help https://github.com/letsencrypt/boulder/issues/4127
Previously we relied on each instance of `features.Set` to have a
corresponding `defer features.Reset()`. If we forget that, we can wind
up with unexpected behavior where features set in one test case leak
into another test case. This led to the bug in
https://github.com/letsencrypt/boulder/issues/4118 going undetected.
Fix#4120
* Use %T in GoodKey error fmt to handle nil keys
This is to prevent nil keys from generating format errors such as:
unknown key type %!s(\u003cnil\u003e)
* Add GoodKey test for nil keys
The fmt package says:
>All errors begin with the string "%!" followed sometimes by a single character (the verb) and end with a parenthesized description.
This test ensures an error is generated and that '%!' marker is not
present.
This commit updates the `github.com/weppos/publicsuffix-go` dependency
to 34e9f38 - the tip of master at the time of writing.
Unit tests are confirmed to pass:
```
~/go/src/github.com/weppos/publicsuffix-go$ git log --pretty=format:'%h' -n 1
34e9f38
~/go/src/github.com/weppos/publicsuffix-go$ go test ./...
? github.com/weppos/publicsuffix-go/cmd/load [no test files]
ok github.com/weppos/publicsuffix-go/net/publicsuffix 0.005s
ok github.com/weppos/publicsuffix-go/publicsuffix 0.006s
```
* `EnforceMultiVA` to allow configuring multiple VAs but not changing the primary VA's result based on what the remote VAs return.
* `MultiVAFullResults` to allow collecting all of the remote VA results. When all results are collected a JSON log line with the differential between the primary/remote VAs is logged.
Resolves https://github.com/letsencrypt/boulder/issues/4066
We don't intend to load test the legacy WFE implementation in the future
and if we need to we can always revive this code from git. Removing it
will make refactoring the ACME v2 code to be closer to RFC 8555 easier.
Previously the v2_integration tests were imported to the global
namespace in integration-test.py. As a result, some were shadowed and
didn't run, or called methods that were in the main namespace rather
than their own.
This PR imports and runs them under their own namespace. It also fixes
some tests that were broken. Notably:
- Fixes chisel2.expect_problem.
- Fixes incorrect namespacing on some expect_problem calls.
- Remove unused ValidationError from v2_integration.
- Replace client.key with client.net.key.
* A redirect without a hostname is obviously bad and should get
a distinct error message as early as possible.
* A redirect to a hostname that doesn't end in an IANA registered TLD is
also obviously bad and should get a distinct error message as early as
possible.
* Remove the challenge whitelist
* Reduce the signature for ChallengesFor and ChallengeTypeEnabled
* Some unit tests in the VA were changed from testing TLS-SNI to testing the same behavior
in TLS-ALPN, when that behavior wasn't already tested. For instance timeouts during connect
are now tested.
Fixes#4109
Our integration test test_http_challenge_timeout occasionally fails with
boulder-ra [AUDIT] Could not communicate with VA: rpc
error: code = DeadlineExceeded desc = context deadline exceeded
In at least one of these cases, the VA correctly timed-out its HTTP
request and logged a validation error with the correct error message.
I believe that there is a race between the VA returning its validation
error to the RA, and the RA timing out its gRPC call. By shaving some
time off the context we should more reliably get the response back to
the RA.
The order the primary VA calls `PerformValidation` on configured
remote VAs is also changed to be done in a random order.
Resolves#4087
We're only using the simplified HTTP-01 code from `va/http.go` now 🎉 The old unit tests that still seem relevant are left in place in `va/va_test.go` instead of being moved to `va/http_test.go` to signal that they're a bit crufty and could probably use a separate cleanup. For now I'm hesitant to remove test coverage so I updated them in-place without moving them to a new home.
Resolves https://github.com/letsencrypt/boulder/issues/4089
When this test fails, it logs the fact that it got the wrong type of
ProblemDetail, but not what the actual ProblemDetail was. Fixing that
will make it easier to track down intermittent failures.
Go 1.11+ updated the `sql.DBStats` struct with new fields that are of
interest to us. This PR routes these stats to Prometheus by replacing
the existing autoprom stats code with new first-class Prometheus
metrics. Resolves https://github.com/letsencrypt/boulder/issues/4095
The `max_db_connections` stat from the SA is removed because the Go 1.11+
`sql.DBStats.MaxOpenConnections` field will give us a better view of
the same information.
The autoprom "reused_authz" stat that was being incremented in
`SA.GetPendingAuthorization` was also removed. It wasn't doing what it
says it was (counting reused authorizations) and was instead counting
the number of times `GetPendingAuthorization` returned an authz.
This changeset implements the logic required for the WFE to retrieve v2 authorizations and their associated challenges while still maintaining the logic to retrieve old authorizations/challenges. Challenge IDs for v2 authorizations are obfuscated using a pretty simply scheme in order to prevent hard coding of indexes. A `V2` field is added to the `core.Authorization` object and populated using the existing field of the same name from the protobuf for convenience. v2 authorizations and challenges use a `v2` prefix in all their URLs in order to easily differentiate between v1 and v2 URLs (e.g. `/acme/authz/v2/asd` and `/acme/challenge/v2/asd/123`), once v1 authorizations cease to exist this prefix can be safely removed. As v2 authorizations use int IDs this change switches from string IDs to int IDs, this mainly only effects tests.
Integration tests are put off for #4079 as they really need #4077 and #4078 to be properly effective.
Fixes#4041.
Update the `github.com/weppos/publicsuffix-go` dependency to 26bf87f,
the tip of master at the time of writing.
Unit tests are confirmed to pass:
```
~/go/src/github.com/weppos/publicsuffix-go$ git log --pretty=format:'%h' -n 1
26bf87f
~/go/src/github.com/weppos/publicsuffix-go$ go test ./...
? github.com/weppos/publicsuffix-go/cmd/load [no test files]
ok github.com/weppos/publicsuffix-go/net/publicsuffix 0.005s
ok github.com/weppos/publicsuffix-go/publicsuffix 0.005s
```
I tried dropping the RA->VA timeout to make the
`test_http_challenge_timeout` integration test faster. It seems to flake
in CI so I'm restoring the original 20s timeout. This makes
`test_http_challenge_timeout` slower but c'est la vie.
If an order for a given set of names will fail finalization because of certificate rate limits (certs per domain, certs per fqdn set) there isn't any point in allowing an order for those names to be created. We can stop a lot of requests earlier by enforcing the cert rate limits at new order time as well as finalization time. A new RA `EarlyOrderRateLimit` feature flag controls whether this is done or not.
Resolves#3975
If there is a rate limit override for both the key being examined and the regID in use then the ratelimit `GetThreshold` function should return the larger of the two.
Resolves#4072
We've been using the newer "ready" order status for longer than the lifetime of any previously "pending" orders. I believe this means we can drop the legacy allowance for finalizing pending orders and enforce finalization only occur for "ready" orders without any feature flags. This is implemented in [c85d4b0](c85d4b097b).
There is a new error type added in the draft spec (`orderNotReady`) that should be returned to clients that finalize an order in state other than "ready". This is implemented in [6008202](6008202357).
Resolves#4073
The vendored copy of `github.com/zmap/zlint` is updated to
f38bd22 - the tip of master at the time of writing.
This pulls in a new deprecated gTLD: `.active`.
Unit tests are confirmed to pass:
```
~/go/src/github.com/zmap/zlint$ git log --pretty=format:'%h' -n 1
f38bd22
~/go/src/github.com/zmap/zlint$ go test ./...
ok github.com/zmap/zlint 0.220s
? github.com/zmap/zlint/cmd/zlint [no test files]
? github.com/zmap/zlint/cmd/zlint-gtld-update [no test files]
ok github.com/zmap/zlint/lints 0.270s
ok github.com/zmap/zlint/util 0.015s
```
The existing (but undeployed) `AllowRenewalFirstRL` feature flag is used to gate whether the SA `CountCertificatesByNames` and `CountCertificatesByExactNames` RPCs will exclude renewals from the returned counts using the `issuedNames` table's `renewal` field.
The previous implementation of `AllowRenewalFirstRL` is deleted. It wasn't performant in specific corner cases.
There's no new integration test in this branch because the existing `test_renewal_exemption` integration test from the first `AllowRenewalFirstRL` implementation provides the required coverage for `config-next` runs.
Resolves#4060Resolves#4006
The `renewal` field of the `issuedNames` table is indexed
to allow more efficient processing of rate limit decisions
w.r.t. renewals for the certificates per domain rate limit (see
https://github.com/letsencrypt/boulder/pull/3178). Before we can start
using this field in rate limit calculations we need to populate it
appropriately. A new `SetIssuedNamesRenewalBit` feature flag for the SA
controls whether we do so or not.
Resolves https://github.com/letsencrypt/boulder/issues/4008
The vendored copy of `github.com/zmap/zlint` is updated to b2aa746 - the
tip of master at the time of writing.
This pulls in two deprecated gTLDs (`.zippo`, `.epost`).
```
~/go/src/github.com/zmap/zlint$ git log --pretty=format:'%h' -n 1
b2aa746
~/go/src/github.com/zmap/zlint$ go test ./...
ok github.com/zmap/zlint 0.212s
? github.com/zmap/zlint/cmd/zlint [no test files]
? github.com/zmap/zlint/cmd/zlint-gtld-update [no test files]
ok github.com/zmap/zlint/lints 0.210s
ok github.com/zmap/zlint/util 0.006s
```
The vendored copy of `github.com/zmap/zlint` is updated to fbc0b69 - the
tip of master at the time of writing.
This pulls in a deprecated gTLD (`.blanco`).
Unit tests are confirmed to pass:
```
~/go/src/github.com/zmap/zlint$ git log --pretty=format:'%h' -n 1
fbc0b69
~/go/src/github.com/zmap/zlint$ go test ./...
ok github.com/zmap/zlint 0.215s
? github.com/zmap/zlint/cmd/zlint [no test files]
? github.com/zmap/zlint/cmd/zlint-gtld-update [no test files]
ok github.com/zmap/zlint/lints 0.270s
ok github.com/zmap/zlint/util 0.007s
```
Implements a feature that enables immediate revocation instead of marking a certificate revoked and waiting for the OCSP-Updater to generate the OCSP response. This means that as soon as the request returns from the WFE the revoked OCSP response should be available to the user. This feature requires that the RA be configured to use the standalone Akamai purger service.
Fixes#4031.