These feature flags are no longer referenced in any test, staging, or
production configuration. They were removed in:
- StoreRevokerInfo: IN-8546
- ROCSPStage6 and ROCSPStage7: IN-8886
- CAAValidationMethods and CAAAccountURI: IN-9301
Enable SA gRPC health checks in Consul ahead of further changes for
#6878. Calls to the `Check` method of the SA's grpc.health.v1.Health
service must respond `SERVING` before the `sa` service will be
advertised in Consul DNS. Consul will continue to poll this service
every 5 seconds.
- Add `bconsul` docker service to boulder `bluenet` and `rednet`
- Add TLS credentials for `consul.boulder`:
```shell
$ openssl x509 -in consul.boulder/cert.pem -text | grep DNS
DNS:consul.boulder
```
- Update `test/grpc-creds/generate.sh` to add `consul.boulder`
- Update test SA configs to allow `consul.boulder` to access to
`grpc.health.v1.Health`
Part of #6878
Deprecate the ROCSPStage6 feature flag. Remove all references to the
`ocspResponse` column from the SA, both when reading from and when
writing to the `certificateStatus` table. This makes it safe to fully
remove that column from the database.
IN-8731 enabled this flag in all environments, so it is safe to
deprecate.
Part of #6285
- Consistently format existing test JSON config files
- Add a small Python script which loads and dumps JSON files
- Add CI JSON lint test to CI
---------
Co-authored-by: Aaron Gable <aaron@aarongable.com>
In the WFE, ocsp-responder, and crl-updater, switch from using
StorageAuthorityClients to StorageAuthorityReadOnlyClients. This ensures
that these services cannot call methods which write to our database.
Fixes#6454
These flags are set in both staging and prod. Deprecate them, make
all code gated behind them the only path, and delete code (multi_source)
which was only accessible when these flags were not set.
Part of #6285
- Add a new gRPC client config field which overrides the dNSName checked in the
certificate presented by the gRPC server.
- Revert all test gRPC credentials to `<service>.boulder`
- Revert all ClientNames in gRPC server configs to `<service>.boulder`
- Set all gRPC clients in `test/config` to use `serverAddress` + `hostOverride`
- Set all gRPC clients in `test/config-next` to use `srvLookup` + `hostOverride`
- Rename incorrect SRV record for `ca` with port `9096` to `ca-ocsp`
- Rename incorrect SRV record for `ca` with port `9106` to `ca-crl`
Resolves#6424
- Add a dedicated Consul container
- Replace `sd-test-srv` with Consul
- Add documentation for configuring Consul
- Re-issue all gRPC credentials for `<service-name>.service.consul`
Part of #6111
Honeycomb was emitting logs directly to stderr like this:
```
WARN: Missing API Key.
WARN: Dataset is ignored in favor of service name. Data will be sent to service name: boulder
```
Fix this by providing a fake API key and replacing "dataset" with "serviceName" in configs. Also add missing Honeycomb configs for crl-updater.
For stdout-only logger, include checksums and escape newlines.
Create a new service named crl-updater. It is responsible for
maintaining the full set of CRLs we issue: one "full and complete" CRL
for each currently-active Issuer, split into a number of "shards" which
are essentially CRLs with arbitrary scopes.
The crl-updater is modeled after the ocsp-updater: it is a long-running
standalone service that wakes up periodically, does a large amount of
work in parallel, and then sleeps. The period at which it wakes to do
work is configurable. Unlike the ocsp-responder, it does all of its work
every time it wakes, so we expect to set the update frequency at 6-24
hours.
Maintaining CRL scopes is done statelessly. Every certificate belongs to
a specific "bucket", given its notAfter date. This mapping is generally
unchanging over the life of the certificate, so revoked certificate
entries will not be moving between shards upon every update. The only
exception is if we change the number of shards, in which case all of the
bucket boundaries will be recomputed. For more details, see the comment
on `getShardBoundaries`.
It uses the new SA.GetRevokedCerts method to collect all of the revoked
certificates whose notAfter timestamps fall within the boundaries of
each shard's time-bucket. It uses the new CA.GenerateCRL method to sign
the CRLs. In the future, it will send signed CRLs to the crl-storer to
be persisted outside our infrastructure.
Fixes#6163
This copies over settings from config-next that are now deployed in prod.
Also, I updated a comment in sd-test-srv to more accurately describe how SRV records work.
Add Honeycomb tracing to all Boulder components which act as
HTTP servers, gRPC servers, or gRPC clients. Add many values
which we currently emit to logs to the trace spans. Add a way to
configure the Honeycomb integration to our config files, and by
default configure all of our tests to "mute" (send nothing).
Followup changes will refine the configuration, attempt to reduce
the new dependency load, and introduce better sampling.
Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218
In #5235 we replaced MaxDBConns in favor of MaxOpenConns.
One week ago MaxDBConns was removed from all dev, staging, and
production configurations. This change completes the removal of
MaxDBConns from all components and test/config.
Fixes#5249
This adds a new tool, `health-checker`, which is a client of the new
Health Checker Service that has been integrated into all of our
boulder components. This tool takes an address, a timeout, and a
config file. It then attempts to connect to a gRPC Health Service at
the given address, retrying until it hits its timeout, using credentials
specified by the config file.
This is then wrapped by a new function `waithealth` in our Python
helpers, which serves much the same function as `waitport`, but
specifically for services which surface a gRPC Health Service
This in turn requires slight modifications to `startservers`, namely
specifying the address and port on which each service starts its
gRPC listener.
Finally, this change also introduces new credentials for this
health-checker, and adds those credentials as a valid client to
all services' json configs. A similar change would have to be made
to our production configs if we were to establish a long-lived
health checker/prober in prod.
Fixes#5074
The StoreKeyHashes feature flag controls whether rows are added to the
keyHashToSerial table. This feature is now enabled everywhere, so the
flag-protected code can be turned on unconditionally and the flag
removed from configs.
Related to #4895
This copies over a number of features flags and other settings from
test/config-next that have been applied in prod.
Also, remove the config-next gate on various tests.
For now this mainly provides an example config and confirms that
log-validator can start up and shut down cleanly, as well as provide a
stat indicating how many log lines it has handled.
This introduces a syslog config to the boulder-tools image that will write
logs to /var/log/program.log. It also tweaks the various .json config
files so they have non-default syslogLevel, to ensure they actually
write something for log-validator to verify.
This also removes some awkward dancing we did in integration_test.py to
run setup_twenty_days_ago under the opposite config of whatever we were
about to run tests under.
Reverts most of #4288 and #4290.
To make this work, I changed the twenty_days_ago setup to use
`config-next` when the main test phase is running `config`. That, in
turn, made the recheck_caa test fail, so I added a tweak to that.
I also moved the authzv2 migrations into `db`. Without that change,
the integration test would fail during the twenty_days_ago setup because
Boulder would attempt to create authzv2 objects but the table wouldn't
exist yet.
For authzv1, this actually executes a SQL DELETE for the unused challenges
when an authorization is updated upon validation.
For authzv2, this doesn't perform a delete, but changes the authorizations that
are returned so they don't include unused challenges.
In order to test the flag for both authz storage models, I set the feature flag in
both config/ and config-next/.
Fixes#4352
Things removed:
* features.EmbedSCTs (and all the associated RA/CA/ocsp-updater code etc)
* ca.enablePrecertificateFlow (and all the associated RA/CA code)
* sa.AddSCTReceipt and sa.GetSCTReceipt RPCs
* publisher.SubmitToCT and publisher.SubmitToSingleCT RPCs
Fixes#3755.
Removes usage of the `EnforceChallengeDisable` feature, the feature itself is not removed as it is still configured in staging/production, once that is fixed I'll submit another PR removing the actual flag.
This keeps the behavior that when authorizations are retrieved from the SA they have their challenges populated, because that seems to make the most sense to me? It also retains TLS re-validation.
Fixes#3441.
- Remove spinner from test.js. It made Travis logs hard to read.
- Listen on all interfaces for debugAddr. This makes it possible to check
Prometheus metrics for instances running in a Docker container.
- Standardize DNS timeouts on 1s and 3 retries across all configs. This ensures
DNS completes within the relevant RPC timeouts.
- Remove RA service queue from VA, since VA no longer uses the callback to RA on
completing a challenge.