Commit Graph

979 Commits

Author SHA1 Message Date
Daniel McCarney 75dcac2272
deps: update github.com/zmap/zlint to latest. (#4375)
Notably this brings in:
* A mild perf. boost from an updated transitive zcrypto dep and a reworked util func.
* A new KeyUsage lint for ECDSA keys.
* Updated gTLD data.
* A required `LintStatus` deserialization fix that will unblock a CFSSL update.

The `TestIgnoredLint` unit test is updated to no longer expect a warning from the 
` w_serial_number_low_entropy` lint. This lint was removed in the upstream project.
2019-07-31 13:10:44 -04:00
Daniel McCarney bb005e1c79
integration: add test for boulder-janitor. (#4364) 2019-07-29 16:13:10 -04:00
Jacob Hoffman-Andrews ba5a5a5ac9 cmd: Log less from gRPC, no INFO level. (#4367)
The gRPC INFO log lines clutter up integration test output, and we've never
had a use for them in production (they are mostly about details of
connection status).
2019-07-26 10:02:34 -04:00
Daniel McCarney 9e896325f7
boulder-janitor: add initial daemon for tidying certificate resources. (#4354)
A new `boulder-janitor` command is added that provides a long-running
daemon that cleans up rows associated with expired certificate
resources. At present this is rows from the following tables:

* certificates
* certificateStatus
* certificatesPerName

Adding cleanup of tables associated with Order resources is the next step.

Three prometheus stats are exported:

* janitor_deletions - CounterVec for the number of deletions by table the 
  boulder-janitor has performed.
* janitor_workbatch - GaugeVec for the number of items of work by table
  the boulder-janitor queued for deletion.
* janitor_errors - CounterVec for the number of errors by table and error
  type the boulder-janitor has experienced.
2019-07-24 15:09:04 -04:00
Jacob Hoffman-Andrews d077d3346e wfe/wfe2: remove AllowAuthzDeactivation flag. (#4345)
Fixes #4339
2019-07-17 16:30:27 -04:00
Jacob Hoffman-Andrews a4fc143a54 wfe/wfe2: clean up AcceptRevocationReason flag. (#4342)
Fixes #4340
2019-07-17 10:33:47 -04:00
Roland Bracewell Shoemaker 3ea77270e3
Use primary key as cursor in cert-checker rather than serial (#4316)
`cert-checker` assumes an undefined behavior of MySQL which is only sometimes true, which means sometimes we select fewer certificates than we actually expect to. Instead of adding an explicit ORDER BY we simply switch to cursoring using the primary key, which gets us overall much more efficient usage of indexes.

Fixes #4315.
2019-07-03 12:05:48 -07:00
Daniel McCarney 8a94ce053f
cert-checker: treat info/warning lint results as errs. (#4314) 2019-07-01 12:50:38 -04:00
Roland Bracewell Shoemaker af41bea99a Switch to more efficient multi nonce-service design (#4308)
Basically a complete re-write/re-design of the forwarding concept introduced in
#4297 (sorry for the rapid churn here). Instead of nonce-services blindly
forwarding nonces around to each other in an attempt to find out who issued the
nonce we add an identifying prefix to each nonce generated by a service. The
WFEs then use this prefix to decide which nonce-service to ask to validate the
nonce.

This requires a slightly more complicated configuration at the WFE/2 end, but
overall I think ends up being a way cleaner, more understandable, easy to
reason about implementation. When configuring the WFE you need to provide two
forms of gRPC config:

* one gRPC config for retrieving nonces, this should be a DNS name that
resolves to all available nonce-services (or at least the ones you want to
retrieve nonces from locally, in a two DC setup you might only configure the
nonce-services that are in the same DC as the WFE instance). This allows
getting a nonce from any of the configured services and is load-balanced
transparently at the gRPC layer. 
* a map of nonce prefixes to gRPC configs, this maps each individual
nonce-service to it's prefix and allows the WFE instances to figure out which
nonce-service to ask to validate a nonce it has received (in a two DC setup
you'd want to configure this with all the nonce-services across both DCs so
that you can validate a nonce that was generated by a nonce-service in another
DC).

This balancing is implemented in the integration tests.

Given the current remote nonce code hasn't been deployed anywhere yet this
makes a number of hard breaking changes to both the existing nonce-service
code, and the forwarding code.

Fixes #4303.
2019-06-28 12:58:46 -04:00
Roland Bracewell Shoemaker 66f4a48b1b nonce-service: switch to proto3 (#4304) 2019-06-27 10:07:17 -04:00
Roland Bracewell Shoemaker 844ae26b65
Allow forwarding of nonce-service Redeem RPCs from one service… (#4297)
Fixes #4295.
2019-06-26 13:04:31 -07:00
Roland Bracewell Shoemaker 352899ba2f Remove RevokeAuthorizationsByDomain/2 functionality (#4302)
* Remove RevokeAuthorizationsByDomain/2 functionality
* Remove old integration test
2019-06-26 15:48:18 -04:00
Roland Bracewell Shoemaker acc44498d1 RA: Make RevokeAtRA feature standard behavior (#4268)
Now that it is live in production and is working as intended we can remove
the old ocsp-updater functionality entirely.

Fixes #4048.
2019-06-20 14:32:53 -04:00
Daniel McCarney 8d3a246adb
cert-checker: allow ignoring lints by name. (#4272)
This updates the `cert-checker` utility configuration with a new allow list of
ignored lints so we can exclude known false-positives/accepted info results by
name instead of result level. To start only the `n_subject_common_name_included`
lint is excluded in `test/config-next/cert-checker.json`. Once this lands we can
treat info/warning lint results as errors as a follow-up to not break
deployability guarantees.

Resolves https://github.com/letsencrypt/boulder/issues/4271
2019-06-20 13:09:10 -04:00
Roland Bracewell Shoemaker 098a761c02 ocsp-updater: Remove integrated akamai purger (#4258)
This is now an external service.

Also bumps up the deadline in the integration test helper which checks for
purging because using the remote service from the ocsp-updater takes a little
longer. Once we remove ocsp-updater revocation support that can probably be
cranked back down to a more reasonable timeframe.
2019-06-12 09:36:53 -04:00
Roland Bracewell Shoemaker 3532dce246 Excise grpc maxConcurrentStreams configuration (#4257) 2019-06-12 09:35:24 -04:00
Jacob Hoffman-Andrews 65086c6976 ocsp-updater: Remove stale TODO (#4253)
The query referenced in the comment has already been updated to use the
isExpired field.
2019-06-06 15:02:22 -04:00
Roland Bracewell Shoemaker 4ca01b5de3
Implement standalone nonce service (#4228)
Fixes #3976.
2019-06-05 10:41:19 -07:00
Daniel McCarney 7dd176e9a4 Implement suberrors for policy blocked names. (#4234)
When validating a CSR's identifiers, or a new order's identifiers there may be more than one identifier that is blocked by policy. We should return an error that has suberrors identifying each bad identifier individually in this case.

Updates https://github.com/letsencrypt/boulder/issues/4193
Resolves https://github.com/letsencrypt/boulder/issues/3727
2019-05-31 15:00:17 -07:00
Roland Bracewell Shoemaker 11d16df3a6
Add authz2 expired-authz-purger tool (#4226)
Fixes #4188.
2019-05-30 14:01:01 -07:00
Roland Bracewell Shoemaker 6f93942a04 Consistently used stdlib context package (#4229) 2019-05-28 14:36:16 -04:00
Daniel McCarney ea9871de1e core: split identifier types into separate package. (#4225)
This will allow implementing sub-problems without creating a cyclic
dependency between `core` and `problems`.

The `identifier` package is somewhat small/single-purpose and in the
future we may want to move more "ACME" bits beyond the `identifier`
types into a dedicated package outside of `core`.
2019-05-23 13:24:41 -07:00
Daniel McCarney e627f58f97
publisher: remove HTTP GET log probing. (#4223)
We adding this diagnostic probing while debugging an issue that has
since been resolved.
2019-05-23 12:42:26 -04:00
Daniel McCarney 443c949180
tidy: cleanup JSON hostname policy support. (#4214)
We transitioned this data to YAML to have support for comments and can
remove the legacy JSON support/test data.
2019-05-14 17:06:36 -04:00
Jacob Hoffman-Andrews 0759d2d440 cmd: Split out config structs (#4200)
This follows up on some refactoring we had done previously but not
completed. This removes various binary-specific config structs from the
common cmd package, and moves them into their appropriate packages. In
the case of CT configs, they had to be moved into their own package to
avoid a dependency loop between RA and ctpolicy.
2019-05-06 11:11:08 -04:00
Daniel McCarney a3d35f51ff
admin-revoker: use authz2 SA revocation RPC. (#4182)
The `RevokeAuthorizationsByDomain` SA RPC is deprecated and `RevokeAuthorizationsByDomain2`
should be used in its place. Which RPC to use is controlled by the `NewAuthorizationSchema` feature
flag. When it is true the `admin-revoker` will use the new RPC. 

Resolves https://github.com/letsencrypt/boulder/issues/4178
2019-05-02 14:55:43 -04:00
Jacob Hoffman-Andrews 0c700143bb Clean up README and test configs (#4185)
- docker-rebuild isn't needed now that boulder and bhsm containers run directly off
 the boulder-tools image.
- Remove DNS options from RA config.
- Remove GSB options from VA config.
2019-04-30 13:26:19 -07:00
Daniel McCarney 5be559debb
sa: remove CertStatus.LockCol and SubscriberApproved. (#4175)
Both of these database fields are not being used.
2019-04-23 15:14:48 -04:00
Jacob Hoffman-Andrews 7b49849f87 For challenge deletions, select before deleting. (#4173)
This may work around an issue where deleting by authorization ID causes
a full table scan.
2019-04-23 13:47:40 -04:00
Jacob Hoffman-Andrews ff3129247d Put features.Reset in unitest setup functions. (#4129)
Previously we relied on each instance of `features.Set` to have a
corresponding `defer features.Reset()`. If we forget that, we can wind
up with unexpected behavior where features set in one test case leak
into another test case. This led to the bug in
https://github.com/letsencrypt/boulder/issues/4118 going undetected.

Fix #4120
2019-04-02 10:14:38 -07:00
Jacob Hoffman-Andrews 677b9b88ad Remove GSB support. (#4115)
This is no longer enabled in prod; cleaning up the code.

https://community.letsencrypt.org/t/let-s-encrypt-no-longer-checking-google-safe-browsing/82168
2019-03-15 10:24:44 -07:00
Jacob Hoffman-Andrews d1e6d0f190 Remove TLS-SNI-01 (#4114)
* Remove the challenge whitelist
* Reduce the signature for ChallengesFor and ChallengeTypeEnabled
* Some unit tests in the VA were changed from testing TLS-SNI to testing the same behavior
  in TLS-ALPN, when that behavior wasn't already tested. For instance timeouts during connect 
  are now tested.

Fixes #4109
2019-03-15 09:05:24 -04:00
Daniel McCarney 0ecdf80709 SA: refactor DB stat collection & collect more stats. (#4096)
Go 1.11+ updated the `sql.DBStats` struct with new fields that are of
interest to us. This PR routes these stats to Prometheus by replacing
the existing autoprom stats code with new first-class Prometheus
metrics. Resolves https://github.com/letsencrypt/boulder/issues/4095

The `max_db_connections` stat from the SA is removed because the Go 1.11+
`sql.DBStats.MaxOpenConnections` field will give us a better view of
the same information.

The autoprom "reused_authz" stat that was being incremented in
`SA.GetPendingAuthorization` was also removed. It wasn't doing what it
says it was (counting reused authorizations) and was instead counting
the number of times `GetPendingAuthorization` returned an authz.
2019-03-06 17:08:53 -08:00
Roland Bracewell Shoemaker 3e54cea295 Implement direct revocation at RA (#4043)
Implements a feature that enables immediate revocation instead of marking a certificate revoked and waiting for the OCSP-Updater to generate the OCSP response. This means that as soon as the request returns from the WFE the revoked OCSP response should be available to the user. This feature requires that the RA be configured to use the standalone Akamai purger service.

Fixes #4031.
2019-02-14 14:47:42 -05:00
Roland Bracewell Shoemaker 232a5f828f Fix ineffectual assignments (#4052)
* in boulder-ra we connected to the publisher and created a publisher gRPC client twice for no apparent reason
* in the SA we ignored errors from `getChallenges` in `GetAuthorizations` which could result in a nil challenge being returned in an authorization
2019-02-13 15:39:58 -05:00
Jacob Hoffman-Andrews 9fda3fb77d Switch to DSNs (#4044)
* Switch to DSNs

We used to use "mysql+tcp://" URLs but we don't need those anymore,
and there aren't any more of them in prod.

* Fix test.
2019-02-11 10:46:07 -08:00
Roland Bracewell Shoemaker 046955e99c Add a standalone akamai purger service (#4040)
Fixes #4030.
2019-02-05 09:00:31 -08:00
Roland Bracewell Shoemaker 064001203b Continue work on more SMTP errors (#4039)
Instead of just on 401. Pulled the various error codes from a handful of SMTP docs I
could find, they could probably use a second once over by others though.
2019-01-28 22:23:25 -08:00
Daniel McCarney d1daeee831 Config: serverAddresses -> serverAddress. (#4035)
The plural `serverAddresses` field in gRPC config has been deprecated for a bit now. We've removed the last usages of it in our staging/prod environments and can clear out the related code. Moving forward we only support a singular `serverAddress` and rely on DNS to direct to multiple instances of a given server.
2019-01-25 10:50:53 -08:00
Daniel McCarney b00c03e65a ocsp-updater: exploit isExpired index for revoked query. (#4032)
Before modifying the `findRevokedCertificatesToUpdate` query to include `NOT
isExpired` the query shows no `possible_keys` in `EXPLAIN` output.

```
MariaDB [boulder_sa_integration]> explain SELECT serial, status, ocspLastUpdated, revokedDate, revokedReason, lastExpirationNagSent, ocspResponse, notAfter, isExpired FROM certificateStatus WHERE status = 'revoked' AND ocspLastUpdated <= revokedDate LIMIT 1000;
+------+-------------+-------------------+------+---------------+------+---------+------+------+-------------+
| id   | select_type | table             | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+-------------------+------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | certificateStatus | ALL  | NULL          | NULL | NULL    | NULL |  208 | Using where |
+------+-------------+-------------------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.01 sec)
```

Afterwards we see `isExpired_ocspLastUpdated_idx` is considered:

```
MariaDB [boulder_sa_integration]> explain SELECT serial, status, ocspLastUpdated, revokedDate, revokedReason, lastExpirationNagSent, ocspResponse, notAfter, isExpired FROM certificateStatus WHERE NOT isExpired AND status = 'revoked' AND ocspLastUpdated <= revokedDate LIMIT 1000;
+------+-------------+-------------------+------+-------------------------------+------+---------+------+------+-------------+
| id   | select_type | table             | type | possible_keys                 | key  | key_len | ref  | rows | Extra       |
+------+-------------+-------------------+------+-------------------------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | certificateStatus | ALL  | isExpired_ocspLastUpdated_idx | NULL | NULL    | NULL |  208 | Using where |
+------+-------------+-------------------+------+-------------------------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

MariaDB [boulder_sa_integration]>
```
2019-01-24 15:41:32 -08:00
Jacob Hoffman-Andrews 4b9fd1f97e notify-mailer: Support CSV and parameters (#4024)
Fixes #4018 

This rearranges notify-mailer so we can give it CSV input and interpolate fields from that CSV.
It removes the old-style JSON input so we don't have to support two different input styles.

When multiple accounts have the same email address, their recipient data is consolidated under
that address so they only receive a single email. The CSV data can be interpolated using
the `range` operator in Golang templates.

Because we're now operating on the resolved email addresses instead of purely on accounts,
this PR also changes the checkpointing mode. Instead of a numeric start and end, it takes
a pair of strings, and only sends to email addresses between those two strings.
2019-01-22 16:07:17 -08:00
Daniel McCarney 1cf44c0546 RA: Exit early if CTLogGroups2 config is invalid. (#4025)
The `boulder-ra` component should fail to start if the `CTLogGroups2` configuration is empty, or if any of the configured log groups have no logs specified. This avoids more ambiguous errors down the road.

This PR also removes the deprecated `CTLogGroups` field from the RA struct. It isn't being used in any configurations.

Based on initial work in https://github.com/letsencrypt/boulder/pull/3990 by @michalmedvecky. Resolves #3941.
2019-01-22 13:14:26 -08:00
Daniel McCarney 1a68cc2225 notify-mailer: warn for bad rcpt, don't exit. (#4022)
Resolves https://github.com/letsencrypt/boulder/issues/4019

I can't find RFC verse and chapter for "401 4.1.3" errors, but [IANA's registry of SMTP enhanced status codes](https://www.iana.org/assignments/smtp-enhanced-status-codes/smtp-enhanced-status-codes.xhtml) does show an entry matching `x.1.3`:
```
X.1.3 | Bad destination mailbox address syntax | 501 | The destination address was syntactically invalid. This can apply to any field in the address. This code is only useful for permanent failures. | [RFC3463] (Standards Track) | G. Vaudreuil | IESG
```
However that entry from IANA says the "associated basic code" is 501, not 401.

Since we wrote this tool to talk to exactly one SMTP server in the world and it definitely is returning "401 4.1.3" in some cases I think its reasonable to handle as I've done in this PR. Alternative suggestions welcome.
2019-01-18 14:14:30 -08:00
Daniel McCarney ed01d6bc14 notify-mailer: skip invalid contact emails (#4021)
Resolves #4020
2019-01-18 11:47:21 -08:00
Jacob Hoffman-Andrews bb15685a0f
No new certificates tick (#4012)
Since #2633 we generate OCSP at first issuance, so we no longer need 
this loop to check for new certificates that need OCSP status generated.
Since the associate SQL query is slow, we should just turn it off.

Also remove the configuration fields for the MissingSCTTick. The code
for that was already deleted.
2019-01-17 14:43:06 -08:00
Jacob Hoffman-Andrews 281e2546f3
De-duplicate email addresses in notify-mailer. (#4015)
Resolves #4003
2019-01-17 11:34:04 -08:00
Daniel 901e3fc1ed Merge remote-tracking branch 'le/master' into cpu-notify-off-by-one 2019-01-16 09:30:22 -05:00
Daniel 7f626a1c79 Clearer fix 2019-01-16 09:30:13 -05:00
Jacob Hoffman-Andrews cdc01df24f Revert "Increase default MaxIdleConns. (#3164)" (#4007)
This reverts commit 600640294d,
removing the maxIdleDBConns config setting.
2019-01-16 08:41:21 -05:00
Daniel 7d8f55d64c notify-mailer: fix off-by-one in printStatus args 2019-01-15 16:25:37 -05:00