Commit Graph

522 Commits

Author SHA1 Message Date
Samantha Frank 14c0b2c3bb
ratelimits: Check at NewOrder and SpendOnly later (#7669)
- Check `CertificatesPerDomain` at newOrder and spend at Finalize time.
- Check `CertificatesPerAccountPerDomain` at newOrder and spend at
Finalize time.
- Check `CertificatesPerFQDNSet` at newOrder and spend at Finalize time.
- Fix a bug
in`FailedAuthorizationsPerDomainPerAccountSpendOnlyTransaction()` which
results in failed authorizations being spent for the exact FQDN, not the
eTLD+1.
- Remove redundant "max names" check at transaction construction time
- Enable key-value rate limits in the RA
2024-08-15 19:08:17 -04:00
Samantha Frank 6a3e9d725b
ratelimits: Provide verbose user-facing rate limit errors (#7653)
- Instruct callers to call *Decision.Result() to check the result of
rate limit transactions
- Preserve the Transaction within the resulting *Decision
- Generate consistently formatted verbose errors using the metadata
found in the *Decision
- Fix broken key-value rate limits integration test in
TestDuplicateFQDNRateLimit

Fixes #7577
2024-08-12 16:14:15 -04:00
Aaron Gable 7b6935d223
Configure lints separately for each profile (#7636)
Move the two lint-configuration keys, LintConfig and IgnoreLints, from
the top-level CA.Issuance config stanza into each individual
CA.Issuance.CertProfiles stanza. This allows us to have
differently-configured lints for different profiles, to ensure that our
linting regime is as strict as possible.

Without this change, it would be necessary for us to ignore both the
"common name included" and the "no subject key id" lints at the
top-level, when in fact each of those warnings only triggers on one of
our two profiles.

Fixes https://github.com/letsencrypt/boulder/issues/7635
2024-08-01 10:01:46 -07:00
Aaron Gable c6c7617851
Profiles: allow for omission of KU, EKU, and SKID (#7622)
Add three new keys to the CA's ProfileConfig:
- OmitKeyEncipherment causes the keyEncipherment Key Usage to be omitted
from certificates with RSA public keys. We currently include it for
backwards compatibility with TLS 1.1 servers that don't support modern
cipher suites, but this KU is completely useless as of TLS 1.3.
- OmitClientAuth causes the tlsClientAuthentication Extended Key Usage
to be omitted from all certificates. We currently include it to support
any subscribers who may be relying on it, but Root Programs are moving
towards single-purpose hierarchies and its inclusion is being
discouraged.
- OmitSKID causes the Subject Key Identifier extension to be omitted
from all certificates. We currently include this extension because it is
recommended by RFC 5280, but it serves little to no practical purpose
and consumes a large number of bytes, so it is now NOT RECOMMENDED by
the Baseline Requirements.

Make substantive changes to issuer.requestValid and issuer.Prepare to
implement the desired behavior for each of these options. Make a very
slight change to ra.matchesCSR to generally allow for serverAuth-only
EKUs. Improve the unit tests of both the //ca and //issuance packages to
cover the new behavior.

Part of https://github.com/letsencrypt/boulder/issues/7610
2024-07-31 11:08:11 -07:00
Aaron Gable cf8e5aa1b1
Use profile to determine backdate and validity (#7621)
One of our goals with profiles is to allow different profiles to have
different validity periods. While the profiles already had the ability
to enforce different maximum backdates and validities, the CA still had
separate global configuration for what the backdate and validity period
should actually be.

Move the computation of the notBefore and notAfter timestamps into the
issuance package, so that it can be based on the profile's configured
backdate and validity durations. Deprecate the global "backdate" and
"expiry" config fields, as they are no longer used. Finally, add more
validation for the profile's backdate and validity.

Part of https://github.com/letsencrypt/boulder/issues/7610
2024-07-25 13:47:51 -07:00
Samantha Frank 986c78a2b4
WFE: Reject new orders containing paused identifiers (#7599)
Part of #7406
Fixes #7475
2024-07-25 13:46:40 -04:00
Aaron Gable ff851f7107
WFE: Include profile name in returned Order json (#7626)
Integration testing revealed that the WFE was not rendering the profile
name in the Order JSON object. Fix the one spot where it was missed.

Part of https://github.com/letsencrypt/boulder/issues/7332
2024-07-24 14:30:24 -07:00
Aaron Gable 6b484f44ba
Profiles: replace AllowCommonName with OmitCommonName (#7620)
Add a new profile config key named "OmitCommonName" which, if set to
`true`, causes the issuance package to exclude the CN from the resulting
certificate even if the initiating IssuanceRequest specified one.
Deprecate the old "AllowCommonName" config key, so that it no longer has
any effect, rather than causing the issuance package to fully reject
IssuanceRequests containing a CN.

This allows for more graceful variation between profiles, since we know
that excluding the Common Name is always safe.

Part of https://github.com/letsencrypt/boulder/issues/7610
2024-07-24 11:44:26 -07:00
Aaron Gable 48439e4532
Advertise available profiles in directory resource (#7603)
Change the way profiles are configured at the WFE to allow them to be
accompanied by descriptive strings. Augment the construction of the
directory resource's "meta" sub-object to include these profile names
and descriptions.

This config swap is safe, since no Boulder WFE instance is configured
with `CertificateProfileNames` yet.

Fixes https://github.com/letsencrypt/boulder/issues/7602
2024-07-22 15:31:08 -07:00
Aaron Gable 848a9ea696
Deprecate AllowCTPoison and AllowSCTList profile settings (#7611)
These profile variables are set to "true" everywhere, and we have no
intention of ever setting them to "false" anywhere. Deprecate them so
that they can be removed in the future, and to reduce the chances of
confusion when new profile variables are introduced in the near future.

Part of https://github.com/letsencrypt/boulder/issues/7610
2024-07-22 15:27:56 -07:00
Aaron Gable a3e99432bb
goodkey: default to 110 rounds of Fermat factorization (#7579)
This change guarantees compliance with CA/BF Ballot SC-073 "Compromised
and Weak Keys", which requires that at least 100 rounds of Fermat
Factorization be attempted:

> Section 6.1.1.3 Subscriber Key Pair Generation
> The CA SHALL reject a certificate request if... The Public Key
corresponds to an industry-demonstrated weak Private Key. For requests
submitted on or after November 15, 2024,... In the case of Close Primes
vulnerability (https://fermatattack.secvuln.info/), the CA SHALL reject
weak keys which can be factored within 100 rounds using Fermat’s
factorization method.

We choose 110 rounds to ensure a margin above and beyond the requirements.

Fixes https://github.com/letsencrypt/boulder/issues/7558
2024-07-17 16:05:30 -07:00
Phil Porada 30c6e592f7
sfe: Implement self-service frontend for account pausing/unpausing (#7500)
Adds a new boulder component named `sfe` aka the Self-service FrontEnd
which is dedicated to non-ACME related Subscriber functions. This change
implements one such function which is a web interface and handlers for
account unpausing.

When paused, an ACME client receives a log line URL with a JWT parameter
from the WFE. For the observant Subscriber, manually clicking the link
opens their web browser and displays a page with a pre-filled HTML form.
Upon clicking the form button, the SFE sends an HTTP POST back to itself
and either validates the JWT and issues an RA gRPC request to unpause
the account, or returns an HTML error page.

The SFE and WFE should share a 32 byte seed value e.g. the output of
`openssl rand -hex 16` which will be used as a go-jose symmetric signer
using the HS256 algorithm. The SFE will check various [RFC
7519](https://datatracker.ietf.org/doc/html/rfc7519) claims on the JWT
such as the `iss`, `aud`, `nbf`, `exp`, `iat`, and a custom `apiVersion`
claim.

The SFE should not yet be relied upon or deployed to staging/production
environments. It is very much a work in progress, but this change is big
enough as-is.

Related to https://github.com/letsencrypt/boulder/issues/7406
Part of https://github.com/letsencrypt/boulder/issues/7499
2024-07-10 10:52:33 -04:00
Samantha Frank 55c274d132
ratelimits: Exempt renewals from NewOrdersPerAccount and CertificatesPerDomain (#7513)
- Rename `NewOrderRequest` field `LimitsExempt` to `IsARIRenewal`
- Introduce a new `NewOrderRequest` field, `IsRenewal`
- Introduce a new (temporary) feature flag, `CheckRenewalExemptionAtWFE`

WFE:
- Perform renewal detection in the WFE when `CheckRenewalExemptionAtWFE`
is set
- Skip (key-value) `NewOrdersPerAccount` and `CertificatesPerDomain`
limit checks when renewal detection indicates the the order is a
renewal.

RA:
- Leave renewal detection in the RA intact
- Skip renewal detection and (legacy) `NewOrdersPerAccount` and
`CertificatesPerDomain` limit checks when `CheckRenewalExemptionAtWFE`
is set and the `NewOrderRequest` indicates that the order is a renewal.

Fixes #7508
Part of #5545
2024-06-27 16:39:31 -04:00
Phil Porada 9207669755
Deprecate ECDSAForAll feature and remove ECDSAAllowList (#7560)
`ECDSAForAll` feature is now enabled by default (due to it not being
referenced in any issuance path) and as a result the `ECDSAAllowlist`
has been deleted.

Fixes https://github.com/letsencrypt/boulder/issues/7535
2024-06-26 10:38:51 -04:00
Aaron Gable c3c278a1a2
Deprecate EnforceMultiVA and MultiVAFullResults feature flags (#7520)
These flags have been true and false, respectively, for years. We do not
expect to change them at any time in the future, and their continued
existence makes certain parts of the VA code significantly more complex.

Remove all references to them, preserving behavior in the "enforce, but
not full results" configuration.

IN-10358 tracks the corresponding config changes
2024-06-04 11:57:03 -07:00
Aaron Gable 0d8efb9b38
Purger: compute throughput values from number of instances (#7502)
Give akamai-purger a new "Throughput.TotalInstances" config value, to
inform it how many instances of itself are competing for akamai rate
limit quote. Combine the `useOptimizedDefaults` and `validate` functions
into a single `optimizeAndValidate` function which sets default values
according to the number of active instances, and confirms that the
results still fall within the rate limits.

Fixes https://github.com/letsencrypt/boulder/issues/7487
2024-05-24 13:30:46 -04:00
Aaron Gable 5be3650e56
Remove deprecated WFE.RedeemNonceServices (#7493)
Fixes https://github.com/letsencrypt/boulder/issues/6610
2024-05-21 13:13:13 -04:00
Aaron Gable 146b78a0f7
Remove all static minica keys (#7489)
Remove the redis-tls, wfe-tls, and mail-test-srv keys which were
generated by minica and then checked in to the repo. All three are
replaced by the dynamically-generated ipki directory.

Part of https://github.com/letsencrypt/boulder/issues/7476
2024-05-17 11:45:40 -07:00
Aaron Gable 6ae6aa8e90
Dynamically generate grpc-creds at integration test startup (#7477)
The summary here is:
- Move test/cert-ceremonies to test/certs
- Move .hierarchy (generated by the above) to test/certs/webpki
- Remove our mapping of .hierarchy to /hierarchy inside docker
- Move test/grpc-creds to test/certs/ipki
- Unify the generation of both test/certs/webpki and test/certs/ipki
into a single script at test/certs/generate.sh
- Make that script the entrypoint of a new docker compose service
- Have t.sh and tn.sh invoke that service to ensure keys and certs are
created before tests run

No production changes are necessary, the config changes here are just
for testing purposes.

Part of https://github.com/letsencrypt/boulder/issues/7476
2024-05-15 11:31:23 -04:00
Phil Porada 44c0587988
remoteva: Config options to handle alternate deployment models (#7473)
* Adds a `VerifyGRPCClientCertIfGiven` boolean to the `remoteva` config
that cause the RVA server to use the less strict
`tls.VerifyClientCertIfGiven` for use with an Amazon Web Services
Application Load Balancer (ALB) between the `boulder-va` and `remoteva`.
See https://github.com/letsencrypt/boulder/issues/7386.

Part of https://github.com/letsencrypt/boulder/issues/5294

---------

Co-authored-by: Samantha <hello@entropy.cat>
2024-05-13 14:43:40 -04:00
Samantha 16d55ef120
ratelimits: Support new Comment field for each Id entry (#7480)
Fixes #7478
2024-05-13 14:16:51 -04:00
Phil Porada c1561b070b
Add a new remoteva binary (#7437)
* Adds a new `remoteva` binary that takes a distinct configuration from
the existing `boulder-va`
* Removed the `boulder-remoteva` name registration from `boulder-va`. 
  * Existing users of `boulder-remoteva` must either 
1. laterally migrate to `boulder-va` which uses that same config, or
    2. switch to using `remoteva` with a new config.

Part of https://github.com/letsencrypt/boulder/issues/5294
2024-05-06 16:29:29 -04:00
Aaron Gable 939ac1be8f
Add pkilint to CI via custom zlint (#7441)
Add a new "LintConfig" item to the CA's config, which can point to a
zlint configuration toml file. This allows lints to be configured, e.g.
to control the number of rounds of factorization performed by the Fermat
factorization lint.

Leverage this new config to create a new custom zlint which calls out to
a configured pkilint API endpoint. In config-next integration tests,
configure the lint to point at a new pkilint docker container.

This approach has three nice forward-looking features: we now have the
ability to configure any of our lints; it's easy to expand this
mechanism to lint CRLs when the pkilint API has support for that; and
it's easy to enable this new lint if we decide to stand up a pkilint
container in our production environment.

No production configuration changes are necessary at this time.

Fixes https://github.com/letsencrypt/boulder/issues/7430
2024-04-30 09:29:26 -07:00
Phil Porada 57a4995a26
test: Remove n_subject_common_name_included from ignored lint list (#7453)
Fixes https://github.com/letsencrypt/boulder/issues/7261
2024-04-25 13:37:40 -04:00
Aaron Gable 94d14689bf
Implement unpredictable issuance from similar intermediates (#7418)
Replace the CA's "useForRSA" and "useForECDSA" config keys with a single
"active" boolean. When the CA starts up, all active RSA issuers will be
used to issue precerts with RSA pubkeys, and all ECDSA issuers will be
used to issue precerts with ECDSA pubkeys (if the ECDSAForAll flag is
true; otherwise just those that are on the allow-list). All "inactive"
issuers can still issue OCSP responses, CRLs, and (notably) final
certificates.

Instead of using the "useForRSA" and "useForECDSA" flags, plus implicit
config ordering, to determine which issuer to use to handle a given
issuance, simply use the issuer's public key algorithm to determine
which issuances it should be handling. All implicit ordering
considerations are removed, because the "active" certificates now just
form a pool that is sampled from randomly.

To facilitate this, update some unit and integration tests to be more
flexible and try multiple potential issuing intermediates, particularly
when constructing OCSP requests.

For this change to be safe to deploy with no user-visible behavior
changes, the CA configs must contain:
- Exactly one RSA-keyed intermediate with "useForRSALeaves" set to true;
and
- Exactly one ECDSA-keyed intermediate with "useForECDSALeaves" set to
true.

If the configs contain more than one intermediate meeting one of the
bullets above, then randomized issuance will begin immediately.

Fixes https://github.com/letsencrypt/boulder/issues/7291
Fixes https://github.com/letsencrypt/boulder/issues/7290
2024-04-18 10:00:38 -07:00
Aaron Gable ce8986e17b
Make "CRLDPBase" config item optional (#7427)
This was missed in https://github.com/letsencrypt/boulder/pull/7300

Part of https://github.com/letsencrypt/boulder/issues/7296
2024-04-12 11:23:27 -07:00
Aaron Gable 327f96d281
Update integration test hierarchy for the modern era (#7411)
Update the hierarchy which the integration tests auto-generate inside
the ./hierarchy folder to include three intermediates of each key type,
two to be actively loaded and one to be held in reserve. To facilitate
this:
- Update the generation script to loop, rather than hard-coding each
intermediate we want
- Improve the filenames of the generated hierarchy to be more readable
- Replace the WFE's AIA endpoint with a thin aia-test-srv so that we
don't have to have NameIDs hardcoded in our ca.json configs

Having this new hierarchy will make it easier for our integration tests
to validate that new features like "unpredictable issuance" are working
correctly.

Part of https://github.com/letsencrypt/boulder/issues/729
2024-04-08 14:06:00 -07:00
Phil Porada 1e1f6ff254
CA: Load multiple certificate profiles (#7325)
This change introduces a new config key `certProfiles` which contains a
map of `profiles`. Only one of `profile` or `certProfiles` should be
used, because configuring both will result in the CA erroring and
shutting down. Further, the singular `profile` is now
[deprecated](https://github.com/letsencrypt/boulder/issues/7414).

The CA pre-computes several maps at startup; 
* A human-readable name to a `*issuance.Profile` which is referred to as
"name".
* A SHA-256 sum over the entire contents of the given profile to the
`*issuance.Profile`. We'll refer to this as "hash".

Internally, CA methods no longer pass an `*issuance.Profile`, instead
they pass a structure containing maps of certificate profile
identifiers. To determine the default profile used by the CA, a new
config field `defaultCertificateProfileName` has been added to the
Issuance struct. Absence of `defaultCertificateProfileName` will cause
the CA to use the default value of `defaultBoulderCertificateProfile`
such as for the the deprecated `profile`. The key for each given
certificate profile will be used as the "name". Duplicate names or
hashes will cause the CA to error during initialization and shutdown.

When the RA calls `ra.CA.IssuePrecertificate`, it will pass an arbitrary
certificate profile name to the CA triggering the CA to lookup if the
name exists in its internal mapping. The RA maintains no state or
knowledge of configured certificate profiles and relies on the CA to
provide this information. If the name exists in the CA's map, it will
return the hash along with the precertificate bytes in a
`capb.IssuePrecertificateResponse`. The RA will then call
`ra.CA.IssueCertificateForPrecertificate` with that same hash. The CA
will lookup the hash to determine if it exists in its map, and if so
will continue on with certificate issuance.

Precertificate and certificate issuance audit logs will now include the
certificate profile name and hex representation of the hash that they
were issued with.

Fixes https://github.com/letsencrypt/boulder/issues/6966

There are no required config or SQL changes.
2024-04-08 12:52:46 -04:00
Phil Porada 8556eaedca
SA: store and return certificate profile name (#7352)
Adds `certificateProfileName` to the `orders` database table. The
[maximum
length](https://github.com/letsencrypt/boulder/pull/7325/files#diff-a64a0af7cbf484da8e6d08d3eefdeef9314c5d9888233f0adcecd21b800102acR35)
of a profile name matches the `//issuance` package.

Adds a `MultipleCertificateProfiles` feature flag that, when enabled,
will store the certificate profile name from a `NewOrderRequest`. The
certificate profile name is allowed to be empty and the database will
treat that row as [NULL](https://mariadb.com/kb/en/null-values/). When
the SA retrieves this potentially NULL row, it will be cast as the
golang string zero value `""`.

SRE ticket IN-10145 has been filed to perform the database migration and
enable the new feature flag. The migration must be performed before
enabling the feature flag.

Part of https://github.com/letsencrypt/boulder/issues/7324
2024-03-20 13:08:31 -04:00
Samantha c6b50558e6
WFE: Add support for certificate profiles (#7373)
- Parse and validate the `profile` field in `newOrder` requests.
- Pass the `profile` field from `newOrder` calls to the resulting
`RA.NewOrder` call.
- When the client requests a specific profile, ensure that the profile
field is populated in the order returned.

Fixes #7332
Part of #7309
2024-03-20 12:49:45 -04:00
Samantha 5e68cbe552
WFE: Gate ARI limit exemption and replacement tracking on a feature flag (#7383)
Gate checking of replacement orders and exemption for ARI replacements
on the `TrackReplacementCertificatesARI` feature flag.
2024-03-18 12:22:01 -04:00
Samantha 529157ce56
ratelimits: Fix transaction building for Failed Authorizations Limit (#7344)
- Update the failed authorizations limit to use 'enum:regId:domain' for
transactions while maintaining 'enum:regId' for overrides.
- Modify the failed authorizations transaction builder to generate a
transaction for each order name.
- Rename the `FailedAuthorizationsPerAccount` enum to
`FailedAuthorizationsPerDomainPerAccount` to align with its corrected
implementation. This change is possible because the limit isn't yet
deployed in staging or production.

Blocks #7346
Part of #5545
2024-03-06 13:48:32 -05:00
Matthew McPherrin 313e3b93ba
Add DNSStaticResolver option (#7336)
We run the RVAs in AWS, where we don't have all the same service
discovery infrastructure we do for the primary VAs and the rest of
Boulder. The solution for populating SRV records we have today hasn't
been reliable, so we'd like to experiment with bringing up RVAs paired
1:1 with a local DNS resolver. This brings back some of the previous
static DNS resolver configuration, though it's not a clean revert
because other configuration has changed in the meantime
2024-02-23 14:45:01 -08:00
Aaron Gable 78e4e82ffa
Feature cleanup (#7320)
Remove three deprecated feature flags which have been removed from all
production configs:
- StoreLintingCertificateInsteadOfPrecertificate
- LeaseCRLShards
- AllowUnrecognizedFeatures

Deprecate three flags which are set to true in all production configs:
- CAAAfterValidation
- AllowNoCommonName
- SHA256SubjectKeyIdentifier

IN-9879 tracked the removal of these flags.
2024-02-13 17:42:27 -08:00
Aaron Gable ad699af3d4
Add CRL capabilities to issuance package (#7300)
Move the CRL issuance logic -- building an x509.RevocationList template,
populating it with correctly-built extensions, linting it, and actually
signing it -- out of the //ca package and into the //issuance package.
This means that the CA's CRL code no longer needs to be able to reach
inside the issuance package to access its issuers and certificates (and
those fields will be able to be made private after the same is done for
OCSP issuance).

Additionally, improve the configuration of CRL issuance, create
additional checks on CRL's ThisUpdate and NextUpdate fields, and make it
possible for a CRL to contain two IssuingDistributionPoint URIs so that
we can migrate to shorter addresses.

IN-10045 tracks the corresponding production changes.

Fixes https://github.com/letsencrypt/boulder/issues/7159
Part of https://github.com/letsencrypt/boulder/issues/7296
Part of https://github.com/letsencrypt/boulder/issues/7294
Part of https://github.com/letsencrypt/boulder/issues/7094
Part of https://github.com/letsencrypt/boulder/issues/7100
2024-02-13 09:13:36 -08:00
Samantha f10abd27eb
SA/ARI: Add method of tracking certificate replacement (#7284)
Part of #6732
Part of #7038
2024-02-08 14:19:29 -05:00
Aaron Gable 10e894a172
Create new admin tool (#7276)
Create a new administration tool "bin/admin" as a successor to and
replacement of "admin-revoker".

This new tool supports all the same fundamental capabilities as the old
admin-revoker, including:
- Revoking by serial, by batch of serials, by incident table, and by
private key
- Blocking a key to let bad-key-revoker take care of revocation
- Clearing email addresses from all accounts that use them

Improvements over the old admin-revoker include:
- All commands run in "dry-run" mode by default, to prevent accidental
executions
- All revocation mechanisms allow setting the revocation reason,
skipping blocking the key, indicating that the certificate is malformed,
and controlling the number of parallel workers conducting revocation
- All revocation mechanisms do not parse the cert in question, leaving
that to the RA
- Autogenerated usage information for all subcommands
- A much more modular structure to simplify adding more capabilities in
the future
- Significantly simplified tests with smaller mocks

The new tool has analogues of all of admin-revokers unit tests, and all
integration tests have been updated to use the new tool instead. A
future PR will remove admin-revoker, once we're sure SRE has had time to
update all of their playbooks.

Fixes https://github.com/letsencrypt/boulder/issues/7135
Fixes https://github.com/letsencrypt/boulder/issues/7269
Fixes https://github.com/letsencrypt/boulder/issues/7268
Fixes https://github.com/letsencrypt/boulder/issues/6927
Part of https://github.com/letsencrypt/boulder/issues/6840
2024-02-07 09:35:18 -08:00
Samantha 97a19b18d2
WFE: Check NewOrder rate limits (#7201)
Add non-blocking checks of New Order limits to the WFE using the new
key-value based rate limits package.

Part of #5545
2024-01-26 21:05:30 -05:00
Phil Porada 03152aadc6
RVA: Recheck CAA records (#7221)
Previously, `va.IsCAAValid` would only check CAA records from the
primary VA during initial domain control validation, completely ignoring
any configured RVAs. The upcoming
[MPIC](https://github.com/ryancdickson/staging/pull/8) ballot will
require that it be done from multiple perspectives. With the currently
deployed [Multi-Perspective
Validation](https://letsencrypt.org/2020/02/19/multi-perspective-validation.html)
in staging and production, this change brings us in line with the
[proposed phase
3](https://github.com/ryancdickson/staging/pull/8/files#r1368708684).
This change reuses the existing
[MaxRemoteValidationFailures](21fc191273/cmd/boulder-va/main.go (L35))
variable for the required non-corroboration quorum.
> Phase 3: June 15, 2025 - December 14, 2025 ("CAs MUST implement MPIC
in blocking mode*"):
>
>    MUST implement MPIC? Yes
> Required quorum?: Minimally, 2 remote perspectives must be used. If
using less than 6 remote perspectives, 1 non-corroboration is allowed.
If using 6 or more remote perspectives, 2 non-corroborations are
allowed.
>    MUST block issuance if quorum is not met: Yes.
> Geographic diversity requirements?: Perspectives must be 500km from 1)
the primary perspective and 2) all other perspectives used in the
quorum.
>
> * Note: "Blocking Mode" is a nickname. As opposed to "monitoring mode"
(described in the last milestone), CAs MUST NOT issue a certificate if
quorum requirements are not met from this point forward.

Adds new VA feature flags: 
* `EnforceMultiCAA` instructs a primary VA to command each of its
configured RVAs to perform a CAA recheck.
* `MultiCAAFullResults` causes the primary VA to block waiting for all
RVA CAA recheck results to arrive.


Renamed `va.logRemoteValidationDifferentials` to
`va.logRemoteDifferentials` because it can handle initial domain control
validations and CAA rechecking with minimal editing.

Part of https://github.com/letsencrypt/boulder/issues/7061
2024-01-25 16:23:25 -05:00
Jacob Hoffman-Andrews ce5632b480
Remove `service1` / `service2` names in consul (#7266)
These names corresponded to single instances of a service, and were
primarily used for (a) specifying which interface to bind a gRPC port on
and (b) allowing `health-checker` to check individual instances rather
than a service as a whole.

For (a), change the `--grpc-addr` flags to bind to "all interfaces." For
(b), provide a specific IP address and port for health checking. This
required adding a `--hostOverride` flag for `health-checker` because the
service certificates contain hostname SANs, not IP address SANs.

Clarify the situation with nonce services a little bit. Previously we
had one nonce "service" in Consul and got nonces from that (i.e.
randomly between the two nonce-service instances). Now we have two nonce
services in consul, representing multiple datacenters, and one of them
is explicitly configured as the "get" service, while both are configured
as the "redeem" service.

Part of #7245.

Note this change does not yet get rid of the rednet/bluenet distinction,
nor does it get rid of all use of 10.88.88.88. That will be a followup
change.
2024-01-22 09:34:20 -08:00
Matthew McPherrin 56c10c613c
Update zlint (#7252)
Upgrade to zlint v3.6.0

Two new lints are triggered in various places:
aia_contains_internal_names is ignored in integration test
configurations, and unit tests are updated to have more realistic URLs.
The w_subject_common_name_included lint needs to be ignored where we'd
ignored n_subject_common_name_included before.

Related to https://github.com/letsencrypt/boulder/issues/7261
2024-01-16 11:50:37 -08:00
Aaron Gable d38b7b685b
Fix flaky integration test failures (#7262)
This partially reverts commit 20b121138c,
which was landed in https://github.com/letsencrypt/boulder/pull/7254.
Specifically, it reverts the addition of "noWaitForReady" to the
health-checker's gRPC config. This appears to stop the flaky `last
resolver error: produced zero addresses` failures we've been seeing in
the CI integration tests.
2024-01-16 09:50:13 -08:00
Jacob Hoffman-Andrews 20b121138c
health-checker: bail early on handshake failure (#7254)
When we have a problem with our authentication certificates, it's better
to get a clear error early than to wait for health checker to time out.

Also, set noWaitForReady in the config, which prevents detailed errors
from being obscured by "timed out" errors.
2024-01-11 09:36:35 -08:00
Jacob Hoffman-Andrews 7b347dd6c3
Use different ports for instances of the same service (#7246)
Part of #7245.

This just provides a unique port for each instance, and breaks the
service<->port mapping. A subsequent PR will move to listening on the
same IP.

Remove unused `-b` variants of crl-storer and akamai-purger.

The new port scheme is that the first instance of a service is on `93xx`
and the second instance of a service is on `94xx`.

Part of a stacked change with #7243.
2024-01-10 14:32:33 -08:00
Phil Porada 2e951b0105
Remove ca-a and ca-b distinction in test configs (#7238)
Fixes https://github.com/letsencrypt/boulder/issues/7187
2024-01-08 13:19:28 -08:00
Aaron Gable 6b54b61f21
Prevent serial prefixes from beginning with a 1 (#7214)
Change the max value of the CA's `SerialPrefix` config value from 255 (a
byte of all 1s) to 127 (a byte of one 0 followed by seven 1s). This
prevents the serial prefix from ever beginning with a 1.

This is important because serials are interpreted as signed
(twos-complement) integers, and are required to be positive -- a serial
whose first bit is 1 is considered to be negative and therefore in
violation of RFC 5280. The go stdlib fixes this for us by prepending a
zero byte to any serial that begins with a 1 bit, but we'd prefer all
our serials to be the same length.

Corresponding config change was completed in IN-9880.
2023-12-15 07:37:44 -08:00
Aaron Gable 97cba52e09
Remove deprecated and unused feature flags (#7207)
These feature flags are no longer referenced in any test, staging, or
production configuration. They were removed in:
- StoreRevokerInfo: IN-8546
- ROCSPStage6 and ROCSPStage7: IN-8886
- CAAValidationMethods and CAAAccountURI: IN-9301
2023-12-13 13:53:31 -08:00
Samantha 8cd1e60abf
ratelimits: More compact overrides format (#7199)
Support a more compact format for supplying overrides to default rate
limits.

Fixes #7197
2023-12-11 11:23:39 -08:00
Jacob Hoffman-Andrews c21b376623
Implement DoH for validation queries (#7178)
Fixes: #7141
2023-12-11 10:49:00 -08:00
Matthew McPherrin cb5384dcd7
Add --addr and/or --debug-addr flags to all commands (#7175)
Many services already have --addr and/or --debug-addr flags.

However, it wasn't universal, so this PR adds flags to commands where
they're not currently present.

This makes it easier to use a shared config file but listen on different
ports, for running multiple instances on a single host.

The config options are made optional as well, and removed from
config-next/.
2023-12-07 17:41:01 -08:00
Phil Porada 3366be50f1
Use RFC 7093 truncated SHA256 hash for Subject Key Identifier (#7179)
- Adds a feature flag to gate rollout for SHA256 Subject Key Identifiers
for end-entity certificates.
- The ceremony tool will now use the RFC 7093 section 2 option 1 method
for generating Subject Key Identifiers for future root CA, intermediate
CA, and cross-sign ceremonies.

- - - -

[RFC 7093 section 2 option
1](https://datatracker.ietf.org/doc/html/rfc7093#section-2) provides a
method for generating a truncated SHA256 hash for the Subject Key
Identifier field in accordance with Baseline Requirement [section
7.1.2.11.4 Subject Key
Identifier](90a98dc7c1/docs/BR.md (712114-subject-key-identifier)).

> [RFC5280] specifies two examples for generating key identifiers from
>    public keys.  Four additional mechanisms are as follows:
> 
>    1) The keyIdentifier is composed of the leftmost 160-bits of the
>       SHA-256 hash of the value of the BIT STRING subjectPublicKey
>       (excluding the tag, length, and number of unused bits).

The related [RFC 5280 section
4.2.1.2](https://datatracker.ietf.org/doc/html/rfc5280#section-4.2.1.2)
states:
>   For CA certificates, subject key identifiers SHOULD be derived from
>   the public key or a method that generates unique values.  Two common
>   methods for generating key identifiers from the public key are:
>   ...
>   Other methods of generating unique numbers are also acceptable.
2023-12-06 13:44:17 -05:00
Matthew McPherrin 32adaf1846
Make log-validator take glob patterns to monitor for log files (#7172)
To simplify deployment of the log validator, this allows wildcards
(using go's filepath.Glob) to be included in the file paths.

In order to detect new files, a new background goroutine polls the glob
patterns every minute for matches.

Because the "monitor" function is running in its own goroutine, a lock
is needed to ensure it's not trying to add new tailers while shutdown is
happening.
2023-11-27 12:48:46 -08:00
Aaron Gable 16081d8e30
Invert RequireCommonName into AllowNoCommonName (#7139)
The RequireCommonName feature flag was our only "inverted" feature flag,
which defaulted to true and had to be explicitly set to false. This
inversion can lead to confusion, especially to readers who expect all Go
default values to be zero values. We plan to remove the ability for our
feature flag system to support default-true flags, which the existence
of this flag blocked. Since this flag has not been set in any real
configs, inverting it is easy.

Part of https://github.com/letsencrypt/boulder/issues/6802
2023-11-06 10:58:30 -08:00
Aaron Gable 81cb970d30
Remove crlURL from test CA issuer configs (#7132)
This value is always set to the empty string in prod, which (correctly)
results in the issued certificates not having a CRLDP at all. It turns
out our integration test environment has been including CRLDPs in all of
our test certs because we set crlURL to a non-empty value! This change
updates our test configs to match reality.

I'll remove the code which supports this config value as part of my
upcoming CA CRLDP changes.
2023-11-02 11:20:50 -07:00
Samantha 9aef5839b5
WFE: Add new key-value ratelimits implementation (#7089)
Integrate the key-value rate limits from #6947 into the WFE. Rate limits
are backed by the Redis source added in #7016, and use the SRV record
shard discovery added in #7042.

Part of #5545
2023-10-04 14:12:38 -04:00
Aaron Gable 3b880e1ccf
Add CAAAfterValidation feature flag (#7082)
Add a new feature flag "CAAAfterValidation" which, when set to true in
the VA, causes the VA to only begin CAA checks after basic domain
control validation has completed successfully. This will make successful
validations take longer, since the DCV and CAA checks are performed
serially instead of in parallel. However, it will also reduce the number
of CAA checks we perform by up to 80%, since such a high percentage of
validations also fail.

IN-9575 tracks enabling this feature flag in staging and prod
Fixes https://github.com/letsencrypt/boulder/issues/7058
2023-09-18 13:30:31 -07:00
Aaron Gable 102b447e8d
Smoother scheduling and leasing for crl-updater (#7010)
Overhaul crl-updater's default (i.e. non-runOnce) behavior to update
individual CRL shards continuously, rather than updating all shards in a
large batch.

To accomplish this, it spins up one goroutine for each shard of each
issuer this updater is responsible for. Each goroutine is solely
responsible for its assigned shard. It sleeps for a random amount of
time (to stagger their starts), then begins a ticker to wake up every
updateInterval and re-issue its shard.

As part of this change, refactor updater.go into three separate files
(batch.go, continuous.go, and updater.go) containing functions dedicated
to single-run batch processing, long-running continuous processing, and
shared helpers, respectively.

IN-9475 tracks the deprecation of the `updateOffset` config key. The
other configuration changes in this PR do not require production
changes.

Fixes https://github.com/letsencrypt/boulder/issues/7023
2023-09-08 09:16:15 -07:00
Phil Porada 72e01b337a
ceremony: Distinguish between intermediate and cross-sign ceremonies (#7005)
In `//cmd/ceremony`:
* Added `CertificateToCrossSignPath` to the `cross-certificate` ceremony
type. This new input field takes an existing certificate that will be
cross-signed and performs checks against the manually configured data in
each ceremony file.
* Added byte-for-byte subject/issuer comparison checks to root,
intermediate, and cross-certificate ceremonies to detect that signing is
happening as expected.
* Added Fermat factorization check from the `//goodkey` package to all
functions that generate new key material.

In `//linter`: 
* The Check function now exports linting certificate bytes. The idea is
that a linting certificate's `tbsCertificate` bytes can be compared
against the final certificate's `tbsCertificate` bytes as a verification
that `x509.CreateCertificate` was deterministic and produced identical
DER bytes after each signing operation.

Other notable changes:
* Re-orders the issuers list in each CA config to match staging and
production. There is an ordering issue mentioned by @aarongable two
years ago on IN-5913 that didn't make it's way back to this repository.
> Order here matters – the default chain we serve for each intermediate
should be the first listed chain containing that intermediate.
* Enables `ECDSAForAll` in `config-next` CA configs to match Staging.
* Generates 2x new ECDSA subordinate CAs cross-signed by an RSA root and
adds these chains to the WFE for clients to download.
* Increased the test.sh startup timeout to account for the extra
ceremony run time.


Fixes https://github.com/letsencrypt/boulder/issues/7003

---------

Co-authored-by: Aaron Gable <aaron@letsencrypt.org>
2023-08-23 14:01:19 -04:00
Aaron Gable 6a450a2272
Improve CRL shard leasing (#7030)
Simplify the index-picking logic in the SA's leaseOldestCrlShard method.
Specifically, more clearly separate it into "missing" and "non-missing"
cases, which require entirely different logic: picking a random missing
shard, or picking the oldest unleased shard, respectively.

Also change the UpdateCRLShard method to "unlease" shards when they're
updated. This allows the crl-updater to run as quickly as it likes,
while still ensuring that multiple instances do not step on each other's
toes.

The config change for shardWidth and lookbackPeriod instead of
certificateLifetime has been deployed in prod since IN-8445. The config
change changing the shardWidth is just so that the tests neither produce
a bazillion shards, nor have to do a bazillion SA queries for each chunk
within a shard, improving the readability of test logs.

Part of https://github.com/letsencrypt/boulder/issues/7023
2023-08-08 17:05:00 -07:00
Aaron Gable 9a4f0ca678
Deprecate LeaseCRLShards feature (#7009)
This feature flag is enabled in both staging and prod.
2023-08-07 15:17:00 -07:00
Jacob Hoffman-Andrews 725f190c01
ca: remove orphan queue code (#7025)
The `orphanQueueDir` config field is no longer used anywhere.

Fixes #6551
2023-08-02 16:04:28 -07:00
Jacob Hoffman-Andrews 8d7b87c9ca
cert-checker: check for precertificate correspondence (#7015)
This adds a lookup in cert-checker to find the linting precertificate
with the same serial number as a given final certificate, and checks
precertificate correspondence between the two.

Fixes #6959
2023-07-28 12:45:47 -04:00
Aaron Gable 908421bb98
crl-updater: lease CRL shards to prevent races (#6941)
Add a new feature flag, LeaseCRLShards, which controls certain aspects
of crl-updater's behavior.

When this flag is enabled, crl-updater calls the new SA.LeaseCRLShard
method before beginning work on a shard. This prevents it from stepping
on the toes of another crl-updater instance which may be working on the
same shard. This is important to prevent two competing instances from
accidentally updating a CRL's Number (which is an integer representation
of its thisUpdate timestamp) *backwards*, which would be a compliance
violation.

When this flag is enabled, crl-updater also calls the new
SA.UpdateCRLShard method after finishing work on a shard.

In the future, additional work will be done to make crl-updater use the
"give me the oldest available shard" mode of the LeaseCRLShard method.

Fixes https://github.com/letsencrypt/boulder/issues/6897
2023-07-19 15:11:16 -07:00
Aaron Gable e09c5faf5e
Deprecate CAA AccountURI and ValidationMethods feature flags (#7000)
These flags are set to true in all environments.
2023-07-14 14:54:39 -04:00
Jacob Hoffman-Andrews cd24b9db20
ca: deprecate StoreLintingCertificateInsteadOfPrecertificate (#6970)
And turn off the orphan queue in config-next.
2023-07-05 10:44:08 -07:00
Samantha 124c4cc6f5
grpc/sa: Implement deep health checks (#6928)
Add the necessary scaffolding for deep health checking of our various
gRPC components. Each component implementation that also implements the
grpc.checker interface will be checked periodically, and the health
status of the component will be updated accordingly.

Add the necessary methods to SA to implement the grpc.checker interface
and register these new health checks with Consul.

Additionally:
- Update entry point script to check for ProxySQL readiness.
- Increase the poll rate for gRPC Consul checks from 5s to 2s to help
with DNS failures, due to check failures, on startup.
- Change log level for Consul from INFO to ERROR to deal with noisy logs
full of transport failures due to Consul gRPC checks firing before the
SAs are up.

Fixes #6878
Part of #6795
2023-06-12 13:58:53 -04:00
Jacob Hoffman-Andrews 2041e8723b
integration: shorten log output (#6894)
Remove the load test stage of the integration test, which generates
superfluous amounts of log.

Turn down logging on the CA and VA from info to error-only.

Part of https://github.com/letsencrypt/boulder/issues/6890
2023-06-05 13:11:19 -04:00
Jacob Hoffman-Andrews 80e1510819
admin: add clear-email subcommand (#6919)
When a user wants their email address deleted from the database but no
longer has access to their account, this allows an administrator to
clear it.

This adds `admin` as an alias for `admin-revoker`, because we'd like the
clear-email sub-command to be a part of that overall tool, but it's not
really revocation related.

Part of #6864
2023-06-01 14:33:24 -04:00
Samantha f09a94bd74
consul: Configure gRPC health check for SA (#6908)
Enable SA gRPC health checks in Consul ahead of further changes for
#6878. Calls to the `Check` method of the SA's grpc.health.v1.Health
service must respond `SERVING` before the `sa` service will be
advertised in Consul DNS. Consul will continue to poll this service
every 5 seconds.

- Add `bconsul` docker service to boulder `bluenet` and `rednet`
- Add TLS credentials for `consul.boulder`:
  ```shell
  $ openssl x509 -in consul.boulder/cert.pem -text | grep DNS
                DNS:consul.boulder
  ```
- Update `test/grpc-creds/generate.sh` to add `consul.boulder`
- Update test SA configs to allow `consul.boulder` to access to
`grpc.health.v1.Health`

Part of #6878
2023-05-23 13:16:49 -04:00
Aaron Gable fe523f142d
crl-updater: retry failed shards (#6907)
Add per-shard exponential backoff and retry to crl-updater. Each
individual CRL shard will be retried up to MaxAttempts (default 1)
times, with exponential backoff starting at 1 second and maxing out at 1
minute between each attempt.

This can effectively reduce the parallelism of crl-updater: while a
goroutine is sleeping between attempts of a failing shard, it is not
doing work on another shard. This is a desirable feature, since it means
that crl-updater gently reduces the total load it places on the network
and database when shards start to fail.

Setting this new config parameter is tracked in IN-9140
Fixes https://github.com/letsencrypt/boulder/issues/6895
2023-05-22 12:59:09 -07:00
Matthew McPherrin b7d9f8c2e3
In config-next/, opentelemetry -> openTelemetry for consistency (#6888)
In configs, opentelemetry -> openTelemetry

As pointed out in review of #6867, these should match the case of their
corresponding Go identifiers for consistency.

JSON keys are case-insensitive in Go (part of why we've got a fork in
go-jose),
so this change should have no functional impact.
2023-05-15 17:07:29 -04:00
Matthew McPherrin 3aae67b8a9
Opentelemetry: Add option for public endpoints (#6867)
This PR adds a new configuration block specifically for the otelhttp
instrumentation. This block is separate from the existing
"opentelemetry" configuration, and is only relevant when using otelhttp
instrumentation. It does not share any codepath with the existing
configuration, so it is at the top level to indicate which services it
applies to.

There's a bit of plumbing new configuration through. I've adopted the
measured_http package to also set up opentelemetry instead of just
metrics, which should hopefully allow any future changes to be smaller
(just config & there) and more consistent between the wfe2 and ocsp
responder.

There's one option here now, which disables setting
[otelhttp.WithPublicEndpoint](https://pkg.go.dev/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp#WithPublicEndpoint).
This option is designed to do exactly what we want: Don't accept
incoming spans as parents of the new span created in the server.
Previously we had a setting to disable parent-based sampling to help
with this problem, which doesn't really make sense anymore, so let's
just remove it and simplify that setup path. The default of "false" is
designed to be the safe option. It's set to True in the test/ configs
for integration tests that use traces, and I expect we'll likely set it
true in production eventually once the LBs are configured to handle
tracing themselves.

Fixes #6851
2023-05-12 15:34:34 -04:00
Samantha 310546a14e
VA: Support discovery of DNS resolvers via Consul (#6869)
Deprecate `va.DNSResolver` in favor of backwards compatible
`va.DNSProvider`.

Fixes #6852
2023-05-12 12:54:31 -04:00
Samantha 19c5244088
test: Use consul hostname instead of IP for dnsAuthority (#6883)
Standardize on hostnames for dnsAuthority to match production. 

Related to #6869
2023-05-11 14:13:53 -07:00
Jacob Hoffman-Andrews f295626e4c
ca: remove simulated ISRG OID from config (#6879)
We intend to issue in the future with only the CA/Browser Forum Domain
Validated OID.
2023-05-10 12:39:12 -04:00
Jacob Hoffman-Andrews ac4be89b56
grpc: add NoWaitForReady config field (#6850)
Currently we set WaitForReady(true), which causes gRPC requests to not
fail immediately if no backends are available, but instead wait until
the timeout in case a backend does become available. The downside is
that this behavior masks true connection errors. We'd like to turn it
off.

Fixes #6834
2023-05-09 16:16:44 -07:00
Matthew McPherrin 8427245675
OTel Integration test using jaeger (#6842)
This adds Jaeger's all-in-one dev container (with no persistent storage)
to boulder's dev docker-compose. It configures config-next/ to send all
traces there.

A new integration test creates an account and issues a cert, then
verifies the trace contains some set of expected spans.

This test found that async finalize broke spans, so I fixed that and a
few related spots where we make a new context.
2023-05-05 10:41:29 -04:00
Jacob Hoffman-Andrews 1c7e0fd1d8
Store linting certificate instead of precertificate (#6807)
In order to get rid of the orphan queue, we want to make sure that
before we sign a precertificate, we have enough data in the database
that we can fulfill our revocation-checking obligations even if storing
that precertificate in the database fails. That means:

- We should have a row in the certificateStatus table for the serial.
- But we should not serve "good" for that serial until we are positive
the precertificate was issued (BRs 4.9.10).
- We should have a record in the live DB of the proposed certificate's
public key, so the bad-key-revoker can mark it revoked.
- We should have a record in the live DB of the proposed certificate's
names, so it can be revoked if we are required to revoke based on names.

The SA.AddPrecertificate method already achieves these goals for
precertificates by writing to the various metadata tables. This PR
repurposes the SA.AddPrecertificate method to write "proposed
precertificates" instead.

We already create a linting certificate before the precertificate, and
that linting certificate is identical to the precertificate that will be
issued except for the private key used to sign it (and the AKID). So for
instance it contains the right pubkey and SANs, and the Issuer name is
the same as the Issuer name that will be used. So we'll use the linting
certificate as the "proposed precertificate" and store it to the DB,
along with appropriate metadata.

In the new code path, rather than writing "good" for the new
certificateStatus row, we write a new, fake OCSP status string "wait".
This will cause us to return internalServerError to OCSP requests for
that serial (but we won't get such requests because the serial has not
yet been published). After we finish precertificate issuance, we update
the status to "good" with SA.SetCertificateStatusReady.

Part of #6665
2023-04-26 13:54:24 -07:00
Aaron Gable 45329c9472
Deprecate ROCSPStage7 flag (#6804)
Deprecate the ROCSPStage7 feature flag, which caused the RA and CA to
stop generating OCSP responses when issuing new certs and when revoking
certs. (That functionality is now handled just-in-time by the
ocsp-responder.) Delete the old OCSP-generating codepaths from the RA
and CA. Remove the CA's internal reference to an OCSP implementation,
because it no longer needs it.

Additionally, remove the SA's "Issuers" config field, which was never
used.

Fixes #6285
2023-04-12 17:03:06 -07:00
Aaron Gable 7e994a1216
Deprecate ROCSPStage6 feature flag (#6770)
Deprecate the ROCSPStage6 feature flag. Remove all references to the
`ocspResponse` column from the SA, both when reading from and when
writing to the `certificateStatus` table. This makes it safe to fully
remove that column from the database.

IN-8731 enabled this flag in all environments, so it is safe to
deprecate.

Part of #6285
2023-04-04 15:41:51 -07:00
Aaron Gable 8c67769be4
Remove ocsp-updater from Boulder (#6769)
Delete the ocsp-updater service, and the //ocsp/updater library that
supports it. Remove test configs for the service, and remove references
to the service from other test files.

This service has been fully shut down for an extended period now, and is
safe to remove.

Fixes #6499
2023-03-31 14:39:04 -07:00
Samantha b2224eb4bc
config: Add validation tags to all configuration structs (#6674)
- Require `letsencrypt/validator` package.
- Add a framework for registering configuration structs and any custom
validators for each Boulder component at `init()` time.
- Add a `validate` subcommand which allows you to pass a `-component`
name and `-config` file path.
- Expose validation via exported utility functions
`cmd.LookupConfigValidator()`, `cmd.ValidateJSONConfig()` and
`cmd.ValidateYAMLConfig()`.
- Add unit test which validates all registered component configuration
structs against test configuration files.

Part of #6052
2023-03-21 14:08:03 -04:00
Aaron Gable 6d6f3632da
Change SetCommonName to RequireCommonName (#6749)
Change the SetCommonName flag, introduced in #6706, to
RequireCommonName. Rather than having the flag control both whether or
not a name is hoisted from the SANs into the CN *and* whether or not the
CA is willing to issue certs with no CN, this updated flag now only
controls the latter. By default, the new flag is true, and continues our
current behavior of failing issuance if we cannot set a CN in the cert.
When the flag is set to false, then we are willing to issue certificates
for which the CSR contains no CN and there is no SAN short enough to be
hoisted into the CN field.

When we have rolled out this change, we can move on to the next flag in
this series: HoistCommonName, which will control whether or not a SAN is
hoisted at all, effectively giving the CSRs (and therefore the clients)
full control over whether their certificate contains a SAN.

This change is safe because no environment explicitly sets the
SetCommonName flag to false yet.

Fixes #5112
2023-03-21 11:07:06 -07:00
Matthew McPherrin 05c9106eba
lints: Consistently format JSON configuration files (#6755)
- Consistently format existing test JSON config files
- Add a small Python script which loads and dumps JSON files
- Add CI JSON lint test to CI

---------

Co-authored-by: Aaron Gable <aaron@aarongable.com>
2023-03-20 18:11:19 -04:00
Matthew McPherrin e1ed1a2ac2
Remove beeline tracing (#6733)
Remove tracing using Beeline from Boulder. The only remnant left behind
is the deprecated configuration, to ensure deployability.

We had previously planned to swap in OpenTelemetry in a single PR, but
that adds significant churn in a single change, so we're doing this as
multiple steps that will each be significantly easier to reason about
and review.

Part of #6361
2023-03-14 15:14:27 -07:00
Aaron Gable 9af4871e59
Add SetCommonName feature flag (#6706)
Add a new feature flag, `SetCommonName`, which defaults to `true`. In
this default state, no behavior changes.

When set to `false` on the CA, this flag will cause the CA to leave the
Subject commonName field of the certificate blank, as is recommended by
the Baseline Requirements Section 7.1.4.2.2(a).

Also slightly modify the behavior of the RA's `matchesCSR()` function,
to allow for both certificates that have a CN and certificates that
don't. It is not feasible to put this behavior behind the same
SetCommonName flag, because that would require an atomic deploy of both
the RA and the CA.

Obsoletes #5112
2023-03-09 13:31:55 -05:00
Aaron Gable 29bf521121
CA: Remove secondary gRPC servers (#6496)
Remove the OCSPGenerator and CRLGenerator gRPC servers that run on
separate ports from the CA's main gRPC server, which exposes both those
and the CertificateAuthority service as well. These additional servers
are no longer necessary, now that all three services are exposed on the
single address/port.

Fixes #6448
2023-03-01 11:45:28 -08:00
Samantha 98ef3bb2b4
VA/config: Remove unused va.CAA service in config (#6697)
GRPC config from `va.VA` is used for both `va.VA` and `va.CAA`.
2023-02-27 13:44:47 -05:00
Samantha a0fe7dc93e
SA: Remove Redis config (#6695)
This field doesn't appear to be in use.

Part of #6052
2023-02-27 09:29:38 -08:00
Aaron Gable cdf1a6f9f9
Add flag to make order finalization async (#6589)
Add the "AsyncFinalize" feature flag. When enabled, this causes the RA
to return almost immediately from FinalizeOrder requests, with the
actual hard work of issuing the precertificate, getting SCTs, issuing
the final certificate, and updating the database accordingly all
occuring in a background goroutine while the client polls the GetOrder
endpoint waiting for the result.

This is implemented by factoring out the majority of the finalization
work into a new `issueCertificateOuter` helper function, and simply
using the new flag to determine whether we call that helper in a
goroutine or not. This makes removing the feature flag in the future
trivially easy.

Also add a new prometheus metric named `inflight_finalizes` which can be
used to count the number of simultaneous goroutines which are performing
finalization work. This metric is exported regardless of the state of
the AsyncFinalize flag, so that we can observe any changes to this
metric when the flag is flipped.

Fixes #6575
2023-02-24 09:57:54 -08:00
Jacob Hoffman-Andrews 79250756bf
expiration-mailer: limit number of mails sent to same address per day (#6675)
This adds a config field, "mailsPerAddressPerDay." Addresses that get
that many mails won't receive any more until the next day (UTC).

Fixes #6508.
2023-02-22 15:24:31 -08:00
Phil Porada 6c84a69043
Remove MandatoryPOSTasGET flag (#6672)
Remove the `MandatoryPOSTasGET` flag from the WFE2.
Update the ACMEv2 divergence doc to note that neither staging nor
production use MandatoryPOSTasGET.

Fixes #6582.
2023-02-17 13:04:31 -05:00
Jacob Hoffman-Andrews cd1bbc0d82
Tidy up integration test environment (#6668)
Remove `example.com` domain name, which was used by the deleted OldTLS
tests.

Remove GODEBUG=x509sha1=1.

Add a longer comment for the Consul DNS fallback in docker-compose.yml.

Use the "dnsAuthority" field for all gRPC clients in config-next,
instead of implicitly relying on the system DNS. This matches what we do
in prod.

Make "dnsAuthority" field of GRPCClientConfig mandatory whenever
SRVLookup or SRVLookups is used.

Make test/config/ocsp-responder.json use ServerAddress instead of
SRVLookup, like the rest of test/config.
2023-02-16 09:33:24 -08:00
Aaron Gable f9e4fb6c06
Add replication lag retries to some SA methods (#6649)
Add a new time.Duration field, LagFactor, to both the SA's config struct
and the read-only SA's implementation struct. In the GetRegistration,
GetOrder, and GetAuthorization2 methods, if the database select returned
a NoRows error and a lagFactor duration is configured, then sleep for
lagFactor seconds and retry the select.

This allows us to compensate for the replication lag between our primary
write database and our read-only replica databases. Sometimes clients
will fire requests in rapid succession (such as creating a new order,
then immediately querying the authorizations associated with that
order), and the subsequent requests will fail because they are directed
to read replicas which are lagging behind the primary. Adding this
simple sleep-and-retry will let us mitigate many of these failures,
without adding too much complexity.

Fixes #6593
2023-02-14 17:25:13 -08:00
Samantha 5c49231ea6
ROCSP: Remove support for Redis Cluster (#6645)
Fixes #6517
2023-02-09 17:14:37 -05:00
Phil Porada 134321040b
Default ReuseValidAuthz to true (#6644)
`ReuseValidAuthz` was introduced
here [1] and enabled in staging and production configs on 2016-07-13. 
There was a brief stint during the TLS-SNI-01 challenge type removal where 
SRE disabled it. However, time has finally come to remove this configuration
option. Issue #6623 will determine the feasibility of shorter authz
lifetimes and potentially the removal of authz reuse.

This change is broken up into two parts to allow SRE to safely remove
the flag from staging and production configs. We'll merge this PR, SRE
will deploy boulder and the config change, then we'll finish removing
`ReuseValidAuthz` configuration from the codebase.

[1] boulder commit 9abc212448

Part 1 of 2 for fixing #2734.
2023-02-09 14:26:06 -05:00
Samantha d73125d8f6
WFE: Add custom balancer implementation which routes nonce redemption RPCs by prefix (#6618)
Assign nonce prefixes for each nonce-service by taking the first eight
characters of the the base64url encoded HMAC-SHA256 hash of the RPC
listening address using a provided key. The provided key must be same
across all boulder-wfe and nonce-service instances.
- Add a custom `grpc-go` load balancer implementation (`nonce`) which
can route nonce redemption RPC messages by matching the prefix to the
derived prefix of the nonce-service instance which created it.
- Modify the RPC client constructor to allow the operator to override
the default load balancer implementation (`round_robin`).
- Modify the `srv` RPC resolver to accept a comma separated list of
targets to be resolved.
- Remove unused nonce-service `-prefix` flag.

Fixes #6404
2023-02-03 17:52:18 -05:00
Jacob Hoffman-Andrews e57c788086
Add checking of validations to cert-checker (#6617)
This includes two feature flags: one that controls turning on the extra
database queries, and one that causes cert-checker to fail on missing
validations. If the second flag isn't turned on, it will just emit error
log lines. This will help us find any edge conditions we need to deal
with before making the new code trigger alerts.

Fixes #6562
2023-02-03 16:25:41 -05:00
Jacob Hoffman-Andrews 9d3f7d8f84
Add timeout config to WFE (#6621) 2023-01-30 10:07:41 -08:00
Aaron Gable a7dc34f127
ocsp-responder: make db config optional (#6601)
In #6293, we gave the ocsp-responder the ability to use a gRPC
connection to the SA to get status information for certificates, rather
than using a database connection directly. However, that change
neglected to make the database connection configuration optional: an
ocsp-responder with an SA gRPC client configured would never use its
database connection, but if it wasn't configured it would refuse to
start. Fix this oversight by making the DBConfig stanza optional.
2023-01-26 15:21:39 -08:00