While working on https://github.com/letsencrypt/boulder/pull/7238, I dug
into why the consul services config has, for example, `[ca-a, ca-b]` in
addition to `[ca1, ca2]`. Boulder test configs use `ca.service.consul`
which will return both CAs (`[ca-a, ca-b]`). For `[ca1, ca2]` though, a
grpc load balancing [integration
test](a55bf19ea0/test/integration-test.py (L121-L143))
individually targets services such as to verify that each backend is
working correctly.
This PR addresses a discrepancy between the code comments and the actual
behavior in the challenge construction functions within
`core/challenges.go`. The existing comments suggest that these functions
generate a random token if the supplied token is empty. However, upon
reviewing the relevant code, it's evident that these functions do not
generate a random token; they simply use the token that is passed to
them.
The [only calling
code](a3afce5f75/policy/pa.go (L561-L571))
in `policy/pa.go` demonstrates this behavior:
```go
token := core.NewToken()
for i, t := range challTypes {
c, err := core.NewChallenge(t, token)
// ... additional code ...
}
```
This change corrects the comments to reflect actual behavior.
Protobuf v1.32 fixes a potential stack overflow crash. Boulder doesn't
expose grpc externally so the risk is minimal, but it seems prudent to
upgrade on a regular cadence. IE, this is not a security fix for Boulder.
- Update parsing of overrides with Ids formatted as 'fqdnSet' to produce
a hexadecimal string.
- Update validation for Ids formatted as 'fqdnSet' when constructing a
bucketKey for a transaction to validate before identifier construction.
- Skip CertificatesPerDomain transactions when the limit is disabled.
Part of #5545
In MariaDB, `long_query_time`[1] and `max_statement_time`[2] have up to
microsecond granularity (6 digits to the right of the decimal).
Fixes an issue detected by proxysql in staging.
```
MySQL_Session.cpp:6567:handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___MYSQL_COM_QUERY_qpo(): [ERROR] Unable to parse query. If correct, report it as a bug: SET long_query_time=3.9200000000000004
```
1. https://mariadb.com/kb/en/server-system-variables/#long_query_time
2. https://mariadb.com/kb/en/server-system-variables/#max_statement_time
---------
Co-authored-by: Aaron Gable <aaron@letsencrypt.org>
Zlint is deprecating lint.Lint in favour of lint.CertificateLint.
The main difference is that metadata is now its own struct, shared with
lint.RevocationListLint and presumably future lint types.
Revamp WillingToIssueWildcards to WillingToIssue. Remove the need for
identifier.ACMEIdentifiers in the WillingToIssue(Wildcards) method.
Previously, before invoking this method, a slice of identifiers was
created by looping over each dnsName. However, these identifiers were
solely used in error messages.
Segment the validation process into distinct parts for domain
validation, wildcard validation, and exact blocklist checks. This
approach eliminates the necessity of substituting *. with x. in wildcard
domains.
Introduce a new helper, ValidDomain. It checks that a domain is valid
and that it doesn't contain any invalid wildcard characters.
Functionality from the previous ValidDomain is preserved in
ValidNonWildcardDomain.
Fixes#3323
Use policy.ValidEmail to vet email addresses before sending expiration
notifications to them. This same check is performed by notify-mailer,
and it helps reduce the number of invalid addresses we attempt to send
to and the number of email bounces we generate.
Additionally, mark certificates as having had a nag email sent if there
are no valid addresses for us to send to, so that we don't constantly
retry them.
Fixes https://github.com/letsencrypt/boulder/issues/5372
Besides inheriting the ForceAttemptHTTP2 setting, this inherits
reasonable defaults for MaxIdleConns, IdleConnTimeout, DialTimeout, and
so on.
Follow-up for https://github.com/letsencrypt/boulder/pull/7215
Add support for draft-ietf-acme-ari-02 format alongside the existing
draft-ietf-acme-ari-01 implementation. Both formats are interchangeable.
Fixes#7037
If a client attempts to validate a challenge twice in rapid succession,
we'll kick off two background validation routines. One of these will
complete first, updating the database with success or failure. The other
will fail when it attempts to update the database and finds that there
are no longer any authorizations with that ID in the "pending" state.
Reduce the level at which we log such events, since we don't
particularly care about them.
Fixes https://github.com/letsencrypt/boulder/issues/3995
Change the max value of the CA's `SerialPrefix` config value from 255 (a
byte of all 1s) to 127 (a byte of one 0 followed by seven 1s). This
prevents the serial prefix from ever beginning with a 1.
This is important because serials are interpreted as signed
(twos-complement) integers, and are required to be positive -- a serial
whose first bit is 1 is considered to be negative and therefore in
violation of RFC 5280. The go stdlib fixes this for us by prepending a
zero byte to any serial that begins with a 1 bit, but we'd prefer all
our serials to be the same length.
Corresponding config change was completed in IN-9880.
Per https://pkg.go.dev/net/http#hdr-HTTP_2:
> The http package's Transport and Server both automatically enable
HTTP/2 support for simple configurations.
and https://pkg.go.dev/net/http#Transport:
> // If non-nil, HTTP/2 support may not be enabled by default.
> TLSClientConfig *tls.Config
Since we were setting a non-default TLSClientConfig to trust custom
roots, we accidentally turned off HTTP/2 support. And Unbound requires
HTTP/2 to serve DoH queries.
Also, clone the TLS config just to be safe against possible mutation in
other packages.
These feature flags are no longer referenced in any test, staging, or
production configuration. They were removed in:
- StoreRevokerInfo: IN-8546
- ROCSPStage6 and ROCSPStage7: IN-8886
- CAAValidationMethods and CAAAccountURI: IN-9301
Replace the current three-piece setup (enum of feature variables, map of
feature vars to default values, and autogenerated bidirectional maps of
feature variables to and from strings) with a much simpler one-piece
setup: a single struct with one boolean-typed field per feature. This
preserves the overall structure of the package -- a single global
feature set protected by a mutex, and Set, Reset, and Enabled methods --
although the exact function signatures have all changed somewhat.
The executable config format remains the same, so no deployment changes
are necessary. This change does deprecate the AllowUnrecognizedFeatures
feature, as we cannot tell the json config parser to ignore unknown
field names, but that flag is set to False in all of our deployment
environments already.
Fixes https://github.com/letsencrypt/boulder/issues/6802
Fixes https://github.com/letsencrypt/boulder/issues/5229
Previously we made these a single `RUN` step in the Dockerfile to reduce
the size of the final image. Docker pulls all the dependent layers for
an image, which means that even if you delete intermediate build files
in a later `RUN` step, they still contribute to the overall download
size. You can work around that by deleting the intermediate files within
a single `RUN` step.
However, that has downsides: changing one Go dependency meant
downloading Go and all the other dependencies again. By moving these
back into `RUN` steps we get incremental builds, which are nice. And by
adding the builder pattern (`FROM ... AS godeps`), we can avoid having
intermediate files contribute to the overall image size.
This solves a few problems:
- When producing a new revision of boulder-tools, it often requires
multiple iterations to get it right. This provides a straightforward
path to build those iterations without trying to upload them to a Docker
repository each time.
- It's no longer necessary to produce dev container images in addition
to CI container images. Dev images are built on-demand and cached.
- Cross builds are no longer needed unless building the CI images on
non-amd64.
For third-party integration tests that do `docker compose up`, this may
result in longer build times if they are rebuilding from scratch each
time. That can be improved by keeping docker cache around.
Truncating to the hour does not provide any meaningful protection
against signature preimage attacks, and can cause the thisUpdate and
producedAt fields to differ by up to 59 minutes from each other.
Instead, truncate to the minute, to match how x/crypto/ocsp sets the
producedAt field.
Fixes https://github.com/letsencrypt/boulder/issues/7190
The servers are invoked such that they have to look up their service
names in DNS in order to bind a port. This means that when consul is
down, they take a long time to start up- they are timing out the query.
In the meantime there are a number of messages about timed out health
checks. This winds up obscuring the real error, so let's do a quick DNS
check at startup and give a more meaningful error.
minica by default sets restrictive permissions on the directories it
makes. This produced confusing behavior after regenerating keys: the
`bconsul` container failed to start up because it couldn't access its
TLS keys, which led to other errors during startservers.
Many services already have --addr and/or --debug-addr flags.
However, it wasn't universal, so this PR adds flags to commands where
they're not currently present.
This makes it easier to use a shared config file but listen on different
ports, for running multiple instances on a single host.
The config options are made optional as well, and removed from
config-next/.
- Move default and override limits, and associated methods, out of the
Limiter to new limitRegistry struct, embedded in a new public
TransactionBuilder.
- Export Transaction and add corresponding Transaction constructor
methods for each limit Name, making Limiter and TransactionBuilder the
API for interacting with the ratelimits package.
- Implement batched Spends and Refunds on the Limiter, the new methods
accept a slice of Transactions.
- Add new boolean fields check and spend to Transaction to support more
complicated cases that can arise in batches:
1. the InvalidAuthorizations limit is checked at New Order time in a
batch with many other limits, but should only be spent when an
Authorization is first considered invalid.
2. the CertificatesPerDomain limit is overridden by
CertficatesPerDomainPerAccount, when this is the case, spends of the
CertificatesPerDomain limit should be "best-effort" but NOT deny the
request if capacity is lacking.
- Modify the existing Spend/Refund methods to support
Transaction.check/spend and 0 cost Transactions.
- Make bucketId private and add a constructor for each bucket key format
supported by ratelimits.
- Move domainsForRateLimiting() from the ra.go to ratelimits. This
avoids a circular import issue in ra.go.
Part of #5545
- Adds a feature flag to gate rollout for SHA256 Subject Key Identifiers
for end-entity certificates.
- The ceremony tool will now use the RFC 7093 section 2 option 1 method
for generating Subject Key Identifiers for future root CA, intermediate
CA, and cross-sign ceremonies.
- - - -
[RFC 7093 section 2 option
1](https://datatracker.ietf.org/doc/html/rfc7093#section-2) provides a
method for generating a truncated SHA256 hash for the Subject Key
Identifier field in accordance with Baseline Requirement [section
7.1.2.11.4 Subject Key
Identifier](90a98dc7c1/docs/BR.md (712114-subject-key-identifier)).
> [RFC5280] specifies two examples for generating key identifiers from
> public keys. Four additional mechanisms are as follows:
>
> 1) The keyIdentifier is composed of the leftmost 160-bits of the
> SHA-256 hash of the value of the BIT STRING subjectPublicKey
> (excluding the tag, length, and number of unused bits).
The related [RFC 5280 section
4.2.1.2](https://datatracker.ietf.org/doc/html/rfc5280#section-4.2.1.2)
states:
> For CA certificates, subject key identifiers SHOULD be derived from
> the public key or a method that generates unique values. Two common
> methods for generating key identifiers from the public key are:
> ...
> Other methods of generating unique numbers are also acceptable.
When running in manual mode, the `configFile` variable will take the
zero value of `""` while `manualConfigFile` will be provided on the CLI
by the operator. A startup check incorrectly dereferences `configFile`;
but correctly determines that it is the zero value `""`, outputs the
help text, and exits never allowing manual mode to perform work.
Fixes https://github.com/letsencrypt/boulder/issues/7176
This will make it easier to add a crl.go, holding functionality similar
to cert.go, without making any single file overly complex.
This introduces no functionality changes.
Part of https://github.com/letsencrypt/boulder/issues/7159