Per https://pkg.go.dev/net/http#hdr-HTTP_2:
> The http package's Transport and Server both automatically enable
HTTP/2 support for simple configurations.
and https://pkg.go.dev/net/http#Transport:
> // If non-nil, HTTP/2 support may not be enabled by default.
> TLSClientConfig *tls.Config
Since we were setting a non-default TLSClientConfig to trust custom
roots, we accidentally turned off HTTP/2 support. And Unbound requires
HTTP/2 to serve DoH queries.
Also, clone the TLS config just to be safe against possible mutation in
other packages.
These feature flags are no longer referenced in any test, staging, or
production configuration. They were removed in:
- StoreRevokerInfo: IN-8546
- ROCSPStage6 and ROCSPStage7: IN-8886
- CAAValidationMethods and CAAAccountURI: IN-9301
Replace the current three-piece setup (enum of feature variables, map of
feature vars to default values, and autogenerated bidirectional maps of
feature variables to and from strings) with a much simpler one-piece
setup: a single struct with one boolean-typed field per feature. This
preserves the overall structure of the package -- a single global
feature set protected by a mutex, and Set, Reset, and Enabled methods --
although the exact function signatures have all changed somewhat.
The executable config format remains the same, so no deployment changes
are necessary. This change does deprecate the AllowUnrecognizedFeatures
feature, as we cannot tell the json config parser to ignore unknown
field names, but that flag is set to False in all of our deployment
environments already.
Fixes https://github.com/letsencrypt/boulder/issues/6802
Fixes https://github.com/letsencrypt/boulder/issues/5229
Previously we made these a single `RUN` step in the Dockerfile to reduce
the size of the final image. Docker pulls all the dependent layers for
an image, which means that even if you delete intermediate build files
in a later `RUN` step, they still contribute to the overall download
size. You can work around that by deleting the intermediate files within
a single `RUN` step.
However, that has downsides: changing one Go dependency meant
downloading Go and all the other dependencies again. By moving these
back into `RUN` steps we get incremental builds, which are nice. And by
adding the builder pattern (`FROM ... AS godeps`), we can avoid having
intermediate files contribute to the overall image size.
This solves a few problems:
- When producing a new revision of boulder-tools, it often requires
multiple iterations to get it right. This provides a straightforward
path to build those iterations without trying to upload them to a Docker
repository each time.
- It's no longer necessary to produce dev container images in addition
to CI container images. Dev images are built on-demand and cached.
- Cross builds are no longer needed unless building the CI images on
non-amd64.
For third-party integration tests that do `docker compose up`, this may
result in longer build times if they are rebuilding from scratch each
time. That can be improved by keeping docker cache around.
Truncating to the hour does not provide any meaningful protection
against signature preimage attacks, and can cause the thisUpdate and
producedAt fields to differ by up to 59 minutes from each other.
Instead, truncate to the minute, to match how x/crypto/ocsp sets the
producedAt field.
Fixes https://github.com/letsencrypt/boulder/issues/7190
The servers are invoked such that they have to look up their service
names in DNS in order to bind a port. This means that when consul is
down, they take a long time to start up- they are timing out the query.
In the meantime there are a number of messages about timed out health
checks. This winds up obscuring the real error, so let's do a quick DNS
check at startup and give a more meaningful error.
minica by default sets restrictive permissions on the directories it
makes. This produced confusing behavior after regenerating keys: the
`bconsul` container failed to start up because it couldn't access its
TLS keys, which led to other errors during startservers.
Many services already have --addr and/or --debug-addr flags.
However, it wasn't universal, so this PR adds flags to commands where
they're not currently present.
This makes it easier to use a shared config file but listen on different
ports, for running multiple instances on a single host.
The config options are made optional as well, and removed from
config-next/.
- Move default and override limits, and associated methods, out of the
Limiter to new limitRegistry struct, embedded in a new public
TransactionBuilder.
- Export Transaction and add corresponding Transaction constructor
methods for each limit Name, making Limiter and TransactionBuilder the
API for interacting with the ratelimits package.
- Implement batched Spends and Refunds on the Limiter, the new methods
accept a slice of Transactions.
- Add new boolean fields check and spend to Transaction to support more
complicated cases that can arise in batches:
1. the InvalidAuthorizations limit is checked at New Order time in a
batch with many other limits, but should only be spent when an
Authorization is first considered invalid.
2. the CertificatesPerDomain limit is overridden by
CertficatesPerDomainPerAccount, when this is the case, spends of the
CertificatesPerDomain limit should be "best-effort" but NOT deny the
request if capacity is lacking.
- Modify the existing Spend/Refund methods to support
Transaction.check/spend and 0 cost Transactions.
- Make bucketId private and add a constructor for each bucket key format
supported by ratelimits.
- Move domainsForRateLimiting() from the ra.go to ratelimits. This
avoids a circular import issue in ra.go.
Part of #5545
- Adds a feature flag to gate rollout for SHA256 Subject Key Identifiers
for end-entity certificates.
- The ceremony tool will now use the RFC 7093 section 2 option 1 method
for generating Subject Key Identifiers for future root CA, intermediate
CA, and cross-sign ceremonies.
- - - -
[RFC 7093 section 2 option
1](https://datatracker.ietf.org/doc/html/rfc7093#section-2) provides a
method for generating a truncated SHA256 hash for the Subject Key
Identifier field in accordance with Baseline Requirement [section
7.1.2.11.4 Subject Key
Identifier](90a98dc7c1/docs/BR.md (712114-subject-key-identifier)).
> [RFC5280] specifies two examples for generating key identifiers from
> public keys. Four additional mechanisms are as follows:
>
> 1) The keyIdentifier is composed of the leftmost 160-bits of the
> SHA-256 hash of the value of the BIT STRING subjectPublicKey
> (excluding the tag, length, and number of unused bits).
The related [RFC 5280 section
4.2.1.2](https://datatracker.ietf.org/doc/html/rfc5280#section-4.2.1.2)
states:
> For CA certificates, subject key identifiers SHOULD be derived from
> the public key or a method that generates unique values. Two common
> methods for generating key identifiers from the public key are:
> ...
> Other methods of generating unique numbers are also acceptable.
When running in manual mode, the `configFile` variable will take the
zero value of `""` while `manualConfigFile` will be provided on the CLI
by the operator. A startup check incorrectly dereferences `configFile`;
but correctly determines that it is the zero value `""`, outputs the
help text, and exits never allowing manual mode to perform work.
Fixes https://github.com/letsencrypt/boulder/issues/7176
This will make it easier to add a crl.go, holding functionality similar
to cert.go, without making any single file overly complex.
This introduces no functionality changes.
Part of https://github.com/letsencrypt/boulder/issues/7159
We had this disabled in one of our probes, but not all. Add a common
dialer that disables the fallback and use it in each applicable prober.
This avoids masking failures of IPv6 connectivity.
Also change to use contexts instead of timeout parameters consistently.
The shared dialer is in the new `obsdialer` package because putting it
in `observer` results in import cycles.
This is a cleanup PR finishing the migration from int64 timestamps to
protobuf `*timestamppb.Timestamps` by removing all usage of the old
int64 fields. In the previous PR
https://github.com/letsencrypt/boulder/pull/7121 all fields were
switched to read from the protobuf timestamppb fields.
Adds a new case to `core.IsAnyNilOrZero` to check various properties of
a `*timestamppb.Timestamp` reducing the visual complexity for receivers.
Fixes https://github.com/letsencrypt/boulder/issues/7060
To simplify deployment of the log validator, this allows wildcards
(using go's filepath.Glob) to be included in the file paths.
In order to detect new files, a new background goroutine polls the glob
patterns every minute for matches.
Because the "monitor" function is running in its own goroutine, a lock
is needed to ensure it's not trying to add new tailers while shutdown is
happening.
Have the RA populate the new pubpb.Request.Kind field, along side the
deprecated pubpb.Request.Precert field, so that the publisher has more
detailed information about what kind of CT submission this is.
Fixes https://github.com/letsencrypt/boulder/issues/7161
Give the publisher a more nuanced view of the three kinds of CT
submissions we do: "sct" (submitting a precert to get SCTs), "info"
(submitting a precert but not caring about the result), and "final"
(submitting a final cert and not caring about the result). Expose these
three kinds in the ct_errors_count and ct_submission_time_seconds
metrics, so that they can be separately grouped and alerted on.
This is an improvement over the current status-quo, which only
distinguishes between "precert" and "final" submissions, without being
able to distinguish between SCT-retrieving and purely-informational
submissions of precerts.
This functionality will not be fully operational until the RA begins
informing the publisher of what kind of submission this is.
Part of https://github.com/letsencrypt/boulder/issues/7161
The main function in log-validator is overly big, so this refactors it
in preparation for adding support for file globs, which is started in
the (currently draft) #7134
This PR should be functionally a no-op, except for one change:
The 1-per-second ratelimiter is moved to be per-file, so it's
1-per-second-per-file. I think this more closely aligns with what we'd
want, as we could potentially miss that a file had bad lines in it if it
was overwhelmed by another log file at the same time.
- Emit override utilization only when resource counts are under
threshold.
- Override utilization accounts for anticipated issuance.
- Correct the limit metric label for `CertificatesPerName` and
`CertificatesPerFQDNSet/Fast`.
Part of #5545
I introduced a bug in https://github.com/letsencrypt/boulder/pull/7162
which lead to incorrect type comparisons.
`IsAnyNilOrZero` takes a variadic amount of interfaces with each
interface containing a value with underlying type `string`, `int32`,
`float64`, etc and checks for that type's zero value in a switch
statement. When checking for multiple numeric types on the same case
statement line, the compiler will default to `int` rather than implying
the type of the untyped constant `0` leading to an incorrect result at
runtime. Contrast that to when each numeric type is on its own case
statement line where a correct comparison can be made.
Per [go.dev/blog/constants](https://go.dev/blog/constants):
> The default type of an untyped constant is determined by its syntax.
For string constants, the only possible implicit type is string. For
[numeric constants](https://go.dev/ref/spec#Numeric_types), the implicit
type has more variety. Integer constants default to int, floating-point
constants float64, rune constants to rune (an alias for int32), and
imaginary constants to complex128.
Per
[go.dev/doc/effective_go](https://go.dev/doc/effective_go#interfaces_and_types):
> Constants in Go are just that—constant. They are created at compile
time, even when defined as locals in functions, and can only be numbers,
characters (runes), strings or booleans. Because of the compile-time
restriction, the expressions that define them must be constant
expressions, evaluatable by the compiler.
See the following code for an explicit example.
```
package main
import "fmt"
func main() {
for _, val := range []interface{}{1, int8(1), int16(1), int32(1), int64(1)} {
switch v := val.(type) {
case int, int8, int16, int32, int64:
fmt.Printf("Type %T:\t%v\n", v, v == 1)
}
}
fmt.Println("-----------------------------")
for _, val := range []interface{}{1, int8(1), int16(1), int32(1), int64(1)} {
switch v := val.(type) {
case int:
fmt.Printf("Type %T:\t%v\n", v, v == 1)
case int8:
fmt.Printf("Type %T:\t%v\n", v, v == 1)
case int16:
fmt.Printf("Type %T:\t%v\n", v, v == 1)
case int32:
fmt.Printf("Type %T:\t%v\n", v, v == 1)
case int64:
fmt.Printf("Type %T:\t%v\n", v, v == 1)
}
}
}
-------
Type int: true
Type int8: false
Type int16: false
Type int32: false
Type int64: false
-----------------------------
Type int: true
Type int8: true
Type int16: true
Type int32: true
Type int64: true
```
The `core.IsAnyNilOrZero` function is used in the critical path for
issuance, ocsp generation, specific ratelimit lookups, and before
performing each domain validation. The original function heavily relies
upon the default switch case which uses the full reflect package, rather
than reflectlite which can't be used because it does not implement
`IsZero()`. We can reduce CPU utilization and increase the speed of each
call by performing some basic type checking.
Benchmarking the main branch in `before.txt` and this PR in `after.txt`
shows a noticeable reduction in CPU time between the two. I ran each
test with `go test -bench=. -count=10 | tee ${FILENAME}` and compared
them with
[benchstat](https://pkg.go.dev/golang.org/x/perf/cmd/benchstat). The
only negatively affected case happens to be when a bool is `true` which
may just be due to a sample size of 10 test iterations.
```
$ benchstat before.txt after.txt
goos: linux
goarch: amd64
pkg: github.com/letsencrypt/boulder/core
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
│ before.txt │ after.txt │
│ sec/op │ sec/op vs base │
IsAnyNilOrZero/input_0-8 6.349n ± 20% 4.660n ± 1% -26.60% (p=0.000 n=10)
IsAnyNilOrZero/input_1-8 7.385n ± 15% 6.316n ± 4% -14.48% (p=0.000 n=10)
IsAnyNilOrZero/input_0#01-8 6.803n ± 17% 4.284n ± 4% -37.02% (p=0.000 n=10)
IsAnyNilOrZero/input_1#01-8 6.806n ± 7% 4.430n ± 5% -34.91% (p=0.000 n=10)
IsAnyNilOrZero/input_0#02-8 6.321n ± 8% 4.232n ± 2% -33.06% (p=0.000 n=10)
IsAnyNilOrZero/input_0.1-8 7.074n ± 10% 4.176n ± 5% -40.97% (p=0.000 n=10)
IsAnyNilOrZero/input_-8 7.494n ± 9% 3.242n ± 2% -56.73% (p=0.000 n=10)
IsAnyNilOrZero/input_ahoyhoy-8 9.839n ± 25% 3.990n ± 4% -59.45% (p=0.000 n=10)
IsAnyNilOrZero/input_[]-8 3.898n ± 13% 3.405n ± 2% -12.63% (p=0.019 n=10)
IsAnyNilOrZero/input_[0]-8 4.533n ± 6% 4.088n ± 2% -9.82% (p=0.001 n=10)
IsAnyNilOrZero/input_[1]-8 4.626n ± 6% 4.075n ± 5% -11.92% (p=0.000 n=10)
IsAnyNilOrZero/input_<nil>-8 2.854n ± 3% 2.039n ± 2% -28.55% (p=0.000 n=10)
IsAnyNilOrZero/input_false-8 7.269n ± 2% 6.164n ± 3% -15.21% (p=0.000 n=10)
IsAnyNilOrZero/input_true-8 8.047n ± 10% 9.090n ± 11% +12.96% (p=0.003 n=10)
IsAnyNilOrZero/input_<nil>#01-8 9.114n ± 3% 7.582n ± 2% -16.81% (p=0.000 n=10)
IsAnyNilOrZero/input_0001-01-01_00:00:00_+0000_UTC-8 10.630n ± 3% 3.871n ± 4% -63.58% (p=0.000 n=10)
IsAnyNilOrZero/input_2015-06-04_11:04:38_+0000_UTC-8 10.900n ± 4% 4.720n ± 2% -56.70% (p=0.000 n=10)
IsAnyNilOrZero/input_1s-8 9.509n ± 13% 9.024n ± 4% ~ (p=0.393 n=10)
IsAnyNilOrZero/input_0s-8 8.972n ± 4% 7.608n ± 2% -15.20% (p=0.000 n=10)
geomean 6.905n 4.772n -30.89%
```
Part of https://github.com/letsencrypt/boulder/issues/7153
Update cert-checker's logic to only ask the Policy Authority if we are
willing to issue for the Subject Alternative Names in a cert, to avoid
asking the PA about the Common Name when that CN is just the empty
string.
Add a new check to cert-checker to ensure that the Common Name is
exactly equal to one of the SANs, to fill a theoretical gap created by
loosening the check above.
Fixes https://github.com/letsencrypt/boulder/issues/7156
Having govulncheck prevent a PR from merging means that circumstances
entirely outside our control can grind Boulder development to a halt
until they are addressed. When the vulnerability is within Go itself, it
prevents PRs from being merged until we do a production deploy, because
we want our CI to always match what is in production. This is too
strict.
This PR removes govulncheck from the set of jobs depended upon by our
Boulder CI Test Matrix meta-job. It also adds vendorcheck, which was
accidentally omitted in #7123.
The hpcloud version appears abandoned, with numerous unfixed bugs
including ones that can cause it to miss data. The nxadm fork is
maintained.
The updated tail also pulls in an updated fsnotify. We had it vendored
at two paths before, so this has a side benefit of simplifying us to
having just one copy.
The `Limiter` API has been adjusted significantly to both improve both
safety and ergonomics and two `Limit` types have been corrected to match
the legacy implementations.
**Safety**
Previously, the key used for looking up limit overrides and for fetching
individual buckets from the key-value store was constructed within the
WFE. This posed a risk: if the key was malformed, the default limit
would still be enforced, but individual overrides would fail to function
properly. This has been addressed by the introduction of a new
`BucketId` type along with a `BucketId` constructor for each `Limit`
type. Each constructor is responsible for producing a well-formed bucket
key which undergoes the very same validation as any potentially matching
override key.
**Ergonomics**
Previously, each of the `Limiter` methods took a `Limit` name, a bucket
identifier, and a cost to be spent/ refunded. To simplify this, each
method now accepts a new `Transaction` type which provides a cost, and
wraps a `BucketId` identifying the specific bucket.
The two changes above, when taken together, make the implementation of
batched rate limit transactions considerably easier, as a batch method
can accept a slice of `Transaction`.
**Limit Corrections**
PR #6947 added all of the existing rate limits which could be made
compatible with the key-value approach. Two of these were improperly
implemented;
- `CertificatesPerDomain` and `CertificatesPerFQDNSet`, were implemented
as
- `CertificatesPerDomainPerAccount` and
`CertificatesPerFQDNSetPerAccount`.
Since we do not actually associate these limits with a particular ACME
account, the `regID` portion of each of their bucket keys has been
removed.