Commit Graph

6560 Commits

Author SHA1 Message Date
Phil Porada 51e9f39259
Finish migration from int64 durations to durationpb (#7147)
This is a cleanup PR finishing the migration from int64 durations to
protobuf `*durationpb.Duration` by removing all usage of the old int64
fields. In the previous PR
https://github.com/letsencrypt/boulder/pull/7146 all fields were
switched to read from the protobuf durationpb fields.

Fixes https://github.com/letsencrypt/boulder/issues/7097
2023-11-28 12:51:11 -05:00
Phil Porada 6925fad324
Finish migration from int64 timestamps to timestamppb (#7142)
This is a cleanup PR finishing the migration from int64 timestamps to
protobuf `*timestamppb.Timestamps` by removing all usage of the old
int64 fields. In the previous PR
https://github.com/letsencrypt/boulder/pull/7121 all fields were
switched to read from the protobuf timestamppb fields.

Adds a new case to `core.IsAnyNilOrZero` to check various properties of
a `*timestamppb.Timestamp` reducing the visual complexity for receivers.

Fixes https://github.com/letsencrypt/boulder/issues/7060
2023-11-27 13:37:31 -08:00
Matthew McPherrin 32adaf1846
Make log-validator take glob patterns to monitor for log files (#7172)
To simplify deployment of the log validator, this allows wildcards
(using go's filepath.Glob) to be included in the file paths.

In order to detect new files, a new background goroutine polls the glob
patterns every minute for matches.

Because the "monitor" function is running in its own goroutine, a lock
is needed to ensure it's not trying to add new tailers while shutdown is
happening.
2023-11-27 12:48:46 -08:00
Aaron Gable a31429fb3e
RA: inform publisher of CT submission type (#7164)
Have the RA populate the new pubpb.Request.Kind field, along side the
deprecated pubpb.Request.Precert field, so that the publisher has more
detailed information about what kind of CT submission this is.

Fixes https://github.com/letsencrypt/boulder/issues/7161
2023-11-27 10:53:18 -08:00
Aaron Gable e1a8a2ebcd
Publisher: expose submission type in metric labels (#7163)
Give the publisher a more nuanced view of the three kinds of CT
submissions we do: "sct" (submitting a precert to get SCTs), "info"
(submitting a precert but not caring about the result), and "final"
(submitting a final cert and not caring about the result). Expose these
three kinds in the ct_errors_count and ct_submission_time_seconds
metrics, so that they can be separately grouped and alerted on.

This is an improvement over the current status-quo, which only
distinguishes between "precert" and "final" submissions, without being
able to distinguish between SCT-retrieving and purely-informational
submissions of precerts.

This functionality will not be fully operational until the RA begins
informing the publisher of what kind of submission this is.

Part of https://github.com/letsencrypt/boulder/issues/7161
2023-11-27 10:14:10 -08:00
Matthew McPherrin 70a6b1093a
Refactor log-validator to break up the big main function. (#7170)
The main function in log-validator is overly big, so this refactors it
in preparation for adding support for file globs, which is started in
the (currently draft) #7134

This PR should be functionally a no-op, except for one change:
The 1-per-second ratelimiter is moved to be per-file, so it's
1-per-second-per-file. I think this more closely aligns with what we'd
want, as we could potentially miss that a file had bad lines in it if it
was overwhelmed by another log file at the same time.
2023-11-27 13:07:39 -05:00
Matthew McPherrin 54c25f9152
Regenerate redis-tls certs and include script (#7171)
This copies the prelude from grpc-creds/generate.sh into
redis-tls/generate.sh, and regenerates all the certs there, which are
expiring.
2023-11-22 16:45:17 -05:00
Samantha e1312ff319
RA: Fix legacy override utilization metrics (#7124)
- Emit override utilization only when resource counts are under
threshold.
- Override utilization accounts for anticipated issuance.
- Correct the limit metric label for `CertificatesPerName` and
`CertificatesPerFQDNSet/Fast`.

Part of #5545
2023-11-20 14:12:21 -05:00
Phil Porada caf73f2762
Explicitly check each interface type in core.IsAnyNilOrZero (#7168)
I introduced a bug in https://github.com/letsencrypt/boulder/pull/7162
which lead to incorrect type comparisons.

`IsAnyNilOrZero` takes a variadic amount of interfaces with each
interface containing a value with underlying type `string`, `int32`,
`float64`, etc and checks for that type's zero value in a switch
statement. When checking for multiple numeric types on the same case
statement line, the compiler will default to `int` rather than implying
the type of the untyped constant `0` leading to an incorrect result at
runtime. Contrast that to when each numeric type is on its own case
statement line where a correct comparison can be made.

Per [go.dev/blog/constants](https://go.dev/blog/constants):
> The default type of an untyped constant is determined by its syntax.
For string constants, the only possible implicit type is string. For
[numeric constants](https://go.dev/ref/spec#Numeric_types), the implicit
type has more variety. Integer constants default to int, floating-point
constants float64, rune constants to rune (an alias for int32), and
imaginary constants to complex128.

Per
[go.dev/doc/effective_go](https://go.dev/doc/effective_go#interfaces_and_types):
> Constants in Go are just that—constant. They are created at compile
time, even when defined as locals in functions, and can only be numbers,
characters (runes), strings or booleans. Because of the compile-time
restriction, the expressions that define them must be constant
expressions, evaluatable by the compiler.

See the following code for an explicit example.
```
package main
import "fmt"
func main() {
	for _, val := range []interface{}{1, int8(1), int16(1), int32(1), int64(1)} {
		switch v := val.(type) {
		case int, int8, int16, int32, int64:
			fmt.Printf("Type %T:\t%v\n", v, v == 1)
		}
	}
        fmt.Println("-----------------------------")
	for _, val := range []interface{}{1, int8(1), int16(1), int32(1), int64(1)} {
		switch v := val.(type) {
		case int:
			fmt.Printf("Type %T:\t%v\n", v, v == 1)
		case int8:
			fmt.Printf("Type %T:\t%v\n", v, v == 1)
		case int16:
			fmt.Printf("Type %T:\t%v\n", v, v == 1)
		case int32:
			fmt.Printf("Type %T:\t%v\n", v, v == 1)
		case int64:
			fmt.Printf("Type %T:\t%v\n", v, v == 1)
		}
	}
}

-------
Type int:	true
Type int8:	false
Type int16:	false
Type int32:	false
Type int64:	false
-----------------------------
Type int:	true
Type int8:	true
Type int16:	true
Type int32:	true
Type int64:	true
```
2023-11-20 10:31:40 -08:00
Phil Porada 01f2336317
Check more types in core.IsAnyNilOrZero (#7162)
The `core.IsAnyNilOrZero` function is used in the critical path for
issuance, ocsp generation, specific ratelimit lookups, and before
performing each domain validation. The original function heavily relies
upon the default switch case which uses the full reflect package, rather
than reflectlite which can't be used because it does not implement
`IsZero()`. We can reduce CPU utilization and increase the speed of each
call by performing some basic type checking.

Benchmarking the main branch in `before.txt` and this PR in `after.txt`
shows a noticeable reduction in CPU time between the two. I ran each
test with `go test -bench=. -count=10 | tee ${FILENAME}` and compared
them with
[benchstat](https://pkg.go.dev/golang.org/x/perf/cmd/benchstat). The
only negatively affected case happens to be when a bool is `true` which
may just be due to a sample size of 10 test iterations.
```
$ benchstat before.txt after.txt 
goos: linux
goarch: amd64
pkg: github.com/letsencrypt/boulder/core
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
                                                     │  before.txt   │              after.txt               │
                                                     │    sec/op     │    sec/op     vs base                │
IsAnyNilOrZero/input_0-8                                6.349n ± 20%   4.660n ±  1%  -26.60% (p=0.000 n=10)
IsAnyNilOrZero/input_1-8                                7.385n ± 15%   6.316n ±  4%  -14.48% (p=0.000 n=10)
IsAnyNilOrZero/input_0#01-8                             6.803n ± 17%   4.284n ±  4%  -37.02% (p=0.000 n=10)
IsAnyNilOrZero/input_1#01-8                             6.806n ±  7%   4.430n ±  5%  -34.91% (p=0.000 n=10)
IsAnyNilOrZero/input_0#02-8                             6.321n ±  8%   4.232n ±  2%  -33.06% (p=0.000 n=10)
IsAnyNilOrZero/input_0.1-8                              7.074n ± 10%   4.176n ±  5%  -40.97% (p=0.000 n=10)
IsAnyNilOrZero/input_-8                                 7.494n ±  9%   3.242n ±  2%  -56.73% (p=0.000 n=10)
IsAnyNilOrZero/input_ahoyhoy-8                          9.839n ± 25%   3.990n ±  4%  -59.45% (p=0.000 n=10)
IsAnyNilOrZero/input_[]-8                               3.898n ± 13%   3.405n ±  2%  -12.63% (p=0.019 n=10)
IsAnyNilOrZero/input_[0]-8                              4.533n ±  6%   4.088n ±  2%   -9.82% (p=0.001 n=10)
IsAnyNilOrZero/input_[1]-8                              4.626n ±  6%   4.075n ±  5%  -11.92% (p=0.000 n=10)
IsAnyNilOrZero/input_<nil>-8                            2.854n ±  3%   2.039n ±  2%  -28.55% (p=0.000 n=10)
IsAnyNilOrZero/input_false-8                            7.269n ±  2%   6.164n ±  3%  -15.21% (p=0.000 n=10)
IsAnyNilOrZero/input_true-8                             8.047n ± 10%   9.090n ± 11%  +12.96% (p=0.003 n=10)
IsAnyNilOrZero/input_<nil>#01-8                         9.114n ±  3%   7.582n ±  2%  -16.81% (p=0.000 n=10)
IsAnyNilOrZero/input_0001-01-01_00:00:00_+0000_UTC-8   10.630n ±  3%   3.871n ±  4%  -63.58% (p=0.000 n=10)
IsAnyNilOrZero/input_2015-06-04_11:04:38_+0000_UTC-8   10.900n ±  4%   4.720n ±  2%  -56.70% (p=0.000 n=10)
IsAnyNilOrZero/input_1s-8                               9.509n ± 13%   9.024n ±  4%        ~ (p=0.393 n=10)
IsAnyNilOrZero/input_0s-8                               8.972n ±  4%   7.608n ±  2%  -15.20% (p=0.000 n=10)
geomean                                                 6.905n         4.772n        -30.89%
```

Part of https://github.com/letsencrypt/boulder/issues/7153
2023-11-15 14:50:59 -08:00
Phil Porada 104c39335b
VA: Check that maxRemoteValidationFailures is non-negative (#7150)
Prevents a panic when the VA config field `maxRemoteValidationFailures`
is set to a negative number by adding validation tags
 
Fixes https://github.com/letsencrypt/boulder/issues/7149
2023-11-15 11:03:20 -05:00
Matthew McPherrin f57bd30931
Set "CompleteLines" true in TailFile (#7158)
We've observed with the upgraded tailer, it can observe partial lines.
But we only want complete lines to validate them.
2023-11-14 15:44:31 -08:00
dependabot[bot] de58ad1318
build(deps): bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.40.0 to 1.42.0 (#7145)
Bumps
[github.com/aws/aws-sdk-go-v2/service/s3](https://github.com/aws/aws-sdk-go-v2)
from 1.40.0 to 1.42.0.
2023-11-14 16:35:02 -05:00
Aaron Gable 6c9153aa76
cert-checker: allow for empty common name (#7157)
Update cert-checker's logic to only ask the Policy Authority if we are
willing to issue for the Subject Alternative Names in a cert, to avoid
asking the PA about the Common Name when that CN is just the empty
string.

Add a new check to cert-checker to ensure that the Common Name is
exactly equal to one of the SANs, to fill a theoretical gap created by
loosening the check above.

Fixes https://github.com/letsencrypt/boulder/issues/7156
2023-11-14 12:53:30 -08:00
Phil Porada 279a4d539d
Read from durationpb instead of int64 durations (#7146)
Switch to reading grpc duration values from the new durationpb protofbuf
fields, completely ignoring the old int64 fields.

Part 2 of 3 for https://github.com/letsencrypt/boulder/issues/7097
2023-11-13 12:23:46 -05:00
Aaron Gable dc2ef15ac8
CI: don't block on govulncheck, do block on vendorcheck (#7155)
Having govulncheck prevent a PR from merging means that circumstances
entirely outside our control can grind Boulder development to a halt
until they are addressed. When the vulnerability is within Go itself, it
prevents PRs from being merged until we do a production deploy, because
we want our CI to always match what is in production. This is too
strict.

This PR removes govulncheck from the set of jobs depended upon by our
Boulder CI Test Matrix meta-job. It also adds vendorcheck, which was
accidentally omitted in #7123.
2023-11-13 12:22:24 -05:00
Matthew McPherrin 75439eab4b
Replace hpcloud/tail with nxadm/tail (#7152)
The hpcloud version appears abandoned, with numerous unfixed bugs
including ones that can cause it to miss data.  The nxadm fork is
maintained.

The updated tail also pulls in an updated fsnotify. We had it vendored
at two paths before, so this has a side benefit of simplifying us to
having just one copy.
2023-11-09 16:30:15 -08:00
Samantha 1bb8ef6e47
Upgrade from go1.21.3 to go1.21.4 (#7154) 2023-11-09 16:17:35 -05:00
Aaron Gable 19582cee4b
Remove go1.21.1 from CI (#7144)
We are running go1.21.3 in all environments.
2023-11-08 16:31:28 -08:00
Samantha ca6314fa48
ratelimits: API improvements necessary for batches and limit fixes (#7117)
The `Limiter` API has been adjusted significantly to both improve both
safety and ergonomics and two `Limit` types have been corrected to match
the legacy implementations.

**Safety**
Previously, the key used for looking up limit overrides and for fetching
individual buckets from the key-value store was constructed within the
WFE. This posed a risk: if the key was malformed, the default limit
would still be enforced, but individual overrides would fail to function
properly. This has been addressed by the introduction of a new
`BucketId` type along with a `BucketId` constructor for each `Limit`
type. Each constructor is responsible for producing a well-formed bucket
key which undergoes the very same validation as any potentially matching
override key.

**Ergonomics**
Previously, each of the `Limiter` methods took a `Limit` name, a bucket
identifier, and a cost to be spent/ refunded. To simplify this, each
method now accepts a new `Transaction` type which provides a cost, and
wraps a `BucketId` identifying the specific bucket.

The two changes above, when taken together, make the implementation of
batched rate limit transactions considerably easier, as a batch method
can accept a slice of `Transaction`.

**Limit Corrections**
PR #6947 added all of the existing rate limits which could be made
compatible with the key-value approach. Two of these were improperly
implemented;
- `CertificatesPerDomain` and `CertificatesPerFQDNSet`, were implemented
as
- `CertificatesPerDomainPerAccount` and
`CertificatesPerFQDNSetPerAccount`.

Since we do not actually associate these limits with a particular ACME
account, the `regID` portion of each of their bucket keys has been
removed.
2023-11-08 13:29:01 -05:00
Phil Porada d9b97c7863
Use a separate GOCACHE for config and config-next (#7136)
`t.sh` and `tn.sh` now each use a distinct GOCACHE. This causes
back-to-back invocations of the unittests in config and config-next
to not incorrectly skip tests whose behavior changes between
those two environments.
2023-11-06 16:53:38 -08:00
Aaron Gable 16081d8e30
Invert RequireCommonName into AllowNoCommonName (#7139)
The RequireCommonName feature flag was our only "inverted" feature flag,
which defaulted to true and had to be explicitly set to false. This
inversion can lead to confusion, especially to readers who expect all Go
default values to be zero values. We plan to remove the ability for our
feature flag system to support default-true flags, which the existence
of this flag blocked. Since this flag has not been set in any real
configs, inverting it is easy.

Part of https://github.com/letsencrypt/boulder/issues/6802
2023-11-06 10:58:30 -08:00
dependabot[bot] 6d8e6e74f8
build(deps): bump actions/checkout from 3 to 4 (#7125)
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
2023-11-03 09:53:57 -04:00
J.C. Jones c2443e2b58
Package ct-test-srv into the Boulder tarball (#7131)
SRE is moving ct-test-srv into Nomad, so we need this binary available
(somehow), and for now this is the means by which to do so.
2023-11-02 14:22:54 -04:00
Aaron Gable 1d31a22245
Export crl shard calculation code for future use (#7127)
Make crl.GetChunkAtTIme an exported function, so that it can be used by
the RA in a future PR without making that PR bigger and harder to
review. Also move it from being a method to an independent function
which takes two new arguments to compensate for the loss of its
receiver.

Also move some tests from batch_test to updater_test, where they should
have been in the first place.

Part of https://github.com/letsencrypt/boulder/issues/7094
2023-11-02 11:21:01 -07:00
Aaron Gable 81cb970d30
Remove crlURL from test CA issuer configs (#7132)
This value is always set to the empty string in prod, which (correctly)
results in the issued certificates not having a CRLDP at all. It turns
out our integration test environment has been including CRLDPs in all of
our test certs because we set crlURL to a non-empty value! This change
updates our test configs to match reality.

I'll remove the code which supports this config value as part of my
upcoming CA CRLDP changes.
2023-11-02 11:20:50 -07:00
Aaron Gable f24ec910ef
Further simplifications to test.ThrowAwayCert (#7129)
Remove ThrowAwayCert's nameCount argument, since it is always set to 1
by all callers. Remove ThrowAwayCertWithSerial, because it has no
callers. Change the throwaway cert's key from RSA512 to ECDSA P-224 for
a two-orders-of-magnitude speedup in key generation. Use this simplified
form in two new places in the RA that were previously rolling their own
test certs.
2023-11-02 09:45:56 -07:00
Aaron Gable 617b6edea4
Update google.golang.org/grpc to v1.59.0 (#7130)
This version contains mitigations for the HTTP/2 rapid-reset DoS vector.
See https://github.com/advisories/GHSA-m425-mq94-257g for details.

Changelog: https://github.com/grpc/grpc-go/compare/v1.54.0...v1.59.0
2023-11-02 10:20:14 -04:00
Aaron Gable 3a3e32514c
Give throwaway test certs reasonable validity intervals (#7128)
Add a new clock argument to the test-only ThrowAwayCert function, and
use that clock to generate reasonable notBefore and notAfter timestamps
in the resulting throwaway test cert. This is necessary to easily test
functions which rely on the expiration timestamp of the certificate,
such as upcoming work about computing CRL shards.

Part of https://github.com/letsencrypt/boulder/issues/7094
2023-11-01 15:24:43 -07:00
Matthew McPherrin 5b3c84d001
Remove the "netaccess" container from the docker-compose dev environment. (#7123)
Remove the "netaccess" container from the docker-compose dev
environment.

It isn't needed during a regular 'docker compose up' developer
environment, and only really serves as a way to use the same tools image
in CI. Two checks run during CI are the govulncheck and verifying go mod
tidy / go vendor. Neither of these checks require anything from the
custom image other than Golang itself, which can be provided directly
from the CI environment.

If a developer is working inside the existing containers, they can still
run `go mod tidy; go mod vendor` themselves, which is a standard Golang
workflow and thus is simpler than using the netaccess image via docker
compose.
2023-11-01 15:11:51 -07:00
Jacob Hoffman-Andrews 776287f22a
ra: send UnknownSerial error from GenerateOCSP (#7110)
The allows the ocsp-responder to log unknown serial numbers (vs just
expired ones).

Follow-up of #7108
Updates #7091
2023-11-01 15:07:00 -07:00
Phil Porada fdaab3e21a
grpc: Read timestamps from timestamppb fields instead of int64 fields (#7121)
Switch to reading grpc timestamp values from the new timestamppb
protofbuf fields, completely ignoring the old int64 fields.

Part 3 of 4 for https://github.com/letsencrypt/boulder/issues/7060
2023-10-30 15:51:33 -04:00
dependabot[bot] 50ec4786e4
build(deps): bump github.com/redis/go-redis/v9 from 9.1.0 to 9.2.1 (#7107)
Bumps https://github.com/redis/go-redis from 9.1.0 to 9.2.1.
- Release notes: https://github.com/redis/go-redis/releases/tag/v9.2.1
- Changelog: https://github.com/redis/go-redis/compare/v9.1.0...v9.2.1
2023-10-27 09:12:54 -07:00
Jacob Hoffman-Andrews c84201c09a
observer: add TCP prober (#7118)
This is potentially useful for diagnosing issues with connection
timeouts, which could have separate causes from HTTP errors. For
instance, a connection timeout is more likely to be caused by network
congestion or high CPU usage on the load balancer.
2023-10-27 09:11:18 -07:00
Matthew McPherrin bba8fd31a6
Upgrade to jaeger 1.50 (#7122)
This has the OTLP collector enabled by default, so we don't need a flag
for that anymore.
2023-10-27 11:49:25 -04:00
Phil Porada 000cd05d54
Remove config live reloader package (#7112)
The CA, RA, and tools importing the PA (policy authority) will no longer
be able to live reload specific config files. Each location is now
responsible for loading the config file.

* Removed the reloader package
* Removed unused `ecdsa_allow_list_status` metric from the CA
* Removed mutex from all ratelimit `limitsImpl` methods

Fixes https://github.com/letsencrypt/boulder/issues/7111

---------

Co-authored-by: Samantha <hello@entropy.cat>
Co-authored-by: Aaron Gable <aaron@letsencrypt.org>
2023-10-26 16:06:31 -04:00
Jacob Hoffman-Andrews 1cde2e8861
observer: add obs_tls_not_before metric (#7120)
This is the counterpart of obs_tls_not_after, and is useful for
satisfying requirements like "the target certificate must not be more
than X days old."

Fixes #7119
2023-10-26 12:59:12 -07:00
Phil Porada b8b105453a
Rename protobuf duration fields to <fieldname>NS and populate new duration fields (#7115)
* Renames all of int64 as a time.Duration fields to `<fieldname>NS` to
indicate they are Unix nanoseconds.
* Adds new `google.protobuf.Duration` fields to each .proto file where
we previously had been using an int64 field to populate a time.Duration.
* Updates relevant gRPC messages.

Part 1 of 3 for https://github.com/letsencrypt/boulder/issues/7097
2023-10-26 10:46:03 -04:00
dependabot[bot] 1aa304063d
build(deps): bump golang.org/x/net from 0.11.0 to 0.17.0 (#7113)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.11.0 to 0.17.0.
2023-10-19 11:52:58 -04:00
Phil Porada d250a3d7e9
Update to go1.21.3 (#7114)
The [go1.21.3
release](https://groups.google.com/g/golang-announce/c/iNNxDTCjZvo)
contains updates to the `net/http` package for the [HTTP/2 rapid reset
bug](https://cloud.google.com/blog/products/identity-security/how-it-works-the-novel-http2-rapid-reset-ddos-attack).
The fixes in `x/net/http2` will be handled by [another
PR](https://github.com/letsencrypt/boulder/pull/7113).

The following CVEs are fixed in this release:
- [CVE-2023-39325](https://nvd.nist.gov/vuln/detail/CVE-2023-39325)
- [CVE-2023-44487](https://nvd.nist.gov/vuln/detail/CVE-2023-44487)
2023-10-12 15:08:42 -07:00
Phil Porada a5c2772004
Add and populate new protobuf Timestamp fields (#7070)
* Adds new `google.protobuf.Timestamp` fields to each .proto file where
we had been using `int64` fields as a timestamp.
* Updates relevant gRPC messages to populate the new
`google.protobuf.Timestamp` fields in addition to the old `int64`
timestamp fields.
* Added tests for each `<x>ToPB` and `PBto<x>` functions to ensure that
new fields passed into a gRPC message arrive as intended.
* Removed an unused error return from `PBToCert` and `PBToCertStatus`
and cleaned up each call site.

Built on-top of https://github.com/letsencrypt/boulder/pull/7069
Part 2 of 4 related to
https://github.com/letsencrypt/boulder/issues/7060
2023-10-11 12:12:12 -04:00
Jacob Hoffman-Andrews 2fa09d5990
ra: do not emit UnknownSerial yet (#7109)
This is a partial revert of #7103. Because that PR introduces a new
Boulder error code that ocsp-responder needs to be specially aware of,
it has a deployability issue: ocsp-responder needs to be made aware of
the new error code before the RA can start emitting it.

The simple fix is to not emit that error code in the RA until the
ocsp-responder is ready to receive it.
2023-10-04 15:26:41 -07:00
Samantha 9aef5839b5
WFE: Add new key-value ratelimits implementation (#7089)
Integrate the key-value rate limits from #6947 into the WFE. Rate limits
are backed by the Redis source added in #7016, and use the SRV record
shard discovery added in #7042.

Part of #5545
2023-10-04 14:12:38 -04:00
Jacob Hoffman-Andrews b68b21c7a8
cert-checker: improve querying (#7088)
The certificates table has no index on expires, so including expires in
the query was causing unnecessarily slow queries.

Expires was part of the query in order to support the "UnexpiredOnly"
feature. This feature isn't really necessary; if we tell cert-checker to
check certain certificates, we want to know about problems even if the
certificates are expired. Deprecate the feature and always include all
certificates in the time range. Remove "expires" from the queries.

This allows simplifying the query structure a bit and inlining query
arguments in each site where they are used.

Update the error message returned when no rows are returned for the
MIN(id) query.

The precursor to UnexpiredOnly was introduced in
https://github.com/letsencrypt/boulder/pull/1644, but I can't find any
reference to a requirement for the flag.

Fixes #7085
2023-10-03 12:54:03 -07:00
dependabot[bot] 98a3f14ff6
build(deps): bump docker/login-action from 2.2.0 to 3.0.0 (#7105)
Bumps [docker/login-action](https://github.com/docker/login-action) from 2.2.0 to 3.0.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-03 12:47:53 -04:00
Phil Porada 5c98bf6724
ceremony: Add support for CRL onlyContainsCACerts (#7064)
* Allows the ceremony tool to add the `onlyContainsCACerts` flag to the
`IssuingDistributionPoint` extension[1] for CRLs.
* Add a lint to detect basic usage of this new flag.
* Add a helper function which doesn't (yet) exist in golang
x/crypto/cryptobyte named `ReadOptionalASN1BooleanWithTag` which
searches for an optional DER-encoded ASN.1 element tagged with a given
tag e.g. onlyContainsUserCerts and reports values back to the caller.
* Each revoked certificate in the CRL config is checked for is `IsCA` to
maintain conformance with RFC 5280 Section 6.3.3 b.2.iii [2].
    >  (iii) If the onlyContainsCACerts boolean is asserted in the
    >        IDP CRL extension, verify that the certificate
    >        includes the basic constraints extension with the cA
    >        boolean asserted.

Fixes https://github.com/letsencrypt/boulder/issues/7047

1. https://datatracker.ietf.org/doc/html/rfc5280#section-5.2.5
2. https://datatracker.ietf.org/doc/html/rfc5280#section-6.3.3
2023-10-02 17:03:36 -07:00
Jacob Hoffman-Andrews c9e9918b80
ocsp-responder(redis): sample logs at configured rate (#7102)
In #6478, we stopped passing through Redis errors to the top-level
Responder object, preferring instead to live-sign. As part of that
change, we logged the Redis errors so they wouldn't disappear. However,
the sample rate for those errors was hard coded to 1-in-1000, instead of
following the LogSampleRate configured in the JSON.

This adds a field to redisSource for logSampleRate, and passes it
through from the JSON config in ocsp-responder/main.go.

Part of #7091
2023-10-02 17:02:24 -07:00
Aaron Gable e15564e9c6
admin-revoker: improve malformed-revoke docs in usage string (#7106) 2023-10-02 13:03:37 -07:00
Aaron Gable bab048d221
SA: Add and use revokedCertificates table (#7095)
Add a new "revokedCertificates" table to the database schema. This table
is similar to the existing "certificateStatus" table in many ways, but
the idea is that it will only have rows added to it when certificates
are revoked, not when they're issued. Thus, it will grow many orders of
magnitude slower than the certificateStatus table does. Eventually, it
will replace that table entirely.

The one column that revokedCertificates adds is the new "ShardIdx"
column, which is the CRL shard in which the revoked certificate will
appear. This way we can assign certificates to CRL shards at the time
they are revoked, and guarantee that they will never move to a different
shard even if we change the number of shards we produce. This will
eventually allow us to put CRL URLs directly into our certificates,
replacing OCSP URLs.

Add new logic to the SA's RevokeCertificate and UpdateRevokedCertificate
methods to handle this new table. If these methods receive a request
which specifies a CRL shard (our CRL shards are 1-indexed, so shard 0
does not exist), then they will ensure that the new revocation status is
written into both the certificateStatus and revokedCertificates tables.
This logic will not function until the RA is updated to take advantage
of it, so it is not a risk for it to appear in Boulder before the new
table has been created.

Also add new logic to the SA's GetRevokedCertificates method. Similar to
the above, this reads from the new table if the ShardIdx field is
supplied in the request message. This code will not operate until the
crl-updater is updated to include this field. We will not perform this
update for a minimum of 100 days after this code is deployed, to ensure
that all unexpired revoked certificates are present in the
revokedCertificates table.

Part of https://github.com/letsencrypt/boulder/issues/7094
2023-10-02 10:21:14 -07:00
Aaron Gable 6c92c3041a
CAA: Treat NXDOMAIN for a TLD as an error (#7104)
Change the CAA NXDOMAIN carve-out to only apply to registered domains
and their subdomains, not to TLDs.

Our CAA lookup function has a carveout that allows queries which receive
an NXDOMAIN response to be treated as though they received a successful
empty response. This is important due to the confluence of three
circumstances:
1) many clients use the DNS-01 method to validate issuance for names
which generally don't have publicly-visible A/AAAA records;
2) many ACME clients remove their DNS-01 TXT record promptly after
validation has completed; and
3) authorizations may be reused more than 8 hours after they were first
validated and CAA was first checked.

When these circumstances combine, the DNS rightly returns NXDOMAIN when
we re-check CAA at issuance time, because no records exist at all for
that name. We have to treat this as permission to issue, the same as any
other domain which has no CAA records.

However, this should never be the case for TLDs: those should always
have at least one record. If a TLD returns NXDOMAIN, something much
worse has happened -- such as a gTLD being unlisted by ICANN -- and we
should treat it as a failure.

This change adds a check that the name in question contains at least one
dot (".") before triggering the existing carve-out, to make it so that
the carve-out does not apply to TLDs.

Fixes https://github.com/letsencrypt/boulder/issues/7056
2023-10-02 10:04:39 -07:00