Commit Graph

155 Commits

Author SHA1 Message Date
Aaron Gable 327f96d281
Update integration test hierarchy for the modern era (#7411)
Update the hierarchy which the integration tests auto-generate inside
the ./hierarchy folder to include three intermediates of each key type,
two to be actively loaded and one to be held in reserve. To facilitate
this:
- Update the generation script to loop, rather than hard-coding each
intermediate we want
- Improve the filenames of the generated hierarchy to be more readable
- Replace the WFE's AIA endpoint with a thin aia-test-srv so that we
don't have to have NameIDs hardcoded in our ca.json configs

Having this new hierarchy will make it easier for our integration tests
to validate that new features like "unpredictable issuance" are working
correctly.

Part of https://github.com/letsencrypt/boulder/issues/729
2024-04-08 14:06:00 -07:00
Aaron Gable 78e4e82ffa
Feature cleanup (#7320)
Remove three deprecated feature flags which have been removed from all
production configs:
- StoreLintingCertificateInsteadOfPrecertificate
- LeaseCRLShards
- AllowUnrecognizedFeatures

Deprecate three flags which are set to true in all production configs:
- CAAAfterValidation
- AllowNoCommonName
- SHA256SubjectKeyIdentifier

IN-9879 tracked the removal of these flags.
2024-02-13 17:42:27 -08:00
Aaron Gable 10e894a172
Create new admin tool (#7276)
Create a new administration tool "bin/admin" as a successor to and
replacement of "admin-revoker".

This new tool supports all the same fundamental capabilities as the old
admin-revoker, including:
- Revoking by serial, by batch of serials, by incident table, and by
private key
- Blocking a key to let bad-key-revoker take care of revocation
- Clearing email addresses from all accounts that use them

Improvements over the old admin-revoker include:
- All commands run in "dry-run" mode by default, to prevent accidental
executions
- All revocation mechanisms allow setting the revocation reason,
skipping blocking the key, indicating that the certificate is malformed,
and controlling the number of parallel workers conducting revocation
- All revocation mechanisms do not parse the cert in question, leaving
that to the RA
- Autogenerated usage information for all subcommands
- A much more modular structure to simplify adding more capabilities in
the future
- Significantly simplified tests with smaller mocks

The new tool has analogues of all of admin-revokers unit tests, and all
integration tests have been updated to use the new tool instead. A
future PR will remove admin-revoker, once we're sure SRE has had time to
update all of their playbooks.

Fixes https://github.com/letsencrypt/boulder/issues/7135
Fixes https://github.com/letsencrypt/boulder/issues/7269
Fixes https://github.com/letsencrypt/boulder/issues/7268
Fixes https://github.com/letsencrypt/boulder/issues/6927
Part of https://github.com/letsencrypt/boulder/issues/6840
2024-02-07 09:35:18 -08:00
Aaron Gable 0358bd7bf3
Ensure gRPC suberror metadata is ascii-only (#7282)
When passing detailed error information between services as gRPC
metadata, ensure that the suberrors being sent contain only ascii
characters, because gRPC metadata is sent as HTTP headers which only
allow visible ascii characters.

Also add a regression test.
2024-02-06 17:40:45 -08:00
Samantha 97a19b18d2
WFE: Check NewOrder rate limits (#7201)
Add non-blocking checks of New Order limits to the WFE using the new
key-value based rate limits package.

Part of #5545
2024-01-26 21:05:30 -05:00
Jacob Hoffman-Andrews ce5632b480
Remove `service1` / `service2` names in consul (#7266)
These names corresponded to single instances of a service, and were
primarily used for (a) specifying which interface to bind a gRPC port on
and (b) allowing `health-checker` to check individual instances rather
than a service as a whole.

For (a), change the `--grpc-addr` flags to bind to "all interfaces." For
(b), provide a specific IP address and port for health checking. This
required adding a `--hostOverride` flag for `health-checker` because the
service certificates contain hostname SANs, not IP address SANs.

Clarify the situation with nonce services a little bit. Previously we
had one nonce "service" in Consul and got nonces from that (i.e.
randomly between the two nonce-service instances). Now we have two nonce
services in consul, representing multiple datacenters, and one of them
is explicitly configured as the "get" service, while both are configured
as the "redeem" service.

Part of #7245.

Note this change does not yet get rid of the rednet/bluenet distinction,
nor does it get rid of all use of 10.88.88.88. That will be a followup
change.
2024-01-22 09:34:20 -08:00
Aaron Gable 26e3646249
Add integration test for account key change (#7208)
Fixes https://github.com/letsencrypt/boulder/issues/3112
Fixes https://github.com/letsencrypt/boulder/issues/7063
2023-12-13 13:54:38 -08:00
Aaron Gable 16081d8e30
Invert RequireCommonName into AllowNoCommonName (#7139)
The RequireCommonName feature flag was our only "inverted" feature flag,
which defaulted to true and had to be explicitly set to false. This
inversion can lead to confusion, especially to readers who expect all Go
default values to be zero values. We plan to remove the ability for our
feature flag system to support default-true flags, which the existence
of this flag blocked. Since this flag has not been set in any real
configs, inverting it is easy.

Part of https://github.com/letsencrypt/boulder/issues/6802
2023-11-06 10:58:30 -08:00
Aaron Gable 19f03da1e4
crl-storer: check number before uploading (#7065)
Have the crl-storer download the previous CRL from S3, parse it, and
compare its number against the about-to-be-uploaded CRL. This is not an
atomic operation, so it is not a 100% guarantee, but it is still a
useful safety check to prevent accidentally uploading CRL shards whose
CRL Numbers are not strictly increasing.

Part of https://github.com/letsencrypt/boulder/issues/6456
2023-09-27 09:12:44 -07:00
Phil Porada 4bd90ea82f
Log version string for more tools at startup (#7087)
This is a followup to https://github.com/letsencrypt/boulder/pull/7086
2023-09-19 12:46:55 -04:00
Phil Porada 72e01b337a
ceremony: Distinguish between intermediate and cross-sign ceremonies (#7005)
In `//cmd/ceremony`:
* Added `CertificateToCrossSignPath` to the `cross-certificate` ceremony
type. This new input field takes an existing certificate that will be
cross-signed and performs checks against the manually configured data in
each ceremony file.
* Added byte-for-byte subject/issuer comparison checks to root,
intermediate, and cross-certificate ceremonies to detect that signing is
happening as expected.
* Added Fermat factorization check from the `//goodkey` package to all
functions that generate new key material.

In `//linter`: 
* The Check function now exports linting certificate bytes. The idea is
that a linting certificate's `tbsCertificate` bytes can be compared
against the final certificate's `tbsCertificate` bytes as a verification
that `x509.CreateCertificate` was deterministic and produced identical
DER bytes after each signing operation.

Other notable changes:
* Re-orders the issuers list in each CA config to match staging and
production. There is an ordering issue mentioned by @aarongable two
years ago on IN-5913 that didn't make it's way back to this repository.
> Order here matters – the default chain we serve for each intermediate
should be the first listed chain containing that intermediate.
* Enables `ECDSAForAll` in `config-next` CA configs to match Staging.
* Generates 2x new ECDSA subordinate CAs cross-signed by an RSA root and
adds these chains to the WFE for clients to download.
* Increased the test.sh startup timeout to account for the extra
ceremony run time.


Fixes https://github.com/letsencrypt/boulder/issues/7003

---------

Co-authored-by: Aaron Gable <aaron@letsencrypt.org>
2023-08-23 14:01:19 -04:00
Aaron Gable 6a450a2272
Improve CRL shard leasing (#7030)
Simplify the index-picking logic in the SA's leaseOldestCrlShard method.
Specifically, more clearly separate it into "missing" and "non-missing"
cases, which require entirely different logic: picking a random missing
shard, or picking the oldest unleased shard, respectively.

Also change the UpdateCRLShard method to "unlease" shards when they're
updated. This allows the crl-updater to run as quickly as it likes,
while still ensuring that multiple instances do not step on each other's
toes.

The config change for shardWidth and lookbackPeriod instead of
certificateLifetime has been deployed in prod since IN-8445. The config
change changing the shardWidth is just so that the tests neither produce
a bazillion shards, nor have to do a bazillion SA queries for each chunk
within a shard, improving the readability of test logs.

Part of https://github.com/letsencrypt/boulder/issues/7023
2023-08-08 17:05:00 -07:00
Aaron Gable 9a4f0ca678
Deprecate LeaseCRLShards feature (#7009)
This feature flag is enabled in both staging and prod.
2023-08-07 15:17:00 -07:00
Jacob Hoffman-Andrews 725f190c01
ca: remove orphan queue code (#7025)
The `orphanQueueDir` config field is no longer used anywhere.

Fixes #6551
2023-08-02 16:04:28 -07:00
Samantha 33109ce384
test: Fix naming of integration test config structs (#7020)
Significantly differentiate configuration struct naming in the
integration package.
2023-08-01 16:24:42 -07:00
Samantha e7cb74b5f8
grpc: Allow for some SRV resolution failures (#7014)
Allow gRPC SRV resolver to succeed even when some names are not resolved
successfully. Cross-DC services (e.g. nonce) will fail to resolve when
the link between DCs is severed or one DC is taken offline, this should
not result in hard gRPC service failures.

Fixes #6974
2023-08-01 12:55:05 -04:00
Samantha b141fa7c78
WFE: Correct Error Handling for Nonce Redemption RPCs with Unknown Prefixes (#7004)
Fix an issue related to the custom gRPC Picker implementation introduced
in #6618. When a nonce contained a prefix not associated with a known
backend, the Picker would continuously rebuild, re-resolve DNS, and
eventually throw a 500 "Server Error" at RPC timeout. The Picker now
promptly returns a 400 "Bad Nonce" error as expected, in response the
requesting client should retry their request with a fresh nonce.

Additionally:
- WFE unit tests use derived nonces when `"BOULDER_CONFIG_DIR" ==
"test/config-next"`.
- `Balancer.Build()` in "noncebalancer" forces a rebuild until non-zero
backends are available. This matches the
[balancer/roundrobin](d524b40946/balancer/roundrobin/roundrobin.go (L49-L53))
implementation.
- Nonces with no matching backend increment "jose_errors" with label
`"type": "JWSInvalidNonce"` and "nonce_no_backend_found".
- Nonces of incorrect length are now rejected at the WFE and increment
"jose_errors" with label `"type": "JWSMalformedNonce"` instead of
`"type": "JWSInvalidNonce"`.
- Nonces not encoded as base64url are now rejected at the WFE and
increment "jose_errors" with label `"type": "JWSMalformedNonce"` instead
of `"type": "JWSInvalidNonce"`.

Fixes #6969
Part of #6974
2023-07-28 12:07:52 -04:00
Aaron Gable 908421bb98
crl-updater: lease CRL shards to prevent races (#6941)
Add a new feature flag, LeaseCRLShards, which controls certain aspects
of crl-updater's behavior.

When this flag is enabled, crl-updater calls the new SA.LeaseCRLShard
method before beginning work on a shard. This prevents it from stepping
on the toes of another crl-updater instance which may be working on the
same shard. This is important to prevent two competing instances from
accidentally updating a CRL's Number (which is an integer representation
of its thisUpdate timestamp) *backwards*, which would be a compliance
violation.

When this flag is enabled, crl-updater also calls the new
SA.UpdateCRLShard method after finishing work on a shard.

In the future, additional work will be done to make crl-updater use the
"give me the oldest available shard" mode of the LeaseCRLShard method.

Fixes https://github.com/letsencrypt/boulder/issues/6897
2023-07-19 15:11:16 -07:00
Jacob Hoffman-Andrews 7d66d67054
It's borpin' time! (#6982)
This change replaces [gorp] with [borp].

The changes consist of a mass renaming of the import and comments / doc
fixups, plus modifications of many call sites to provide a
context.Context everywhere, since gorp newly requires this (this was one
of the motivating factors for the borp fork).

This also refactors `github.com/letsencrypt/boulder/db.WrappedMap` and
`github.com/letsencrypt/boulder/db.Transaction` to not embed their
underlying gorp/borp objects, but to have them as plain fields. This
ensures that we can only call methods on them that are specifically
implemented in `github.com/letsencrypt/boulder/db`, so we don't miss
wrapping any. This required introducing a `NewWrappedMap` method along
with accessors `SQLDb()` and `BorpDB()` to get at the internal fields
during metrics and logging setup.

Fixes #6944
2023-07-17 14:38:29 -07:00
Jacob Hoffman-Andrews 80e1510819
admin: add clear-email subcommand (#6919)
When a user wants their email address deleted from the database but no
longer has access to their account, this allows an administrator to
clear it.

This adds `admin` as an alias for `admin-revoker`, because we'd like the
clear-email sub-command to be a part of that overall tool, but it's not
really revocation related.

Part of #6864
2023-06-01 14:33:24 -04:00
Aaron Gable 3990a08328
Add relevant domain to CAA errors and logs (#6886)
When processing CAA records, keep track of the FQDN at which that CAA
record was found (which may be different from the FQDN for which we are
attempting issuance, since we crawl CAA records upwards from the
requested name to the TLD). Then surface this name upwards so that it
can be included in our own log lines and in the problem documents which
we return to clients.

Fixes https://github.com/letsencrypt/boulder/issues/3171
2023-05-22 15:08:56 -04:00
Aaron Gable 62ff373885
Probs: remove divergences from RFC8555 (#6877)
Remove the remaining divergences from RFC8555 regarding what error types
we use in certain situations. Specifically:
- use "invalidContact" instead of "invalidEmail";
- use "unsupportedContact" for contact addresses that use a protocol
other than "mailto:"; and
- use "unsupportedIdentifier" for identifiers that specify a type other
than "dns".
2023-05-15 12:35:12 -07:00
Samantha 487680629d
cmd: TLSConfig values should be string not *string (#6872)
Fixes #6737
2023-05-08 13:21:42 -04:00
Matthew McPherrin 8427245675
OTel Integration test using jaeger (#6842)
This adds Jaeger's all-in-one dev container (with no persistent storage)
to boulder's dev docker-compose. It configures config-next/ to send all
traces there.

A new integration test creates an account and issues a cert, then
verifies the trace contains some set of expected spans.

This test found that async finalize broke spans, so I fixed that and a
few related spots where we make a new context.
2023-05-05 10:41:29 -04:00
Matthew McPherrin b5118dde36
Stop using DIRECTORY env var in integration tests (#6854)
We only ever set it to the same value, and then read it back in
make_client, so just hardcode it there instead.

It's a bit spooky-action-at-a-distance and is process-wide with no
synchronization, which means we can't safely use different values
anyway.
2023-05-03 09:54:48 -04:00
Jacob Hoffman-Andrews a9fc1cb882
Improve cert_storage_failed_test (#6849)
Replace inline connect string with a new one in test/vars (that points
to boulder_sa_integration).

Remove comments about interpolateParams=false being required; it is not.

Add clauses to getPrecertByName to ensure it follows its documented
constraints (return the latest one).

Follow-up on #6807. Fixes #6848.
2023-05-02 15:43:07 -07:00
Jacob Hoffman-Andrews 1c7e0fd1d8
Store linting certificate instead of precertificate (#6807)
In order to get rid of the orphan queue, we want to make sure that
before we sign a precertificate, we have enough data in the database
that we can fulfill our revocation-checking obligations even if storing
that precertificate in the database fails. That means:

- We should have a row in the certificateStatus table for the serial.
- But we should not serve "good" for that serial until we are positive
the precertificate was issued (BRs 4.9.10).
- We should have a record in the live DB of the proposed certificate's
public key, so the bad-key-revoker can mark it revoked.
- We should have a record in the live DB of the proposed certificate's
names, so it can be revoked if we are required to revoke based on names.

The SA.AddPrecertificate method already achieves these goals for
precertificates by writing to the various metadata tables. This PR
repurposes the SA.AddPrecertificate method to write "proposed
precertificates" instead.

We already create a linting certificate before the precertificate, and
that linting certificate is identical to the precertificate that will be
issued except for the private key used to sign it (and the AKID). So for
instance it contains the right pubkey and SANs, and the Issuer name is
the same as the Issuer name that will be used. So we'll use the linting
certificate as the "proposed precertificate" and store it to the DB,
along with appropriate metadata.

In the new code path, rather than writing "good" for the new
certificateStatus row, we write a new, fake OCSP status string "wait".
This will cause us to return internalServerError to OCSP requests for
that serial (but we won't get such requests because the serial has not
yet been published). After we finish precertificate issuance, we update
the status to "good" with SA.SetCertificateStatusReady.

Part of #6665
2023-04-26 13:54:24 -07:00
Aaron Gable 25cae29f70
Resolve TestAkamaiPurgerDrainQueueFails race (#6844)
Fixes https://github.com/letsencrypt/boulder/issues/6837
2023-04-26 11:30:09 -04:00
Phil Porada 17fb1b287f
cmd: Export prometheus metrics for TLS cert notBefore and notAfter fields (#6836)
Export new prometheus metrics for the `notBefore` and `notAfter` fields
to track internal certificate validity periods when calling the `Load()`
method for a `*tls.Config`. Each metric is labeled with the `serial`
field.

```
tlsconfig_notafter_seconds{serial="2152072875247971686"} 1.664821961e+09
tlsconfig_notbefore_seconds{serial="2152072875247971686"} 1.664821960e+09
```

Fixes https://github.com/letsencrypt/boulder/issues/6829
2023-04-24 16:28:05 -04:00
Aaron Gable 3ddca2d1b8
Update eggsampler/acme and use it for ARI tests (#6811)
Update github.com/eggsampler/acme from v3.3.0 to v3.4.0.
Changelog: https://github.com/eggsampler/acme/compare/v3.3.0...v3.4.0

Update the ARI integration test to use the eggampler/acme client's new
ARI capabilities for making both GET and POST requests. This simplifies
and streamlines the test significantly, and lets us test the POST path.

Fixes #6781
2023-04-19 14:14:43 -07:00
Aaron Gable 94f93361a0
Promote the first SAN from the CSR (#6796)
Rather than promoting the alphabetically-first SAN to be the CN, promote
the SAN which came first in the CSR. This is a reversion to previous
behavior that was changed as a side-effect of:
- https://github.com/letsencrypt/boulder/pull/6706;
- https://github.com/letsencrypt/boulder/pull/6749; and
- https://github.com/letsencrypt/boulder/pull/6757

Fixes https://github.com/letsencrypt/boulder/issues/6801
2023-04-06 14:30:19 -07:00
Aaron Gable 22fd579cf2
ARI: write Retry-After header before body (#6787)
When sending an ARI response, write the Retry-After header before
writing the JSON response body. This is necessary because
http.ResponseWriter implicitly calls WriteHeader whenever Write is
called, flushing all headers to the network and preventing any
additional headers from being written. Unfortunately, the unittests use
httptest.ResponseRecorder, which doesn't seem to enforce this invariant
(it's happy to report headers which were written after the body). Add a
header check to the integration tests, to make up for this deficiency.
2023-03-31 10:48:45 -07:00
Aaron Gable 0d0116dd3f
Implement GetSerialMetadata on StorageAuthorityRO (#6780)
When external clients make POST requests to our ARI endpoint, they're
getting 404s even when a GET request with the same exact CertID
succeeds. Logs show that this is because the SA is returning "method
GetSerialMetadata not implemented" when the WFE attempts that gRPC
request. This is due to an oversight: the GetSerialMetadata method is
not implemented on the SQLStorageAuthorityRO object, only on the
SQLStorageAuthority object. The unit tests did not catch this bug
because they supply a mock SA, which does implement the method in
question.

Update the receiver and add a wrapper so that GetSerialMetadata is
implemented on both the read-write and read-only SA implementation
types. Add a new kind of test assertion which helps ensure this won't
happen again. Add a TODO for an integration test covering the ARI POST
codepath to prevent a regression.

Fixes #6778
2023-03-30 12:32:14 -07:00
Matthew McPherrin 49851d7afd
Remove Beeline configuration (#6765)
In a previous PR, #6733, this configuration was marked deprecated
pending removal.  Here is that removal.
2023-03-23 16:58:36 -04:00
Aaron Gable 6d6f3632da
Change SetCommonName to RequireCommonName (#6749)
Change the SetCommonName flag, introduced in #6706, to
RequireCommonName. Rather than having the flag control both whether or
not a name is hoisted from the SANs into the CN *and* whether or not the
CA is willing to issue certs with no CN, this updated flag now only
controls the latter. By default, the new flag is true, and continues our
current behavior of failing issuance if we cannot set a CN in the cert.
When the flag is set to false, then we are willing to issue certificates
for which the CSR contains no CN and there is no SAN short enough to be
hoisted into the CN field.

When we have rolled out this change, we can move on to the next flag in
this series: HoistCommonName, which will control whether or not a SAN is
hoisted at all, effectively giving the CSRs (and therefore the clients)
full control over whether their certificate contains a SAN.

This change is safe because no environment explicitly sets the
SetCommonName flag to false yet.

Fixes #5112
2023-03-21 11:07:06 -07:00
Aaron Gable 9af4871e59
Add SetCommonName feature flag (#6706)
Add a new feature flag, `SetCommonName`, which defaults to `true`. In
this default state, no behavior changes.

When set to `false` on the CA, this flag will cause the CA to leave the
Subject commonName field of the certificate blank, as is recommended by
the Baseline Requirements Section 7.1.4.2.2(a).

Also slightly modify the behavior of the RA's `matchesCSR()` function,
to allow for both certificates that have a CN and certificates that
don't. It is not feasible to put this behavior behind the same
SetCommonName flag, because that would require an atomic deploy of both
the RA and the CA.

Obsoletes #5112
2023-03-09 13:31:55 -05:00
Phil Porada 9390c0e5f5
Put errors at end of log lines (#6627)
For consistency, put the error field at the end of unstructured log
lines to make them more ... structured.

Adds the `issuerID` field to "orphaning certificate" log line in the CA
to match the "orphaning precertificate" log line.

Fixes broken tests as a result of the CA and bdns log line change.

Fixes #5457
2023-02-03 11:28:38 -05:00
Aaron Gable 86c8a23a1a
Add fermat factorization integration test (#6613)
Add an integration test which verifies that we reject finalize requests
with CSRs containing a fermat-factorizable public key.

Originally this change was also going to remove our Fermat factorization
implementation from good_key.go, and simply rely on the similar check in
zlint's e_rsa_fermat_factorization check. However, while relying solely
on the lint works, it causes us to block such requests with a 500
serverInternal error, because we consider failing lints to be our fault.
This would be a regression from the current status quo, where such
requests are rejected with a 400 badCSR error and details of the
factorization, so we are leaving our goodkey checks in place.
2023-01-27 10:15:38 -08:00
Aaron Gable a7a2afef7a
ARI: Suggest immediate renewal for revoked certs (#6534)
Update our implementation of ARI to return a renewal window entirely in
the past (i.e., suggesting immediate renewal) if the certificate in
question has been revoked for any reason. This will allow clients which
implement ARI to discover that they need to replace their certificate
without having to query OCSP directly, especially as we move into a
future where OCSP is mostly supplanted by aggregated CRLs.

Fixes #6503
2022-12-02 14:33:55 -08:00
Aaron Gable 4f473edfa8
Deprecate 10 feature flags (#6502)
Deprecate these feature flags, which are consistently set in both prod
and staging and which we do not expect to change the value of ever
again:
- AllowReRevocation
- AllowV1Registration
- CheckFailedAuthorizationsFirst
- FasterNewOrdersRateLimit
- GetAuthzReadOnly
- GetAuthzUseIndex
- MozRevocationReasons
- RejectDuplicateCSRExtensions
- RestrictRSAKeySizes
- SHA1CSRs

Move each feature flag to the "deprecated" section of features.go.
Remove all references to these feature flags from Boulder application
code, and make the code they were guarding the only path. Deduplicate
tests which were testing both the feature-enabled and feature-disabled
code paths. Remove the flags from all config-next JSON configs (but
leave them in config ones until they're fully deleted, not just
deprecated). Finally, replace a few testdata CSRs used in CA tests,
because they had SHA1WithRSAEncryption signatures that are now rejected.

Fixes #5171 
Fixes #6476
Part of #5997
2022-11-14 09:24:50 -08:00
Samantha b35fe81d7b
ctpolicy: Remove deprecated codepath and fix metrics (#6485)
- Remove deprecated code for #5938
- Fix broken metrics flagged in #6435
- Make CT operator and log selection random

Fixes #6435
Fixes #5938
Fixes #6486
2022-11-07 11:31:20 -08:00
Samantha 6d519059a3
akamai-purger: Deprecate PurgeInterval config field (#6489)
Fixes #6003
2022-11-04 12:44:35 -07:00
Matthew McPherrin 1d16ff9b00
Add support for subcommands to "boulder" command (#6426)
Boulder builds a single binary which is symlinked to the different binary names, which are included in its releases.
However, requiring symlinks isn't always convenient.

This change makes the base `boulder` command usable as any of the other binary names.  If the binary is invoked as boulder, runs the second argument as the command name.  It shifts off the `boulder` from os.Args so that all the existing argument parsing can remain unchanged.

This uses the subcommand versions in integration tests, which I think is important to verify this change works, however we can debate whether or not that should be merged, since we're using the symlink method in production, that's what we want to test.

Issue #6362 suggests we want to move to a more fully-featured command-line parsing library that has proper subcommand support. This fixes one fragment of that, by providing subcommands, but is definitely nowhere near as nice as it could be with a more fully fleshed out library.  Thus this change takes a minimal-touch approach to this change, since we know a larger refactoring is coming.
2022-10-06 11:21:47 -07:00
Samantha 9c12e58c7b
grpc: Allow static host override in client config (#6423)
- Add a new gRPC client config field which overrides the dNSName checked in the
  certificate presented by the gRPC server.
- Revert all test gRPC credentials to `<service>.boulder`
- Revert all ClientNames in gRPC server configs to `<service>.boulder`
- Set all gRPC clients in `test/config` to use `serverAddress` + `hostOverride`
- Set all gRPC clients in `test/config-next` to use `srvLookup` + `hostOverride`
- Rename incorrect SRV record for `ca` with port `9096` to `ca-ocsp` 
- Rename incorrect SRV record for `ca` with port `9106` to `ca-crl` 

Resolves #6424
2022-10-03 15:23:55 -07:00
Samantha 90eb90bdbe
test: Replace sd-test-srv with consul (#6389)
- Add a dedicated Consul container
- Replace `sd-test-srv` with Consul
- Add documentation for configuring Consul
- Re-issue all gRPC credentials for `<service-name>.service.consul`

Part of #6111
2022-09-19 16:13:53 -07:00
Aaron Gable 7f189f7a3b
Improve how crl-updater formats and surfaces errors (#6369)
Make every function in the Run -> Tick -> tickIssuer -> tickShard chain
return an error. Make that return value a named return (which we usually
avoid) so that we can remove the manual setting of the metric result
label and have the deferred metric handling function take care of that
instead. In addition, let that cleanup function wrap the returned error
(if any) with the identity of the shard, issuer, or tick that is
returning it, so that we don't have to include that info in every
individual error message. Finally, have the functions which spin off
many helpers (Tick and tickIssuer) collect all of their helpers' errors
and only surface that error at the end, to ensure the process completes
even in the presence of transient errors.

In crl-updater's main, surface the error returned by Run or Tick, to
make debugging easier.
2022-09-12 11:36:42 -07:00
Aaron Gable 78fbda1cd2
Enable CRL test in config integration tests (#6368)
Now that both crl-updater and crl-storer are running in prod,
run this integration test in both test environments as well.

In addition, remove the fake storer grpc client that the updater
used when no storer client was configured, as storer clients
are now configured in all environments.
2022-09-09 16:03:49 -07:00
Aaron Gable c706609e79
Update grpc from v1.36.1 to v1.49.0 (#6336)
Changelog: https://github.com/grpc/grpc-go/compare/v1.36.1...v1.49.0

The biggest change for us is that grpc.WithBalancerName has
transitioned from deprecated to fully removed. The fix is to replace
it with a JSON-formatted "default config" object, as demonstrated in
https://github.com/grpc/grpc-go/pull/5232#issuecomment-1106921140.

This should unblock updating other dependencies which want to
transitively update gRPC as well.
2022-09-01 13:29:06 -07:00
Aaron Gable 73b72e8fa2
ARI: Implement GET portion of draft-ietf-acme-ari-00 (#6322)
Update our ACME Renewal Info implementation to parse
the CertID-based request format specified in the current
version of the draft specification.

Part of #6033
2022-08-30 14:03:26 -07:00
Jacob Hoffman-Andrews f98d74c14d
log: emit warnings and errors on stderr (#6325)
Debug and Info messages still go to stdout.

Fix the CAA integration test, which asserted that stderr should be empty
when caa-log-checker finds a problem. That used to be the case because
we never logged to stderr, but now it is the case.

Update the logging docs.

Fixes #6324
2022-08-29 15:00:55 -07:00
Aaron Gable 9c197e1f43
Use io and os instead of deprecated ioutil (#6286)
The iotuil package has been deprecated since go1.16; the various
functions it provided now exist in the os and io packages. Replace all
instances of ioutil with either io or os, as appropriate.
2022-08-10 13:30:17 -07:00
Aaron Gable 6a9bb399f7
Create new crl-storer service (#6264)
Create a new crl-storer service, which receives CRL shards via gRPC and
uploads them to an S3 bucket. It ignores AWS SDK configuration in the
usual places, in favor of configuration from our standard JSON service
config files. It ensures that the CRLs it receives parse and are signed
by the appropriate issuer before uploading them.

Integrate crl-updater with the new service. It streams bytes to the
crl-storer as it receives them from the CA, without performing any
checking at the same time. This new functionality is disabled if the
crl-updater does not have a config stanza instructing it how to connect
to the crl-storer.

Finally, add a new test component, the s3-test-srv. This acts similarly
to the existing mail-test-srv: it receives requests, stores information
about them, and exposes that information for later querying by the
integration test. The integration test uses this to ensure that a
newly-revoked certificate does show up in the next generation of CRLs
produced.

Fixes #6162
2022-08-08 16:22:48 -07:00
Aaron Gable 9ae16edf51
Fix race condition in revocation integration tests (#6253)
Add a new filter to mail-test-srv, allowing test processes to query
for messages sent from a specific address, not just ones sent to
a specific address. This fixes a race condition in the revocation
integration tests where the number of messages sent to a cert's
contact address would be higher than expected because expiration
mailer sent a message while the test was running. Also reduce
bad-key-revoker's maximum backoff to 2 seconds to ensure that
it continues to run frequently during the integration tests, despite
usually not having any work to do.

While we're here, also improve the comments on various revocation
integration tests, remove some unnecessary cruft, and split the tests
out to explicitly test functionality with the MozRevocationReasons
flag both enabled and disabled. Also, change ocsp_helper's default
output from os.Stdout to ioutil.Discard to prevent hundreds of lines
of log spam when the integration tests fail during a test that uses
that library.

Fixes #6248
2022-07-29 09:23:50 -07:00
Aaron Gable 11544756bb
Support new Google CT Policy (#6082)
Add a new code path to the ctpolicy package which enforces Chrome's new
CT Policy, which requires that SCTs come from logs run by two different
operators, rather than one Google and one non-Google log. To achieve
this, invert the "race" logic: rather than assuming we always have two
groups, and racing the logs within each group against each other, we now
race the various groups against each other, and pick just one arbitrary
log from each group to attempt submission to.

Ensure that the new code path does the right thing by adding a new zlint
which checks that the two SCTs embedded in a certificate come from logs
run by different operators. To support this lint, which needs to have a
canonical mapping from logs to their operators, import the Chrome CT Log
List JSON Schema and autogenerate Go structs from it so that we can
parse a real CT Log List. Also add flags to all services which run these
lints (the CA and cert-checker) to let them load a CT Log List from disk
and provide it to the lint.

Finally, since we now have the ability to load a CT Log List file
anyway, use this capability to simplify configuration of the RA. Rather
than listing all of the details for each log we're willing to submit to,
simply list the names (technically, Descriptions) of each log, and look
up the rest of the details from the log list file.

To support this change, SRE will need to deploy log list files (the real
Chrome log list for prod, and a custom log list for staging) and then
update the configuration of the RA, CA, and cert-checker. Once that
transition is complete, the deletion TODOs left behind by this change
will be able to be completed, removing the old RA configuration and old
ctpolicy race logic.

Part of #5938
2022-05-25 15:14:57 -07:00
Aaron Gable dab8a71b0e
Use new RA methods from WFE revocation path (#5983)
Simplify the WFE `RevokeCertificate` API method in three ways:
- Remove most of the logic checking if the requester is authorized to
  revoke the certificate in question (based on who is making the
  request, what authorizations they have, and what reason they're
  requesting). That checking is now done by the RA. Instead, simply
  verify that the JWS is authenticated.
- Remove the hard-to-read `authorizedToRevoke` callbacks, and make the
  `revokeCertBySubscriberKey` (nee `revokeCertByKeyID`) and
  `revokeCertByCertKey` (nee `revokeCertByJWK`) helpers much more
  straight-line in their execution logic.
- Call the RA's new `RevokeCertByApplicant` and `RevokeCertByKey` gRPC
  methods, rather than the deprecated `RevokeCertificateWithReg`.

This change, without any flag flips, should be invisible to the
end-user. It will slightly change some of our log message formats.
However, by now relying on the new RA gRPC revocation methods, this
change allows us to change our revocation policies by enabling the
`AllowDoubleRevocation` and `MozRevocationReasons` feature flags, which
affect the behavior of those new helpers.

Fixes #5936
2022-03-28 14:14:11 -07:00
Samantha 7c22b99d63
akamai-purger: Improve throughput and configuration safety (#6006)
- Add new configuration key `throughput`, a mapping which contains all
  throughput related akamai-purger settings.
- Deprecate configuration key `purgeInterval` in favor of `purgeBatchInterval` in
  the new `throughput` configuration mapping.
- When no `throughput` or `purgeInterval` is provided, the purger uses optimized
  default settings which offer 1.9x the throughput of current production settings.
- At startup, all throughput related settings are modeled to ensure that we
  don't exceed the limits imposed on us by Akamai.
- Queue is now `[][]string`, instead of `[]string`.
  - When a given queue entry is purged we know all 3 of it's URLs were purged.
  - At startup we know the size of a theoretical request to purge based on the
    number of queue entries included
- Raises the queue size from ~333-thousand cached OCSP responses to
  1.25-million, which is roughly 6 hours of work using the optimized default
  settings
- Raise `purgeInterval` in test config from 1ms, which violates API limits, to 800ms

Fixes #5984
2022-03-23 17:23:07 -07:00
Aaron Gable 32973392de
Revert "Bump google.golang.org/grpc from 1.36.1 to 1.44.0" (#5981)
Reverts letsencrypt/boulder#5963

Turns out the tests are still flaky -- using the `grpc.WaitForReady(true)`
connection option results in sometimes seeing 9 entries added to the
purger queue, and sometimes 10 entries. Reverting because flakiness
on main should not be tolerated.
2022-03-08 10:32:30 -08:00
dependabot[bot] 2ec03b377b
Bump google.golang.org/grpc from 1.36.1 to 1.44.0 (#5963)
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.36.1 to 1.44.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.36.1...v1.44.0)

Also update akamai-purger integration test to avoid experimental API.

The `conn.GetState()` API is marked experimental and may change behavior
at any time. It appears to have changed between v1.36.1 and v1.44.0,
and so the akamai-purger integration tests which rely on it break.

Rather than writing our own loop which polls `conn.GetState()`, just
use the stable `WaitForReady(true)` connection option, and apply it to
all connections by setting it as a default option in the dial options.
2022-03-07 17:00:20 -08:00
Samantha 80fe3aed54
akamai-purger: Cleanup (#5949)
Light cleanup of akamai-purger and the akamai cache-client. This does not make
any material changes to logic.

- Use `errors.New` and `errors.Is` instead of a custom `ErrFatal` type and
  `errors.As`
- Add whitespace to separate chunks of execution and error checking from one
  another
- Use `logger.Infof` and `logger.Errorf` instead of wrapped calls to
  `fmt.Sprintf`
- Remove capital letters from the beginning of error messages
- Additional comments and removal of some that are no longer accurate
2022-02-24 20:57:25 -08:00
Aaron Gable ff726dfc9f
Make revocation integration tests comprehensive (#5856)
Overhaul the revocation integration tests to comprehensively test
every combination of:
- revoking a cert vs a precert
- revoking via the cert key, the subscriber key, or a separate account
  that has validation for all of the names in the cert
- revoking for reason Unspecified vs for reason KeyCompromise

Also update a number of the python tests to verify that they cannot
revoke for reason keyCompromise, but can and do revoke with other
reasons.
2021-12-20 14:38:39 -08:00
Aaron Gable 5c02deabfb
Remove wfe1 integration tests (#5840)
These tests are testing functionality that is no longer in use in
production deployments of Boulder. As we go about removing wfe1
functionality, these tests will break, so let's just remove them
wholesale right now. I have verified that all of the tests removed in
this PR are duplicated against wfe2.

One of the changes in this PR is to cease starting up the wfe1 process
in the integration tests at all. However, that component was serving
requests for the AIA Issuer URL, which gets queried by various OCSP and
revocation tests. In order to keep those tests working, this change also
adds an integration-test-only handler to wfe2, and updates the CA
configuration to point at the new handler.

Part of #5681
2021-12-10 12:40:22 -08:00
Aaron Gable 1a1cd24237
Add tests for the experimental `renewalInfo` endpoint (#5750)
Add a unit test and an integration test that both exercise the new
experimental ACME Renewal Info endpoint. These tests do not
yet validate the contents of the response, just that the appropriate
HTTP response code is returned, but they will be developed as the
code under test evolves.

Fixes #5674
2021-10-27 15:00:56 -07:00
Aaron Gable eb5d0e9ba9
Update golangci-lint from v1.29.0 to v1.42.1 (#5745)
Update the version of golangci-lint we use in our docker image,
and update the version of the docker image we use in our tests.
Fix a couple places where we were violating lints (ineffective assign
and calling `t.Fatal` from outside the main test goroutine), and add
one lint (using math/rand) to the ignore list.

Fixes #5710
2021-10-22 16:26:59 -07:00
Jacob Hoffman-Andrews ba0ea090b2
integration: save hierarchy across runs (#5729)
This allows repeated runs using the same hiearchy, and avoids spurious
errors from ocsp-updater saying "This CA doesn't have an issuer cert
with ID XXX"

Fixes #5721
2021-10-20 17:06:33 -07:00
Andrew Gabbitas ba673673a4
Match revocation reason and request signing method (#5713)
Match revocation reason and request signing method

Add more detailed logging about request signing methods
2021-10-14 15:39:22 -06:00
Aaron Gable e0c3e2c1df
Reject unrecognized config keys (#5649)
Instead of using the default `json.Unmarshal`, explicitly
construct and use a `json.Decoder` so that we can set the
`DisallowUnknownFields` flag on the decoder. This causes
any unrecognized config keys to result in errors at boulder
startup time.

Fixes #5643
2021-09-24 10:13:44 -07:00
Aaron Gable 73233fb659
Improve caa-log-checker log output (#5613)
Make it produce audit logs like our other components.
2021-08-30 11:29:56 -07:00
Aaron Gable 5af74b74c2
Invert caa-log-checker's processing to save memory (#5565)
Previously, caa-log-checker's core algorithm was:
1. Load every single VA (CAA) log file, producing an in-memory map of
   names to the time at which they were checked
2. Iterate over the RA (Issuance) log file, checking each issuance event
   to see if it occurred less than 8 hours after an event in the
   in-memory map.

This consumes significant memory, as the map of all CAA checks is
redundant (contains entries for w.x.y.z, x.y.z, and y.z) and holds
unnecessary data (contains entries for CAA checks that occurred much
more than 8 hours before or after any issuance in the RA log).

Invert this algorithm, as such:
1. Load the RA (Issuance) log file, producing an in-memory map of names
   to the time at which they were issued
2. Iterate over each VA (CAA) log file, removing entries from the
   in-memory map if they occurred less than 8 hours after the current
   CAA checking event.

This reduces the memory consumption of caa-log-checker, because the
total number of issuance events is much smaller and the map does not
need to hold redundant data. The tradeoff is that caa-log-checker can no
longer print partial output as it runs; all results are held until the
very end, when it can inspect the in-memory map to see if it is empty.

Fixes #5552
2021-08-20 14:31:55 -07:00
Aaron Gable 923aef5839
Update to go1.16.5 (#5482)
Includes a number of updates to packages we use: math/big,
net, net/http/httputil, and archive/zip. See release notes at
https://golang.org/doc/devel/release#go1.16.minor

Fixes #5464
2021-06-15 11:42:04 -07:00
Aaron Gable 40f9e38088
Add lower, faster duplicate certificate rate limit (#5401)
Add a new rate limit, identical in implementation to the current
`CertificatesPerFQDNSet` limit, intended to always have both a lower
window and a lower threshold. This allows us to block runaway clients
quickly, and give their owners the ability to fix and try again quickly
(on the order of hours instead of days).

Configure the integration tests to set this new limit at 2 certs per 2
hours. Also increase the existing limit from 5 to 6 certs in 7 days, to
allow clients to hit the first limit three times before being fully
blocked for the week. Also add a new integration test to verify this
behavior.

Note that the new ratelimit must have a window greater than the
configured certificate backdate (currently 1 hour) in order to be
useful.

Fixes #5210
2021-05-17 14:50:29 -07:00
Aaron Gable 0196d7e876
orphan-finder: use serial + issuerID for OCSP (#5383)
Update orphan-finder's `generateOCSP` function to make its request to
the CA using the certificate's serial number and issuer ID, rather than
the full DER bytes. To facilitate this, add an `IssuerCerts` item to the
orphan-finder config, and add an `issuers` map to its struct, mimicking
fields of the same name and purpose on the RA. Leave the old code path
in the `generateOCSP` method for now, to be fully removed after the new
config has been deployed.

Also update the unittests to use real on-disk certificates instead of
inline strings, and similarly correct the integration test to use a
certificate with the correct Issuer field.

Part of #5079
Fixes #5149
2021-04-05 09:13:21 -07:00
Aaron Gable 8cf597459d
Add multi-issuer support to ocsp-responder (#5154)
The ocsp-responder takes a path to a certificate file as one of
its config values. It uses this path as one of the inputs when
constructing its DBSource, the object responsible for querying
the database for pregenerated OCSP responses to fulfill requests.

However, this certificate file is not necessary to query the
database; rather, it only acts as a filter: OCSP requests whose
IssuerKeyHash do not match the hash of the loaded certificate are
rejected outright, without querying the DB. In addition, there is
currently only support for a single certificate file in the config.

This change adds support for multiple issuer certificate files in
the config, and refactors the pre-database filtering of bad OCSP
requests into a helper object dedicated solely to that purpose.

Fixes #5119
2020-11-10 09:21:09 -08:00
Aaron Gable 294d1c31d7
Use error wrapping for berrors and tests (#5169)
This change adds two new test assertion helpers, `AssertErrorIs`
and `AssertErrorWraps`. The former is a wrapper around `errors.Is`,
and asserts that the error's wrapping chain contains a specific (i.e.
singleton) error. The latter is a wrapper around `errors.As`, and
asserts that the error's wrapping chain contains any error which is
of the given type; it also has the same unwrapping side effect as
`errors.As`, which can be useful for further assertions about the
contents of the error.

It also makes two small changes to our `berrors` package, namely
making `berrors.ErrorType` itself an error rather than just an int,
and giving `berrors.BoulderError` an `Unwrap()` method which
exposes that inner `ErrorType`. This allows us to use the two new
helpers above to make assertions about berrors, rather than
having to hand-roll equality assertions about their types.

Finally, it takes advantage of the two changes above to greatly
simplify many of the assertions in our tests, removing conditional
checks and replacing them with simple assertions.
2020-11-06 13:17:11 -08:00
Samantha ae7e1541f8
test: replacing error assertions with errors.As (#5137)
Part of #5010
2020-10-21 11:13:42 -07:00
Jacob Hoffman-Andrews d90a2817c4
ocsp_helper: allow suppressing output. (#5103)
This adds a configurable output parameter for ocsp_helper, which
defaults to stdout. This allows suppressing the stdout output when using
ocsp_helper in integration tests. That output was making it hard to
see details about failing tests.
2020-09-25 14:03:50 -07:00
Aaron Gable 17e9e7fbb7
SA: Ensure that IssuerID is set when adding precertificates (#5099)
This change adds `req.IssuerID` to the set of fields that the SA's
`AddPrecertificate` method requires be non-zero.

As a result, this also updates many tests, both unit and integration,
to ensure that they supply a value (usually just 1) for that field. The
most complex part of the test changes is a slight refactoring to the
orphan-finder code, which makes it easier to reason about the
separation between log line parsing and building and sending the
request.

Based on #5096
Fixes #5097
2020-09-23 16:45:19 -07:00
Roland Bracewell Shoemaker 1bf3d5d660
cmd/caa-log-checker: non-zero exit when errors are found (#5041)
Fixes #5033
2020-08-27 13:57:37 -07:00
Tim Geoghegan 8685e7aec2
cmd/caa-log-checker: -earliest and -latest (#5045)
Since we now sync caaChecks logs daily instead of continuously,
caa-log-checker can no longer assume that the validation logs it is
checking cover the exact same span of time as the issuance logs. This
commit adds -earliest and -latest parameters so that the script
that drives this tool can restrict verification to a timespan where we
know the data is valid.

Also adds a -debug flag to caa-log-checker to enable debug logs. At the
moment this makes the tool write to stderr how many issuance messages
were evaluated and how many were skipped due to -earliest and
-latest parameters.
2020-08-25 09:54:20 -07:00
Jacob Hoffman-Andrews 50d404333e
akamai-purger: empty queue on shutdown (#4944) 2020-07-10 13:04:46 -07:00
Jacob Hoffman-Andrews 56d581613c
Update test/config. (#4923)
This copies over a number of features flags and other settings from
test/config-next that have been applied in prod.

Also, remove the config-next gate on various tests.
2020-07-01 17:59:14 -07:00
Aaron Gable afffbb899d
Add -expect-reason flag to checkocsp (#4901)
Adds a new -expect-reason flag to the checkocsp binary to allow for
verifying the revocation reason of the certificate(s) in question.
This flag has a default value of -1, meaning that no particular
revocation reason will be expected or enforced.

Also updates the -expect-status flag to have the same default (-1) and
behavior, so that when the tool is run interactively it can simply
print the revocation status of each certificate.

Finally, refactors the way the ocsp/helper library declares flags and
accesses their values. This unifies the interface and makes it easy to
extend to allow tests to modify parameters other than expectStatus when
desired.

Fixes #4885
2020-06-29 14:15:14 -07:00
Jacob Hoffman-Andrews 7b93e00021
Fix pass-through of revoke reason in ocsp-updater. (#4880)
This ensures that updates to OCSP responses keep the same reason code
as when they were revoked.
2020-06-18 17:37:32 -07:00
Roland Bracewell Shoemaker 7673f02803
Use cmd/ceremony in integration tests (#4832)
This ended up taking a lot more work than I expected. In order to make the implementation more robust a bunch of stuff we previously relied on has been ripped out in order to reduce unnecessary complexity (I think I insisted on a bunch of this in the first place, so glad I can kill it now).

In particular this change:

* Removes bhsm and pkcs11-proxy: softhsm and pkcs11-proxy don't play well together, and any softhsm manipulation would need to happen on bhsm, then require a restart of pkcs11-proxy to pull in the on-disk changes. This makes manipulating softhsm from the boulder container extremely difficult, and because of the need to initialize new on each run (described below) we need direct access to the softhsm2 tools since pkcs11-tool cannot do slot initialization operations over the wire. I originally argued for bhsm as a way to mimic a network attached HSM, mainly so that we could do network level fault testing. In reality we've never actually done this, and the extra complexity is not really realistic for a handful of reasons. It seems better to just rip it out and operate directly on a local softhsm instance (the other option would be to use pkcs11-proxy locally, but this still would require manually restarting the proxy whenever softhsm2-util was used, and wouldn't really offer any realistic benefit).
* Initializes the softhsm slots on each integration test run, rather than when creating the docker image (this is necessary to prevent churn in test/cert-ceremonies/generate.go, which would need to be updated to reflect the new slot IDs each time a new boulder-tools image was created since slot IDs are randomly generated)
* Installs softhsm from source so that we can use a more up to date version (2.5.0 vs. 2.2.0 which is in the debian repo)
* Generates the root and intermediate private keys in softhsm and writes out the root and intermediate public keys to /tmp for use in integration tests (the existing test-{ca,root} certs are kept in test/ because they are used in a whole bunch of unit tests. At some point these should probably be renamed/moved to be more representative of what they are used for, but that is left for a follow-up in order to keep the churn in this PR as related to the ceremony work as possible)
Another follow-up item here is that we should really be zeroing out the database at the start of each integration test run, since certain things like certificates and ocsp responses will be signed by a key/issuer that is no longer is use/doesn't match the current key/issuer.

Fixes #4832.
2020-06-03 15:20:23 -07:00
Roland Bracewell Shoemaker 57ee1543a3
Add caa-log-checker tool (#4804)
Adds a productionized version of our internal tooling to the tree. The
major differences are: it doesn't skip certs with only one name, it
doesn't read in all the va logs in parallel, it only supports reading
one ra log at a time, and it adds unit tests.

Probably it should include a integration test, but that requires
capturing logs on the docker container, which I don't think we currently
do? Probably would make for a good follow-up issue.

Fixes #4698.
2020-05-08 12:12:24 -07:00
Garrett Squire 739686ba88
Bug Fixes (#4798)
Patches:

Make sure all log tailing types call Cleanup
Make sure the http.Response body is closed in all cases
Make sure that the challenge token is always deleted
2020-04-30 11:56:43 -07:00
Roland Bracewell Shoemaker 70ff4d9347
Add bad-key-revoker daemon (#4788)
Adds a daemon which monitors the new blockedKeys table and checks for any unexpired, unrevoked certificates that are associated with the added SPKI hashes and revokes them, notifying the user that issued the certificates.

Fixes #4772.
2020-04-23 11:51:59 -07:00
Roland Bracewell Shoemaker 9df97cbf06
Add a blocked keys table, and use it (#4773)
Fixes #4712 and fixes #4711.
2020-04-15 13:42:51 -07:00
Jacob Hoffman-Andrews 5254844ba2
Make TestValidAuthzExpires non-flaky. (#4778)
Previously, the test called `.Round(time.Minute)` on the expected
and actual expiration times, intending to perform an "approximately
equal" function.

However, when the expected and actual times differed by a second, but
they happened to fall on opposite sides of a rounding interval (i.e. 30
seconds into a minute), they would be rounded in opposite directions,
resulting in a conclusion that they were not equal.

This change instead defines an acceptable range of plus or minus a
minute for the expiration time, and checks that the actual expiration
time is in that interval.
2020-04-15 12:54:53 -07:00
Roland Bracewell Shoemaker 87746dec5c Properly register boulder-wfe2 http metrics (#4654)
Instead of blackholing them.
2020-01-21 12:55:26 -08:00
Jacob Hoffman-Andrews 1be1986894 SA: Rename authz2Model to authzModel. (#4648)
Also remove unused helper functions selectPendingAuthz and selectAuthz,
and pendingAuthzModel and the previously defined but now unused
authzModel.
2020-01-17 17:19:15 -05:00
Daniel McCarney bff9eb0534
CI/Dev: restore allowOrigins wfe/wfe2 config. (#4650)
In 67ec373a96 we removed "unused" WFE and WFE2
config elements. Unfortunately I missed that one of these elements,
`allowOrigins`, **is** used and without this config in
place CORS is broken.

We have unit tests for the CORS headers but we did not have any end-to-end
integration tests that would catch a problem with the WFE/WFE2 missing the
`allowOrigins` config element.

This commit restores the `allowOrigins` config value across the WFE/WFE2
configs and also adds a very small integration test. That test only checks one
CORS header and only for the HTTP ACMEv2 endpoint but I think it's sufficient
for the moment (and definitely better than nothing).

Prior to fixing the config elements the integration test fails as expected:
```
--- FAIL: TestWFECORS (0.00s)
    wfe_test.go:28: "" != "*"
FAIL
FAIL	github.com/letsencrypt/boulder/test/integration	0.014s
FAIL
```
2020-01-17 14:41:20 -05:00
Daniel McCarney 97bd7c53dc RA: Don't extend valid authorization time by pending expiry. (#4619)
The RA should set the expiry of valid authorizations based only on the current time and the configured authorizationLifetime. It should not extend the pending authorization's lifetime by the authorizationLifetime.

Resolves #4617

I didn't gate this with a feature flag. If we think this needs an API announcement and gradual rollout (I don't personally think this change deserves that) then I think we should change the RA config's authorizationLifetimeDays value to 37 days instead of adding a feature flag that we'll have to clean up after the flag date. We can change it back to 30 after the flag date.
2019-12-19 10:11:23 -08:00
Daniel McCarney 6ed62cf746
RA: reject Contacts that marshal too long for DB. (#4575)
In the deep dark history of Boulder we ended up jamming contacts into
a VARCHAR db field. We need to make sure that when contacts are
marshaled the resulting bytes will fit into the column or a 500 will
be returned to the user when the SA RPC fails.

One day we should fix this properly and not return a hacky error message
that's hard for users to understand. Unfortunately that will likely
require a migration or a new DB table. In the shorter term this hack
will prevent 500s which is a clear improvement.
2019-11-22 15:13:53 -05:00
Daniel McCarney fde145ab96
RA: implement stricter email validation. (#4574)
Prev. we weren't checking the domain portion of an email contact address
very strictly in the RA. This updates the PA to export a function that
can be used to validate the domain the same way we validate domain
portions of DNS type identifiers for issuance.

This also changes the RA to use the `invalidEmail` error type in more
places.

A new Go integration test is added that checks these errors end-to-end
for both account creation and account update.
2019-11-22 13:39:31 -05:00
Daniel McCarney a86ed0f753
RA: fix error returned through WFE2 for too big NewOrders. (#4572)
We need the RA's `NewOrder` RPC to return a `berrors.Malformed` instance
when there are too many identifiers. A bare error will be turned into
a server internal problem by the WFE2's `web.ProblemDetailsForError`
call while a `berrors.Malformed` will produce the expected malformed
problem.

This commit fixes the err, updates the unit test, and adds an end-to-end
integration test so we don't mess this up again.
2019-11-21 13:54:49 -05:00
Daniel McCarney 4e9ab5f04e
deps: update to eggsampler/acme/v3, run tidy, re-enable parallel tests (#4568)
This updates the `github.com/eggsampler/acme` dependency used in our Go-based
integration tests to v3. Notably this fixes a data race we encountered in CI.
With the data race fixed this branch can also revert
54a798b7f6 and resolve
https://github.com/letsencrypt/boulder/issues/4542

I ran a `go mod tidy` to cleanup the old `v2` copy of the dep and it also
removed a few stale cfssl/mysql items from the `go.mod`.

Upstream library's tests are confirmed to pass:
```
~/go/src/github.com/eggsampler/acme$ git log --pretty=format:'%h' -n 1
b581dc6

~/go/src/github.com/eggsampler/acme$ make pebble
mkdir -p /home/daniel/go/src/github.com/letsencrypt/pebble
git clone --depth 1 https://github.com/letsencrypt/pebble.git /home/daniel/go/src/github.com/letsencrypt/pebble \
	|| (cd /home/daniel/go/src/github.com/letsencrypt/pebble; git checkout -f master && git reset --hard HEAD && git pull -q)
fatal: destination path '/home/daniel/go/src/github.com/letsencrypt/pebble' already exists and is not an empty directory.
Already on 'master'
Your branch is up-to-date with 'le/master'.
HEAD is now at 6c2d514 wfe: compare Identifier.Type with acme.IndentifierIP (#287)
docker-compose -f /home/daniel/go/src/github.com/letsencrypt/pebble/docker-compose.yml up -d
Creating network "pebble_acmenet" with driver "bridge"
Creating pebble_challtestsrv_1 ... done
Creating pebble_pebble_1       ... done
while ! wget --delete-after -q --no-check-certificate "https://localhost:14000/dir" ; do sleep 1 ; done
go clean -testcache
go test -race -coverprofile=coverage_18.txt -covermode=atomic github.com/eggsampler/acme/v3
ok  	github.com/eggsampler/acme/v3	24.292s	coverage: 83.0% of statements
docker-compose -f /home/daniel/go/src/github.com/letsencrypt/pebble/docker-compose.yml down
Stopping pebble_pebble_1       ... done
Stopping pebble_challtestsrv_1 ... done
Removing pebble_pebble_1       ... done
Removing pebble_challtestsrv_1 ... done
Removing network pebble_acmenet
```
2019-11-21 09:23:12 -05:00
Roland Bracewell Shoemaker f24fd0dfc8 Cleanup leftovers from PrecertificateOCSP deprecation (#4551)
Cleans up a few things that were left out of #4465.
2019-11-14 15:23:48 -08:00
Daniel McCarney df6b507aa9
test: fix TestPrecertificateOCSP flake. (#4536)
Since 6f71c0c switched the Go integration tests to run in parallel the
`TestPrecertificateOCSP` test has been flaky. To fix the flake the test
needs to be changed to be resilient to precertificates other than the
one it is expecting being returned by the ct-test-srv since other tests
are also concurrently using it.
2019-11-08 16:29:30 -05:00
Roland Bracewell Shoemaker 6f71c0c453 tests: run golang integration tests in parallel w/ race detector (#4533) 2019-11-08 15:10:21 -05:00
Jacob Hoffman-Andrews 5e608ccbb8 tests: use relative paths in Go integration tests. (#4526)
This makes it simpler to specify paths to testdata and config files.

Fixes #4508
2019-11-05 10:22:20 -05:00