Things removed:
* features.EmbedSCTs (and all the associated RA/CA/ocsp-updater code etc)
* ca.enablePrecertificateFlow (and all the associated RA/CA code)
* sa.AddSCTReceipt and sa.GetSCTReceipt RPCs
* publisher.SubmitToCT and publisher.SubmitToSingleCT RPCs
Fixes#3755.
Fix: When any of OCSPURL, CRLURL or IssuerURL in the CertProfile are empty, the relevant fields are encoded in ASN.1 as empty arrays.
Fix: KeyUsage is a bitmask requiring bitwise OR.
Tested against relevant hardware for generating both RSA and ECDSA roots and intermediates with keys generated using `gen-key`.
Also this makes a few changes to the `gen-key` tool after further experience with the HSM and more reading of the PCKS#11 specification. Main change is the removal of `compatMode`, which was intended to provide support for two naming schemes for EC used in subsequent PKCS#11 drafts. It turns out these schemes were changes in name only and the underlying structs/ints were the exact same (i.e. `CKA_ECDSA_PARAMS == CKA_EC_PARAMS` and `CKM_ECDSA_KEY_PAIR_GEN == CKM_EC_KEY_PAIR_GEN`) and just allowed using one of the two names based on preference. This meant with `compatMode` enabled or disabled the tool did the exact same thing.
Fixes#3697.
This adds support for the account-uri CAA parameter as specified by
section 3 of https://tools.ietf.org/html/draft-ietf-acme-caa-04, allowing
issuance to be restricted to one or more ACME accounts as specified by CAA
records.
Switch linting library to zmap/zlint.
```
github.com/zmap/zlint$ go test ./...
ok github.com/zmap/zlint 0.190s
? github.com/zmap/zlint/cmd/zlint [no test files]
ok github.com/zmap/zlint/lints 0.216s
ok github.com/zmap/zlint/util (cached)
```
Two fixes that I found while doing work on the gen-cert tool and setting up the HSM again
* Accept an empty PIN argument, this allows purely using the PED for login if not using challenge mode
* Generate 4 byte key ID for public/private key pairs during key gen, the HSM doesn't generate this field itself and `letsencrypt/pkcs11key` relies on this attribute to function
While we intended to allow legacy ACME v1 accounts created through the WFE to work with the ACME v2 implementation and the WFE2 we neglected to consider that a legacy account would have a Key ID URL that doesn't match the expected for a V2 account. This caused `wfe2/verify.go`'s `lookupJWK` to reject all POST requests authenticated by a legacy account unless the ACME client took the extra manual step of "fixing" the URL.
This PR adds a configuration parameter to the WFE2 for an allowed legacy key ID prefix. The WFE2 verification logic is updated to allow both the expected key ID prefix and the configured legacy key ID prefix. This will allow us to specify the correct legacy URL in configuration for both staging/prod to allow unmodified V1 ACME accounts to be used with ACME v2.
Resolves https://github.com/letsencrypt/boulder/issues/3674
Remove various unnecessary uses of fmt.Sprintf - in particular:
- Avoid calls like t.Error(fmt.Sprintf(...)), where t.Errorf can be used directly.
- Use strconv when converting an integer to a string, rather than using
fmt.Sprintf("%d", ...). This is simpler and can also detect type errors at
compile time.
- Instead of using x.Write([]byte(fmt.Sprintf(...))), use fmt.Fprintf(x, ...).
A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)).
Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions
of the logger functions and call these directly with the format and arguments.
While here remove some unnecessary trailing newlines and calls to String/Error.
In #3651 we introduced a new parameter to sa.AddCertificate to allow specifying the Issued date. If nil, we defaulted to the current time to maintain deployability guidelines.
Now that this has been deployed everywhere this PR updates SA.AddCertificate and the gRPC wrappers such that a nil issuer argument is rejected with an error.
Unit tests that were previously using nil for the issued time are updated to explicitly set the issued time to the fake clock's now().
Resolves#3657
We may see RPCs that are dispatched by a client but do not arrive at the server for some time afterwards. To have insight into potential request latency at this layer we want to publish the time delta between when a client sent an RPC and when the server received it.
This PR updates the gRPC client interceptor to add the current time to the gRPC request metadata context when it dispatches an RPC. The server side interceptor is updated to pull the client request time out of the gRPC request metadata. Using this timestamp it can calculate the latency and publish it as an observation on a Prometheus histogram.
Accomplishing the above required wiring a clock through to each of the client interceptors. This caused a small diff across each of the gRPC aware boulder commands.
A small unit test is included in this PR that checks that a latency stat is published to the histogram after an RPC to a test ChillerServer is made. It's difficult to do more in-depth testing because using fake clocks makes the latency 0 and using real clocks requires finding a way to queue/delay requests inside of the gRPC mechanisms not exposed to Boulder.
Updates https://github.com/letsencrypt/boulder/issues/3635 - Still TODO: Explicitly logging latency in the VA, tracking outstanding RPCs as a gauge.
* Randomize order of CT logs when submitting precerts so we maximize the chances we actually exercise all of the logs in a group and not just the first in the list.
* Add metrics for winning logs
The Boulder orphan-finder command uses the SA's AddCertificate RPC to add orphaned certificates it finds back to the DB. Prior to this commit this RPC always set the core.Certificate.Issued field to the
current time. For the orphan-finder case this meant that the Issued date would incorrectly be set to when the certificate was found, not when it was actually issued. This could cause cert-checker to alarm based on the unusual delta between the cert NotBefore and the core.Certificate.Issued value.
This PR updates the AddCertificate RPC to accept an optional issued timestamp in the request arguments. In the SA layer we address deployability concerns by setting a default value of the current time when none is explicitly provided. This matches the classic behaviour and will let an old RA communicate with a new SA.
This PR updates the orphan-finder to provide an explicit issued time to sa.AddCertificate. The explicit issued time is calculated using the found certificate's NotBefore and the configured backdate.
This lets the orphan-finder set the true issued time in the core.Certificate object, avoiding any cert-checker alarms.
Resolves#3624
Submits final certificates to any configured CT logs. This doesn't introduce a feature flag as it is config gated, any log we want to submit final certificates to needs to have it's log description updated to include the `"submitFinalCerts": true` field.
Fixes#3605.
During periods of peak load, some RPCs are significantly delayed (on the order of seconds) by client-side blocking. HTTP/2 clients have to obey a "max concurrent streams" setting sent by the server. In Go's HTTP/2 implementation, this value [defaults to 250](https://github.com/golang/net/blob/master/http2/server.go#L56), so the gRPC default is also 250. So whenever there are more than 250 requests in progress at a time, additional requests will be delayed until there is a slot available.
During this peak load, we aren't hitting limits on CPU or memory, so we should increase the max concurrent streams limit to take better advantage of our available resources. This PR adds a config field to do that.
Fixes#3641.
The `TotalCertificates` rate limit serves to ensure we don't
accidentally exceed our OCSP signing capacity by issuing too many
certificates within a fixed period. In practice this rate limit has been
fragile and the associated queries have been linked to performance
problems.
Since we now have better means of monitoring our OCSP signing capacity
this commit removes the rate limit and associated code.
This commit updates the `boulder-ra` and `boulder-ca` commands to refuse
to start if their configured `MaxNames` is 0 (the default value). This
should always be set to a positive number.
This commit also updates `csr/csr.go` to always apply the max names
check since it will never be 0 after the change above.
Also refactor `FailOnError` to pull out a separate `Fail` function.
Related to https://github.com/letsencrypt/boulder/issues/3632
* Update `globalsign/certlint` to d4a45be.
This commit updates the `github.com/globalsign/certlint` dependency to
the latest tip of master (d4a45be06892f3e664f69892aca79a48df510be0).
Unit tests are confirmed to pass:
```
$ go test ./...
ok github.com/globalsign/certlint 3.816s
ok github.com/globalsign/certlint/asn1 (cached)
? github.com/globalsign/certlint/certdata [no test files]
? github.com/globalsign/certlint/checks [no test files]
? github.com/globalsign/certlint/checks/certificate/aiaissuers [no
test files]
? github.com/globalsign/certlint/checks/certificate/all [no test
files]
? github.com/globalsign/certlint/checks/certificate/basicconstraints
[no test files]
? github.com/globalsign/certlint/checks/certificate/extensions [no
test files]
? github.com/globalsign/certlint/checks/certificate/extkeyusage [no
test files]
ok github.com/globalsign/certlint/checks/certificate/internal
(cached)
? github.com/globalsign/certlint/checks/certificate/issuerdn [no
test files]
? github.com/globalsign/certlint/checks/certificate/keyusage [no
test files]
? github.com/globalsign/certlint/checks/certificate/publickey [no
test files]
? github.com/globalsign/certlint/checks/certificate/publickey/goodkey
[no test files]
ok github.com/globalsign/certlint/checks/certificate/publicsuffix
(cached)
? github.com/globalsign/certlint/checks/certificate/revocation [no
test files]
? github.com/globalsign/certlint/checks/certificate/serialnumber
[no test files]
? github.com/globalsign/certlint/checks/certificate/signaturealgorithm
[no test files]
ok github.com/globalsign/certlint/checks/certificate/subject (cached)
ok github.com/globalsign/certlint/checks/certificate/subjectaltname
(cached)
? github.com/globalsign/certlint/checks/certificate/validity [no
test files]
? github.com/globalsign/certlint/checks/certificate/version [no test
files]
? github.com/globalsign/certlint/checks/certificate/wildcard [no
test files]
? github.com/globalsign/certlint/checks/extensions/adobetimestamp
[no test files]
? github.com/globalsign/certlint/checks/extensions/all [no test
files]
? github.com/globalsign/certlint/checks/extensions/authorityinfoaccess
[no test files]
? github.com/globalsign/certlint/checks/extensions/authoritykeyid
[no test files]
? github.com/globalsign/certlint/checks/extensions/basicconstraints
[no test files]
? github.com/globalsign/certlint/checks/extensions/crldistributionpoints
[no test files]
? github.com/globalsign/certlint/checks/extensions/ct [no test
files]
? github.com/globalsign/certlint/checks/extensions/extkeyusage [no
test files]
? github.com/globalsign/certlint/checks/extensions/keyusage [no test
files]
? github.com/globalsign/certlint/checks/extensions/nameconstraints
[no test files]
ok github.com/globalsign/certlint/checks/extensions/ocspmuststaple
(cached)
? github.com/globalsign/certlint/checks/extensions/ocspnocheck [no
test files]
? github.com/globalsign/certlint/checks/extensions/pdfrevocation
[no test files]
? github.com/globalsign/certlint/checks/extensions/policyidentifiers
[no test files]
? github.com/globalsign/certlint/checks/extensions/smimecapabilities
[no test files]
? github.com/globalsign/certlint/checks/extensions/subjectaltname
[no test files]
? github.com/globalsign/certlint/checks/extensions/subjectkeyid [no
test files]
ok github.com/globalsign/certlint/errors (cached)
? github.com/globalsign/certlint/examples/ct [no test files]
? github.com/globalsign/certlint/examples/specificchecks [no test
files]
```
* Certchecker: Remove OCSP Must Staple err ignore, fix typos.
This commit removes the explicit ignore for OCSP Must Staple errors that
was added when the upstream `certlint` package didn't understand that
PKIX extension. That problem was resolved and so we can remove the
ignore from `cert-checker`.
This commit also fixes two typos that were fixed upstream and needed to
be reflected in expected error messages in the `certlint` unit test.
* Certchecker: Ignore Certlint CN/SAN == PSL errors.
`globalsign/certlint`, used by `cmd/cert-checker` to vet certs,
improperly flags certificates that have subj CN/SANs equal to a private
entry in the public suffix list as faulty.
This commit adds a regex that will skip errors that match the certlint
PSL error string. Prior to this workaround the addition of a private PSL
entry as a SAN in the `TestCheckCert` test cert fails the test:
```
--- FAIL: TestCheckCert (1.72s)
main_test.go:221: Found unexpected problem 'Certificate subjectAltName
"dev-myqnapcloud.com" equals "dev-myqnapcloud.com" from the public
suffix list'.
```
With the workaround in place, the test passes again.
Also instead of repeating the same bucket definitions everywhere just use a single top level var in the metrics package in order to discourage copy/pasting.
Fixes#3607.
This commit addresses two config elements that were defined but not
wired through to the WFE implementation object. Prior to this commit the
`c.WFE.DirectoryCAAIdentity` and `c.WFE.DirectoryWebsite` configuration
values were read and unmarshaled from config but not passed to the WFE.
After this commit these two config options will be picked up by the WFE
impl.
The upstream `certlint` package doesn't understand the RFC 7633 OCSP
Must Staple PKIX Extension and flags its presence as an error. Until
this is resolved upstream this commit updates `cmd/cert-checker` to
ignore the error.
The `TestCheckCert` unit test is updated to add an unsupported extension
and the OCSP must staple extension to its test cert. Only the
unsupported extension should be flagged as a problem.
This commit updates the WFE and WFE2 to have configuration support for
setting a value for the `/directory` object's "meta" field's
optional "caaIdentities" and "website" fields. The config-next wfe/wfe2
configuration are updated with values for these fields. Unit tests are
updated to check that they are sent when expected and not otherwise.
Bonus content: The `test.AssertUnmarshaledEquals` function had a bug
where it would consider two inputs equal when the # of keys differed.
This commit also fixes that bug.
The outer `config.SubscriberAgreementURL` field has been deprecated for
a while in favour of `config.wfe.SubscriberAgreementURL`. After
verifying the prod/staging configurations do not use the legacy field
this commit removes it.
* Reject WFE2 certificate chain PEM files with CRLF endings.
This commit updates the `boulder-wfe2` command's processing of
certificate chains such that it will reject chain files that contain PEM
encoding with Windows CRLF line endings. Boulder is a UNIX service and
throughout we assume UNIX newlines. CRLF endings in a certificate chain
input file is an error that should be resolved by the operator prior to
startup.
* Add trailing newline to PEM chainfiles automatically.
If a PEM encoded chain file doesn't end with a trailing `\n` the WFE2
should add it. This commit updates the chain file loading to handle this
and adds a corresponding unit test.
Prior to this commit, if the expiration-mailer database configuration is
invalid, or the database is unreachable,
`cmd/expiration-mailer/main.go`'s `main()` function tries to call
`sa.SetSQLDebug(dbMap, logger)` before erroring from the DB
initialization failure. This causes a nil panic.
This commit changes the order of two lines such that `sa.SetSQLDebug` is
only called when there was no db setup error.
Adds SCT embedding to the certificate issuance flow. When a issuance is requested a precertificate (the requested certificate but poisoned with the critical CT extension) is issued and submitted to the required CT logs. Once the SCTs for the precertificate have been collected a new certificate is issued with the poison extension replace with a SCT list extension containing the retrieved SCTs.
Fixes#2244, fixes#3492 and fixes#3429.
For a long time now the WFE has generated URLs based on the incoming
request rather than a hardcoded BaseURL. BaseURL is no longer set in the
prod configs.
This also allows factoring out relativeEndpoint into the web package.
Some of the pprof handlers have to be accessed through
pprof.Handler("string"), while some have to be accessed through an
exported var in pprof. We weren't doing the latter before, which meant
some key handlers like Profile weren't available.
This change updates boulder-tools to use Go 1.10, and references a
newly-pushed image built using that new config.
Since boulder-tools pulls in the latest Certbot master at the time of
build, this also pulls in the latest changes to Certbot's acme module,
which now supports ACME v2. This means we no longer have to check out
the special acme-v2-integration branch in our integration tests.
This also updates chisel2.py to reflect some of the API changes that
landed in the acme module as it was merged to master.
Since we don't need additional checkouts to get the ACMEv2-compatible
version of the acme module, we can include it in the default RUN set for
local tests.
Use `t.Helper` and `t.Fatalf` instead of our own versions.
Remove some unused or single-user helpers.
Make the output of `AssetUnmarshaledEquals` clearer by showing one line per field.
Our various main.go functions gated some key code on whether the TLS
and/or GRPC config fields were present. Now that those fields are fully
deployed in production, we can simplify the code and require them.
Also, rename tls to tlsConfig everywhere to avoid confusion with the tls
package.
Avoid assigning to the same err from two different goroutines in
boulder-ca (fix a race).
Add a set of logs which will be submitted to but not relied on for their SCTs,
this allows us to test submissions to a particular log or submit to a log which
is not yet approved by a browser/root program.
Also add a feature which stops cancellations of remaining submissions when racing
to get a SCT from a group of logs.
Additionally add an informational log that always times out in config-next.
Fixes#3464 and fixes#3465.
This commit adds support for the Akamai CCU v3 API. See
https://developer.akamai.com/api/purge/ccu/resources.html for more information.
The V2 and V3 API are close enough to one another that we can support
both with minimal changes. A new OCSP updated configuration parameter
"AkamaiV3Network" is used to determine if the cache client should use
the V2 API or the V3 API. When empty, V2 is used. When set to either
"production" or "staging", the V3 API is used.
Boulder is fairly noisy about gRPC connection errors. This is a mixed
blessing: Our gRPC configuration will try to reconnect until it hits
an RPC deadline, and most likely eventually succeed. In that case,
we don't consider those to really be errors. However, in cases where
a connection is repeatedly failing, we'd like to see errors in the
logs about connection failure, rather than "deadline exceeded." So
we want to keep logging of gRPC errors.
However, right now we get a lot of these errors logged during
integration tests. They make the output hard to read, and may disguise
more serious errors. So we'd like to avoid causing such errors in
normal integration test operation.
This change reorders the startup of Boulder components by their gRPC
dependencies, so everything's backend is likely to be up and running
before it starts. It also reverses that order for clean shutdowns,
and waits for each process to exit before signalling the next one.
With these changes, I still got connection errors. Taking listenbuddy
out of the gRPC path fixed them. I believe the issue is that
listenbuddy is not a truly transparent proxy. In particular, it
accepts an inbound TCP connection before opening an outbound TCP
connection. If opening that outbound connection results in "connection
refused," it closes the inbound connection. That means gRPC sees a
"connection closed" (or "connection reset"?) rather than "connection
refused". I'm guessing it handles those cases differently, explaining
the different error results.
We've been using listenbuddy to trigger disconnects while Boulder is
running, to ensure that gRPC's reconnect code works. I think we can
probably rely on gRPC's reconnect to work. The initial problem that
led us to start testing this was a configuration problem; now that
we have the configuration we want, we should be fine and don't need
to keep testing reconnects on every integration test run.