Commit Graph

522 Commits

Author SHA1 Message Date
Jacob Hoffman-Andrews d3b9107059 orphan-finder: add OCSP generation (#4457)
Fixes #4428
2019-10-07 14:40:36 -04:00
Daniel McCarney ab26662fc8
ocsp-updater: fix generateResponse for precerts w/o certs (#4468)
Since 9906c93217 when
`features.PrecertificateOCSP` is enabled it is possible for there to be
`certificateStatus` rows that correspond to `precertificates` that do not have
a matching final `certificates` row. This happens in the case where we began
serving OCSP for a precert and weren't able to issue a final certificate.

Prior to the fix in this branch when the `ocsp-updater` would find stale OCSP
responses by querying the `certificateStatus` table it would error in
`generateResponse` when it couldn't find a matching `certificates` row. This
branch updates the logic so that when `features.PrecertificateOCSP` is enabled
it will also try finding the ocsp update DER from the `precertificates` table
when there is no matching serial in the `certificates` table.
2019-10-07 13:11:31 -04:00
Roland Bracewell Shoemaker 31ed590edd
Strip default scheme ports from Host headers (#4448)
Fixes #4447.
2019-09-27 16:14:40 -07:00
Roland Bracewell Shoemaker 8d877d93b2
boulder-janitor: Calculate expiry cutoff in code rather than th… (#4439)
Fixes #4431.
2019-09-23 12:33:54 -07:00
Daniel McCarney 1cd9733c24
WFE2: allow revocation of precertificates. (#4433)
When the `features.PrecertificateRevocation` feature flag is enabled the WFE2
will allow revoking certificates for a submitted precertificate. The legacy WFE1
behaviour remains unchanged (as before (pre)certificates issued through the V1
API will be revocable with the V2 API).

Previously the WFE2 vetted the certificate from the revocation request by
looking up a final certificate by the serial number in the requested
certificate, and then doing a byte for byte comparison between the stored and
requested certificate.

Rather than adjust this logic to handle looking up and comparing stored
precertificates against requested precertificates (requiring new RPCs and an
additional round-trip) we choose to instead check the signature on the requested
certificate or precertificate and consider it valid for revocation if the
signature validates with one of the WFE2's known issuers. We trust the integrity
of our own signatures.

An integration test that performs a revocation of a precertificate (in this case
one that never had a final certificate issued due to SCT embedded errors) with
all of the available authentication mechanisms is included.

Resolves https://github.com/letsencrypt/boulder/issues/4414
2019-09-16 16:40:07 -04:00
Jacob Hoffman-Andrews 9906c93217
Generate and store OCSP at precertificate signing time (#4420)
This change adds two tables and two methods in the SA, to store precertificates
and serial numbers.

In the CA, when the feature flag is turned on, we generate a serial number, store it,
sign a precertificate and OCSP, store them, and then return the precertificate. Storing
the serial as an additional step before signing the certificate adds an extra layer of
insurance against duplicate serials, and also serves as a check on database availability.
Since an error storing the serial prevents going on to sign the precertificate, this decreases
the chance of signing something while the database is down.

Right now, neither table has read operations available in the SA.

To make this work, I needed to remove the check for duplicate certificateStatus entry
when inserting a final certificate and its OCSP response. I also needed to remove
an error that can occur when expiration-mailer processes a precertificate that lacks
a final certificate. That error would otherwise have prevented further processing of
expiration warnings.

Fixes #4412

This change builds on #4417, please review that first for ease of review.
2019-09-09 12:21:20 -07:00
Daniel McCarney f02e9da38f
Support admin. blocking public keys. (#4419)
We occasionally have reason to block public keys from being used in CSRs
or for JWKs. This work adds support for loading a YAML blocked keys list
to the WFE, the RA and the CA (all the components already using the
`goodekey` package).

The list is loaded in-memory and is intended to be used sparingly and
not for more complicated mass blocking scenarios. This augments the
existing debian weak key checking which is specific to RSA keys and
operates on a truncated hash of the key modulus. In comparison the
admin. blocked keys are identified by the Base64 encoding of a SHA256
hash over the DER encoding of the public key expressed as a PKIX subject
public key. For ECDSA keys in particular we believe a more thorough
solution would have to consider inverted curve points but to start we're
calling this approach "Good Enough".

A utility program (`block-a-key`) is provided that can read a PEM
formatted x509 certificate or a JSON formatted JWK and emit lines to be
added to the blocked keys YAML to block the related public key.

A test blocked keys YAML file is included
(`test/example-blocked-keys.yml`), initially populated with a few of the
keys from the `test/` directory. We may want to do a more through pass
through Boulder's source code and add a block entry for every test
private key.

Resolves https://github.com/letsencrypt/boulder/issues/4404
2019-09-06 16:54:26 -04:00
Roland Bracewell Shoemaker 62e52f4103 test: weakKeyDirectory -> weakKeyFile in test configs (#4397) 2019-08-12 18:07:56 -04:00
Jacob Hoffman-Andrews 5e7fee0c4a test: update test/config with deployed configs. (#4396) 2019-08-09 12:08:56 -04:00
Roland Bracewell Shoemaker db01830508
Return OCSP unauthorized status if the certificate is expired (#4380)
The ocsp-updater ocspStaleMaxAge config var has to be bumped up to ~7 months so that when it is run after the six-months-ago run it will actually update the ocsp responses generated during that period and mark the certificate status row as expired.

Fixes #4338.
2019-08-01 14:13:27 -07:00
Daniel McCarney 8b518451b4 deps: update github.com/zmap/zlint to latest. (#4384)
* deps: update github.com/zmap/zlint to latest.

Update the `github.com/zmap/zlint` dependency to b126a9b. This captures
a small fix to the `ct_sct_policy_count_unsatisfied` lint that ensures
it isn't run for precertificates.

* config: remove ct_sct_policy_count_unsatisfied from ignored_lints.

With the latest `zlint` the `ct_sct_policy_count_unsatisfied` lint won't
flag precertificates as having an info-level lint result for missing
SCTs. With that fix in place we no longer have to ignore this lint in
the config-next CA configs that enable preissuance linting.
2019-08-01 10:22:30 -07:00
Daniel McCarney eb20b2accd
CA: implement CFSSL/zlint pre-issuance linting. (#4378)
The `test/config-next` CA configs are both updated to use `zlint` to lint TBS
pre-certificates with a throw-away key and treat any lint findings >=
`lints.Pass` as an error, blocking the CA from signing the TBS pre-cert with its
private key.

The CA `issuePrecertificateInner` function is updated to specifically catch
linting related errors from CFSSL to marshal the linting findings to the audit
log. A small unit test for this change is included.

The CA `IssueCertificateForPrecertificate` function remains unchanged: the CFSSL
interface that defines `SignFromPrecert` doesn't facilitate linting. We still
lint final certificates post-issuance with `cert-checker` and accept the
possibility there may be some compliance issues that could occur between the
precertificate passing linting and the final certificate being signed.

Resolves https://github.com/letsencrypt/boulder/issues/4255
2019-07-31 15:08:57 -04:00
Daniel McCarney bb005e1c79
integration: add test for boulder-janitor. (#4364) 2019-07-29 16:13:10 -04:00
Jacob Hoffman-Andrews a68c39ad9b SA: Delete unused challenges (#4353)
For authzv1, this actually executes a SQL DELETE for the unused challenges
when an authorization is updated upon validation.

For authzv2, this doesn't perform a delete, but changes the authorizations that
are returned so they don't include unused challenges.

In order to test the flag for both authz storage models, I set the feature flag in
both config/ and config-next/.

Fixes #4352
2019-07-26 14:04:46 -04:00
Daniel McCarney 9e896325f7
boulder-janitor: add initial daemon for tidying certificate resources. (#4354)
A new `boulder-janitor` command is added that provides a long-running
daemon that cleans up rows associated with expired certificate
resources. At present this is rows from the following tables:

* certificates
* certificateStatus
* certificatesPerName

Adding cleanup of tables associated with Order resources is the next step.

Three prometheus stats are exported:

* janitor_deletions - CounterVec for the number of deletions by table the 
  boulder-janitor has performed.
* janitor_workbatch - GaugeVec for the number of items of work by table
  the boulder-janitor queued for deletion.
* janitor_errors - CounterVec for the number of errors by table and error
  type the boulder-janitor has experienced.
2019-07-24 15:09:04 -04:00
Jacob Hoffman-Andrews 74699486ec Fix FasterGetOrderForNames and add tests. (#4331)
This rolls forward #4326 after it was reverted in #4328.

Resolves https://github.com/letsencrypt/boulder/issues/4329

The older query didn't have a `LIMIT 1` so it was returning multiple results,
but gorp's `SelectOne` was okay with multiple results when the selection was
going into an `int64`. When I changed this to a `struct` in #4326, gorp started
producing errors.

For this bug to manifest, an account needs to create an order, then fail
validation, twice in a row for a given domain name, then create an order once
more for the same domain name - that third request will fail because there are
multiple orders in the orderFqdnSets table for that domain.

Note that the bug condition doesn't happen when an account does three successful
issuances in a row, because finalizing an order (that is, issuing a certificate
for it) deletes the row in orderFqdnSets. Failing an authorization does not
delete the row in orderFqdnSets. I believe this was an intentional design
decision because an authorization can participate in many orders, and those
orders can have many other authorizations, so computing the updated state of
all those orders would be expensive (remember, order state is not persisted in
the DB but is calculated dynamically based on the authorizations it contains).

This wasn't detected in integration tests because we don't have any tests that
fail validation for the same domain multiple times. I filed an issue for an
integration test that would have incidentally caught this:
https://github.com/letsencrypt/boulder/issues/4332. There's also a more specific
test case in #4331.
2019-07-11 13:43:42 -04:00
Jacob Hoffman-Andrews 2131065b2d
Revert "SA: improve performance of GetOrderForNames. (#4326)" (#4328)
This reverts commit 9fa360769e.

This commit can cause "gorp: multiple rows returned for: ..." under certain situations.

See #4329 for details of followup.
2019-07-09 14:33:28 -07:00
Jacob Hoffman-Andrews 9fa360769e SA: improve performance of GetOrderForNames. (#4326)
When there are a lot of potential orders to reuse, the query could scan
unnecessary rows, sometimes leading to timeouts. The new query used 
when the FasterGetOrderForNames feature flag is enabled uses the
available index more effectively and adds a LIMIT clause.
2019-07-09 09:46:06 -04:00
Roland Bracewell Shoemaker af41bea99a Switch to more efficient multi nonce-service design (#4308)
Basically a complete re-write/re-design of the forwarding concept introduced in
#4297 (sorry for the rapid churn here). Instead of nonce-services blindly
forwarding nonces around to each other in an attempt to find out who issued the
nonce we add an identifying prefix to each nonce generated by a service. The
WFEs then use this prefix to decide which nonce-service to ask to validate the
nonce.

This requires a slightly more complicated configuration at the WFE/2 end, but
overall I think ends up being a way cleaner, more understandable, easy to
reason about implementation. When configuring the WFE you need to provide two
forms of gRPC config:

* one gRPC config for retrieving nonces, this should be a DNS name that
resolves to all available nonce-services (or at least the ones you want to
retrieve nonces from locally, in a two DC setup you might only configure the
nonce-services that are in the same DC as the WFE instance). This allows
getting a nonce from any of the configured services and is load-balanced
transparently at the gRPC layer. 
* a map of nonce prefixes to gRPC configs, this maps each individual
nonce-service to it's prefix and allows the WFE instances to figure out which
nonce-service to ask to validate a nonce it has received (in a two DC setup
you'd want to configure this with all the nonce-services across both DCs so
that you can validate a nonce that was generated by a nonce-service in another
DC).

This balancing is implemented in the integration tests.

Given the current remote nonce code hasn't been deployed anywhere yet this
makes a number of hard breaking changes to both the existing nonce-service
code, and the forwarding code.

Fixes #4303.
2019-06-28 12:58:46 -04:00
Roland Bracewell Shoemaker 844ae26b65
Allow forwarding of nonce-service Redeem RPCs from one service… (#4297)
Fixes #4295.
2019-06-26 13:04:31 -07:00
Roland Bracewell Shoemaker 0a16b5f57d Reduce akamai purger interval in integration tests (#4277)
and reduce the verify_akamai_purge deadline/sleep to match the much smaller interval.
2019-06-20 16:31:44 -04:00
Roland Bracewell Shoemaker acc44498d1 RA: Make RevokeAtRA feature standard behavior (#4268)
Now that it is live in production and is working as intended we can remove
the old ocsp-updater functionality entirely.

Fixes #4048.
2019-06-20 14:32:53 -04:00
Daniel McCarney 8d3a246adb
cert-checker: allow ignoring lints by name. (#4272)
This updates the `cert-checker` utility configuration with a new allow list of
ignored lints so we can exclude known false-positives/accepted info results by
name instead of result level. To start only the `n_subject_common_name_included`
lint is excluded in `test/config-next/cert-checker.json`. Once this lands we can
treat info/warning lint results as errors as a follow-up to not break
deployability guarantees.

Resolves https://github.com/letsencrypt/boulder/issues/4271
2019-06-20 13:09:10 -04:00
Roland Bracewell Shoemaker 3532dce246 Excise grpc maxConcurrentStreams configuration (#4257) 2019-06-12 09:35:24 -04:00
Daniel McCarney fea7106927
WFE2: Add feature flag for Mandatory POST-As-GET. (#4251)
In November 2019 we will be removing support for legacy pre RFC-8555
unauthenticated GET requests for accessing ACME resources. A new
`MandatoryPOSTAsGET` feature flag is added to the WFE2 to allow
enforcing this change. Once this feature flag has been activated in Nov
we can remove both it and the WFE2 code supporting GET requests.
2019-06-07 08:36:13 -04:00
Roland Bracewell Shoemaker 4ca01b5de3
Implement standalone nonce service (#4228)
Fixes #3976.
2019-06-05 10:41:19 -07:00
Roland Bracewell Shoemaker 4d40cf58e4
Enable integration tests for authz2 and fix a few bugs (#4221)
Enables integration tests for authz2 and fixes a few bugs that were flagged up during the process. Disables expired-authorization-purger integration tests if config-next is being used as expired-authz-purger expects to purge some stuff but doesn't know about authz2 authorizations, a new test will be added with #4188.

Fixes #4079.
2019-05-23 15:06:50 -07:00
Daniel McCarney e627f58f97
publisher: remove HTTP GET log probing. (#4223)
We adding this diagnostic probing while debugging an issue that has
since been resolved.
2019-05-23 12:42:26 -04:00
Jacob Hoffman-Andrews 76beffe074 Clean up must staple and precert options in CA (#4201)
Precertificate issuance has been the only supported mode for a while now. This
cleans up the remaining flags in the CA code. The same is true of must staple.

This also removes the IssueCertificate RPC call and its corresponding wrappers,
and removes a lot of plumbing in the CA unittests that was used to test the
situation where precertificate issuance was not enabled.
2019-05-21 15:34:28 -04:00
Jacob Hoffman-Andrews 09ba859366 SA: Deprecate FasterRateLimit feature flag (#4210)
This makes the behavior behind that flag the default.
2019-05-09 15:06:21 -04:00
Daniel McCarney a3d35f51ff
admin-revoker: use authz2 SA revocation RPC. (#4182)
The `RevokeAuthorizationsByDomain` SA RPC is deprecated and `RevokeAuthorizationsByDomain2`
should be used in its place. Which RPC to use is controlled by the `NewAuthorizationSchema` feature
flag. When it is true the `admin-revoker` will use the new RPC. 

Resolves https://github.com/letsencrypt/boulder/issues/4178
2019-05-02 14:55:43 -04:00
Jacob Hoffman-Andrews 0c700143bb Clean up README and test configs (#4185)
- docker-rebuild isn't needed now that boulder and bhsm containers run directly off
 the boulder-tools image.
- Remove DNS options from RA config.
- Remove GSB options from VA config.
2019-04-30 13:26:19 -07:00
Daniel McCarney 748f315b1a
PA: Support YAML for hostname policy. (#4180)
Our existing hostname policy configs use JSON. We would like to switch to YAML
to match the rate limit policy configs and to support commenting/tagging
entries in the hostname policy.

The PA is updated to support both JSON and YAML while we migrate the existing
policy data to the new format. To verify there are no changes in functionality
the existing unit tests were updated to test the same policy expressed in YAML
and JSON form.

In integration tests the `config` tree continues to use a JSON hostname policy
file to test the legacy support. In `config-next` a YAML hostname policy file
is used instead.

The new YAML format allows separating out `HighRiskBlockedNames` (primarily
used to meet the requirement of managing issuance for "high risk" domains per
CABF BRs, mostly static) and `AdminBlockedNames` (used after administrative
action is taken to block a domain. Additions are made with some frequency).
Since the rate at which we change entries in these lists differs it is helpful
to separate them in the policy content.
2019-04-26 14:35:28 -04:00
Jacob Hoffman-Andrews e49ffaf94c sa: use a faster query for certificates per name rate limit (#4179)
Right now we run a `SELECT COUNT` query on issuedNames to calculate this rate limit.
Unfortunately, counting large numbers of rows is very slow. This change introduces a
dedicated table for keeping track of that rate limit.

Instead of being keyed by FQDN, this new `certificatesPerName` table is keyed by
the same field as rate limits are calculated on: the base domain. Each row has a base
domain, a time (chunked by hour), and a count. This means calculating the rate limit
status for each domain reads at most 7 * 24 rows.

This should particularly speed up cases when a single domain has lots of subdomains.

Fixes #4152
2019-04-26 10:53:47 -07:00
Roland Bracewell Shoemaker e385ec1247 new authzs: remove feature + add required SQL users (#4176) 2019-04-23 14:56:09 -04:00
Daniel McCarney c2ad80f774 tests: rework multi-va integration tests. (#4169)
Rather than having the `BouncerHTTPRequestHandler` bounce/redirect
requests based on how many have arrived it is more reliable to follow
the way we did this in unit tests and decide based on the request
UA. By setting a distinct UA per VA it's possible to more precisely
define how many requests from each VA should be redirected to the
real key authorization.

Using the count was flaky because occasionally the remote VA
requests would arrive before the primary VA expending the VIP
request quota and bouncing the primary VA validation request. The
`test_http_multiva_threshold_pass` unit test expectation was that
since 2 of the 3 requests would pass that the test would pass but
since the primary VA is treated differently (it must always see a
valid result for the remote VA results to be considered) the test
would fail when the requests were scheduled this way.

Something particular to CI seems to result in the primary VA
being scheduled after the remote VA goroutines more often
than on faster machines/local testing. I had very little
luck reproducing locally but in 10 CI builds of master
with the count-based `BouncerHTTPRequestHandler` tests I saw
`test_http_multiva_threshold_pass` fail 2 builds. After switching to
this branch's UA-based `BouncerHTTPRequestHandler` 10/10 builds passed.

Resolves https://github.com/letsencrypt/boulder/issues/4147
2019-04-18 17:06:24 -07:00
Jacob Hoffman-Andrews 4e20c83d96 Deprecate renewal rate limiting feature flags (#4161) 2019-04-17 12:39:08 -07:00
Roland Bracewell Shoemaker 7ff30cf857 Remove account ID in WFE2 if feature enabled (#4160)
This helps encourage ACME client developers to use the RFC 8555
specified `Location` header instead.

Fixes #4136.
2019-04-17 13:38:25 -04:00
Jacob Hoffman-Andrews 8f578f3a93
Improve integration tests (#4143)
- Move fakeclock, get_future_output, and random_domain to helpers.py.
- Remove tempdir handling from integration-test.py since it's already
  done in helpers.py
- Consolidate handling of config dir into helpers.py, and add
  CONFIG_NEXT boolean.
- Move RevokeAtRA config gating into verify_revocation to reduce
  redundancy.
- Skip load-balancing test when filter is enabled.
- Ungate test_sct_embedding
- Rework test_ct_submissions, which was out of date. In particular, have a couple of
  logs where submitFinalCert: false, and make ct-test-srv store submission counts
  by hostnames for better test case isolation.
2019-04-04 10:59:38 -07:00
Daniel McCarney 063a98f02a
VA: additional feature flag control for multiVA. (#4122)
* `EnforceMultiVA` to allow configuring multiple VAs but not changing the primary VA's result based on what the remote VAs return.
* `MultiVAFullResults` to allow collecting all of the remote VA results. When all results are collected a JSON log line with the differential between the primary/remote VAs is logged.

Resolves https://github.com/letsencrypt/boulder/issues/4066
2019-03-25 12:23:53 -04:00
Jacob Hoffman-Andrews d1e6d0f190 Remove TLS-SNI-01 (#4114)
* Remove the challenge whitelist
* Reduce the signature for ChallengesFor and ChallengeTypeEnabled
* Some unit tests in the VA were changed from testing TLS-SNI to testing the same behavior
  in TLS-ALPN, when that behavior wasn't already tested. For instance timeouts during connect 
  are now tested.

Fixes #4109
2019-03-15 09:05:24 -04:00
Roland Bracewell Shoemaker 51f29b9953
Implement WFE retrieval logic for v2 authorizations (#4085)
This changeset implements the logic required for the WFE to retrieve v2 authorizations and their associated challenges while still maintaining the logic to retrieve old authorizations/challenges. Challenge IDs for v2 authorizations are obfuscated using a pretty simply scheme in order to prevent hard coding of indexes. A `V2` field is added to the `core.Authorization` object and populated using the existing field of the same name from the protobuf for convenience. v2 authorizations and challenges use a `v2` prefix in all their URLs in order to easily differentiate between v1 and v2 URLs (e.g. `/acme/authz/v2/asd` and `/acme/challenge/v2/asd/123`), once v1 authorizations cease to exist this prefix can be safely removed. As v2 authorizations use int IDs this change switches from string IDs to int IDs, this mainly only effects tests.

Integration tests are put off for #4079 as they really need #4077 and #4078 to be properly effective.

Fixes #4041.
2019-02-26 13:14:05 -08:00
Daniel McCarney 279947ade2 CI/Devenv: restore 20s RA->VA timeout. (#4084)
I tried dropping the RA->VA timeout to make the
`test_http_challenge_timeout` integration test faster. It seems to flake
in CI so I'm restoring the original 20s timeout. This makes
`test_http_challenge_timeout` slower but c'est la vie.
2019-02-22 08:53:18 -08:00
Daniel McCarney 16e464a37d RA: apply certificate rate limits at NewOrder time. (#4074)
If an order for a given set of names will fail finalization because of certificate rate limits (certs per domain, certs per fqdn set) there isn't any point in allowing an order for those names to be created. We can stop a lot of requests earlier by enforcing the cert rate limits at new order time as well as finalization time. A new RA `EarlyOrderRateLimit` feature flag controls whether this is done or not.

Resolves #3975
2019-02-21 11:02:40 -08:00
Daniel McCarney d0b6524fa2 SA: Set issuedNames renewal bit in AddCertificate. (#4059)
The `renewal` field of the `issuedNames` table is indexed
to allow more efficient processing of rate limit decisions
w.r.t. renewals for the certificates per domain rate limit (see
https://github.com/letsencrypt/boulder/pull/3178). Before we can start
using this field in rate limit calculations we need to populate it
appropriately. A new `SetIssuedNamesRenewalBit` feature flag for the SA
controls whether we do so or not.

Resolves https://github.com/letsencrypt/boulder/issues/4008
2019-02-18 22:27:56 -08:00
Daniel McCarney 3324989205 CI/Dev: Increase RA->VA timeout to 8s. (#4062)
There has been some flakyness in CI related to RA->VA timeouts.
2019-02-15 13:38:12 -08:00
Roland Bracewell Shoemaker 3e54cea295 Implement direct revocation at RA (#4043)
Implements a feature that enables immediate revocation instead of marking a certificate revoked and waiting for the OCSP-Updater to generate the OCSP response. This means that as soon as the request returns from the WFE the revoked OCSP response should be available to the user. This feature requires that the RA be configured to use the standalone Akamai purger service.

Fixes #4031.
2019-02-14 14:47:42 -05:00
Daniel McCarney 1c0be52e53 VA: Add integration test for HTTP timeouts. (#4050)
Also update `TestHTTPTimeout` to test with the `SimplifiedVAHTTP`
feature flag enabled.
2019-02-12 13:42:01 -08:00
Roland Bracewell Shoemaker 046955e99c Add a standalone akamai purger service (#4040)
Fixes #4030.
2019-02-05 09:00:31 -08:00
Jacob Hoffman-Andrews 92e8e1708a Update config and config-next challenge settings. (#4017)
- Allow tls-alpn-01 challenge in config.
- Disallow tls-sni-01 challenge in config-next.
- Remove gating of tls-alpn integration test.
- Remove TLSSNIRevalidation in config-next.
2019-01-18 10:30:38 -08:00
Jacob Hoffman-Andrews bb15685a0f
No new certificates tick (#4012)
Since #2633 we generate OCSP at first issuance, so we no longer need 
this loop to check for new certificates that need OCSP status generated.
Since the associate SQL query is slow, we should just turn it off.

Also remove the configuration fields for the MissingSCTTick. The code
for that was already deleted.
2019-01-17 14:43:06 -08:00
Daniel McCarney 413028aa1b
WFE2: Restore HeadNonceStatusOK feat. in conf-next (#4011)
This was accidentally removed in 842739bc but hasn't been deployed to
staging/prod yet.
2019-01-16 14:56:47 -05:00
Jacob Hoffman-Andrews cdc01df24f Revert "Increase default MaxIdleConns. (#3164)" (#4007)
This reverts commit 600640294d,
removing the maxIdleDBConns config setting.
2019-01-16 08:41:21 -05:00
Roland Bracewell Shoemaker 842739bccd Remove deprecated features that have been purged from prod and staging configs (#4001) 2019-01-15 16:16:35 -08:00
Daniel McCarney 52395a061a WFE2: Remove ACME13KeyRollover feature and legacy code. (#3999)
The draft 13 ACME v2 key change behaviour is now mandatory and we can remove the legacy WFE2 code.
2019-01-15 15:40:01 -08:00
Daniel McCarney b0f407dcf0 RA: Remove deprecated UpdateAuthorization RPC. (#3993)
Staging and prod both deployed the PerformValidationRPC feature flag. All running WFE/WFE2 instances are using the more accurately named PerformValidation RPC and we can strip out the old UpdateAuthorization bits. The feature flag for PerformValidationRPC remains until we clean up the staging/prod configs.

Resolves #3947 and completes the last of #3930
2019-01-07 16:35:27 -08:00
Daniel McCarney e5b8d530b7
wfe2: Return Status 200 for HEAD to new-nonce endpoint. (#3992)
Previously we mistakenly returned status 204 (no content) for all
requests to new-nonce, including HEAD. This status should only be used
for GET requests.

When the `HeadNonceStatusOK` feature flag is enabled we will now return
the correct status for HEAD requests. When the flag is disabled we return
status 204 to preserve backwards compatibility.
2019-01-07 12:58:30 -05:00
Jacob Hoffman-Andrews 9e39680e3f Enable blocking profiles. (#3957)
Per https://golang.org/pkg/runtime/#SetBlockProfileRate, to use blocking
profiling we need to set a profile rate. This enables it at once per
second. This might help us debug the problem where Publisher sometimes
stalls on submitting to all logs.
2018-12-03 10:25:45 -08:00
Daniel McCarney 8f5de538c1
RA: Add PerformValidation RPC to replace UpdateAuthorization. (#3942)
The existing RA `UpdateAuthorization` RPC needs replacing for
two reasons:

1. The name isn't accurate - `PerformValidation` better captures
the purpose of the RPC.
2. The `core.Challenge` argument is superfluous since Key 
Authorizations are not sent in the initiation POST from the client 
anymore. The corresponding unmarshal and verification is now 
removed. Notably this means broken clients that were POSTing
the wrong thing and failing pre-validation will now likely fail 
post-validation.

To remove `UpdateAuthorization` the new `PerformValidation` 
RPC is added alongside the old one. WFE and WFE2 are 
updated to use the new RPC when the perform validation
feature flag is enabled. We can remove 
`UpdateAuthorization` and its associated wrappers once all 
WFE instances have been updated.

Resolves https://github.com/letsencrypt/boulder/issues/3930
2018-11-28 10:12:47 -05:00
Roland Bracewell Shoemaker ba7a8e8e5d Add fake Akamai purge server for integration testing (#3946)
Fixes #3916.
2018-11-27 09:49:05 -05:00
Daniel McCarney d9d2f4e9b0
VA: Simplified HTTP-01 w/ IP address URLs (#3939)
Continued bugs from the custom dialer approach used by the VA for HTTP-01 (most recently https://github.com/letsencrypt/boulder/issues/3889) motivated a rewrite.

Instead of using a custom dialer to be able to control DNS resolution for HTTP validation requests we can construct URLs for the IP addresses we resolve and overload the Host header. This avoids having to do address resolution within the dialer and eliminates the complexity of the dialer `addrInfoChan`. The only thing left for our custom dialer now is to shave some time off of the provided context to help us discern timeouts before/after connect.

The existing IP preference & fallback behaviour is preserved: e.g. if a host has both IPv6 and IPv4 addresses we connect to the first IPv6 address. If there is a network error connecting to that address (e.g. an error during "dial"), we try once more with the first IPv4 address. No other retries are done. Matching existing behaviour no fallback is done for HTTP level failures on an IPv6 address (e.g. mismatched webroots, redirect loops, etc). A new Prometheus counter "http01_fallbacks" is used to keep track of the number of fallbacks performed.

As a result of moving the layer at which the retry happens a fallback like described above will now produce two validation records: one for the initial IPv6 connection, and one for the IPv4 connection. Neither will have the "addressesTried" field populated, just "addressesResolved" and "addressUsed". Previously with the dialer doing the retry we would have created just one validation record with an IPv4 "addressUsed" field and both an IPv6 and IPv4 address in the "addressesTried" field.

Because this is a big diff for a key part of the VA the new code is gated by the `SimplifiedVAHTTP` feature flag.

Resolves #3889
2018-11-19 14:15:39 -05:00
Jacob Hoffman-Andrews a8ebe1225a Remove userNotice from config-next. (#3922) 2018-11-05 08:10:24 -05:00
Jacob Hoffman-Andrews 4670be1210 Reduce log level for WFE in tests. (#3918)
Our Travis output is quite verbose with the WFE output, and it's very
rare that we have to reference it. I'd like to remove the INFO-level
logs (i.e. the logs of every request) so that it's easier to see real
errors, and faster to scroll to the bottom of logs of failed runs.
2018-11-01 09:50:41 -04:00
Jacob Hoffman-Andrews 48103af5b1 Add timeout to ocsp-responder (#3892)
Right now if ocsp-responder gets flooded with traffic, it will have a number of requests that
spend long enough waiting for an available connection that the reverse proxy will have given
up on them before they get a chance to execute the SQL query. Add a timeout parameter so
ocsp-responder can gracefully shed this load rather than try to do pointless work.
2018-10-22 09:20:08 -04:00
Daniel McCarney 3e61513364
config, config-next: remove deprecated ocsp-updater fields (#3884) 2018-10-12 13:52:15 -04:00
Roland Bracewell Shoemaker 484fd31460 Probe logs from inside the publisher (#3873)
Does a simpler probe than compared to using a `blackbox_exporter`, but directly collects the info we think will aid debugging publisher outages.

Updates #3821.
2018-09-27 14:42:26 -04:00
Roland Bracewell Shoemaker 196f019851 Add support for temporal CT logs (#3853)
Required a little bit of rework of the RA issuance flow (to add parsing of the precert to determine the expiration date, and moving final cert parsing before final cert submission) and RA tests, but I think it shouldn't create any issues...

Fixes #3197.
2018-09-14 16:14:42 -07:00
Daniel McCarney d39babdcf3
RA: Remove vestigial DNS config/setup. (#3854)
In db01b0b we removed email validation from the RA. This was the only
use of the `bdns` package by the RA and so we can go one step further
and delete the remaining setup, configuration and `bdns` fields.
2018-09-13 13:39:23 -04:00
Roland Bracewell Shoemaker 9b94d4fdfe Add a orphan queue to the CA (#3832)
Retains the existing logging of orphaned certs until we are confident that this
solution can fully replace it (even then we may want to keep it just for auditing etc).

Fixes #3636.
2018-09-05 11:12:07 -07:00
Roland Bracewell Shoemaker 876c727b6f Update gRPC (#3817)
Fixes #3474.
2018-08-20 10:55:42 -04:00
Roland Bracewell Shoemaker 3a8f0bc0be Allow ocsp-responder to filter requests by serial prefix (#3815) 2018-08-10 11:16:22 -04:00
Daniel McCarney 0cb28c9e02
WFE2: Implement draft-13 keyrollover with feature flag. (#3813)
ACME draft-13 changed the inner JWS' body to contain the old key
that signed the outer JWS. Because this is a backwards incompatible
change with the draft-12 ACME v2 key rollover we introduce a new feature
flag `features.ACME13KeyRollover` and conditionally use the old or new
key rollover semantics based on its value.
2018-08-07 15:27:25 -04:00
Roland Bracewell Shoemaker b5f7c62460 Remove leftover publisher CT config (#3803) 2018-07-27 08:05:51 -04:00
Jacob Hoffman-Andrews 36a83150ad Add stagger to CT log submissions. (#3794)
This allows each log a chance to respond before we move onto the next,
spreading our load more evenly across the logs in a log group.
2018-07-06 16:25:51 -04:00
Daniel McCarney bbf0102cdc
Remove UseAIAIssuerURL feature flag and code. (#3790)
We aren't going to deploy this as-is and its causing integration test
problems for downstream clients.
2018-07-03 16:29:44 -04:00
Roland Bracewell Shoemaker e27f370fd3 Excise code relating to pre-SCT embedding issuance flow (#3769)
Things removed:

* features.EmbedSCTs (and all the associated RA/CA/ocsp-updater code etc)
* ca.enablePrecertificateFlow (and all the associated RA/CA code)
* sa.AddSCTReceipt and sa.GetSCTReceipt RPCs
* publisher.SubmitToCT and publisher.SubmitToSingleCT RPCs

Fixes #3755.
2018-06-28 08:33:05 -04:00
Joel Sing 9c2859c87b Add support for CAA account-uri validation. (#3736)
This adds support for the account-uri CAA parameter as specified by
section 3 of https://tools.ietf.org/html/draft-ietf-acme-caa-04, allowing
issuance to be restricted to one or more ACME accounts as specified by CAA
records.
2018-06-08 12:08:03 -07:00
Maciej Dębski bb9ddb124e Implement TLS-ALPN-01 and integration test for it (#3654)
This implements newly proposed TLS-ALPN-01 validation method, as described in
https://tools.ietf.org/html/draft-ietf-acme-tls-alpn-01 This challenge type is disabled 
except in the config-next tree.
2018-06-06 13:04:09 -04:00
Daniel McCarney 783784b680 SA: Enable OrderReadyStatus feature flag in config-next. (#3738)
We landed this feature flag disabled pending Certbot's acme library
supporting this status value. That work has landed and so we can enable
this feature in `config-next` ahead of a staging/prod rollout.
2018-05-29 10:32:58 -07:00
Joel Sing 2540d59296 Implement CAA validation-methods checking. (#3716)
When performing CAA checking respect the validation-methods parameter (if
present) and restrict the allowed authorization methods to those specified.
This allows a domain to restrict authorization methods that can be used with
Let's Encrypt.

This is largely based on PR #3003 (by @lukaslihotzki), which was landed and
then later reverted due to issue #3143. The bug the resulted in the previous
code being reverted has been addressed (likely inadvertently) by 76973d0f.

This implementation also includes integration tests for CAA validation-methods.

Fixes issue #3143.
2018-05-23 14:32:31 -07:00
Jacob Hoffman-Andrews dbcb16543e Start using multiple-IP hostnames for load balancing (#3687)
We'd like to start using the DNS load balancer in the latest version of gRPC. That means putting all IPs for a service under a single hostname (or using a SRV record, but we're not taking that path). This change adds an sd-test-srv to act as our service discovery DNS service. It returns both Boulder IP addresses for any A lookup ending in ".boulder". This change also sets up the Docker DNS for our boulder container to defer to sd-test-srv when it doesn't know an answer.

sd-test-srv doesn't know how to resolve public Internet names like `github.com`. Resolving public names is required for the `godep-restore` test phase, so this change breaks out a copy of the boulder container that is used only for `godep-restore`.

This change implements a shim of a DNS resolver for gRPC, so that we can switch to DNS-based load balancing with the currently vendored gRPC, then when we upgrade to the latest gRPC we won't need a simultaneous config update.

Also, this change introduces a check at the end of the integration test that each backend received at least one RPC, ensuring that we are not sending all load to a single backend.
2018-05-23 09:47:14 -04:00
Daniel McCarney 5597a77ba2
WFE2: Allow legacy Key ID prefix for ACME v2 JWS. (#3705)
While we intended to allow legacy ACME v1 accounts created through the WFE to work with the ACME v2 implementation and the WFE2 we neglected to consider that a legacy account would have a Key ID URL that doesn't match the expected for a V2 account. This caused `wfe2/verify.go`'s `lookupJWK` to reject all POST requests authenticated by a legacy account unless the ACME client took the extra manual step of "fixing" the URL.

This PR adds a configuration parameter to the WFE2 for an allowed legacy key ID prefix. The WFE2 verification logic is updated to allow both the expected key ID prefix and the configured legacy key ID prefix. This will allow us to specify the correct legacy URL in configuration for both staging/prod to allow unmodified V1 ACME accounts to be used with ACME v2.

Resolves https://github.com/letsencrypt/boulder/issues/3674
2018-05-11 15:57:56 -04:00
Daniel McCarney 76a3f4a18f RA/CA: Use `doNotForceCN: false` for `test/config`. (#3698)
In staging/prod we use `doNotForceCN: false` for both the RA & CA
config. Switching this to `true` is blocked on CABF work that will
likely take considerable time.

In the short-term we should use `doNotForceCN: false` in `test/config`
and only use `doNotForceCN: true` in `test/config-next`.
2018-05-09 12:54:16 -07:00
Jacob Hoffman-Andrews a4421ae75b Run gRPC backends on multiple IPs instead of multiple ports (#3679)
We're currently stuck on gRPC v1.1 because of a breaking change to certificate validation in gRPC 1.8. Our gRPC balancer uses a static list of multiple hostnames, and expects to validate against those hostnames. However gRPC expects that a service is one hostname, with multiple IP addresses, and validates all those IP addresses against the same hostname. See grpc/grpc-go#2012.

If we follow gRPC's assumptions, we can rip out our custom Balancer and custom TransportCredentials, and will probably have a lower-friction time in general.

This PR is the first step in doing so. In order to satisfy the "multiple IPs, one port" property of gRPC backends in our Docker container infrastructure, we switch to Docker's user-defined networking. This allows us to give the Boulder container multiple IP addresses on different local networks, and gives it different DNS aliases in each network.

In startservers.py, each shard of a service listens on a different DNS alias for that service, and therefore a different IP address. The listening port for each shard of a service is now identical.

This change also updates the gRPC service certificates. Now, each certificate that is used in a gRPC service (as opposed to something that is "only" a client) has three names. For instance, sa1.boulder, sa2.boulder, and sa.boulder (the generic service name). For now, we are validating against the specific hostnames. When we update our gRPC dependency, we will begin validating against the generic service name.

Incidentally, the DNS aliases feature of Docker allows us to get rid of some hackery in entrypoint.sh that inserted entries into /etc/hosts.

Note: Boulder now has a dependency on the DNS aliases feature in Docker. By default, docker-compose run creates a temporary container and doesn't assign any aliases to it. We now need to specify docker-compose run --use-aliases to get the correct behavior. Without --use-aliases, Boulder won't be able to resolve the hostnames it wants to bind to.
2018-05-07 10:38:31 -07:00
Daniel McCarney 054f181458 load-generator: send correct ACMEv2 Content-Type on POST (#3667)
load generator: send correct ACMEv2 Content-Type on POST.

This PR updates the Boulder load-generator to send the correct ACMEv2 Content-Type header when POSTing the ACME server. This is required for ACMEv2 and without it all POST requests to the WFE2 running a test/config-next configuration result in malformed 400 errors. While only required by ACMEv2 this commit sends it for ACMEv1 requests as well. No harm no foul.

integration tests: allow running just the load generator.
Prior to this PR an omission in an if statement in integration-test.py meant that you couldn't invoke test/integration-test.py with just the --load argument to only run the load generator. This commit updates the if to allow this use case.
2018-05-01 12:22:43 -07:00
Roland Bracewell Shoemaker 0a86573a73 Update integration tests 2018-04-20 13:18:40 -07:00
Roland Bracewell Shoemaker ebc86fd778 Fix CT submission cancelations (#3658)
When the WFE calls the RA the RA creates a sub context which is cancelled when
the RPC returns. Because we were spawning the publisher RPC calls in a
goroutine with the context from ra.IssueCertificate as soon as
ra.IssueCertificate returned that context was being canceled which in turn
canceled the publisher RPC calls.

Instead of using the RA RPC context simply use a `context.Background()` so
that the RPC context doesn't break these submissions. Also return to
pre-features.CancelCTSubmissions behavior where precert submissions would
be canceled once we retrieved SCTs from the winning logs instead of relying
on the magic behavior of the RA RPC canceling them itself.
2018-04-20 11:26:02 -04:00
Daniel McCarney f8f9a158c7 orphan-finder: set cert issued date based on notbefore. (#3651)
The Boulder orphan-finder command uses the SA's AddCertificate RPC to add orphaned certificates it finds back to the DB. Prior to this commit this RPC always set the core.Certificate.Issued field to the
current time. For the orphan-finder case this meant that the Issued date would incorrectly be set to when the certificate was found, not when it was actually issued. This could cause cert-checker to alarm based on the unusual delta between the cert NotBefore and the core.Certificate.Issued value.

This PR updates the AddCertificate RPC to accept an optional issued timestamp in the request arguments. In the SA layer we address deployability concerns by setting a default value of the current time when none is explicitly provided. This matches the classic behaviour and will let an old RA communicate with a new SA.

This PR updates the orphan-finder to provide an explicit issued time to sa.AddCertificate. The explicit issued time is calculated using the found certificate's NotBefore and the configured backdate.
This lets the orphan-finder set the true issued time in the core.Certificate object, avoiding any cert-checker alarms.

Resolves #3624
2018-04-19 10:25:12 -07:00
Roland Bracewell Shoemaker 1271a15be7 Submit final certs to CT logs (#3640)
Submits final certificates to any configured CT logs. This doesn't introduce a feature flag as it is config gated, any log we want to submit final certificates to needs to have it's log description updated to include the `"submitFinalCerts": true` field.

Fixes #3605.
2018-04-13 12:02:01 -04:00
Jacob Hoffman-Andrews 2a1cd4981a Allow configuring gRPC's MaxConcurrentStreams (#3642)
During periods of peak load, some RPCs are significantly delayed (on the order of seconds) by client-side blocking. HTTP/2 clients have to obey a "max concurrent streams" setting sent by the server. In Go's HTTP/2 implementation, this value [defaults to 250](https://github.com/golang/net/blob/master/http2/server.go#L56), so the gRPC default is also 250. So whenever there are more than 250 requests in progress at a time, additional requests will be delayed until there is a slot available.

During this peak load, we aren't hitting limits on CPU or memory, so we should increase the max concurrent streams limit to take better advantage of our available resources. This PR adds a config field to do that.

Fixes #3641.
2018-04-12 17:17:17 -04:00
Daniel 14689d8598
config-next: disable `OrderReadyStatus` feature flag.
This commit disables the `OrderReadyStatus` feature flag in
`test/config-next/sa.json`. Certbot's ACME implementation breaks when
this flag is enabled (See
https://github.com/certbot/certbot/issues/5856). Since Certbot runs
integration tests against Boulder with config-next we should be
courteous and leave this flag disabled until we are closer to being able
to turn it on for staging/prod.
2018-04-12 13:24:38 -04:00
Daniel ac6672bc71
Revert "Revert "V2: implement "ready" status for Order objects (#3614)" (#3643)"
This reverts commit 3ecf841a3a.
2018-04-12 13:20:47 -04:00
Jacob Hoffman-Andrews 3ecf841a3a Revert "V2: implement "ready" status for Order objects (#3614)" (#3643)
This reverts commit 1d22f47fa2.

According to
https://github.com/letsencrypt/boulder/pull/3614#issuecomment-380615172,
this broke Certbot's tests. We'll investigate, and then roll forward
once we understand what broke.
2018-04-12 10:46:57 -04:00
Daniel McCarney 1d22f47fa2 V2: implement "ready" status for Order objects (#3614)
* SA: Add Order "Ready" status, feature flag.

This commit adds the new "Ready" status to `core/objects.go` and updates
`sa.statusForOrder` to use it conditionally for orders with all valid
authorizations that haven't been finalized yet. This state is used
conditionally based on the `features.OrderReadyStatus` feature flag
since it will likely break some existing clients that expect status
"Processing" for this state. The SA unit test for `statusForOrder` is
updated with a "ready" status test case.

* RA: Enforce order ready status conditionally.

This commit updates the RA to conditionally expect orders that are being
finalized to be in the "ready" status instead of "pending". This is
conditionally enforced based on the `OrderReadyStatus` feature flag.
Along the way the SA was changed to calculate the order status for the
order returned in `sa.NewOrder` dynamically now that it could be
something other than "pending".

* WFE2: Conditionally enforce order ready status for finalization.

Similar to the RA the WFE2 should conditionally enforce that an order's
status is either "ready" or "pending" based on the "OrderReadyStatus"
feature flag.

* Integration: Fix `test_order_finalize_early`.

This commit updates the V2 `test_order_finalize_early` test for the
"ready" status. A nice side-effect of the ready state change is that we
no longer invalidate an order when it is finalized too soon because we
can reject the finalization in the WFE. Subsequently the
`test_order_finalize_early` testcase is also smaller.

* Integration: Test classic behaviour w/o feature flag.

In the previous commit I fixed the integration test for the
`config/test-next` run that has the `OrderReadyStatus` feature flag set
but broke it for the `config/test` run without the feature flag.

This commit updates the `test_order_finalize_early` test to work
correctly based on the feature flag status in both cases.
2018-04-11 10:31:25 -07:00
Jacob Hoffman-Andrews a4f9de9e35 Improve nesting of RPC deadlines (#3619)
gRPC passes deadline information through the RPC boundary, but client and server have the same deadline. Ideally we'd like the server to have a slightly tighter deadline than the client, so if one of the server's onward RPCs or other network calls times out, the server can pass back more detailed information to the client, rather than the client timing out the server and losing the opportunity to log more detailed information about which component caused the timeout.

In this change, I subtract 100ms from the deadline on the server side of our interceptors, using our existing serverInterceptor. I also check that there is at least 100ms remaining in which to do useful work, so the server doesn't begin a potentially expensive task only to abort it.

Fixes #3608.
2018-04-06 15:40:18 +01:00
Roland Bracewell Shoemaker cc5ec34539 Allow configuration of multiple DNS resolvers (#3612)
* Allow configuration of multiple DNS resolvers
* Use multiple DNS resolvers in integration tests

Fixes #3611.
2018-04-05 11:51:22 -04:00
Daniel McCarney 17922a6d2d
Add CAAIdentities and Website to /directory "meta". (#3588)
This commit updates the WFE and WFE2 to have configuration support for
setting a value for the `/directory` object's "meta" field's
optional "caaIdentities" and "website" fields. The config-next wfe/wfe2
configuration are updated with values for these fields. Unit tests are
updated to check that they are sent when expected and not otherwise.

Bonus content: The `test.AssertUnmarshaledEquals` function had a bug
where it would consider two inputs equal when the # of keys differed.
This commit also fixes that bug.
2018-03-22 16:12:43 -04:00
Jacob Hoffman-Andrews 65b88a8dbc Run certlint in cert-checker (#3550)
This pulls in the certlint dependency, which in turn pulls in publicsuffix as a dependency.

Fixes #3549
2018-03-15 17:42:58 +00:00
Jacob Hoffman-Andrews 700604dda1
Overlapping wildcard errors are 400, not 500. (#3561)
Return a malformed error for these requests. Also add an integration test.

Fixes #3558
2018-03-14 13:18:25 -07:00
Daniel McCarney 5172f03865
Enable WFE2 "EnforceV2ContentType" flag in config-next. (#3562)
We shipped this feature flag disabled because at the time Certbot's acme
module had a bug with revocation that sent the wrong content-type. That
has been fixed upstream and the current boulder-tools image has the fix.
We should be able to turn this flag on now for config-next without
breaking tests.
2018-03-14 16:09:58 -04:00
Roland Bracewell Shoemaker 9c9e944759 Add SCT embedding (#3521)
Adds SCT embedding to the certificate issuance flow. When a issuance is requested a precertificate (the requested certificate but poisoned with the critical CT extension) is issued and submitted to the required CT logs. Once the SCTs for the precertificate have been collected a new certificate is issued with the poison extension replace with a SCT list extension containing the retrieved SCTs.

Fixes #2244, fixes #3492 and fixes #3429.
2018-03-12 11:58:30 -07:00
Jacob Hoffman-Andrews 11434650b7 Check safe browsing at validation time (#3539)
Right now we check safe browsing at new-authz time, which introduces a possible
external dependency when calling new-authz. This is usually fine, since most safe
browsing checks can be satisfied locally, but when requests have to go external,
it can create variance in new-authz timing.

Fixes #3491.
2018-03-09 11:15:05 +00:00
Daniel McCarney 28cc969814 Remove TLS-SNI-02 implementation. (#3516)
This code was never enabled in production. Our original intent was to
ship this as part of the ACMEv2 API. Before that could happen flaws were
identified in TLS-SNI-01|02 that resulted in TLS-SNI-02 being removed
from the ACME protocol. We won't ever be enabling this code and so we
might as well remove it.
2018-03-02 10:56:13 -08:00
Roland Bracewell Shoemaker 0b53063a72 ctpolicy: Add informational logs and don't cancel remaining submissions (#3472)
Add a set of logs which will be submitted to but not relied on for their SCTs,
this allows us to test submissions to a particular log or submit to a log which
is not yet approved by a browser/root program.

Also add a feature which stops cancellations of remaining submissions when racing
to get a SCT from a group of logs.

Additionally add an informational log that always times out in config-next.

Fixes #3464 and fixes #3465.
2018-02-23 21:51:50 -05:00
Jacob Hoffman-Andrews 92c9340fe8 Deprecate CountCertificatesExact. (#3462)
This is now enabled in prod and we can make it the default.
2018-02-20 14:34:03 -08:00
Roland Bracewell Shoemaker 8446571b46 Remove EnforceChallengeDisable (#3444)
Removes usage of the `EnforceChallengeDisable` feature, the feature itself is not removed as it is still configured in staging/production, once that is fixed I'll submit another PR removing the actual flag.

This keeps the behavior that when authorizations are retrieved from the SA they have their challenges populated, because that seems to make the most sense to me? It also retains TLS re-validation.

Fixes #3441.
2018-02-14 13:21:26 -08:00
Jacob Hoffman-Andrews c556a1a20d
Reduce spurious errors in integration test (#3436)
Boulder is fairly noisy about gRPC connection errors. This is a mixed
blessing: Our gRPC configuration will try to reconnect until it hits
an RPC deadline, and most likely eventually succeed. In that case,
we don't consider those to really be errors. However, in cases where
a connection is repeatedly failing, we'd like to see errors in the
logs about connection failure, rather than "deadline exceeded." So
we want to keep logging of gRPC errors.

However, right now we get a lot of these errors logged during
integration tests. They make the output hard to read, and may disguise
more serious errors. So we'd like to avoid causing such errors in
normal integration test operation.

This change reorders the startup of Boulder components by their gRPC
dependencies, so everything's backend is likely to be up and running
before it starts. It also reverses that order for clean shutdowns,
and waits for each process to exit before signalling the next one.

With these changes, I still got connection errors. Taking listenbuddy
out of the gRPC path fixed them. I believe the issue is that
listenbuddy is not a truly transparent proxy. In particular, it
accepts an inbound TCP connection before opening an outbound TCP
connection. If opening that outbound connection results in "connection
refused," it closes the inbound connection. That means gRPC sees a
"connection closed" (or "connection reset"?) rather than "connection
refused". I'm guessing it handles those cases differently, explaining
the different error results.

We've been using listenbuddy to trigger disconnects while Boulder is
running, to ensure that gRPC's reconnect code works. I think we can
probably rely on gRPC's reconnect to work. The initial problem that
led us to start testing this was a configuration problem; now that
we have the configuration we want, we should be fine and don't need
to keep testing reconnects on every integration test run.
2018-02-12 18:17:50 -08:00
Jacob Hoffman-Andrews 2dc3b56fa9
Add variable latency to ct-test-srv (#3435)
For the upcoming SCT embedding changes, it will be useful to have a CT test server that blocks for nontrivial amounts of time before responding. This change introduces a config file for `ct-test-srv` that can be used to set up multiple "personalities" on various ports. Each personality can have a "latencySchedule" that determines how long it will sleep before servicing responding to a submission.

This change also introduces two new "personalities" on :4510 and :4511, plus configures CTLogGroups in the RA. Having four CT log personalities allows us to simulate two nontrivial log groups.

Note: This triggers Publisher to emit audit errors on timed-out submissions. We may want to make Publisher not treat those as errors, and instead only log an error if a whole log group fails.
2018-02-09 13:48:19 -08:00
Roland Bracewell Shoemaker fc5c8f76b6 Remove unused features (#3393)
This removes a number of unused features (i.e. they are never checked anywhere).
2018-01-25 08:55:05 -05:00
Daniel McCarney d6a33d1108 Return full cert chain for V2 cert GET. (#3366)
This commit implements a mapping from certificate AIA Issuer URL to PEM
encoded certificate chain. GET's to the V2 Certificate endpoint will
return a full PEM encoded certificate chain in addition to the leaf cert
using the AIA issuer URL of the leaf cert and the configured mapping.

The boulder-wfe2 command builds the chain mapping by reading the
"wfe" config section's 'certificateChains" field, specifying a list
of file paths to PEM certificates for each AIA issuer URL. At startup
the PEM file contents are ready, verified and separated by a newline.
The resulting populated AIA issuer URL -> PEM cert chain mapping is
given to the WFE for use with the Certificate endpoint.

Resolves #3291
2018-01-19 11:23:44 -08:00
Daniel McCarney ba264a5091 Remove unused WFE2 feature flags. (#3375)
The WFE2 doesn't check any of the feature flags that are configured in
the `test/config/wfe2.json` and `test/config-next/wfe2.json` config
files - we default to acting as if all new features are enabled for the
V2 work. This commit removes the flags from the config to avoid
confusion or expectations that changing the config will disable the
features.
2018-01-17 12:28:19 -08:00
Daniel McCarney c6d56b7a84 Match RA `authorizationLifetimeDays` to prod. (#3370) 2018-01-16 10:39:57 -08:00
Jacob Hoffman-Andrews 8153b919be
Implement TLSSNIRevalidation (#3361)
This change adds a feature flag, TLSSNIRevalidation. When it is enabled, Boulder
will create new authorization objects with TLS-SNI challenges if the requesting
account has issued a certificate with the relevant domain name, and was the most
recent account to do so*. This setting overrides the configured list of
challenges in the PolicyAuthority, so even if TLS-SNI is disabled in general, it
will be enabled for revalidation.

Note that this interacts with EnforceChallengeDisable. Because
EnforceChallengeDisable causes additional checked at validation time and at
issuance time, we need to update those two places as well. We'll send a
follow-up PR with that.

*We chose to make this work only for the most recent account to issue, even if
there were overlapping certificates, because it significantly simplifies the
database access patterns and should work for 95+% of cases.

Note that this change will let an account revalidate and reissue for a domain
even if the previous issuance on that account used http-01 or dns-01. This also
simplifies implementation, and fits within the intent of the mitigation plan: If
someone previously issued for a domain using http-01, we have high confidence
that they are actually the owner, and they are not going to "steal" the domain
from themselves using tls-sni-01.

Also note: This change also doesn't work properly with ReusePendingAuthz: true.
Specifically, if you attempted issuance in the last couple days and failed
because there was no tls-sni challenge, you'll still have an http-01 challenge
lying around, and we'll reuse that; then your client will fail due to lack of
tls-sni challenge again.

This change was joint work between @rolandshoemaker and @jsha.
2018-01-12 11:00:06 -08:00
Maciej Dębski 44984cd84a Implement regID whitelist for allowed challenge types. (#3352)
This updates the PA component to allow authorization challenge types that are globally disabled if the account ID owning the authorization is on a configured whitelist for that challenge type.
2018-01-10 13:44:53 -05:00
Roland Shoemaker 4d7f68de21 Properly flag gate SA authorization challenge population 2018-01-09 20:53:04 -08:00
Roland Shoemaker 1a3a76438c Fix tests and GetOrderAuthorizations 2018-01-09 20:38:52 -08:00
Jacob Hoffman-Andrews 827f7859f2 Fix issuerCert in test configs. (#3310)
Previously, there was a disagreement between WFE and CA as to what the correct
issuer certificate was. Consolidate on test-ca2.pem (h2ppy h2cker fake CA).
    
Also, the CA configs contained an outdated entry for "IssuerCert", which was not
being used: The CA configs now use an "Issuers" array to allow signing by
multiple issuer certificates at once (for instance when rolling intermediates).
Removed this outdated entry, and the config code for CA to load it. I've
confirmed these changes match what is currently in production.

Added an integration test to check for this problem in the future.

Fixes #3309, thanks to @icing for bringing the issue to our attention!

This also includes changes from #3321 to clarify certificates for WFE.
2018-01-09 07:56:39 -05:00
Jacob Hoffman-Andrews fdd854a7e5 Fix various WFE2 bugs. (#3292)
- Encode certificate as PEM.
- Use lowercase for field names.
- Use termsOfServiceAgreed instead of Agreement
- Use a different ToS URL for v2 that points at the v2 HTTPS port.

Resolves #3280
2017-12-19 13:13:29 -08:00
Jacob Hoffman-Andrews cd49316493 Do ROCAChecks by default. (#3283)
This feature flag has been enabled in prod, and we don't expect to want to
turn it off any time soon.
2017-12-15 13:44:39 -08:00
Jacob Hoffman-Andrews f16c3af335 Active UDPDNS by default. (#3285)
Active UDPDNS by default
2017-12-15 12:26:45 -08:00
Roland Bracewell Shoemaker bdea281ae0 Remove CAA SERVFAIL exceptions code (#3262)
Fixes #3080.
2017-12-05 14:39:37 -08:00
Daniel McCarney 1c99f91733 Policy based issuance for wildcard identifiers (Round two) (#3252)
This PR implements issuance for wildcard names in the V2 order flow. By policy, pending authorizations for wildcard names only receive a DNS-01 challenge for the base domain. We do not re-use authorizations for the base domain that do not come from a previous wildcard issuance (e.g. a normal authorization for example.com turned valid by way of a DNS-01 challenge will not be reused for a *.example.com order).

The wildcard prefix is stripped off of the authorization identifier value in two places:

When presenting the authorization to the user - ACME forbids having a wildcard character in an authorization identifier.
When performing validation - We validate the base domain name without the *. prefix.
This PR is largely a rewrite/extension of #3231. Instead of using a pseudo-challenge-type (DNS-01-Wildcard) to indicate an authorization & identifier correspond to the base name of a wildcard order name we instead allow the identifier to take the wildcard order name with the *. prefix.
2017-12-04 12:18:10 -08:00
Daniel McCarney 55dd1020c0 Increase VA SingleDialTimeout to 10s. (#3260)
This PR changes the VA's singleDialTimeout value from 5 * time.Second to 10 * time.Second. This will give slower servers a better chance to respond, especially for the multi-VA case where n requests arrive ~simultaneously.

This PR also bumps the RA->VA timeout by 5s and the WFE->RA timeout by 5s to accommodate the increased dial timeout. I put this in a separate commit in case we'd rather deal with this separately.
2017-12-04 09:53:26 -08:00
Jacob Hoffman-Andrews 2fd2f9e230 Remove LegacyCAA implementation. (#3240)
Fixes #3236
2017-11-20 16:09:00 -05:00
Roland Bracewell Shoemaker d5db80ab12 Various publisher CT fixes (#3219)
Makes a couple of changes:
* Change `SubmitToCT` to make submissions to each log in parallel instead of in serial, this prevents a single slow log from eating up the majority of the deadline and causing submissions to other logs to fail
* Remove the 'submissionTimeout' field on the publisher since it is actually bounded by the gRPC timeout as is misleading
* Add a timeout to the CT clients internal HTTP client so that when log servers hang indefinitely we actually do retries instead of just using the entire submission deadline. Currently set at 2.5 minutes

Fixes #3218.
2017-11-09 10:05:26 -05:00
Jacob Hoffman-Andrews 4296dd985a Use TLS in mailer integration tests (#3213)
* Remove non-TLS support from mailer entirely
* Add a config option for trusted roots in expiration-mailer. If unset, it defaults to the system roots, so this does not need to be set in production.
* Use TLS in mail-test-srv, along with an internal root and localhost certificates signed by that root.
2017-11-06 14:57:14 -08:00
Jacob Hoffman-Andrews 5df083a57e Add ROCA weak key checking (#3189)
Thanks to @titanous for the library!
2017-11-02 08:42:59 -04:00
Jacob Hoffman-Andrews 071fc0120f Remove facebookgo/httpdown. (#3168)
Its purpose is now served by net/http's Shutdown().
2017-10-17 08:55:43 -04:00
Jacob Hoffman-Andrews 600640294d Increase default MaxIdleConns. (#3164)
Go's default is 2: https://golang.org/src/database/sql/sql.go#L686.
Graphs show we are opening 100-200 fresh connections per second on the SA.
Changing this default should reduce that a lot, which should reduce load on both
the SA and MariaDB. This should also improve latency, since every new TCP
connection adds a little bit of latency.
2017-10-16 15:48:17 -07:00
Jacob Hoffman-Andrews 4e68fb2ff6 Switch to udp for internal DNS. (#3135)
We used to use TCP because we would request DNSSEC records from Unbound, and
they would always cause truncated records when present. Now that we no longer
request those (#2718), we can use UDP. This is better because the TCP serving
paths in Unbound are likely less thoroughly tested, and not optimized for high
load. In particular this may resolve some availability problems we've seen
recently when trying to upgrade to a more recent Unbound.

Note that this only affects the Boulder->Unbound path. The Unbound->upstream
path is already UDP by default (with TCP fallback for truncated ANSWERs).
2017-10-10 10:06:33 -04:00
Daniel McCarney 1794c56eb8 Revert "Add CAA parameter to restrict challenge type (#3003)" (#3145)
This reverts commit 23e2c4a836.
2017-10-04 12:00:44 -07:00
Jacob Hoffman-Andrews 8aeb1a6b4d Set parallelism in SA's config-next (#3142) 2017-10-03 20:44:05 -07:00
lukaslihotzki 23e2c4a836 Add CAA parameter to restrict challenge type (#3003)
This commit adds CAA `issue` paramter parsing and the `challenge` parameter to permit a single challenge type only. By setting `challenge=dns-01`, the nameserver keeps control over every issued certificate.
2017-10-02 11:59:47 -07:00
Brian Smith 9d324631a7 CA: Set certificates' validity periods using the CA's clock. (#2983) 2017-09-14 09:40:31 -04:00
Jacob Hoffman-Andrews 4266853092 Implement legacy form of CAA (#3075)
This implements the pre-erratum 5065 version of CAA, behind a feature flag.

This involved refactoring DNSClient.LookupCAA to return a list of CNAMEs in addition to the CAA records, and adding an alternate lookuper that does tree-climbing on single-depth aliases.
2017-09-13 10:16:12 -04:00
Jacob Hoffman-Andrews 8afec60433 Remove unneeded dns config value for WFE. (#3057) 2017-09-08 14:32:36 -04:00
Kleber Correia c2156479dd Remove ResubmitMissingSCTsOnly flag (#3042)
Part of #2692
2017-09-06 10:30:30 -07:00
Jacob Hoffman-Andrews b0c7bc1bee Recheck CAA for authorizations older than 8 hours (#3014)
Fixes #2889.

VA now implements two gRPC services: VA and CAA. These both run on the same port, but this allows implementation of the IsCAAValid RPC to skip using the gRPC wrappers, and makes it easier to potentially separate the service into its own package in the future.

RA.NewCertificate now checks the expiration times of authorizations, and will call out to VA to recheck CAA for those authorizations that were not validated recently enough.
2017-08-28 16:40:57 -07:00
Roland Bracewell Shoemaker 90ba766af9 Add NewOrder RPCs + methods to SA and RA (#2907)
Fixes #2875, #2900 and #2901.
2017-08-11 14:24:25 -04:00
Kleber Correia 338c61171b Remove IDNASupport flag (#2926)
Splitting #2712 into multiple per-flag PRs
2017-08-01 16:51:19 -07:00
Jacob Hoffman-Andrews 3431acfb92 Adjust testing maxNames config to match prod. (#2911) 2017-07-27 15:23:29 -07:00
Jacob Hoffman-Andrews 8bc1db742c Improve recycling of pending authzs (#2896)
The existing ReusePendingAuthz implementation had some bugs:

It would recycle deactivated authorizations, which then couldn't be fulfilled. (#2840)
Since it was implemented in the SA, it wouldn't get called until after the RA checks the Pending Authorizations rate limit. Which means it wouldn't fulfill its intended purpose of making accounts less likely to get stuck in a Pending Authorizations limited state. (#2831)
This factors out the reuse functionality, which used to be inside an "if" statement in the SA. Now the SA has an explicit GetPendingAuthorization RPC, which gets called from the RA before calling NewPendingAuthorization. This happens to obsolete #2807, by putting the recycling logic for both valid and pending authorizations in the RA.
2017-07-26 14:00:30 -07:00
Daniel McCarney 57252c3b07 Remove letsencrypt/go-safe-browsing-api dependency. (#2905)
We have migrated from our fork of `go-safe-browsing-api` to Google's
safebrowsing v4 library. This commit removes the legacy safe browsing
code.
2017-07-26 13:57:57 -07:00
Daniel McCarney bd3e2747ba Duplicate WFE to WFE2. (#2839)
This PR is the initial duplication of the WFE to create a WFE2
package. The rationale is briefly explained in `wfe2/README.md`.

Per #2822 this PR only lays the groundwork for further customization
and deduplication. Presently both the WFE and WFE2 are identical except
for the following configuration differences:

* The WFE offers HTTP and HTTPS on 4000 and 4430 respectively, the WFE2
  offers HTTP on 4001 and 4431.
* The WFE has a debug port on 8000, the WFE2 uses the next free "8000
  range port" and puts its debug service on 8013

Resolves https://github.com/letsencrypt/boulder/issues/2822
2017-07-05 13:32:45 -07:00
Roland Bracewell Shoemaker 088b872287 Implement multi VA validation (#2802)
Adds basic multi-path validation functionality. A new method `performRemoteValidation` is added to `boulder-va` which is called if it is configured with a list of remote VA gRPC addresses. In this initial implementation the remote VAs are only used to check the validation result of the main VA, if all of the remote validations succeed but the local validation failed, the overall validation will still fail. Remote VAs use the exact same code as the local VA to perform validation. If the local validation succeeds then a configured quorum of the remote VA successes must be met in order to fully complete the validation.

This implementation assumes that metrics are collected from the remote VAs in order to have visibility into their individual validation latencies etc.

Fixes #2621.
2017-06-29 14:11:01 -07:00
Daniel McCarney 71f8ae0e87 Improve renewal rate limiting (#2832)
As described in Boulder issue #2800 the implementation of the SA's
`countCertificates` function meant that the renewal exemption for the
Certificates Per Domain rate limit was difficult to work with. To
maximize allotted certificates clients were required to perform all new
issuances first, followed by the "free" renewals. This arrangement was
difficult to coordinate.

In this PR `countCertificates` is updated such that renewals are
excluded from the count reliably. To do so the SA takes the serials it
finds for a given domain from the issuedNames table and cross references
them with the FQDN sets it can find for the associated serials. With the
FQDN sets a second query is done to find all the non-renewal FQDN sets
for the serials, giving a count of the total non-renewal issuances to
use for rate limiting.

Resolves #2800
2017-06-27 15:39:59 -04:00
Jacob Hoffman-Andrews 41df4ae10f Set ReusePendingAuthz in config-next. (#2820) 2017-06-21 09:44:57 -04:00
Roland Bracewell Shoemaker 8ce2f8b432 Basic RSA known weak key checking (#2765)
Adds a basic truncated modulus hash check for RSA keys that can be used to check keys against the Debian `{openssl,openssh,openvpn}-blacklist` lists of weak keys generated during the [Debian weak key incident](https://wiki.debian.org/SSLkeys).

Testing is gated on adding a new configuration key to the WFE, RA, and CA configs which contains the path to a directory which should contain the weak key lists.

Fixes #157.
2017-05-25 09:33:58 -07:00
Jacob Hoffman-Andrews b17b5c72a6 Remove statsd from Boulder (#2752)
This removes the config and code to output to statsd.

- Change `cmd.StatsAndLogging` to output a `Scope`, not a `Statter`.
- Remove the prefixing of component name (e.g. "VA") in front of stats; this was stripped by `autoProm` but now no longer needs to be.
- Delete vendored statsd client.
- Delete `MockStatter` (generated by gomock) and `mocks.Statter` (hand generated) in favor of mocking `metrics.Scope`, which is the interface we now use everywhere.
- Remove a few unused methods on `metrics.Scope`, and update its generated mock.
- Refactor `autoProm` and add `autoRegisterer`, which can be included in a `metrics.Scope`, avoiding global state. `autoProm` now registers everything with the `prometheus.Registerer` it is given.
- Change va_test.go's `setup()` to not return a stats object; instead the individual tests that care about stats override `va.stats` directly.

Fixes #2639, #2733.
2017-05-15 10:19:54 -04:00
Daniel 8c547473b8
Adds "meta" entry w/ ToS to /directory.
This commit adds the acme draft-02+ optional "meta" element for the
/directory response. Presently we only include the optional
"terms-of-service" URL. Whether the meta entry is included is controlled
by two factors:

  1. The state of the "DirectoryMeta" feature flag, which defaults to
     off
  2. Whether the client advertises the UA we know to be intolerant of
     new directory entries.

The TestDirectory unit test is updated to test both states of the flag
and the UA detection.
2017-05-09 15:43:16 -04:00
Roland Bracewell Shoemaker 730318a755 Add GREASE to directory (#2731)
Randomly generates and adds a key to the directory object with the value grease.

Fixes #2415.
2017-05-08 14:13:35 -07:00
Daniel McCarney 47452d6c6c Prefer IPv6 addresses, fall back to IPv4. (#2715)
This PR introduces a new feature flag "IPv6First". 

When the "IPv6First" feature is enabled the VA's HTTP dialer and TLS SNI
(01 and 02) certificate fetch requests will attempt to automatically
retry when the initial connection was to IPv6 and there is an IPv4
address available to retry with.

This resolves https://github.com/letsencrypt/boulder/issues/2623
2017-05-08 13:00:16 -07:00
Daniel McCarney 1ed34a4a5d Fixes cert count rate limit for exact PSL matches. (#2703)
Prior to this PR if a domain was an exact match to a public suffix
list entry the certificates per name rate limit was applied based on the
count of certificates issued for that exact name and all of its
subdomains.

This PR introduces an exception such that exact public suffix
matches correctly have the certificate per name rate limit applied based
on only exact name matches.

In order to accomplish this a new RPC is added to the SA
`CountCertificatesByExactNames`. This operates similar to the existing
`CountCertificatesByNames` but does *not* include subdomains in the
count, only exact matches to the names provided. The usage of this new
RPC is feature flag gated behind the "CountCertificatesExact" feature flag.

The RA unit tests are updated to test the new code paths both with and
without the feature flag enabled.

Resolves #2681
2017-05-02 13:43:35 -07:00
Daniel McCarney e6c63f1f11 Removes notafter-backfiller configs. (#2650)
This commit removes the notafter-backfiller config files. The actual
tool was already removed and these are just cruft.
2017-04-05 18:26:05 -07:00
Jacob Hoffman-Andrews cef0a630b3 Remove old-style gRPC TLS configs (#2495)
* Switch Publisher gRPC to use new "tls" block.

* Remove old-style GRPC TLS configs.

* Fix incorrect TLS blocks.

* Remove more config.
2017-04-05 12:41:41 -04:00
Roland Bracewell Shoemaker a46d30945c Purge remaining AMQP code (#2648)
Deletes github.com/streadway/amqp and the various RabbitMQ setup tools etc. Changes how listenbuddy is used to proxy all of the gRPC client -> server connections so we test reconnection logic.

+49 -8,221 😁

Fixes #2640 and #2562.
2017-04-04 15:02:22 -07:00
Roland Bracewell Shoemaker fd561ef842 Block issuance on first OCSP response generation (#2633)
Generate first OCSP response in ca.IssueCertificate instead of ocsp-updater.newCertificateTick
if features.GenerateOCSPEarly is enabled. Adds a new field to the sa.AddCertiifcate RPC for
the OCSP response and only adds it to the certificate status + sets ocspLastUpdated if it is a
non-empty slice. ocsp-updater.newCertificateTick stays the same so we can catch certificates
that were successfully signed + stored but a OCSP response couldn't be generated (for whatever
reason).

Fixes #2477.
2017-04-04 11:28:09 -07:00
Jacob Hoffman-Andrews 6719dc17a6 Remove AMQP config and code (#2634)
We now use gRPC everywhere.
2017-04-03 10:39:39 -04:00
Roland Bracewell Shoemaker 98addd5f36 expiration-mailer daemon mode (#2631)
Adds a daemon mode to `expiration-mailer` that is triggered by using the flag `--daemon` in order to follow deployability guidelines. If the `--daemon` flag is used the `mailer.runPeriod` config field is checked for a tick duration, if the value is `0` it exits.

Super lightweight implementation, OCSP-Updater has some custom ticker code which we use to do fancy things when the method being invoked in the loop takes longer expected, but that isn't necessary here.

Fixes #2617.
2017-03-30 16:16:41 -07:00
David Calavera c71c3cff80 Implement TLS-SNI-02 challenge validations. (#2585)
I think these are all the necessary changes to implement TLS-SNI-02 validations, according to the section 7.3 of draft 05:

https://tools.ietf.org/html/draft-ietf-acme-acme-05#section-7.3

I don't have much experience with this code, I'll really appreciate your feedback.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2017-03-22 10:17:59 -07:00
Daniel McCarney fcf361c327 Remove CertStatusOptimizationsMigrated Feature Flag & Assoc. Cruft (#2561)
The NotAfter and IsExpired fields on the certificateStatus table
have been migrated in staging & production. Similarly the
CertStatusOptimizationsMigrated feature flag has been turned on after
a successful backfill operation. We have confirmed the optimization is
working as expected and can now clean out the duplicated v1 and v2
models, and the feature flag branching. The notafter-backfill command
is no longer useful and so this commit also cleans it out of the repo.

Note: Some unit tests were sidestepping the SA and inserting
certificateStatus rows explicitly. These tests had to be updated to
set the NotAfter field in order for the queries used by the
ocsp-updater and the expiration-mailer to perform the way the tests
originally expected.

Resolves #2530
2017-02-16 11:35:00 -08:00
Patrick Figel 6ba8aadfd7 Use X.509 AIA Issuer URL in rel="up" link header (#2545)
In order to provide the correct issuer certificate for older certificates after an issuer certificate rollover or when using multiple issuer certificates (e.g. RSA and ECDSA), use the AIA CA Issuer URL embedded in the certificate for the rel="up" link served by WFE. This behaviour is gated behind the UseAIAIssuerURL feature, which defaults to false.

To prevent MitM vulnerabilities in cases where the AIA URL is HTTP-only, it is upgraded to HTTPS.

This also adds a test for the issuer URL returned by the /acme/cert endpoint. wfe/test/178.{crt,key} were regenerated to add the AIA extension required to pass the test.

/acme/cert was changed to return an absolute URL to the issuer endpoint (making it consistent with /acme/new-cert).

Fixes #1663
Based on #1780
2017-02-07 11:19:22 -08:00
Roland Bracewell Shoemaker 18de73f0d8 Pass nil errors through boulder/grpc wrapError/unwrapError (#2544)
Instead of trying to wrap or unwrap them which causes panics.

Also, expand the test_ct_submission integration test to include resubmissions.
2017-02-06 18:19:39 -08:00
Daniel e88db3cd5e
Revert "Revert "Copy all statsd stats to Prometheus. (#2474)" (#2541)"
This reverts commit 9d9e4941a5 and
restores the statsd prometheus code.
2017-02-01 15:48:18 -05:00
Daniel McCarney 9d9e4941a5 Revert "Copy all statsd stats to Prometheus. (#2474)" (#2541)
This reverts commit 58ccd7a71a.

We are seeing multiple boulder components restart when they encounter the stat registration race condition described in https://github.com/letsencrypt/boulder/issues/2540
2017-02-01 12:50:27 -05:00
Daniel McCarney 15e73edc5a Google Safe Browsing V4 Improvements (#2504)
This PR has three primary contributions:

1. The existing code for using the V4 safe browsing API introduced in #2446 had some bugs that are fixed in this PR.
2. A gsb-test-srv is added to provide a mock Google Safebrowsing V4 server for integration testing purposes.
3. A short integration test is added to test end-to-end GSB lookup for an "unsafe" domain.

For 1) most notably Boulder was assuming the new V4 library accepted a directory for its database persistence when it instead expects an existing file to be provided. Additionally the VA wasn't properly instantiating feature flags preventing the V4 api from being used by the VA.

For 2) the test server is designed to have a fixed set of "bad" domains (Currently just honest.achmeds.discount.hosting.com). When asked for a database update by a client it will package the list of bad domains up & send them to the client. When the client is asked to do a URL lookup it will check the local database for a matching prefix, and if found, perform a lookup against the test server. The test server will process the lookup and increment a count for how many times the bad domain was asked about.

For 3) the Boulder startservers.py was updated to start the gsb-test-srv and the VA is configured to talk to it using the V4 API. The integration test consists of attempting issuance for a domain pre-configured in the gsb-test-srv as a bad domain. If the issuance succeeds we know the GSB lookup code is faulty. If the issuance fails, we check that the gsb-test-srv received the correct number of lookups for the "bad" domain and fail if the expected isn't reality.

Notes for reviewers:

* The gsb-test-srv has to be started before anything will use it. Right now the v4 library handles database update request failures poorly and will not retry for 30min. See google/safebrowsing#44 for more information.
* There's not an easy way to test for "good" domain lookups, only hits against the list. The design of the V4 API is such that a list of prefixes is delivered to the client in the db update phase and if the domain in question matches no prefixes then the lookup is deemed unneccesary and not performed. I experimented with sending 256 1 byte prefixes to try and trick the client to always do a lookup, but the min prefix size is 4 bytes and enumerating all possible prefixes seemed gross.
* The test server has a /add endpoint that could be used by integration tests to add new domains to the block list, but it isn't being used presently. The trouble is that the client only updates its database every 30 minutes at present, and so adding a new domain will only take affect after the client updates the database.

Resolves #2448
2017-01-23 11:07:20 -08:00
Jacob Hoffman-Andrews 373ff015a2 Update cfssl, CT, and OCSP dependencies (#2170)
Pulls in logging improvements in OCSP Responder and the CT client, plus a handful of API changes. Also, the CT client verifies responses by default now.

This change includes some Boulder diffs to accommodate the API changes.
2017-01-12 16:01:14 -08:00
Jacob Hoffman-Andrews cbde78d58f Harmonize and tweak configs (#2479)
Set authorizationLifetimeDays to 60 across both config and config-next.

Set NumSessions to 2 in both config and config-next. A decrease from 10 because pkcs11-proxy (or pkcs11-daemon?) seems to error out under load if you have more sessions than CPUs.

Reorder parallelGenerateOCSPRequests to match config-next.

Remove extra tags for parsing yaml in config objects.
2017-01-10 13:46:38 -08:00
Jacob Hoffman-Andrews 58ccd7a71a Copy all statsd stats to Prometheus. (#2474)
We have a number of stats already expressed using the statsd interface. During
the switchover period to direct Prometheus collection, we'd like to make those
stats available both ways. This change automatically exports any stats exported
using the statsd interface via Prometheus as well.

This is a little tricky because Prometheus expects all stats to by registered
exactly once. Prometheus does offer a mechanism to gracefully recover from
registering a stat more than once by handling a certain error, but it is not
safe for concurrent access. So I added a concurrency-safe wrapper that creates
Prometheus stats on demand and memoizes them.

In the process, made a few small required side changes:
 - Clean "/" from method names in the gRPC interceptors. They are allowed in
   statsd but not in Prometheus.
 - Replace "127.0.0.1" with "boulder" as the name of our testing CT log.
   Prometheus stats can't start with a number.
 - Remove ":" from the CT-log stat names emitted by Publisher. Prometheus stats
   can't include it.
 - Remove a stray "RA" in front of some rate limit stats, since it was
   duplicative (we were emitting "RA.RA..." before).

Note that this means two stat groups in particular are duplicated:
 - Gostats* is duplicated with the default process-level stats exported by the
   Prometheus library.
 - gRPCClient* are duplicated by the stats generated by the go-grpc-prometheus
   package.

When writing dashboards and alerts in the Prometheus world, we should be careful
to avoid these two categories, as they will disappear eventually. As a general
rule, if a stat is available with an all-lowercase name, choose that one, as it
is probably the Prometheus-native version.

In the long run we will want to create most stats using the native Prometheus
stat interface, since it allows us to use add labels to metrics, which is very
useful. For instance, currently our DNS stats distinguish types of queries by
appending the type to the stat name. This would be more natural as a label in
Prometheus.
2017-01-10 10:30:15 -05:00
Jacob Hoffman-Andrews 510e279208 Simplify gRPC TLS configs. (#2470)
Previously, a given binary would have three TLS config fields (CA cert, cert,
key) for its gRPC server, plus each of its configured gRPC clients. In typical
use, we expect all three of those to be the same across both servers and clients
within a given binary.

This change reuses the TLSConfig type already defined for use with AMQP, adds a
Load() convenience function that turns it into a *tls.Config, and configures it
for use with all of the binaries. This should make configuration easier and more
robust, since it more closely matches usage.

This change preserves temporary backwards-compatibility for the
ocsp-updater->publisher RPCs, since those are the only instances of gRPC
currently enabled in production.
2017-01-06 14:19:18 -08:00
Jacob Hoffman-Andrews 9b8dacab03 Split out separate RPC services for issuing and for signing OCSP (#2452)
This allows finer-grained control of which components can request issuance. The OCSP Updater should not be able to request issuance.

Also, update test/grpc-creds/generate.sh to reissue the certs properly.

Resolves #2417
2017-01-05 15:08:39 -08:00
Jacob Hoffman-Andrews 089a270453 Add instructions on load testing OCSP generation. (#2459) 2017-01-02 11:36:03 -08:00
Jacob Hoffman-Andrews 0c665b2053 Split up gRPC certificates by service. (#2453)
Previously, all gRPC services used the same client and server certificates. Now,
each service has its own certificate, which it uses for both client and server
authentication, more closely simulating production.

This also adds aliases for each of the relevant hostnames in /etc/hosts. There
may be some issues if Docker decides to rewrite /etc/hosts while Boulder is
running, but this seems to work for now.
2016-12-29 14:53:59 -08:00
Daniel McCarney 74e281c1ce Switch to Google's v4 safebrowsing library. (#2446)
Right now we are using a third-party client for the Google Safe Browsing API, but Google has recently released their own [Golang library](https://github.com/google/safebrowsing) which also supports the newer v4 API. Using this library will let us avoid fixing some lingering race conditions & unpleasantness in our fork of `go-safebrowsing-api`.

This PR adds support for using the Google library & the v4 API in place of our existing fork when the `GoogleSafeBrowsingV4` feature flag is enabled in the VA "features" configuration.

Resolves https://github.com/letsencrypt/boulder/issues/1863

Per `CONTRIBUTING.md` I also ran the unit tests for the new dependency:
```
daniel@XXXXXXXXXX:~/go/src/github.com/google/safebrowsing$ go test ./...
ok  	github.com/google/safebrowsing	3.274s
?   	github.com/google/safebrowsing/cmd/sblookup	[no test files]
?   	github.com/google/safebrowsing/cmd/sbserver	[no test files]
?   	github.com/google/safebrowsing/cmd/sbserver/statik	[no test files]
?   	github.com/google/safebrowsing/internal/safebrowsing_proto	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/jsonpb	0.012s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/jsonpb/jsonpb_test_proto	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/proto	0.062s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/proto/proto3_proto	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/descriptor	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/generator	0.017s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/grpc	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/plugin	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes	0.009s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/any	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/duration	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/empty	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/struct	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/timestamp	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/wrappers	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/rakyll/statik	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/rakyll/statik/fs	[no test files]
ok  	github.com/google/safebrowsing/vendor/golang.org/x/net/idna	0.003s
```
2016-12-27 11:18:11 -05:00
Daniel McCarney 5c3482d2dd `certificateStatus` table optimizations (Part Four) (#2432)
Similar to #2431 the expiration-mailer's `findExpiringCertificates()` query can be optimized slightly by using `certificateStatus.NotAfter` in place of `certificate.expires` in the `WHERE` clause of its query when the `CertStatusOptimizationsMigrated` feature is enabled.

Resolves https://github.com/letsencrypt/boulder/issues/2425
2016-12-15 12:53:54 -08:00
Daniel McCarney 32890656b8 `certificateStatus` table optimizations (Part Three) (#2431)
Following on to https://github.com/letsencrypt/boulder/pull/2177 and https://github.com/letsencrypt/boulder/issues/2227 this PR adds code to the `ocsp-updater` that takes advantage of the migrations & backfill from the previous optimization PRs.

This has the primary effect of removing the `JOIN` on the `certificates` table in the `findStaleOCSPResponses` query. We expect this to be a big win in terms of query performance.

The `ocsp-updater` is also updated to opportunistically fill in the newly added `isExpired` field of the `CertificateStatus` table as it encounters rows that aren't marked as expired but correspond to an expired certificate.

Resolves https://github.com/letsencrypt/boulder/issues/2238 and #2239
2016-12-15 12:53:43 -08:00
Jacob Hoffman-Andrews 26cf552ff9 Sign OCSP in parallel for better performance. (#2422)
Previously all OCSP signing and storage would be serial, which meant it was hard
to exercise the full capacity of our HSM. In this change, we run a limited
number of update and store requests in parallel.

This change also changes stats generation in generateOCSPResponses so we can
tell the difference between stats produced by new OCSP requests vs existing ones,
and adds a new stat that records how long the SQL query in findStaleOCSPResponses
takes.
2016-12-12 17:22:44 -08:00
Daniel McCarney 6ec93157f7 OCSP Updater "stale max age" parameter. (#2419)
This PR adds a new `OCSPStaleMaxAge` configuration parameter to the `ocsp-updater`. The default value when not provided is 30 days, and this is explicitly added to both `config/ocsp-updater.json` and `config-next/ocsp-updater.json`. 

The OCSP updater uses this new parameter in `findStaleOCSPResponses` as a lower bound on the `ocspLastUpdated` field of the certificateStatus table. This is intended to speed up the processing of this query until we can land the proper fixes that require more intensive migrations & backfilling. 

The `TestGenerateOCSPResponses` and `TestFindStaleOCSPResponses` unit tests had to be updated to explicitly set the `ocspLastUpdated` field of the certificate status rows that the tests add, because otherwise they are left at a default value of `0` and are excluded by the new `OCSPStaleMaxAge` functionality.
2016-12-12 15:57:59 -05:00
Jacob Hoffman-Andrews 1c1449b284 Improvements to tests and test configs. (#2396)
- Remove spinner from test.js. It made Travis logs hard to read.
- Listen on all interfaces for debugAddr. This makes it possible to check
  Prometheus metrics for instances running in a Docker container.
- Standardize DNS timeouts on 1s and 3 retries across all configs. This ensures
  DNS completes within the relevant RPC timeouts.
- Remove RA service queue from VA, since VA no longer uses the callback to RA on
  completing a challenge.
2016-12-05 14:35:27 -08:00
Daniel McCarney a2b8faea1e Only resubmit missing SCTs. (#2342)
This PR introduces the ability for the ocsp-updater to only resubmit certificates to logs that we are missing SCTs from. Prior to this commit when a certificate was missing one or more SCTs we would submit it to every log, causing unnecessary overhead for us and the log operator.

To accomplish this a new RPC endpoint is added to the Publisher service "SubmitToSingleCT". Unlike the existing "SubmitToCT" this RPC endpoint accepts a log URI and public key in addition to the certificate DER bytes. The certificate is submitted directly to that log, and a cache of constructed resources is maintained so that subsequent submissions to the same log can reuse the stat name, verifier, and submission client.

Resolves #1679
2016-12-05 13:54:02 -08:00
Roland Bracewell Shoemaker 03fdd65bfe Add gRPC server to SA (#2374)
Adds a gRPC server to the SA and SA gRPC Clients to the WFE, RA, CA, Publisher, OCSP updater, orphan finder, admin revoker, and expiration mailer.

Also adds a CA gRPC client to the OCSP Updater which was missed in #2193.

Fixes #2347.
2016-12-02 17:24:46 -08:00
Daniel McCarney 78587bae6e Add explicit forbidden names validation to cert-checker (#2373)
In https://groups.google.com/forum/#!topic/mozilla.dev.security.policy/_pSjsrZrTWY, we had a problem with the policy authority configuration, but cert-checker didn't alert about it because it uses the same policy configuration.

This PR adds support for an explicit list of regular expressions used to match forbidden names. The regular expressions are applied after the PA has done its usual validation process in order to act as a defense-in-depth mechanism for cases (.mil, .local, etc) that we know we never want to support, even if the PA thinks they are valid (e.g. due to a policy configuration malfunction).

Initially the forbidden name regexps are:

`^\s*$`,
`\.mil$`,
`\.local$`,
`^localhost$`,
`\.localhost$`,
Additionally, the existing cert-checker.json config in both test/config/ and test/config-next/ was missing the hostnamePolicyFile entry required for operation of cert-checker. This PR adds a hostnamePolicyFile entry pointing at the existing test/hostname-policy.json file. The cert checker can now be used in the dev env with cert-checker -config test/config/cert-checker.json without error.

Resolves #2366
2016-12-02 11:55:24 -08:00
Jacob Hoffman-Andrews 7c624d013e Remove RA->{VA,CA} AMQP configs. (#2371) 2016-11-30 08:55:03 -05:00
Roland Bracewell Shoemaker a87379bc6e Add gRPC server to RA (#2350)
Fixes #2348.
2016-11-29 15:34:35 -08:00
Jacob Hoffman-Andrews 1df986b858 Remove CheckMalformedCSR feature flag. (#2370)
This is now enabled in prod and can default to enabled.
2016-11-29 17:05:05 -05:00
Roland Bracewell Shoemaker e2155388a1 Remove caa-checker from the tree (#2351)
The VA can internally check CAA and this additional code was deemed unneeded complexity that could be hoisted outside of Boulder. Fixes #2346.
2016-11-23 08:42:33 -05:00
Roland Bracewell Shoemaker c5f99453a9 Switch CT submission RPC from CA -> RA (#2304)
With the current gRPC design the CA talks directly to the Publisher when calling SubmitToCT which crosses security bounadries (secure internal segment -> internet facing segment) which is dangerous if (however unlikely) the Publisher is compromised and there is a gRPC exploit that allows memory corruption on the caller end of a RPC which could expose sensitive information or cause arbitrary issuance.

Instead we move the RPC call to the RA which is in a less sensitive network segment. Switching the call site from the CA -> RA is gated on adding the gRPC PublisherService object to the RA config.

Fixes #2202.
2016-11-08 11:39:02 -08:00
Daniel McCarney 6c983e8c9e Implements client whitelisting for gRPC. (#2307)
As described in #2282, our gRPC code uses mutual TLS to authenticate both clients and servers. However, currently our gRPC servers will accept any client certificate signed by the internal CA we use to authenticate connections. Instead, we would like each server to have a list of which clients it will accept. This will improve security by preventing the compromise of one client private key being used to access endpoints unrelated to its intended scope/purpose.

This PR implements support for gRPC servers to specify a list of accepted client names. A `serverTransportCredentials` implementing `ServerHandshake` uses a `verifyClient` function to enforce that the connecting peer presents a client certificate with a SAN entry that matches an entry on the list of accepted client names

The `NewServer` function from `grpc/server.go` is updated to instantiate the `serverTransportCredentials` used by `grpc.NewServer`, specifying an accepted names list populated from the `cmd.GRPCServerConfig.ClientNames` config field.

The pre-existing client and server certificates in `test/grpc-creds/` are replaced by versions that contain SAN entries as well as subject common names. A DNS and an IP SAN entry are added to allow testing both methods of specifying allowed SANs. The `generate.sh` script is converted to use @jsha's `minica` tool (OpenSSL CLI is blech!).

An example client whitelist is added to each of the existing gRPC endpoints in config-next/ to allow the SAN of the test RPC client certificate.

Resolves #2282
2016-11-08 13:57:34 -05:00
Daniel McCarney eb67ad4f88 Allow `validateEmail` to timeout w/o error. (#2288)
This PR reworks the validateEmail() function from the RA to allow timeouts during DNS validation of MX/A/AAAA records for an email to be non-fatal and match our intention to verify emails best-effort.

Notes:

bdns/problem.go - DNSError.Timeout() was changed to also include context cancellation and timeout as DNS timeouts. This matches what DNSError.Error() was doing to set the error message and supports external callers to Timeout not duplicating the work.

bdns/mocks.go - the LookupMX mock was changed to support always.error and always.timeout in a manner similar to the LookupHost mock. Otherwise the TestValidateEmail unit test for the RA would fail when the MX lookup completed before the Host lookup because the error wouldn't be correct (empty DNS records vs a timeout or network error).

test/config/ra.json, test/config-next/ra.json - the dnsTries and dnsTimeout values were updated such that dnsTries * dnsTimeout was <= the WFE->RA RPC timeout (currently 15s in the test configs). This allows the dns lookups to all timeout without the overall RPC timing out.

Resolves #2260.
2016-10-27 11:56:12 -07:00
Roland Bracewell Shoemaker ce679bad41 Implement key rollover (#2231)
Fixes #503.

Functionality is gated by the feature flag `AllowKeyRollover`. Since this functionality is only specified in ACME draft-03 and we mostly implement the draft-02 style this takes some liberties in the implementation, which are described in the updated divergences doc. The `key-change` resource is used to side-step draft-03 `url` requirement.
2016-10-27 10:22:09 -04:00
Jacob Hoffman-Andrews 6b126baa0a Use pkcs11key.NewPool in CA. (#2276)
Right now, we only get single-threaded performance from our HSM, even though it
has multiple cores. We can use the pkcs11key's NewPool function to create a pool
of PKCS#11 sessions, allowing us to take advantage of the HSM's full
performance.
2016-10-24 12:05:23 -07:00
Roland Bracewell Shoemaker 28af65a04b Set feature flags in cert-checker (#2273)
Fixes #2272.
2016-10-23 10:46:43 -07:00
Daniel McCarney 83e713683f Adds `notafter-backfiller` cmd. (#2227)
The "20160817143417_AddCertStatusNotAfter.sql" db migration adds a "notAfter" column to the certificateStatus database table. This field duplicates the contents of the certificates table "expires" column. This enables performance improvements (see #1864) for both the ocsp-updater and the expiration-mailer utilities.

Since existing rows will have a NULL value in the new field the notafter-backfill utility exists to perform a one-time update of the existing certificateStatus rows to set their notAfter column based on the data that exists in the certificates table.

This follows on https://github.com/letsencrypt/boulder/pull/2177 and requires that the migration be applied & the feature flag set accordingly before use.

Fixes #2237.
2016-10-11 14:38:40 -07:00
Roland Bracewell Shoemaker 5fabc90a16 Add IDN support (#2215)
Add feature flagged support for issuing for IDNs, fixes #597.

This patch expects that clients have performed valid IDN2008 encoding on any label that includes unicode characters. Invalid encodings (including non-compatible IDN2003 encoding) will be rejected. No script-mixing or script exclusion checks are performed as we assume that if a name is resolvable that it conforms to the registrar's policies on these matters and if it uses non-standard scripts in sub-domains etc that browsers should be the ones choosing how to display those names.

Required a full update of the golang.org/x/net tree to pull in golang.org/x/net/idna, all test suites pass.
2016-10-06 13:05:37 -04:00
Roland Bracewell Shoemaker 9648e1cf85 Fix config-next features location and registration status validity check (#2225)
Move features sections to the correct JSON object and only test registration validity if regCheck is true

* Pull other flag up to correct level

* Only check status update when status is non-empty
2016-10-05 12:31:59 -04:00
Daniel McCarney 4c9cf065a8 `certificateStatus` table optimizations (Part One) (#2177)
This PR adds a migration to create two new fields on the `certificateStatus` table: `notAfter` and `isExpired`. The rationale for these fields is explained in #1864. Usage of these fields is gated behind `features.CertStatusOptimizationsMigrated` per [CONTRIBUTING.md](https://github.com/letsencrypt/boulder/blob/master/CONTRIBUTING.md#gating-migrations). This flag should be set to true **only** when the `20160817143417_CertStatusOptimizations.sql` migration has been applied.

Points of difference from #2132 (the initial preparatory "all-in-one go" PR):
**Note 1**: Updating the `isExpired` field in the OCSP updater can not be done yet, the `notAfter` field needs to be fully populated first - otherwise a separate query or a messy `JOIN` would have to be used to determine if a certStatus `isExpired` by using the `certificates` table's `expires` field. 
**Note 2**: Similarly we can't remove the `JOIN` on `certificates` from the `findStaleOCSPResponse` query yet until all DB rows have `notAfter` populated. This will happen in a separate **Part Two** PR.
2016-09-30 14:52:19 -04:00
Roland Bracewell Shoemaker c6e3ef660c Re-apply 2138 with proper gating (#2199)
Re-applies #2138 using the new style of feature-flag gated migrations. Account deactivation is gated behind `features.AllowAccountDeactivation`.
2016-09-29 17:16:03 -04:00
Daniel McCarney 409f1623e6 Retires `LookupIPv6` VA flag. (#2205)
The LookupIPv6 flag has been enabled in production and isn't required anymore. This PR removes the flag entirely.

The errA and errAAAA error handling in LookupHost is left as-is, meaning that a non-nil errAAAA will not be returned to the caller. This matches the existing behaviour, and the expectations of the TestDNSLookupHost unit tests.

This commit also removes the tests from TestDNSLookupHost that tested the LookupIPv6 == false behaviours since those are no longer implemented.

Resolves #2191
2016-09-26 18:00:01 -07:00
Roland Bracewell Shoemaker 7f0b7472e2 Add gRPC support to CA (#2193)
Fixes #2171.
2016-09-21 14:13:43 -07:00
Roland Bracewell Shoemaker 239bf9ae0a Very basic feature flag impl (#1705)
Updates #1699.

Adds a new package, `features`, which exposes methods to set and check if various internal features are enabled. The implementation uses global state to store the features so that services embedded in another service do not each require their own features map in order to check if something is enabled.

Requires a `boulder-tools` image update to include `golang.org/x/tools/cmd/stringer`.
2016-09-20 16:29:01 -07:00
Jacob Hoffman-Andrews d75a44baa0 Remove "network" and "server" from syslog configs. (#2159)
We removed these from the config object because we never use anything other than
the default empty string, which means "local socket."
2016-09-08 10:08:18 -04:00
Roland Bracewell Shoemaker 51ee04e6a9 Allow authorization deactivation (#2116)
Implements `valid` and `pending` authz deactivation.
2016-08-23 16:25:06 -04:00
Roland Bracewell Shoemaker fc39781274 Allow user specified revocation reason (#2089)
Fixes #140.

This patch allows users to specify the following revocation reasons based on my interpretation of the meaning of the codes but could use confirmation from others.

* unspecified (0)
* keyCompromise (1)
* affiliationChanged (3)
* superseded (4)
* cessationOfOperation (5)
2016-08-08 14:26:52 -07:00
Jacob Hoffman-Andrews c97f28055c Update tests to use multi-issuer format and ca2 (#1638)
Builds on #1635.
2016-08-05 13:42:03 -07:00
Ben Irving b587d4e663 Simplify KeyPolicy code (#2092)
This PR, removes the allowedSigningAlgos configuration struct and hard codes a key policy.

Fixes #1844
2016-07-30 16:15:19 -07:00
Ben Irving f73328b3cb Split up boulder-config.json (Orphan Finder) (#2059) 2016-07-21 09:30:31 -04:00
Ben Irving 44c573bbca Split up boulder-config.json (Cert Checker) (#2058) 2016-07-21 09:26:53 -04:00
Ben Irving 2ffbed989b Split up boulder-config.json (Admin Revoker) (#2053)
Another step in completing #1962, which will remove the global configuration file and codegangsta/cli from boulder. 3 more to go!

This PR, is a little bit different than others in that there was a lot more reliance on codegangsta/cli especially in the implementation of subcommands. I put some thought into creating our own SubCommand struct, but given the lack of complexity it seemed unnecessary as the same could be accomplished with slightly more advanced usage of os and flag.
2016-07-20 10:59:34 -04:00
Jacob Hoffman-Andrews 031a4022bd Fix dbConnect strings in OCSP Responder. (#2047)
Right now we use the Source field for both DB and file URLs. However, we want to move to the DBConnect config field, so that we can take advantage of the code that reads DSNs from a file on disk.  It turns out the existing code didn't work if you configure a dbConnect string, because it would error out with:

  "source" parameter not found in JSON config

After rearranging, both methods should work.
2016-07-20 10:36:54 -04:00
Ben Irving 1a4f099899 Split up boulder-config.json (Expiration Mailer) (#2036)
Part of #1962.
2016-07-12 15:55:52 -07:00
Patrick Figel 8cd74bf766 Make (pending)AuthorizationLifetime configurable (#2028)
Introduces the `authorizationLifetimeDays` and `pendingAuthorizationLifetimeDays` configuration options for `RA`.

If the values are missing from configuration, the code defaults back to the current values (300/7 days).

fixes #2024
2016-07-12 15:18:22 -04:00
Roland Bracewell Shoemaker a0a9623cb6 Switch to using SoftHSM in Docker for testing (#1920)
Instead of reading the CA key from a file on disk into memory and using that for signing in `boulder-ca` this patch adds a new Docker container that runs SoftHSM and pkcs11-proxy in order to hold the key and perform signing operations. The pkcs11-proxy module is used by `boulder-ca` to talk to the SoftHSM container.

This exercises (almost) the full pkcs11 path through boulder and will allow testing various HSM related failures in the future as well as simplifying tuning signing performance for benchmarking.

Fixes #703.
2016-07-11 11:20:51 -07:00
Ben Irving 0e2ef748b4 Split up boulder-config.json (OCSP Responder) (#2017) 2016-07-07 14:52:08 -04:00
Daniel McCarney 8a585b8691 notify-mailer/contact-exporter bug fixes & documentation (#2016)
For the notify-mailer, this PR fixes a bug with the -end parameter where the default (99999999) would cause a slice index out of range error. This was fixed by setting the -end value to len(m.destinations) in run when it is too large.

For both the notify-mailer and the contact-exporter a bug was fixed that was comparing the required flags against nil when the defaults were set to a non-nil pointer to "". This resulted in confusing errors when the mandatory arguments were not provided.

This PR also adds a separated config example for both the notify-mailer and the contact-exporter into test/config and test/config-next respectively.

Finally a documentation string was added to describe the overall design & usage of both tools, including example invocations.
2016-07-06 14:15:22 -04:00
Ben Irving 653cc004d0 Split Boulder Config (OCSP Updater) (#2013) 2016-07-06 10:00:52 -04:00
Ben Irving cb45bdea67 Split up boulder-config.json (Publisher) (#2008) 2016-07-05 13:31:30 -07:00
Ben Irving bea8e57536 Split up boulder-config.json (VA) (#1979) 2016-07-01 13:06:50 -04:00
Ben Irving 21e0b3bdc7 Split up boulder-config.json (CA) (#1978) 2016-07-01 10:24:19 -04:00
Ben Irving 6162533c00 Split up boulder-config.json (SA) (#1975)
Depends on #1973

https://github.com/letsencrypt/boulder/pull/1975
2016-06-29 15:01:49 -07:00
Ben Irving c4f7fb580d Split up boulder-config.json (RA) (#1974)
Part of #1962
2016-06-29 13:43:55 -07:00
Roland Bracewell Shoemaker 04961d7c66 Add basic ASN.1 structure test for pre-1.0.2 OpenSSL CSRs (#1972)
Adds a test for CSRs generated using a pre-1.0.2 version of OpenSSL and a buggy client which will fail to parse with Golang 1.6+.

This test checks the values of the bytes in the 8th and 9th offsets, which in a properly formatted CSR should be the version integer declaration bytes, and if the malformed values are present will return a error to the user informing them that they are using an old version of OpenSSL and/or a client which doesn't explicitly set the CSR version.

Fixes #1902.
2016-06-28 12:38:52 -07:00
Ben Irving 6007df8f3c Split up boulder-config.json (WFE) (#1973)
Moves the wfe to it's own config file.

Each config will now belong in `test/config` and `test/config-next` analogous to `boulder-config` and `boulder-config-next`.
2016-06-28 10:40:16 -07:00