Commit Graph

4866 Commits

Author SHA1 Message Date
Jacob Hoffman-Andrews 5e7fee0c4a test: update test/config with deployed configs. (#4396) 2019-08-09 12:08:56 -04:00
Jacob Hoffman-Andrews b7250c1d43
integration: test for DisableAuthz2Orders. (#4390)
To make this work, I changed the twenty_days_ago setup to use
`config-next` when the main test phase is running `config`. That, in
turn, made the recheck_caa test fail, so I added a tweak to that.

I also moved the authzv2 migrations into `db`. Without that change,
the integration test would fail during the twenty_days_ago setup because
Boulder would attempt to create authzv2 objects but the table wouldn't
exist yet.
2019-08-08 17:07:29 -07:00
Roland Bracewell Shoemaker 62db2d0cae publisher: add label to CT log error metric for http status code (#4391) 2019-08-08 08:30:35 -04:00
Daniel McCarney 652cb6be78
wfe2: set web.RequestEvent.Method for POST-as-GET. (#4395)
To make log analysis easier we choose to elevate the pseudo ACME HTTP
method "POST-as-GET" to the `web.RequestEvent.Method` after processing
a valid POST-as-GET request, replacing the "POST" method value that will
have been set by the outermost handler.
2019-08-08 08:29:53 -04:00
Jacob Hoffman-Andrews e20eb6271d Suppress "transport is closing" errors. (#4394)
These errors show up in the Publisher at shutdown during integration
test runs, because the Publisher is trying to write responses from RPCs
that were slow due to the ct-test-srv's LatencySchedule. This
specifically happens only for the optional submission of "final"
certificates.
2019-08-07 13:39:53 -07:00
Roland Bracewell Shoemaker 751e3b1704 cmd: Set CFSSL log level to debug (#4393) 2019-08-07 14:30:42 -04:00
Daniel McCarney a2d041a2d3
boulder-janitor: fix debug lines for job creation. (#4388) 2019-08-06 19:22:30 -04:00
Roland Bracewell Shoemaker a585f23365
Add feature flag for disabling new domain validations in the V1… (#4385)
Fixes #4307.
2019-08-05 11:34:51 -07:00
Jacob Hoffman-Andrews 1b75ea21e1
Remove unnecessary transaction. (#4387)
In getAllOrderAuthorizationStatuses, we were using a transaction for a series
of SELECTs. Since these SELECTs don't need to be strongly consistent with
each other, that creates needless locking and round trips.
2019-08-05 10:48:04 -07:00
Jacob Hoffman-Andrews 41569572e9 sa: wrap transactions for commits/rollback, part 2 (#4386)
This follows up on #4373, adding the withTransaction handling to the rest of the
functions in SA that use transactions.
2019-08-05 13:23:35 -04:00
Roland Bracewell Shoemaker db01830508
Return OCSP unauthorized status if the certificate is expired (#4380)
The ocsp-updater ocspStaleMaxAge config var has to be bumped up to ~7 months so that when it is run after the six-months-ago run it will actually update the ocsp responses generated during that period and mark the certificate status row as expired.

Fixes #4338.
2019-08-01 14:13:27 -07:00
Daniel McCarney 8b518451b4 deps: update github.com/zmap/zlint to latest. (#4384)
* deps: update github.com/zmap/zlint to latest.

Update the `github.com/zmap/zlint` dependency to b126a9b. This captures
a small fix to the `ct_sct_policy_count_unsatisfied` lint that ensures
it isn't run for precertificates.

* config: remove ct_sct_policy_count_unsatisfied from ignored_lints.

With the latest `zlint` the `ct_sct_policy_count_unsatisfied` lint won't
flag precertificates as having an info-level lint result for missing
SCTs. With that fix in place we no longer have to ignore this lint in
the config-next CA configs that enable preissuance linting.
2019-08-01 10:22:30 -07:00
Roland Bracewell Shoemaker 2e4531342d tests: add authorization deactivation integration tests (#4381)
Add pending and valid authorization deactivation integration tests
2019-07-31 17:47:52 -04:00
Daniel McCarney 17cf6fde8d
deps: bump github.com/weppos/publicsuffix-go to latest. (#4383) 2019-07-31 17:46:51 -04:00
Jacob Hoffman-Andrews 16235b6839 sa: wrap transactions in a function for commits/rollbacks (#4373)
In the current SA code, we need to remember to call Rollback on any error.
If we don't, we'll leave dangling transactions, which are hard to spot but eventually
clog up the database and cause availability problems.

This change attempts to deal with rollbacks more rigorously, by implementing a
withTransaction function that takes a closure as input. withTransaction opens
a transaction, applies a context.Context to it, and then runs the closure. If the
closure returns an error, withTransaction rolls back and return the error; otherwise
it commits and returns nil.

One of the quirks of this implementation is that it relies on the closure modifying
variables from its parent scope in order to return values. An alternate implementation
could define the return value of the closure as interface{}, nil, and have the calling
function do a type assertion. I'm seeking feedback on that; not sure yet which is cleaner.

This is a subset of the functions that need this treatment. I've got more coming, but
some of the changes break tests so I'm checking into why.

Updates #4337
2019-07-31 12:41:51 -07:00
Daniel McCarney eb20b2accd
CA: implement CFSSL/zlint pre-issuance linting. (#4378)
The `test/config-next` CA configs are both updated to use `zlint` to lint TBS
pre-certificates with a throw-away key and treat any lint findings >=
`lints.Pass` as an error, blocking the CA from signing the TBS pre-cert with its
private key.

The CA `issuePrecertificateInner` function is updated to specifically catch
linting related errors from CFSSL to marshal the linting findings to the audit
log. A small unit test for this change is included.

The CA `IssueCertificateForPrecertificate` function remains unchanged: the CFSSL
interface that defines `SignFromPrecert` doesn't facilitate linting. We still
lint final certificates post-issuance with `cert-checker` and accept the
possibility there may be some compliance issues that could occur between the
precertificate passing linting and the final certificate being signed.

Resolves https://github.com/letsencrypt/boulder/issues/4255
2019-07-31 15:08:57 -04:00
Daniel McCarney 17b74cfb55
deps: update github.com/cloudflare/cfssl to v1.3.4 (#4377)
This will unblock pre-issuance linting support by updating the
`github.com/cloudflare/cfssl` dependency to the `1.3.4` tag which
notably includes the zlint integration developed in
cloudflare/cfssl#1015
2019-07-31 14:06:02 -04:00
Daniel McCarney 75dcac2272
deps: update github.com/zmap/zlint to latest. (#4375)
Notably this brings in:
* A mild perf. boost from an updated transitive zcrypto dep and a reworked util func.
* A new KeyUsage lint for ECDSA keys.
* Updated gTLD data.
* A required `LintStatus` deserialization fix that will unblock a CFSSL update.

The `TestIgnoredLint` unit test is updated to no longer expect a warning from the 
` w_serial_number_low_entropy` lint. This lint was removed in the upstream project.
2019-07-31 13:10:44 -04:00
Jacob Hoffman-Andrews c777dfece6 Log the Origin header. (#4376)
XHR requests from web-based ACME clients provide the User-Agent
of the browser that initiated the request, but the hostname of the site
that originated the request is sent in the Origin header. This will let
us better analyze web-based ACME traffic.

Fixes #4370
2019-07-31 09:47:44 -07:00
Daniel McCarney bb005e1c79
integration: add test for boulder-janitor. (#4364) 2019-07-29 16:13:10 -04:00
Jacob Hoffman-Andrews 98677b83d8 integration: make test case filter better (#4366) 2019-07-29 09:00:02 -07:00
Jacob Hoffman-Andrews a68c39ad9b SA: Delete unused challenges (#4353)
For authzv1, this actually executes a SQL DELETE for the unused challenges
when an authorization is updated upon validation.

For authzv2, this doesn't perform a delete, but changes the authorizations that
are returned so they don't include unused challenges.

In order to test the flag for both authz storage models, I set the feature flag in
both config/ and config-next/.

Fixes #4352
2019-07-26 14:04:46 -04:00
Roland Bracewell Shoemaker 59ef95230d integration: Fix typo in test/helpers.py (#4369) 2019-07-26 14:02:52 -04:00
Roland Bracewell Shoemaker 52dd3bd9c7 web: Log subproblems in RequestEvent (#4363) 2019-07-26 14:02:18 -04:00
Jacob Hoffman-Andrews 1613082c22 integration: Stop printing an exception for HTTP timeout test. (#4368) 2019-07-26 10:03:53 -04:00
Jacob Hoffman-Andrews ba5a5a5ac9 cmd: Log less from gRPC, no INFO level. (#4367)
The gRPC INFO log lines clutter up integration test output, and we've never
had a use for them in production (they are mostly about details of
connection status).
2019-07-26 10:02:34 -04:00
Roland Bracewell Shoemaker c7debd51b9
Set namespace for sub-problems (#4361)
Fixes #4355.
2019-07-25 10:26:58 -07:00
Daniel McCarney 9e896325f7
boulder-janitor: add initial daemon for tidying certificate resources. (#4354)
A new `boulder-janitor` command is added that provides a long-running
daemon that cleans up rows associated with expired certificate
resources. At present this is rows from the following tables:

* certificates
* certificateStatus
* certificatesPerName

Adding cleanup of tables associated with Order resources is the next step.

Three prometheus stats are exported:

* janitor_deletions - CounterVec for the number of deletions by table the 
  boulder-janitor has performed.
* janitor_workbatch - GaugeVec for the number of items of work by table
  the boulder-janitor queued for deletion.
* janitor_errors - CounterVec for the number of errors by table and error
  type the boulder-janitor has experienced.
2019-07-24 15:09:04 -04:00
Roland Bracewell Shoemaker cc2754cc57 Report the correct identifier when there are suberrors (#4358)
Turns out the test was already flagging this issue.

Fixes #4356.
2019-07-24 09:12:48 -07:00
Jacob Hoffman-Andrews 88992e3f0d sa: remove unused revokeAuthorizations functions. (#4351) 2019-07-22 13:51:19 -04:00
Jacob Hoffman-Andrews 4628c79239
Check invalid authorization limit in parallel. (#4348)
Fixes #3069.
2019-07-19 13:37:12 -07:00
Jacob Hoffman-Andrews 979e00651b sa: fix GetOrderForNames query ORDER BY to match comment. (#4349)
In #4331 I introduced this new more efficient query for
GetOrderForNames, and commented about why we needed an ORDER BY... ASC
to efficiently use the index. However, the actually query did not match
the comment, and it used DESC. This fixes the query.

To demonstrate that the index is actually used with the ASC version,
here's the EXPLAIN output after filling up the table with a bunch of
failed orders:

MariaDB [boulder_sa_integration]> explain select orderID, registrationID FROM orderFqdnSets
    -> WHERE setHash = UNHEX('B60FE34E4A6735D5A575D81C97F4DFED2102DC179B34252E4AA18F6E2A375C98')
    -> AND expires > NOW() ORDER BY EXPIRES ASC LIMIT 1 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: orderFqdnSets
         type: range
possible_keys: setHash_expires_idx
          key: setHash_expires_idx
      key_len: 37
          ref: NULL
         rows: 1500
        Extra: Using index condition
1 row in set (0.000 sec)

MariaDB [boulder_sa_integration]> explain select orderID, registrationID FROM orderFqdnSets
    -> WHERE setHash = UNHEX('B60FE34E4A6735D5A575D81C97F4DFED2102DC179B34252E4AA18F6E2A375C98')
    -> AND expires > NOW() ORDER BY EXPIRES DESC LIMIT 1 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: orderFqdnSets
         type: range
possible_keys: setHash_expires_idx
          key: setHash_expires_idx
      key_len: 37
          ref: NULL
         rows: 1500
        Extra: Using where
1 row in set (0.000 sec)
2019-07-18 15:24:47 -04:00
Jacob Hoffman-Andrews 5952d89346 wfe: add feature flag to control new acme v1 registrations. (#4346)
Adds `AllowV1Registration` feature flag to the WFE to control new acme v1 registrations.

Fixes #4306.
2019-07-18 15:18:05 -04:00
Jacob Hoffman-Andrews f06e0ac0b1 tests: delete old files that are no longer used. (#4347) 2019-07-18 14:21:29 -04:00
Jacob Hoffman-Andrews d077d3346e wfe/wfe2: remove AllowAuthzDeactivation flag. (#4345)
Fixes #4339
2019-07-17 16:30:27 -04:00
Daniel McCarney 028822d435 sa: drop SCTReceipts table and assoc. code. (#4344)
The `SCTReceipts` database table and associated model linkages are
legacy cruft from before Boulder implemented SCT embedding. We can
safely remove all of this stuff.
2019-07-17 11:12:42 -07:00
Jacob Hoffman-Andrews a4fc143a54 wfe/wfe2: clean up AcceptRevocationReason flag. (#4342)
Fixes #4340
2019-07-17 10:33:47 -04:00
Jacob Hoffman-Andrews 71eea294b9
Use HandleFunc to process authzv2s (in wfe2) (#4341)
Similar to #4334, this fixes a bug where authzs with a randomly generated id
starting with "v2" would incorrectly get treated as v2 authzs.

It accomplishes this change by splitting out v2 authzs into their own path and
using our regular HTTP mux to split them out. It uses a "-v3" name in the
public-facing URLs to avoid confusion.
2019-07-16 11:46:33 -07:00
Jacob Hoffman-Andrews d41282dc3f wfe: use HandleFunc to separately process authzv2s. (#4334)
Previously we made v2 authz handling part of regular authz handling, and
just tried to strip the v2/ prefix. This led to a bug: #4327, where
authzv1s that had an id randomly start with "v2" would 404 because they
were treated as authzv2's.

This change moves authz2's and related challenges onto their own prefix,
and uses HandleFunc to dispatch them appropriately. To reduce code
duplication, I factored out the code that was common to v1 and v2 into
new Common functions.

I also changed the path names to contain "v3" instead of "v2" to avoid
potential confusion.
2019-07-16 09:54:05 -04:00
Daniel McCarney bdc9e17afe test: remove stale comment in sa_db_users.sql (#4335)
I noticed this stale bit of text in `test/sa_db_users.sql` while working on something unrelated.

The CA component does not have its own DB, it relies on the SA and its DB the same as
other components. Consulting a historian tells me that this was once true in a lost age :-)

The explicit license header comment isn't required. We don't follow this practice in any of
the other files and assume the base repo LICENSE file is sufficient.

Similarly we don't need to drop each user and recreate them, we use
`CREATE USER IF NOT EXISTS` from MariaDB 10.1+ We also don't need the related
`drop_users.sql` script.
2019-07-15 08:35:26 -07:00
Daniel McCarney f5a322006d SA: fix dangling getAllOrderAuthorizationStatuses tx. (#4336)
In the case where the DB `Select()` returns a non-nil `err` result the
SA's `getAllOrderAuthorizationStatuses` function needs to ensure it
rolls back the transaction it opened or it will be leaked.
2019-07-15 07:31:35 -07:00
Jacob Hoffman-Andrews 34a55a9b97 integration: add test for failed validation limit. (#4333)
I introduced test_fail_thrice as a specific regression test for #4329,
but I realized that a more general test of the failed validation limit
would have better coverage and also serve as a regression test at the
same time.

Fixes #4332.
2019-07-11 15:35:35 -04:00
Jacob Hoffman-Andrews 74699486ec Fix FasterGetOrderForNames and add tests. (#4331)
This rolls forward #4326 after it was reverted in #4328.

Resolves https://github.com/letsencrypt/boulder/issues/4329

The older query didn't have a `LIMIT 1` so it was returning multiple results,
but gorp's `SelectOne` was okay with multiple results when the selection was
going into an `int64`. When I changed this to a `struct` in #4326, gorp started
producing errors.

For this bug to manifest, an account needs to create an order, then fail
validation, twice in a row for a given domain name, then create an order once
more for the same domain name - that third request will fail because there are
multiple orders in the orderFqdnSets table for that domain.

Note that the bug condition doesn't happen when an account does three successful
issuances in a row, because finalizing an order (that is, issuing a certificate
for it) deletes the row in orderFqdnSets. Failing an authorization does not
delete the row in orderFqdnSets. I believe this was an intentional design
decision because an authorization can participate in many orders, and those
orders can have many other authorizations, so computing the updated state of
all those orders would be expensive (remember, order state is not persisted in
the DB but is calculated dynamically based on the authorizations it contains).

This wasn't detected in integration tests because we don't have any tests that
fail validation for the same domain multiple times. I filed an issue for an
integration test that would have incidentally caught this:
https://github.com/letsencrypt/boulder/issues/4332. There's also a more specific
test case in #4331.
2019-07-11 13:43:42 -04:00
alexzorin df2909a7ca va: Send extValue in TLSALPN unauthorized response (#4330)
Brings it to be more in line with the responses from the other two challenges and
will hopefully make the challenge a lot easier to debug (like in the recent community 
thread).

```json
"error": {
  "type": "urn:ietf:params:acme:error:unauthorized",
  "detail": "Incorrect validation certificate for tls-alpn-01 challenge. Expected acmeValidationV1 extension value 836bf5358f8a32826c61faeff2e0225b00756f935b00ed3002cabb9d536b9f53 for this challenge but got 8539b12e31c306b81a0aedab4128722c6ad71f71f46316a3c71612f47df0e532",
  "status": 403
},
```
2019-07-11 09:08:14 -07:00
Jacob Hoffman-Andrews 2131065b2d
Revert "SA: improve performance of GetOrderForNames. (#4326)" (#4328)
This reverts commit 9fa360769e.

This commit can cause "gorp: multiple rows returned for: ..." under certain situations.

See #4329 for details of followup.
2019-07-09 14:33:28 -07:00
Jacob Hoffman-Andrews 9fa360769e SA: improve performance of GetOrderForNames. (#4326)
When there are a lot of potential orders to reuse, the query could scan
unnecessary rows, sometimes leading to timeouts. The new query used 
when the FasterGetOrderForNames feature flag is enabled uses the
available index more effectively and adds a LIMIT clause.
2019-07-09 09:46:06 -04:00
Jacob Hoffman-Andrews e3f797f9dc grpc: Add better error message for timeouts. (#4324)
Right now we sometimes get errors like:

rpc error: code = Unknown desc = rpc error:
  code = DeadlineExceeded desc = context deadline exceeded

For instance, when an SA call times out, and the RA returns that
timed-out error to the WFE. These are kind of confusing because they
have two layers of nested gRPC error, and they don't provide additional
information about which SA call timed out.

This change replaces DeadlineExceeded errors with our own error type
that includes the service and the method that were called, as well as
the amount of time it took (which helps understand if timeouts are
happening because earlier calls ate up time towards the deadline).

When the RA->SA NewOrder call times out, and the RA returns that error to WFE:

"InternalErrors":["rpc error: code = Unknown desc =
  sa.StorageAuthority.NewOrder timed out after 14954 ms"]

When the WFE->RA NewOrder call times out:

"InternalErrors":["ra.RegistrationAuthority.NewOrder timed out after 15000 ms"]

Note that this change only handles timeouts at one level deep, which I
think is sufficient for our needs.
2019-07-08 13:47:25 -04:00
Roland Bracewell Shoemaker 3ea77270e3
Use primary key as cursor in cert-checker rather than serial (#4316)
`cert-checker` assumes an undefined behavior of MySQL which is only sometimes true, which means sometimes we select fewer certificates than we actually expect to. Instead of adding an explicit ORDER BY we simply switch to cursoring using the primary key, which gets us overall much more efficient usage of indexes.

Fixes #4315.
2019-07-03 12:05:48 -07:00
Jacob Hoffman-Andrews 3af49a16be
Revert "integration: move to Python3 (#4313)" (#4323)
This reverts commit 796a7aa2f4.

People's tests have been breaking on `docker-compose up` with the following output:

```
ImportError: No module named requests
```

Fixes #4322
2019-07-03 11:35:45 -07:00
Roland Bracewell Shoemaker 0d9b48e280 PA: restructure error for single bad name in multi-name req (#4319) 2019-07-03 13:47:31 -04:00