Commit Graph

4043 Commits

Author SHA1 Message Date
Jacob Hoffman-Andrews f366e45756 Remove global state from metrics gathering (#3167)
Previously, we used prometheus.DefaultRegisterer to register our stats, which uses global state to export its HTTP stats. We also used net/http/pprof's behavior of registering to the default global HTTP ServeMux, via DebugServer, which starts an HTTP server that uses that global ServeMux.

In this change, I merge DebugServer's functions into StatsAndLogging. StatsAndLogging now takes an address parameter and fires off an HTTP server in a goroutine. That HTTP server is newly defined, and doesn't use DefaultServeMux. On it is registered the Prometheus stats handler, and handlers for the various pprof traces. In the process I split StatsAndLogging internally into two functions: makeStats and MakeLogger. I didn't port across the expvar variable exporting, which serves a similar function to Prometheus stats but which we never use.

One nice immediate effect of this change: Since StatsAndLogging now requires and address, I noticed a bunch of commands that called StatsAndLogging, and passed around the resulting Scope, but never made use of it because they didn't run a DebugServer. Under the old StatsD world, these command still could have exported their stats by pushing, but since we moved to Prometheus their stats stopped being collected. We haven't used any of these stats, so instead of adding debug ports to all short-lived commands, or setting up a push gateway, I simply removed them and switched those commands to initialize only a Logger, no stats.
2017-10-13 11:58:01 -07:00
Jacob Hoffman-Andrews 0a72f768a7 Remove ProfileCmd. (#3166)
These stats are now all collected by Prometheus.
2017-10-13 10:02:04 -04:00
Jacob Hoffman-Andrews fa716769e9 Move CAConfig out of cmd. (#3165)
We removed most of the component-specific configs out of cmd a long time ago. We
left CAConfig in, because it is used directly in the NewCertificateAuthority
constructor. That means that moving CAConfig into cmd/boulder-ca would have
resulted in a circular dependency.

Eventually we probably want to decompose CAConfig so it's a set of arguments to
NewCertificateAuthority, but as a short term improvement, move the config into
its own package to break the dependency. This has the advantage of removing a
couple of big dependencies from cmd.
2017-10-12 16:16:38 -07:00
JP Phillips e83480f86b Return accurate error description for invalid authz limit (#3156)
Fixes #3144
2017-10-11 22:21:51 -07:00
Jacob Hoffman-Andrews 9c9c6f3739 Revert "Set timeout for GSB requests. (#3136)" (#3159)
This reverts commit 78f454bdfd.
2017-10-10 09:27:25 -07:00
Jacob Hoffman-Andrews 4e68fb2ff6 Switch to udp for internal DNS. (#3135)
We used to use TCP because we would request DNSSEC records from Unbound, and
they would always cause truncated records when present. Now that we no longer
request those (#2718), we can use UDP. This is better because the TCP serving
paths in Unbound are likely less thoroughly tested, and not optimized for high
load. In particular this may resolve some availability problems we've seen
recently when trying to upgrade to a more recent Unbound.

Note that this only affects the Boulder->Unbound path. The Unbound->upstream
path is already UDP by default (with TCP fallback for truncated ANSWERs).
2017-10-10 10:06:33 -04:00
Daniel McCarney aff1d64605 Clarify ACME divergences doc (#3154)
A frequent point of confusion is which ACME draft Boulder implements. Often people imagine (sensibly!) that there is one draft they can reference to understand Boulder.

This commit updates the divergences doc to clarify that it should be used to compare Boulder to whatever the most current ACME draft is and that Boulder doesn't implement a specific draft. This commit also adds a reference to what ACME v1 is and a link to the ACME v2 blog post.

Small references are also added to the "applications" concept from prev. drafts. Otherwise folks that land on older ACME drafts may wonder why the divergences doc doesn't mention "applications", a concept that was renamed to "orders" in subsequent drafts. We do document divergences for "orders" and attention should be directed there.
2017-10-06 14:18:15 -07:00
Daniel McCarney 63f0008cce Fix example goose migration path. (#3155)
The contrib guide to migrations creates a migration with goose create AddWizards sql but references it later as
sa/_db/20160915101011_WizardMigrations.sql. This commit fixes the
reference to use the correct AddWizards.sql filename suffix.
2017-10-06 13:07:41 -07:00
Jacob Hoffman-Andrews 51991cd264 Fix logging of hostname in VA. (#3149)
The pbToAuthzMeta method in rpc/pb-marshalling.go only propagates ID and
registrationID, not hostname. So log the "domain" parameter instead.
2017-10-06 11:10:02 -07:00
Jacob Hoffman-Andrews 84eaf49352 Update to Go 1.9.1 (#3153)
Update the boulder-tools Docker image, and switch to the new tag.
2017-10-06 09:38:49 -04:00
Jacob Hoffman-Andrews 84692841fb Remove references to buser in Dockerfile. (#3152)
We originally planned to run as a non-root user in our Docker setup. We haven't
done that yet, so let's clean up the detritus until we do.
2017-10-06 09:36:44 -04:00
Jacob Hoffman-Andrews f472587a13 In tag_and_upload, ensure correct directory. (#3151)
This avoids accidentally building the Boulder Dockerfile instead of the
boulder-tools one.
2017-10-06 09:36:07 -04:00
Jacob Hoffman-Andrews b8ed544f86 Make fast_finish work. (#3150)
We have had fast_finish in our .travis.yml for a while but it wasn't working properly. I've moved things around so it now works correctly.
2017-10-06 09:35:40 -04:00
Daniel McCarney 1794c56eb8 Revert "Add CAA parameter to restrict challenge type (#3003)" (#3145)
This reverts commit 23e2c4a836.
2017-10-04 12:00:44 -07:00
Jacob Hoffman-Andrews 8aeb1a6b4d Set parallelism in SA's config-next (#3142) 2017-10-03 20:44:05 -07:00
Jacob Hoffman-Andrews 78f454bdfd Set timeout for GSB requests. (#3136)
safebrowsing defaults to a one minute timeout for its requests. Set a much lower
one so that we don't timeout new-authzs when communicating with safebrowsing is
slow.
2017-10-03 10:50:30 -07:00
Jacob Hoffman-Andrews dbfb48226d Add parallelism to SA CountCertificatesByNames. (#3133)
Since we can make up to 100 SQL queries from this method (based on the 100-SAN
limit), sometimes it is too slow and we get a timeout for large certificates. By
running some of those queries in parallel, we can speed things up and stop
getting timeouts.
2017-10-02 15:45:08 -04:00
lukaslihotzki 23e2c4a836 Add CAA parameter to restrict challenge type (#3003)
This commit adds CAA `issue` paramter parsing and the `challenge` parameter to permit a single challenge type only. By setting `challenge=dns-01`, the nameserver keeps control over every issued certificate.
2017-10-02 11:59:47 -07:00
Daniel McCarney 3d84bd9b99 Update `publicsuffix-go` to 02da67. (#3131)
Unit tests confirmed to pass:
```
HEAD is now at 02da67f... autopull: 2017-09-11T06:00:45Z (#92)
?   	github.com/weppos/publicsuffix-go/cmd/load	[no test files]
=== RUN   TestPublicSuffix
--- PASS: TestPublicSuffix (0.00s)
=== RUN   TestEffectiveTLDPlusOne
--- PASS: TestEffectiveTLDPlusOne (0.00s)
PASS
ok  	github.com/weppos/publicsuffix-go/net/publicsuffix	0.006s
=== RUN   TestValid
--- PASS: TestValid (0.00s)
=== RUN   TestIncludePrivate
--- PASS: TestIncludePrivate (0.00s)
=== RUN   TestIDNA
--- PASS: TestIDNA (0.00s)
=== RUN   TestPsl
--- PASS: TestPsl (0.01s)
=== RUN   TestNewListFromString
--- PASS: TestNewListFromString (0.00s)
=== RUN   TestNewListFromString_IDNAInputIsUnicode
--- PASS: TestNewListFromString_IDNAInputIsUnicode (0.00s)
=== RUN   TestNewListFromString_IDNAInputIsAscii
--- PASS: TestNewListFromString_IDNAInputIsAscii (0.00s)
=== RUN   TestNewListFromFile
--- PASS: TestNewListFromFile (0.00s)
=== RUN   TestListAddRule
--- PASS: TestListAddRule (0.00s)
=== RUN   TestListFind
--- PASS: TestListFind (0.00s)
=== RUN   TestNewRule_Normal
--- PASS: TestNewRule_Normal (0.00s)
=== RUN   TestNewRule_Wildcard
--- PASS: TestNewRule_Wildcard (0.00s)
=== RUN   TestNewRule_Exception
--- PASS: TestNewRule_Exception (0.00s)
=== RUN   TestNewRule_FromASCII
--- PASS: TestNewRule_FromASCII (0.00s)
=== RUN   TestNewRule_FromUnicode
--- PASS: TestNewRule_FromUnicode (0.00s)
=== RUN   TestNewRuleUnicode_FromASCII
--- PASS: TestNewRuleUnicode_FromASCII (0.00s)
=== RUN   TestNewRuleUnicode_FromUnicode
--- PASS: TestNewRuleUnicode_FromUnicode (0.00s)
=== RUN   TestRuleMatch
--- PASS: TestRuleMatch (0.00s)
=== RUN   TestRuleDecompose
--- PASS: TestRuleDecompose (0.00s)
=== RUN   TestLabels
--- PASS: TestLabels (0.00s)
=== RUN   TestToASCII
--- PASS: TestToASCII (0.00s)
=== RUN   TestCookieJarList
--- PASS: TestCookieJarList (0.00s)
PASS
ok  	github.com/weppos/publicsuffix-go/publicsuffix	0.024s
```
2017-10-02 10:28:58 -07:00
Jacob Hoffman-Andrews b80b129d1a Remove unused requestMethod and VerificationMethods. (#3129)
These were added as part of #62,
based on the original CPS at
https://letsencrypt.org/documents/ISRG-CPS-May-5-2015.pdf. Request method was an
odd thing to log because for Let's Encrypt it will always be "online", never "in
person." And VerificationMethods is logged separately during the authz
validation process. The newest CPS at
https://letsencrypt.org/documents/isrg-cps-v2.0/ no longer requires these
specific fields, so we're removing them for clarity.
2017-09-29 12:35:58 -07:00
Jacob Hoffman-Andrews d2883f12c1 Remove TimingDuration call from VA (#3122)
Also switch over tests.

Fixes #3100
2017-09-28 14:25:22 -07:00
Roland Bracewell Shoemaker e77b886f85 Fix load-generator for multi-va (#3124)
Wait for the authorization state to change before deleting the HTTP-01 challenge from the challenge server.
2017-09-26 14:20:04 -07:00
Jacob Hoffman-Andrews 97265c9184 Factor out context.go from wfe and wfe2. (#3086)
* Move probs.go to web.

* Move probs_test.go

* Factor out probs.go from wfe

* Move context.go

* Extract context.go into web package.

* Add a constructor for TopHandler.
2017-09-26 13:54:14 -04:00
Daniel McCarney 966e02313f Forbid HTTP redirects to non-80/443 ports. (#3115)
Prior to this commit the VA would follow redirects from the initial
HTTP-01 challenge request on port 80 to any other port. In practice the
Let's Encrypt production environment has network egress firewall rules
that drop outbound requests that are not on port 80 or 443. In effect
this meant any challenge request that was redirected from 80 to a port
other than 80/443 was turned into a mysterious connection timeout error.

We have decided to preserve the egress firewall rule and continue to act
conservatively. Only port 80 and 443 should be allowed in redirects.

This commit updates the VA to return a clear error message when
a non-80/443 redirect is made.

To aid in testing/configuration the actual ports enforced are specified
by the va.httpPort and va.httpsPort that are used for the initial
outbound HTTP-01 connection.

The VA TestHTTPRedirectLookup unit test is updated accordingly to test
that a non-80/443 redirect fails with the expected message.

Resolves #3049
2017-09-25 10:19:10 -07:00
Daniel McCarney 3408b62720 Include the domain name in problems from IsCAAValid. (#3116)
For certificates with many domains it can be difficult to associate
a given CAA error with the specific domain that caused it. To make this
easier this commit explicitly prefixes all of the problems that can be
returned from `va.IsCAAValid` with the domain name in question.

A small unit test is included to check a CAA problem's detail message is
suitably prefixed with the affected domain.
2017-09-25 13:11:50 -04:00
Roland Bracewell Shoemaker b7bca87134 Batch fetching of existing authorizations and creation of pending authorizations (#3058)
For the new-order endpoint only. This does some refactoring of the order of operations in `ra.NewAuthorization` as well in order to reduce the duplication of code relating to creating pending authorizations, existing tests still seem to work as intended... A close eye should be given to this since we don't have integration tests yet that test it end to end. This also changes the inner type of `grpc.StorageAuthorityServerWrapper` to `core.StorageAuthority` so that we can avoid a circular import that is created by needing to import `grpc.AuthzToPB` and `grpc.PBToAuthz` in `sa/sa.go`.

This is a big change but should considerably improve the performance of the new-order flow.

Fixes #2955.
2017-09-25 09:10:59 -07:00
Daniel McCarney 9b922b9feb Ensure `LockCol` is set correctly on reg update. (#3113)
In 2fb247488f we consolidated the
`regModelV2` and `regModelv1` structs to one `regModel` type. In the
process we accidentally lost the explicit assignment of the
to-be-updated registration model's `LockCol` with the value of the
existing registration's `LockCol`. This meant that the Update was
occurring with a where clause `LockCol=0` (the default value).

In practice this meant that the first reg update would succeed (since
the reg row starts with LockCol=0) but any regs that had already been
updated once before would modify 0 rows in the update (because the where
clause on `LockCol` failed) and this in turn was translated into
a ServerInternal error since we knew the reg being updated did exist.

This commit updates the SA's `UpdateRegistration` function to properly
set the `LockCol` on the to-be-updated row.

This commit additionally adds an integration test for registration
contact information updating to ensure we don't fall into this trap in
the future.
2017-09-22 15:41:22 -07:00
Daniel McCarney f0a9e9aa0e Fix duplicate cert limit off-by-one. (#3117)
The RA's `checkCertificatesPerFQDNSetLimit` function was using `>` where
it should have been using `>=` when evaluating a FQDNSet count against
the rate limit threshold. This resulted in an off by one error where we
allowed 1 more duplicate certificate than intended.

This commit fixes the off-by-one error and adds a short unit test. The
unit test failed the
`TestCheckExactCertificateLimit/FQDN_set_issuances_equal_to_limit`
subtest before the fix was applied and passes afterwards.
2017-09-22 14:45:48 -07:00
Kleber Correia 9e763c25fb Remove RandomDirectoryEntry feature flag (#3101) 2017-09-21 09:26:23 -04:00
Jacob Hoffman-Andrews fce975a1e6 Move CAA mocks into caa_test. (#3084)
There were a bunch of test fixtures in bdns/mocks.go that were only used in va/caa_test.go. This moves them to be in the same file so we have less spooky action at a distance.

One side-effect: We can't construct bdns.DNSError with the internal fields we want, because those fields are unexported. So we switch a couple of mock cases to just return a generic error, and the corresponding test cases to expect that error.
2017-09-18 13:10:01 -07:00
Jacob Hoffman-Andrews 4f1f5cd689 Factor out probs.go from wfe and wfe2 (#3085)
This is shared code between both packages. Better to have it in a single shared place.

In the process, remove the unexported signatureValidationError, which was unnecessary; all returned errors from checkAlgorithm get turned into Malformed.
2017-09-18 13:08:18 -07:00
Kleber Correia 72b1c69761 Remove RecheckCAA feature flag (#3103)
Updates #2692.
2017-09-18 11:59:58 -07:00
Kleber Correia 172164848b Remove DirectoryMeta feature flag (#3102)
Fixes #2692.
2017-09-18 11:58:42 -07:00
Roland Bracewell Shoemaker 0ab1e2ff46 Raise treeClimbingLookupCAA limit (#3098) 2017-09-15 18:30:29 -07:00
Roland Bracewell Shoemaker 8a2ad13a87 Don't tree climb for trees we've already climbed (#3096)
Prevents repeated lookups in traditional CNAME or tree based CNAME loops
2017-09-15 19:29:35 -04:00
Roland Bracewell Shoemaker 02bad19779 Fix weak-key-search key casting (#3095) 2017-09-15 15:04:17 -04:00
Roland Bracewell Shoemaker d1d6cab8ce Fix CAA test (#3092) 2017-09-14 16:02:54 -07:00
Daniel McCarney 0e4466bb30 Update gopkg.in/go-jose.v2 to v2.1.3. (#3087)
The 2.1.3 release of go-jose.v2 contains a bug fix for a nil panic
encountering null values in JWS headers that affects Boulder. This
commit updates Boulder to use the 2.1.3 release.

Unit tests were confirmed to pass:
```
$ go test ./...
ok      gopkg.in/square/go-jose.v2      13.648s
ok      gopkg.in/square/go-jose.v2/cipher       0.003s
?       gopkg.in/square/go-jose.v2/jose-util    [no test files]
ok      gopkg.in/square/go-jose.v2/json 1.199s
ok      gopkg.in/square/go-jose.v2/jwt  0.064s
```
2017-09-14 14:29:26 -07:00
Jacob Hoffman-Andrews 254537ea48 Reduce duplicated logging of errors in WFE (#3071)
In 7d04ea9 we introduced the notion of a requestEvent, which had an AddError method that could be called to log an error. In that change we also added an AddError call before every wfe.sendError, to ensure errors got logged. In dc58017, we made it so that sendError would automatically add its errors to the request event, so we wouldn't need to write AddError everywhere. However, we never cleaned up the existing AddError calls, and since then have tended to "follow local style" and add a redundant AddError before many of our sendError calls.

This change attempts to undo some of that, by removing all AddError calls that appear to be redundant with the sendError call immediately following. It also adds a section on error handling to CONTRIBUTING.md.
2017-09-14 14:19:40 -07:00
Jacob Hoffman-Andrews 9ab2ff4e03 Add CAA-specific error. (#3051)
Previously, CAA problems were lumped in under "ConnectionProblem" or
"Unauthorized". This should make things clearer and easier to differentiate.

Fixes #3043
2017-09-14 14:11:41 -07:00
Brian Smith 9d324631a7 CA: Set certificates' validity periods using the CA's clock. (#2983) 2017-09-14 09:40:31 -04:00
Jacob Hoffman-Andrews 1b156822a1 Add verifyPOST and NewReg tests when GetRegByKey fails (#3062)
I thought there was a bug in NewRegistration when GetRegByKey returns an error, so I wrote a unittest... and discovered it works correctly. Oh well, now we have more tests!
2017-09-13 17:07:43 -07:00
Jacob Hoffman-Andrews 4128e0d95a Add time-dependent integration testing (#3060)
Fixes #3020.

In order to write integration tests for some features, especially related to rate limiting, rechecking of CAA, and expiration of authzs, orders, and certs, we need to be able to fake the passage of time in integration tests.

To do so, this change switches out all clock.Default() instances for cmd.Clock(), which can be set manually with the FAKECLOCK environment variable. integration-test.py now starts up all servers once before the main body of tests, with FAKECLOCK set to a date 70 days ago, and does some initial setup for a new integration test case. That test case tries to fetch a 70-day-old authz URL, and expects it to 404.

In order to make this work, I also had to change a number of our test binaries to shut down cleanly in response to SIGTERM. Without that change, stopping the servers between the setup phase and the main tests caused startservers.check() to fail, because some processes exited with nonzero status.

Note: This is an initial stab at things, to prove out the technique. Long-term, I think we will want to use an idiom where test cases are classes that have a number of optional setup phases that may be run at e.g. 70 days prior and 5 days prior. This could help us avoid a proliferation of global state as we add more time-dependent test cases.
2017-09-13 12:34:14 -07:00
Roland Bracewell Shoemaker c03d96212b Update vendored github.com/cloudflare/cfssl (#3078) 2017-09-13 15:23:38 -04:00
Roland Bracewell Shoemaker 94e4947c58 Remove RSAPSS signatures from the list of acceptable CSR signing algs (#3079) 2017-09-13 14:07:20 -04:00
Roland Bracewell Shoemaker 2d3fc8c4b4 Add tool to search for certificates containing debian weak keys (#3077)
Fixes #3074.
2017-09-13 10:59:58 -07:00
Jacob Hoffman-Andrews 08d2018c10 Reject CAA responses containing DNAMEs (#3082)
Since the legacy CAA spec does the wrong thing with DNAMEs (treating them as
CNAMEs), and it's hard to reconcile this approach with CNAME handling, and
DNAMEs are extremely rare, reject outright any CAA responses containing DNAMEs.

Also, in the process, fix a bug in the previous LegacyCAA implementation.
Because the processing of records in LookupCAA was gated by `if
answer.Header().RRType == dnsType`, non-CAA responses were filtered out. This
wasn't caught by previous testing, because it was unittesting that mocked out
bdns.
2017-09-13 10:54:48 -07:00
Jacob Hoffman-Andrews 4266853092 Implement legacy form of CAA (#3075)
This implements the pre-erratum 5065 version of CAA, behind a feature flag.

This involved refactoring DNSClient.LookupCAA to return a list of CNAMEs in addition to the CAA records, and adding an alternate lookuper that does tree-climbing on single-depth aliases.
2017-09-13 10:16:12 -04:00
Kleber Correia 2fb247488f Consolidate registration model (#3064)
* Consolidate registration model

* Use regModel instead of empty interface
2017-09-12 12:35:40 -04:00
Roland Bracewell Shoemaker 9d34af6a82 Set AD bit in the header of DNS queries (#3068)
Fixes broken DNSSEC metrics, lack of this bit being set in queries had no security implications.
2017-09-12 09:28:07 -07:00