Our proto files had a variety of indentation styles: 2 spaces,
4 spaces, 8 spaces, and tabs; sometimes mixed within the same
file. The proto3 style guide[1] says to use 2-space indents,
so this change standardizes on that.
[1] https://developers.google.com/protocol-buffers/docs/style
And make corresponding changes to call sites and wrappers.
Note that proto2 vs proto3 is distinction in the syntax of the .proto files
and doesn't change the wire format, so this meets the deployability
guidelines.
There are some changes to the code generated in the latest version, so
this modifies every .pb.go file.
Also, the way protoc-gen-go decides where to put files has changed, so
each generate.go gets the --go_opt=paths=source_relative flag to
tell protoc to continue placing output next to the input.
Remove staticcheck from build.sh; we get it via golangci-lint now.
Pass --no-document to gem install fpm; this is recommended in the fpm docs.
In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back.
There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change.
Fixes#4591.
SCT signature verification already happens within the CT client, so we are
currently doing it twice for no reason. This change removes the redundant check
we perform.
In Boulder Issue 3821[0] we found that HTTP/2 support was causing hard
to diagnose intermittent freezes in CT submission. Disabling HTTP/2 with
an environment variable resolved the freezes but is not a stable fix.
Per the Go `http` package docs we can make this change persistent by
changing the `http.Transport` config:
Programs that must disable HTTP/2 can do so by setting
Transport.TLSNextProto (for clients) or Server.TLSNextProto (for
servers) to a non-nil, empty map"
[0]: https://github.com/letsencrypt/boulder/issues/3821
Precursor to #4116. Since some of our dependencies impose a minimum
version on these two packages higher than what we have in Godeps, we'll
have to bump them anyhow. Bumping them independently of the modules
update should keep things a little simpler.
In order to get protobuf tests to pass, I had to update protoc-gen-go in
boulder-tools. Now we download a prebuilt binary instead of using the
Ubuntu package, which is stuck on 3.0.0. This also meant I needed to
re-generate our pb.go files, since the new version generates somewhat
different output.
This happens to change the tag for pbutil, but it's not a substantive change - they just added a tagged version where there was none.
$ go test github.com/miekg/dns/...
ok github.com/miekg/dns 4.675s
ok github.com/miekg/dns/dnsutil 0.003s
ok github.com/golang/protobuf/descriptor (cached)
ok github.com/golang/protobuf/jsonpb (cached)
? github.com/golang/protobuf/jsonpb/jsonpb_test_proto [no test files]
ok github.com/golang/protobuf/proto (cached)
? github.com/golang/protobuf/proto/proto3_proto [no test files]
? github.com/golang/protobuf/proto/test_proto [no test files]
ok github.com/golang/protobuf/protoc-gen-go (cached)
? github.com/golang/protobuf/protoc-gen-go/descriptor [no test files]
ok github.com/golang/protobuf/protoc-gen-go/generator (cached)
ok github.com/golang/protobuf/protoc-gen-go/generator/internal/remap (cached)
? github.com/golang/protobuf/protoc-gen-go/grpc [no test files]
? github.com/golang/protobuf/protoc-gen-go/plugin [no test files]
ok github.com/golang/protobuf/ptypes (cached)
? github.com/golang/protobuf/ptypes/any [no test files]
? github.com/golang/protobuf/ptypes/duration [no test files]
? github.com/golang/protobuf/ptypes/empty [no test files]
? github.com/golang/protobuf/ptypes/struct [no test files]
? github.com/golang/protobuf/ptypes/timestamp [no test files]
? github.com/golang/protobuf/ptypes/wrappers [no test files]
Now that Pebble has a `pebble-challtestsrv` we can remove the `challtestrv`
package and associated command from Boulder. I switched CI to use
`pebble-challtestsrv`. Notably this means that we have to add our expected mock
data using the HTTP management interface. The Boulder-tools images are
regenerated to include the `pebble-challtestsrv` command.
Using this approach also allows separating the TLS-ALPN-01 and HTTPS HTTP-01
challenges by binding each challenge type in the `pebble-challtestsrv` to
different interfaces both using the same VA
HTTPS port. Mock DNS directs the VA to the correct interface.
The load-generator command that was previously using the `challtestsrv` package
from Boulder is updated to use a vendored copy of the new
`github.org/letsencrypt/challtestsrv` package.
Vendored dependencies change in two ways:
1) Gomock is updated to the latest release (matching what the Bouldertools image
provides)
2) A couple of new subpackages in `golang.org/x/net/` are added by way of
transitive dependency through the challtestsrv package.
Unit tests are confirmed to pass for `gomock`:
```
~/go/src/github.com/golang/mock/gomock$ git log --pretty=format:'%h' -n 1
51421b9
~/go/src/github.com/golang/mock/gomock$ go test ./...
ok github.com/golang/mock/gomock 0.002s
? github.com/golang/mock/gomock/internal/mock_matcher [no test files]
```
For `/x/net` all tests pass except two `/x/net/icmp` `TestDiag.go` test cases
that we have agreed are OK to ignore.
Resolves https://github.com/letsencrypt/boulder/issues/3962 and
https://github.com/letsencrypt/boulder/issues/3951
Even though we create a different `http.Client` for each log, all those
clients have their `Transport` pointed at `http.DefaultTransport`, which
means they share a connection pool and the relevant locks.
We occasionally see a bug where a single publisher instance will hang on
all log submissions, while other publisher instances continue normally.
One possibility is that `http.Client` is blocking on some internal state
related to connection pooling. To rule that out, we provide a separate
`Transport` for each log.
This updates Boulder's vendored dependency for `github.com/google/certificate-transparency-go` to c25855a, the tip of master at the time of writing.
Unit tests are confirmed to pass:
```
$ git log --pretty=format:'%h' -n 1
c25855a
$ go test ./...
ok github.com/google/certificate-transparency-go (cached)
ok github.com/google/certificate-transparency-go/asn1 (cached)
ok github.com/google/certificate-transparency-go/client 22.985s
? github.com/google/certificate-transparency-go/client/configpb [no test files]
? github.com/google/certificate-transparency-go/client/ctclient [no test files]
ok github.com/google/certificate-transparency-go/ctpolicy (cached)
ok github.com/google/certificate-transparency-go/ctutil (cached)
? github.com/google/certificate-transparency-go/ctutil/sctcheck [no test files]
? github.com/google/certificate-transparency-go/ctutil/sctscan [no test files]
ok github.com/google/certificate-transparency-go/dnsclient (cached)
ok github.com/google/certificate-transparency-go/fixchain 0.091s
? github.com/google/certificate-transparency-go/fixchain/chainfix [no test files]
ok github.com/google/certificate-transparency-go/fixchain/ratelimiter (cached)
ok github.com/google/certificate-transparency-go/gossip (cached)
? github.com/google/certificate-transparency-go/gossip/gossip_server [no test files]
ok github.com/google/certificate-transparency-go/gossip/minimal 0.028s
? github.com/google/certificate-transparency-go/gossip/minimal/configpb [no test files]
? github.com/google/certificate-transparency-go/gossip/minimal/goshawk [no test files]
? github.com/google/certificate-transparency-go/gossip/minimal/gosmin [no test files]
ok github.com/google/certificate-transparency-go/gossip/minimal/x509ext (cached)
ok github.com/google/certificate-transparency-go/ingestor/ranges (cached)
ok github.com/google/certificate-transparency-go/jsonclient 0.007s
ok github.com/google/certificate-transparency-go/logid (cached)
ok github.com/google/certificate-transparency-go/loglist (cached)
? github.com/google/certificate-transparency-go/loglist/findlog [no test files]
ok github.com/google/certificate-transparency-go/loglist2 (cached)
? github.com/google/certificate-transparency-go/preload [no test files]
? github.com/google/certificate-transparency-go/preload/dumpscts [no test files]
? github.com/google/certificate-transparency-go/preload/preloader [no test files]
ok github.com/google/certificate-transparency-go/scanner 0.009s
? github.com/google/certificate-transparency-go/scanner/scanlog [no test files]
ok github.com/google/certificate-transparency-go/tls (cached)
ok github.com/google/certificate-transparency-go/trillian/ctfe (cached)
? github.com/google/certificate-transparency-go/trillian/ctfe/configpb [no test files]
? github.com/google/certificate-transparency-go/trillian/ctfe/ct_server [no test files]
? github.com/google/certificate-transparency-go/trillian/ctfe/testonly [no test files]
ok github.com/google/certificate-transparency-go/trillian/integration 0.023s
? github.com/google/certificate-transparency-go/trillian/integration/ct_hammer [no test files]
? github.com/google/certificate-transparency-go/trillian/migrillian [no test files]
? github.com/google/certificate-transparency-go/trillian/migrillian/configpb [no test files]
ok github.com/google/certificate-transparency-go/trillian/migrillian/core (cached)
? github.com/google/certificate-transparency-go/trillian/mockclient [no test files]
ok github.com/google/certificate-transparency-go/trillian/util (cached)
ok github.com/google/certificate-transparency-go/x509 (cached)
? github.com/google/certificate-transparency-go/x509/pkix [no test files]
? github.com/google/certificate-transparency-go/x509util [no test files]
? github.com/google/certificate-transparency-go/x509util/certcheck [no test files]
? github.com/google/certificate-transparency-go/x509util/crlcheck [no test files]
```
Resolves https://github.com/letsencrypt/boulder/issues/3872
**Note to reviewers**: There's an outstanding bug that I've tracked down to the `--load` stage of the integration tests that results in one of the remote VA instances in the `test/config-next` configuration under Go 1.11.1 to fail to cleanly shut down. I'm working on finding the root cause but in the meantime I've disabled `--load` during CI so we can unblock moving forward with getting Go 1.11.1 in dev/CI. Tracking this in https://github.com/letsencrypt/boulder/issues/3889
Does a simpler probe than compared to using a `blackbox_exporter`, but directly collects the info we think will aid debugging publisher outages.
Updates #3821.
Things removed:
* features.EmbedSCTs (and all the associated RA/CA/ocsp-updater code etc)
* ca.enablePrecertificateFlow (and all the associated RA/CA code)
* sa.AddSCTReceipt and sa.GetSCTReceipt RPCs
* publisher.SubmitToCT and publisher.SubmitToSingleCT RPCs
Fixes#3755.
A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)).
Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions
of the logger functions and call these directly with the format and arguments.
While here remove some unnecessary trailing newlines and calls to String/Error.
Submits final certificates to any configured CT logs. This doesn't introduce a feature flag as it is config gated, any log we want to submit final certificates to needs to have it's log description updated to include the `"submitFinalCerts": true` field.
Fixes#3605.
Also instead of repeating the same bucket definitions everywhere just use a single top level var in the metrics package in order to discourage copy/pasting.
Fixes#3607.
In publisher and in the integration test, check that SCTs are in a
reasonable range. Also, update CreateTestingSignedSCT (used by
ct-test-srv) to produce SCTs correctly with a timetamp in Unix epoch
milliseconds.
This change updates boulder-tools to use Go 1.10, and references a
newly-pushed image built using that new config.
Since boulder-tools pulls in the latest Certbot master at the time of
build, this also pulls in the latest changes to Certbot's acme module,
which now supports ACME v2. This means we no longer have to check out
the special acme-v2-integration branch in our integration tests.
This also updates chisel2.py to reflect some of the API changes that
landed in the acme module as it was merged to master.
Since we don't need additional checkouts to get the ACMEv2-compatible
version of the acme module, we can include it in the default RUN set for
local tests.
With the CT policy changes, we cancel any outstanding requests in a
group once we've gotten a successful response from any log in that
group. That means various function calls will return early with an error
code indicating cancellation. We want to avoid logging such error codes,
because they were not really errors, they were intentional.
This change introduces a small utility package called "canceled", which
checks for both context.Canceled and the gRPC return code indicating
cancelled.
shell_test.go and publisher_test.go had unnecessary references to
../test/test-ca.pem. This change makes them a little more self-contained.
Note: ca/ca_test.go still depends on test-ca.pem, but removing the dependency
turns out to be a little more complicated due to hardcoded expectations in some
of the test cases.
Updates the buckets for histograms in the publisher, va, and expiration-mailer which are used to measure the latency of operations that go over the internet and therefore are liable to take a lot longer than the default buckets can measure. Uses a standard set of buckets for all three instead of attempting to tune for each one.
Fixes#3217.
Makes a couple of changes:
* Change `SubmitToCT` to make submissions to each log in parallel instead of in serial, this prevents a single slow log from eating up the majority of the deadline and causing submissions to other logs to fail
* Remove the 'submissionTimeout' field on the publisher since it is actually bounded by the gRPC timeout as is misleading
* Add a timeout to the CT clients internal HTTP client so that when log servers hang indefinitely we actually do retries instead of just using the entire submission deadline. Currently set at 2.5 minutes
Fixes#3218.
Switches imports from `github.com/google/certificate-transparency` to `github.com/google/certificate-transparency-go` and vendors the new code. Also fixes a number of small breakages caused by API changes since the last time we vendored the code. Also updates `github.com/cloudflare/cfssl` since you can't vendor both `github.com/google/certificate-transparency` and `github.com/google/certificate-transparency-go`.
Side note: while doing this `godep` tried to pull in a number of imports under the `golang.org/x/text` repo that I couldn't find actually being used anywhere so I just dropped the changes to `Godeps/Godeps.json` and didn't add the vendored dir to the tree, let's see if this breaks any tests...
All tests pass
```
$ go test ./...
ok github.com/google/certificate-transparency-go 0.640s
ok github.com/google/certificate-transparency-go/asn1 0.005s
ok github.com/google/certificate-transparency-go/client 22.054s
? github.com/google/certificate-transparency-go/client/ctclient [no test files]
ok github.com/google/certificate-transparency-go/fixchain 0.133s
? github.com/google/certificate-transparency-go/fixchain/main [no test files]
ok github.com/google/certificate-transparency-go/fixchain/ratelimiter 27.752s
ok github.com/google/certificate-transparency-go/gossip 0.322s
? github.com/google/certificate-transparency-go/gossip/main [no test files]
ok github.com/google/certificate-transparency-go/jsonclient 25.701s
ok github.com/google/certificate-transparency-go/merkletree 0.006s
? github.com/google/certificate-transparency-go/preload [no test files]
? github.com/google/certificate-transparency-go/preload/dumpscts/main [no test files]
? github.com/google/certificate-transparency-go/preload/main [no test files]
ok github.com/google/certificate-transparency-go/scanner 0.013s
? github.com/google/certificate-transparency-go/scanner/main [no test files]
ok github.com/google/certificate-transparency-go/tls 0.033s
ok github.com/google/certificate-transparency-go/x509 1.071s
? github.com/google/certificate-transparency-go/x509/pkix [no test files]
? github.com/google/certificate-transparency-go/x509util [no test files]
```
```
$ ./test.sh
...
ok github.com/cloudflare/cfssl/api 1.089s coverage: 81.1% of statements
ok github.com/cloudflare/cfssl/api/bundle 1.548s coverage: 87.2% of statements
ok github.com/cloudflare/cfssl/api/certadd 13.681s coverage: 86.8% of statements
ok github.com/cloudflare/cfssl/api/client 1.314s coverage: 55.2% of statements
ok github.com/cloudflare/cfssl/api/crl 1.124s coverage: 75.0% of statements
ok github.com/cloudflare/cfssl/api/gencrl 1.067s coverage: 72.5% of statements
ok github.com/cloudflare/cfssl/api/generator 2.809s coverage: 33.3% of statements
ok github.com/cloudflare/cfssl/api/info 1.112s coverage: 84.1% of statements
ok github.com/cloudflare/cfssl/api/initca 1.059s coverage: 90.5% of statements
ok github.com/cloudflare/cfssl/api/ocsp 1.178s coverage: 93.8% of statements
ok github.com/cloudflare/cfssl/api/revoke 2.282s coverage: 75.0% of statements
ok github.com/cloudflare/cfssl/api/scan 2.729s coverage: 62.1% of statements
ok github.com/cloudflare/cfssl/api/sign 2.483s coverage: 83.3% of statements
ok github.com/cloudflare/cfssl/api/signhandler 1.137s coverage: 26.3% of statements
ok github.com/cloudflare/cfssl/auth 1.030s coverage: 68.2% of statements
ok github.com/cloudflare/cfssl/bundler 15.014s coverage: 85.1% of statements
ok github.com/cloudflare/cfssl/certdb/dbconf 1.042s coverage: 78.9% of statements
ok github.com/cloudflare/cfssl/certdb/ocspstapling 1.919s coverage: 69.2% of statements
ok github.com/cloudflare/cfssl/certdb/sql 1.265s coverage: 65.7% of statements
ok github.com/cloudflare/cfssl/cli 1.050s coverage: 61.9% of statements
ok github.com/cloudflare/cfssl/cli/bundle 1.023s coverage: 0.0% of statements
ok github.com/cloudflare/cfssl/cli/crl 1.669s coverage: 57.8% of statements
ok github.com/cloudflare/cfssl/cli/gencert 9.278s coverage: 83.6% of statements
ok github.com/cloudflare/cfssl/cli/gencrl 1.310s coverage: 73.3% of statements
ok github.com/cloudflare/cfssl/cli/genkey 3.028s coverage: 70.0% of statements
ok github.com/cloudflare/cfssl/cli/ocsprefresh 1.106s coverage: 64.3% of statements
ok github.com/cloudflare/cfssl/cli/revoke 1.081s coverage: 88.2% of statements
ok github.com/cloudflare/cfssl/cli/scan 1.217s coverage: 36.0% of statements
ok github.com/cloudflare/cfssl/cli/selfsign 2.201s coverage: 73.2% of statements
ok github.com/cloudflare/cfssl/cli/serve 1.133s coverage: 39.0% of statements
ok github.com/cloudflare/cfssl/cli/sign 1.210s coverage: 54.8% of statements
ok github.com/cloudflare/cfssl/cli/version 2.475s coverage: 100.0% of statements
ok github.com/cloudflare/cfssl/cmd/cfssl 1.082s coverage: 0.0% of statements
ok github.com/cloudflare/cfssl/cmd/cfssljson 1.016s coverage: 4.0% of statements
ok github.com/cloudflare/cfssl/cmd/mkbundle 1.024s coverage: 0.0% of statements
ok github.com/cloudflare/cfssl/config 2.754s coverage: 67.7% of statements
ok github.com/cloudflare/cfssl/crl 1.063s coverage: 68.3% of statements
ok github.com/cloudflare/cfssl/csr 27.016s coverage: 89.6% of statements
ok github.com/cloudflare/cfssl/errors 1.081s coverage: 81.2% of statements
ok github.com/cloudflare/cfssl/helpers 1.217s coverage: 80.4% of statements
ok github.com/cloudflare/cfssl/helpers/testsuite 7.658s coverage: 65.8% of statements
ok github.com/cloudflare/cfssl/initca 205.809s coverage: 74.2% of statements
ok github.com/cloudflare/cfssl/log 1.016s coverage: 59.3% of statements
ok github.com/cloudflare/cfssl/multiroot/config 1.107s coverage: 77.4% of statements
ok github.com/cloudflare/cfssl/ocsp 1.524s coverage: 77.7% of statements
ok github.com/cloudflare/cfssl/revoke 1.775s coverage: 79.6% of statements
ok github.com/cloudflare/cfssl/scan 1.022s coverage: 1.1% of statements
ok github.com/cloudflare/cfssl/selfsign 1.119s coverage: 70.0% of statements
ok github.com/cloudflare/cfssl/signer 1.019s coverage: 20.0% of statements
ok github.com/cloudflare/cfssl/signer/local 3.146s coverage: 81.2% of statements
ok github.com/cloudflare/cfssl/signer/remote 2.328s coverage: 71.8% of statements
ok github.com/cloudflare/cfssl/signer/universal 2.280s coverage: 67.7% of statements
ok github.com/cloudflare/cfssl/transport 1.028s
ok github.com/cloudflare/cfssl/transport/ca/localca 1.056s coverage: 94.9% of statements
ok github.com/cloudflare/cfssl/transport/core 1.538s coverage: 90.9% of statements
ok github.com/cloudflare/cfssl/transport/kp 1.054s coverage: 37.1% of statements
ok github.com/cloudflare/cfssl/ubiquity 1.042s coverage: 88.3% of statements
ok github.com/cloudflare/cfssl/whitelist 2.304s coverage: 100.0% of statements
```
Fixes#2746.
Updates the various gRPC/protobuf libs (google.golang.org/grpc/... and github.com/golang/protobuf/proto) and the boulder-tools image so that we can update to the newest github.com/grpc-ecosystem/go-grpc-prometheus. Also regenerates all of the protobuf definition files.
Tests run on updated packages all pass.
Unblocks #2633fixes#2636.
Pulls in logging improvements in OCSP Responder and the CT client, plus a handful of API changes. Also, the CT client verifies responses by default now.
This change includes some Boulder diffs to accommodate the API changes.
We have a number of stats already expressed using the statsd interface. During
the switchover period to direct Prometheus collection, we'd like to make those
stats available both ways. This change automatically exports any stats exported
using the statsd interface via Prometheus as well.
This is a little tricky because Prometheus expects all stats to by registered
exactly once. Prometheus does offer a mechanism to gracefully recover from
registering a stat more than once by handling a certain error, but it is not
safe for concurrent access. So I added a concurrency-safe wrapper that creates
Prometheus stats on demand and memoizes them.
In the process, made a few small required side changes:
- Clean "/" from method names in the gRPC interceptors. They are allowed in
statsd but not in Prometheus.
- Replace "127.0.0.1" with "boulder" as the name of our testing CT log.
Prometheus stats can't start with a number.
- Remove ":" from the CT-log stat names emitted by Publisher. Prometheus stats
can't include it.
- Remove a stray "RA" in front of some rate limit stats, since it was
duplicative (we were emitting "RA.RA..." before).
Note that this means two stat groups in particular are duplicated:
- Gostats* is duplicated with the default process-level stats exported by the
Prometheus library.
- gRPCClient* are duplicated by the stats generated by the go-grpc-prometheus
package.
When writing dashboards and alerts in the Prometheus world, we should be careful
to avoid these two categories, as they will disappear eventually. As a general
rule, if a stat is available with an all-lowercase name, choose that one, as it
is probably the Prometheus-native version.
In the long run we will want to create most stats using the native Prometheus
stat interface, since it allows us to use add labels to metrics, which is very
useful. For instance, currently our DNS stats distinguish types of queries by
appending the type to the stat name. This would be more natural as a label in
Prometheus.
This PR introduces the ability for the ocsp-updater to only resubmit certificates to logs that we are missing SCTs from. Prior to this commit when a certificate was missing one or more SCTs we would submit it to every log, causing unnecessary overhead for us and the log operator.
To accomplish this a new RPC endpoint is added to the Publisher service "SubmitToSingleCT". Unlike the existing "SubmitToCT" this RPC endpoint accepts a log URI and public key in addition to the certificate DER bytes. The certificate is submitted directly to that log, and a cache of constructed resources is maintained so that subsequent submissions to the same log can reuse the stat name, verifier, and submission client.
Resolves#1679
Fixes#1576.
Adds a new package mock_metrics, with code generated by gomock, in order to test the change.
Modifies publisher.New to take a metrics.Scope and an SA, and unexport SA.
Moves core of submission loop into a separate function, singleLogSubmit, which can return an error rather than using the continue keyword. This reduces repetition of AuditErr lines, and makes it easier to put error statting in one place.