Commit Graph

231 Commits

Author SHA1 Message Date
Jacob Hoffman-Andrews 3d9b3d4d20 Restore expvar handler. (#3209)
In #3167 I removed expvar, thinking it was unused, but it turns out the RA
exports the last issuance time, and core/util.go has a function to export
BuildID, both of which are used in monitoring. This wasn't caught at compile
time because the global expvar package was happy to register the exports even
though there was no handler to serve them.
2017-11-02 07:05:54 -07:00
Jacob Hoffman-Andrews 6cd777bd8d Fix up stats after #3167 (#3185)
There were two bugs in #3167:

All process-level stats got prefixed with "boulder", which broke dashboards.
All request_time stats got dropped, because measured_http was using the prometheus DefaultRegisterer.
To fix, this PR plumbs through a scope object to measured_http, and uses an empty prefix when calling NewProcessCollector().
2017-10-18 11:14:59 -07:00
Jacob Hoffman-Andrews f366e45756 Remove global state from metrics gathering (#3167)
Previously, we used prometheus.DefaultRegisterer to register our stats, which uses global state to export its HTTP stats. We also used net/http/pprof's behavior of registering to the default global HTTP ServeMux, via DebugServer, which starts an HTTP server that uses that global ServeMux.

In this change, I merge DebugServer's functions into StatsAndLogging. StatsAndLogging now takes an address parameter and fires off an HTTP server in a goroutine. That HTTP server is newly defined, and doesn't use DefaultServeMux. On it is registered the Prometheus stats handler, and handlers for the various pprof traces. In the process I split StatsAndLogging internally into two functions: makeStats and MakeLogger. I didn't port across the expvar variable exporting, which serves a similar function to Prometheus stats but which we never use.

One nice immediate effect of this change: Since StatsAndLogging now requires and address, I noticed a bunch of commands that called StatsAndLogging, and passed around the resulting Scope, but never made use of it because they didn't run a DebugServer. Under the old StatsD world, these command still could have exported their stats by pushing, but since we moved to Prometheus their stats stopped being collected. We haven't used any of these stats, so instead of adding debug ports to all short-lived commands, or setting up a push gateway, I simply removed them and switched those commands to initialize only a Logger, no stats.
2017-10-13 11:58:01 -07:00
Jacob Hoffman-Andrews 0a72f768a7 Remove ProfileCmd. (#3166)
These stats are now all collected by Prometheus.
2017-10-13 10:02:04 -04:00
Jacob Hoffman-Andrews 4128e0d95a Add time-dependent integration testing (#3060)
Fixes #3020.

In order to write integration tests for some features, especially related to rate limiting, rechecking of CAA, and expiration of authzs, orders, and certs, we need to be able to fake the passage of time in integration tests.

To do so, this change switches out all clock.Default() instances for cmd.Clock(), which can be set manually with the FAKECLOCK environment variable. integration-test.py now starts up all servers once before the main body of tests, with FAKECLOCK set to a date 70 days ago, and does some initial setup for a new integration test case. That test case tries to fetch a 70-day-old authz URL, and expects it to 404.

In order to make this work, I also had to change a number of our test binaries to shut down cleanly in response to SIGTERM. Without that change, stopping the servers between the setup phase and the main tests caused startservers.check() to fail, because some processes exited with nonzero status.

Note: This is an initial stab at things, to prove out the technique. Long-term, I think we will want to use an idiom where test cases are classes that have a number of optional setup phases that may be run at e.g. 70 days prior and 5 days prior. This could help us avoid a proliferation of global state as we add more time-dependent test cases.
2017-09-13 12:34:14 -07:00
Jacob Hoffman-Andrews 20ec1e3e4e Filter spurious shutdown errors. (#3052)
Previously, we would produce an error an a nonzero status code on shutdown,
because gRPC's GracefulStop would cause s.Serve() to return an error. Now we
filter that specific error and treat it as success. This also allows us to kill
process with SIGTERM instead of SIGKILL in integration tests.

Fixes #2410.
2017-09-07 13:45:32 -07:00
Jacob Hoffman-Andrews 63a25bf913 Remove clientName everywhere. (#2862)
This used to be used for AMQP queue names. Now that AMQP is gone, these consts
were only used when printing a version string at startup. This changes
VersionString to just use the name of the current program, and removes
`const clientName = ` from many of our main.go's.
2017-07-12 10:28:54 -07:00
Jacob Hoffman-Andrews b17b5c72a6 Remove statsd from Boulder (#2752)
This removes the config and code to output to statsd.

- Change `cmd.StatsAndLogging` to output a `Scope`, not a `Statter`.
- Remove the prefixing of component name (e.g. "VA") in front of stats; this was stripped by `autoProm` but now no longer needs to be.
- Delete vendored statsd client.
- Delete `MockStatter` (generated by gomock) and `mocks.Statter` (hand generated) in favor of mocking `metrics.Scope`, which is the interface we now use everywhere.
- Remove a few unused methods on `metrics.Scope`, and update its generated mock.
- Refactor `autoProm` and add `autoRegisterer`, which can be included in a `metrics.Scope`, avoiding global state. `autoProm` now registers everything with the `prometheus.Registerer` it is given.
- Change va_test.go's `setup()` to not return a stats object; instead the individual tests that care about stats override `va.stats` directly.

Fixes #2639, #2733.
2017-05-15 10:19:54 -04:00
Jacob Hoffman-Andrews d9b53cd103 Set gRPC logs to go through syslog. (#2403)
StatsAndLogging is called early enough in each program that it precedes any gRPC
setup code that might need SetLogger already to have been set.

Fixes #2383
2016-12-08 15:25:31 -08:00
Daniel c96c8a648f
Removes `cmd.Version()`.
The `Version()` function is a less useful alternative to
`VersionString()` and isn't necessary. It used a fixed "0.1.0" prefix
that doesn't match a release tag, it also doesn't print Go & host
information that `VersionString()` does.

Less code = less bugs!
2016-11-26 16:59:45 -05:00
Roland Bracewell Shoemaker 595204b23f Implement improved signal catching in services that already use it (#2333)
Implements a less RPC focused signal catch/shutdown method. Certain things that probably could also use this (i.e. `ocsp-updater`) haven't been given it as they would require rather substantial changes to allow for a graceful shutdown approach.

Fixes #2298.
2016-11-18 21:05:04 -05:00
Jacob Hoffman-Andrews 9b8b877e42 Add prometheus client. (#2293)
This vendors the Prometheus client code, and exports metrics on the debug port, under `/metrics`.

This will currently export just the default metrics, like `go_goroutines`, `process_cpu_seconds_total`, `process_open_fds`, and `process_resident_memory_bytes`. Later work will start exporting Boulder-specific metrics, but this will allow Ops to start configuring scraping of Prometheus metrics in production.

Tests pass:

```
$ git diff master Godeps/ | sed -ne 's/^+.*ImportPath": "//p' | tr -d '",' | xargs go test
ok      github.com/beorn7/perks/quantile        0.562s
ok      github.com/matttproud/golang_protobuf_extensions/pbutil 0.003s
ok      github.com/prometheus/client_golang/prometheus  34.418s
ok      github.com/prometheus/client_golang/prometheus/promhttp 0.003s
?       github.com/prometheus/client_model/go   [no test files]
ok      github.com/prometheus/common/expfmt     0.019s
ok      github.com/prometheus/common/internal/bitbucket.org/ww/goautoneg        0.002s
ok      github.com/prometheus/common/model      0.003s
ok      github.com/prometheus/procfs    0.008s
```

Part of #2284
2016-10-28 16:13:41 -07:00
Roland Bracewell Shoemaker 239bf9ae0a Very basic feature flag impl (#1705)
Updates #1699.

Adds a new package, `features`, which exposes methods to set and check if various internal features are enabled. The implementation uses global state to store the features so that services embedded in another service do not each require their own features map in order to check if something is enabled.

Requires a `boulder-tools` image update to include `golang.org/x/tools/cmd/stringer`.
2016-09-20 16:29:01 -07:00
Roland Bracewell Shoemaker c8f1fb3e2f Remove direct usages of go-statsd-client in favor of using metrics.Scope (#2136)
Fixes #2118, fixes #2082.
2016-09-07 19:35:13 -04:00
Blake Griffith 344a312905 Remove audit comments -- closes #2129 (#2139)
Closes #2129

* Remove audit comments.
* Nuke doc/requirements/*
2016-08-25 18:23:42 -07:00
Ben Irving 1a4f099899 Split up boulder-config.json (Expiration Mailer) (#2036)
Part of #1962.
2016-07-12 15:55:52 -07:00
Jacob Hoffman-Andrews c8723c4baa Add back version flag for all binaries. (#2034)
Fixes #2020
2016-07-11 09:28:01 -07:00
Ben Irving 0e2ef748b4 Split up boulder-config.json (OCSP Responder) (#2017) 2016-07-07 14:52:08 -04:00
Ben Irving 653cc004d0 Split Boulder Config (OCSP Updater) (#2013) 2016-07-06 10:00:52 -04:00
Ben Irving cb45bdea67 Split up boulder-config.json (Publisher) (#2008) 2016-07-05 13:31:30 -07:00
Ben Irving bea8e57536 Split up boulder-config.json (VA) (#1979) 2016-07-01 13:06:50 -04:00
Ben Irving 21e0b3bdc7 Split up boulder-config.json (CA) (#1978) 2016-07-01 10:24:19 -04:00
Ben Irving 6162533c00 Split up boulder-config.json (SA) (#1975)
Depends on #1973

https://github.com/letsencrypt/boulder/pull/1975
2016-06-29 15:01:49 -07:00
Ben Irving c4f7fb580d Split up boulder-config.json (RA) (#1974)
Part of #1962
2016-06-29 13:43:55 -07:00
Ben Irving 6007df8f3c Split up boulder-config.json (WFE) (#1973)
Moves the wfe to it's own config file.

Each config will now belong in `test/config` and `test/config-next` analogous to `boulder-config` and `boulder-config-next`.
2016-06-28 10:40:16 -07:00
Jacob Hoffman-Andrews 4283fb5dd4 Improve syslog defaults. (#1932)
Under the new defaults, if the syslog section is missing, we'll use the default
config that we use in prod: no logs to stdout, INFO and below to syslog.

This allows us to remove the syslog section from prod configs, and potentially
move it to individual service configs in the future.

* Improve syslog defaults.
* Add stdout logging for purger test.
* Use plain int for sysloglevel.
* Fix JSON syntax
* Fix syslog config for expired-authz-purger.

https://github.com/letsencrypt/boulder/pull/1932
2016-06-17 11:26:11 -07:00
Ben Irving f04b922aff Remove support for TCP-based logging (#1931) 2016-06-14 14:30:21 -07:00
Ben Irving 1336c42813 Replace all log.Err calls with log.AuditErr (#1891)
* remove calls to log.Err()
* go fmt
* remove more occurrences
* change AuditErr argument to string and replace occurrences
2016-06-06 16:27:16 -04:00
Roland Bracewell Shoemaker 54573b36ba Remove all stray copyright headers and appends the initial line to LICENSE.txt (#1853) 2016-05-31 12:32:04 -07:00
Ben Irving d88cce5c72 Add config option to lower syslog level 2016-05-26 09:32:32 -07:00
Jacob Hoffman-Andrews e6c17e1717 Switch to new vendor style (#1747)
* Switch to new vendor style.

* Fix metrics generate command.

* Fix miekg/dns types_generate.

* Use generated copies of files.

* Update miekg to latest.

Fixes a problem with `go generate`.

* Set GO15VENDOREXPERIMENT.

* Build in letsencrypt/boulder.

* fix travis more.

* Exclude vendor instead of godeps.

* Replace some ...

* Fix unformatted cmd

* Fix errcheck for vendorexp

* Add GO15VENDOREXPERIMENT to Makefile.

* Temp disable errcheck.

* Restore master fetch.

* Restore errcheck.

* Build with 1.6 also.

* Match statsd.*"

* Skip errcheck unles Go1.6.

* Add other ignorepkg.

* Fix errcheck.

* move errcheck

* Remove go1.6 requirement.

* Put godep-restore with errcheck.

* Remove go1.6 dep.

* Revert master fetch revert.

* Remove -r flag from godep save.

* Set GO15VENDOREXPERIMENT in Dockerfile and remove _worskpace.

* Fix Godep version.
2016-04-18 12:51:36 -07:00
Kane York 25b45a45ec Errcheck errors fixed (#1677)
* Fix all errcheck errors
* Add errcheck to test.sh
* Add a new sa.Rollback method to make handling errors in rollbacks easier.
This also causes a behavior change in the VA. If a HTTP connection is
abruptly closed after serving the headers for a non-200 response, the
reported error will be the read failure instead of the non-200.
2016-04-12 16:54:01 -07:00
Jacob Hoffman-Andrews ecc04e8e61 Refactor log package (#1717)
- Remove error signatures from log methods. This means fewer places where errcheck will show ignored errors.
- Pull in latest cfssl to be compatible with errorless log messages.
- Reduce the number of message priorities we support to just those we actually use.
- AuditNotice -> AuditInfo
- Remove InfoObject (only one use, switched to Info)
- Remove EmergencyExit and related functions in favor of panic
- Remove SyslogWriter / AuditLogger separate types in favor of a single interface, Logger, that has all the logging methods on it.
- Merge mock log into logger. This allows us to unexport the internals but still override them in the mock.
- Shorten names to be compatible with Go style: New, Set, Get, Logger, NewMock, etc.
- Use a shorter log format for stdout logs.
- Remove "... Starting" log messages. We have better information in the "Versions" message logged at startup.

Motivation: The AuditLogger / SyslogWriter distinction was confusing and exposed internals only necessary for tests. Some components accepted one type and some accepted the other. This made it hard to consistently use mock loggers in tests. Also, the unnecessarily fat interface for AuditLogger made it hard to meaningfully mock out.
2016-04-08 16:12:20 -07:00
Roland Bracewell Shoemaker 32e9e44906 Remove activity-monitor from the tree
* Axe boulder-am
* Also remove the analysis subpackage and references to it, and remove routingKey from rpc/connection.go

https://github.com/letsencrypt/boulder/pull/1682
2016-04-04 12:19:17 -07:00
Roland Bracewell Shoemaker 800b5b0cbf Switch to using a wrapped statter that provides PID
* Switch to using a wrapped statter that provides PID

* Fix tests and change some types to interfaces

* Add hostname to suffix + update comment
2016-04-01 15:43:35 -07:00
Kane York c8614e21c5 Add call to cfssl SetLogger, remove TODO
Fixes #1528
2016-02-29 15:39:57 -08:00
Kane York df6bb5126a Set the mysql logger in shell.go/StatsAndLogging()
Fixes #1507
2016-02-22 16:15:08 -08:00
Alex Gaynor cbeffe96a6 Fixed a bunch of typos 2016-01-04 18:39:34 -05:00
Jeff Hodges 5da1e32f3b Merge branch 'master' into fix-amqp-defaults 2015-11-24 15:02:42 -08:00
Jacob Hoffman-Andrews 169a0b79e3 Export BuildID via expvars. 2015-11-24 14:24:51 -08:00
Jacob Hoffman-Andrews 86596f9355 Fix AMQP defaults 2015-11-23 17:28:35 -08:00
Jacob Hoffman-Andrews 7dcfcd7864 Add configurable RPC timeouts per backend.
In the process, break out AMQP config into its own struct, one per service.
The AMQPConfig struct is included by composition in the config structs that need
it. If any given service lacks an AMQP config of its own, it gets a default
value from the top-level AMQP config struct, for deployability reasons.

Tightens the RPC code to take a specific AMQP config, not an over-broad
cmd.Config.

Shortens construction of specific RPC clients so they instatiate the generic
client connection themselves, simplifying per-service startup code.

Remove unused SetTimeout method on RPC clients.
2015-11-17 19:51:51 -08:00
Jacob Hoffman-Andrews 59b92132e7 Add a TODO. 2015-11-16 09:43:45 -08:00
Jacob Hoffman-Andrews ed1fef72eb Merge branch 'master' of github.com:letsencrypt/boulder into quiet-more-logs 2015-11-16 09:36:54 -08:00
Jacob Hoffman-Andrews 284f472b1e Inihibit more logs from going to stdout.
Follow up on https://github.com/letsencrypt/boulder/issues/852,
there were a couple of spots I missed.
2015-11-15 11:34:31 -08:00
Jacob Hoffman-Andrews 1b0838cf99 Move config structs into config.go.
Part of https://github.com/letsencrypt/boulder/issues/1052. I'll be adding some
new config structs, and want everything in a consistent place.
2015-11-13 14:57:07 -08:00
Jacob Hoffman-Andrews 47bae156e5 Move config structs into config.go. 2015-11-12 12:01:14 -08:00
Jacob Hoffman-Andrews 2fc0f3143e Improve logging.
Consolidate initialization of stats and logging from each main.go into cmd
package.

Define a new config parameter, `StdoutLevel`, that determines the maximum log
level that will be printed to stdout. It can be set to 6 to inhibit debug
messages, or 0 to print only emergency messages, or -1 to print no messages at
all.

Remove the existing config parameter `Tag`. Instead, choose the tag from the
basename of the currently running process. Previously all Boulder log messages
had the tag "boulder", but now they will be differentiated by process, like
"boulder-wfe".

Shorten the date format used in stdout logging, and add the current binary's
basename.

Consolidate setup function in audit-logger_test.go.

Note: Most CLI binaries now get their stats and logging from the parameters of
Action. However, a few of our binaries don't use our custom AppShell, and
instead use codegangsta/cli directly. For those binaries, we export the new
StatsAndLogging method from cmd.

Fixes https://github.com/letsencrypt/boulder/issues/852
2015-11-11 16:52:42 -08:00
Richard Barnes 174011f6d8 Move validation and defaults out of UnmarshalJSON 2015-11-09 15:30:13 -05:00
Richard Barnes f61183e144 Use a map and set defaults 2015-11-07 12:39:57 -05:00