- Replace `gorp.DbMap` with calls that use `sql.DB` directly
- Use `rows.Scan()` and `rows.Next()` to get query results (which opens the door to streaming the results)
- Export function `CertStatusMetadataFields` from `SA`
- Add new function `ScanCertStatusRow` to `SA`
- Add new function `NewDbSettingsFromDBConfig` to `SA`
Fixes#5642
Part Of #5715
Give ocsp-responder a new map of IssuerNameIDs to keyHashes,
so that it can confirm that OCSP requests have an appropriate
key hash whether the database is storing old-style IssuerIDs or
new-style IssuerNameIDs.
Part of #5152
Change `CertificateStatus.IssuerID` from `*int64` to just an
`int64`. It might make sense for this field to be nillable in a world
where we want to distinguish between it being missing and it
being zero, but none of our code actually does that: we error
out either way.
Part of #5152
This changeset adds a second DB connect string for the SA for use in
read-only queries that are not themselves dependencies for read-write
queries. In other words, this is attempting to only catch things like
rate-limit `SELECT`s and other coarse-counting, so we can potentially
move those read queries off the read-write primary database.
It also adds a second DB connect string to the OCSP Updater. This is a
little trickier, as the subsequent `UPDATE`s _are_ dependent on the
output of the `SELECT`, but in this case it's operating on data batches,
and a few seconds' replication latency are several orders of magnitude
below the threshold for update frequency, so any certificates that
aren't caught on run `n` can be caught on run `n+1`.
Since we export DB metrics to Prometheus, this also refactors
`InitDBMetrics` to take a DB Address (host:port tuple) and User out of
the DB connection DSN and include those as labels in the metrics.
Fixes#5550Fixes#4985
Add Honeycomb tracing to all Boulder components which act as
HTTP servers, gRPC servers, or gRPC clients. Add many values
which we currently emit to logs to the trace spans. Add a way to
configure the Honeycomb integration to our config files, and by
default configure all of our tests to "mute" (send nothing).
Followup changes will refine the configuration, attempt to reduce
the new dependency load, and introduce better sampling.
Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218
Named field `DB`, in a each component configuration struct, acts as the
receiver for the value of `db` when component JSON files are
unmarshalled.
When `cmd.DBConfig` fields are received at the root of component
configuration struct instead of `DB` copy them to the `DB` field of the
component configuration struct.
Move existing `cmd.DBConfig` values from the root of each component's
JSON configuration in `test/config-next` to `db`
Part of #5275
In #5235 we replaced MaxDBConns in favor of MaxOpenConns.
One week ago MaxDBConns was removed from all dev, staging, and
production configurations. This change completes the removal of
MaxDBConns from all components and test/config.
Fixes#5249
Historically the only database/sql driver setting exposed via JSON
config was maxDBConns. This change adds support for maxIdleConns,
connMaxLifetime, connMaxIdleTime, and renames maxDBConns to
maxOpenConns. The addition of these settings will give our SRE team a
convenient method for tuning the reuse/closure of database connections.
A new struct, DBSettings, has been added to SA. The struct, and each of
it's fields has been commented.
All new fields have been plumbed through to the relevant Boulder
components and exported as Prometheus metrics. Tests have been
added/modified to ensure that the fields are being set. There should be
no loss in coverage
Deployability concerns for the migration from maxDBConns to maxOpenConns
have been addressed with the temporary addition of the helper method
cmd.DBConfig.GetMaxOpenConns(). This method can be removed once
test/config is defaulted to using maxOpenConns. Relevant sections of the
code have TODOs added that link back to an newly opened issue.
Fixes#5199
The vast majority of Boulder components no longer care about
anything in the common config block. As such, we hope to
remove it entirely in the near future. So let's put the (not-yet-used)
IssuerCerts config item in the main OCSPResponder block,
rather than in the common block.
Part of #5204
It is possible for a CertificateStatus row to have a nil IssuerID
(there was a period of time in which we didn't write IssuerIDs into
CertificateStatus rows at all) but all such rows should be old and
therefore expired.
Unfortunately, this code was checking the IssuerID before it was
checking the Expiry, and therefore was generating panics when trying
to dereference a nil pointer.
This change simply moves the IssuerID check to be after the Expires
check, so that we'll only try to dereference the IssuerID on recent
CertificateStatus rows, where it is guaranteed to be non-nil.
Fixes#5200
When making an OCSP request, the client provides three pieces of
information: the URL which it is querying to get OCSP info, the
hash of the issuer public which issued the cert in question, and
the serial number of the cert in question. In Boulder, the first
of these is only provided implicitly, based on which instance of
ocsp-responder is handling the request: we ensure (via configs)
that the ocsp-responder handling a given OCSP AIA URL has the
corresponding issuer cert loaded in memory.
When handling a request, the ocsp-responder checks three things:
that the request is using SHA1 to hash the issuer public key, that
the requested issuer public key matches one of the loaded issuer
certs, and that the requested serial number is one we could have
issued (i.e. has the correct prefix). It relies on the database
query to filter out requests for non-existent serials.
However, this means that a request to an ocsp-responder instance
with issuer cert A loaded could receive and handle a request which
specifies cert A as the issuer, but names a serial which was actually
issued by issuer cert B. The checks all pass and the database lookup
succeeds. But the returned OCSP response is for a certificate that
was issued by a different issuer, and the response itself was
signed by that other issuer.
In order to resolve this potentially confusing situation, this change
adds one additional check to the ocsp-responder: after it has
retrieved the ocsp response, it looks up which issuer produced that
ocsp response, confirms that it has that issuer cert loaded in
memory, and confirms that its issuer key hash matches that in the
original request.
There is still one wrinkle if issuer certs A and B are both loaded
in one ocsp-responder, and that one ocsp-responder is handling OCSP
requests to both of their AIA OCSP URLs. In this case, it is possible
that a request to a.ocsp.com, but requesting OCSP for a cert issued
by B, could still have its request answered. This is because the
ocsp-responder itself does not know which URL was requested. But
regardless, this change does guarantee that the response will match
the contents of the request (or no response will be given), no matter
what URL that request was sent to.
Fixes#5182
The ocsp-responder takes a path to a certificate file as one of
its config values. It uses this path as one of the inputs when
constructing its DBSource, the object responsible for querying
the database for pregenerated OCSP responses to fulfill requests.
However, this certificate file is not necessary to query the
database; rather, it only acts as a filter: OCSP requests whose
IssuerKeyHash do not match the hash of the loaded certificate are
rejected outright, without querying the DB. In addition, there is
currently only support for a single certificate file in the config.
This change adds support for multiple issuer certificate files in
the config, and refactors the pre-database filtering of bad OCSP
requests into a helper object dedicated solely to that purpose.
Fixes#5119
Simplify database interactions
This change is a result of an audit of all places where
Go code directly constructs SQL queries and executes them
against a dbMap, with the goal of eliminating all instances
of constructing a well-known object type (such as a
core.CertificateStatus) from explicitly-listed database columns.
Instead, we should be relying on helper functions defined in the
sa itself to determine which columns are relevant for the
construction of any given object.
This audit did not find many places where this was occurring. It
did reveal a few simplifications, which are contained in this
change:
1) Greater use of existing SelectFoo methods provided by models.go
2) Streamlining of various SelectSingularFoo methods to always
select by serial string, rather than user-provided WHERE clause
3) One spot (in ocsp-responder) where using a well-known type seemed
better than using a more minimal custom type
Addresses #4899
In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back.
There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change.
Fixes#4591.
New types and related infrastructure are added to the `db` package to allow
wrapping gorp DbMaps and Transactions.
The wrapped versions return a special `db.ErrDatabaseOp` error type when errors
occur. The new error type includes additional information such as the operation
that failed and the related table.
Where possible we determine the table based on the types of the gorp function
arguments. Where that isn't possible (e.g. with raw SQL queries) we try to use
a simple regexp approach to find the table name. This isn't great for general
SQL but works well enough for Boulder's existing SQL queries.
To get additional confidence my regexps work for all of Boulder's queries
I temporarily changed the `db` package's `tableFromQuery` function to panic if
the table couldn't be determined. I re-ran the full unit and integration test
suites with this configuration and saw no panics.
Resolves https://github.com/letsencrypt/boulder/issues/4559
Integrates the cfssl/ocsp responder code directly into boulder. I've tried to
pare down the existing code to only the bits we actually use and have removed
some generic interfaces in places in favor of directly using our boulder
specific interfaces.
Fixes#4427.
The ocsp-updater ocspStaleMaxAge config var has to be bumped up to ~7 months so that when it is run after the six-months-ago run it will actually update the ocsp responses generated during that period and mark the certificate status row as expired.
Fixes#4338.
Go 1.11+ updated the `sql.DBStats` struct with new fields that are of
interest to us. This PR routes these stats to Prometheus by replacing
the existing autoprom stats code with new first-class Prometheus
metrics. Resolves https://github.com/letsencrypt/boulder/issues/4095
The `max_db_connections` stat from the SA is removed because the Go 1.11+
`sql.DBStats.MaxOpenConnections` field will give us a better view of
the same information.
The autoprom "reused_authz" stat that was being incremented in
`SA.GetPendingAuthorization` was also removed. It wasn't doing what it
says it was (counting reused authorizations) and was instead counting
the number of times `GetPendingAuthorization` returned an authz.
Right now if ocsp-responder gets flooded with traffic, it will have a number of requests that
spend long enough waiting for an available connection that the reverse proxy will have given
up on them before they get a chance to execute the SQL query. Add a timeout parameter so
ocsp-responder can gracefully shed this load rather than try to do pointless work.
Fixes#3836.
```
$ ./test.sh
ok github.com/cloudflare/cfssl/api 1.023s coverage: 81.1% of statements
ok github.com/cloudflare/cfssl/api/bundle 1.464s coverage: 87.2% of statements
ok github.com/cloudflare/cfssl/api/certadd 16.766s coverage: 86.8% of statements
ok github.com/cloudflare/cfssl/api/client 1.062s coverage: 51.9% of statements
ok github.com/cloudflare/cfssl/api/crl 1.075s coverage: 75.0% of statements
ok github.com/cloudflare/cfssl/api/gencrl 1.038s coverage: 72.5% of statements
ok github.com/cloudflare/cfssl/api/generator 1.478s coverage: 33.3% of statements
ok github.com/cloudflare/cfssl/api/info 1.085s coverage: 84.1% of statements
ok github.com/cloudflare/cfssl/api/initca 1.050s coverage: 90.5% of statements
ok github.com/cloudflare/cfssl/api/ocsp 1.114s coverage: 93.8% of statements
ok github.com/cloudflare/cfssl/api/revoke 3.063s coverage: 75.0% of statements
ok github.com/cloudflare/cfssl/api/scan 2.988s coverage: 62.1% of statements
ok github.com/cloudflare/cfssl/api/sign 2.680s coverage: 83.3% of statements
ok github.com/cloudflare/cfssl/api/signhandler 1.114s coverage: 26.3% of statements
ok github.com/cloudflare/cfssl/auth 1.010s coverage: 68.2% of statements
ok github.com/cloudflare/cfssl/bundler 22.078s coverage: 84.5% of statements
ok github.com/cloudflare/cfssl/certdb/dbconf 1.013s coverage: 84.2% of statements
ok github.com/cloudflare/cfssl/certdb/ocspstapling 1.302s coverage: 69.2% of statements
ok github.com/cloudflare/cfssl/certdb/sql 1.223s coverage: 70.5% of statements
ok github.com/cloudflare/cfssl/cli 1.014s coverage: 62.5% of statements
ok github.com/cloudflare/cfssl/cli/bundle 1.011s coverage: 0.0% of statements [no tests to run]
ok github.com/cloudflare/cfssl/cli/crl 1.086s coverage: 57.8% of statements
ok github.com/cloudflare/cfssl/cli/gencert 7.927s coverage: 83.6% of statements
ok github.com/cloudflare/cfssl/cli/gencrl 1.064s coverage: 73.3% of statements
ok github.com/cloudflare/cfssl/cli/gencsr 1.058s coverage: 70.3% of statements
ok github.com/cloudflare/cfssl/cli/genkey 2.718s coverage: 70.0% of statements
ok github.com/cloudflare/cfssl/cli/ocsprefresh 1.077s coverage: 64.3% of statements
ok github.com/cloudflare/cfssl/cli/revoke 1.033s coverage: 88.2% of statements
ok github.com/cloudflare/cfssl/cli/scan 1.014s coverage: 36.0% of statements
ok github.com/cloudflare/cfssl/cli/selfsign 2.342s coverage: 73.2% of statements
ok github.com/cloudflare/cfssl/cli/serve 1.076s coverage: 38.2% of statements
ok github.com/cloudflare/cfssl/cli/sign 1.070s coverage: 54.8% of statements
ok github.com/cloudflare/cfssl/cli/version 1.011s coverage: 100.0% of statements
ok github.com/cloudflare/cfssl/cmd/cfssl 1.028s coverage: 0.0% of statements [no tests to run]
ok github.com/cloudflare/cfssl/cmd/cfssljson 1.012s coverage: 3.4% of statements
ok github.com/cloudflare/cfssl/cmd/mkbundle 1.011s coverage: 0.0% of statements [no tests to run]
ok github.com/cloudflare/cfssl/config 1.023s coverage: 67.7% of statements
ok github.com/cloudflare/cfssl/crl 1.054s coverage: 68.3% of statements
ok github.com/cloudflare/cfssl/csr 8.473s coverage: 89.6% of statements
ok github.com/cloudflare/cfssl/errors 1.014s coverage: 79.6% of statements
ok github.com/cloudflare/cfssl/helpers 1.216s coverage: 80.6% of statements
ok github.com/cloudflare/cfssl/helpers/derhelpers 1.017s coverage: 48.0% of statements
ok github.com/cloudflare/cfssl/helpers/testsuite 7.826s coverage: 65.8% of statements
ok github.com/cloudflare/cfssl/initca 151.314s coverage: 73.2% of statements
ok github.com/cloudflare/cfssl/log 1.013s coverage: 59.3% of statements
ok github.com/cloudflare/cfssl/multiroot/config 1.258s coverage: 77.4% of statements
ok github.com/cloudflare/cfssl/ocsp 1.353s coverage: 75.1% of statements
ok github.com/cloudflare/cfssl/revoke 1.149s coverage: 75.0% of statements
ok github.com/cloudflare/cfssl/scan 1.023s coverage: 1.1% of statements
skipped github.com/cloudflare/cfssl/scan/crypto/md5
skipped github.com/cloudflare/cfssl/scan/crypto/rsa
skipped github.com/cloudflare/cfssl/scan/crypto/sha1
skipped github.com/cloudflare/cfssl/scan/crypto/sha256
skipped github.com/cloudflare/cfssl/scan/crypto/sha512
skipped github.com/cloudflare/cfssl/scan/crypto/tls
ok github.com/cloudflare/cfssl/selfsign 1.098s coverage: 70.0% of statements
ok github.com/cloudflare/cfssl/signer 1.020s coverage: 19.4% of statements
ok github.com/cloudflare/cfssl/signer/local 4.886s coverage: 77.9% of statements
ok github.com/cloudflare/cfssl/signer/remote 2.500s coverage: 70.0% of statements
ok github.com/cloudflare/cfssl/signer/universal 2.228s coverage: 67.7% of statements
ok github.com/cloudflare/cfssl/transport 1.012s
ok github.com/cloudflare/cfssl/transport/ca/localca 1.046s coverage: 94.9% of statements
ok github.com/cloudflare/cfssl/transport/kp 1.050s coverage: 37.1% of statements
ok github.com/cloudflare/cfssl/ubiquity 1.037s coverage: 88.3% of statements
ok github.com/cloudflare/cfssl/whitelist 3.519s coverage: 100.0% of statements
...
$ go test ./... (master✱)
ok golang.org/x/crypto/acme 2.782s
ok golang.org/x/crypto/acme/autocert 2.963s
? golang.org/x/crypto/acme/autocert/internal/acmetest [no test files]
ok golang.org/x/crypto/argon2 0.047s
ok golang.org/x/crypto/bcrypt 4.694s
ok golang.org/x/crypto/blake2b 0.056s
ok golang.org/x/crypto/blake2s 0.050s
ok golang.org/x/crypto/blowfish 0.015s
ok golang.org/x/crypto/bn256 0.460s
ok golang.org/x/crypto/cast5 4.204s
ok golang.org/x/crypto/chacha20poly1305 0.560s
ok golang.org/x/crypto/cryptobyte 0.014s
? golang.org/x/crypto/cryptobyte/asn1 [no test files]
ok golang.org/x/crypto/curve25519 0.025s
ok golang.org/x/crypto/ed25519 0.073s
? golang.org/x/crypto/ed25519/internal/edwards25519 [no test files]
ok golang.org/x/crypto/hkdf 0.012s
ok golang.org/x/crypto/internal/chacha20 0.047s
ok golang.org/x/crypto/internal/subtle 0.011s
ok golang.org/x/crypto/md4 0.013s
ok golang.org/x/crypto/nacl/auth 9.226s
ok golang.org/x/crypto/nacl/box 0.016s
ok golang.org/x/crypto/nacl/secretbox 0.012s
ok golang.org/x/crypto/nacl/sign 0.012s
ok golang.org/x/crypto/ocsp 0.047s
ok golang.org/x/crypto/openpgp 8.872s
ok golang.org/x/crypto/openpgp/armor 0.012s
ok golang.org/x/crypto/openpgp/clearsign 16.984s
ok golang.org/x/crypto/openpgp/elgamal 0.013s
? golang.org/x/crypto/openpgp/errors [no test files]
ok golang.org/x/crypto/openpgp/packet 0.159s
ok golang.org/x/crypto/openpgp/s2k 7.597s
ok golang.org/x/crypto/otr 0.612s
ok golang.org/x/crypto/pbkdf2 0.045s
ok golang.org/x/crypto/pkcs12 0.073s
ok golang.org/x/crypto/pkcs12/internal/rc2 0.013s
ok golang.org/x/crypto/poly1305 0.016s
ok golang.org/x/crypto/ripemd160 0.034s
ok golang.org/x/crypto/salsa20 0.013s
ok golang.org/x/crypto/salsa20/salsa 0.013s
ok golang.org/x/crypto/scrypt 0.942s
ok golang.org/x/crypto/sha3 0.140s
ok golang.org/x/crypto/ssh 0.939s
ok golang.org/x/crypto/ssh/agent 0.529s
ok golang.org/x/crypto/ssh/knownhosts 0.027s
ok golang.org/x/crypto/ssh/terminal 0.016s
ok golang.org/x/crypto/tea 0.010s
ok golang.org/x/crypto/twofish 0.019s
ok golang.org/x/crypto/xtea 0.012s
ok golang.org/x/crypto/xts 0.016s
```
A match of an OCSP request's serial number to *any* of the configured `reqSerialPrefixes` entries is sufficient for the request to be valid, not just the last `reqSerialPrefixes` entry.
Resolves https://github.com/letsencrypt/boulder/issues/3829
A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)).
Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions
of the logger functions and call these directly with the format and arguments.
While here remove some unnecessary trailing newlines and calls to String/Error.
Previously, all errors were treated as Not Found, but we actually want
to treat database errors differently; for instance, by not caching them,
and by setting tighter alerting thresholds for them.
Fixes#3419.
There were two bugs in #3167:
All process-level stats got prefixed with "boulder", which broke dashboards.
All request_time stats got dropped, because measured_http was using the prometheus DefaultRegisterer.
To fix, this PR plumbs through a scope object to measured_http, and uses an empty prefix when calling NewProcessCollector().
Previously, we used prometheus.DefaultRegisterer to register our stats, which uses global state to export its HTTP stats. We also used net/http/pprof's behavior of registering to the default global HTTP ServeMux, via DebugServer, which starts an HTTP server that uses that global ServeMux.
In this change, I merge DebugServer's functions into StatsAndLogging. StatsAndLogging now takes an address parameter and fires off an HTTP server in a goroutine. That HTTP server is newly defined, and doesn't use DefaultServeMux. On it is registered the Prometheus stats handler, and handlers for the various pprof traces. In the process I split StatsAndLogging internally into two functions: makeStats and MakeLogger. I didn't port across the expvar variable exporting, which serves a similar function to Prometheus stats but which we never use.
One nice immediate effect of this change: Since StatsAndLogging now requires and address, I noticed a bunch of commands that called StatsAndLogging, and passed around the resulting Scope, but never made use of it because they didn't run a DebugServer. Under the old StatsD world, these command still could have exported their stats by pushing, but since we moved to Prometheus their stats stopped being collected. We haven't used any of these stats, so instead of adding debug ports to all short-lived commands, or setting up a push gateway, I simply removed them and switched those commands to initialize only a Logger, no stats.
Fixes#3020.
In order to write integration tests for some features, especially related to rate limiting, rechecking of CAA, and expiration of authzs, orders, and certs, we need to be able to fake the passage of time in integration tests.
To do so, this change switches out all clock.Default() instances for cmd.Clock(), which can be set manually with the FAKECLOCK environment variable. integration-test.py now starts up all servers once before the main body of tests, with FAKECLOCK set to a date 70 days ago, and does some initial setup for a new integration test case. That test case tries to fetch a 70-day-old authz URL, and expects it to 404.
In order to make this work, I also had to change a number of our test binaries to shut down cleanly in response to SIGTERM. Without that change, stopping the servers between the setup phase and the main tests caused startservers.check() to fail, because some processes exited with nonzero status.
Note: This is an initial stab at things, to prove out the technique. Long-term, I think we will want to use an idiom where test cases are classes that have a number of optional setup phases that may be run at e.g. 70 days prior and 5 days prior. This could help us avoid a proliferation of global state as we add more time-dependent test cases.
This used to be used for AMQP queue names. Now that AMQP is gone, these consts
were only used when printing a version string at startup. This changes
VersionString to just use the name of the current program, and removes
`const clientName = ` from many of our main.go's.
This removes the config and code to output to statsd.
- Change `cmd.StatsAndLogging` to output a `Scope`, not a `Statter`.
- Remove the prefixing of component name (e.g. "VA") in front of stats; this was stripped by `autoProm` but now no longer needs to be.
- Delete vendored statsd client.
- Delete `MockStatter` (generated by gomock) and `mocks.Statter` (hand generated) in favor of mocking `metrics.Scope`, which is the interface we now use everywhere.
- Remove a few unused methods on `metrics.Scope`, and update its generated mock.
- Refactor `autoProm` and add `autoRegisterer`, which can be included in a `metrics.Scope`, avoiding global state. `autoProm` now registers everything with the `prometheus.Registerer` it is given.
- Change va_test.go's `setup()` to not return a stats object; instead the individual tests that care about stats override `va.stats` directly.
Fixes#2639, #2733.
Uses a special mux for the OCSP Responder so that we stop collapsing double slashes in GET requests which cause a small number of requests to be considered malformed.
Rename HTTPMonitor to MeasuredHandler.
Remove inflight stat (we didn't use it).
Add timing stat by method, endpoint, and status code.
The timing stat subsumes the "rate" stat, so remove that.
WFE now wraps in MeasuredHandler, instead of relying on its cmd/main.go.
Remove FBAdapter stats.
MeasuredHandler tracks stats by method, status code, and endpoint.
In VA, add a Prometheus histogram for validation timing.
If you are the first person to add a feature to a Boulder command its very
easy to forget to update the command's config structure to accommodate a
`map[string]bool` entry and to pass it to `features.Set` in `main()`. See
https://github.com/letsencrypt/boulder/issues/2533 for one example. I've
fallen into this trap myself a few times so I'm going to try and save myself
some future grief by fixing it across the board once and for all!
This PR adds a `Features` config entry and a corresponding `features.Set` to:
* ocsp-updater (resolves#2533)
* admin-revoker
* boulder-publisher
* contact-exporter
* expiration-mailer
* expired-authz-purger
* notify-mailer
* ocsp-responder
* orphan-finder
These components were skipped because they already had features supported:
* boulder-ca
* boulder-ra
* boulder-sa
* boulder-va
* boulder-wfe
* cert-checker
I deliberately skipped adding Feature support to:
* single-ocsp (Its only configuration comes from the pkcs11key library and
doesn't support features)
* rabbitmq-setup (No configuration/features and we'll likely soon be rming this
since the gRPC migration)
* notafter-backfill (This is a one-off that will be deleted soon)
Implements a less RPC focused signal catch/shutdown method. Certain things that probably could also use this (i.e. `ocsp-updater`) haven't been given it as they would require rather substantial changes to allow for a graceful shutdown approach.
Fixes#2298.
Updates #1699.
Adds a new package, `features`, which exposes methods to set and check if various internal features are enabled. The implementation uses global state to store the features so that services embedded in another service do not each require their own features map in order to check if something is enabled.
Requires a `boulder-tools` image update to include `golang.org/x/tools/cmd/stringer`.
Output base64-encoded DER, as expected by ocsp-responder.
Use flags instead of template for Status, ThisUpdate, NextUpdate.
Provide better help.
Remove old test (wasn't run automatically).
Add it to integration test, and use its output for integration test of issuer ocsp-responder.
Add another slot to boulder-tools HSM image, to store root key.
Right now we use the Source field for both DB and file URLs. However, we want to move to the DBConnect config field, so that we can take advantage of the code that reads DSNs from a file on disk. It turns out the existing code didn't work if you configure a dbConnect string, because it would error out with:
"source" parameter not found in JSON config
After rearranging, both methods should work.
* rename, change params, restructure
* I'm wondering how I managed that one
* use a metrics.Scope
* move method to SA, update callers
* rerun goimports
* fix compile error
* revert cmd/shell.go
https://github.com/letsencrypt/boulder/pull/1805
- Remove error signatures from log methods. This means fewer places where errcheck will show ignored errors.
- Pull in latest cfssl to be compatible with errorless log messages.
- Reduce the number of message priorities we support to just those we actually use.
- AuditNotice -> AuditInfo
- Remove InfoObject (only one use, switched to Info)
- Remove EmergencyExit and related functions in favor of panic
- Remove SyslogWriter / AuditLogger separate types in favor of a single interface, Logger, that has all the logging methods on it.
- Merge mock log into logger. This allows us to unexport the internals but still override them in the mock.
- Shorten names to be compatible with Go style: New, Set, Get, Logger, NewMock, etc.
- Use a shorter log format for stdout logs.
- Remove "... Starting" log messages. We have better information in the "Versions" message logged at startup.
Motivation: The AuditLogger / SyslogWriter distinction was confusing and exposed internals only necessary for tests. Some components accepted one type and some accepted the other. This made it hard to consistently use mock loggers in tests. Also, the unnecessarily fat interface for AuditLogger made it hard to meaningfully mock out.
It's now the default in all cases that it was configurable. When we want to
suppress SQL debug messages, we can simply adjust the logging level to suppress
debug messages in general.
Also, pass a logger to SetSQLDebug rather than calling GetAuditLogger.
Some stat services, we believe, are saying the ocsp-responder is down
because / returns 400 Bad Request currently.
Shuffle some code into a new `mux` function to make it easier to test.
Consolidate initialization of stats and logging from each main.go into cmd
package.
Define a new config parameter, `StdoutLevel`, that determines the maximum log
level that will be printed to stdout. It can be set to 6 to inhibit debug
messages, or 0 to print only emergency messages, or -1 to print no messages at
all.
Remove the existing config parameter `Tag`. Instead, choose the tag from the
basename of the currently running process. Previously all Boulder log messages
had the tag "boulder", but now they will be differentiated by process, like
"boulder-wfe".
Shorten the date format used in stdout logging, and add the current binary's
basename.
Consolidate setup function in audit-logger_test.go.
Note: Most CLI binaries now get their stats and logging from the parameters of
Action. However, a few of our binaries don't use our custom AppShell, and
instead use codegangsta/cli directly. For those binaries, we export the new
StatsAndLogging method from cmd.
Fixes https://github.com/letsencrypt/boulder/issues/852
OCSP-Responder attempts to read the OCSP response from the certificateStatus table,
if it cannot find a response there it reads the ocspResponses table to try to find a
response, if neither contains a response the not found bool is passed back to the
Responder.
* Moves revocation from the CA to the OCSP-Updater, the RA will mark certificates as
revoked then wait for the OCSP-Updater to create a new (final) revoked response
* Merges the ocspResponses table with the certificateStatus table and only use UPDATES
to update the OCSP response (vs INSERT-only since this happens quite often and will
lead to an extremely large table)
We had a staging deploy go bad because of the missing error handling on
the "source" config not being in the JSON. While we debugged, I wrote
some tests.
Fixes#936.
This means after parsing the config file, setting up stats, and dialing the
syslogger. But it is still before trying to initialize the given server. This
means that we are more likely to get version numbers logged for some common
runtime failures.
If two OCSP responses were generated in the same second, the earlier would
previously take priority sometimes, leading to a "good" response for revoked
certificates and causing the OCSP integration test to be flaky.