Commit Graph

171 Commits

Author SHA1 Message Date
Samantha 99502b1ffb
oscp-updater: use rows.Scan() to get query results (#5656)
- Replace `gorp.DbMap` with calls that use `sql.DB` directly
- Use `rows.Scan()` and `rows.Next()` to get query results (which opens the door to streaming the results)
- Export function `CertStatusMetadataFields` from `SA`
- Add new function `ScanCertStatusRow` to `SA`
- Add new function `NewDbSettingsFromDBConfig` to `SA`

Fixes #5642
Part Of #5715
2021-10-18 10:33:09 -07:00
Aaron Gable 9ee02b2588
ocsp-responder: handle NameIDs in the database (#5592)
Give ocsp-responder a new map of IssuerNameIDs to keyHashes,
so that it can confirm that OCSP requests have an appropriate
key hash whether the database is storing old-style IssuerIDs or
new-style IssuerNameIDs.

Part of #5152
2021-08-20 18:21:16 -07:00
Aaron Gable 5fcabde592
Make CertificateStatus.IssuerID not a reference (#5594)
Change `CertificateStatus.IssuerID` from `*int64` to just an
`int64`. It might make sense for this field to be nillable in a world
where we want to distinguish between it being missing and it
being zero, but none of our code actually does that: we error
out either way.

Part of #5152
2021-08-20 13:19:29 -07:00
J.C. Jones 7b31bdb30a
Add read-only dbConns to SQLStorageAuthority and OCSPUpdater (#5555)
This changeset adds a second DB connect string for the SA for use in 
read-only queries that are not themselves dependencies for read-write 
queries. In other words, this is attempting to only catch things like 
rate-limit `SELECT`s and other coarse-counting, so we can potentially 
move those read queries off the read-write primary database.

It also adds a second DB connect string to the OCSP Updater. This is a 
little trickier, as the subsequent `UPDATE`s _are_ dependent on the 
output of the `SELECT`, but in this case it's operating on data batches,
and a few seconds' replication latency are several orders of magnitude 
below the threshold for update frequency, so any certificates that 
aren't caught on run `n` can be caught on run `n+1`.

Since we export DB metrics to Prometheus, this also refactors 
`InitDBMetrics` to take a DB Address (host:port tuple) and User out of 
the DB connection DSN and include those as labels in the metrics.

Fixes #5550
Fixes #4985
2021-08-02 11:21:34 -07:00
Aaron Gable 9a12ba7f7f
OCSP: Don't warn on expired responses (#5507)
Downgrade the "ocsp response expired" log from Warning to Info, as
this is a very common occurrence and should be expected.

Fixes #5501
2021-07-09 10:01:20 -07:00
Aaron Gable 9abb39d4d6
Honeycomb integration proof-of-concept (#5408)
Add Honeycomb tracing to all Boulder components which act as
HTTP servers, gRPC servers, or gRPC clients. Add many values
which we currently emit to logs to the trace spans. Add a way to
configure the Honeycomb integration to our config files, and by
default configure all of our tests to "mute" (send nothing).

Followup changes will refine the configuration, attempt to reduce
the new dependency load, and introduce better sampling.

Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218
2021-05-24 16:13:08 -07:00
Aaron Gable f569b15b64
Remove common config from ocsp-responder (#5350)
The old `config.Common.IssuerCert` format is no longer used in any
production configs, and can be removed safely.

Part of #5162
Part of #5242
2021-03-18 17:16:37 -07:00
Samantha 5a92926b0c
Remove dbconfig migration deployability code (#5348)
Default boulder code paths to exclusively use the `db` config key

Fixes #5338
2021-03-18 16:41:15 -07:00
Samantha e2e7dad034
Move cmd.DBConfig fields to their own named sub-struct (#5286)
Named field `DB`, in a each component configuration struct, acts as the
receiver for the value of `db` when component JSON files are
unmarshalled.

When `cmd.DBConfig` fields are received at the root of component
configuration struct instead of `DB` copy them to the `DB` field of the
component configuration struct.

Move existing `cmd.DBConfig` values from the root of each component's
JSON configuration in `test/config-next` to `db`

Part of #5275
2021-02-16 10:48:58 -08:00
Samantha 7cb0038498
Deprecate MaxDBConns for MaxOpenConns (#5274)
In #5235 we replaced MaxDBConns in favor of MaxOpenConns.

One week ago MaxDBConns was removed from all dev, staging, and
production configurations. This change completes the removal of
MaxDBConns from all components and test/config.

Fixes #5249
2021-02-08 12:00:01 -08:00
Samantha e0510056cc
Enhancements to SQL driver tuning via JSON config (#5235)
Historically the only database/sql driver setting exposed via JSON
config was maxDBConns. This change adds support for maxIdleConns,
connMaxLifetime, connMaxIdleTime, and renames maxDBConns to
maxOpenConns. The addition of these settings will give our SRE team a
convenient method for tuning the reuse/closure of database connections.

A new struct, DBSettings, has been added to SA. The struct, and each of
it's fields has been commented.

All new fields have been plumbed through to the relevant Boulder
components and exported as Prometheus metrics. Tests have been
added/modified to ensure that the fields are being set. There should be
no loss in coverage

Deployability concerns for the migration from maxDBConns to maxOpenConns
have been addressed with the temporary addition of the helper method
cmd.DBConfig.GetMaxOpenConns(). This method can be removed once
test/config is defaulted to using maxOpenConns. Relevant sections of the
code have TODOs added that link back to an newly opened issue.

Fixes #5199
2021-01-25 15:34:55 -08:00
Aaron Gable 5ca0c343af
ocsp-responder: move IssuerCerts out of common config (#5203)
The vast majority of Boulder components no longer care about
anything in the common config block. As such, we hope to
remove it entirely in the near future. So let's put the (not-yet-used)
IssuerCerts config item in the main OCSPResponder block,
rather than in the common block.

Part of #5204
2020-12-15 16:59:38 -08:00
Aaron Gable 9ba2d3c00b
ocsp-responder: move IssuerID check after Expires check (#5202)
It is possible for a CertificateStatus row to have a nil IssuerID
(there was a period of time in which we didn't write IssuerIDs into
CertificateStatus rows at all) but all such rows should be old and
therefore expired.

Unfortunately, this code was checking the IssuerID before it was
checking the Expiry, and therefore was generating panics when trying
to dereference a nil pointer.

This change simply moves the IssuerID check to be after the Expires
check, so that we'll only try to dereference the IssuerID on recent
CertificateStatus rows, where it is guaranteed to be non-nil.

Fixes #5200
2020-12-15 14:38:21 -08:00
Aaron Gable fff9794477
ocsp-responder: don't respond for other issuers (#5183)
When making an OCSP request, the client provides three pieces of
information: the URL which it is querying to get OCSP info, the
hash of the issuer public which issued the cert in question, and
the serial number of the cert in question. In Boulder, the first
of these is only provided implicitly, based on which instance of
ocsp-responder is handling the request: we ensure (via configs)
that the ocsp-responder handling a given OCSP AIA URL has the
corresponding issuer cert loaded in memory.

When handling a request, the ocsp-responder checks three things:
that the request is using SHA1 to hash the issuer public key, that
the requested issuer public key matches one of the loaded issuer
certs, and that the requested serial number is one we could have
issued (i.e. has the correct prefix). It relies on the database
query to filter out requests for non-existent serials.

However, this means that a request to an ocsp-responder instance
with issuer cert A loaded could receive and handle a request which
specifies cert A as the issuer, but names a serial which was actually
issued by issuer cert B. The checks all pass and the database lookup
succeeds. But the returned OCSP response is for a certificate that
was issued by a different issuer, and the response itself was
signed by that other issuer.

In order to resolve this potentially confusing situation, this change
adds one additional check to the ocsp-responder: after it has
retrieved the ocsp response, it looks up which issuer produced that
ocsp response, confirms that it has that issuer cert loaded in
memory, and confirms that its issuer key hash matches that in the
original request.

There is still one wrinkle if issuer certs A and B are both loaded
in one ocsp-responder, and that one ocsp-responder is handling OCSP
requests to both of their AIA OCSP URLs. In this case, it is possible
that a request to a.ocsp.com, but requesting OCSP for a cert issued
by B, could still have its request answered. This is because the
ocsp-responder itself does not know which URL was requested. But
regardless, this change does guarantee that the response will match
the contents of the request (or no response will be given), no matter
what URL that request was sent to.

Fixes #5182
2020-11-30 11:51:32 -08:00
Aaron Gable 8cf597459d
Add multi-issuer support to ocsp-responder (#5154)
The ocsp-responder takes a path to a certificate file as one of
its config values. It uses this path as one of the inputs when
constructing its DBSource, the object responsible for querying
the database for pregenerated OCSP responses to fulfill requests.

However, this certificate file is not necessary to query the
database; rather, it only acts as a filter: OCSP requests whose
IssuerKeyHash do not match the hash of the loaded certificate are
rejected outright, without querying the DB. In addition, there is
currently only support for a single certificate file in the config.

This change adds support for multiple issuer certificate files in
the config, and refactors the pre-database filtering of bad OCSP
requests into a helper object dedicated solely to that purpose.

Fixes #5119
2020-11-10 09:21:09 -08:00
Aaron Gable 6f0016262f
Simplify database interactions (#4949)
Simplify database interactions

This change is a result of an audit of all places where
Go code directly constructs SQL queries and executes them
against a dbMap, with the goal of eliminating all instances
of constructing a well-known object type (such as a
core.CertificateStatus) from explicitly-listed database columns.
Instead, we should be relying on helper functions defined in the
sa itself to determine which columns are relevant for the
construction of any given object.

This audit did not find many places where this was occurring. It
did reveal a few simplifications, which are contained in this
change:
1) Greater use of existing SelectFoo methods provided by models.go
2) Streamlining of various SelectSingularFoo methods to always
   select by serial string, rather than user-provided WHERE clause
3) One spot (in ocsp-responder) where using a well-known type seemed
   better than using a more minimal custom type

Addresses #4899
2020-07-20 11:12:52 -07:00
Roland Bracewell Shoemaker e940b6386f
ocsp: switch from cfssl/log to internal log (#4941)
Fixes #4898.
2020-07-08 09:32:23 -07:00
Jacob Hoffman-Andrews 7bddafd45e
Add MaxBytesReader for ocsp-responder. (#4869)
Also, return status code 500 when the OCSP response from
the DB is unparseable.
2020-06-23 11:30:59 -07:00
Roland Bracewell Shoemaker c4813cc340
cmd/ceremony: merge single-ocsp tool into ceremony (#4878)
Fixes #4658.
2020-06-23 11:30:31 -07:00
Jacob Hoffman-Andrews 06ffb57221
Update go-gorp and run go mod tidy. (#4860)
gorp now uses go modules.

```
$ cd ~/go/src/github.com/go-gorp/gorp/
$ git checkout v3.0.1
$ go test ./...
ok      github.com/go-gorp/gorp/v3      0.002s
```
2020-06-10 16:18:37 -07:00
Roland Bracewell Shoemaker d516537394
cmd/ocsp-responder: calculate key hash rather than relying on SKID (#4851)
Rather than just assuming the SKID is the key hash, calculate the
actual hash, also reject any requests for hashes we don't support.
2020-06-09 17:56:34 -07:00
Jacob Hoffman-Andrews 0e9ac0c638
Use bytes.Equal instead of bytes.Compare == 0 (#4758)
staticcheck cleanup: https://staticcheck.io/docs/checks#S1004
2020-04-08 17:20:56 -07:00
Roland Bracewell Shoemaker 5b2f11e07e Switch away from old style statsd metrics wrappers (#4606)
In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back.

There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change.

Fixes #4591.
2019-12-18 11:08:25 -05:00
Daniel McCarney 1c9ece3f44
SA: use wrapped database maps/transactions. (#4585)
New types and related infrastructure are added to the `db` package to allow
wrapping gorp DbMaps and Transactions.

The wrapped versions return a special `db.ErrDatabaseOp` error type when errors
occur. The new error type includes additional information such as the operation
that failed and the related table.

Where possible we determine the table based on the types of the gorp function
arguments. Where that isn't possible (e.g. with raw SQL queries) we try to use
a simple regexp approach to find the table name. This isn't great for general
SQL but works well enough for Boulder's existing SQL queries.

To get additional confidence my regexps work for all of Boulder's queries
I temporarily changed the `db` package's `tableFromQuery` function to panic if
the table couldn't be determined. I re-ran the full unit and integration test
suites with this configuration and saw no panics.

Resolves https://github.com/letsencrypt/boulder/issues/4559
2019-12-04 13:03:09 -05:00
Roland Bracewell Shoemaker 3359ec349b ocsp-responder: Integrate CFSSL OCSP responder code (#4461)
Integrates the cfssl/ocsp responder code directly into boulder. I've tried to
pare down the existing code to only the bits we actually use and have removed
some generic interfaces in places in favor of directly using our boulder
specific interfaces.

Fixes #4427.
2019-10-07 14:05:37 -04:00
Roland Bracewell Shoemaker db01830508
Return OCSP unauthorized status if the certificate is expired (#4380)
The ocsp-updater ocspStaleMaxAge config var has to be bumped up to ~7 months so that when it is run after the six-months-ago run it will actually update the ocsp responses generated during that period and mark the certificate status row as expired.

Fixes #4338.
2019-08-01 14:13:27 -07:00
Daniel McCarney 0ecdf80709 SA: refactor DB stat collection & collect more stats. (#4096)
Go 1.11+ updated the `sql.DBStats` struct with new fields that are of
interest to us. This PR routes these stats to Prometheus by replacing
the existing autoprom stats code with new first-class Prometheus
metrics. Resolves https://github.com/letsencrypt/boulder/issues/4095

The `max_db_connections` stat from the SA is removed because the Go 1.11+
`sql.DBStats.MaxOpenConnections` field will give us a better view of
the same information.

The autoprom "reused_authz" stat that was being incremented in
`SA.GetPendingAuthorization` was also removed. It wasn't doing what it
says it was (counting reused authorizations) and was instead counting
the number of times `GetPendingAuthorization` returned an authz.
2019-03-06 17:08:53 -08:00
Jacob Hoffman-Andrews 48103af5b1 Add timeout to ocsp-responder (#3892)
Right now if ocsp-responder gets flooded with traffic, it will have a number of requests that
spend long enough waiting for an available connection that the reverse proxy will have given
up on them before they get a chance to execute the SQL query. Add a timeout parameter so
ocsp-responder can gracefully shed this load rather than try to do pointless work.
2018-10-22 09:20:08 -04:00
Roland Bracewell Shoemaker 00be0627bd Add a stats shim to ocsp-responder (#3841)
Fixes #3836.

```
$ ./test.sh
ok  	github.com/cloudflare/cfssl/api	1.023s	coverage: 81.1% of statements
ok  	github.com/cloudflare/cfssl/api/bundle	1.464s	coverage: 87.2% of statements
ok  	github.com/cloudflare/cfssl/api/certadd	16.766s	coverage: 86.8% of statements
ok  	github.com/cloudflare/cfssl/api/client	1.062s	coverage: 51.9% of statements
ok  	github.com/cloudflare/cfssl/api/crl	1.075s	coverage: 75.0% of statements
ok  	github.com/cloudflare/cfssl/api/gencrl	1.038s	coverage: 72.5% of statements
ok  	github.com/cloudflare/cfssl/api/generator	1.478s	coverage: 33.3% of statements
ok  	github.com/cloudflare/cfssl/api/info	1.085s	coverage: 84.1% of statements
ok  	github.com/cloudflare/cfssl/api/initca	1.050s	coverage: 90.5% of statements
ok  	github.com/cloudflare/cfssl/api/ocsp	1.114s	coverage: 93.8% of statements
ok  	github.com/cloudflare/cfssl/api/revoke	3.063s	coverage: 75.0% of statements
ok  	github.com/cloudflare/cfssl/api/scan	2.988s	coverage: 62.1% of statements
ok  	github.com/cloudflare/cfssl/api/sign	2.680s	coverage: 83.3% of statements
ok  	github.com/cloudflare/cfssl/api/signhandler	1.114s	coverage: 26.3% of statements
ok  	github.com/cloudflare/cfssl/auth	1.010s	coverage: 68.2% of statements
ok  	github.com/cloudflare/cfssl/bundler	22.078s	coverage: 84.5% of statements
ok  	github.com/cloudflare/cfssl/certdb/dbconf	1.013s	coverage: 84.2% of statements
ok  	github.com/cloudflare/cfssl/certdb/ocspstapling	1.302s	coverage: 69.2% of statements
ok  	github.com/cloudflare/cfssl/certdb/sql	1.223s	coverage: 70.5% of statements
ok  	github.com/cloudflare/cfssl/cli	1.014s	coverage: 62.5% of statements
ok  	github.com/cloudflare/cfssl/cli/bundle	1.011s	coverage: 0.0% of statements [no tests to run]
ok  	github.com/cloudflare/cfssl/cli/crl	1.086s	coverage: 57.8% of statements
ok  	github.com/cloudflare/cfssl/cli/gencert	7.927s	coverage: 83.6% of statements
ok  	github.com/cloudflare/cfssl/cli/gencrl	1.064s	coverage: 73.3% of statements
ok  	github.com/cloudflare/cfssl/cli/gencsr	1.058s	coverage: 70.3% of statements
ok  	github.com/cloudflare/cfssl/cli/genkey	2.718s	coverage: 70.0% of statements
ok  	github.com/cloudflare/cfssl/cli/ocsprefresh	1.077s	coverage: 64.3% of statements
ok  	github.com/cloudflare/cfssl/cli/revoke	1.033s	coverage: 88.2% of statements
ok  	github.com/cloudflare/cfssl/cli/scan	1.014s	coverage: 36.0% of statements
ok  	github.com/cloudflare/cfssl/cli/selfsign	2.342s	coverage: 73.2% of statements
ok  	github.com/cloudflare/cfssl/cli/serve	1.076s	coverage: 38.2% of statements
ok  	github.com/cloudflare/cfssl/cli/sign	1.070s	coverage: 54.8% of statements
ok  	github.com/cloudflare/cfssl/cli/version	1.011s	coverage: 100.0% of statements
ok  	github.com/cloudflare/cfssl/cmd/cfssl	1.028s	coverage: 0.0% of statements [no tests to run]
ok  	github.com/cloudflare/cfssl/cmd/cfssljson	1.012s	coverage: 3.4% of statements
ok  	github.com/cloudflare/cfssl/cmd/mkbundle	1.011s	coverage: 0.0% of statements [no tests to run]
ok  	github.com/cloudflare/cfssl/config	1.023s	coverage: 67.7% of statements
ok  	github.com/cloudflare/cfssl/crl	1.054s	coverage: 68.3% of statements
ok  	github.com/cloudflare/cfssl/csr	8.473s	coverage: 89.6% of statements
ok  	github.com/cloudflare/cfssl/errors	1.014s	coverage: 79.6% of statements
ok  	github.com/cloudflare/cfssl/helpers	1.216s	coverage: 80.6% of statements
ok  	github.com/cloudflare/cfssl/helpers/derhelpers	1.017s	coverage: 48.0% of statements
ok  	github.com/cloudflare/cfssl/helpers/testsuite	7.826s	coverage: 65.8% of statements
ok  	github.com/cloudflare/cfssl/initca	151.314s	coverage: 73.2% of statements
ok  	github.com/cloudflare/cfssl/log	1.013s	coverage: 59.3% of statements
ok  	github.com/cloudflare/cfssl/multiroot/config	1.258s	coverage: 77.4% of statements
ok  	github.com/cloudflare/cfssl/ocsp	1.353s	coverage: 75.1% of statements
ok  	github.com/cloudflare/cfssl/revoke	1.149s	coverage: 75.0% of statements
ok  	github.com/cloudflare/cfssl/scan	1.023s	coverage: 1.1% of statements
skipped github.com/cloudflare/cfssl/scan/crypto/md5
skipped github.com/cloudflare/cfssl/scan/crypto/rsa
skipped github.com/cloudflare/cfssl/scan/crypto/sha1
skipped github.com/cloudflare/cfssl/scan/crypto/sha256
skipped github.com/cloudflare/cfssl/scan/crypto/sha512
skipped github.com/cloudflare/cfssl/scan/crypto/tls
ok  	github.com/cloudflare/cfssl/selfsign	1.098s	coverage: 70.0% of statements
ok  	github.com/cloudflare/cfssl/signer	1.020s	coverage: 19.4% of statements
ok  	github.com/cloudflare/cfssl/signer/local	4.886s	coverage: 77.9% of statements
ok  	github.com/cloudflare/cfssl/signer/remote	2.500s	coverage: 70.0% of statements
ok  	github.com/cloudflare/cfssl/signer/universal	2.228s	coverage: 67.7% of statements
ok  	github.com/cloudflare/cfssl/transport	1.012s
ok  	github.com/cloudflare/cfssl/transport/ca/localca	1.046s	coverage: 94.9% of statements
ok  	github.com/cloudflare/cfssl/transport/kp	1.050s	coverage: 37.1% of statements
ok  	github.com/cloudflare/cfssl/ubiquity	1.037s	coverage: 88.3% of statements
ok  	github.com/cloudflare/cfssl/whitelist	3.519s	coverage: 100.0% of statements
...

$ go test ./...                                                                                                                         (master✱)
ok  	golang.org/x/crypto/acme	2.782s
ok  	golang.org/x/crypto/acme/autocert	2.963s
?   	golang.org/x/crypto/acme/autocert/internal/acmetest	[no test files]
ok  	golang.org/x/crypto/argon2	0.047s
ok  	golang.org/x/crypto/bcrypt	4.694s
ok  	golang.org/x/crypto/blake2b	0.056s
ok  	golang.org/x/crypto/blake2s	0.050s
ok  	golang.org/x/crypto/blowfish	0.015s
ok  	golang.org/x/crypto/bn256	0.460s
ok  	golang.org/x/crypto/cast5	4.204s
ok  	golang.org/x/crypto/chacha20poly1305	0.560s
ok  	golang.org/x/crypto/cryptobyte	0.014s
?   	golang.org/x/crypto/cryptobyte/asn1	[no test files]
ok  	golang.org/x/crypto/curve25519	0.025s
ok  	golang.org/x/crypto/ed25519	0.073s
?   	golang.org/x/crypto/ed25519/internal/edwards25519	[no test files]
ok  	golang.org/x/crypto/hkdf	0.012s
ok  	golang.org/x/crypto/internal/chacha20	0.047s
ok  	golang.org/x/crypto/internal/subtle	0.011s
ok  	golang.org/x/crypto/md4	0.013s
ok  	golang.org/x/crypto/nacl/auth	9.226s
ok  	golang.org/x/crypto/nacl/box	0.016s
ok  	golang.org/x/crypto/nacl/secretbox	0.012s
ok  	golang.org/x/crypto/nacl/sign	0.012s
ok  	golang.org/x/crypto/ocsp	0.047s
ok  	golang.org/x/crypto/openpgp	8.872s
ok  	golang.org/x/crypto/openpgp/armor	0.012s
ok  	golang.org/x/crypto/openpgp/clearsign	16.984s
ok  	golang.org/x/crypto/openpgp/elgamal	0.013s
?   	golang.org/x/crypto/openpgp/errors	[no test files]
ok  	golang.org/x/crypto/openpgp/packet	0.159s
ok  	golang.org/x/crypto/openpgp/s2k	7.597s
ok  	golang.org/x/crypto/otr	0.612s
ok  	golang.org/x/crypto/pbkdf2	0.045s
ok  	golang.org/x/crypto/pkcs12	0.073s
ok  	golang.org/x/crypto/pkcs12/internal/rc2	0.013s
ok  	golang.org/x/crypto/poly1305	0.016s
ok  	golang.org/x/crypto/ripemd160	0.034s
ok  	golang.org/x/crypto/salsa20	0.013s
ok  	golang.org/x/crypto/salsa20/salsa	0.013s
ok  	golang.org/x/crypto/scrypt	0.942s
ok  	golang.org/x/crypto/sha3	0.140s
ok  	golang.org/x/crypto/ssh	0.939s
ok  	golang.org/x/crypto/ssh/agent	0.529s
ok  	golang.org/x/crypto/ssh/knownhosts	0.027s
ok  	golang.org/x/crypto/ssh/terminal	0.016s
ok  	golang.org/x/crypto/tea	0.010s
ok  	golang.org/x/crypto/twofish	0.019s
ok  	golang.org/x/crypto/xtea	0.012s
ok  	golang.org/x/crypto/xts	0.016s
```
2018-09-04 16:10:03 -07:00
Daniel McCarney 00f94de354 ocsp-responder: check reqSerialPrefixes correctly. (#3830)
A match of an OCSP request's serial number to *any* of the configured `reqSerialPrefixes` entries is sufficient for the request to be valid, not just the last `reqSerialPrefixes` entry.

Resolves https://github.com/letsencrypt/boulder/issues/3829
2018-08-23 14:47:02 -07:00
Roland Bracewell Shoemaker 3a8f0bc0be Allow ocsp-responder to filter requests by serial prefix (#3815) 2018-08-10 11:16:22 -04:00
Joel Sing 8ebdfc60b6 Provide formatting logger functions. (#3699)
A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)).
Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions
of the logger functions and call these directly with the format and arguments.

While here remove some unnecessary trailing newlines and calls to String/Error.
2018-05-10 11:06:29 -07:00
Jacob Hoffman-Andrews 6584d2067b
Return 500s from ocsp-responder. (#3423)
Previously, all errors were treated as Not Found, but we actually want
to treat database errors differently; for instance, by not caching them,
and by setting tighter alerting thresholds for them.

Fixes #3419.
2018-02-06 11:37:44 -08:00
Roland Bracewell Shoemaker 2a04a85c49 Export max DB connections in boulder-sa and ocsp-responder (#3388)
Fixes #3387.
2018-01-24 09:11:01 -05:00
Jacob Hoffman-Andrews 6cd777bd8d Fix up stats after #3167 (#3185)
There were two bugs in #3167:

All process-level stats got prefixed with "boulder", which broke dashboards.
All request_time stats got dropped, because measured_http was using the prometheus DefaultRegisterer.
To fix, this PR plumbs through a scope object to measured_http, and uses an empty prefix when calling NewProcessCollector().
2017-10-18 11:14:59 -07:00
Jacob Hoffman-Andrews 071fc0120f Remove facebookgo/httpdown. (#3168)
Its purpose is now served by net/http's Shutdown().
2017-10-17 08:55:43 -04:00
Jacob Hoffman-Andrews f366e45756 Remove global state from metrics gathering (#3167)
Previously, we used prometheus.DefaultRegisterer to register our stats, which uses global state to export its HTTP stats. We also used net/http/pprof's behavior of registering to the default global HTTP ServeMux, via DebugServer, which starts an HTTP server that uses that global ServeMux.

In this change, I merge DebugServer's functions into StatsAndLogging. StatsAndLogging now takes an address parameter and fires off an HTTP server in a goroutine. That HTTP server is newly defined, and doesn't use DefaultServeMux. On it is registered the Prometheus stats handler, and handlers for the various pprof traces. In the process I split StatsAndLogging internally into two functions: makeStats and MakeLogger. I didn't port across the expvar variable exporting, which serves a similar function to Prometheus stats but which we never use.

One nice immediate effect of this change: Since StatsAndLogging now requires and address, I noticed a bunch of commands that called StatsAndLogging, and passed around the resulting Scope, but never made use of it because they didn't run a DebugServer. Under the old StatsD world, these command still could have exported their stats by pushing, but since we moved to Prometheus their stats stopped being collected. We haven't used any of these stats, so instead of adding debug ports to all short-lived commands, or setting up a push gateway, I simply removed them and switched those commands to initialize only a Logger, no stats.
2017-10-13 11:58:01 -07:00
Jacob Hoffman-Andrews 0a72f768a7 Remove ProfileCmd. (#3166)
These stats are now all collected by Prometheus.
2017-10-13 10:02:04 -04:00
Jacob Hoffman-Andrews 4128e0d95a Add time-dependent integration testing (#3060)
Fixes #3020.

In order to write integration tests for some features, especially related to rate limiting, rechecking of CAA, and expiration of authzs, orders, and certs, we need to be able to fake the passage of time in integration tests.

To do so, this change switches out all clock.Default() instances for cmd.Clock(), which can be set manually with the FAKECLOCK environment variable. integration-test.py now starts up all servers once before the main body of tests, with FAKECLOCK set to a date 70 days ago, and does some initial setup for a new integration test case. That test case tries to fetch a 70-day-old authz URL, and expects it to 404.

In order to make this work, I also had to change a number of our test binaries to shut down cleanly in response to SIGTERM. Without that change, stopping the servers between the setup phase and the main tests caused startservers.check() to fail, because some processes exited with nonzero status.

Note: This is an initial stab at things, to prove out the technique. Long-term, I think we will want to use an idiom where test cases are classes that have a number of optional setup phases that may be run at e.g. 70 days prior and 5 days prior. This could help us avoid a proliferation of global state as we add more time-dependent test cases.
2017-09-13 12:34:14 -07:00
Roland Bracewell Shoemaker c03d96212b Update vendored github.com/cloudflare/cfssl (#3078) 2017-09-13 15:23:38 -04:00
Jacob Hoffman-Andrews 63a25bf913 Remove clientName everywhere. (#2862)
This used to be used for AMQP queue names. Now that AMQP is gone, these consts
were only used when printing a version string at startup. This changes
VersionString to just use the name of the current program, and removes
`const clientName = ` from many of our main.go's.
2017-07-12 10:28:54 -07:00
Jacob Hoffman-Andrews b17b5c72a6 Remove statsd from Boulder (#2752)
This removes the config and code to output to statsd.

- Change `cmd.StatsAndLogging` to output a `Scope`, not a `Statter`.
- Remove the prefixing of component name (e.g. "VA") in front of stats; this was stripped by `autoProm` but now no longer needs to be.
- Delete vendored statsd client.
- Delete `MockStatter` (generated by gomock) and `mocks.Statter` (hand generated) in favor of mocking `metrics.Scope`, which is the interface we now use everywhere.
- Remove a few unused methods on `metrics.Scope`, and update its generated mock.
- Refactor `autoProm` and add `autoRegisterer`, which can be included in a `metrics.Scope`, avoiding global state. `autoProm` now registers everything with the `prometheus.Registerer` it is given.
- Change va_test.go's `setup()` to not return a stats object; instead the individual tests that care about stats override `va.stats` directly.

Fixes #2639, #2733.
2017-05-15 10:19:54 -04:00
Roland Bracewell Shoemaker bd045b9325 Fix OCSP-Responder double slash collapsing (#2748)
Uses a special mux for the OCSP Responder so that we stop collapsing double slashes in GET requests which cause a small number of requests to be considered malformed.
2017-05-10 09:51:10 -04:00
Jacob Hoffman-Andrews d59188c676 Use pattern to determine endpoint metrics. (#2689)
This ensures we don't create infinite metrics based on users hitting
non-existent endpoints.
2017-04-20 13:14:47 -04:00
Jacob Hoffman-Andrews 4b665e35a6 Use Prometheus stats for VA, WFE, and OCSP Responder (#2628)
Rename HTTPMonitor to MeasuredHandler.
Remove inflight stat (we didn't use it).
Add timing stat by method, endpoint, and status code.
The timing stat subsumes the "rate" stat, so remove that.
WFE now wraps in MeasuredHandler, instead of relying on its cmd/main.go.
Remove FBAdapter stats.
MeasuredHandler tracks stats by method, status code, and endpoint.

In VA, add a Prometheus histogram for validation timing.
2017-04-03 17:03:04 -07:00
Daniel McCarney 00d11f126b Parse feature flags in all cmd's (#2534)
If you are the first person to add a feature to a Boulder command its very
easy to forget to update the command's config structure to accommodate a
`map[string]bool` entry and to pass it to `features.Set` in `main()`. See
https://github.com/letsencrypt/boulder/issues/2533 for one example. I've
fallen into this trap myself a few times so I'm going to try and save myself
some future grief by fixing it across the board once and for all!

This PR adds a `Features` config entry and a corresponding `features.Set` to:
* ocsp-updater (resolves #2533)
* admin-revoker
* boulder-publisher
* contact-exporter
* expiration-mailer
* expired-authz-purger
* notify-mailer
* ocsp-responder
* orphan-finder

These components were skipped because they already had features supported:
* boulder-ca
* boulder-ra
* boulder-sa
* boulder-va
* boulder-wfe
* cert-checker

I deliberately skipped adding Feature support to:
* single-ocsp (Its only configuration comes from the pkcs11key library and
  doesn't support features)
* rabbitmq-setup (No configuration/features and we'll likely soon be rming this
  since the gRPC migration)
* notafter-backfill (This is a one-off that will be deleted soon)
2017-01-27 16:29:46 -05:00
Roland Bracewell Shoemaker 595204b23f Implement improved signal catching in services that already use it (#2333)
Implements a less RPC focused signal catch/shutdown method. Certain things that probably could also use this (i.e. `ocsp-updater`) haven't been given it as they would require rather substantial changes to allow for a graceful shutdown approach.

Fixes #2298.
2016-11-18 21:05:04 -05:00
Jacob Hoffman-Andrews 32c03f942b Don't start DebugServer until server's ready. (#2271)
This makes availability of DebugServer a better proxy for readiness of the
component.
2016-10-21 16:57:14 -04:00
Roland Bracewell Shoemaker 239bf9ae0a Very basic feature flag impl (#1705)
Updates #1699.

Adds a new package, `features`, which exposes methods to set and check if various internal features are enabled. The implementation uses global state to store the features so that services embedded in another service do not each require their own features map in order to check if something is enabled.

Requires a `boulder-tools` image update to include `golang.org/x/tools/cmd/stringer`.
2016-09-20 16:29:01 -07:00
Jacob Hoffman-Andrews 87fee12d6c Improve single-ocsp command (#2181)
Output base64-encoded DER, as expected by ocsp-responder.
Use flags instead of template for Status, ThisUpdate, NextUpdate.
Provide better help.
Remove old test (wasn't run automatically).
Add it to integration test, and use its output for integration test of issuer ocsp-responder.

Add another slot to boulder-tools HSM image, to store root key.
2016-09-15 15:28:54 -07:00
Roland Bracewell Shoemaker c8f1fb3e2f Remove direct usages of go-statsd-client in favor of using metrics.Scope (#2136)
Fixes #2118, fixes #2082.
2016-09-07 19:35:13 -04:00
Jacob Hoffman-Andrews 031a4022bd Fix dbConnect strings in OCSP Responder. (#2047)
Right now we use the Source field for both DB and file URLs. However, we want to move to the DBConnect config field, so that we can take advantage of the code that reads DSNs from a file on disk.  It turns out the existing code didn't work if you configure a dbConnect string, because it would error out with:

  "source" parameter not found in JSON config

After rearranging, both methods should work.
2016-07-20 10:36:54 -04:00
Ben Irving 0e2ef748b4 Split up boulder-config.json (OCSP Responder) (#2017) 2016-07-07 14:52:08 -04:00
Ben Irving 1336c42813 Replace all log.Err calls with log.AuditErr (#1891)
* remove calls to log.Err()
* go fmt
* remove more occurrences
* change AuditErr argument to string and replace occurrences
2016-06-06 16:27:16 -04:00
Roland Bracewell Shoemaker 54573b36ba Remove all stray copyright headers and appends the initial line to LICENSE.txt (#1853) 2016-05-31 12:32:04 -07:00
Kane York fef60a8fd6 Add statsd reporting of current DB connection count (#1805)
* rename, change params, restructure
* I'm wondering how I managed that one
* use a metrics.Scope
* move method to SA, update callers
* rerun goimports
* fix compile error
* revert cmd/shell.go

https://github.com/letsencrypt/boulder/pull/1805
2016-05-12 20:33:23 -07:00
Jacob Hoffman-Andrews b3bc3d8e41 Add a MaxDBConns config parameter. (#1793) 2016-05-09 14:21:15 -07:00
Kane York 7a4aa49add Return false when ocsp blob is empty (#1771)
Return false when ocsp blob is empty
2016-05-06 17:22:19 -07:00
Jacob Hoffman-Andrews e6c17e1717 Switch to new vendor style (#1747)
* Switch to new vendor style.

* Fix metrics generate command.

* Fix miekg/dns types_generate.

* Use generated copies of files.

* Update miekg to latest.

Fixes a problem with `go generate`.

* Set GO15VENDOREXPERIMENT.

* Build in letsencrypt/boulder.

* fix travis more.

* Exclude vendor instead of godeps.

* Replace some ...

* Fix unformatted cmd

* Fix errcheck for vendorexp

* Add GO15VENDOREXPERIMENT to Makefile.

* Temp disable errcheck.

* Restore master fetch.

* Restore errcheck.

* Build with 1.6 also.

* Match statsd.*"

* Skip errcheck unles Go1.6.

* Add other ignorepkg.

* Fix errcheck.

* move errcheck

* Remove go1.6 requirement.

* Put godep-restore with errcheck.

* Remove go1.6 dep.

* Revert master fetch revert.

* Remove -r flag from godep save.

* Set GO15VENDOREXPERIMENT in Dockerfile and remove _worskpace.

* Fix Godep version.
2016-04-18 12:51:36 -07:00
Jacob Hoffman-Andrews ecc04e8e61 Refactor log package (#1717)
- Remove error signatures from log methods. This means fewer places where errcheck will show ignored errors.
- Pull in latest cfssl to be compatible with errorless log messages.
- Reduce the number of message priorities we support to just those we actually use.
- AuditNotice -> AuditInfo
- Remove InfoObject (only one use, switched to Info)
- Remove EmergencyExit and related functions in favor of panic
- Remove SyslogWriter / AuditLogger separate types in favor of a single interface, Logger, that has all the logging methods on it.
- Merge mock log into logger. This allows us to unexport the internals but still override them in the mock.
- Shorten names to be compatible with Go style: New, Set, Get, Logger, NewMock, etc.
- Use a shorter log format for stdout logs.
- Remove "... Starting" log messages. We have better information in the "Versions" message logged at startup.

Motivation: The AuditLogger / SyslogWriter distinction was confusing and exposed internals only necessary for tests. Some components accepted one type and some accepted the other. This made it hard to consistently use mock loggers in tests. Also, the unnecessarily fat interface for AuditLogger made it hard to meaningfully mock out.
2016-04-08 16:12:20 -07:00
Jacob Hoffman-Andrews a3533f0bba Reduce log levels in OCSP responder. (#1702)
* Reduce log levels in OCSP responder.
* Use mock log in test.
* Update upstream cfssl.
2016-04-08 14:41:14 -07:00
Roland Bracewell Shoemaker 800b5b0cbf Switch to using a wrapped statter that provides PID
* Switch to using a wrapped statter that provides PID

* Fix tests and change some types to interfaces

* Add hostname to suffix + update comment
2016-04-01 15:43:35 -07:00
Jacob Hoffman-Andrews 39d0240793 Remove SQLDebug config option.
It's now the default in all cases that it was configurable. When we want to
suppress SQL debug messages, we can simply adjust the logging level to suppress
debug messages in general.

Also, pass a logger to SetSQLDebug rather than calling GetAuditLogger.
2016-03-29 23:32:02 -07:00
Jacob Hoffman-Andrews 0fda27e15a Remove checking of ocspResponses table.
We now use the certificateStatus table.
2016-02-09 10:36:41 -08:00
Jeff Hodges 57b6dd5bb5 make HTTPMonitor a http.Handler 2016-02-01 22:01:21 -08:00
Jeff Hodges c156f99106 ocsp-responder: 200 on GET /
Some stat services, we believe, are saying the ocsp-responder is down
because / returns 400 Bad Request currently.

Shuffle some code into a new `mux` function to make it easier to test.
2016-02-01 20:03:45 -08:00
Jacob Hoffman-Andrews feaf6bd230 Merge branch 'master' into secrets 2015-11-30 14:14:47 -08:00
Jacob Hoffman-Andrews c5de989796 Move FailOnError to correct place. 2015-11-26 11:30:18 -08:00
Jacob Hoffman-Andrews b71a850501 Fix DBConfig references. 2015-11-24 16:41:53 -08:00
Jacob Hoffman-Andrews 8aa3c6cd7b Add comment and override for Source. 2015-11-24 11:01:33 -08:00
Jacob Hoffman-Andrews f008c46a77 Run godep update and godep save -r.
Also, remove cache-control code from ocsp-responder, since caching headers are
now handled in cfssl.
2015-11-20 16:48:43 -08:00
Jacob Hoffman-Andrews 2fc0f3143e Improve logging.
Consolidate initialization of stats and logging from each main.go into cmd
package.

Define a new config parameter, `StdoutLevel`, that determines the maximum log
level that will be printed to stdout. It can be set to 6 to inhibit debug
messages, or 0 to print only emergency messages, or -1 to print no messages at
all.

Remove the existing config parameter `Tag`. Instead, choose the tag from the
basename of the currently running process. Previously all Boulder log messages
had the tag "boulder", but now they will be differentiated by process, like
"boulder-wfe".

Shorten the date format used in stdout logging, and add the current binary's
basename.

Consolidate setup function in audit-logger_test.go.

Note: Most CLI binaries now get their stats and logging from the parameters of
Action. However, a few of our binaries don't use our custom AppShell, and
instead use codegangsta/cli directly. For those binaries, we export the new
StatsAndLogging method from cmd.

Fixes https://github.com/letsencrypt/boulder/issues/852
2015-11-11 16:52:42 -08:00
Roland Shoemaker d24c73bb1b Review fixes 2015-10-20 19:15:39 -07:00
Roland Shoemaker 02cd06ad0b Rename and cleanup the dbMap wraper interface 2015-10-16 16:27:03 -07:00
Roland Shoemaker 980d87aa14 Add test to catch logging of failed SQL calls 2015-10-16 13:58:16 -07:00
Roland Shoemaker e7c71bb7ac Log OCSP responder SQL errors 2015-10-16 11:59:04 -07:00
Roland Shoemaker 54a79fa640 Review fixes pt. 1 2015-10-12 13:19:50 -07:00
Roland Shoemaker 2b604eef6e Review fixes pt. 2 2015-10-09 18:02:02 -07:00
Roland Shoemaker 8d1ea7291f Address review comments
OCSP-Responder attempts to read the OCSP response from the certificateStatus table,
if it cannot find a response there it reads the ocspResponses table to try to find a
response, if neither contains a response the not found bool is passed back to the
Responder.
2015-10-09 15:48:09 -07:00
Roland Shoemaker 10b6bb5548 Refactor certificate revocation and OCSP generation workflows
* Moves revocation from the CA to the OCSP-Updater, the RA will mark certificates as
  revoked then wait for the OCSP-Updater to create a new (final) revoked response
* Merges the ocspResponses table with the certificateStatus table and only use UPDATES
  to update the OCSP response (vs INSERT-only since this happens quite often and will
  lead to an extremely large table)
2015-10-08 18:55:11 -07:00
Jeff Hodges 28a4eecad0 ocsp-responder: error on missing source and tests
We had a staging deploy go bad because of the missing error handling on
the "source" config not being in the JSON. While we debugged, I wrote
some tests.

Fixes #936.
2015-10-06 21:50:44 -07:00
Jacob Hoffman-Andrews a0ba72ea35 Merge branch 'master' into ocsp-decoding
Conflicts:
	test/amqp-integration-test.py
2015-10-01 17:48:26 -07:00
Jacob Hoffman-Andrews 9191aad304 Don't use default handler in ocsp responder. 2015-10-01 16:42:52 -07:00
Roland Shoemaker 44373307b9 Merge branch 'fb-to-statsd' of github.com:letsencrypt/boulder into fb-to-statsd 2015-10-01 15:50:01 -07:00
Roland Shoemaker 9b0586dfdc Add and use clock 2015-10-01 15:49:50 -07:00
Jeff Hodges eb7f318fdc Merge branch 'master' into fb-to-statsd 2015-10-01 15:40:27 -07:00
Jacob Hoffman-Andrews e97880aaa7 Audit log version info as early as possible.
This means after parsing the config file, setting up stats, and dialing the
syslogger. But it is still before trying to initialize the given server. This
means that we are more likely to get version numbers logged for some common
runtime failures.
2015-09-29 17:16:03 -07:00
Roland Shoemaker 081b81d170 Add a facebookgo/stats client that sends StatsD metrics for facebookgo/httpdown 2015-09-26 21:38:05 -07:00
Jeff Hodges 601cf9f0fb add Cache-Control headers to ocsp-responder
Also, adds a JSONDuration to clean up some of the config code. It will
get used more in later PRs.

Fixes #797
2015-09-25 11:26:44 -07:00
Jacob Hoffman-Andrews 3bec0076cd Use file URLs for static responders. 2015-09-24 10:11:20 -07:00
Jacob Hoffman-Andrews 540c792474 Add an OCSP responder that serves from a file.
This is useful for intermediate and root OCSP, which are generated manually one
a year.
2015-09-23 16:34:13 -07:00
Roland Shoemaker 4a47aaed51 Merge master 2015-09-22 14:07:07 -07:00
Roland Shoemaker 91724296a8 Use facebooks gracefully shutting down HTTP server for WFE & OCSP-Responder 2015-09-21 20:43:38 -07:00
Roland Shoemaker f35643bcaf Merge master 2015-09-15 12:05:58 -07:00
Richard Barnes 0584cfd53c Typo fix 2015-09-13 22:12:14 -04:00
Richard Barnes a7484fb1e7 Use subjectKeyID instead of authoritKeyID in OCSP responder configuration 2015-09-13 21:05:59 -04:00
Roland Shoemaker a3c9f60bec Review fixes 2015-08-30 22:15:13 -07:00
Roland Shoemaker 764169667e Merge master 2015-08-27 11:21:18 -07:00
Jacob Hoffman-Andrews 9b9dd76f54 Fix flaky OCSP.
If two OCSP responses were generated in the same second, the earlier would
previously take priority sometimes, leading to a "good" response for revoked
certificates and causing the OCSP integration test to be flaky.
2015-08-24 15:31:26 -07:00
Roland Shoemaker d6efd496fa Merge master 2015-08-24 12:27:58 -07:00