boulder

Commit Graph

Author	SHA1	Message	Date
Samantha	99502b1ffb	oscp-updater: use rows.Scan() to get query results (#5656 ) - Replace `gorp.DbMap` with calls that use `sql.DB` directly - Use `rows.Scan()` and `rows.Next()` to get query results (which opens the door to streaming the results) - Export function `CertStatusMetadataFields` from `SA` - Add new function `ScanCertStatusRow` to `SA` - Add new function `NewDbSettingsFromDBConfig` to `SA` Fixes #5642 Part Of #5715	2021-10-18 10:33:09 -07:00
Aaron Gable	9ee02b2588	ocsp-responder: handle NameIDs in the database (#5592 ) Give ocsp-responder a new map of IssuerNameIDs to keyHashes, so that it can confirm that OCSP requests have an appropriate key hash whether the database is storing old-style IssuerIDs or new-style IssuerNameIDs. Part of #5152	2021-08-20 18:21:16 -07:00
Aaron Gable	5fcabde592	Make CertificateStatus.IssuerID not a reference (#5594 ) Change `CertificateStatus.IssuerID` from `*int64` to just an `int64`. It might make sense for this field to be nillable in a world where we want to distinguish between it being missing and it being zero, but none of our code actually does that: we error out either way. Part of #5152	2021-08-20 13:19:29 -07:00
J.C. Jones	7b31bdb30a	Add read-only dbConns to SQLStorageAuthority and OCSPUpdater (#5555 ) This changeset adds a second DB connect string for the SA for use in read-only queries that are not themselves dependencies for read-write queries. In other words, this is attempting to only catch things like rate-limit `SELECT`s and other coarse-counting, so we can potentially move those read queries off the read-write primary database. It also adds a second DB connect string to the OCSP Updater. This is a little trickier, as the subsequent `UPDATE`s _are_ dependent on the output of the `SELECT`, but in this case it's operating on data batches, and a few seconds' replication latency are several orders of magnitude below the threshold for update frequency, so any certificates that aren't caught on run `n` can be caught on run `n+1`. Since we export DB metrics to Prometheus, this also refactors `InitDBMetrics` to take a DB Address (host:port tuple) and User out of the DB connection DSN and include those as labels in the metrics. Fixes #5550 Fixes #4985	2021-08-02 11:21:34 -07:00
Aaron Gable	9a12ba7f7f	OCSP: Don't warn on expired responses (#5507 ) Downgrade the "ocsp response expired" log from Warning to Info, as this is a very common occurrence and should be expected. Fixes #5501	2021-07-09 10:01:20 -07:00
Aaron Gable	9abb39d4d6	Honeycomb integration proof-of-concept (#5408 ) Add Honeycomb tracing to all Boulder components which act as HTTP servers, gRPC servers, or gRPC clients. Add many values which we currently emit to logs to the trace spans. Add a way to configure the Honeycomb integration to our config files, and by default configure all of our tests to "mute" (send nothing). Followup changes will refine the configuration, attempt to reduce the new dependency load, and introduce better sampling. Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218	2021-05-24 16:13:08 -07:00
Aaron Gable	f569b15b64	Remove common config from ocsp-responder (#5350 ) The old `config.Common.IssuerCert` format is no longer used in any production configs, and can be removed safely. Part of #5162 Part of #5242	2021-03-18 17:16:37 -07:00
Samantha	5a92926b0c	Remove dbconfig migration deployability code (#5348 ) Default boulder code paths to exclusively use the `db` config key Fixes #5338	2021-03-18 16:41:15 -07:00
Samantha	e2e7dad034	Move cmd.DBConfig fields to their own named sub-struct (#5286 ) Named field `DB`, in a each component configuration struct, acts as the receiver for the value of `db` when component JSON files are unmarshalled. When `cmd.DBConfig` fields are received at the root of component configuration struct instead of `DB` copy them to the `DB` field of the component configuration struct. Move existing `cmd.DBConfig` values from the root of each component's JSON configuration in `test/config-next` to `db` Part of #5275	2021-02-16 10:48:58 -08:00
Samantha	7cb0038498	Deprecate MaxDBConns for MaxOpenConns (#5274 ) In #5235 we replaced MaxDBConns in favor of MaxOpenConns. One week ago MaxDBConns was removed from all dev, staging, and production configurations. This change completes the removal of MaxDBConns from all components and test/config. Fixes #5249	2021-02-08 12:00:01 -08:00
Samantha	e0510056cc	Enhancements to SQL driver tuning via JSON config (#5235 ) Historically the only database/sql driver setting exposed via JSON config was maxDBConns. This change adds support for maxIdleConns, connMaxLifetime, connMaxIdleTime, and renames maxDBConns to maxOpenConns. The addition of these settings will give our SRE team a convenient method for tuning the reuse/closure of database connections. A new struct, DBSettings, has been added to SA. The struct, and each of it's fields has been commented. All new fields have been plumbed through to the relevant Boulder components and exported as Prometheus metrics. Tests have been added/modified to ensure that the fields are being set. There should be no loss in coverage Deployability concerns for the migration from maxDBConns to maxOpenConns have been addressed with the temporary addition of the helper method cmd.DBConfig.GetMaxOpenConns(). This method can be removed once test/config is defaulted to using maxOpenConns. Relevant sections of the code have TODOs added that link back to an newly opened issue. Fixes #5199	2021-01-25 15:34:55 -08:00
Aaron Gable	5ca0c343af	ocsp-responder: move IssuerCerts out of common config (#5203 ) The vast majority of Boulder components no longer care about anything in the common config block. As such, we hope to remove it entirely in the near future. So let's put the (not-yet-used) IssuerCerts config item in the main OCSPResponder block, rather than in the common block. Part of #5204	2020-12-15 16:59:38 -08:00
Aaron Gable	9ba2d3c00b	ocsp-responder: move IssuerID check after Expires check (#5202 ) It is possible for a CertificateStatus row to have a nil IssuerID (there was a period of time in which we didn't write IssuerIDs into CertificateStatus rows at all) but all such rows should be old and therefore expired. Unfortunately, this code was checking the IssuerID before it was checking the Expiry, and therefore was generating panics when trying to dereference a nil pointer. This change simply moves the IssuerID check to be after the Expires check, so that we'll only try to dereference the IssuerID on recent CertificateStatus rows, where it is guaranteed to be non-nil. Fixes #5200	2020-12-15 14:38:21 -08:00
Aaron Gable	fff9794477	ocsp-responder: don't respond for other issuers (#5183 ) When making an OCSP request, the client provides three pieces of information: the URL which it is querying to get OCSP info, the hash of the issuer public which issued the cert in question, and the serial number of the cert in question. In Boulder, the first of these is only provided implicitly, based on which instance of ocsp-responder is handling the request: we ensure (via configs) that the ocsp-responder handling a given OCSP AIA URL has the corresponding issuer cert loaded in memory. When handling a request, the ocsp-responder checks three things: that the request is using SHA1 to hash the issuer public key, that the requested issuer public key matches one of the loaded issuer certs, and that the requested serial number is one we could have issued (i.e. has the correct prefix). It relies on the database query to filter out requests for non-existent serials. However, this means that a request to an ocsp-responder instance with issuer cert A loaded could receive and handle a request which specifies cert A as the issuer, but names a serial which was actually issued by issuer cert B. The checks all pass and the database lookup succeeds. But the returned OCSP response is for a certificate that was issued by a different issuer, and the response itself was signed by that other issuer. In order to resolve this potentially confusing situation, this change adds one additional check to the ocsp-responder: after it has retrieved the ocsp response, it looks up which issuer produced that ocsp response, confirms that it has that issuer cert loaded in memory, and confirms that its issuer key hash matches that in the original request. There is still one wrinkle if issuer certs A and B are both loaded in one ocsp-responder, and that one ocsp-responder is handling OCSP requests to both of their AIA OCSP URLs. In this case, it is possible that a request to a.ocsp.com, but requesting OCSP for a cert issued by B, could still have its request answered. This is because the ocsp-responder itself does not know which URL was requested. But regardless, this change does guarantee that the response will match the contents of the request (or no response will be given), no matter what URL that request was sent to. Fixes #5182	2020-11-30 11:51:32 -08:00
Aaron Gable	8cf597459d	Add multi-issuer support to ocsp-responder (#5154 ) The ocsp-responder takes a path to a certificate file as one of its config values. It uses this path as one of the inputs when constructing its DBSource, the object responsible for querying the database for pregenerated OCSP responses to fulfill requests. However, this certificate file is not necessary to query the database; rather, it only acts as a filter: OCSP requests whose IssuerKeyHash do not match the hash of the loaded certificate are rejected outright, without querying the DB. In addition, there is currently only support for a single certificate file in the config. This change adds support for multiple issuer certificate files in the config, and refactors the pre-database filtering of bad OCSP requests into a helper object dedicated solely to that purpose. Fixes #5119	2020-11-10 09:21:09 -08:00
Aaron Gable	6f0016262f	Simplify database interactions (#4949 ) Simplify database interactions This change is a result of an audit of all places where Go code directly constructs SQL queries and executes them against a dbMap, with the goal of eliminating all instances of constructing a well-known object type (such as a core.CertificateStatus) from explicitly-listed database columns. Instead, we should be relying on helper functions defined in the sa itself to determine which columns are relevant for the construction of any given object. This audit did not find many places where this was occurring. It did reveal a few simplifications, which are contained in this change: 1) Greater use of existing SelectFoo methods provided by models.go 2) Streamlining of various SelectSingularFoo methods to always select by serial string, rather than user-provided WHERE clause 3) One spot (in ocsp-responder) where using a well-known type seemed better than using a more minimal custom type Addresses #4899	2020-07-20 11:12:52 -07:00
Roland Bracewell Shoemaker	e940b6386f	ocsp: switch from cfssl/log to internal log (#4941 ) Fixes #4898.	2020-07-08 09:32:23 -07:00
Jacob Hoffman-Andrews	7bddafd45e	Add MaxBytesReader for ocsp-responder. (#4869 ) Also, return status code 500 when the OCSP response from the DB is unparseable.	2020-06-23 11:30:59 -07:00
Roland Bracewell Shoemaker	c4813cc340	cmd/ceremony: merge single-ocsp tool into ceremony (#4878 ) Fixes #4658.	2020-06-23 11:30:31 -07:00
Jacob Hoffman-Andrews	06ffb57221	Update go-gorp and run go mod tidy. (#4860 ) gorp now uses go modules. ``` $ cd ~/go/src/github.com/go-gorp/gorp/ $ git checkout v3.0.1 $ go test ./... ok github.com/go-gorp/gorp/v3 0.002s ```	2020-06-10 16:18:37 -07:00
Roland Bracewell Shoemaker	d516537394	cmd/ocsp-responder: calculate key hash rather than relying on SKID (#4851 ) Rather than just assuming the SKID is the key hash, calculate the actual hash, also reject any requests for hashes we don't support.	2020-06-09 17:56:34 -07:00
Jacob Hoffman-Andrews	0e9ac0c638	Use bytes.Equal instead of bytes.Compare == 0 (#4758 ) staticcheck cleanup: https://staticcheck.io/docs/checks#S1004	2020-04-08 17:20:56 -07:00
Roland Bracewell Shoemaker	5b2f11e07e	Switch away from old style statsd metrics wrappers (#4606 ) In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back. There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change. Fixes #4591.	2019-12-18 11:08:25 -05:00
Daniel McCarney	1c9ece3f44	SA: use wrapped database maps/transactions. (#4585 ) New types and related infrastructure are added to the `db` package to allow wrapping gorp DbMaps and Transactions. The wrapped versions return a special `db.ErrDatabaseOp` error type when errors occur. The new error type includes additional information such as the operation that failed and the related table. Where possible we determine the table based on the types of the gorp function arguments. Where that isn't possible (e.g. with raw SQL queries) we try to use a simple regexp approach to find the table name. This isn't great for general SQL but works well enough for Boulder's existing SQL queries. To get additional confidence my regexps work for all of Boulder's queries I temporarily changed the `db` package's `tableFromQuery` function to panic if the table couldn't be determined. I re-ran the full unit and integration test suites with this configuration and saw no panics. Resolves https://github.com/letsencrypt/boulder/issues/4559	2019-12-04 13:03:09 -05:00
Roland Bracewell Shoemaker	3359ec349b	ocsp-responder: Integrate CFSSL OCSP responder code (#4461 ) Integrates the cfssl/ocsp responder code directly into boulder. I've tried to pare down the existing code to only the bits we actually use and have removed some generic interfaces in places in favor of directly using our boulder specific interfaces. Fixes #4427.	2019-10-07 14:05:37 -04:00
Roland Bracewell Shoemaker	db01830508	Return OCSP unauthorized status if the certificate is expired (#4380 ) The ocsp-updater ocspStaleMaxAge config var has to be bumped up to ~7 months so that when it is run after the six-months-ago run it will actually update the ocsp responses generated during that period and mark the certificate status row as expired. Fixes #4338.	2019-08-01 14:13:27 -07:00
Daniel McCarney	0ecdf80709	SA: refactor DB stat collection & collect more stats. (#4096 ) Go 1.11+ updated the `sql.DBStats` struct with new fields that are of interest to us. This PR routes these stats to Prometheus by replacing the existing autoprom stats code with new first-class Prometheus metrics. Resolves https://github.com/letsencrypt/boulder/issues/4095 The `max_db_connections` stat from the SA is removed because the Go 1.11+ `sql.DBStats.MaxOpenConnections` field will give us a better view of the same information. The autoprom "reused_authz" stat that was being incremented in `SA.GetPendingAuthorization` was also removed. It wasn't doing what it says it was (counting reused authorizations) and was instead counting the number of times `GetPendingAuthorization` returned an authz.	2019-03-06 17:08:53 -08:00
Jacob Hoffman-Andrews	48103af5b1	Add timeout to ocsp-responder (#3892 ) Right now if ocsp-responder gets flooded with traffic, it will have a number of requests that spend long enough waiting for an available connection that the reverse proxy will have given up on them before they get a chance to execute the SQL query. Add a timeout parameter so ocsp-responder can gracefully shed this load rather than try to do pointless work.	2018-10-22 09:20:08 -04:00
Roland Bracewell Shoemaker	00be0627bd	Add a stats shim to ocsp-responder (#3841 ) Fixes #3836. ``` $ ./test.sh ok github.com/cloudflare/cfssl/api 1.023s coverage: 81.1% of statements ok github.com/cloudflare/cfssl/api/bundle 1.464s coverage: 87.2% of statements ok github.com/cloudflare/cfssl/api/certadd 16.766s coverage: 86.8% of statements ok github.com/cloudflare/cfssl/api/client 1.062s coverage: 51.9% of statements ok github.com/cloudflare/cfssl/api/crl 1.075s coverage: 75.0% of statements ok github.com/cloudflare/cfssl/api/gencrl 1.038s coverage: 72.5% of statements ok github.com/cloudflare/cfssl/api/generator 1.478s coverage: 33.3% of statements ok github.com/cloudflare/cfssl/api/info 1.085s coverage: 84.1% of statements ok github.com/cloudflare/cfssl/api/initca 1.050s coverage: 90.5% of statements ok github.com/cloudflare/cfssl/api/ocsp 1.114s coverage: 93.8% of statements ok github.com/cloudflare/cfssl/api/revoke 3.063s coverage: 75.0% of statements ok github.com/cloudflare/cfssl/api/scan 2.988s coverage: 62.1% of statements ok github.com/cloudflare/cfssl/api/sign 2.680s coverage: 83.3% of statements ok github.com/cloudflare/cfssl/api/signhandler 1.114s coverage: 26.3% of statements ok github.com/cloudflare/cfssl/auth 1.010s coverage: 68.2% of statements ok github.com/cloudflare/cfssl/bundler 22.078s coverage: 84.5% of statements ok github.com/cloudflare/cfssl/certdb/dbconf 1.013s coverage: 84.2% of statements ok github.com/cloudflare/cfssl/certdb/ocspstapling 1.302s coverage: 69.2% of statements ok github.com/cloudflare/cfssl/certdb/sql 1.223s coverage: 70.5% of statements ok github.com/cloudflare/cfssl/cli 1.014s coverage: 62.5% of statements ok github.com/cloudflare/cfssl/cli/bundle 1.011s coverage: 0.0% of statements [no tests to run] ok github.com/cloudflare/cfssl/cli/crl 1.086s coverage: 57.8% of statements ok github.com/cloudflare/cfssl/cli/gencert 7.927s coverage: 83.6% of statements ok github.com/cloudflare/cfssl/cli/gencrl 1.064s coverage: 73.3% of statements ok github.com/cloudflare/cfssl/cli/gencsr 1.058s coverage: 70.3% of statements ok github.com/cloudflare/cfssl/cli/genkey 2.718s coverage: 70.0% of statements ok github.com/cloudflare/cfssl/cli/ocsprefresh 1.077s coverage: 64.3% of statements ok github.com/cloudflare/cfssl/cli/revoke 1.033s coverage: 88.2% of statements ok github.com/cloudflare/cfssl/cli/scan 1.014s coverage: 36.0% of statements ok github.com/cloudflare/cfssl/cli/selfsign 2.342s coverage: 73.2% of statements ok github.com/cloudflare/cfssl/cli/serve 1.076s coverage: 38.2% of statements ok github.com/cloudflare/cfssl/cli/sign 1.070s coverage: 54.8% of statements ok github.com/cloudflare/cfssl/cli/version 1.011s coverage: 100.0% of statements ok github.com/cloudflare/cfssl/cmd/cfssl 1.028s coverage: 0.0% of statements [no tests to run] ok github.com/cloudflare/cfssl/cmd/cfssljson 1.012s coverage: 3.4% of statements ok github.com/cloudflare/cfssl/cmd/mkbundle 1.011s coverage: 0.0% of statements [no tests to run] ok github.com/cloudflare/cfssl/config 1.023s coverage: 67.7% of statements ok github.com/cloudflare/cfssl/crl 1.054s coverage: 68.3% of statements ok github.com/cloudflare/cfssl/csr 8.473s coverage: 89.6% of statements ok github.com/cloudflare/cfssl/errors 1.014s coverage: 79.6% of statements ok github.com/cloudflare/cfssl/helpers 1.216s coverage: 80.6% of statements ok github.com/cloudflare/cfssl/helpers/derhelpers 1.017s coverage: 48.0% of statements ok github.com/cloudflare/cfssl/helpers/testsuite 7.826s coverage: 65.8% of statements ok github.com/cloudflare/cfssl/initca 151.314s coverage: 73.2% of statements ok github.com/cloudflare/cfssl/log 1.013s coverage: 59.3% of statements ok github.com/cloudflare/cfssl/multiroot/config 1.258s coverage: 77.4% of statements ok github.com/cloudflare/cfssl/ocsp 1.353s coverage: 75.1% of statements ok github.com/cloudflare/cfssl/revoke 1.149s coverage: 75.0% of statements ok github.com/cloudflare/cfssl/scan 1.023s coverage: 1.1% of statements skipped github.com/cloudflare/cfssl/scan/crypto/md5 skipped github.com/cloudflare/cfssl/scan/crypto/rsa skipped github.com/cloudflare/cfssl/scan/crypto/sha1 skipped github.com/cloudflare/cfssl/scan/crypto/sha256 skipped github.com/cloudflare/cfssl/scan/crypto/sha512 skipped github.com/cloudflare/cfssl/scan/crypto/tls ok github.com/cloudflare/cfssl/selfsign 1.098s coverage: 70.0% of statements ok github.com/cloudflare/cfssl/signer 1.020s coverage: 19.4% of statements ok github.com/cloudflare/cfssl/signer/local 4.886s coverage: 77.9% of statements ok github.com/cloudflare/cfssl/signer/remote 2.500s coverage: 70.0% of statements ok github.com/cloudflare/cfssl/signer/universal 2.228s coverage: 67.7% of statements ok github.com/cloudflare/cfssl/transport 1.012s ok github.com/cloudflare/cfssl/transport/ca/localca 1.046s coverage: 94.9% of statements ok github.com/cloudflare/cfssl/transport/kp 1.050s coverage: 37.1% of statements ok github.com/cloudflare/cfssl/ubiquity 1.037s coverage: 88.3% of statements ok github.com/cloudflare/cfssl/whitelist 3.519s coverage: 100.0% of statements ... $ go test ./... (master✱) ok golang.org/x/crypto/acme 2.782s ok golang.org/x/crypto/acme/autocert 2.963s ? golang.org/x/crypto/acme/autocert/internal/acmetest [no test files] ok golang.org/x/crypto/argon2 0.047s ok golang.org/x/crypto/bcrypt 4.694s ok golang.org/x/crypto/blake2b 0.056s ok golang.org/x/crypto/blake2s 0.050s ok golang.org/x/crypto/blowfish 0.015s ok golang.org/x/crypto/bn256 0.460s ok golang.org/x/crypto/cast5 4.204s ok golang.org/x/crypto/chacha20poly1305 0.560s ok golang.org/x/crypto/cryptobyte 0.014s ? golang.org/x/crypto/cryptobyte/asn1 [no test files] ok golang.org/x/crypto/curve25519 0.025s ok golang.org/x/crypto/ed25519 0.073s ? golang.org/x/crypto/ed25519/internal/edwards25519 [no test files] ok golang.org/x/crypto/hkdf 0.012s ok golang.org/x/crypto/internal/chacha20 0.047s ok golang.org/x/crypto/internal/subtle 0.011s ok golang.org/x/crypto/md4 0.013s ok golang.org/x/crypto/nacl/auth 9.226s ok golang.org/x/crypto/nacl/box 0.016s ok golang.org/x/crypto/nacl/secretbox 0.012s ok golang.org/x/crypto/nacl/sign 0.012s ok golang.org/x/crypto/ocsp 0.047s ok golang.org/x/crypto/openpgp 8.872s ok golang.org/x/crypto/openpgp/armor 0.012s ok golang.org/x/crypto/openpgp/clearsign 16.984s ok golang.org/x/crypto/openpgp/elgamal 0.013s ? golang.org/x/crypto/openpgp/errors [no test files] ok golang.org/x/crypto/openpgp/packet 0.159s ok golang.org/x/crypto/openpgp/s2k 7.597s ok golang.org/x/crypto/otr 0.612s ok golang.org/x/crypto/pbkdf2 0.045s ok golang.org/x/crypto/pkcs12 0.073s ok golang.org/x/crypto/pkcs12/internal/rc2 0.013s ok golang.org/x/crypto/poly1305 0.016s ok golang.org/x/crypto/ripemd160 0.034s ok golang.org/x/crypto/salsa20 0.013s ok golang.org/x/crypto/salsa20/salsa 0.013s ok golang.org/x/crypto/scrypt 0.942s ok golang.org/x/crypto/sha3 0.140s ok golang.org/x/crypto/ssh 0.939s ok golang.org/x/crypto/ssh/agent 0.529s ok golang.org/x/crypto/ssh/knownhosts 0.027s ok golang.org/x/crypto/ssh/terminal 0.016s ok golang.org/x/crypto/tea 0.010s ok golang.org/x/crypto/twofish 0.019s ok golang.org/x/crypto/xtea 0.012s ok golang.org/x/crypto/xts 0.016s ```	2018-09-04 16:10:03 -07:00
Daniel McCarney	00f94de354	ocsp-responder: check reqSerialPrefixes correctly. (#3830 ) A match of an OCSP request's serial number to any of the configured `reqSerialPrefixes` entries is sufficient for the request to be valid, not just the last `reqSerialPrefixes` entry. Resolves https://github.com/letsencrypt/boulder/issues/3829	2018-08-23 14:47:02 -07:00
Roland Bracewell Shoemaker	3a8f0bc0be	Allow ocsp-responder to filter requests by serial prefix (#3815 )	2018-08-10 11:16:22 -04:00
Joel Sing	8ebdfc60b6	Provide formatting logger functions. (#3699 ) A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)). Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions of the logger functions and call these directly with the format and arguments. While here remove some unnecessary trailing newlines and calls to String/Error.	2018-05-10 11:06:29 -07:00
Jacob Hoffman-Andrews	6584d2067b	Return 500s from ocsp-responder. (#3423 ) Previously, all errors were treated as Not Found, but we actually want to treat database errors differently; for instance, by not caching them, and by setting tighter alerting thresholds for them. Fixes #3419.	2018-02-06 11:37:44 -08:00
Roland Bracewell Shoemaker	2a04a85c49	Export max DB connections in boulder-sa and ocsp-responder (#3388 ) Fixes #3387.	2018-01-24 09:11:01 -05:00
Jacob Hoffman-Andrews	6cd777bd8d	Fix up stats after #3167 (#3185 ) There were two bugs in #3167: All process-level stats got prefixed with "boulder", which broke dashboards. All request_time stats got dropped, because measured_http was using the prometheus DefaultRegisterer. To fix, this PR plumbs through a scope object to measured_http, and uses an empty prefix when calling NewProcessCollector().	2017-10-18 11:14:59 -07:00
Jacob Hoffman-Andrews	071fc0120f	Remove facebookgo/httpdown. (#3168 ) Its purpose is now served by net/http's Shutdown().	2017-10-17 08:55:43 -04:00
Jacob Hoffman-Andrews	f366e45756	Remove global state from metrics gathering (#3167 ) Previously, we used prometheus.DefaultRegisterer to register our stats, which uses global state to export its HTTP stats. We also used net/http/pprof's behavior of registering to the default global HTTP ServeMux, via DebugServer, which starts an HTTP server that uses that global ServeMux. In this change, I merge DebugServer's functions into StatsAndLogging. StatsAndLogging now takes an address parameter and fires off an HTTP server in a goroutine. That HTTP server is newly defined, and doesn't use DefaultServeMux. On it is registered the Prometheus stats handler, and handlers for the various pprof traces. In the process I split StatsAndLogging internally into two functions: makeStats and MakeLogger. I didn't port across the expvar variable exporting, which serves a similar function to Prometheus stats but which we never use. One nice immediate effect of this change: Since StatsAndLogging now requires and address, I noticed a bunch of commands that called StatsAndLogging, and passed around the resulting Scope, but never made use of it because they didn't run a DebugServer. Under the old StatsD world, these command still could have exported their stats by pushing, but since we moved to Prometheus their stats stopped being collected. We haven't used any of these stats, so instead of adding debug ports to all short-lived commands, or setting up a push gateway, I simply removed them and switched those commands to initialize only a Logger, no stats.	2017-10-13 11:58:01 -07:00
Jacob Hoffman-Andrews	0a72f768a7	Remove ProfileCmd. (#3166 ) These stats are now all collected by Prometheus.	2017-10-13 10:02:04 -04:00
Jacob Hoffman-Andrews	4128e0d95a	Add time-dependent integration testing (#3060 ) Fixes #3020. In order to write integration tests for some features, especially related to rate limiting, rechecking of CAA, and expiration of authzs, orders, and certs, we need to be able to fake the passage of time in integration tests. To do so, this change switches out all clock.Default() instances for cmd.Clock(), which can be set manually with the FAKECLOCK environment variable. integration-test.py now starts up all servers once before the main body of tests, with FAKECLOCK set to a date 70 days ago, and does some initial setup for a new integration test case. That test case tries to fetch a 70-day-old authz URL, and expects it to 404. In order to make this work, I also had to change a number of our test binaries to shut down cleanly in response to SIGTERM. Without that change, stopping the servers between the setup phase and the main tests caused startservers.check() to fail, because some processes exited with nonzero status. Note: This is an initial stab at things, to prove out the technique. Long-term, I think we will want to use an idiom where test cases are classes that have a number of optional setup phases that may be run at e.g. 70 days prior and 5 days prior. This could help us avoid a proliferation of global state as we add more time-dependent test cases.	2017-09-13 12:34:14 -07:00
Roland Bracewell Shoemaker	c03d96212b	Update vendored github.com/cloudflare/cfssl (#3078 )	2017-09-13 15:23:38 -04:00
Jacob Hoffman-Andrews	63a25bf913	Remove clientName everywhere. (#2862 ) This used to be used for AMQP queue names. Now that AMQP is gone, these consts were only used when printing a version string at startup. This changes VersionString to just use the name of the current program, and removes `const clientName = ` from many of our main.go's.	2017-07-12 10:28:54 -07:00
Jacob Hoffman-Andrews	b17b5c72a6	Remove statsd from Boulder (#2752 ) This removes the config and code to output to statsd. - Change `cmd.StatsAndLogging` to output a `Scope`, not a `Statter`. - Remove the prefixing of component name (e.g. "VA") in front of stats; this was stripped by `autoProm` but now no longer needs to be. - Delete vendored statsd client. - Delete `MockStatter` (generated by gomock) and `mocks.Statter` (hand generated) in favor of mocking `metrics.Scope`, which is the interface we now use everywhere. - Remove a few unused methods on `metrics.Scope`, and update its generated mock. - Refactor `autoProm` and add `autoRegisterer`, which can be included in a `metrics.Scope`, avoiding global state. `autoProm` now registers everything with the `prometheus.Registerer` it is given. - Change va_test.go's `setup()` to not return a stats object; instead the individual tests that care about stats override `va.stats` directly. Fixes #2639, #2733.	2017-05-15 10:19:54 -04:00
Roland Bracewell Shoemaker	bd045b9325	Fix OCSP-Responder double slash collapsing (#2748 ) Uses a special mux for the OCSP Responder so that we stop collapsing double slashes in GET requests which cause a small number of requests to be considered malformed.	2017-05-10 09:51:10 -04:00
Jacob Hoffman-Andrews	d59188c676	Use pattern to determine endpoint metrics. (#2689 ) This ensures we don't create infinite metrics based on users hitting non-existent endpoints.	2017-04-20 13:14:47 -04:00
Jacob Hoffman-Andrews	4b665e35a6	Use Prometheus stats for VA, WFE, and OCSP Responder (#2628 ) Rename HTTPMonitor to MeasuredHandler. Remove inflight stat (we didn't use it). Add timing stat by method, endpoint, and status code. The timing stat subsumes the "rate" stat, so remove that. WFE now wraps in MeasuredHandler, instead of relying on its cmd/main.go. Remove FBAdapter stats. MeasuredHandler tracks stats by method, status code, and endpoint. In VA, add a Prometheus histogram for validation timing.	2017-04-03 17:03:04 -07:00
Daniel McCarney	00d11f126b	Parse feature flags in all cmd's (#2534 ) If you are the first person to add a feature to a Boulder command its very easy to forget to update the command's config structure to accommodate a `map[string]bool` entry and to pass it to `features.Set` in `main()`. See https://github.com/letsencrypt/boulder/issues/2533 for one example. I've fallen into this trap myself a few times so I'm going to try and save myself some future grief by fixing it across the board once and for all! This PR adds a `Features` config entry and a corresponding `features.Set` to: * ocsp-updater (resolves #2533) * admin-revoker * boulder-publisher * contact-exporter * expiration-mailer * expired-authz-purger * notify-mailer * ocsp-responder * orphan-finder These components were skipped because they already had features supported: * boulder-ca * boulder-ra * boulder-sa * boulder-va * boulder-wfe * cert-checker I deliberately skipped adding Feature support to: * single-ocsp (Its only configuration comes from the pkcs11key library and doesn't support features) * rabbitmq-setup (No configuration/features and we'll likely soon be rming this since the gRPC migration) * notafter-backfill (This is a one-off that will be deleted soon)	2017-01-27 16:29:46 -05:00
Roland Bracewell Shoemaker	595204b23f	Implement improved signal catching in services that already use it (#2333 ) Implements a less RPC focused signal catch/shutdown method. Certain things that probably could also use this (i.e. `ocsp-updater`) haven't been given it as they would require rather substantial changes to allow for a graceful shutdown approach. Fixes #2298.	2016-11-18 21:05:04 -05:00
Jacob Hoffman-Andrews	32c03f942b	Don't start DebugServer until server's ready. (#2271 ) This makes availability of DebugServer a better proxy for readiness of the component.	2016-10-21 16:57:14 -04:00
Roland Bracewell Shoemaker	239bf9ae0a	Very basic feature flag impl (#1705 ) Updates #1699. Adds a new package, `features`, which exposes methods to set and check if various internal features are enabled. The implementation uses global state to store the features so that services embedded in another service do not each require their own features map in order to check if something is enabled. Requires a `boulder-tools` image update to include `golang.org/x/tools/cmd/stringer`.	2016-09-20 16:29:01 -07:00
Jacob Hoffman-Andrews	87fee12d6c	Improve single-ocsp command (#2181 ) Output base64-encoded DER, as expected by ocsp-responder. Use flags instead of template for Status, ThisUpdate, NextUpdate. Provide better help. Remove old test (wasn't run automatically). Add it to integration test, and use its output for integration test of issuer ocsp-responder. Add another slot to boulder-tools HSM image, to store root key.	2016-09-15 15:28:54 -07:00
Roland Bracewell Shoemaker	c8f1fb3e2f	Remove direct usages of go-statsd-client in favor of using metrics.Scope (#2136 ) Fixes #2118, fixes #2082.	2016-09-07 19:35:13 -04:00
Jacob Hoffman-Andrews	031a4022bd	Fix dbConnect strings in OCSP Responder. (#2047 ) Right now we use the Source field for both DB and file URLs. However, we want to move to the DBConnect config field, so that we can take advantage of the code that reads DSNs from a file on disk. It turns out the existing code didn't work if you configure a dbConnect string, because it would error out with: "source" parameter not found in JSON config After rearranging, both methods should work.	2016-07-20 10:36:54 -04:00
Ben Irving	0e2ef748b4	Split up boulder-config.json (OCSP Responder) (#2017 )	2016-07-07 14:52:08 -04:00
Ben Irving	1336c42813	Replace all log.Err calls with log.AuditErr (#1891 ) * remove calls to log.Err() * go fmt * remove more occurrences * change AuditErr argument to string and replace occurrences	2016-06-06 16:27:16 -04:00
Roland Bracewell Shoemaker	54573b36ba	Remove all stray copyright headers and appends the initial line to LICENSE.txt (#1853 )	2016-05-31 12:32:04 -07:00
Kane York	fef60a8fd6	Add statsd reporting of current DB connection count (#1805 ) * rename, change params, restructure * I'm wondering how I managed that one * use a metrics.Scope * move method to SA, update callers * rerun goimports * fix compile error * revert cmd/shell.go https://github.com/letsencrypt/boulder/pull/1805	2016-05-12 20:33:23 -07:00
Jacob Hoffman-Andrews	b3bc3d8e41	Add a MaxDBConns config parameter. (#1793 )	2016-05-09 14:21:15 -07:00
Kane York	7a4aa49add	Return false when ocsp blob is empty (#1771 ) Return false when ocsp blob is empty	2016-05-06 17:22:19 -07:00
Jacob Hoffman-Andrews	e6c17e1717	Switch to new vendor style (#1747 ) * Switch to new vendor style. * Fix metrics generate command. * Fix miekg/dns types_generate. * Use generated copies of files. * Update miekg to latest. Fixes a problem with `go generate`. * Set GO15VENDOREXPERIMENT. * Build in letsencrypt/boulder. * fix travis more. * Exclude vendor instead of godeps. * Replace some ... * Fix unformatted cmd * Fix errcheck for vendorexp * Add GO15VENDOREXPERIMENT to Makefile. * Temp disable errcheck. * Restore master fetch. * Restore errcheck. * Build with 1.6 also. * Match statsd." Skip errcheck unles Go1.6. * Add other ignorepkg. * Fix errcheck. * move errcheck * Remove go1.6 requirement. * Put godep-restore with errcheck. * Remove go1.6 dep. * Revert master fetch revert. * Remove -r flag from godep save. * Set GO15VENDOREXPERIMENT in Dockerfile and remove _worskpace. * Fix Godep version.	2016-04-18 12:51:36 -07:00
Jacob Hoffman-Andrews	ecc04e8e61	Refactor log package (#1717 ) - Remove error signatures from log methods. This means fewer places where errcheck will show ignored errors. - Pull in latest cfssl to be compatible with errorless log messages. - Reduce the number of message priorities we support to just those we actually use. - AuditNotice -> AuditInfo - Remove InfoObject (only one use, switched to Info) - Remove EmergencyExit and related functions in favor of panic - Remove SyslogWriter / AuditLogger separate types in favor of a single interface, Logger, that has all the logging methods on it. - Merge mock log into logger. This allows us to unexport the internals but still override them in the mock. - Shorten names to be compatible with Go style: New, Set, Get, Logger, NewMock, etc. - Use a shorter log format for stdout logs. - Remove "... Starting" log messages. We have better information in the "Versions" message logged at startup. Motivation: The AuditLogger / SyslogWriter distinction was confusing and exposed internals only necessary for tests. Some components accepted one type and some accepted the other. This made it hard to consistently use mock loggers in tests. Also, the unnecessarily fat interface for AuditLogger made it hard to meaningfully mock out.	2016-04-08 16:12:20 -07:00
Jacob Hoffman-Andrews	a3533f0bba	Reduce log levels in OCSP responder. (#1702 ) * Reduce log levels in OCSP responder. * Use mock log in test. * Update upstream cfssl.	2016-04-08 14:41:14 -07:00
Roland Bracewell Shoemaker	800b5b0cbf	Switch to using a wrapped statter that provides PID * Switch to using a wrapped statter that provides PID * Fix tests and change some types to interfaces * Add hostname to suffix + update comment	2016-04-01 15:43:35 -07:00
Jacob Hoffman-Andrews	39d0240793	Remove SQLDebug config option. It's now the default in all cases that it was configurable. When we want to suppress SQL debug messages, we can simply adjust the logging level to suppress debug messages in general. Also, pass a logger to SetSQLDebug rather than calling GetAuditLogger.	2016-03-29 23:32:02 -07:00
Jacob Hoffman-Andrews	0fda27e15a	Remove checking of ocspResponses table. We now use the certificateStatus table.	2016-02-09 10:36:41 -08:00
Jeff Hodges	57b6dd5bb5	make HTTPMonitor a http.Handler	2016-02-01 22:01:21 -08:00
Jeff Hodges	c156f99106	ocsp-responder: 200 on GET / Some stat services, we believe, are saying the ocsp-responder is down because / returns 400 Bad Request currently. Shuffle some code into a new `mux` function to make it easier to test.	2016-02-01 20:03:45 -08:00
Jacob Hoffman-Andrews	feaf6bd230	Merge branch 'master' into secrets	2015-11-30 14:14:47 -08:00
Jacob Hoffman-Andrews	c5de989796	Move FailOnError to correct place.	2015-11-26 11:30:18 -08:00
Jacob Hoffman-Andrews	b71a850501	Fix DBConfig references.	2015-11-24 16:41:53 -08:00
Jacob Hoffman-Andrews	8aa3c6cd7b	Add comment and override for Source.	2015-11-24 11:01:33 -08:00
Jacob Hoffman-Andrews	f008c46a77	Run godep update and godep save -r. Also, remove cache-control code from ocsp-responder, since caching headers are now handled in cfssl.	2015-11-20 16:48:43 -08:00
Jacob Hoffman-Andrews	2fc0f3143e	Improve logging. Consolidate initialization of stats and logging from each main.go into cmd package. Define a new config parameter, `StdoutLevel`, that determines the maximum log level that will be printed to stdout. It can be set to 6 to inhibit debug messages, or 0 to print only emergency messages, or -1 to print no messages at all. Remove the existing config parameter `Tag`. Instead, choose the tag from the basename of the currently running process. Previously all Boulder log messages had the tag "boulder", but now they will be differentiated by process, like "boulder-wfe". Shorten the date format used in stdout logging, and add the current binary's basename. Consolidate setup function in audit-logger_test.go. Note: Most CLI binaries now get their stats and logging from the parameters of Action. However, a few of our binaries don't use our custom AppShell, and instead use codegangsta/cli directly. For those binaries, we export the new StatsAndLogging method from cmd. Fixes https://github.com/letsencrypt/boulder/issues/852	2015-11-11 16:52:42 -08:00
Roland Shoemaker	d24c73bb1b	Review fixes	2015-10-20 19:15:39 -07:00
Roland Shoemaker	02cd06ad0b	Rename and cleanup the dbMap wraper interface	2015-10-16 16:27:03 -07:00
Roland Shoemaker	980d87aa14	Add test to catch logging of failed SQL calls	2015-10-16 13:58:16 -07:00
Roland Shoemaker	e7c71bb7ac	Log OCSP responder SQL errors	2015-10-16 11:59:04 -07:00
Roland Shoemaker	54a79fa640	Review fixes pt. 1	2015-10-12 13:19:50 -07:00
Roland Shoemaker	2b604eef6e	Review fixes pt. 2	2015-10-09 18:02:02 -07:00
Roland Shoemaker	8d1ea7291f	Address review comments OCSP-Responder attempts to read the OCSP response from the certificateStatus table, if it cannot find a response there it reads the ocspResponses table to try to find a response, if neither contains a response the not found bool is passed back to the Responder.	2015-10-09 15:48:09 -07:00
Roland Shoemaker	10b6bb5548	Refactor certificate revocation and OCSP generation workflows * Moves revocation from the CA to the OCSP-Updater, the RA will mark certificates as revoked then wait for the OCSP-Updater to create a new (final) revoked response * Merges the ocspResponses table with the certificateStatus table and only use UPDATES to update the OCSP response (vs INSERT-only since this happens quite often and will lead to an extremely large table)	2015-10-08 18:55:11 -07:00
Jeff Hodges	28a4eecad0	ocsp-responder: error on missing source and tests We had a staging deploy go bad because of the missing error handling on the "source" config not being in the JSON. While we debugged, I wrote some tests. Fixes #936.	2015-10-06 21:50:44 -07:00
Jacob Hoffman-Andrews	a0ba72ea35	Merge branch 'master' into ocsp-decoding Conflicts: test/amqp-integration-test.py	2015-10-01 17:48:26 -07:00
Jacob Hoffman-Andrews	9191aad304	Don't use default handler in ocsp responder.	2015-10-01 16:42:52 -07:00
Roland Shoemaker	44373307b9	Merge branch 'fb-to-statsd' of github.com:letsencrypt/boulder into fb-to-statsd	2015-10-01 15:50:01 -07:00
Roland Shoemaker	9b0586dfdc	Add and use clock	2015-10-01 15:49:50 -07:00
Jeff Hodges	eb7f318fdc	Merge branch 'master' into fb-to-statsd	2015-10-01 15:40:27 -07:00
Jacob Hoffman-Andrews	e97880aaa7	Audit log version info as early as possible. This means after parsing the config file, setting up stats, and dialing the syslogger. But it is still before trying to initialize the given server. This means that we are more likely to get version numbers logged for some common runtime failures.	2015-09-29 17:16:03 -07:00
Roland Shoemaker	081b81d170	Add a facebookgo/stats client that sends StatsD metrics for facebookgo/httpdown	2015-09-26 21:38:05 -07:00
Jeff Hodges	601cf9f0fb	add Cache-Control headers to ocsp-responder Also, adds a JSONDuration to clean up some of the config code. It will get used more in later PRs. Fixes #797	2015-09-25 11:26:44 -07:00
Jacob Hoffman-Andrews	3bec0076cd	Use file URLs for static responders.	2015-09-24 10:11:20 -07:00
Jacob Hoffman-Andrews	540c792474	Add an OCSP responder that serves from a file. This is useful for intermediate and root OCSP, which are generated manually one a year.	2015-09-23 16:34:13 -07:00
Roland Shoemaker	4a47aaed51	Merge master	2015-09-22 14:07:07 -07:00
Roland Shoemaker	91724296a8	Use facebooks gracefully shutting down HTTP server for WFE & OCSP-Responder	2015-09-21 20:43:38 -07:00
Roland Shoemaker	f35643bcaf	Merge master	2015-09-15 12:05:58 -07:00
Richard Barnes	0584cfd53c	Typo fix	2015-09-13 22:12:14 -04:00
Richard Barnes	a7484fb1e7	Use subjectKeyID instead of authoritKeyID in OCSP responder configuration	2015-09-13 21:05:59 -04:00
Roland Shoemaker	a3c9f60bec	Review fixes	2015-08-30 22:15:13 -07:00
Roland Shoemaker	764169667e	Merge master	2015-08-27 11:21:18 -07:00
Jacob Hoffman-Andrews	9b9dd76f54	Fix flaky OCSP. If two OCSP responses were generated in the same second, the earlier would previously take priority sometimes, leading to a "good" response for revoked certificates and causing the OCSP integration test to be flaky.	2015-08-24 15:31:26 -07:00
Roland Shoemaker	d6efd496fa	Merge master	2015-08-24 12:27:58 -07:00

1 2 3 4

171 Commits