Commit Graph

217 Commits

Author SHA1 Message Date
Jacob Hoffman-Andrews 507c0d1ab3
expiration-mailer: add stats (#6091) 2022-05-09 16:41:10 -07:00
Aaron Gable 7ef6913e71
Revert "Allow expiration mailer to work in parallel" (#6080)
When deployed, the newly-parallel expiration-mailer encountered
unexpected difficulties and dropped to apparently sending nearly zero
emails despite not throwing any real errors. Reverting the parallelism
change until we understand and can fix the root cause.

This reverts two commits:
- Allow expiration mailer to work in parallel (#6057)
- Fix data race in expiration-mailer test mocks (#6072) 

It also modifies the revert to leave the new `ParallelSends` config key
in place (albeit completely ignored), so that the binary containing this
revert can be safely deployed regardless of config status.

Part of #5682
2022-05-03 13:18:40 -07:00
Jacob Hoffman-Andrews 9629c88d66
Allow expiration mailer to work in parallel (#6057)
Previously, each accounts email would be sent in serial,
along with several reads from the database (to check for
certificate renewal) and several writes to the database (to update
`certificateStatus.lastExpirationNagSent`). This adds a config field
for the expiration mailer that sets the parallelism it will use.

That means making and using multiple SMTP connections as well. Previously,
`bmail.Mailer` was not safe for concurrent use. It also had a piece of
API awkwardness: after you created a Mailer, you had to call Connect on
it to change its state.

Instead of treating that as a state change on Mailer, I split out a
separate component: `bmail.Conn`. Now, when you call `Mailer.Connect()`,
you get a Conn. You can send mail on that Conn and Close it when you're
done. A single Mailer instance can produce multiple Conns, so Mailer is
now concurrency-safe (while Conn is not).

This involved a moderate amount of renaming and code movement, and
GitHub's move detector is not keeping up 100%, so an eye towards "is
this moved code?" may help. Also adding `?w=1` to the diff URL to ignore
whitespace diffs.
2022-04-21 18:04:55 -07:00
Jacob Hoffman-Andrews fe6fab8821
Remove fqdnsets_old workaround (#6054)
Fixes #5670
2022-04-21 16:39:35 -07:00
Andrew Gabbitas 79048cffba
Support writing initial OCSP response to redis (#5958)
Adds a rocsp redis client to the sa if cluster information is provided in the
sa config. If a redis cluster is configured, all new certificate OCSP
responses added with sa.AddPrecertificate will attempt to be written to
the redis cluster, but will not block or fail on errors.

Fixes: #5871
2022-03-21 20:33:12 -06:00
Jacob Hoffman-Andrews d4336e5f4c
Make expiration-mailer clean exit on SIGTERM (#5998)
Plumb a context through everything, and cancel that context when we
catch a shutdown signal.

Fixes #5953
2022-03-16 10:58:18 -07:00
Aaron Gable 305ef9cce9
Improve error checking paradigm (#5920)
We have decided that we don't like the if err := call(); err != nil
syntax, because it creates confusing scopes, but we have not cleaned up
all existing instances of that syntax. However, we have now found a
case where that syntax enables a bug: It caused readers to believe that
a later err = call() statement was assigning to an already-declared err
in the local scope, when in fact it was assigning to an
already-declared err in the parent scope of a closure. This caused our
ineffassign and staticcheck linters to be unable to analyze the
lifetime of the err variable, and so they did not complain when we
never checked the actual value of that error.

This change standardizes on the two-line error checking syntax
everywhere, so that we can more easily ensure that our linters are
correctly analyzing all error assignments.
2022-02-01 14:42:43 -07:00
Samantha f69b57e0e1
Make DB client initialization uniform and stop setting 'READ-UNCOMMITTED' (#5741)
Boulder components initialize their gorp and gorp-less (non-wrapped) database
clients via two new SA helpers. These helpers handle client construction,
database metric initialization, and (for gorp only) debug logging setup.

Removes transaction isolation parameter `'READ-UNCOMMITTED'` from all database
connections.

Fixes #5715 
Fixes #5889
2022-01-31 13:34:23 -08:00
Jacob Hoffman-Andrews 3bf06bb4d8
Export the config structs from our main files (#5875)
This allows our documentation on those structs to show up in our godoc
output.
2022-01-12 12:20:27 -08:00
Phil Porada a30065edeb
Configure buckets for the expiration-mailer processing_latency metric (#5797) 2021-11-16 17:36:21 -08:00
Aaron Gable bbe53e92d0
Revert "Expiry mailer: fetch certificates in bulk" (#5780)
This reverts commit e3ce816425,
which was reviewed in https://github.com/letsencrypt/boulder/pull/5607.

This change caused database queries to exceed the maximum packet size
and fail. Because this was an opportunistic optimization, reverting it
is the safest course moving forward.
2021-11-05 13:26:06 -07:00
Aaron Gable e3ce816425
Expiry mailer: fetch certificates in bulk (#5607)
Use `sa.SelectCertificates` instead of `sa.SelectCertificate` to
fetch the entire batch of certificates all at once, instead of doing
up to 10k individual certificate selections in serial.
2021-10-28 09:33:20 -07:00
Jacob Hoffman-Andrews 23dd1e21f9
Build all boulder binaries into a single binary (#5693)
The resulting `boulder` binary can be invoked by different names to
trigger the behavior of the relevant subcommand. For instance, symlinking
and invoking as `boulder-ca` acts as the CA. Symlinking and invoking as
`boulder-va` acts as the VA.

This reduces the .deb file size from about 200MB to about 20MB.

This works by creating a registry that maps subcommand names to `main`
functions. Each subcommand registers itself in an `init()` function. The
monolithic `boulder` binary then checks what name it was invoked with
(`os.Args[0]`), looks it up in the registry, and invokes the appropriate
`main`. To avoid conflicts, all of the old `package main` are replaced
with `package notmain`.

To get the list of registered subcommands, run `boulder --list`. This
is used when symlinking all the variants into place, to ensure the set
of symlinked names matches the entries in the registry.

Fixes #5692
2021-10-20 17:05:45 -07:00
Samantha 99502b1ffb
oscp-updater: use rows.Scan() to get query results (#5656)
- Replace `gorp.DbMap` with calls that use `sql.DB` directly
- Use `rows.Scan()` and `rows.Next()` to get query results (which opens the door to streaming the results)
- Export function `CertStatusMetadataFields` from `SA`
- Add new function `ScanCertStatusRow` to `SA`
- Add new function `NewDbSettingsFromDBConfig` to `SA`

Fixes #5642
Part Of #5715
2021-10-18 10:33:09 -07:00
Jacob Hoffman-Andrews 1677aeaf95
expiration-mailer: log attempted and failed sends (#5702)
This will help us do better analysis of how many emails we're sending
and figure out any failure patterns.
2021-10-14 16:46:01 -07:00
Aaron Gable bab688b98f
Remove sa-wrappers.go (#5663)
Remove the last of the gRPC wrapper files. In order to do so:

- Remove the `core.StorageGetter` interface. Replace it with a new
  interface (whose methods include the `...grpc.CallOption` arg)
  inside the `sa/proto/` package.
- Remove the `core.StorageAdder` interface. There's no real use-case
  for having a write-only interface.
- Remove the `core.StorageAuthority` interface, as it is now redundant
  with the autogenerated `sapb.StorageAuthorityClient` interface.
- Replace the `certificateStorage` interface (which appears in two
  different places) with a single unified interface also in `sa/proto/`.
- Update all test mocks to include the `_ ...grpc.CallOption` arg in
  their method signatures so they match the gRPC client interface.
- Delete many methods from mocks which are no longer necessary (mostly
  because they're mocking old authz1 methods that no longer exist).
- Move the two `test/inmem/` wrappers into their own sub-packages to
  avoid an import cycle.
- Simplify the `satest` package to satisfy one of its TODOs and to
  avoid an import cycle.
- Add many methods to the `test/inmem/sa/` wrapper, to accommodate all
  of the methods which are called in unittests.

Fixes #5600
2021-09-27 13:25:41 -07:00
Aaron Gable 6c85ae0f2c
expiration-mailer: improve search for renewals (#5673)
Overhaul how expiration-mailer checks to see if a `certIsRenewed`.
First, change the helper function to take the list of names (which
can be hashed into an fqdnSet) and the issuance date. This allows the
search for renewals to be a much simpler linear scan rather than an
ugly outer left join. Second, update the query to examine both the
`fqdnSets` and `fqdnSets_old` tables, to account for the fact that
this code cares about more time (~90d) than the `fqdnSets` table
currently holds.

Also export the SA's `HashNames` method so it can be used by the mailer,
and update the mailer's tests to use correct name hashes instead of
fake human-readable hashes.

Fixes #5672
2021-09-24 13:36:57 -07:00
Aaron Gable b3192bdead
Switch nag group to count serials, not certs (#5605)
If expiry mailer's SELECT statement is regularly returning exactly
as many rows as it is configured to limit at, that means that we are
consistently missing some certs that we should probably be sending
emails about. Change the way this metric is computed to be based
on the number of rows returned directly by the SELECT, rather than
based on the filtered set of rows that ended up corresponding to
final certificates.

This bug was almost certainly causing us to not detect nag groups
being at capacity, given that the expiry-mailer logs consistently say
that nag groups have nearly-but-not-quite as many certs in them as
the configured maximum. This bug was introduced in #4420, when
expiration-mailer gained the ability to silently continue past certain
classes of errors retrieving certificate details.
2021-08-26 12:29:11 -07:00
Aaron Gable b7ce627572
Remove SA Registration gRPC wrappers (#5551)
Remove all error checking and type transformation from the gRPC wrappers
for the following methods on the SA:
- GetRegistration
- GetRegistrationByKey
- NewRegistration
- UpdateRegistration
- DeactivateRegistration

Update callers of these methods to construct the appropriate protobuf
request messages directly, and to consume the protobuf response messages
directly. In many cases, this requires changing the way that clients
handle the `Jwk` field (from expecting a `JSONWebKey` to expecting a
slice of bytes) and the `Contacts` field (from expecting a possibly-nil
pointer to relying on the value of the `ContactsPresent` boolean field).

Implement two new methods in `sa/model.go` to convert directly between
database models and protobuf messages, rather than round-tripping
through `core` objects in between. Delete the older methods that
converted between database models and `core` objects, as they are no
longer necessary.

Update test mocks to have the correct signatures, and update tests to
not rely on `JSONWebKey` and instead use byte slices.

Fixes #5531
2021-08-04 13:33:41 -07:00
J.C. Jones 7b31bdb30a
Add read-only dbConns to SQLStorageAuthority and OCSPUpdater (#5555)
This changeset adds a second DB connect string for the SA for use in 
read-only queries that are not themselves dependencies for read-write 
queries. In other words, this is attempting to only catch things like 
rate-limit `SELECT`s and other coarse-counting, so we can potentially 
move those read queries off the read-write primary database.

It also adds a second DB connect string to the OCSP Updater. This is a 
little trickier, as the subsequent `UPDATE`s _are_ dependent on the 
output of the `SELECT`, but in this case it's operating on data batches,
and a few seconds' replication latency are several orders of magnitude 
below the threshold for update frequency, so any certificates that 
aren't caught on run `n` can be caught on run `n+1`.

Since we export DB metrics to Prometheus, this also refactors 
`InitDBMetrics` to take a DB Address (host:port tuple) and User out of 
the DB connection DSN and include those as labels in the metrics.

Fixes #5550
Fixes #4985
2021-08-02 11:21:34 -07:00
Aaron Gable 9abb39d4d6
Honeycomb integration proof-of-concept (#5408)
Add Honeycomb tracing to all Boulder components which act as
HTTP servers, gRPC servers, or gRPC clients. Add many values
which we currently emit to logs to the trace spans. Add a way to
configure the Honeycomb integration to our config files, and by
default configure all of our tests to "mute" (send nothing).

Followup changes will refine the configuration, attempt to reduce
the new dependency load, and introduce better sampling.

Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218
2021-05-24 16:13:08 -07:00
Samantha 5a92926b0c
Remove dbconfig migration deployability code (#5348)
Default boulder code paths to exclusively use the `db` config key

Fixes #5338
2021-03-18 16:41:15 -07:00
Samantha e2e7dad034
Move cmd.DBConfig fields to their own named sub-struct (#5286)
Named field `DB`, in a each component configuration struct, acts as the
receiver for the value of `db` when component JSON files are
unmarshalled.

When `cmd.DBConfig` fields are received at the root of component
configuration struct instead of `DB` copy them to the `DB` field of the
component configuration struct.

Move existing `cmd.DBConfig` values from the root of each component's
JSON configuration in `test/config-next` to `db`

Part of #5275
2021-02-16 10:48:58 -08:00
Samantha 7cb0038498
Deprecate MaxDBConns for MaxOpenConns (#5274)
In #5235 we replaced MaxDBConns in favor of MaxOpenConns.

One week ago MaxDBConns was removed from all dev, staging, and
production configurations. This change completes the removal of
MaxDBConns from all components and test/config.

Fixes #5249
2021-02-08 12:00:01 -08:00
Samantha e0510056cc
Enhancements to SQL driver tuning via JSON config (#5235)
Historically the only database/sql driver setting exposed via JSON
config was maxDBConns. This change adds support for maxIdleConns,
connMaxLifetime, connMaxIdleTime, and renames maxDBConns to
maxOpenConns. The addition of these settings will give our SRE team a
convenient method for tuning the reuse/closure of database connections.

A new struct, DBSettings, has been added to SA. The struct, and each of
it's fields has been commented.

All new fields have been plumbed through to the relevant Boulder
components and exported as Prometheus metrics. Tests have been
added/modified to ensure that the fields are being set. There should be
no loss in coverage

Deployability concerns for the migration from maxDBConns to maxOpenConns
have been addressed with the temporary addition of the helper method
cmd.DBConfig.GetMaxOpenConns(). This method can be removed once
test/config is defaulted to using maxOpenConns. Relevant sections of the
code have TODOs added that link back to an newly opened issue.

Fixes #5199
2021-01-25 15:34:55 -08:00
Aaron Gable 6f0016262f
Simplify database interactions (#4949)
Simplify database interactions

This change is a result of an audit of all places where
Go code directly constructs SQL queries and executes them
against a dbMap, with the goal of eliminating all instances
of constructing a well-known object type (such as a
core.CertificateStatus) from explicitly-listed database columns.
Instead, we should be relying on helper functions defined in the
sa itself to determine which columns are relevant for the
construction of any given object.

This audit did not find many places where this was occurring. It
did reveal a few simplifications, which are contained in this
change:
1) Greater use of existing SelectFoo methods provided by models.go
2) Streamlining of various SelectSingularFoo methods to always
   select by serial string, rather than user-provided WHERE clause
3) One spot (in ocsp-responder) where using a well-known type seemed
   better than using a more minimal custom type

Addresses #4899
2020-07-20 11:12:52 -07:00
Jacob Hoffman-Andrews bef02e782a
Fix nits found by staticcheck (#4726)
Part of #4700
2020-03-30 10:20:20 -07:00
Roland Bracewell Shoemaker 5b2f11e07e Switch away from old style statsd metrics wrappers (#4606)
In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back.

There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change.

Fixes #4591.
2019-12-18 11:08:25 -05:00
Daniel McCarney 1c9ece3f44
SA: use wrapped database maps/transactions. (#4585)
New types and related infrastructure are added to the `db` package to allow
wrapping gorp DbMaps and Transactions.

The wrapped versions return a special `db.ErrDatabaseOp` error type when errors
occur. The new error type includes additional information such as the operation
that failed and the related table.

Where possible we determine the table based on the types of the gorp function
arguments. Where that isn't possible (e.g. with raw SQL queries) we try to use
a simple regexp approach to find the table name. This isn't great for general
SQL but works well enough for Boulder's existing SQL queries.

To get additional confidence my regexps work for all of Boulder's queries
I temporarily changed the `db` package's `tableFromQuery` function to panic if
the table couldn't be determined. I re-ran the full unit and integration test
suites with this configuration and saw no panics.

Resolves https://github.com/letsencrypt/boulder/issues/4559
2019-12-04 13:03:09 -05:00
Daniel McCarney fabfba2e16 expiration-mailer: fix nagsAtCapacity to reset. (#4569)
When a nag group hits capacity we set the nagsAtCapacity gauge to 1.
This gauge also needs to be reset to 0 when the nag group is no longer
at capacity.
2019-11-20 19:32:58 -08:00
Jacob Hoffman-Andrews 9906c93217
Generate and store OCSP at precertificate signing time (#4420)
This change adds two tables and two methods in the SA, to store precertificates
and serial numbers.

In the CA, when the feature flag is turned on, we generate a serial number, store it,
sign a precertificate and OCSP, store them, and then return the precertificate. Storing
the serial as an additional step before signing the certificate adds an extra layer of
insurance against duplicate serials, and also serves as a check on database availability.
Since an error storing the serial prevents going on to sign the precertificate, this decreases
the chance of signing something while the database is down.

Right now, neither table has read operations available in the SA.

To make this work, I needed to remove the check for duplicate certificateStatus entry
when inserting a final certificate and its OCSP response. I also needed to remove
an error that can occur when expiration-mailer processes a precertificate that lacks
a final certificate. That error would otherwise have prevented further processing of
expiration warnings.

Fixes #4412

This change builds on #4417, please review that first for ease of review.
2019-09-09 12:21:20 -07:00
Roland Bracewell Shoemaker 6f93942a04 Consistently used stdlib context package (#4229) 2019-05-28 14:36:16 -04:00
Daniel McCarney 5be559debb
sa: remove CertStatus.LockCol and SubscriberApproved. (#4175)
Both of these database fields are not being used.
2019-04-23 15:14:48 -04:00
Daniel McCarney 0ecdf80709 SA: refactor DB stat collection & collect more stats. (#4096)
Go 1.11+ updated the `sql.DBStats` struct with new fields that are of
interest to us. This PR routes these stats to Prometheus by replacing
the existing autoprom stats code with new first-class Prometheus
metrics. Resolves https://github.com/letsencrypt/boulder/issues/4095

The `max_db_connections` stat from the SA is removed because the Go 1.11+
`sql.DBStats.MaxOpenConnections` field will give us a better view of
the same information.

The autoprom "reused_authz" stat that was being incremented in
`SA.GetPendingAuthorization` was also removed. It wasn't doing what it
says it was (counting reused authorizations) and was instead counting
the number of times `GetPendingAuthorization` returned an authz.
2019-03-06 17:08:53 -08:00
Roland Bracewell Shoemaker 232a5f828f Fix ineffectual assignments (#4052)
* in boulder-ra we connected to the publisher and created a publisher gRPC client twice for no apparent reason
* in the SA we ignored errors from `getChallenges` in `GetAuthorizations` which could result in a nil challenge being returned in an authorization
2019-02-13 15:39:58 -05:00
Joel Sing 8ebdfc60b6 Provide formatting logger functions. (#3699)
A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)).
Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions
of the logger functions and call these directly with the format and arguments.

While here remove some unnecessary trailing newlines and calls to String/Error.
2018-05-10 11:06:29 -07:00
Daniel McCarney aa810a3142 gRPC: publish RPC latency stat in server interceptor. (#3665)
We may see RPCs that are dispatched by a client but do not arrive at the server for some time afterwards. To have insight into potential request latency at this layer we want to publish the time delta between when a client sent an RPC and when the server received it.

This PR updates the gRPC client interceptor to add the current time to the gRPC request metadata context when it dispatches an RPC. The server side interceptor is updated to pull the client request time out of the gRPC request metadata. Using this timestamp it can calculate the latency and publish it as an observation on a Prometheus histogram.

Accomplishing the above required wiring a clock through to each of the client interceptors. This caused a small diff across each of the gRPC aware boulder commands.

A small unit test is included in this PR that checks that a latency stat is published to the histogram after an RPC to a test ChillerServer is made. It's difficult to do more in-depth testing because using fake clocks makes the latency 0 and using real clocks requires finding a way to queue/delay requests inside of the gRPC mechanisms not exposed to Boulder.

Updates https://github.com/letsencrypt/boulder/issues/3635 - Still TODO: Explicitly logging latency in the VA, tracking outstanding RPCs as a gauge.
2018-04-25 15:37:22 -07:00
Roland Bracewell Shoemaker 8167abd5e3 Use internet facing appropriate histogram buckets for DNS latencies (#3616)
Also instead of repeating the same bucket definitions everywhere just use a single top level var in the metrics package in order to discourage copy/pasting.

Fixes #3607.
2018-04-04 08:01:54 -04:00
Daniel McCarney e00ed50cc3 Fix nil panic in mailer logger setup when DB conn fails. (#3545)
Prior to this commit, if the expiration-mailer database configuration is
invalid, or the database is unreachable,
`cmd/expiration-mailer/main.go`'s `main()` function tries to call
`sa.SetSQLDebug(dbMap, logger)` before erroring from the DB
initialization failure. This causes a nil panic.

This commit changes the order of two lines such that `sa.SetSQLDebug` is
only called when there was no db setup error.
2018-03-12 13:46:13 -07:00
Jacob Hoffman-Andrews 9da5a7e1fc Cleanup: TLS and GRPC configs are mandatory. (#3476)
Our various main.go functions gated some key code on whether the TLS
and/or GRPC config fields were present. Now that those fields are fully
deployed in production, we can simplify the code and require them.
    
Also, rename tls to tlsConfig everywhere to avoid confusion with the tls
package.
    
Avoid assigning to the same err from two different goroutines in
boulder-ca (fix a race).
2018-02-26 10:16:50 -05:00
Jacob Hoffman-Andrews 68d5cc3331
Restore gRPC metrics (#3265)
The go-grpc-prometheus package by default registers its metrics with Prometheus' global registry. In #3167, when we stopped using the global registry, we accidentally lost our gRPC metrics. This change adds them back.

Specifically, it adds two convenience functions, one for clients and one for servers, that makes the necessary metrics object and registers it. We run these in the main function of each server.

I considered adding these as part of StatsAndLogging, but the corresponding ClientMetrics and ServerMetrics objects (defined by go-grpc-prometheus) need to be subsequently made available during construction of the gRPC clients and servers. We could add them as fields on Scope, but this seemed like a little too much tight coupling.

Also, update go-grpc-prometheus to get the necessary methods.

```
$ go test github.com/grpc-ecosystem/go-grpc-prometheus/...
ok      github.com/grpc-ecosystem/go-grpc-prometheus    0.069s
?       github.com/grpc-ecosystem/go-grpc-prometheus/examples/testproto [no test files]
```
2017-12-07 15:44:55 -08:00
Roland Bracewell Shoemaker 9da1bea433 Update histogram buckets for latencies that measure things over the internet (#3254)
Updates the buckets for histograms in the publisher, va, and expiration-mailer which are used to measure the latency of operations that go over the internet and therefore are liable to take a lot longer than the default buckets can measure. Uses a standard set of buckets for all three instead of attempting to tune for each one.

Fixes #3217.
2017-11-29 15:13:14 -08:00
Jacob Hoffman-Andrews 975456bb08 Switch nagsAtCapacity to Gauge. (#3224)
Fixes #3186
2017-11-08 15:35:25 -08:00
Jacob Hoffman-Andrews 4296dd985a Use TLS in mailer integration tests (#3213)
* Remove non-TLS support from mailer entirely
* Add a config option for trusted roots in expiration-mailer. If unset, it defaults to the system roots, so this does not need to be set in production.
* Use TLS in mail-test-srv, along with an internal root and localhost certificates signed by that root.
2017-11-06 14:57:14 -08:00
Lucas Amorim 7daecf7b23
fix metric name 2017-10-30 11:26:03 -07:00
Lucas Amorim a7a2eaf035
Add metrics to sendNags errors in expiration-mailer 2017-10-29 21:34:41 -07:00
Jacob Hoffman-Andrews f366e45756 Remove global state from metrics gathering (#3167)
Previously, we used prometheus.DefaultRegisterer to register our stats, which uses global state to export its HTTP stats. We also used net/http/pprof's behavior of registering to the default global HTTP ServeMux, via DebugServer, which starts an HTTP server that uses that global ServeMux.

In this change, I merge DebugServer's functions into StatsAndLogging. StatsAndLogging now takes an address parameter and fires off an HTTP server in a goroutine. That HTTP server is newly defined, and doesn't use DefaultServeMux. On it is registered the Prometheus stats handler, and handlers for the various pprof traces. In the process I split StatsAndLogging internally into two functions: makeStats and MakeLogger. I didn't port across the expvar variable exporting, which serves a similar function to Prometheus stats but which we never use.

One nice immediate effect of this change: Since StatsAndLogging now requires and address, I noticed a bunch of commands that called StatsAndLogging, and passed around the resulting Scope, but never made use of it because they didn't run a DebugServer. Under the old StatsD world, these command still could have exported their stats by pushing, but since we moved to Prometheus their stats stopped being collected. We haven't used any of these stats, so instead of adding debug ports to all short-lived commands, or setting up a push gateway, I simply removed them and switched those commands to initialize only a Logger, no stats.
2017-10-13 11:58:01 -07:00
Jacob Hoffman-Andrews dbfb48226d Add parallelism to SA CountCertificatesByNames. (#3133)
Since we can make up to 100 SQL queries from this method (based on the 100-SAN
limit), sometimes it is too slow and we get a timeout for large certificates. By
running some of those queries in parallel, we can speed things up and stop
getting timeouts.
2017-10-02 15:45:08 -04:00
Daniel McCarney 2a84bc2495 Replace go-jose v1 with go-jose v2. (#2899)
This commit replaces the Boulder dependency on
gopkg.in/square/go-jose.v1 with gopkg.in/square/go-jose.v2. This is
necessary both to stay in front of bitrot and because the ACME v2 work
will require a feature from go-jose.v2 for JWS validation.

The largest part of this diff is cosmetic changes:

Changing import paths
jose.JsonWebKey -> jose.JSONWebKey
jose.JsonWebSignature -> jose.JSONWebSignature
jose.JoseHeader -> jose.Header
Some more significant changes were caused by updates in the API for
for creating new jose.Signer instances. Previously we constructed
these with jose.NewSigner(algorithm, key). Now these are created with
jose.NewSigner(jose.SigningKey{},jose.SignerOptions{}). At present all
signers specify EmbedJWK: true but this will likely change with
follow-up ACME V2 work.

Another change was the removal of the jose.LoadPrivateKey function
that the wfe tests relied on. The jose v2 API removed these functions,
moving them to a cmd's main package where we can't easily import them.
This function was reimplemented in the WFE's test code & updated to fail
fast rather than return errors.

Per CONTRIBUTING.md I have verified the go-jose.v2 tests at the imported
commit pass:

ok      gopkg.in/square/go-jose.v2      14.771s
ok      gopkg.in/square/go-jose.v2/cipher       0.025s
?       gopkg.in/square/go-jose.v2/jose-util    [no test files]
ok      gopkg.in/square/go-jose.v2/json 1.230s
ok      gopkg.in/square/go-jose.v2/jwt  0.073s

Resolves #2880
2017-07-26 10:55:14 -07:00
Roland Bracewell Shoemaker a656408630 Use standard SA methods in the expiration-mailer and refactor tests (#2893)
This makes making changes to the `certificate` and `certificateStatus` tables much easier in the future.
2017-07-24 15:24:33 -04:00
Daniel McCarney f70e262935 Replace autoprom stats with Prometheus style stats. (#2869)
This commit replaces the existing expiration-mailer autoprom stats with
first-class Prometheus style stats.
2017-07-13 15:14:36 -07:00
Jacob Hoffman-Andrews 63a25bf913 Remove clientName everywhere. (#2862)
This used to be used for AMQP queue names. Now that AMQP is gone, these consts
were only used when printing a version string at startup. This changes
VersionString to just use the name of the current program, and removes
`const clientName = ` from many of our main.go's.
2017-07-12 10:28:54 -07:00
Jacob Hoffman-Andrews d47d3c5066 Recycle pending authorizations (#2797)
If the feature flag "ReusePendingAuthz" is enabled, a request to create a new authorization object from an account that already has a pending authorization object for the same identifier will return the already-existing authorization object. This should make it less common for people to get stuck in the "too many pending authorizations" state, and reduce DB storage growth.

Fixes #2768
2017-06-19 13:35:36 -04:00
Jacob Hoffman-Andrews b17b5c72a6 Remove statsd from Boulder (#2752)
This removes the config and code to output to statsd.

- Change `cmd.StatsAndLogging` to output a `Scope`, not a `Statter`.
- Remove the prefixing of component name (e.g. "VA") in front of stats; this was stripped by `autoProm` but now no longer needs to be.
- Delete vendored statsd client.
- Delete `MockStatter` (generated by gomock) and `mocks.Statter` (hand generated) in favor of mocking `metrics.Scope`, which is the interface we now use everywhere.
- Remove a few unused methods on `metrics.Scope`, and update its generated mock.
- Refactor `autoProm` and add `autoRegisterer`, which can be included in a `metrics.Scope`, avoiding global state. `autoProm` now registers everything with the `prometheus.Registerer` it is given.
- Change va_test.go's `setup()` to not return a stats object; instead the individual tests that care about stats override `va.stats` directly.

Fixes #2639, #2733.
2017-05-15 10:19:54 -04:00
Jacob Hoffman-Andrews 6719dc17a6 Remove AMQP config and code (#2634)
We now use gRPC everywhere.
2017-04-03 10:39:39 -04:00
Roland Bracewell Shoemaker 98addd5f36 expiration-mailer daemon mode (#2631)
Adds a daemon mode to `expiration-mailer` that is triggered by using the flag `--daemon` in order to follow deployability guidelines. If the `--daemon` flag is used the `mailer.runPeriod` config field is checked for a tick duration, if the value is `0` it exits.

Super lightweight implementation, OCSP-Updater has some custom ticker code which we use to do fancy things when the method being invoked in the loop takes longer expected, but that isn't necessary here.

Fixes #2617.
2017-03-30 16:16:41 -07:00
Roland Bracewell Shoemaker e2b2511898 Overhaul internal error usage (#2583)
This patch removes all usages of the `core.XXXError` and almost all usages of `probs` outside of the WFE and VA and replaces them with a unified internal error type. Since the VA uses `probs.ProblemDetails` quite extensively in challenges, and currently stores them in the DB I've saved this change for another change (it'll also require a migration). Since `ProblemDetails` should only ever be exposed to end-users all of its related logic should be moved into the `WFE` but since it still needs to be exposed to the VA and SA I've left it in place for now.

The new internal `errors` package offers the same convenience functions as `probs` does as well as a new simpler type testing method. A few small changes have also been made to error messages, mainly adding the library and function name to internal server errors for easier debugging (i.e. where a number of functions return the exact same errors and there is no other way to distinguish which method threw the error).

Also adds proper encoding of internal errors transferred over gRPC (the current encoding scheme is kept for `core` and `probs` errors since it'll be ideally be removed after we deploy this and follow-up changes) using `grpc/metadata` instead of the gRPC status codes.

Fixes #2507. Updates #2254 and #2505.
2017-03-22 23:27:31 -07:00
Daniel McCarney 8f1de3b57e Allows expiration-mailer to use template as subject. (#2613)
This commit resolves #2599 by adding support to the expiration-mailer to
treat the subject for email messages as a template. This allows for the
dynamic subject lines from #2435 to be used with a prefix for staging
emails.
2017-03-21 16:57:28 -07:00
Roland Bracewell Shoemaker 8a1adbdc9a Switch to gorp.v2 (#2598)
Switch from `gorp.v1` to `gorp.v2`. Removes `vendor/gopkg.in/gorp.v1` and vendors `vendor/gopkg/go-gorp/gorp.v2`, all tests pass.

Changes between `v1.7.1` and `v2.0.0`: c87af80f3c...4deece6103

Fixes #2490.
2017-03-08 12:20:22 -05:00
Daniel McCarney fcf361c327 Remove CertStatusOptimizationsMigrated Feature Flag & Assoc. Cruft (#2561)
The NotAfter and IsExpired fields on the certificateStatus table
have been migrated in staging & production. Similarly the
CertStatusOptimizationsMigrated feature flag has been turned on after
a successful backfill operation. We have confirmed the optimization is
working as expected and can now clean out the duplicated v1 and v2
models, and the feature flag branching. The notafter-backfill command
is no longer useful and so this commit also cleans it out of the repo.

Note: Some unit tests were sidestepping the SA and inserting
certificateStatus rows explicitly. These tests had to be updated to
set the NotAfter field in order for the queries used by the
ocsp-updater and the expiration-mailer to perform the way the tests
originally expected.

Resolves #2530
2017-02-16 11:35:00 -08:00
Daniel McCarney 00d11f126b Parse feature flags in all cmd's (#2534)
If you are the first person to add a feature to a Boulder command its very
easy to forget to update the command's config structure to accommodate a
`map[string]bool` entry and to pass it to `features.Set` in `main()`. See
https://github.com/letsencrypt/boulder/issues/2533 for one example. I've
fallen into this trap myself a few times so I'm going to try and save myself
some future grief by fixing it across the board once and for all!

This PR adds a `Features` config entry and a corresponding `features.Set` to:
* ocsp-updater (resolves #2533)
* admin-revoker
* boulder-publisher
* contact-exporter
* expiration-mailer
* expired-authz-purger
* notify-mailer
* ocsp-responder
* orphan-finder

These components were skipped because they already had features supported:
* boulder-ca
* boulder-ra
* boulder-sa
* boulder-va
* boulder-wfe
* cert-checker

I deliberately skipped adding Feature support to:
* single-ocsp (Its only configuration comes from the pkcs11key library and
  doesn't support features)
* rabbitmq-setup (No configuration/features and we'll likely soon be rming this
  since the gRPC migration)
* notafter-backfill (This is a one-off that will be deleted soon)
2017-01-27 16:29:46 -05:00
Josh Soref 8adf9d41cf Spelling (#2500)
Various spelling fixes.
2017-01-16 10:44:52 -05:00
Jacob Hoffman-Andrews 510e279208 Simplify gRPC TLS configs. (#2470)
Previously, a given binary would have three TLS config fields (CA cert, cert,
key) for its gRPC server, plus each of its configured gRPC clients. In typical
use, we expect all three of those to be the same across both servers and clients
within a given binary.

This change reuses the TLSConfig type already defined for use with AMQP, adds a
Load() convenience function that turns it into a *tls.Config, and configures it
for use with all of the binaries. This should make configuration easier and more
robust, since it more closely matches usage.

This change preserves temporary backwards-compatibility for the
ocsp-updater->publisher RPCs, since those are the only instances of gRPC
currently enabled in production.
2017-01-06 14:19:18 -08:00
Daniel McCarney e74e7ad14b Include domain name in email subj (#2435)
This PR modifies the `expiration-mailer` utility to change the subject used in the reminder emails to include a domain name from the expiring certificate.

Previously unless otherwise specified using the `Mailer.Subject` configuration parameter all reminder emails were sent with the subject `Certificate expiration notice`. Both the `test/config/` and `test/config-next` expiration mailer configurations do not override the subject and were using the default.

With this PR, if no `Mailer.Subject` configuration parameter is provided then reminder emails are sent with the subject `Certificate expiration notice for domain "foo.bar.com"` in the case of only one domain in the expiring certificate, and `Certificate expiration for domain "foo.bar.com" (and $(n-1) more)` for the case where there are n > 1 domains (e.g. "(and 1 more)", "(and 2 more)" ...). I explicitly left support for the `Mailer.Subject` override to allow legacy configurations to function.

I didn't explicitly add a new unit test for this behaviour because the existing unit tests were exercising both the "configuration override" portion of the subject behaviour, and matching the new expected subject. It would be entirely duplicated code to write a separate test for the subject template.

Resolves #2411
2016-12-19 17:12:37 -05:00
Daniel McCarney 1083db5a15 Optimize expiration-mailer queries (#2440)
This PR splits up the expiration-mailer's `findExpiringCertificates` query into two parts:
1. One query to find `certificateStatus` serial numbers that match the search criteria
2. Sequential queries to find each `certificate` row for the results from 1.

This removes the `JOIN` on two large tables from the original `findExpiringCertificates` query and lets us shift load away from the database. https://github.com/letsencrypt/boulder/issues/2432 wasn't sufficient to reduce the load of this query.

Resolves https://github.com/letsencrypt/boulder/issues/2425
2016-12-19 14:29:37 -05:00
Daniel McCarney 5c3482d2dd `certificateStatus` table optimizations (Part Four) (#2432)
Similar to #2431 the expiration-mailer's `findExpiringCertificates()` query can be optimized slightly by using `certificateStatus.NotAfter` in place of `certificate.expires` in the `WHERE` clause of its query when the `CertStatusOptimizationsMigrated` feature is enabled.

Resolves https://github.com/letsencrypt/boulder/issues/2425
2016-12-15 12:53:54 -08:00
Jacob Hoffman-Andrews 27a1446010 Move timeouts into client interceptor. (#2387)
Previously we had custom code in each gRPC wrapper to implement timeouts. Moving
the timeout code into the client interceptor allows us to simplify things and
reduce code duplication.
2016-12-05 10:42:26 -05:00
Roland Bracewell Shoemaker 03fdd65bfe Add gRPC server to SA (#2374)
Adds a gRPC server to the SA and SA gRPC Clients to the WFE, RA, CA, Publisher, OCSP updater, orphan finder, admin revoker, and expiration mailer.

Also adds a CA gRPC client to the OCSP Updater which was missed in #2193.

Fixes #2347.
2016-12-02 17:24:46 -08:00
Jacob Hoffman-Andrews 4e90f07d89 Make expiration mailer log standard startup format. (#2352)
Previously it was just logging its name, not its version.
2016-11-23 10:57:49 -08:00
Daniel McCarney a6f2b0fafb Updates `go-jose` dep to v1.1.0 (#2314)
This commit updates the `go-jose` dependency to [v1.1.0](https://github.com/square/go-jose/releases/tag/v1.1.0) (Commit: aa2e30fdd1fe9dd3394119af66451ae790d50e0d). Since the import path changed from `github.com/square/...` to `gopkg.in/square/go-jose.v1/` this means removing the old dep and adding the new one.

The upstream go-jose library added a `[]*x509.Certificate` member to the `JsonWebKey` struct that prevents us from using a direct equality test against two `JsonWebKey` instances. Instead we now must compare the inner `Key` members.

The `TestRegistrationContactUpdate` function from `ra_test.go` was updated to populate the `Key` members used in testing instead of only using KeyID's to allow the updated comparisons to work as intended.

The `Key` field of the `Registration` object was switched from `jose.JsonWebKey` to `*jose.JsonWebKey ` to make it easier to represent a registration w/o a Key versus using a value with a nil `JsonWebKey.Key`.

I verified the upstream unit tests pass per contributing.md:
```
daniel@XXXXX:~/go/src/gopkg.in/square/go-jose.v1$ git show
commit aa2e30fdd1fe9dd3394119af66451ae790d50e0d
Merge: 139276c e18a743
Author: Cedric Staub <cs@squareup.com>
Date:   Thu Sep 22 17:08:11 2016 -0700

    Merge branch 'master' into v1
    
    * master:
      Better docs explaining embedded JWKs
      Reject invalid embedded public keys
      Improve multi-recipient/multi-sig handling

daniel@XXXXX:~/go/src/gopkg.in/square/go-jose.v1$ go test ./...
ok  	gopkg.in/square/go-jose.v1	17.599s
ok  	gopkg.in/square/go-jose.v1/cipher	0.007s
?   	gopkg.in/square/go-jose.v1/jose-util	[no test files]
ok  	gopkg.in/square/go-jose.v1/json	1.238s
```
2016-11-08 13:56:50 -05:00
Jacob Hoffman-Andrews 32c03f942b Don't start DebugServer until server's ready. (#2271)
This makes availability of DebugServer a better proxy for readiness of the
component.
2016-10-21 16:57:14 -04:00
Daniel McCarney 8efc6342bb Mailer reliability improvements (#2262)
### Connect before sending mail, not at startup

Per #2250 when we connect to the remote SMTP server at start-up time by calling `mailer.Connect()` but do not actually call `mailer.SendMail()` until after we have done some potentially expensive/time-consuming work we are liable to have our connection closed due to timeout.

This PR moves the `Connect()` call in `expiration-mailer` and `notify-mailer` to be closer to where the actual messages are sent via `SendMail()` and resolves #2250 

### Handle SMTP 421 errors gracefully

Issue #2249 describes a case where we see this SMTP error code from the remote server when our connection has been idle for too long. This would manifest when connecting to the remote server at startup, running a very long database query, and then sending mail. This commit allows the mailer to treat SMTP 421 errors as an event that should produce a reconnect attempt and resolves #2249.

A unit test is added to the mailer tests to test that reconnection works when the server sends a SMTP 421 error. Prior to b64e51f and support for SMTP 421 reconnection this test failed in a manner matching issue #2249:

```
go test -p 1 -race --test.run TestReconnectSMTP421
github.com/letsencrypt/boulder/mail
Wrote goodbye msg: 421 1.2.3 green.eggs.and.spam Error: timeout exceeded
Cutting off client early
--- FAIL: TestReconnectSMTP421 (0.00s)
  mailer_test.go:257: Expected SendMail() to not fail. Got err: 421
  1.2.3 green.eggs.and.spam Error: timeout exceeded
  FAIL
  FAIL  github.com/letsencrypt/boulder/mail     0.023s
```

With b64e51f the test passes and the client gracefully reconnects.

The existing reconnect testing logic in the `mail-test-srv` integration tests is changed such that half of the forced disconnects are a normal clean connection close and half are a SMTP 421. This allows the existing integration test for server disconnects to be reused to test the 421 reconnect logic.
2016-10-20 14:10:47 -04:00
Roland Bracewell Shoemaker e18f4e7457 revert expiration-mailer clientName change (#2207)
This required a corresponding change to AMQP permissions, to be scheduled in the future.
2016-09-26 10:37:17 -07:00
Roland Bracewell Shoemaker 239bf9ae0a Very basic feature flag impl (#1705)
Updates #1699.

Adds a new package, `features`, which exposes methods to set and check if various internal features are enabled. The implementation uses global state to store the features so that services embedded in another service do not each require their own features map in order to check if something is enabled.

Requires a `boulder-tools` image update to include `golang.org/x/tools/cmd/stringer`.
2016-09-20 16:29:01 -07:00
Roland Bracewell Shoemaker c8f1fb3e2f Remove direct usages of go-statsd-client in favor of using metrics.Scope (#2136)
Fixes #2118, fixes #2082.
2016-09-07 19:35:13 -04:00
Daniel McCarney a584f8de46 Allow `mailer` to reconnect to server. (#2101)
The `MailerImpl` gains a few new fields (`retryBase`, &  `retryMax`). These are used with `core.RetryBackoff` in `reconnect()` to implement exponential backoff in a reconnect attempt loop. Both `expiration-mailer` and `notify-mailer` are modified to add CLI args for these 2 flags and to wire them into the `MailerImpl` via its `New()` constructor.

In `MailerImpl`'s `SendMail()` function it now detects when `sendOne` returns an `io.EOF` error indicating that the server closed the connection unexpectedly. When this case occurs `reconnect()` is invoked. If the reconnect succeeds then we invoke `sendOne` again to try and complete the message sending operation that was interrupted by the disconnect.

For integration testing purposes I modified the `mail-test-srv` to support a `-closeChance` parameter between 0 and 100. This controls what % of `MAIL` commands will result in the server immediately closing the client connection before further processing. This allows us to simulate a flaky mailserver. `test/startservers.py` is modified to start the `mail-test-srv` with a 35% close chance to thoroughly test the reconnection logic during the existing `expiration-mailer` integration tests. I took this as a chance to do some slight clean-up of the `mail-test-srv` code (mostly removing global state).

For unit testing purposes I modified the mailer `TestConnect` test to abstract out a server that can operate similar to `mail-test-serv` (e.g. can close connections artificially).

This is testing a server that **closes** a connection, and not a server that **goes away/goes down**. E.g. the `core.RetryBackoff` sleeps themselves are not being tested. The client is disconnected and attempts a reconnection which always succeeds on the first try. To test a "gone away" server would require a more substantial rewrite of the unit tests and the `mail-test-srv`/integration tests. I think this matches the experience we have with MailChimp/Mandril closing long lived connections.
2016-08-15 14:14:49 -07:00
Ben Irving 8ed5b1e6a1 Replace *AcmeURL with string (#2117)
Removes core.AcmeURL from boulder and uses string instead.

Fixes #1996
2016-08-11 13:27:19 -07:00
Jacob Hoffman-Andrews 243832822a Remove transaction in updateCertStatus. (#2096)
Formerly in expiration-mailer, when we wanted to set `lastNagSent` to `Now()`, we started a transaction, read the object, updated one field, wrote it back, and closed the transaction.

This commit replaces the transaction and instead does a much simpler and more efficient `UPDATE certificateStatus SET lastNagSent = ? where serial = ?;`.
2016-08-09 09:44:02 -04:00
Daniel McCarney b16585be5d `notify-mailer` monitor progress (#2046)
This PR adds a `printStatus` function that is called every iteration of the mailer's `run()` loop. The status output is logged at the `info` level and includes the destination email, the current message being sent, the total number of messages to send, and the elapsed time since `run()` started.

The status output can be disabled by lowering the default syslog level in the `notify-mailer` config.

Additionally, this PR adds stats support for the mailer package. Three new stats are
published during the `MailerImpl`'s `SendMail` function (called in a loop by the mailer utilities):
  `Mailer.SendMail.Attempts`
  `Mailer.SendMail.Successes`
  `Mailer.SendMail.Errors`

This PR removes two stats from the `expiration-mailer` that are redundant copies of the above:
  `Mailer.Expiration.Errors.SendingNag.SendFailure`
  `Mailer.Expiration.Sent`

This resolves #2026.
2016-07-26 11:26:08 -04:00
Jacob Hoffman-Andrews db3792fd02 expiration-mailer: Skip renewed certs faster. (#2060)
Previously, if a certificate was skipped by the expiration mailer due to being
renewed already, we wouldn't update its lastnag time. However, this meant that
already-renewed certificates would clog up the results of the query
expiration-mailer does to find expired certs. Since this query has a limit (1000
in practice), we might find only renewed certificates on each query, even when
there are non-renewed certificates available to alert about. Then we'd never
make forward progress.

This change updates the stored lastExpirationNagSent field when a certificate is
skipped over due to renewal, so that it isn't included in the first-step query.

Fixes #2054
2016-07-18 17:33:45 -07:00
Daniel McCarney 02f3f124f6 Add stat for expiration-mailer at capacity. (#2042)
This PR adds a stat that is emitted when any of the nag groups are operating at capacity. The mailer is considered at capacity when the number of certs returned by the query in findExpiringCertificates is equal to the configured -cert_limit.

The at capacity stat names take the form: "Mailer.Expiration.Errors.Nag-XXXXX.AtCapacity" where XXXXX is the String() representation of the nagCheck offset nag time. Allowing the capacity alert to be specified per-nag group. As an example, a nag time of 48hrs with a nag check of 24hrs would produce a stat: "Mailer.Expiration.Errors.Nag-72h0m0s.AtCapacity" when it reached a capacity state.

This will allow creation of an alert for the conditions that caused issue #2002 to manifest.

In order to unit test with a mock statter it was also required to swap out the time.Since calls to equivalent dateB.sub(dateA) calls using the fake clock.
2016-07-13 17:33:47 -07:00
Ben Irving 1a4f099899 Split up boulder-config.json (Expiration Mailer) (#2036)
Part of #1962.
2016-07-12 15:55:52 -07:00
Jacob Hoffman-Andrews d0eef4b498 Fix crash in expiration-mailer (#1997)
In #1923 we changed reg.Contact to a pointer, which can be nil if the corresponding data from the DB is the literal string "null". This causes panics in expiration-mailer, which we need to fix.

This change fixes modelToRegistration to always return a pointer to a non-nil slice. It also adds an extra sanity check in expiration-mailer itself.

Fixes #1993

https://github.com/letsencrypt/boulder/pull/1997
2016-06-30 10:22:40 -07:00
Daniel McCarney 6bd97b2a6b Adds `notify-mailer` command. (#1936)
This commit adds a new notify-mailer command. Outside of the new command, this PR also:

Adds a new SMTPConfig to cmd/config.go that is shared between the expiration mailer and the notify mailer.
Modifies mail/mailer.go to add an smtpClient interface.
Adds a dryRunClient to mail/mailer.go that implements the smtpClient interface.
Modifies the mail/mailer.go MailerImpl and constructor to use the SMTPConfig and a dialer. The missing functions from the smtpClient interface are added.
The notify-mailer command supports checkpointing through --start and --end parameters. It supports dry runs by using the new dryRunClient from the mail package when given the --dryRun flag. The speed at which emails are sent can be tweaked using the --sleep flag.

Unit tests for notify-mailer's checkpointing behaviour, the checkpoint interval/sleep parameter sanity, the sleep behaviour, and the message content construction are included in main_test.go.

Future work:

A separate command to generate the list of destination emails provided to notify-mailer
Support for using registration IDs as input and resolving the email address at runtime.
Resolves #1928. Credit to @jsha for the initial work - I'm just completing the branch he started.

* Adds `notify-mailer` command.
* Adds a new SMTPConfig to `cmd/config.go` that is shared between the
expiration mailer and the notify mailer.
* Modifies `mail/mailer.go` to add an `smtpClient` interface.
* Adds a `dryRunClient` to `mail/mailer.go` that implements the
  `smtpClient` interface.
* Modifies the `mail/mailer.go` `MailerImpl` and constructor to use the
  SMTPConfig and a dialer. The missing functions from the `smtpClient`
  interface are added.
* Fix errcheck warnings
* Review feedback
* Review feedback pt2
* Fixes #1446 - invalid message-id generation.
* Change -configFile to -config
* Test message ID with friendly email

https://github.com/letsencrypt/boulder/pull/1936
2016-06-16 15:12:31 -07:00
Daniel McCarney cd2d1c4f6b Allow removing registration contact. (#1923)
The RA UpdateRegistration function merges a base registration object with an update by calling Registration.MergeUpdate. Prior to this commit MergeUpdate only allowed the updated registration object to overwrite the Contact field of the existing registration if the updated reg. defined at least one AcmeURL. This prevented clients from being able to outright remove the contact associated with an existing registration.

This commit removes the len() check on the input.Contact in MergeUpdate to allow the r.Contact field to be overwritten by a []*core.AcmeURL(nil) Contact field. Subsequently clients can now send an empty contacts list in the update registration POST in order to remove their reg contact.

Fixes #1846

* Allow removing registration contact.
* Adds a test for `MergeUpdate` contact removal.
* Change `Registration.Contact` type to `*[]*core.AcmeURL`.
* End validateContacts early for empty contacts
* Test removing reg. contact more thoroughly.
2016-06-13 11:02:29 -07:00
Ben Irving 1336c42813 Replace all log.Err calls with log.AuditErr (#1891)
* remove calls to log.Err()
* go fmt
* remove more occurrences
* change AuditErr argument to string and replace occurrences
2016-06-06 16:27:16 -04:00
Roland Bracewell Shoemaker 54573b36ba Remove all stray copyright headers and appends the initial line to LICENSE.txt (#1853) 2016-05-31 12:32:04 -07:00
Kane York fef60a8fd6 Add statsd reporting of current DB connection count (#1805)
* rename, change params, restructure
* I'm wondering how I managed that one
* use a metrics.Scope
* move method to SA, update callers
* rerun goimports
* fix compile error
* revert cmd/shell.go

https://github.com/letsencrypt/boulder/pull/1805
2016-05-12 20:33:23 -07:00
Jacob Hoffman-Andrews b3bc3d8e41 Add a MaxDBConns config parameter. (#1793) 2016-05-09 14:21:15 -07:00
Kane York fb4955c72a Fix expiration-mailer logspam when all certs are renewed (#1770)
Fix expiration-mailer logspam when all certs are renewed
Fixes #1772
2016-05-02 13:48:28 -07:00
Kane York b7cf618f5d context.Context as the first parameter of all RPC calls (#1741)
Change core/interfaces to put context.Context as the first parameter of all RPC calls in preparation for gRPC.
2016-04-19 11:34:36 -07:00
Jacob Hoffman-Andrews e6c17e1717 Switch to new vendor style (#1747)
* Switch to new vendor style.

* Fix metrics generate command.

* Fix miekg/dns types_generate.

* Use generated copies of files.

* Update miekg to latest.

Fixes a problem with `go generate`.

* Set GO15VENDOREXPERIMENT.

* Build in letsencrypt/boulder.

* fix travis more.

* Exclude vendor instead of godeps.

* Replace some ...

* Fix unformatted cmd

* Fix errcheck for vendorexp

* Add GO15VENDOREXPERIMENT to Makefile.

* Temp disable errcheck.

* Restore master fetch.

* Restore errcheck.

* Build with 1.6 also.

* Match statsd.*"

* Skip errcheck unles Go1.6.

* Add other ignorepkg.

* Fix errcheck.

* move errcheck

* Remove go1.6 requirement.

* Put godep-restore with errcheck.

* Remove go1.6 dep.

* Revert master fetch revert.

* Remove -r flag from godep save.

* Set GO15VENDOREXPERIMENT in Dockerfile and remove _worskpace.

* Fix Godep version.
2016-04-18 12:51:36 -07:00
Kane York 25b45a45ec Errcheck errors fixed (#1677)
* Fix all errcheck errors
* Add errcheck to test.sh
* Add a new sa.Rollback method to make handling errors in rollbacks easier.
This also causes a behavior change in the VA. If a HTTP connection is
abruptly closed after serving the headers for a non-200 response, the
reported error will be the read failure instead of the non-200.
2016-04-12 16:54:01 -07:00
Jacob Hoffman-Andrews ecc04e8e61 Refactor log package (#1717)
- Remove error signatures from log methods. This means fewer places where errcheck will show ignored errors.
- Pull in latest cfssl to be compatible with errorless log messages.
- Reduce the number of message priorities we support to just those we actually use.
- AuditNotice -> AuditInfo
- Remove InfoObject (only one use, switched to Info)
- Remove EmergencyExit and related functions in favor of panic
- Remove SyslogWriter / AuditLogger separate types in favor of a single interface, Logger, that has all the logging methods on it.
- Merge mock log into logger. This allows us to unexport the internals but still override them in the mock.
- Shorten names to be compatible with Go style: New, Set, Get, Logger, NewMock, etc.
- Use a shorter log format for stdout logs.
- Remove "... Starting" log messages. We have better information in the "Versions" message logged at startup.

Motivation: The AuditLogger / SyslogWriter distinction was confusing and exposed internals only necessary for tests. Some components accepted one type and some accepted the other. This made it hard to consistently use mock loggers in tests. Also, the unnecessarily fat interface for AuditLogger made it hard to meaningfully mock out.
2016-04-08 16:12:20 -07:00
Jacob Hoffman-Andrews 3018c00519 Testing and logging improvements
Pass log as an argument to SA. This allows us to mock it out.
Use a mockSA in CA test.
Use mockSA in orphan-finder test.
Improve logging from assert functions: Use our own printing style plus FailNow() so that each failure message isn't prefixed by "test-tools.go:60"
Remove duplicate TraceOn.

Part of #1642.

https://github.com/letsencrypt/boulder/pull/1683
2016-04-04 18:42:42 -07:00
Roland Bracewell Shoemaker 800b5b0cbf Switch to using a wrapped statter that provides PID
* Switch to using a wrapped statter that provides PID

* Fix tests and change some types to interfaces

* Add hostname to suffix + update comment
2016-04-01 15:43:35 -07:00
Kane York 98567efdfc Add integration tests for expiry mailer
This creates a new server, 'mail-test-srv', which is a simplistic SMTP
server that accepts mail and can report the received mail over HTTP.

An integration test is added that uses the new server to test the expiry
mailer.

The FAKECLOCK environment variable is used to force the expiry mailer to
think that the just-issued certificate is about to expire.

Additionally, the expiry mailer is modified to cleanly shut down its
SMTP connections.
2016-03-25 10:02:02 -07:00
Roland Shoemaker 00b617b59a Switch to upstream square/go-jose + pull latest 2016-03-15 13:54:22 -07:00
Kane York 21700ffec5 Improve mocks.Mailer to check To: line 2016-03-14 17:08:44 -07:00
Kane York 327a760311 expiration-mailer: don't mail if exact-renewal already
If a certificate has already been issued with the same set of FQDNs, it
is considered to be renewed and no expiration mail is sent.

Also, use the connection string in the test/vars package instead of
copying it all around.
2016-03-09 10:58:56 -08:00