Commit Graph

47 Commits

Author SHA1 Message Date
Phil Porada 439517543b
CI: Run staticcheck standalone (#7055)
Run staticcheck as a standalone binary rather than as a library via
golangci-lint. From the golangci-lint help out,
> staticcheck (megacheck): It's a set of rules from staticcheck. It's
not the same thing as the staticcheck binary. The author of staticcheck
doesn't support or approve the use of staticcheck as a library inside
golangci-lint.

We decided to disable ST1000 which warns about incorrect or missing
package comments.

For SA4011, I chose to change the semantics[1] of the for loop rather
than ignoring the SA4011 lint for that line.

Fixes https://github.com/letsencrypt/boulder/issues/6988

1. https://go.dev/ref/spec#Continue_statements
2023-08-31 21:09:40 -07:00
Aaron Gable cf795aed54
expiration-mailer: don't re-attempt bad addresses (#6246)
When expiration-mailer attempts to send nag emails, and
the result is a "Bad Address" error, mark the certificates in
question as having had their last expiration nag sent, so we
don't keep retrying them every time expiration-mailer runs.

To facilitate this, factor out more of the code which performs
the database updates into a more robust helper function, and
optimize it to perform all of the updates at once.

Fixes #6185
2022-07-25 13:20:17 -07:00
Jacob Hoffman-Andrews 76f987a1df
Reland "Allow expiration mailer to work in parallel" (#6133)
This reverts commit 7ef6913e71.

We turned on the `ExpirationMailerDontLookTwice` feature flag in prod, and it's
working fine but not clearing the backlog. Since
https://github.com/letsencrypt/boulder/pull/6100 fixed the issue that caused us
to (nearly) stop sending mail when we deployed #6057, this should be safe to
roll forward.

The revert of the revert applied cleanly, except for expiration-mailer/main.go
and `main_test.go`, particularly around the contents `processCerts` (where
`sendToOneRegID` was extracted from) and `sendToOneRegID` itself. So those areas
are good targets for extra attention.
2022-05-23 16:16:43 -07:00
Aaron Gable 7ef6913e71
Revert "Allow expiration mailer to work in parallel" (#6080)
When deployed, the newly-parallel expiration-mailer encountered
unexpected difficulties and dropped to apparently sending nearly zero
emails despite not throwing any real errors. Reverting the parallelism
change until we understand and can fix the root cause.

This reverts two commits:
- Allow expiration mailer to work in parallel (#6057)
- Fix data race in expiration-mailer test mocks (#6072) 

It also modifies the revert to leave the new `ParallelSends` config key
in place (albeit completely ignored), so that the binary containing this
revert can be safely deployed regardless of config status.

Part of #5682
2022-05-03 13:18:40 -07:00
Jacob Hoffman-Andrews 9629c88d66
Allow expiration mailer to work in parallel (#6057)
Previously, each accounts email would be sent in serial,
along with several reads from the database (to check for
certificate renewal) and several writes to the database (to update
`certificateStatus.lastExpirationNagSent`). This adds a config field
for the expiration mailer that sets the parallelism it will use.

That means making and using multiple SMTP connections as well. Previously,
`bmail.Mailer` was not safe for concurrent use. It also had a piece of
API awkwardness: after you created a Mailer, you had to call Connect on
it to change its state.

Instead of treating that as a state change on Mailer, I split out a
separate component: `bmail.Conn`. Now, when you call `Mailer.Connect()`,
you get a Conn. You can send mail on that Conn and Close it when you're
done. A single Mailer instance can produce multiple Conns, so Mailer is
now concurrency-safe (while Conn is not).

This involved a moderate amount of renaming and code movement, and
GitHub's move detector is not keeping up 100%, so an eye towards "is
this moved code?" may help. Also adding `?w=1` to the diff URL to ignore
whitespace diffs.
2022-04-21 18:04:55 -07:00
Jacob Hoffman-Andrews b7989d0b5d
Add EPIPE to retryable mailer errors (#5801) 2021-11-16 15:28:59 -08:00
Samantha 401d862354
mail: Rename RecoverableSMTPError to BadAddressSMTPError (#5479)
Rename `RecoverableSMTPError` to `BadAddressSMTPError`. The former
implies that an operation resulting in this error can be retried.
2021-06-15 11:04:56 -07:00
Samantha a27054132b
mail: Reconnect when connection reset by peer (#5461) 2021-06-07 13:30:44 -06:00
Samantha 20bfc65c32
mailer: replacing error assertions with errors.As (#5123)
errors.As checks for a specific error in a wrapped error chain
(see https://golang.org/pkg/errors/#As) as opposed to asserting
that an error is of a specific type.

Part of #5010
2020-10-13 17:34:17 -07:00
Jacob Hoffman-Andrews 3bf6aa4aac
notify-mailer: improve log output (#5094)
One of the log lines describes the most frequent address corresponding
to a number of accounts, but it actually corresponds to a number of
lines in the input CSV.

Also, now that we escape newlines in log output, the dryRunMailer's
output looks messed up. Split the message body into lines and emit one
log message per line.
2020-09-17 09:56:24 -07:00
Jacob Hoffman-Andrews 75024c3ec1
Replace clock.Default() with clock.New() (#4761)
clock.Default is deprecated:
https://godoc.org/github.com/jmhodges/clock#Default
2020-04-08 17:23:43 -07:00
Roland Bracewell Shoemaker 0e02b07aaf mail: populate mailer counter when creating dryRun client (#4646)
If a dryRunClient is used it will cause panics when SendMail is called because the 
constructor doesn't populate the prometheus counter. This fix populates the counter.
2020-01-14 18:17:14 -05:00
Roland Bracewell Shoemaker 5b2f11e07e Switch away from old style statsd metrics wrappers (#4606)
In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back.

There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change.

Fixes #4591.
2019-12-18 11:08:25 -05:00
Roland Bracewell Shoemaker 5d6196e228
Fix mailer error bug (#4042)
This fixes two bugs:
1. `resetAndError` would be called on every error, including `io.EOF`, which is returned when the connection is terminated. Calling `m.client.Reset()` after a `io.EOF` will result in another error, causing us to wrap the `io.EOF` with a `errors.errorString`. This broke a check in `sendMail` that was used to cause a reconnect.
2. There was a error type cast that assumed the type without checking it, which could result in a panic when an error of the unexpected type was returned.
2019-02-07 20:25:08 -08:00
Roland Bracewell Shoemaker 064001203b Continue work on more SMTP errors (#4039)
Instead of just on 401. Pulled the various error codes from a handful of SMTP docs I
could find, they could probably use a second once over by others though.
2019-01-28 22:23:25 -08:00
Jacob Hoffman-Andrews fb7b60f42c Always reset the SMTP connection after an error. (#4036)
This prevents problems with "nested MAIL command."

Fixes #3191
2019-01-25 15:08:25 -05:00
Daniel McCarney 1a68cc2225 notify-mailer: warn for bad rcpt, don't exit. (#4022)
Resolves https://github.com/letsencrypt/boulder/issues/4019

I can't find RFC verse and chapter for "401 4.1.3" errors, but [IANA's registry of SMTP enhanced status codes](https://www.iana.org/assignments/smtp-enhanced-status-codes/smtp-enhanced-status-codes.xhtml) does show an entry matching `x.1.3`:
```
X.1.3 | Bad destination mailbox address syntax | 501 | The destination address was syntactically invalid. This can apply to any field in the address. This code is only useful for permanent failures. | [RFC3463] (Standards Track) | G. Vaudreuil | IESG
```
However that entry from IANA says the "associated basic code" is 501, not 401.

Since we wrote this tool to talk to exactly one SMTP server in the world and it definitely is returning "401 4.1.3" in some cases I think its reasonable to handle as I've done in this PR. Alternative suggestions welcome.
2019-01-18 14:14:30 -08:00
Joel Sing 8ebdfc60b6 Provide formatting logger functions. (#3699)
A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)).
Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions
of the logger functions and call these directly with the format and arguments.

While here remove some unnecessary trailing newlines and calls to String/Error.
2018-05-10 11:06:29 -07:00
Jacob Hoffman-Andrews 4296dd985a Use TLS in mailer integration tests (#3213)
* Remove non-TLS support from mailer entirely
* Add a config option for trusted roots in expiration-mailer. If unset, it defaults to the system roots, so this does not need to be set in production.
* Use TLS in mail-test-srv, along with an internal root and localhost certificates signed by that root.
2017-11-06 14:57:14 -08:00
Daniel McCarney 8efc6342bb Mailer reliability improvements (#2262)
### Connect before sending mail, not at startup

Per #2250 when we connect to the remote SMTP server at start-up time by calling `mailer.Connect()` but do not actually call `mailer.SendMail()` until after we have done some potentially expensive/time-consuming work we are liable to have our connection closed due to timeout.

This PR moves the `Connect()` call in `expiration-mailer` and `notify-mailer` to be closer to where the actual messages are sent via `SendMail()` and resolves #2250 

### Handle SMTP 421 errors gracefully

Issue #2249 describes a case where we see this SMTP error code from the remote server when our connection has been idle for too long. This would manifest when connecting to the remote server at startup, running a very long database query, and then sending mail. This commit allows the mailer to treat SMTP 421 errors as an event that should produce a reconnect attempt and resolves #2249.

A unit test is added to the mailer tests to test that reconnection works when the server sends a SMTP 421 error. Prior to b64e51f and support for SMTP 421 reconnection this test failed in a manner matching issue #2249:

```
go test -p 1 -race --test.run TestReconnectSMTP421
github.com/letsencrypt/boulder/mail
Wrote goodbye msg: 421 1.2.3 green.eggs.and.spam Error: timeout exceeded
Cutting off client early
--- FAIL: TestReconnectSMTP421 (0.00s)
  mailer_test.go:257: Expected SendMail() to not fail. Got err: 421
  1.2.3 green.eggs.and.spam Error: timeout exceeded
  FAIL
  FAIL  github.com/letsencrypt/boulder/mail     0.023s
```

With b64e51f the test passes and the client gracefully reconnects.

The existing reconnect testing logic in the `mail-test-srv` integration tests is changed such that half of the forced disconnects are a normal clean connection close and half are a SMTP 421. This allows the existing integration test for server disconnects to be reused to test the 421 reconnect logic.
2016-10-20 14:10:47 -04:00
Roland Bracewell Shoemaker c8f1fb3e2f Remove direct usages of go-statsd-client in favor of using metrics.Scope (#2136)
Fixes #2118, fixes #2082.
2016-09-07 19:35:13 -04:00
Daniel McCarney a584f8de46 Allow `mailer` to reconnect to server. (#2101)
The `MailerImpl` gains a few new fields (`retryBase`, &  `retryMax`). These are used with `core.RetryBackoff` in `reconnect()` to implement exponential backoff in a reconnect attempt loop. Both `expiration-mailer` and `notify-mailer` are modified to add CLI args for these 2 flags and to wire them into the `MailerImpl` via its `New()` constructor.

In `MailerImpl`'s `SendMail()` function it now detects when `sendOne` returns an `io.EOF` error indicating that the server closed the connection unexpectedly. When this case occurs `reconnect()` is invoked. If the reconnect succeeds then we invoke `sendOne` again to try and complete the message sending operation that was interrupted by the disconnect.

For integration testing purposes I modified the `mail-test-srv` to support a `-closeChance` parameter between 0 and 100. This controls what % of `MAIL` commands will result in the server immediately closing the client connection before further processing. This allows us to simulate a flaky mailserver. `test/startservers.py` is modified to start the `mail-test-srv` with a 35% close chance to thoroughly test the reconnection logic during the existing `expiration-mailer` integration tests. I took this as a chance to do some slight clean-up of the `mail-test-srv` code (mostly removing global state).

For unit testing purposes I modified the mailer `TestConnect` test to abstract out a server that can operate similar to `mail-test-serv` (e.g. can close connections artificially).

This is testing a server that **closes** a connection, and not a server that **goes away/goes down**. E.g. the `core.RetryBackoff` sleeps themselves are not being tested. The client is disconnected and attempts a reconnection which always succeeds on the first try. To test a "gone away" server would require a more substantial rewrite of the unit tests and the `mail-test-srv`/integration tests. I think this matches the experience we have with MailChimp/Mandril closing long lived connections.
2016-08-15 14:14:49 -07:00
Daniel McCarney a0a12cb297 Fix nil panic with `-dryRun=true` (#2115)
While finalizing the testing for #2101 I noticed that the `notify-mailer` would panic when `-dryRun=true` (e.g. the default value):

```
E150646 notify-mailer [AUDIT] Panic caused by err: runtime error: invalid memory address or nil pointer dereference
E150646 notify-mailer [AUDIT] Stack Trace (Current frame) goroutine 1 [running]:
github.com/letsencrypt/boulder/log.(*impl).AuditPanic(0xc820167610)
	/home/daniel/go/src/github.com/letsencrypt/boulder/log/log.go:190 +0x190
panic(0x9804a0, 0xc82000e120)
	/usr/local/go/src/runtime/panic.go:443 +0x4e9
github.com/letsencrypt/boulder/metrics.(*StatsdScope).Inc(0x0, 0xa9ad30, 0x11, 0x1, 0x0, 0x0)
	/home/daniel/go/src/github.com/letsencrypt/boulder/metrics/scope.go:68 +0xd5
github.com/letsencrypt/boulder/mail.(*MailerImpl).SendMail(0xc8201dc360, 0xc8203d5150, 0x1, 0x1, 0x7ffeda89737c, 0x1a, 0xc8201f0000, 0x517, 0x0, 0x0)
	/home/daniel/go/src/github.com/letsencrypt/boulder/mail/mailer.go:290 +0x108
main.(*mailer).run(0xc820175ea8, 0x0, 0x0)
	/home/daniel/go/src/github.com/letsencrypt/boulder/cmd/notify-mailer/main.go:108 +0x362
main.main()
	/home/daniel/go/src/github.com/letsencrypt/boulder/cmd/notify-mailer/main.go:365 +0x142e
```

This was caused by the `NewDryRun` constructor not initializing the `stats` member of the `MailerImpl` and is fixed in this commit.
2016-08-11 11:29:40 -07:00
Daniel McCarney b16585be5d `notify-mailer` monitor progress (#2046)
This PR adds a `printStatus` function that is called every iteration of the mailer's `run()` loop. The status output is logged at the `info` level and includes the destination email, the current message being sent, the total number of messages to send, and the elapsed time since `run()` started.

The status output can be disabled by lowering the default syslog level in the `notify-mailer` config.

Additionally, this PR adds stats support for the mailer package. Three new stats are
published during the `MailerImpl`'s `SendMail` function (called in a loop by the mailer utilities):
  `Mailer.SendMail.Attempts`
  `Mailer.SendMail.Successes`
  `Mailer.SendMail.Errors`

This PR removes two stats from the `expiration-mailer` that are redundant copies of the above:
  `Mailer.Expiration.Errors.SendingNag.SendFailure`
  `Mailer.Expiration.Sent`

This resolves #2026.
2016-07-26 11:26:08 -04:00
Ben Irving 77e64fef79 Disallow non-ASCII email addresses (#1953)
This PR, adds a check in registration authority for non-ASCII encoded characters in an email address. This is due to a 'funky email implementation'.

Fixes #1350
2016-06-21 17:53:38 -07:00
Daniel McCarney 6bd97b2a6b Adds `notify-mailer` command. (#1936)
This commit adds a new notify-mailer command. Outside of the new command, this PR also:

Adds a new SMTPConfig to cmd/config.go that is shared between the expiration mailer and the notify mailer.
Modifies mail/mailer.go to add an smtpClient interface.
Adds a dryRunClient to mail/mailer.go that implements the smtpClient interface.
Modifies the mail/mailer.go MailerImpl and constructor to use the SMTPConfig and a dialer. The missing functions from the smtpClient interface are added.
The notify-mailer command supports checkpointing through --start and --end parameters. It supports dry runs by using the new dryRunClient from the mail package when given the --dryRun flag. The speed at which emails are sent can be tweaked using the --sleep flag.

Unit tests for notify-mailer's checkpointing behaviour, the checkpoint interval/sleep parameter sanity, the sleep behaviour, and the message content construction are included in main_test.go.

Future work:

A separate command to generate the list of destination emails provided to notify-mailer
Support for using registration IDs as input and resolving the email address at runtime.
Resolves #1928. Credit to @jsha for the initial work - I'm just completing the branch he started.

* Adds `notify-mailer` command.
* Adds a new SMTPConfig to `cmd/config.go` that is shared between the
expiration mailer and the notify mailer.
* Modifies `mail/mailer.go` to add an `smtpClient` interface.
* Adds a `dryRunClient` to `mail/mailer.go` that implements the
  `smtpClient` interface.
* Modifies the `mail/mailer.go` `MailerImpl` and constructor to use the
  SMTPConfig and a dialer. The missing functions from the `smtpClient`
  interface are added.
* Fix errcheck warnings
* Review feedback
* Review feedback pt2
* Fixes #1446 - invalid message-id generation.
* Change -configFile to -config
* Test message ID with friendly email

https://github.com/letsencrypt/boulder/pull/1936
2016-06-16 15:12:31 -07:00
Roland Bracewell Shoemaker 54573b36ba Remove all stray copyright headers and appends the initial line to LICENSE.txt (#1853) 2016-05-31 12:32:04 -07:00
Jacob Hoffman-Andrews e6c17e1717 Switch to new vendor style (#1747)
* Switch to new vendor style.

* Fix metrics generate command.

* Fix miekg/dns types_generate.

* Use generated copies of files.

* Update miekg to latest.

Fixes a problem with `go generate`.

* Set GO15VENDOREXPERIMENT.

* Build in letsencrypt/boulder.

* fix travis more.

* Exclude vendor instead of godeps.

* Replace some ...

* Fix unformatted cmd

* Fix errcheck for vendorexp

* Add GO15VENDOREXPERIMENT to Makefile.

* Temp disable errcheck.

* Restore master fetch.

* Restore errcheck.

* Build with 1.6 also.

* Match statsd.*"

* Skip errcheck unles Go1.6.

* Add other ignorepkg.

* Fix errcheck.

* move errcheck

* Remove go1.6 requirement.

* Put godep-restore with errcheck.

* Remove go1.6 dep.

* Revert master fetch revert.

* Remove -r flag from godep save.

* Set GO15VENDOREXPERIMENT in Dockerfile and remove _worskpace.

* Fix Godep version.
2016-04-18 12:51:36 -07:00
Kane York 25b45a45ec Errcheck errors fixed (#1677)
* Fix all errcheck errors
* Add errcheck to test.sh
* Add a new sa.Rollback method to make handling errors in rollbacks easier.
This also causes a behavior change in the VA. If a HTTP connection is
abruptly closed after serving the headers for a non-200 response, the
reported error will be the read failure instead of the non-200.
2016-04-12 16:54:01 -07:00
Kane York 98567efdfc Add integration tests for expiry mailer
This creates a new server, 'mail-test-srv', which is a simplistic SMTP
server that accepts mail and can report the received mail over HTTP.

An integration test is added that uses the new server to test the expiry
mailer.

The FAKECLOCK environment variable is used to force the expiry mailer to
think that the just-issued certificate is about to expire.

Additionally, the expiry mailer is modified to cleanly shut down its
SMTP connections.
2016-03-25 10:02:02 -07:00
Jacob Hoffman-Andrews 556afa3547 Reuse a single connection to SMTP server.
Also, add a Subject config field and use TLS-wrapped SMTP if appropriate.
2016-01-11 15:51:54 -08:00
Jacob Hoffman-Andrews b61c2a7e3a Add a From field to mailer config.
Fixes #1351.
2016-01-07 12:24:51 -08:00
Roland Shoemaker 4e569184d5 Disallow non-ASCII addresses 2016-01-06 17:26:59 -08:00
Roland Shoemaker aa94f07081 Review fixes 2016-01-06 17:13:04 -08:00
Roland Shoemaker 95f5a5b0a0 Strip LF from headers and encode body using MIME quoted-printable 2016-01-06 15:54:40 -08:00
Roland Shoemaker 043bfd8041 Add header/body escaping 2016-01-04 10:39:33 -08:00
Roland Shoemaker cb18c22c0d Use crypto/rand for Message-Id 2015-12-31 23:59:35 +00:00
Roland Shoemaker aacafb7ff5 Add unit test 2015-12-29 20:50:26 +00:00
Roland Shoemaker eb52f02d06 Make expiration mailer RFC 822 compliant (and satisfy SpamAssassin) 2015-12-29 11:54:05 +00:00
Roland Shoemaker 145790d9c3 Review fixes 2015-07-27 12:46:09 -07:00
Roland Shoemaker a69021f918 Add sendWarning test 2015-07-24 14:53:50 -07:00
Roland Shoemaker b5f519d22d Rework how the expiration mailer looks for certificates 2015-07-23 15:33:28 -07:00
J.C. Jones 41f5788c77 Correct most `go lint` warnings. (274 -> 5) 2015-06-16 22:18:28 -05:00
J.C. Jones 42302541bd Run `go fmt` for PR #186 2015-05-18 18:44:38 -07:00
Roland Shoemaker 696c558bc9 cleanups 2015-04-17 17:46:12 -07:00
Roland Shoemaker cc5ab9c465 add empty tests 2015-04-17 17:41:37 -07:00
Roland Shoemaker 72d8862371 add simple mailer 2015-04-17 17:37:23 -07:00