Commit Graph

39 Commits

Author SHA1 Message Date
Aaron Gable 146b78a0f7
Remove all static minica keys (#7489)
Remove the redis-tls, wfe-tls, and mail-test-srv keys which were
generated by minica and then checked in to the repo. All three are
replaced by the dynamically-generated ipki directory.

Part of https://github.com/letsencrypt/boulder/issues/7476
2024-05-17 11:45:40 -07:00
Aaron Gable 9c197e1f43
Use io and os instead of deprecated ioutil (#6286)
The iotuil package has been deprecated since go1.16; the various
functions it provided now exist in the os and io packages. Replace all
instances of ioutil with either io or os, as appropriate.
2022-08-10 13:30:17 -07:00
Jacob Hoffman-Andrews 76f987a1df
Reland "Allow expiration mailer to work in parallel" (#6133)
This reverts commit 7ef6913e71.

We turned on the `ExpirationMailerDontLookTwice` feature flag in prod, and it's
working fine but not clearing the backlog. Since
https://github.com/letsencrypt/boulder/pull/6100 fixed the issue that caused us
to (nearly) stop sending mail when we deployed #6057, this should be safe to
roll forward.

The revert of the revert applied cleanly, except for expiration-mailer/main.go
and `main_test.go`, particularly around the contents `processCerts` (where
`sendToOneRegID` was extracted from) and `sendToOneRegID` itself. So those areas
are good targets for extra attention.
2022-05-23 16:16:43 -07:00
Aaron Gable 7ef6913e71
Revert "Allow expiration mailer to work in parallel" (#6080)
When deployed, the newly-parallel expiration-mailer encountered
unexpected difficulties and dropped to apparently sending nearly zero
emails despite not throwing any real errors. Reverting the parallelism
change until we understand and can fix the root cause.

This reverts two commits:
- Allow expiration mailer to work in parallel (#6057)
- Fix data race in expiration-mailer test mocks (#6072) 

It also modifies the revert to leave the new `ParallelSends` config key
in place (albeit completely ignored), so that the binary containing this
revert can be safely deployed regardless of config status.

Part of #5682
2022-05-03 13:18:40 -07:00
Jacob Hoffman-Andrews 9629c88d66
Allow expiration mailer to work in parallel (#6057)
Previously, each accounts email would be sent in serial,
along with several reads from the database (to check for
certificate renewal) and several writes to the database (to update
`certificateStatus.lastExpirationNagSent`). This adds a config field
for the expiration mailer that sets the parallelism it will use.

That means making and using multiple SMTP connections as well. Previously,
`bmail.Mailer` was not safe for concurrent use. It also had a piece of
API awkwardness: after you created a Mailer, you had to call Connect on
it to change its state.

Instead of treating that as a state change on Mailer, I split out a
separate component: `bmail.Conn`. Now, when you call `Mailer.Connect()`,
you get a Conn. You can send mail on that Conn and Close it when you're
done. A single Mailer instance can produce multiple Conns, so Mailer is
now concurrency-safe (while Conn is not).

This involved a moderate amount of renaming and code movement, and
GitHub's move detector is not keeping up 100%, so an eye towards "is
this moved code?" may help. Also adding `?w=1` to the diff URL to ignore
whitespace diffs.
2022-04-21 18:04:55 -07:00
Aaron Gable 305ef9cce9
Improve error checking paradigm (#5920)
We have decided that we don't like the if err := call(); err != nil
syntax, because it creates confusing scopes, but we have not cleaned up
all existing instances of that syntax. However, we have now found a
case where that syntax enables a bug: It caused readers to believe that
a later err = call() statement was assigning to an already-declared err
in the local scope, when in fact it was assigning to an
already-declared err in the parent scope of a closure. This caused our
ineffassign and staticcheck linters to be unable to analyze the
lifetime of the err variable, and so they did not complain when we
never checked the actual value of that error.

This change standardizes on the two-line error checking syntax
everywhere, so that we can more easily ensure that our linters are
correctly analyzing all error assignments.
2022-02-01 14:42:43 -07:00
Samantha 6f9b77a515
mail: Fix data race in unit tests (#5491)
Fix data race introduced in #5461.

- Remove logging of `err` after test case has already returned

Fixes #5466
2021-06-17 10:43:02 -07:00
Samantha 401d862354
mail: Rename RecoverableSMTPError to BadAddressSMTPError (#5479)
Rename `RecoverableSMTPError` to `BadAddressSMTPError`. The former
implies that an operation resulting in this error can be retried.
2021-06-15 11:04:56 -07:00
Samantha a27054132b
mail: Reconnect when connection reset by peer (#5461) 2021-06-07 13:30:44 -06:00
Aaron Gable 294d1c31d7
Use error wrapping for berrors and tests (#5169)
This change adds two new test assertion helpers, `AssertErrorIs`
and `AssertErrorWraps`. The former is a wrapper around `errors.Is`,
and asserts that the error's wrapping chain contains a specific (i.e.
singleton) error. The latter is a wrapper around `errors.As`, and
asserts that the error's wrapping chain contains any error which is
of the given type; it also has the same unwrapping side effect as
`errors.As`, which can be useful for further assertions about the
contents of the error.

It also makes two small changes to our `berrors` package, namely
making `berrors.ErrorType` itself an error rather than just an int,
and giving `berrors.BoulderError` an `Unwrap()` method which
exposes that inner `ErrorType`. This allows us to use the two new
helpers above to make assertions about berrors, rather than
having to hand-roll equality assertions about their types.

Finally, it takes advantage of the two changes above to greatly
simplify many of the assertions in our tests, removing conditional
checks and replacing them with simple assertions.
2020-11-06 13:17:11 -08:00
Samantha 20bfc65c32
mailer: replacing error assertions with errors.As (#5123)
errors.As checks for a specific error in a wrapped error chain
(see https://golang.org/pkg/errors/#As) as opposed to asserting
that an error is of a specific type.

Part of #5010
2020-10-13 17:34:17 -07:00
Jacob Hoffman-Andrews 3a1a08a10b
Remove unused code. (#4722)
Found by staticcheck.
2020-03-27 11:55:42 -07:00
Roland Bracewell Shoemaker 5b2f11e07e Switch away from old style statsd metrics wrappers (#4606)
In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back.

There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change.

Fixes #4591.
2019-12-18 11:08:25 -05:00
Roland Bracewell Shoemaker 5d6196e228
Fix mailer error bug (#4042)
This fixes two bugs:
1. `resetAndError` would be called on every error, including `io.EOF`, which is returned when the connection is terminated. Calling `m.client.Reset()` after a `io.EOF` will result in another error, causing us to wrap the `io.EOF` with a `errors.errorString`. This broke a check in `sendMail` that was used to cause a reconnect.
2. There was a error type cast that assumed the type without checking it, which could result in a panic when an error of the unexpected type was returned.
2019-02-07 20:25:08 -08:00
Roland Bracewell Shoemaker 064001203b Continue work on more SMTP errors (#4039)
Instead of just on 401. Pulled the various error codes from a handful of SMTP docs I
could find, they could probably use a second once over by others though.
2019-01-28 22:23:25 -08:00
Jacob Hoffman-Andrews fb7b60f42c Always reset the SMTP connection after an error. (#4036)
This prevents problems with "nested MAIL command."

Fixes #3191
2019-01-25 15:08:25 -05:00
Daniel McCarney 1a68cc2225 notify-mailer: warn for bad rcpt, don't exit. (#4022)
Resolves https://github.com/letsencrypt/boulder/issues/4019

I can't find RFC verse and chapter for "401 4.1.3" errors, but [IANA's registry of SMTP enhanced status codes](https://www.iana.org/assignments/smtp-enhanced-status-codes/smtp-enhanced-status-codes.xhtml) does show an entry matching `x.1.3`:
```
X.1.3 | Bad destination mailbox address syntax | 501 | The destination address was syntactically invalid. This can apply to any field in the address. This code is only useful for permanent failures. | [RFC3463] (Standards Track) | G. Vaudreuil | IESG
```
However that entry from IANA says the "associated basic code" is 501, not 401.

Since we wrote this tool to talk to exactly one SMTP server in the world and it definitely is returning "401 4.1.3" in some cases I think its reasonable to handle as I've done in this PR. Alternative suggestions welcome.
2019-01-18 14:14:30 -08:00
Joel Sing f8a023e49c Remove various unnecessary uses of fmt.Sprintf (#3707)
Remove various unnecessary uses of fmt.Sprintf - in particular:

- Avoid calls like t.Error(fmt.Sprintf(...)), where t.Errorf can be used directly.

- Use strconv when converting an integer to a string, rather than using
  fmt.Sprintf("%d", ...). This is simpler and can also detect type errors at
  compile time.

- Instead of using x.Write([]byte(fmt.Sprintf(...))), use fmt.Fprintf(x, ...).
2018-05-11 11:55:25 -07:00
Jacob Hoffman-Andrews 4296dd985a Use TLS in mailer integration tests (#3213)
* Remove non-TLS support from mailer entirely
* Add a config option for trusted roots in expiration-mailer. If unset, it defaults to the system roots, so this does not need to be set in production.
* Use TLS in mail-test-srv, along with an internal root and localhost certificates signed by that root.
2017-11-06 14:57:14 -08:00
Josh Soref 281de6d90e Fix auth line in email test (#2499)
Previously the comment was wrong (left off the leading `\0`), but the base64-encoded `AUTH` line was right.

This change also modified the password from `paswd` to `passwd`.
2017-01-17 09:46:58 -08:00
Daniel McCarney 8efc6342bb Mailer reliability improvements (#2262)
### Connect before sending mail, not at startup

Per #2250 when we connect to the remote SMTP server at start-up time by calling `mailer.Connect()` but do not actually call `mailer.SendMail()` until after we have done some potentially expensive/time-consuming work we are liable to have our connection closed due to timeout.

This PR moves the `Connect()` call in `expiration-mailer` and `notify-mailer` to be closer to where the actual messages are sent via `SendMail()` and resolves #2250 

### Handle SMTP 421 errors gracefully

Issue #2249 describes a case where we see this SMTP error code from the remote server when our connection has been idle for too long. This would manifest when connecting to the remote server at startup, running a very long database query, and then sending mail. This commit allows the mailer to treat SMTP 421 errors as an event that should produce a reconnect attempt and resolves #2249.

A unit test is added to the mailer tests to test that reconnection works when the server sends a SMTP 421 error. Prior to b64e51f and support for SMTP 421 reconnection this test failed in a manner matching issue #2249:

```
go test -p 1 -race --test.run TestReconnectSMTP421
github.com/letsencrypt/boulder/mail
Wrote goodbye msg: 421 1.2.3 green.eggs.and.spam Error: timeout exceeded
Cutting off client early
--- FAIL: TestReconnectSMTP421 (0.00s)
  mailer_test.go:257: Expected SendMail() to not fail. Got err: 421
  1.2.3 green.eggs.and.spam Error: timeout exceeded
  FAIL
  FAIL  github.com/letsencrypt/boulder/mail     0.023s
```

With b64e51f the test passes and the client gracefully reconnects.

The existing reconnect testing logic in the `mail-test-srv` integration tests is changed such that half of the forced disconnects are a normal clean connection close and half are a SMTP 421. This allows the existing integration test for server disconnects to be reused to test the 421 reconnect logic.
2016-10-20 14:10:47 -04:00
Daniel McCarney d58987e087 Allows OS to choose listener port. (#2223)
This commit fixes #2190 by allowing the OS to choose what port the test
Listener's bind to. This allows the tests to be run concurrently
without panicing when they all try to bind the same TCP port.
2016-10-03 15:04:24 -07:00
Roland Bracewell Shoemaker c8f1fb3e2f Remove direct usages of go-statsd-client in favor of using metrics.Scope (#2136)
Fixes #2118, fixes #2082.
2016-09-07 19:35:13 -04:00
Daniel McCarney a584f8de46 Allow `mailer` to reconnect to server. (#2101)
The `MailerImpl` gains a few new fields (`retryBase`, &  `retryMax`). These are used with `core.RetryBackoff` in `reconnect()` to implement exponential backoff in a reconnect attempt loop. Both `expiration-mailer` and `notify-mailer` are modified to add CLI args for these 2 flags and to wire them into the `MailerImpl` via its `New()` constructor.

In `MailerImpl`'s `SendMail()` function it now detects when `sendOne` returns an `io.EOF` error indicating that the server closed the connection unexpectedly. When this case occurs `reconnect()` is invoked. If the reconnect succeeds then we invoke `sendOne` again to try and complete the message sending operation that was interrupted by the disconnect.

For integration testing purposes I modified the `mail-test-srv` to support a `-closeChance` parameter between 0 and 100. This controls what % of `MAIL` commands will result in the server immediately closing the client connection before further processing. This allows us to simulate a flaky mailserver. `test/startservers.py` is modified to start the `mail-test-srv` with a 35% close chance to thoroughly test the reconnection logic during the existing `expiration-mailer` integration tests. I took this as a chance to do some slight clean-up of the `mail-test-srv` code (mostly removing global state).

For unit testing purposes I modified the mailer `TestConnect` test to abstract out a server that can operate similar to `mail-test-serv` (e.g. can close connections artificially).

This is testing a server that **closes** a connection, and not a server that **goes away/goes down**. E.g. the `core.RetryBackoff` sleeps themselves are not being tested. The client is disconnected and attempts a reconnection which always succeeds on the first try. To test a "gone away" server would require a more substantial rewrite of the unit tests and the `mail-test-srv`/integration tests. I think this matches the experience we have with MailChimp/Mandril closing long lived connections.
2016-08-15 14:14:49 -07:00
Daniel McCarney b16585be5d `notify-mailer` monitor progress (#2046)
This PR adds a `printStatus` function that is called every iteration of the mailer's `run()` loop. The status output is logged at the `info` level and includes the destination email, the current message being sent, the total number of messages to send, and the elapsed time since `run()` started.

The status output can be disabled by lowering the default syslog level in the `notify-mailer` config.

Additionally, this PR adds stats support for the mailer package. Three new stats are
published during the `MailerImpl`'s `SendMail` function (called in a loop by the mailer utilities):
  `Mailer.SendMail.Attempts`
  `Mailer.SendMail.Successes`
  `Mailer.SendMail.Errors`

This PR removes two stats from the `expiration-mailer` that are redundant copies of the above:
  `Mailer.Expiration.Errors.SendingNag.SendFailure`
  `Mailer.Expiration.Sent`

This resolves #2026.
2016-07-26 11:26:08 -04:00
Daniel McCarney 6bd97b2a6b Adds `notify-mailer` command. (#1936)
This commit adds a new notify-mailer command. Outside of the new command, this PR also:

Adds a new SMTPConfig to cmd/config.go that is shared between the expiration mailer and the notify mailer.
Modifies mail/mailer.go to add an smtpClient interface.
Adds a dryRunClient to mail/mailer.go that implements the smtpClient interface.
Modifies the mail/mailer.go MailerImpl and constructor to use the SMTPConfig and a dialer. The missing functions from the smtpClient interface are added.
The notify-mailer command supports checkpointing through --start and --end parameters. It supports dry runs by using the new dryRunClient from the mail package when given the --dryRun flag. The speed at which emails are sent can be tweaked using the --sleep flag.

Unit tests for notify-mailer's checkpointing behaviour, the checkpoint interval/sleep parameter sanity, the sleep behaviour, and the message content construction are included in main_test.go.

Future work:

A separate command to generate the list of destination emails provided to notify-mailer
Support for using registration IDs as input and resolving the email address at runtime.
Resolves #1928. Credit to @jsha for the initial work - I'm just completing the branch he started.

* Adds `notify-mailer` command.
* Adds a new SMTPConfig to `cmd/config.go` that is shared between the
expiration mailer and the notify mailer.
* Modifies `mail/mailer.go` to add an `smtpClient` interface.
* Adds a `dryRunClient` to `mail/mailer.go` that implements the
  `smtpClient` interface.
* Modifies the `mail/mailer.go` `MailerImpl` and constructor to use the
  SMTPConfig and a dialer. The missing functions from the `smtpClient`
  interface are added.
* Fix errcheck warnings
* Review feedback
* Review feedback pt2
* Fixes #1446 - invalid message-id generation.
* Change -configFile to -config
* Test message ID with friendly email

https://github.com/letsencrypt/boulder/pull/1936
2016-06-16 15:12:31 -07:00
Roland Bracewell Shoemaker 54573b36ba Remove all stray copyright headers and appends the initial line to LICENSE.txt (#1853) 2016-05-31 12:32:04 -07:00
Jacob Hoffman-Andrews e6c17e1717 Switch to new vendor style (#1747)
* Switch to new vendor style.

* Fix metrics generate command.

* Fix miekg/dns types_generate.

* Use generated copies of files.

* Update miekg to latest.

Fixes a problem with `go generate`.

* Set GO15VENDOREXPERIMENT.

* Build in letsencrypt/boulder.

* fix travis more.

* Exclude vendor instead of godeps.

* Replace some ...

* Fix unformatted cmd

* Fix errcheck for vendorexp

* Add GO15VENDOREXPERIMENT to Makefile.

* Temp disable errcheck.

* Restore master fetch.

* Restore errcheck.

* Build with 1.6 also.

* Match statsd.*"

* Skip errcheck unles Go1.6.

* Add other ignorepkg.

* Fix errcheck.

* move errcheck

* Remove go1.6 requirement.

* Put godep-restore with errcheck.

* Remove go1.6 dep.

* Revert master fetch revert.

* Remove -r flag from godep save.

* Set GO15VENDOREXPERIMENT in Dockerfile and remove _worskpace.

* Fix Godep version.
2016-04-18 12:51:36 -07:00
Kane York 25b45a45ec Errcheck errors fixed (#1677)
* Fix all errcheck errors
* Add errcheck to test.sh
* Add a new sa.Rollback method to make handling errors in rollbacks easier.
This also causes a behavior change in the VA. If a HTTP connection is
abruptly closed after serving the headers for a non-200 response, the
reported error will be the read failure instead of the non-200.
2016-04-12 16:54:01 -07:00
Jacob Hoffman-Andrews 556afa3547 Reuse a single connection to SMTP server.
Also, add a Subject config field and use TLS-wrapped SMTP if appropriate.
2016-01-11 15:51:54 -08:00
Jacob Hoffman-Andrews b61c2a7e3a Add a From field to mailer config.
Fixes #1351.
2016-01-07 12:24:51 -08:00
Roland Shoemaker 4e569184d5 Disallow non-ASCII addresses 2016-01-06 17:26:59 -08:00
Roland Shoemaker aa94f07081 Review fixes 2016-01-06 17:13:04 -08:00
Roland Shoemaker 95f5a5b0a0 Strip LF from headers and encode body using MIME quoted-printable 2016-01-06 15:54:40 -08:00
Roland Shoemaker 043bfd8041 Add header/body escaping 2016-01-04 10:39:33 -08:00
Roland Shoemaker cb18c22c0d Use crypto/rand for Message-Id 2015-12-31 23:59:35 +00:00
Roland Shoemaker aacafb7ff5 Add unit test 2015-12-29 20:50:26 +00:00
Roland Shoemaker 696c558bc9 cleanups 2015-04-17 17:46:12 -07:00
Roland Shoemaker b89934aadb actually add it... 2015-04-17 17:41:50 -07:00