Boulder is fairly noisy about gRPC connection errors. This is a mixed
blessing: Our gRPC configuration will try to reconnect until it hits
an RPC deadline, and most likely eventually succeed. In that case,
we don't consider those to really be errors. However, in cases where
a connection is repeatedly failing, we'd like to see errors in the
logs about connection failure, rather than "deadline exceeded." So
we want to keep logging of gRPC errors.
However, right now we get a lot of these errors logged during
integration tests. They make the output hard to read, and may disguise
more serious errors. So we'd like to avoid causing such errors in
normal integration test operation.
This change reorders the startup of Boulder components by their gRPC
dependencies, so everything's backend is likely to be up and running
before it starts. It also reverses that order for clean shutdowns,
and waits for each process to exit before signalling the next one.
With these changes, I still got connection errors. Taking listenbuddy
out of the gRPC path fixed them. I believe the issue is that
listenbuddy is not a truly transparent proxy. In particular, it
accepts an inbound TCP connection before opening an outbound TCP
connection. If opening that outbound connection results in "connection
refused," it closes the inbound connection. That means gRPC sees a
"connection closed" (or "connection reset"?) rather than "connection
refused". I'm guessing it handles those cases differently, explaining
the different error results.
We've been using listenbuddy to trigger disconnects while Boulder is
running, to ensure that gRPC's reconnect code works. I think we can
probably rely on gRPC's reconnect to work. The initial problem that
led us to start testing this was a configuration problem; now that
we have the configuration we want, we should be fine and don't need
to keep testing reconnects on every integration test run.
Deletes github.com/streadway/amqp and the various RabbitMQ setup tools etc. Changes how listenbuddy is used to proxy all of the gRPC client -> server connections so we test reconnection logic.
+49 -8,221 😁Fixes#2640 and #2562.
Updates #1699.
Adds a new package, `features`, which exposes methods to set and check if various internal features are enabled. The implementation uses global state to store the features so that services embedded in another service do not each require their own features map in order to check if something is enabled.
Requires a `boulder-tools` image update to include `golang.org/x/tools/cmd/stringer`.
The create_db.sh script needs goose to create and fill the database. This is not
available yet, because it is simultaneously being installed in line 10. By moving the create_db.sh call after the `wait`, we make sure goose is available.
* Fix go generate command in metrics.
The previous command only worked on OS X. This one works on Linux but not
OS X.
Also add generate phase of test.sh.
* Add mockgen to test setup.
* Fix github-pr-status output.
* Fix envvar style.
* Set xtrace.
* Fix test.sh
* Fix test.sh some more.
* Fix mockgen command.
* Add dependencies for running `go generate`.
* Add protoc-gen-go.
* Fix go get command.
* Fix generate.
* Wait for all.
* Fix generate.
* Update generated pb.
* Fix generate commands for vendored world.
* Update documentation for new vendor style.
* Update grpc package to latest.
* Update caaChecker proto with latest.
* Run go generate only over TESTPATHS
* See if Travis passes under 1.6
* Switch back to 1.5.
* Trim run command.
* Run stringer from correct directory.
* Move generate command.
* Restore and generate
* Fix path.
* list contents of GOPATH.
* Fix stringer by prebuilding.
* Try another import path.
* regenerate bcode_string.
* remove excess package
* pull jsha fork of protoc-gen-go that echoes
* Echo protoc version.
* install from source
* CD back.
* Go back to normal protoc-gen-go
* Fix path
* Move protobuf install into test/setup.sh
* Move before_install to install.
* Set PATH.
* Follow 301 with curl.
* Shuffle test order.
* Swap back test order.
* Restore all tests.
* Restore 1.5.3 to Travis.
* Remove unnecessary wait-or-exit
* Generate metrics mock with latest mockgen.
* Wrap TESTPATHS in curlies
* Remove spurious bracket
* Fix all errcheck errors
* Add errcheck to test.sh
* Add a new sa.Rollback method to make handling errors in rollbacks easier.
This also causes a behavior change in the VA. If a HTTP connection is
abruptly closed after serving the headers for a non-200 response, the
reported error will be the read failure instead of the non-200.
This followup for #1639 replaces localhost with boulder-rabbitmq when tests run rabbitmq-setup.
Also fixed log message to point to the server name, not 0.0.0.0, when notifying about trying to connect.
This accomplishes two things:
- setup.sh should now be usable by the client integration test.
- setup.sh can be used by new project members to simplify first setup.
Update the README to indicate the new file, and to correct some out-of-date
information.