Commit Graph

34 Commits

Author SHA1 Message Date
Samantha Frank 6021d4b47d
docker: Update image to Ubuntu 24.04 (#8128)
#8109 updated CI to use 24.04 runners, now update the Docker image to
build 24.04 and CI to use it.

Build fixes:
- Unpin mariadb-client-core, 10.3 is no longer provided in 24.04 apt
repositories
- Use new pip flag --break-system-packages to comply with PEP 668, which
is now enforced in Python 3.12+

Runtime fixes:
- Start rsyslogd directly due to missing symlink (see:
https://github.com/rsyslog/rsyslog/issues/5611)
- Fix SyntaxWarning: invalid escape sequence '\w' error.
- Replace OpenSSL.crypto.load_certificate with
x509.load_pem_x509_certificate due to
d73d0ed417
2025-04-17 13:41:20 -04:00
Aaron Gable 358bdab8f4
Replace pkilint with pkimetal in CI (#8058)
Replace the bpkilint container with a new bpkimetal container. Update
our custom lint which calls out to that API to speak PKIMetal's (very
similar) protocol instead. Update our zlint custom configuration to
configure this updated lint.

Fixes https://github.com/letsencrypt/boulder/issues/8009
2025-03-12 12:21:40 -07:00
Phil Porada 3caa8988c9
test: Wait for a successful pkilint connection before continuing integration tests (#7574)
I occasionally receive timeouts due to pkilint being unresponsive during
local integration tests. Typically this happens after rebooting my
machine, with no containers previously running due to the reboot, and no
container data in disk/memory cache.

Example timeout
```
16:14:40.485848 3 boulder-ca _PeZ5w0 [AUDIT] Preparing precert failed: issuer=[int rsa b] serial=[7f2ba75acba0b729fc4e1ba5e2f6aacd5921] regID=[1] names=[rand.3ce2c964.xyz] certProfileName=[defaultBoulderCertificateProfile] certProfileHash=[de4c8c8866ed46b1d4af0d79e6b7ecf2d1ea625e26adcbbd3979ececd8fbd05a] err=[tbsCertificate linting failed: failed lint(s): e_pkilint_lint_cabf_serverauth_cert (making POST request to pkilint API: Post "http://10.77.77.9/certificate/cabf-serverauth": context deadline exceeded)]
```
2024-07-09 12:38:44 -04:00
Samantha 124c4cc6f5
grpc/sa: Implement deep health checks (#6928)
Add the necessary scaffolding for deep health checking of our various
gRPC components. Each component implementation that also implements the
grpc.checker interface will be checked periodically, and the health
status of the component will be updated accordingly.

Add the necessary methods to SA to implement the grpc.checker interface
and register these new health checks with Consul.

Additionally:
- Update entry point script to check for ProxySQL readiness.
- Increase the poll rate for gRPC Consul checks from 5s to 2s to help
with DNS failures, due to check failures, on startup.
- Change log level for Consul from INFO to ERROR to deal with noisy logs
full of transport failures due to Consul gRPC checks firing before the
SAs are up.

Fixes #6878
Part of #6795
2023-06-12 13:58:53 -04:00
Samantha 5c49231ea6
ROCSP: Remove support for Redis Cluster (#6645)
Fixes #6517
2023-02-09 17:14:37 -05:00
Jacob Hoffman-Andrews 8b9ed777d1
entrypoint: fix quoting (#6178)
Expanding `$@` means that if a positional parameter has an internal
space, e.g. "foo bar", it will be split into two positional parameters
in the resulting command, e.g. "foo" "bar". Expanding `"$@"` ensures
that such parameters are quoted during expansion, so we still get
"foo bar" in the exec command, which is always what we wanted.
2022-06-17 15:52:49 -07:00
Jacob Hoffman-Andrews 3d0a818bef
Quiet the output of wait-for-it (#5775)
When wait-for-it is trying to connect and failing, bash emits errors on
stderr. This captures those errors and sends them to /dev/null.

This also replaces an internal wait_tcp_port function inside
entrypoint.sh with a call to wait-for-it.sh.
2021-11-05 11:38:20 -07:00
Jacob Hoffman-Andrews 7fab32a000
Add rocsp-tool to manually store OCSP responses in Redis (#5758)
This is a sort of proof of concept of the Redis interaction, which will
evolve into a tool for inspection and manual repair of missing entries,
if we find ourselves needing to do that.

The important bits here are rocsp/rocsp.go and
cmd/rocsp-tool/main.go. Also, the newly-vendored Redis client.
2021-11-02 11:04:03 -07:00
Roland Bracewell Shoemaker 7673f02803
Use cmd/ceremony in integration tests (#4832)
This ended up taking a lot more work than I expected. In order to make the implementation more robust a bunch of stuff we previously relied on has been ripped out in order to reduce unnecessary complexity (I think I insisted on a bunch of this in the first place, so glad I can kill it now).

In particular this change:

* Removes bhsm and pkcs11-proxy: softhsm and pkcs11-proxy don't play well together, and any softhsm manipulation would need to happen on bhsm, then require a restart of pkcs11-proxy to pull in the on-disk changes. This makes manipulating softhsm from the boulder container extremely difficult, and because of the need to initialize new on each run (described below) we need direct access to the softhsm2 tools since pkcs11-tool cannot do slot initialization operations over the wire. I originally argued for bhsm as a way to mimic a network attached HSM, mainly so that we could do network level fault testing. In reality we've never actually done this, and the extra complexity is not really realistic for a handful of reasons. It seems better to just rip it out and operate directly on a local softhsm instance (the other option would be to use pkcs11-proxy locally, but this still would require manually restarting the proxy whenever softhsm2-util was used, and wouldn't really offer any realistic benefit).
* Initializes the softhsm slots on each integration test run, rather than when creating the docker image (this is necessary to prevent churn in test/cert-ceremonies/generate.go, which would need to be updated to reflect the new slot IDs each time a new boulder-tools image was created since slot IDs are randomly generated)
* Installs softhsm from source so that we can use a more up to date version (2.5.0 vs. 2.2.0 which is in the debian repo)
* Generates the root and intermediate private keys in softhsm and writes out the root and intermediate public keys to /tmp for use in integration tests (the existing test-{ca,root} certs are kept in test/ because they are used in a whole bunch of unit tests. At some point these should probably be renamed/moved to be more representative of what they are used for, but that is left for a follow-up in order to keep the churn in this PR as related to the ceremony work as possible)
Another follow-up item here is that we should really be zeroing out the database at the start of each integration test run, since certain things like certificates and ocsp responses will be signed by a key/issuer that is no longer is use/doesn't match the current key/issuer.

Fixes #4832.
2020-06-03 15:20:23 -07:00
Jacob Hoffman-Andrews ca47151148
Fix test pubkey files. (#4826)
These files are in PEM, so name them as such. Also, they had extra DER
concatenated at the end; remove it.
2020-05-27 12:30:47 -07:00
Jacob Hoffman-Andrews aad43e4688
Fix entrypoint.sh / docker-compose up. (#4747)
We no longer use virtualenv; we just install our Python dependencies
globally.
2020-04-07 11:35:42 -07:00
Jacob Hoffman-Andrews f9a8e744b7 Update pkcs11key to v4 (#4602)
This is a breaking API change: pkcs11key now takes as input a public key rather than
a private key label. In order to find the private key, it first finds the public key's CKA_ID
in the token, then looks for a private key with the same CKA_ID. From ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-11/v2-30/pkcs-11v2-30b-d6.pdf:

> The CKA_ID field is intended to distinguish among multiple keys. In
the case of public and private keys, this field assists in handling
multiple keys held by the same subject; the key identifier for a
public key and its corresponding private key should be the same.

This does require that both the public key and private key are present and have
appropriate CKA_IDs set. I've verified this is the case in prod. In our integration
testing environment it was not the case, so I've tweaked entrypoint.sh to load
public keys into SoftHSM and set their CKA_ID.

The initial part of this change was written by @cpu. I've reviewed and approved
those commits.
2019-12-09 10:03:33 -08:00
Jacob Hoffman-Andrews 1146eecac3 integration: use python3 (#4582)
Python 2 is over in 1 month 4 days: https://pythonclock.org/

This rolls forward most of the changes in #4313.

The original change was rolled back in #4323 because it
broke `docker-compose up`. This change fixes those original issues by
(a) making sure `requests` is installed and (b) sourcing a virtualenv
containing the `requests` module before running start.py.

Other notable changes in this:
 - Certbot has changed the developer instructions to install specific packages
rather than rely on `letsencrypt-auto --os-packages-only`, so we follow suit.
 - Python3 now has a `bytes` type that is used in some places that used to
provide `str`, and all `str` are now Unicode. That means going from `bytes` to
`str` and back requires explicit `.decode()` and `.encode()`.
 - Moved from urllib2 to requests in many places.
2019-11-28 09:54:58 -05:00
Roland Bracewell Shoemaker ef02f513d9 Fix wait_tcp_port in test/entrypoint.sh (#3763)
* If loop hits max exit with 1

* Increase timeout
2018-06-15 08:45:30 -04:00
Jacob Hoffman-Andrews dbcb16543e Start using multiple-IP hostnames for load balancing (#3687)
We'd like to start using the DNS load balancer in the latest version of gRPC. That means putting all IPs for a service under a single hostname (or using a SRV record, but we're not taking that path). This change adds an sd-test-srv to act as our service discovery DNS service. It returns both Boulder IP addresses for any A lookup ending in ".boulder". This change also sets up the Docker DNS for our boulder container to defer to sd-test-srv when it doesn't know an answer.

sd-test-srv doesn't know how to resolve public Internet names like `github.com`. Resolving public names is required for the `godep-restore` test phase, so this change breaks out a copy of the boulder container that is used only for `godep-restore`.

This change implements a shim of a DNS resolver for gRPC, so that we can switch to DNS-based load balancing with the currently vendored gRPC, then when we upgrade to the latest gRPC we won't need a simultaneous config update.

Also, this change introduces a check at the end of the integration test that each backend received at least one RPC, ensuring that we are not sending all load to a single backend.
2018-05-23 09:47:14 -04:00
Jacob Hoffman-Andrews a4421ae75b Run gRPC backends on multiple IPs instead of multiple ports (#3679)
We're currently stuck on gRPC v1.1 because of a breaking change to certificate validation in gRPC 1.8. Our gRPC balancer uses a static list of multiple hostnames, and expects to validate against those hostnames. However gRPC expects that a service is one hostname, with multiple IP addresses, and validates all those IP addresses against the same hostname. See grpc/grpc-go#2012.

If we follow gRPC's assumptions, we can rip out our custom Balancer and custom TransportCredentials, and will probably have a lower-friction time in general.

This PR is the first step in doing so. In order to satisfy the "multiple IPs, one port" property of gRPC backends in our Docker container infrastructure, we switch to Docker's user-defined networking. This allows us to give the Boulder container multiple IP addresses on different local networks, and gives it different DNS aliases in each network.

In startservers.py, each shard of a service listens on a different DNS alias for that service, and therefore a different IP address. The listening port for each shard of a service is now identical.

This change also updates the gRPC service certificates. Now, each certificate that is used in a gRPC service (as opposed to something that is "only" a client) has three names. For instance, sa1.boulder, sa2.boulder, and sa.boulder (the generic service name). For now, we are validating against the specific hostnames. When we update our gRPC dependency, we will begin validating against the generic service name.

Incidentally, the DNS aliases feature of Docker allows us to get rid of some hackery in entrypoint.sh that inserted entries into /etc/hosts.

Note: Boulder now has a dependency on the DNS aliases feature in Docker. By default, docker-compose run creates a temporary container and doesn't assign any aliases to it. We now need to specify docker-compose run --use-aliases to get the correct behavior. Without --use-aliases, Boulder won't be able to resolve the hostnames it wants to bind to.
2018-05-07 10:38:31 -07:00
Jacob Hoffman-Andrews 84692841fb Remove references to buser in Dockerfile. (#3152)
We originally planned to run as a non-root user in our Docker setup. We haven't
done that yet, so let's clean up the detritus until we do.
2017-10-06 09:36:44 -04:00
Phil Porada 61b246000f Log when the boulder container connects to the database container (#2847)
Added a log message for when the boulder container can successfully talk to the database container
2017-07-07 11:31:05 -04:00
Roland Bracewell Shoemaker a46d30945c Purge remaining AMQP code (#2648)
Deletes github.com/streadway/amqp and the various RabbitMQ setup tools etc. Changes how listenbuddy is used to proxy all of the gRPC client -> server connections so we test reconnection logic.

+49 -8,221 😁

Fixes #2640 and #2562.
2017-04-04 15:02:22 -07:00
Jacob Hoffman-Andrews 0a367962d6 Make restarting boulder in docker nicer. (#2492)
* Make restarting boulder in docker nicer.

Handle SIGTERM in startservers.py.
Forcibly remove rsyslog pid to avoid error.

* Add explanatory comment.

* Send SIGTERM instead of kill.

* Further improvements.

- Handle SIGINT too.
- Use unbuffered mode for Python so the print statements (like "all servers
  running") get printed right away rather than at shutdown
- Squelch an unnecessary OSError about interrupting the wait() call.
2017-01-13 11:55:28 -05:00
Jacob Hoffman-Andrews 0c665b2053 Split up gRPC certificates by service. (#2453)
Previously, all gRPC services used the same client and server certificates. Now,
each service has its own certificate, which it uses for both client and server
authentication, more closely simulating production.

This also adds aliases for each of the relevant hostnames in /etc/hosts. There
may be some issues if Docker decides to rewrite /etc/hosts while Boulder is
running, but this seems to work for now.
2016-12-29 14:53:59 -08:00
Jacob Hoffman-Andrews 87fee12d6c Improve single-ocsp command (#2181)
Output base64-encoded DER, as expected by ocsp-responder.
Use flags instead of template for Status, ThisUpdate, NextUpdate.
Provide better help.
Remove old test (wasn't run automatically).
Add it to integration test, and use its output for integration test of issuer ocsp-responder.

Add another slot to boulder-tools HSM image, to store root key.
2016-09-15 15:28:54 -07:00
Daniel McCarney 134b905574 Add DER form of test-ca key in-tree. (#2041)
The PKCS11 proxy requires `test/test-ca.key.pem` in DER form. Rather
than generating it when it doesn't exist in `test/entrypoint.sh` and
adding it to the gitignore we've opted to check it in directly.
2016-07-12 09:06:59 -07:00
Roland Bracewell Shoemaker a0a9623cb6 Switch to using SoftHSM in Docker for testing (#1920)
Instead of reading the CA key from a file on disk into memory and using that for signing in `boulder-ca` this patch adds a new Docker container that runs SoftHSM and pkcs11-proxy in order to hold the key and perform signing operations. The pkcs11-proxy module is used by `boulder-ca` to talk to the SoftHSM container.

This exercises (almost) the full pkcs11 path through boulder and will allow testing various HSM related failures in the future as well as simplifying tuning signing performance for benchmarking.

Fixes #703.
2016-07-11 11:20:51 -07:00
Jacob Hoffman-Andrews 71e4af43f7 Roll forward "Run Travis tests in Docker (#1830)" (#1838)
That change broke the certbot tests because it switched to a MariaDB
10.1-specific syntax. certbot/certbot#3058 changes the certbot tests to use
Boulder's docker-compose.yml, so they will get MariaDB 10.1 automatically.
2016-05-24 15:11:22 -07:00
Jacob Hoffman-Andrews b954dcc010 Revert "Run Travis tests in Docker (#1830)" (#1834)
This reverts commit 92d94f2 and commit 0b4623f to unbreak the Certbot build.
2016-05-20 15:57:10 -07:00
Jacob Hoffman-Andrews 92d94f2558 Run Travis tests in Docker (#1830)
* MariaDB 10.1

* MariaDB 10.1 in Docker

* Run docker stuff.

* Improve test.js error.

* Lower log level

* Revert dockerfile to master

* Export debug ports, set FAKE_DNS, and remove container_name.

* Remove typo.

* Make integration-test.py wait for debug ports.

* Use 10.1 and export more Boulder ports.

* Test updates for Docker

Listen on 0.0.0.0 for utility servers.
Make integration-test.py just wait for ports rather than calling startservers.
Run docker-compose in test.sh.
Remove bypass when database exists.
Separate mailer test into its own function in integration test.
Print better errors in test.js.

* Always bring up mysql container.

* Wait for MySQL to come up.

* Put it in travis-before-install.

* Use 127

* Remove manual docker-up.

* Add ifconfig

* Switch to docker-compose run

* It works!

* Remove some spurious env vars.

* Add bash

* try running it

* Add all deps.

* Pass through env.

* Install everything in the Dockerfile.

* Fix install of ruby

* More improvements

* Revert integration test to run directly
Also remove .git from dockerignore and add some packages.

* Revert integration-test.py to master.

* Stop ignoring test/js

* Start from boulder-tools.

* Add boulder-tools.

* Tweak travis.yml

* Separate out docker-compose pull as install.

* Build in install phase; don't bother with go install in Dockerfile

* Add virtualenv

* Actually build rabbitmq-setup

* Remove FAKE_DNS

* Trivial change

* Pull boulder-tools as a separate step so it gets its own timing info.

* Install certbot and protobuf from repos.

* Use cerbot from debian backports.

* Fix clone

* Remove CERTBOT_PATH

* Updates

* Go back to letsencrypt for build.sh

* Remove certbot volume.

* go back to preinstalled letsencrypt

* Restore ENV

* Remove BASH_ENV

* Adapt reloader test so it psses when run as root.

* Fixups for review.

* Revert test.js

* Revert startservers.py

* Revert Makefile.
2016-05-19 16:29:45 -07:00
Igor Bukanov 75134fc83f Speed up docker build (#1716)
Make COPY and compilation the last commands in the Dockerfile so in the common case Docker will cache results of EXPOSE, WORKDIR and ENV commands. The CMD is eliminated as entrypoint.sh now defaults to start.py if no arguments are given. The patch eliminates setting MYSQL_CONTAINER in run-docker.sh and docker-compose.yaml as entrypoint.sh sets the variable on its own when calling create_db.sh.

In addition the patch passes arguments passed to run-docker.sh as arguments to the entryscript.sh in the container. This way running `./run-docker.sh ./test.sh ...` allows to execute tests locally.
2016-04-08 09:58:50 -07:00
Igor Bukanov dfc1f9465c Switch rabbitmq-setup to connect to boulder-rabbitmq
This followup for #1639 replaces localhost with boulder-rabbitmq when tests run rabbitmq-setup.
Also fixed log message to point to the server name, not 0.0.0.0, when notifying about trying to connect.
2016-04-05 11:48:45 -07:00
Jacob Hoffman-Andrews d98eb634d1 Docker improvements.
Use bridged networking.

Add some files to .dockerignore to shrink the build state sent to Docker
daemon.

Use specific hostnames to contact services, rather than localhost.

Add instructions for adding those hostnames to /etc/hosts in non-Docker config.

Use DSN-style connect strings for DBs.

Remove localhost / 127.0.0.1 rewrite hack from create_db.sh.

Add hosts section with new hostnames.

Remove bin from .dockerignore.

SQL grants go to %

Short-circuit DB creation if already existing.

Make `go install` a part of Docker image build so that Docker run is much
faster.

Bind to 0.0.0.0 for OCSP responders so they can be reached from host, and
publish / expose their ports.

Remove ToSServerThread and test.js' fetch of ToS.

Increase the registrationsPerIP rate limit threshold. When issuing from a Docker
host, the 127.0.0.1 override doesn't apply, so the limit is quickly hit.

Update docker-compose for bridged networking. Note: docker-compose doesn't currently work, but should be close.

https://github.com/letsencrypt/boulder/pull/1639
2016-04-04 16:05:08 -07:00
Roland Bracewell Shoemaker 32e9e44906 Remove activity-monitor from the tree
* Axe boulder-am
* Also remove the analysis subpackage and references to it, and remove routingKey from rpc/connection.go

https://github.com/letsencrypt/boulder/pull/1682
2016-04-04 12:19:17 -07:00
Reinaldo de Souza Jr 673f927d85 Initialize rabbitmq in Docker entrypoint 2016-01-04 21:10:48 -05:00
Jessica Frazelle a2632fa155
change run-docker.sh to use bash not docker-compose
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2015-11-02 09:50:25 -08:00
Jessica Frazelle 4d81e3090d
Cleanup of Docker Dev Playground
- Separated RabbitMq into it's own container
- some various Dockerfile-isms cleanup
- updated routes to linked containers
- removed nodejs, I have not been able to figure out why it was being installed
    (so this could be something that is actually needed)

To setup a dev environment:

You now need `docker-compose`, but running the setup with all the
configurations is as simple as:

```
$ docker-compose build
$ docker-compose up
```

Then you can even run the `test.sh` in the container with:

```
$ docker exec -it boulder_boulder_1 bash
root@container $ ./test.sh
```

This is just an _initial_ first pass at refactoring a bunch of this. There is
a bunch more I want to change and make better.

Also with regard to database migration taking awhile I want to try and move
the goose stuff over to the mariadb container, there is just some less savory
things I don't like about starting the db in the background then running the
migration script :/, I like to attach to the process on container start. I do
have some thoughts on a `docker exec` command in the mariadb container which
migrates the db... but trying to think of something better.

Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2015-10-21 12:47:29 -07:00