Historically the only database/sql driver setting exposed via JSON
config was maxDBConns. This change adds support for maxIdleConns,
connMaxLifetime, connMaxIdleTime, and renames maxDBConns to
maxOpenConns. The addition of these settings will give our SRE team a
convenient method for tuning the reuse/closure of database connections.
A new struct, DBSettings, has been added to SA. The struct, and each of
it's fields has been commented.
All new fields have been plumbed through to the relevant Boulder
components and exported as Prometheus metrics. Tests have been
added/modified to ensure that the fields are being set. There should be
no loss in coverage
Deployability concerns for the migration from maxDBConns to maxOpenConns
have been addressed with the temporary addition of the helper method
cmd.DBConfig.GetMaxOpenConns(). This method can be removed once
test/config is defaulted to using maxOpenConns. Relevant sections of the
code have TODOs added that link back to an newly opened issue.
Fixes#5199
Currently we do not login to DockerHub for CI and are being rate limited
on pulls.
This adds a docker login using env variables that are setup in our Travis
repository.
The login credentials are a bot username and a purpose generated DockerHub
CLI token stored as ENV variables that are not viewable after added to the
TravisCI repository settings.
Risks and best practices for managing these secrets can be found at
https://docs.travis-ci.com/user/best-practices-security.
Fixes#5212
The PrecertificateRevocation flag is turned on everywhere, so the
else case is unused code. This change updates the WFE to always
use the PrecertificateRevocation code path, and deprecates the old
feature flag.
The TestRevokeCertificateWithAuthz method was deleted because
it is redundant with TestRevokeCertificateWithAuthorizations.
Fixes#5240
We intend to delete the v1 API (i.e. `wfe` and its associated codepaths)
in the near future, and as such are not giving it new features or
capabilities. However, before then we intend to allow the v2 API to
provide issuance both from our RSA and from our ECDSA intermediates.
The v1 API cannot gain such capability at the same time.
The CA doesn't know which frontend originated any given issuance
request, so we can't simply gate the single- or double-issuer behavior
based on that. Instead, this change introduces the ability for the
WFE (and the RA, which sits between the WFE and the CA) to request
issuance from a specific intermediate. If the specified intermediate is
not available in the CA, issuance will fail. If no intermediate is
specified (as is the case in requests coming from wfe2), it falls back
to selecting the issuer based on the algorithm of the public key to
be signed.
Fixes#5216
Currently the VA checks to see how many redirects have been followed and
bails out if greater than maxRedirect (10), but it does not check to see
if any redirect url has been followed twice which would mean a broken
infinite redirect loop. Storing the validation records for these is
relatively expensive because we store a record for each hop in the
redirect.
This change checks the previous redirect records to see if the URL has
been used before and error if it has. This will catch a redirect loop
earlier than the maxRedirect value in most cases.
Fixes#5224
This change deletes the (used, but not useful)
`mockSANoSuchRegistration` from the wfe2 tests, and updates all
places where the wfe tests create a mock SA to take the existing
mock SA as input instead of building another one from scratch from
the fakeclock.
Move responsibility for handling the default path (maxLogLen = 0)
from the ocspLogQueue component to CertificateAuthorityImpl. Now the
ocspLogQueue isn't constructed at all unless we configure it to be used.
Also, remove buffer from OCSP audit log chan. The bug this fixes was
hidden by the large buffer, and on reconsideration the large buffer was
unnecessary.
Verified that with the buffer removed, the integration test fails. Then
adding the fixes to the maxLogLen = 0 case fixes the integration test.
Fixes#5228
This adds a new component to the CA, ocspLogQueue, which batches up
OCSP generation events for audit logging. It will log accumulated
events when it reaches a certain line length, or when a maximum amount
of times has passed.
Today, when issuing a certificate based on a precertificate, we choose
the issuer based on algorithm: we find the precert's algorithm, look
it up in our table mapping algs to issuers, and then issue the final
cert from that issuer.
It is therefore hypothetically possible for the precertificate and
final certificate to be issued from different issuers, if the mapping
of algs to issuers were updated between precert and final cert issuance.
We don't expect this to be possible given Boulder's architecture, but
it could become possible given sufficient refactoring.
This change updates the lookup method to be based on the IssuerNameID,
which is a truncated hash computed across the whole Name field of the
issuer certificate (or equivalently across the Issuer field of the
issued precertificate). This ensures that final issuance will fail,
rather than have different issuers for the precert and final cert.
Part of #5216
The /issuer-cert endpoint was a holdover from the v1 API, where
it is a critical part of the issuance flow. In the v2 issuance
flow, the issuer certificate is provided directly in the response
for the certificate itself. Thus, this endpoint is redundant.
Stats show that it receives approximately zero traffic (less than
one request per week, all of which are now coming from wget or
browser useragents). It also complicates the refactoring necessary
for the v2 API to support multiple issuers.
As such, it is a safe and easy decision to remove it.
Fixes#5196
Previously, configuration of the boulder-janitor was split into
two places: the actual json config file (which controlled which
jobs would be enabled, and what their rate limits should be), and
the janitor code itself (which controlled which tables and columns
those jobs should query). This resulted in significant duplicated
code, as most of the jobs were identical except for their table
and column names.
This change abstracts away the query which jobs use to find work.
Instead of having each job type parse its own config and produce
its own work query (in Go code), now each job supplies just a few
key values (the table name and two column names) in its JSON config,
and the Go code assembles the appropriate query from there. We are
able to delete all of the files defining individual job types, and
replace them with a single slightly smarter job constructor.
This enables further refactorings, namely:
* Moving all of the logic code into its own module;
* Ensuring that the exported interface of that module is safe (i.e.
that a client cannot create and run jobs without them being valid,
because the only exposed methods ensure validity);
* Collapsing validity checks into a single location;
* Various renamings.
The vast majority of Boulder components no longer care about
anything in the common config block. As such, we hope to
remove it entirely in the near future. So let's put the (not-yet-used)
IssuerCerts config item in the main OCSPResponder block,
rather than in the common block.
Part of #5204
It is possible for a CertificateStatus row to have a nil IssuerID
(there was a period of time in which we didn't write IssuerIDs into
CertificateStatus rows at all) but all such rows should be old and
therefore expired.
Unfortunately, this code was checking the IssuerID before it was
checking the Expiry, and therefore was generating panics when trying
to dereference a nil pointer.
This change simply moves the IssuerID check to be after the Expires
check, so that we'll only try to dereference the IssuerID on recent
CertificateStatus rows, where it is guaranteed to be non-nil.
Fixes#5200
When the VA encounters CAA records, it logs the contents of those
records. When those records were the result of following a chain of
CNAMEs, the CNAMEs are included as part of the response from our
recursive resolver. However, the current flow for logging the responses
logs only the CAA records, not the CNAMEs.
This change returns the complete dig-style RR response from bdns to the
va where the response of the authoritative CAA RR is string-quoted and
logged.
This dig-style RR response is quite verbose, however it is only ever
returned from bdns.LookupCAA when a CAA response is non-empty. If the CAA
response is empty only an empty string is returned.
Fixes#5082
This adds a new job type (and corresponding config) to the janitor
to clean up old rows from the `keyHashToSerial` table. Rows in this
table are no longer relevant after their corresponding certificate
has expired.
Fixes#4792
There are various technical requirements on the maximum age of
an OCSP response. Although ocsp-updater has mechanisms to ensure
that all certificates have responses which are sufficiently recent,
there is the possibility of a bug which results in some OCSP
responses escaping its notice.
This change adds a historgram metric to ocsp-responder which collects
the ages (i.e. now minus the `thisUpdate` timestamp) of the OCSP
respones which it serves. The histogram has equal buckets in 12-hour
increments. During normal operation, the first 7 such buckets
(representing ages 0 to 3.5 days) should have roughly equal counts,
while the latter 7 buckets (3.5 to 7 days) should be empty.
Fixes#5080
RabbitMQ was removed in favor of gRPC in 2016. While there have been
many commits to remove the AMQP-related code, this last mention in
start.py persisted.
When making an OCSP request, the client provides three pieces of
information: the URL which it is querying to get OCSP info, the
hash of the issuer public which issued the cert in question, and
the serial number of the cert in question. In Boulder, the first
of these is only provided implicitly, based on which instance of
ocsp-responder is handling the request: we ensure (via configs)
that the ocsp-responder handling a given OCSP AIA URL has the
corresponding issuer cert loaded in memory.
When handling a request, the ocsp-responder checks three things:
that the request is using SHA1 to hash the issuer public key, that
the requested issuer public key matches one of the loaded issuer
certs, and that the requested serial number is one we could have
issued (i.e. has the correct prefix). It relies on the database
query to filter out requests for non-existent serials.
However, this means that a request to an ocsp-responder instance
with issuer cert A loaded could receive and handle a request which
specifies cert A as the issuer, but names a serial which was actually
issued by issuer cert B. The checks all pass and the database lookup
succeeds. But the returned OCSP response is for a certificate that
was issued by a different issuer, and the response itself was
signed by that other issuer.
In order to resolve this potentially confusing situation, this change
adds one additional check to the ocsp-responder: after it has
retrieved the ocsp response, it looks up which issuer produced that
ocsp response, confirms that it has that issuer cert loaded in
memory, and confirms that its issuer key hash matches that in the
original request.
There is still one wrinkle if issuer certs A and B are both loaded
in one ocsp-responder, and that one ocsp-responder is handling OCSP
requests to both of their AIA OCSP URLs. In this case, it is possible
that a request to a.ocsp.com, but requesting OCSP for a cert issued
by B, could still have its request answered. This is because the
ocsp-responder itself does not know which URL was requested. But
regardless, this change does guarantee that the response will match
the contents of the request (or no response will be given), no matter
what URL that request was sent to.
Fixes#5182
Modified the Dockerfile to build using Debian Buster, an upgrade from
Debian Stretch. The default Python 3 version for Stretch is 3.5.x which
is soon to de deprecated by Python-cryptography a dependency we rely on
for our integration test suite. The default Python 3 version for Debian
Buster is 3.7.x
In the .travis.yml file we are instructing travis to provision Xenial
instances and install two versions of Go. This change bumps Xenial
(16.04) -> Focal (20.04) and removes the installation of the two Go
versions; all of our testing happens inside of a docker container so
having Go installed on the Docker parent isn't necessary.
In the docker-compose.yml file we configure which docker image to pull
from Dockerhub, I've updated these to reflect the Debian Buster images
already built and pushed.
Modified build.sh to install mariadb-client-core 10.3, there is no 10.1
install candidate for Debian Buster and release notes for 10.2 and 10.3
indicate that these were both security releases.
Modified test.sh to use python3 instead of system python (usually 2.7)
for test/grafana/lints.py
Fixes#5180
In all boulder services, we construct a single tls.Config object
and then pass it into multiple gRPC setup methods. In all boulder
services but one, we pass the object into multiple clients, and
just one server. In general, this is safe, because all of the client
setup happens on the main thread, and the server setup similarly
happens on the main thread before spinning off the gRPC server
goroutine.
In the CA, we do the above and pass the tlsConfig object into two
gRPC server setup functions. Thus the first server goroutine races
with the setup of the second server.
This change removes the post-hoc assignment of MinVersion,
MaxVersion, and CipherSuites of the tls.Config object passed
to grpc.ClientSetup and grpc.NewServer. And adds those same
values to the cmd.TLSConfig.Load, the method responsible for
constructing the tls.Config object before it's passed to
grpc.ClientSetup and grpc.NewServer.
Part of #5159
The database stores order expiry values as type DATETIME, which only
supports values with second-level accuracy. (Contrast with type
DATETIME(6), which allows microsecond accuracy.) But the order object
being written to the database does not have its expiry value rewritten
by the insert process. This leads to Boulder returning different
values for `expires` depending on whether the order was created fresh
(nanosecond accuracy) or retrieved from the db (second accuracy).
There appears to be only one codepath which suffers from this
discrepancy. Although Authorization objects also have an `expires`
field, they are never returned to the client immediately upon creation;
they are first exposed to the user as URLs within an Order object, and
so are always retrieved from the database.
Fixes#5166
This error class is only used in one instance, and when returned to
the user it is transformed into a `probs.Malformed` anyway. It does
more harm than good to keep this one-off BoulderError around, as
it introduces confusion about what sorts of errors we expose to the
client.
Fixes#5167
This change refactors the sa.NewOrder method to be more readable.
Previously, the outer method both modified and returned the request
object, a confusion violation of abstraction. Simultaneously, the inner
transaction function modified its input order object, but returned the
original request object which it hadn't modified.
Now, the NewOrder method does not modify its input, and instead
returns an all-new Order object. Additionally, the inner transaction
function does not modify its input, and returns an order model with
relevant fields set.
Part of #5166
Go version 1.15.5 is a security release which introduces fixes
both to the big.Int package (which we use) and the go compiler
itself (which we use).
Release notes: https://golang.org/doc/go1.15
This change builds go1.15.5 versions of our docker containers,
adds tests on the new version to our travis config, and sets the
default to be the new version.
Fixes#5173
These configs contained a `common.issuerCert` field which is not
read or used by the akamai-purger service. This change removes
is for clarity.
In addition, these files had irregular indentation. This change cleans
them up to use two-space indents throughout.
Having both of these very similar methods sitting around
only serves to increase confusion. This removes the last
few places which use `cmd.LoadCert` and replaces them
with `core.LoadCert`, and deletes the method itself.
Fixes#5163
Modifies test.sh to provide a long and short option flagged menu
with helpful documentation. Improves UX by printing the settings
used before each run. Improves visual parsability by printing a
bold blue section heading with a title for each test. Using set flags
-eu and an EXIT trap we now exit at any failure or undefined var
and print FAILURE in red on the last line. If the script doe not exit
prematurely, then SUCCESS is printed in green on the last line.
Adds the following options: Parse and print a full list of available
integration tests. Enable the -race flag for unit tests. Enable next
config. Flag to add integration test and unit test filter arguments.
Fixes, -race flag not being enabled by the previous tests. The script
comments seem to indicate that this is desirable, but either through
error or intention TRAVIS was never being set to true, I confirmed
this behavior by running CI tests and observing which options were
set before making this change. If this is not desirable, we can make
a change in .travis.yml to omit the -e flag on unit test and unit test
with next-config.
Fixed dir var not being set. Since the previous script did not use the
set -u flag, undefined vars would not cause an exit. Once this flag was
set it was apprent that var dir was never being defined. Instead, the
script was just using .coverprofile for the coverage test option.
Removes FAILURE and SUCCESS on exit from the Python wrapper
in test/integration-test.py.
Modifies .travis.yml to to use the new short option flags instead of
environment variables. Included comments here to describe all of
the available flags.
Modifies README.md to add examples of the new script in use as
well as examples of running tests without the new script. This should
give folks a good idea of what the script is wrapping.
Part of #5139
The ocsp-responder takes a path to a certificate file as one of
its config values. It uses this path as one of the inputs when
constructing its DBSource, the object responsible for querying
the database for pregenerated OCSP responses to fulfill requests.
However, this certificate file is not necessary to query the
database; rather, it only acts as a filter: OCSP requests whose
IssuerKeyHash do not match the hash of the loaded certificate are
rejected outright, without querying the DB. In addition, there is
currently only support for a single certificate file in the config.
This change adds support for multiple issuer certificate files in
the config, and refactors the pre-database filtering of bad OCSP
requests into a helper object dedicated solely to that purpose.
Fixes#5119
The RA is responsible for contacting Akamai to purge cached OCSP
responses when a certificate is revoked and fresh OCSP responses need to
be served ASAP. In order to do so, it needs to construct the same OCSP
URLs that clients would construct, and that Akamai would cache. In order
to do that, it needs access to the issuing certificate to compute a hash
across its Subject Info and Public Key.
Currently, the RA holds a single issuer certificate in memory, and uses
that cert to compute all OCSP URLs, on the assumption that all certs
we're being asked to revoke were issued by the same issuer.
In order to support issuance from multiple intermediates at the same
time (e.g. RSA and ECDSA), and to support rollover between different
issuers of the same type (we may need to revoke certs issued by two
different issuers for the 90 days in which their end-entity certs
overlap), this commit changes the configuration to provide a list of
issuer certificates instead.
In order to support efficient lookup of issuer certs, this change also
introduces a new concept, the Chain ID. The Chain ID is a truncated hash
across the raw bytes of either the Issuer Info or the Subject Info of a
given cert. As such, it can be used to confirm issuer/subject
relationships between certificates. In the future, this may be a
replacement for our current IssuerID (a truncated hash over the whole
issuer certificate), but for now it is used to map revoked certs to
their issuers inside the RA.
Part of #5120
This change adds two new test assertion helpers, `AssertErrorIs`
and `AssertErrorWraps`. The former is a wrapper around `errors.Is`,
and asserts that the error's wrapping chain contains a specific (i.e.
singleton) error. The latter is a wrapper around `errors.As`, and
asserts that the error's wrapping chain contains any error which is
of the given type; it also has the same unwrapping side effect as
`errors.As`, which can be useful for further assertions about the
contents of the error.
It also makes two small changes to our `berrors` package, namely
making `berrors.ErrorType` itself an error rather than just an int,
and giving `berrors.BoulderError` an `Unwrap()` method which
exposes that inner `ErrorType`. This allows us to use the two new
helpers above to make assertions about berrors, rather than
having to hand-roll equality assertions about their types.
Finally, it takes advantage of the two changes above to greatly
simplify many of the assertions in our tests, removing conditional
checks and replacing them with simple assertions.
The akamai purger is now a standalone service, and the ocsp-updater does
not talk to it directly. All of the code in the ocsp-updater related to
the akami purger is gated behind tests that the corresponding config
values are not nil, and the resulting objects are never used by the
service's business logic.
Now that all of the corresponding config stanzas have been removed
from our production configs, we can remove them from the config
struct definitions as well.
errors.As checks for a specific error in a wrapped error chain
(see https://golang.org/pkg/errors/#As) as opposed to asserting
that an error is of a specific type.
Part of #5010
errors.As checks for a specific error in a wrapped error chain
(see https://golang.org/pkg/errors/#As) as opposed to asserting
that an error is of a specific type.
Part of #5010
errors.As checks for a specific error in a wrapped error chain
(see https://golang.org/pkg/errors/#As) as opposed to asserting
that an error is of a specific type.
Part of #5010
This adds an ECDSA hierarchy along-side the RSA hierarchy
which our integration tests already rely on. This does not yet
integrate the new hierarchy into the services (none of the
generated keys or certs are referenced from test service config
files yet), but it lays the groundwork for that to happen after our
services all have multi-issuer support.
Part of #5113
The only caller of this function is the RA's `revokeCertificate`
method, which already has the hydrated `x509.Certificate`
version of the cert. There's no need to pass the raw version
and re-parse the DER again, just pass a reference to the
existing cert.