This change replaces [gorp] with [borp].
The changes consist of a mass renaming of the import and comments / doc
fixups, plus modifications of many call sites to provide a
context.Context everywhere, since gorp newly requires this (this was one
of the motivating factors for the borp fork).
This also refactors `github.com/letsencrypt/boulder/db.WrappedMap` and
`github.com/letsencrypt/boulder/db.Transaction` to not embed their
underlying gorp/borp objects, but to have them as plain fields. This
ensures that we can only call methods on them that are specifically
implemented in `github.com/letsencrypt/boulder/db`, so we don't miss
wrapping any. This required introducing a `NewWrappedMap` method along
with accessors `SQLDb()` and `BorpDB()` to get at the internal fields
during metrics and logging setup.
Fixes#6944
Add two new methods, LeaseCRLShard and UpdateCRLShard, to the SA gRPC
interface. These methods work in concert both to prevent multiple
instances of crl-updater from stepping on each others toes, and to lay
the groundwork for a less bursty version of crl-updater in the future.
Introduce a new database table, crlShards, which tracks the thisUpdate
and nextUpdate timestamps of each CRL shard for each issuer. It also has
a column "leasedUntil", which is also a timestamp. Grant the SA user
read-write access to this table.
LeaseCRLShard updates the leasedUntil column of the identified shard to
the given time. It returns an error if the identified shard's
leasedUntil timestamp is already in the future. This provides a
mechanism for crl-updater instances to "lick the cookie", so to speak,
marking CRL shards as "taken" so that multiple crl-updater instances
don't attempt to work on the same shard at the same time. Using a
timestamp has the added benefit that leases are guaranteed to expire,
ensuring that we don't accidentally fail to work on a shard forever.
LeaseCRLShard has a second mode of operation, when a range of potential
shards is given in the request, rather than a single shard. In this
mode, it returns the shard (within the given range) whose thisUpdate
timestamp is oldest. (Shards with no thisUpdate timestamp, including
because the requested range includes shard indices the database doesn't
yet know about, count as older than any shard with any thisUpdate
timestamp.) This allows crl-updater instances which don't care which
shard they're working on to do the most urgent work first.
UpdateCRLShard updates the thisUpdate and nextUpdate timestamps of the
identified shard. This closes the loop with the second mode of
LeaseCRLShard above: by updating the thisUpdate timestamp, the method
marks the shard as no longer urgently needing to be worked on.
IN-9220 tracks creating this table in staging and production
Part of #6897
Previously, we had three chained calls initializing a database:
- InitWrappedDb calls NewDbMap
- NewDbMap calls NewDbMapFromConfig
Since all three are exporetd, this left me wondering when to call one vs
the others.
It turns out that NewDbMap is only called from tests, so I renamed it to
DBMapForTest to make that clear.
NewDbMapFromConfig is only called internally to the SA, so I made it
unexported it as newDbMapFromMysqlConfig.
Also, I copied the ParseDSN call into InitWrappedDb, so it doesn't need
to call DBMapForTest. Now InitWrappedDb and DBMapForTest both
independently call newDbMapFromMysqlConfig.
I also noticed that InitDBMetrics was only called internally so I
unexported it.
Removes the `Hostname` and `Port` fields from an http-01
ValidationRecord model prior to storing the record in the database.
Using `"hostname":"example.com","port":"80"` as a snippet of a whole
validation record, we'll save minimum 36 bytes for each new http-01
ValidationRecord that gets stored. When retrieving the record, the
ValidationRecord `RehydrateHostPort` method will repopulate the
`Hostname` and `Port` fields from the `URL` field.
Fixes the main goal of
https://github.com/letsencrypt/boulder/issues/5231.
---------
Co-authored-by: Samantha <hello@entropy.cat>
As a follow-up to https://github.com/letsencrypt/boulder/issues/5467, I
did an audit of all places where we call SelectOne to ensure that those
queries can never return more than one result. These four functions were
the only places that weren't already constrained to a single result
through the use of "SELECT COUNT", "LIMIT 1", "WHERE uniqueKey =", or
similar. Limit these functions' queries to always only return a single
result, now that their underlying tables no longer have unique key
constraints.
Additionally, slightly refactor selectRegistration to just take a single
column name rather than a whole WHERE clause.
Fixes https://github.com/letsencrypt/boulder/issues/6521
Make minor, non-user-visible changes to how we structure the probs
package. Notably:
- Add new problem types for UnsupportedContact and
UnsupportedIdentifier, which are specified by RFC8555 and which we will
use in the future, but haven't been using historically.
- Sort the problem types and constructor functions to match the
(alphabetical) order given in RFC8555.
- Rename some of the constructor functions to better match their
underlying problem types (e.g. "TLSError" to just "TLS").
- Replace the redundant ProblemDetailsToStatusCode function with simply
always returning a 500 if we haven't properly set the problem's
HTTPStatus.
- Remove the ability to use either the V1 or V2 error namespace prefix;
always use the proper RFC namespace prefix.
Deprecate the ROCSPStage7 feature flag, which caused the RA and CA to
stop generating OCSP responses when issuing new certs and when revoking
certs. (That functionality is now handled just-in-time by the
ocsp-responder.) Delete the old OCSP-generating codepaths from the RA
and CA. Remove the CA's internal reference to an OCSP implementation,
because it no longer needs it.
Additionally, remove the SA's "Issuers" config field, which was never
used.
Fixes#6285
Deprecate the ROCSPStage6 feature flag. Remove all references to the
`ocspResponse` column from the SA, both when reading from and when
writing to the `certificateStatus` table. This makes it safe to fully
remove that column from the database.
IN-8731 enabled this flag in all environments, so it is safe to
deprecate.
Part of #6285
As a follow-up to #6780, add the same style of implementation test to
all of our other gRPC services. This was not included in that PR just to
keep it small and single-purpose.
This file contained both read-only and read-write methods. Its existence
is not reflected in any other gRPC or struct organization; it was easy
to forget that it exists. Merge its contents into both sa.go and
saro.go, so that the methods follow the same organization scheme as the
rest of the SA.
This makes it less likely that bugs like #6778 will happen again.
When external clients make POST requests to our ARI endpoint, they're
getting 404s even when a GET request with the same exact CertID
succeeds. Logs show that this is because the SA is returning "method
GetSerialMetadata not implemented" when the WFE attempts that gRPC
request. This is due to an oversight: the GetSerialMetadata method is
not implemented on the SQLStorageAuthorityRO object, only on the
SQLStorageAuthority object. The unit tests did not catch this bug
because they supply a mock SA, which does implement the method in
question.
Update the receiver and add a wrapper so that GetSerialMetadata is
implemented on both the read-write and read-only SA implementation
types. Add a new kind of test assertion which helps ensure this won't
happen again. Add a TODO for an integration test covering the ARI POST
codepath to prevent a regression.
Fixes#6778
sa: rename AddPrecertificateRequest.IssuerID
to IssuerNameID. This is in preparation for adding a similarly-named
field to AddSerialRequest.
Part of #5152.
Use constants from the go stdlib time package, such as time.DateTime and
time.RFC3339, when parsing and formatting timestamps. Additionally,
simplify or remove some of our uses of parsing timestamps, such as to
set fake clocks in tests.
Add a new time.Duration field, LagFactor, to both the SA's config struct
and the read-only SA's implementation struct. In the GetRegistration,
GetOrder, and GetAuthorization2 methods, if the database select returned
a NoRows error and a lagFactor duration is configured, then sleep for
lagFactor seconds and retry the select.
This allows us to compensate for the replication lag between our primary
write database and our read-only replica databases. Sometimes clients
will fire requests in rapid succession (such as creating a new order,
then immediately querying the authorizations associated with that
order), and the subsequent requests will fail because they are directed
to read replicas which are lagging behind the primary. Adding this
simple sleep-and-retry will let us mitigate many of these failures,
without adding too much complexity.
Fixes#6593
A `core.Authorization` object has lots of fields (e.g. `status`,
`attempted`, `attemptedAt`) which are not relevant to a
newly-created authorization: a brand new authz can only be in
the "pending" state, cannot have been attempted already or have
been validated.
Fix a nil pointer dereference in `sa.NewOrderAndAuthzs` if a
`req *sapb.NewOrderAndAuthzsRequest` is passed into the
function with an inner nil `req.NewOrder`.
Add new tests.
- TestNewOrderAndAuthzs_MissingInnerOrder
- Checks that
the nil pointer dereference no longer occurs
- TestNewOrderAndAuthzs_NewAuthzExpectedFields
- Checks that the `Attempted`, `AttemptedAt`, `ValidationRecords`,
and `ValidationErrors` fields for a brand new authz in the
`pending` state are correctly defaulted to `nil` in
`sa.NewOrderAndAuthzs`.
Add a new test assertion `AssertBoxedNil` that returns true for the
existence of a "boxed nil" - a nil value wrapped in a non-nil interface
type.
Fixes#6535
---------
Co-authored-by: Samantha <hello@entropy.cat>
Add validation of input parameters as unquoted MariaDB identifiers, and
document the regex that does it.
Accept a narrower interface (Queryer) for `Insert()`.
Take a list of fields rather than a string containing multiple fields,
to make validation simpler. Rename retCol to returningColumn.
Document safety properties and requirements.
We use this pattern in several places: there is a query that needs to
have a variable number of placeholders (question marks) in it, depending
on how many items we are inserting or querying for. For instance, when
issuing a precertificate we add that precertificate's names to the
"issuedNames" table. To make things more efficient, we do that in a
single query, whether there is one name on the certificate or a hundred.
That means interpolating into the query string with series of question
marks that matches the number of names.
We have a helper type MultiInserter that solves this problem for simple
inserts, but it does not solve the problem for selects or more complex
inserts, and we still have a number of places that generate their
sequence of question marks manually.
This change updates addIssuedNames to use MultiInserter. To enable that,
it also narrows the interface required by MultiInserter.Insert, so it's
easier to mock in tests.
This change adds the new function db.QuestionMarks, which generates e.g.
`?,?,?` depending on the input N.
In a few places I had to rename a function parameter named `db` to avoid
shadowing the `db` package.
The SA's GetOrder method issues four separate database queries:
- get the Order object itself
- get the list of Names associated with it
- get the list of Authorization IDs associated with it
- get the contents of those Authorizations themselves
These four queries can hit different database replicas with different
amounts of replication lag, and therefore different views of the
universe. This can result in inconsistent results, such as the Order
existing, but not having any Authorization IDs associated with it.
Conduct these four queries all within a single transaction, to ensure
they all hit a single read-replica with a consistent view of the world.
Fixes#6540
Delete the NewOrder and NewAuthorizations2 methods from the SA's gRPC
interface. These methods have been replaced by the unified
NewOrderAndAuthzs method, which performs both sets of insertions in a
single transaction.
Also update the SA and RA unittests to not rely on these methods for
setting up test data that other functions-under-test rely on. In most
cases, replace calls to NewOrder with calls to NewOrderAndAuthzs. In the
SA tests specifically, replace calls to NewAuthorizations2 with a
streamlined helper function that simply does the single necessary
database insert.
Fixes#6510Fixes#5816
The `digest` value in AddCertificate's response message is never used by
any callers. Remove it, replacing the whole response message with
google.protobuf.Empty, to mirror the AddPrecertificate method.
This swap is safe, because message names are not sent on the network,
and empty message fields are omitted from the wire format entirely, so
sending the predefined Empty message is identical to sending an empty
AddCertificateResponse message. Since no client is inspecting the
response to access the digest field, sending an empty response will not
break any clients.
Fixes#6498
- Replace `-1` in return values with `0`. No callers were depending on
`-1`.
- Replace `count(` with `COUNT(` for the sake of readability.
- Replace `COUNT(1)` with `COUNT(*)` (https://mariadb.com/kb/en/count).
Both
versions provide identical outputs but let's standardize on the docs.
Fixes#6494
Create a new gRPC service named StorageAuthorityReadOnly which only
exposes a read-only subset of the existing StorageAuthority service's
methods.
Implement this by splitting the existing SA in half, and having the
read-write half embed and wrap an instance of the read-only half.
Unfortunately, many of our tests use exported read-write methods as part
of their test setup, so the tests are all being performed against the
read-write struct, but they are exercising the same code as the
read-only implementation exposes.
Expose this new service at the SA on the same port as the existing
service, but with (in config-next) different sets of allowed clients. In
the future, read-only clients will be removed from the read-write
service's set of allowed clients.
Part of #6454
Have the database query return timestamps for when certificates
were issued for certain names. Use that information to compute
when the next time that name will be eligible for issuance again.
Include that timestamp in the error message and a Retry-After
HTTP header.
Fixes#6465
These flags are set in both staging and prod. Deprecate them, make
all code gated behind them the only path, and delete code (multi_source)
which was only accessible when these flags were not set.
Part of #6285
This method simply returns the greatest notAfter timestamp in
the certificateStatus table. This will be used by the crl-updater
to ensure that it includes all unexpired certificates in its CRLs,
rather than only those which happen to fall within its configured
bounds.
Part of #6438
- Move incidents tables from `boulder_sa` to `incidents_sa` (added in #6344)
- Grant read perms for all tables in `incidents_sa`
- Modify unit tests to account for new schema and grants
- Add database cleaning func for `boulder_sa`
- Adjust cleanup funcs to omit `sql-migrate` tables instead of `goose`
Resolves#6328
Suggest that subscribers with certificates impacted by an ongoing revocation
incident renew immediately.
- Make SA method `IncidentsForSerial` a callable RPC
Resolves#6282
Add a new `test.AssertNil()` helper to facilitate asserting that a given
unit test result is a non-boxed nil. Update `test.AssertNotNil()` to use
the reflect package's `.IsNil()` method to catch boxed nils.
In Go, variables whose type is constrained to be an interface type (e.g.
a function parameter which takes an interface, or the return value of a
function which returns `error`, itself an interface type) should
actually be thought of as a (T, V) tuple, where T is their underlying
concrete type and V is their underlying value. Thus, there are two ways
for such a variable to be nil-like: it can be truly nil where T=nil and
V is uninitialized, or it can be a "boxed nil" where T is a nillable
type such as a pointer or a slice and V=nil.
Unfortunately, only the former of these is == nil. The latter is the
cause of frequent bugs, programmer frustration, a whole entry in the Go
FAQ, and considerable design effort to remove from Go 2.
Therefore these two test helpers both call `t.Fatal()` when passed a
boxed nil. We want to avoid passing around boxed nils whenever possible,
and having our tests fail whenever we do is a good way to enforce good
nil hygiene.
Fixes#3279
GitHub doesn't auto-rebase changes before merging, and doesn't re-run
tests against the current HEAD of the target branch before merging, so
it didn't catch that all uses of ioutil had been removed in #6286 while
a new usage was added in #6284. Replace the new usage of ioutil with
the appropriate function from the os package instead.
Create a new `ROCSPStage6` feature flag which affects the behavior of
the SA. When enabled, this flag causes the `AddPrecertificate`,
`RevokeCertificate`, and `UpdateRevokedCertificate` methods to ignore
the OCSP response bytes provided by their caller. They will no longer
error out if those bytes are missing, and if the bytes are present they
will still not be written to the database.
This allows us to, in the future, cause the RA and CA to stop generating
those OCSP responses entirely, and stop providing them to the SA,
without causing any errors when we do.
Part of #6079
The iotuil package has been deprecated since go1.16; the various
functions it provided now exist in the os and io packages. Replace all
instances of ioutil with either io or os, as appropriate.
- Modify `ra.checkCertificatesPerFQDNSetLimit()` to use a leaky bucket algorithm
- Return issuance timestamps from `sa.FQDNSetTimestampsForWindow()` in descending order
Resolves#6154
The metadata values were planned to be used for scanning Redis in
ocsp-updater. Since we won't do that, remove it. Happily, this also
allows us to get rid of shortIssuerId.
Removing the issuer check in rocsp_sa.go uncovered a "boxed nil" problem:
SA was doing a nil check against an interface field that in practice was
never nil (because it was promoted from a concrete type at construction
time). So we would always hit the ROCSP path. But one of the first steps
in that path was looking up an issuer ID. Since `test/config` never
had the issuers set, we would look up the issuer ID, not find it, and
return an error before we attempted to call storeResponse. To fix this,
I made `NewSQLStorageAuthority` take a concrete `*rocsp.WritingClient`
instead of an interface, and check for nil before assigning it to an
internal interface field.
Built on top of #6201.
Add a new SA gRPC method named `GetRevokedCerts`. This method takes as
input an `IssuerNameID` and starting and ending timestamps, and queries
the database for all certificates issued by that issuer whose `NotAfter`
timestamp is in the indicated period. It returns a stream of `CRLEntry`,
one message for each row in the database query result.
This query has been shown to be reasonably fast, taking less than 20
seconds to scan 24 hours of prod issuance.
Fixes#6160
When testing the SA from within the SA package itself, we can directly
call its methods without having to go through the gRPC client interface.
Rather than needing to mock out gRPC client and server streams that
talk to each other over a channel, we can simply mock the server stream
and read from it directly.
The inMemSA adapter work will still be necessary when (for example)
testing an RA which uses a real in-memory SA to serve as its test
backend, rather than mocking out the SA. In that case, we need to wrap
the `storageAuthorityImpl` in something which causes it to match the
`sapb.StorageAuthorityClient` interface expected by the RA. That's what
the `//test/inmem/` package is for, and we'll move this wrapper code there
when we need it.
By testing these methods directly, rather than through an inmem wrapper,
we remove a potential cyclic import that was preventing us from moving
the inmem stream adapter into the `//test/inmem` package itself.
- Add protobuf types `SerialsForIncidentRequest` and `IncidentSerial`
- Rename `incidentCertModel` to `incidentSerialModel`
- Add new SA method `SerialsForIncident`
- Add streaming GRPC adapter to allow for unit testing `SerialsForIncident`
Fixes#5947
First commit adding support for tooling to aid in the tracking and remediation
of incidents.
- Add new SA method `IncidentsForSerial`
- Add database models for `incident`s and `incidentCert`s
- Add protobuf type for `incident`
- Add database migrations for `incidents`, `incident_foo`, and `incident_bar`
- Give db user `sa` permissions to `incidents`, `incident_foo`, and
`incident_bar`
Part Of #5947
Adds a rocsp redis client to the sa if cluster information is provided in the
sa config. If a redis cluster is configured, all new certificate OCSP
responses added with sa.AddPrecertificate will attempt to be written to
the redis cluster, but will not block or fail on errors.
Fixes: #5871
Add a new gRPC method `UpdateRevokedCertificate` to the SA. This
method takes the same argument as the existing `RevokeCertificate` RPC,
but only operates on certificates that have already been revoked with a
reason other than keyCompromise (c.f. `RevokeCertificate`, which only
operates on certificates that have not been revoked).
One thing to be careful of here is that storing an updated revocation reason
should not also change the revocation date. To support this, add a new field
to the existing `RevokeCertificateRequest` that allows us to differentiate the
time at which the new OCSP response was created, and the time at which
the revocation went into effect.
Part of #5936
We have the `core.AcmeStatus` type precisely so that we don't have to
rely on correctly typing the word `"pending"` everywhere. Use those values
instead of raw strings when converting between string-like Authz statuses
and the database bitfield.
Running an older version (v0.0.1-2020.1.4) of `staticcheck` in
whole-program mode (`staticcheck --unused.whole-program=true -- ./...`)
finds various instances of unused code which don't normally show up
as CI issues. I've used this to find and remove a large chunk of the
unused code, to pave the way for additional large deletions accompanying
the WFE1 removal.
Part of #5681