This updates va.proto to use proto3 syntax, and updates
all clients of the autogenerated code to use the new types. In
particular, it removes indirection from built-in types (proto3
uses ints, rather than pointers to ints, for example).
Fixes#4956
This updates the ca.proto to use proto3 syntax, and updates
all clients of the autogenerated code to use the new types. In
particular, it removes indirection from built-in types (proto3
uses ints, rather than pointers to ints, for example).
It also updates a few instances where tests were being
conducted to see if various object fields were nil to instead
check for those fields' new zero-value.
Fixes#4940
And make corresponding changes to call sites and wrappers.
Note that proto2 vs proto3 is distinction in the syntax of the .proto files
and doesn't change the wire format, so this meets the deployability
guidelines.
There are some changes to the code generated in the latest version, so
this modifies every .pb.go file.
Also, the way protoc-gen-go decides where to put files has changed, so
each generate.go gets the --go_opt=paths=source_relative flag to
tell protoc to continue placing output next to the input.
Remove staticcheck from build.sh; we get it via golangci-lint now.
Pass --no-document to gem install fpm; this is recommended in the fpm docs.
In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back.
There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change.
Fixes#4591.
This avoids needing to send the entire certificate in OCSP generation
RPCs.
Ended up including a few cleanups that made the implementation easier.
Initially I was struggling with how to derive the issuer identification info.
We could just stick the full SPKI hash in certificateStatus, but that takes a
significant amount of space, we could configure unique issuer IDs in the CA
config, but that would require being very careful about keeping the IDs
constant, and never reusing an ID, or we could store issuers in a table in the
database and use that as a lookup table, but that requires figuring out how to
get that info into the table etc. Instead I've just gone with what I found to
be the easiest solution, deriving a stable ID from the cert hash. This means we
don't need to remember to configure anything special and the CA config stays
the same as it is now.
Fixes#4469.
This creates the correct type of backend service for the OCSP generator.
It also adds an invocation of orphan-finder during the integration
tests.
This also adds a minor safety check to SA that I hit while writing the
test. Without this safety check, passing a certificate with no DNSNames
to AddCertificate would result in an obscure MariaDB syntax error
without enough context to track it down. In normal circumstances this
shouldn't be hit, but it will be good to have a solid error message if
we hit it in tests sometime.
Also, this tweaks the .travis.yml so it explicitly sets BOULDER_CONFIG_DIR
to test/config in the default case. Because the docker-compose run
command uses -e BOULDER_CONFIG_DIR="${BOULDER_CONFIG_DIR}",
we were setting a blank BOULDER_CONFIG_DIR in default case.
Since the Python startservers script sets a default if BOULDER_CONFIG_DIR
is not set, we haven't noticed this before. But since this test case relies
on the actual environment variable, it became an issue.
Fixes#4499
We need to apply some fixes for bugs introduced in #4476 before it can be deployed, as such we need to revert #4495 as there needs to be a full deploy cycle between these two changes.
This reverts commit 3ae1ae1.
😭
This change set makes the authz2 storage format the default format. It removes
most of the functionality related to the previous storage format, except for
the SA fallbacks and old gRPC methods which have been left for a follow-up
change in order to make these changes deployable without introducing
incompatibilities.
Fixes#4454.
A handful of the gRPC client-side wrappers are checking that the request is not
nil when they should be checking that the response is not nil. This could lead
to derefs where callers assume that the thing being returned is non-nil.
Fixes#4463.
When the `features.PrecertificateRevocation` feature flag is enabled the WFE2
will allow revoking certificates for a submitted precertificate. The legacy WFE1
behaviour remains unchanged (as before (pre)certificates issued through the V1
API will be revocable with the V2 API).
Previously the WFE2 vetted the certificate from the revocation request by
looking up a final certificate by the serial number in the requested
certificate, and then doing a byte for byte comparison between the stored and
requested certificate.
Rather than adjust this logic to handle looking up and comparing stored
precertificates against requested precertificates (requiring new RPCs and an
additional round-trip) we choose to instead check the signature on the requested
certificate or precertificate and consider it valid for revocation if the
signature validates with one of the WFE2's known issuers. We trust the integrity
of our own signatures.
An integration test that performs a revocation of a precertificate (in this case
one that never had a final certificate issued due to SCT embedded errors) with
all of the available authentication mechanisms is included.
Resolves https://github.com/letsencrypt/boulder/issues/4414
This change adds two tables and two methods in the SA, to store precertificates
and serial numbers.
In the CA, when the feature flag is turned on, we generate a serial number, store it,
sign a precertificate and OCSP, store them, and then return the precertificate. Storing
the serial as an additional step before signing the certificate adds an extra layer of
insurance against duplicate serials, and also serves as a check on database availability.
Since an error storing the serial prevents going on to sign the precertificate, this decreases
the chance of signing something while the database is down.
Right now, neither table has read operations available in the SA.
To make this work, I needed to remove the check for duplicate certificateStatus entry
when inserting a final certificate and its OCSP response. I also needed to remove
an error that can occur when expiration-mailer processes a precertificate that lacks
a final certificate. That error would otherwise have prevented further processing of
expiration warnings.
Fixes#4412
This change builds on #4417, please review that first for ease of review.
Right now we sometimes get errors like:
rpc error: code = Unknown desc = rpc error:
code = DeadlineExceeded desc = context deadline exceeded
For instance, when an SA call times out, and the RA returns that
timed-out error to the WFE. These are kind of confusing because they
have two layers of nested gRPC error, and they don't provide additional
information about which SA call timed out.
This change replaces DeadlineExceeded errors with our own error type
that includes the service and the method that were called, as well as
the amount of time it took (which helps understand if timeouts are
happening because earlier calls ate up time towards the deadline).
When the RA->SA NewOrder call times out, and the RA returns that error to WFE:
"InternalErrors":["rpc error: code = Unknown desc =
sa.StorageAuthority.NewOrder timed out after 14954 ms"]
When the WFE->RA NewOrder call times out:
"InternalErrors":["ra.RegistrationAuthority.NewOrder timed out after 15000 ms"]
Note that this change only handles timeouts at one level deep, which I
think is sufficient for our needs.
If a berror with suberrors is being wrapped then we must marshal the
suberrors as JSON and include this data in the RPC metadata trailer that
also carries the berror type. When unwrapping metadata with JSON
suberrors they should be unmarshalled into the returned berror's
suberrors.
The SA `FinalizeAuthorization2` RPC is used with
`FinalizeAuthorizationRequest` objects that may have a nil
`ValidationRecords` field (notably for DNS-01 challenges that
failed). The RPC wrapper should not reject such messages as incomplete.
We don't typically unit test gRPC wrappers, and adding
an integration test for this will likely conflict with
https://github.com/letsencrypt/boulder/issues/4241 so I tested this
fix manually using Certbot and a local Boulder instance configured
with the authz2 feature flag.
Before applying the fix, failing a DNS-01 challenge left the
authorization stuck in pending state and Certbot would poll until
it gave up. On the server-side a 500 error matching what we observed
in staging is logged: > boulder-ra [AUDIT] Could not record updated
validation: err=[rpc error: code = Unknown desc = Incomplete gRPC
request message] regID=[xxx] authzID=[xxxx]
After applying the fix failing a DNS-01 challenge caused the associated
authorization to be marked invalid immediately. No 500 errors are
logged.
Resolves https://github.com/letsencrypt/boulder/issues/4269
Enables integration tests for authz2 and fixes a few bugs that were flagged up during the process. Disables expired-authorization-purger integration tests if config-next is being used as expired-authz-purger expects to purge some stuff but doesn't know about authz2 authorizations, a new test will be added with #4188.
Fixes#4079.
This will allow implementing sub-problems without creating a cyclic
dependency between `core` and `problems`.
The `identifier` package is somewhat small/single-purpose and in the
future we may want to move more "ACME" bits beyond the `identifier`
types into a dedicated package outside of `core`.
Precertificate issuance has been the only supported mode for a while now. This
cleans up the remaining flags in the CA code. The same is true of must staple.
This also removes the IssueCertificate RPC call and its corresponding wrappers,
and removes a lot of plumbing in the CA unittests that was used to test the
situation where precertificate issuance was not enabled.
This PR implements new SA methods for handling authz2 style authorizations and updates existing SA methods to count and retrieve them where applicable when the `NewAuthorizationSchema` feature is enabled.
Fixes#4093Fixes#4082
Updates #4078
Updates #4077
Early ACME drafts supported a notion of "combinations" of challenges
that had to be completed together. This was removed from subsequent
drafts. Boulder has only ever supported "combinations" that exactly map
to the list of challenges, 1 for 1.
This removes all the plumbing for combinations, and adds a list of
combinations to the authz JSON right before marshaling it in WFE1.
Precursor to #4116. Since some of our dependencies impose a minimum
version on these two packages higher than what we have in Godeps, we'll
have to bump them anyhow. Bumping them independently of the modules
update should keep things a little simpler.
In order to get protobuf tests to pass, I had to update protoc-gen-go in
boulder-tools. Now we download a prebuilt binary instead of using the
Ubuntu package, which is stuck on 3.0.0. This also meant I needed to
re-generate our pb.go files, since the new version generates somewhat
different output.
This happens to change the tag for pbutil, but it's not a substantive change - they just added a tagged version where there was none.
$ go test github.com/miekg/dns/...
ok github.com/miekg/dns 4.675s
ok github.com/miekg/dns/dnsutil 0.003s
ok github.com/golang/protobuf/descriptor (cached)
ok github.com/golang/protobuf/jsonpb (cached)
? github.com/golang/protobuf/jsonpb/jsonpb_test_proto [no test files]
ok github.com/golang/protobuf/proto (cached)
? github.com/golang/protobuf/proto/proto3_proto [no test files]
? github.com/golang/protobuf/proto/test_proto [no test files]
ok github.com/golang/protobuf/protoc-gen-go (cached)
? github.com/golang/protobuf/protoc-gen-go/descriptor [no test files]
ok github.com/golang/protobuf/protoc-gen-go/generator (cached)
ok github.com/golang/protobuf/protoc-gen-go/generator/internal/remap (cached)
? github.com/golang/protobuf/protoc-gen-go/grpc [no test files]
? github.com/golang/protobuf/protoc-gen-go/plugin [no test files]
ok github.com/golang/protobuf/ptypes (cached)
? github.com/golang/protobuf/ptypes/any [no test files]
? github.com/golang/protobuf/ptypes/duration [no test files]
? github.com/golang/protobuf/ptypes/empty [no test files]
? github.com/golang/protobuf/ptypes/struct [no test files]
? github.com/golang/protobuf/ptypes/timestamp [no test files]
? github.com/golang/protobuf/ptypes/wrappers [no test files]
This changeset implements the logic required for the WFE to retrieve v2 authorizations and their associated challenges while still maintaining the logic to retrieve old authorizations/challenges. Challenge IDs for v2 authorizations are obfuscated using a pretty simply scheme in order to prevent hard coding of indexes. A `V2` field is added to the `core.Authorization` object and populated using the existing field of the same name from the protobuf for convenience. v2 authorizations and challenges use a `v2` prefix in all their URLs in order to easily differentiate between v1 and v2 URLs (e.g. `/acme/authz/v2/asd` and `/acme/challenge/v2/asd/123`), once v1 authorizations cease to exist this prefix can be safely removed. As v2 authorizations use int IDs this change switches from string IDs to int IDs, this mainly only effects tests.
Integration tests are put off for #4079 as they really need #4077 and #4078 to be properly effective.
Fixes#4041.
Implements a feature that enables immediate revocation instead of marking a certificate revoked and waiting for the OCSP-Updater to generate the OCSP response. This means that as soon as the request returns from the WFE the revoked OCSP response should be available to the user. This feature requires that the RA be configured to use the standalone Akamai purger service.
Fixes#4031.
The plural `serverAddresses` field in gRPC config has been deprecated for a bit now. We've removed the last usages of it in our staging/prod environments and can clear out the related code. Moving forward we only support a singular `serverAddress` and rely on DNS to direct to multiple instances of a given server.
Staging and prod both deployed the PerformValidationRPC feature flag. All running WFE/WFE2 instances are using the more accurately named PerformValidation RPC and we can strip out the old UpdateAuthorization bits. The feature flag for PerformValidationRPC remains until we clean up the staging/prod configs.
Resolves#3947 and completes the last of #3930
Fixes#3965 and fixes#3949.
This change adds a model for the authz2 style authorization storage and implements methods to transform to/from the protobuf representations and SA methods to store and retrieve the new style authorizations.
Marshaling invalid UTF-8 strings to protocol buffers causes an error. This can
happen in VA `PerformValidation` RPC responses if remote servers return invalid
UTF-8 in some ACME challenge contexts. We previously fixed this for HTTP-01 and
DNS-01 but missed a case where TLS-ALPN-01/TLS-SNI-01 challenge response
certificate content was included in error messages without replacing invalid
UTF-8. That's now fixed & unit tests are added.
To aid in diagnosing any future instances the VA is also updated to proactively
attempt to marshal its `PerformValidation` results before handing off to the RPC
wrappers that will do the same. This way if we detect an error in marshaling the
VA can audit log the escaped content for investigation purposes.
Hopefully with these two efforts combined we can avoid any future VA RPC errors
from UTF-8 encoding.
Resolves https://github.com/letsencrypt/boulder/issues/3838
The existing RA `UpdateAuthorization` RPC needs replacing for
two reasons:
1. The name isn't accurate - `PerformValidation` better captures
the purpose of the RPC.
2. The `core.Challenge` argument is superfluous since Key
Authorizations are not sent in the initiation POST from the client
anymore. The corresponding unmarshal and verification is now
removed. Notably this means broken clients that were POSTing
the wrong thing and failing pre-validation will now likely fail
post-validation.
To remove `UpdateAuthorization` the new `PerformValidation`
RPC is added alongside the old one. WFE and WFE2 are
updated to use the new RPC when the perform validation
feature flag is enabled. We can remove
`UpdateAuthorization` and its associated wrappers once all
WFE instances have been updated.
Resolves https://github.com/letsencrypt/boulder/issues/3930
Resolves https://github.com/letsencrypt/boulder/issues/3872
**Note to reviewers**: There's an outstanding bug that I've tracked down to the `--load` stage of the integration tests that results in one of the remote VA instances in the `test/config-next` configuration under Go 1.11.1 to fail to cleanly shut down. I'm working on finding the root cause but in the meantime I've disabled `--load` during CI so we can unblock moving forward with getting Go 1.11.1 in dev/CI. Tracking this in https://github.com/letsencrypt/boulder/issues/3889
Removes the checks for a handful of deployed feature flags in preparation for removing the flags entirely. Also moves all of the currently deprecated flags to a separate section of the flags list so they can be more easily removed once purged from production configs.
Fixes#3880.
Things removed:
* features.EmbedSCTs (and all the associated RA/CA/ocsp-updater code etc)
* ca.enablePrecertificateFlow (and all the associated RA/CA code)
* sa.AddSCTReceipt and sa.GetSCTReceipt RPCs
* publisher.SubmitToCT and publisher.SubmitToSingleCT RPCs
Fixes#3755.
We'd like to start using the DNS load balancer in the latest version of gRPC. That means putting all IPs for a service under a single hostname (or using a SRV record, but we're not taking that path). This change adds an sd-test-srv to act as our service discovery DNS service. It returns both Boulder IP addresses for any A lookup ending in ".boulder". This change also sets up the Docker DNS for our boulder container to defer to sd-test-srv when it doesn't know an answer.
sd-test-srv doesn't know how to resolve public Internet names like `github.com`. Resolving public names is required for the `godep-restore` test phase, so this change breaks out a copy of the boulder container that is used only for `godep-restore`.
This change implements a shim of a DNS resolver for gRPC, so that we can switch to DNS-based load balancing with the currently vendored gRPC, then when we upgrade to the latest gRPC we won't need a simultaneous config update.
Also, this change introduces a check at the end of the integration test that each backend received at least one RPC, ensuring that we are not sending all load to a single backend.
Remove various unnecessary uses of fmt.Sprintf - in particular:
- Avoid calls like t.Error(fmt.Sprintf(...)), where t.Errorf can be used directly.
- Use strconv when converting an integer to a string, rather than using
fmt.Sprintf("%d", ...). This is simpler and can also detect type errors at
compile time.
- Instead of using x.Write([]byte(fmt.Sprintf(...))), use fmt.Fprintf(x, ...).
In #3651 we introduced a new parameter to sa.AddCertificate to allow specifying the Issued date. If nil, we defaulted to the current time to maintain deployability guidelines.
Now that this has been deployed everywhere this PR updates SA.AddCertificate and the gRPC wrappers such that a nil issuer argument is rejected with an error.
Unit tests that were previously using nil for the issued time are updated to explicitly set the issued time to the fake clock's now().
Resolves#3657
Now that #3638 has been deployed to all of the RA instances there are no
more RPC clients using the SA's `CountCertificatesRange` RPC.
This commit deletes the implementation, the RPC definition & wrappers,
and all the test code/mocks.
This PR updates the Boulder gRPC clientInterceptor to update a Prometheus gauge stat for each in-flight RPC it dispatches, sliced by service and method.
A unit test is included that uses a custom ChillerServer that lets the test block up a bunch of RPCs, check the in-flight gauge value is increased, unblock the RPCs, and recheck that the in-flight gauge is reduced. To check the gauge value for a specific set of labels a new test-tools.go function GaugeValueWithLabels is added.
Updates #3635
We may see RPCs that are dispatched by a client but do not arrive at the server for some time afterwards. To have insight into potential request latency at this layer we want to publish the time delta between when a client sent an RPC and when the server received it.
This PR updates the gRPC client interceptor to add the current time to the gRPC request metadata context when it dispatches an RPC. The server side interceptor is updated to pull the client request time out of the gRPC request metadata. Using this timestamp it can calculate the latency and publish it as an observation on a Prometheus histogram.
Accomplishing the above required wiring a clock through to each of the client interceptors. This caused a small diff across each of the gRPC aware boulder commands.
A small unit test is included in this PR that checks that a latency stat is published to the histogram after an RPC to a test ChillerServer is made. It's difficult to do more in-depth testing because using fake clocks makes the latency 0 and using real clocks requires finding a way to queue/delay requests inside of the gRPC mechanisms not exposed to Boulder.
Updates https://github.com/letsencrypt/boulder/issues/3635 - Still TODO: Explicitly logging latency in the VA, tracking outstanding RPCs as a gauge.
The Boulder orphan-finder command uses the SA's AddCertificate RPC to add orphaned certificates it finds back to the DB. Prior to this commit this RPC always set the core.Certificate.Issued field to the
current time. For the orphan-finder case this meant that the Issued date would incorrectly be set to when the certificate was found, not when it was actually issued. This could cause cert-checker to alarm based on the unusual delta between the cert NotBefore and the core.Certificate.Issued value.
This PR updates the AddCertificate RPC to accept an optional issued timestamp in the request arguments. In the SA layer we address deployability concerns by setting a default value of the current time when none is explicitly provided. This matches the classic behaviour and will let an old RA communicate with a new SA.
This PR updates the orphan-finder to provide an explicit issued time to sa.AddCertificate. The explicit issued time is calculated using the found certificate's NotBefore and the configured backdate.
This lets the orphan-finder set the true issued time in the core.Certificate object, avoiding any cert-checker alarms.
Resolves#3624
Submits final certificates to any configured CT logs. This doesn't introduce a feature flag as it is config gated, any log we want to submit final certificates to needs to have it's log description updated to include the `"submitFinalCerts": true` field.
Fixes#3605.
During periods of peak load, some RPCs are significantly delayed (on the order of seconds) by client-side blocking. HTTP/2 clients have to obey a "max concurrent streams" setting sent by the server. In Go's HTTP/2 implementation, this value [defaults to 250](https://github.com/golang/net/blob/master/http2/server.go#L56), so the gRPC default is also 250. So whenever there are more than 250 requests in progress at a time, additional requests will be delayed until there is a slot available.
During this peak load, we aren't hitting limits on CPU or memory, so we should increase the max concurrent streams limit to take better advantage of our available resources. This PR adds a config field to do that.
Fixes#3641.
gRPC passes deadline information through the RPC boundary, but client and server have the same deadline. Ideally we'd like the server to have a slightly tighter deadline than the client, so if one of the server's onward RPCs or other network calls times out, the server can pass back more detailed information to the client, rather than the client timing out the server and losing the opportunity to log more detailed information about which component caused the timeout.
In this change, I subtract 100ms from the deadline on the server side of our interceptors, using our existing serverInterceptor. I also check that there is at least 100ms remaining in which to do useful work, so the server doesn't begin a potentially expensive task only to abort it.
Fixes#3608.
Adds SCT embedding to the certificate issuance flow. When a issuance is requested a precertificate (the requested certificate but poisoned with the critical CT extension) is issued and submitted to the required CT logs. Once the SCTs for the precertificate have been collected a new certificate is issued with the poison extension replace with a SCT list extension containing the retrieved SCTs.
Fixes#2244, fixes#3492 and fixes#3429.
Previously we introduced the concept of a "pending orders per account
ID" rate limit. After struggling with making an implementation of this
rate limit perform well we reevaluated the problem and decided a "new
orders per account per time window" rate limit would be a better fit for
ACMEv2 overall.
This commit introduces the new newOrdersPerAccount rate limit. The RA
now checks this before creating new pending orders in ra.NewOrder. It
does so after order reuse takes place ensuring the rate limit is only
applied in cases when a distinct new pending order row would be created.
To accomplish this a migration for a new orders field (created) and an
index over created and registrationID is added. It would be possible to
use the existing expires field for this like we've done in the past, but that
was primarily to avoid running a migration on a large table in prod. Since
we don't have that problem yet for V2 tables we can Do The Right Thing
and add a column.
For deployability the deprecated pendingOrdersPerAccount code & SA
gRPC bits are left around. A follow-up PR will be needed to remove
those (#3502).
Resolves#3410
Prior to this commit when building up the authorizations for a new-order
request we looked for any unexpired pending/valid authorizations owned
by the account and used them for the order. This allows a client to use
the V1 new-authz endpoint in combination with the V2 new-order endpoint
and we do not want to support this behaviour. All V2 authorizations
should be sourced from other V2 orders. This commit implements a new
parameter for the SA's getAuthorizations function that allows filtering
out legacy V1 authorizations by doing a JOIN on the order to
authorizations join table.
Resolves#3328
This commit resolves the case where an error during finalization occurs.
Prior to this commit if an error (expected or otherwise) occurred after
setting an order to status processing at the start of order
finalization the order would be stuck processing forever.
The SA now has a `SetOrderError` RPC that can be used by the RA to
persist an error onto an order. The order status calculation can use
this error to decide if the order is invalid. The WFE is updated to
write the error to the order JSON when displaying the order information.
Prior to this commit the order protobuf had the error field as
a `[]byte`. It doesn't seem like this is the right decision, we have
a specific protobuf type for ProblemDetails and so this commit switches
the error field to use it. The conversion to/from `[]byte` is done with
the model by the SA.
An integration test is included that prior to this commit left an order
in a stuck processing state. With this commit the integration test
passes as expected.
Resolves https://github.com/letsencrypt/boulder/issues/3403
The SA RPC previously called `GetOrderAuthorizations` only returns
**valid, unexpired** authorizations. This commit updates the name to
emphasize that it only returns valid order authzs.
This PR is a rework of what was originally https://github.com/letsencrypt/boulder/pull/3382, integrating the design feedback proposed by @jsha: https://github.com/letsencrypt/boulder/pull/3382#issuecomment-359912549
This PR removes the stored Order status field and replaces it with a value that is calculated on-the-fly by the SA when fetching an order, based on the order's associated authorizations.
In summary (and order of precedence):
* If any of the order's authorizations are invalid, the order is invalid.
* If any of the order's authorizations are deactivated, the order is deactivated.
* If any of the order's authorizations are pending, the order is pending.
* If all of the order's authorizations are valid, and there is a certificate serial, the order is valid.
* If all of the order's authorizations are valid, and we have began processing, but there is no certificate serial, the order is processing.
* If all of the order's authorizations are valid, and we haven't processing, then the order is pending waiting a finalization request.
This avoids having to explicitly update the order status when an associated authorization changes status.
The RA's implementation of new-order is updated to only reuse an existing order if the calculated status is pending. This avoids giving back invalid or deactivated orders to clients.
Resolves#3333
This change adds a feature flag, TLSSNIRevalidation. When it is enabled, Boulder
will create new authorization objects with TLS-SNI challenges if the requesting
account has issued a certificate with the relevant domain name, and was the most
recent account to do so*. This setting overrides the configured list of
challenges in the PolicyAuthority, so even if TLS-SNI is disabled in general, it
will be enabled for revalidation.
Note that this interacts with EnforceChallengeDisable. Because
EnforceChallengeDisable causes additional checked at validation time and at
issuance time, we need to update those two places as well. We'll send a
follow-up PR with that.
*We chose to make this work only for the most recent account to issue, even if
there were overlapping certificates, because it significantly simplifies the
database access patterns and should work for 95+% of cases.
Note that this change will let an account revalidate and reissue for a domain
even if the previous issuance on that account used http-01 or dns-01. This also
simplifies implementation, and fits within the intent of the mitigation plan: If
someone previously issued for a domain using http-01, we have high confidence
that they are actually the owner, and they are not going to "steal" the domain
from themselves using tls-sni-01.
Also note: This change also doesn't work properly with ReusePendingAuthz: true.
Specifically, if you attempted issuance in the last couple days and failed
because there was no tls-sni challenge, you'll still have an http-01 challenge
lying around, and we'll reuse that; then your client will fail due to lack of
tls-sni challenge again.
This change was joint work between @rolandshoemaker and @jsha.
This commit adds pending order reuse. Subsequent to this commit multiple
add-order requests from the same account ID for the same set of order
names will result in only one order being created. Orders are only
reused while they are not expired. Finalized orders will not be reused
for subsequent new-order requests allowing for duplicate order issuance.
Note that this is a second level of reuse, building on the pending
authorization reuse that's done between separate orders already.
To efficiently find an appropriate order ID given a set of names,
a registration ID, and the current time a new orderFqdnSets table is
added with appropriate indexes and foreign keys.
Resolves#3258
* Allow nil `Authz` slice in `GetAuthorizations` response.
The `StorageAuthorityClientWrapper` was enforcing that the response to
a `GetAuthorizations` request did not have `resp.Authz == nil`. This
meant that the RA's `NewOrder` function failed when creating an order
for names that had no existing authorizations to reuse.
This commit updates the wrapper to allow `resp.Authz` to be nil - this
is a valid case when there are no authorizations found.
* Fix SA server wrapper `AddPendingAuthorizations` logic.
Prior to this commit the `StorageAuthorityServerWrapper`'s
`AddPendingAuthorizations` function had an error in the boolean logic
for determining if a request was incomplete. It was rejecting any
requests that had a non-nil `Authz`. This commit fixes the logic so that
it rejects requests that have a **nil** `Authz`.
* Add `newOrderValid` for new-order rpc wrappers.
This commit updates the `StorageAuthorityServerWrapper`'s `NewOrder`
function to use a new pb-marshalling utility function `newOrderValid` to
determine if the provided order is valid or not. Previous to this commit
the `NewOrder` server wrapper used `orderValid` which rejected orders
that had a nil `Id`. This is incorrect because **all** orders provided
to `NewOrder` have a nil id! They haven't been added yet :-)
* Fix SA server wrapper `GetOrder` incomplete response check.
Prior to this commit the `StorageAuthorityClientWrapper`'s `GetOrder`
function was validating that the returned order had a non-nil
`CertificateSerial`. This isn't correct - you can GET an order that
hasn't been finalized with a certificate and it should work. This commit
updates the `GetOrder` function to use the utility `orderValid` function
that allows for a nil `CertificateSerial` but enforces all other fields
are populated as expected.
* Allow nil Authz in `GetOrderAuthorizations` response.
This commit fixes the `StorageAuthorityClientWrapper`'s
`GetOrderAuthorizations` function to not consider a response with a nil
`Authz` array incomplete. This condition happens under normal
circumstances when an attempt to finalize an order is made for an order
that has completed no authorizations.
The go-grpc-prometheus package by default registers its metrics with Prometheus' global registry. In #3167, when we stopped using the global registry, we accidentally lost our gRPC metrics. This change adds them back.
Specifically, it adds two convenience functions, one for clients and one for servers, that makes the necessary metrics object and registers it. We run these in the main function of each server.
I considered adding these as part of StatsAndLogging, but the corresponding ClientMetrics and ServerMetrics objects (defined by go-grpc-prometheus) need to be subsequently made available during construction of the gRPC clients and servers. We could add them as fields on Scope, but this seemed like a little too much tight coupling.
Also, update go-grpc-prometheus to get the necessary methods.
```
$ go test github.com/grpc-ecosystem/go-grpc-prometheus/...
ok github.com/grpc-ecosystem/go-grpc-prometheus 0.069s
? github.com/grpc-ecosystem/go-grpc-prometheus/examples/testproto [no test files]
```
This commit adds a new rate limit to restrict the number of outstanding
pending orders per account. If the threshold for this rate limit is
crossed subsequent new-order requests will return a 429 response.
Note: Since this the rate limit object itself defines an `Enabled()`
test based on whether or not it has been configured there is **not**
a feature flag for this change.
Resolves https://github.com/letsencrypt/boulder/issues/3246
This PR implements order finalization for the ACME v2 API.
In broad strokes this means:
* Removing the CSR from order objects & the new-order flow
* Adding identifiers to the order object & new-order
* Providing a finalization URL as part of orders returned by new-order
* Adding support to the WFE's Order endpoint to receive finalization POST requests with a CSR
* Updating the RA to accept finalization requests and to ensure orders are fully validated before issuance can proceed
* Updating the SA to allow finding order authorizations & updating orders.
* Updating the CA to accept an Order ID to log when issuing a certificate corresponding to an order object
Resolves#3123
For the new-order endpoint only. This does some refactoring of the order of operations in `ra.NewAuthorization` as well in order to reduce the duplication of code relating to creating pending authorizations, existing tests still seem to work as intended... A close eye should be given to this since we don't have integration tests yet that test it end to end. This also changes the inner type of `grpc.StorageAuthorityServerWrapper` to `core.StorageAuthority` so that we can avoid a circular import that is created by needing to import `grpc.AuthzToPB` and `grpc.PBToAuthz` in `sa/sa.go`.
This is a big change but should considerably improve the performance of the new-order flow.
Fixes#2955.
* Remove all of the errors under core. Their purpose is now served by errors, and they were almost entirely unused. The remaining uses were switched to errors.
* Remove errors.NotSupportedError. It was used in only one place (ca.go), and that usage is more appropriately a ServerInternal error.
Stub out IssueCertificateForPrecertificate() enough so that we can continue with the PRs that implement & test it in parallel with PRs that implement and test the calling side (via mock implementations of the CA side).
* CA: Stub IssuePrecertificate gPRC method.
* CA: Implement IssuePrecertificate.
* CA: Test Precertificate flow in TestIssueCertificate().
move verification of certificate storage
IssuePrecertificate tests
Add CT precertificate poison extension to CFSSL whitelist.
CFSSL won't allow us to add an extension to a certificate unless that
certificate is in the whitelist.
According to its documentation, "Extensions requested in the CSR are
ignored, except for those processed by ParseCertificateRequest (mainly
subjectAltName)." Still, at least we need to add tests to make sure a
poison extension in a CSR isn't copied into the final certificate.
This allows us to avoid making invasive changes to CFSSL.
* CA: Test precertificate issuance in TestInvalidCSRs().
* CA: Only support IssuePrecertificate() if it is explicitly enabled.
* CA: Test that we produce CT poison extensions in the valid form.
The poison extension must be critical in order to work correctly. It probably wouldn't
matter as much what the value is, but the spec requires the value to be ASN.1 NULL, so
verify that it is.
Switch certificates and certificateStatus to use autoincrement primary keys to avoid performance problems with clustered indexes (fixes#2754).
Remove empty externalCerts and identifierData tables (fixes#2881).
Make progress towards deleting unnecessary LockCol and subscriberApproved fields (#856, #873) by making them NULLable and not including them in INSERTs and UPDATEs.
The existing ReusePendingAuthz implementation had some bugs:
It would recycle deactivated authorizations, which then couldn't be fulfilled. (#2840)
Since it was implemented in the SA, it wouldn't get called until after the RA checks the Pending Authorizations rate limit. Which means it wouldn't fulfill its intended purpose of making accounts less likely to get stuck in a Pending Authorizations limited state. (#2831)
This factors out the reuse functionality, which used to be inside an "if" statement in the SA. Now the SA has an explicit GetPendingAuthorization RPC, which gets called from the RA before calling NewPendingAuthorization. This happens to obsolete #2807, by putting the recycling logic for both valid and pending authorizations in the RA.
This commit replaces the Boulder dependency on
gopkg.in/square/go-jose.v1 with gopkg.in/square/go-jose.v2. This is
necessary both to stay in front of bitrot and because the ACME v2 work
will require a feature from go-jose.v2 for JWS validation.
The largest part of this diff is cosmetic changes:
Changing import paths
jose.JsonWebKey -> jose.JSONWebKey
jose.JsonWebSignature -> jose.JSONWebSignature
jose.JoseHeader -> jose.Header
Some more significant changes were caused by updates in the API for
for creating new jose.Signer instances. Previously we constructed
these with jose.NewSigner(algorithm, key). Now these are created with
jose.NewSigner(jose.SigningKey{},jose.SignerOptions{}). At present all
signers specify EmbedJWK: true but this will likely change with
follow-up ACME V2 work.
Another change was the removal of the jose.LoadPrivateKey function
that the wfe tests relied on. The jose v2 API removed these functions,
moving them to a cmd's main package where we can't easily import them.
This function was reimplemented in the WFE's test code & updated to fail
fast rather than return errors.
Per CONTRIBUTING.md I have verified the go-jose.v2 tests at the imported
commit pass:
ok gopkg.in/square/go-jose.v2 14.771s
ok gopkg.in/square/go-jose.v2/cipher 0.025s
? gopkg.in/square/go-jose.v2/jose-util [no test files]
ok gopkg.in/square/go-jose.v2/json 1.230s
ok gopkg.in/square/go-jose.v2/jwt 0.073s
Resolves#2880
This is a step towards the long-term goal of eliminating wrappers and a step
towards the short-term goal of making it easier to refactor ca/ca_test.go to
add testing of precertificate-based issuance.
Prior to this PR the SA's `CountRegistrationsByIP` treated IPv6
differently than IPv4 by counting registrations within a /48 for IPv6 as
opposed to exact matches for IPv4. This PR updates
`CountRegistrationsByIP` to treat IPv4 and IPv6 the
same, always matching exactly. The existing RegistrationsPerIP rate
limit policy will be applied against this exact matching count.
A new `CountRegistrationsByIPRange` function is added to the SA that
performs the historic matching process, e.g. for IPv4 it counts exactly
the same as `CountRegistrationsByIP`, but for IPv6 it counts within
a /48.
A new `RegistrationsPerIPRange` rate limit policy is added to allow
configuring the threshold/window for the fuzzy /48 matching registration
limit. Stats for the "Exceeded" and "Pass" events for this rate limit are
separated into a separate `RegistrationsByIPRange` stats scope under
the `RateLimit` scope to allow us to track it separate from the exact
registrations per IP rate limit.
Resolves https://github.com/letsencrypt/boulder/issues/2738
This PR introduces a new feature flag "IPv6First".
When the "IPv6First" feature is enabled the VA's HTTP dialer and TLS SNI
(01 and 02) certificate fetch requests will attempt to automatically
retry when the initial connection was to IPv6 and there is an IPv4
address available to retry with.
This resolves https://github.com/letsencrypt/boulder/issues/2623
This PR removes two berrors that aren't used anywhere in the codebase:
TooManyRequests , a holdover from AMQP, and is no longer used.
UnsupportedIdentifier, used just for rejecting IDNs, which we no longer do.
In addition, the SignatureValidation error was only used by the WFE so it is moved there and unexported.
Note for reviewers: To remove berrors.UnsupportedIdentifierError I replaced the errIDNNotSupported error in policy/pa.go with a berrors.MalformedError with the same name. This allows removing UnsupportedIdentifierError ahead of #2712 which removes the IDNASupport feature flag. This seemed OK to me, but I can restore UnsupportedIdentifierError and clean it up after 2712 if that's preferred.
Resolves#2709
Prior to this PR if a domain was an exact match to a public suffix
list entry the certificates per name rate limit was applied based on the
count of certificates issued for that exact name and all of its
subdomains.
This PR introduces an exception such that exact public suffix
matches correctly have the certificate per name rate limit applied based
on only exact name matches.
In order to accomplish this a new RPC is added to the SA
`CountCertificatesByExactNames`. This operates similar to the existing
`CountCertificatesByNames` but does *not* include subdomains in the
count, only exact matches to the names provided. The usage of this new
RPC is feature flag gated behind the "CountCertificatesExact" feature flag.
The RA unit tests are updated to test the new code paths both with and
without the feature flag enabled.
Resolves#2681
Generate first OCSP response in ca.IssueCertificate instead of ocsp-updater.newCertificateTick
if features.GenerateOCSPEarly is enabled. Adds a new field to the sa.AddCertiifcate RPC for
the OCSP response and only adds it to the certificate status + sets ocspLastUpdated if it is a
non-empty slice. ocsp-updater.newCertificateTick stays the same so we can catch certificates
that were successfully signed + stored but a OCSP response couldn't be generated (for whatever
reason).
Fixes#2477.
Updates the various gRPC/protobuf libs (google.golang.org/grpc/... and github.com/golang/protobuf/proto) and the boulder-tools image so that we can update to the newest github.com/grpc-ecosystem/go-grpc-prometheus. Also regenerates all of the protobuf definition files.
Tests run on updated packages all pass.
Unblocks #2633fixes#2636.
This patch removes all usages of the `core.XXXError` and almost all usages of `probs` outside of the WFE and VA and replaces them with a unified internal error type. Since the VA uses `probs.ProblemDetails` quite extensively in challenges, and currently stores them in the DB I've saved this change for another change (it'll also require a migration). Since `ProblemDetails` should only ever be exposed to end-users all of its related logic should be moved into the `WFE` but since it still needs to be exposed to the VA and SA I've left it in place for now.
The new internal `errors` package offers the same convenience functions as `probs` does as well as a new simpler type testing method. A few small changes have also been made to error messages, mainly adding the library and function name to internal server errors for easier debugging (i.e. where a number of functions return the exact same errors and there is no other way to distinguish which method threw the error).
Also adds proper encoding of internal errors transferred over gRPC (the current encoding scheme is kept for `core` and `probs` errors since it'll be ideally be removed after we deploy this and follow-up changes) using `grpc/metadata` instead of the gRPC status codes.
Fixes#2507. Updates #2254 and #2505.
The gRPC client reconnect code needs to be able to check if a error is temporary so that it can decide if it should attempt to reconnect or just fail and kill the client[1]. By wrapping the error we were receiving in our TLS handshake code we were removing the existing `Temporary` interface on the error. This meant that if a client attempted to reconnect to a server that was in the process of being shutdown, the client would consider that server permanently dead and never retry.
Fix is simple: don't wrap errors that we pass back into the gRPC internals so that they can be properly inspected.
[1]: aefc96d792/clientconn.go (L783)
Instead of using `unwrapError/wrapError` in each of the wrapper functions do it in the server/client interceptors instead. This means we now consistently do error unwrapping/wrapping.
Fixes#2509.
Previously we had `Error` and `ValidationRecords` fields in the `Challenge` protobuf but they were never populated which mean't that when using gRPC these fields wouldn't be sent to the SA from the RA on a `FinalizeAuthorization` call. This change populates those fields and updates the PB marshaling tests to verify the correct behavior.
Fixes#2514.
Fixes#976.
This implements a new rate limit, InvalidAuthorizationsPerAccount. If a given account fails authorization for a given hostname too many times within the window, subsequent new-authz attempts for that account and hostname will fail early with a rateLimited error. This mitigates the misconfigured clients that constantly retry authorization even though they always fail (e.g., because the hostname no longer resolves).
For the new rate limit, I added a new SA RPC, CountInvalidAuthorizations. I chose to implement this only in gRPC, not in AMQP-RPC, so checking the rate limit is gated on gRPC. See #2406 for some description of the how and why. I also chose to directly use the gRPC interfaces rather than wrapping them in core.StorageAuthority, as a step towards what we will want to do once we've moved fully to gRPC.
Because authorizations don't have a created time, we need to look at the expires time instead. Invalid authorizations retain the expiration they were given when they were created as pending authorizations, so we use now + pendingAuthorizationLifetime as one side of the window for rate limiting, and look backwards from there. Note that this means you could maliciously bypass this rate limit by stacking up pending authorizations over time, then failing them all at once.
Similarly, since this limit is by (account, hostname) rather than just (hostname), you can bypass it by creating multiple accounts. It would be more natural and robust to limit by hostname, like our certificate limits. However, we currently only have two indexes on the authz table: the primary key, and
(`registrationID`,`identifier`,`status`,`expires`)
Since this limit is intended mainly to combat misconfigured clients, I think this is sufficient for now.
Corresponding PR for website: letsencrypt/website#125
Currently services will pass both `core.XXXError` and `probs.XXX` type errors across the gRPC layer. In the future (#2505) we intend to stop passing `probs.XXX` type errors across this layer but for now we need to support them until that change is landed. This patch takes the easiest path to allow this by encoding the `probs.ProblemDetails` to JSON and storing it in the gRPC error body so that it can be passed around.
Fixes#2497.
We turn arrays into maps with a range command. Previously, we were taking the
address of the iteration variable in that range command, which meant incorrect
results since the iteration variable gets reassigned.
Also change the integration test to catch this error.
Fixes#2496
Previously our gRPC client code called the wrong function, enabling server-side instead of client-side histograms.
Also, add a timing stat for the generate / store combination in OCSP Updater.
We have a number of stats already expressed using the statsd interface. During
the switchover period to direct Prometheus collection, we'd like to make those
stats available both ways. This change automatically exports any stats exported
using the statsd interface via Prometheus as well.
This is a little tricky because Prometheus expects all stats to by registered
exactly once. Prometheus does offer a mechanism to gracefully recover from
registering a stat more than once by handling a certain error, but it is not
safe for concurrent access. So I added a concurrency-safe wrapper that creates
Prometheus stats on demand and memoizes them.
In the process, made a few small required side changes:
- Clean "/" from method names in the gRPC interceptors. They are allowed in
statsd but not in Prometheus.
- Replace "127.0.0.1" with "boulder" as the name of our testing CT log.
Prometheus stats can't start with a number.
- Remove ":" from the CT-log stat names emitted by Publisher. Prometheus stats
can't include it.
- Remove a stray "RA" in front of some rate limit stats, since it was
duplicative (we were emitting "RA.RA..." before).
Note that this means two stat groups in particular are duplicated:
- Gostats* is duplicated with the default process-level stats exported by the
Prometheus library.
- gRPCClient* are duplicated by the stats generated by the go-grpc-prometheus
package.
When writing dashboards and alerts in the Prometheus world, we should be careful
to avoid these two categories, as they will disappear eventually. As a general
rule, if a stat is available with an all-lowercase name, choose that one, as it
is probably the Prometheus-native version.
In the long run we will want to create most stats using the native Prometheus
stat interface, since it allows us to use add labels to metrics, which is very
useful. For instance, currently our DNS stats distinguish types of queries by
appending the type to the stat name. This would be more natural as a label in
Prometheus.
Previously, a given binary would have three TLS config fields (CA cert, cert,
key) for its gRPC server, plus each of its configured gRPC clients. In typical
use, we expect all three of those to be the same across both servers and clients
within a given binary.
This change reuses the TLSConfig type already defined for use with AMQP, adds a
Load() convenience function that turns it into a *tls.Config, and configures it
for use with all of the binaries. This should make configuration easier and more
robust, since it more closely matches usage.
This change preserves temporary backwards-compatibility for the
ocsp-updater->publisher RPCs, since those are the only instances of gRPC
currently enabled in production.
This allows finer-grained control of which components can request issuance. The OCSP Updater should not be able to request issuance.
Also, update test/grpc-creds/generate.sh to reissue the certs properly.
Resolves#2417
This is a roll-forward of 5b865f1, with the QueueDeclare and QueueBind changes
in AMQP-RPC removed, and the startup order changes in test/startservers.py
removed. The AMQP-RPC changes caused RabbitMQ permission problems in production,
and the startup order changes depended on the AMQP-RPC changes but were not
required now that we have a unittest also.
This allows us to restart backends with relatively little interruption in
service, provided the backends come up promptly.
Fixes#2389 and #2408
There is now one file per service, containing both the client-side and
server-side wrappers for that service. This is a straight move of the code, with
the copyright, header comments, package statement, and imports copied into each
new file, and goimports run on the result.
Two custom errors were moved into bcodes.go.
Fixes#2388.
There's an off-the-shelf package that provides most of the stats we care about
for gRPC using interceptors. This change vendors go-grpc-prometheus and its
dependencies, and calls out to the interceptors provided by that package from
our own interceptors.
This will allow us to get metrics like latency histograms by call, status codes
by call, and so on.
Fixes#2390.
This change vendors go-grpc-prometheus and its dependencies. Per contributing guidelines, I've run the tests on these dependencies, and they pass:
go test github.com/davecgh/go-spew/spew github.com/grpc-ecosystem/go-grpc-prometheus github.com/grpc-ecosystem/go-grpc-prometheus/examples/testproto github.com/pmezard/go-difflib/difflib github.com/stretchr/testify/assert github.com/stretchr/testify/require github.com/stretchr/testify/suite
ok github.com/davecgh/go-spew/spew 0.022s
ok github.com/grpc-ecosystem/go-grpc-prometheus 0.120s
? github.com/grpc-ecosystem/go-grpc-prometheus/examples/testproto [no test files]
ok github.com/pmezard/go-difflib/difflib 0.042s
ok github.com/stretchr/testify/assert 0.021s
ok github.com/stretchr/testify/require 0.017s
ok github.com/stretchr/testify/suite 0.012s
This PR introduces the ability for the ocsp-updater to only resubmit certificates to logs that we are missing SCTs from. Prior to this commit when a certificate was missing one or more SCTs we would submit it to every log, causing unnecessary overhead for us and the log operator.
To accomplish this a new RPC endpoint is added to the Publisher service "SubmitToSingleCT". Unlike the existing "SubmitToCT" this RPC endpoint accepts a log URI and public key in addition to the certificate DER bytes. The certificate is submitted directly to that log, and a cache of constructed resources is maintained so that subsequent submissions to the same log can reuse the stat name, verifier, and submission client.
Resolves#1679
`grpc/creds:serverTransportCredentials.validateClient` is meant to ignore the check if the `acceptedSANs` map it is constructed with is `nil`. This never happens as the map is constructed using `make(map[string]struct{})` meaning it can never be `nil`.
Instead start with a `nil` map and only populate it if we have `ClientNames` to whitelist.
Fixes#2375.
Previously we had custom code in each gRPC wrapper to implement timeouts. Moving
the timeout code into the client interceptor allows us to simplify things and
reduce code duplication.
Adds a gRPC server to the SA and SA gRPC Clients to the WFE, RA, CA, Publisher, OCSP updater, orphan finder, admin revoker, and expiration mailer.
Also adds a CA gRPC client to the OCSP Updater which was missed in #2193.
Fixes#2347.
This PR updates our GRPC library dep. to v1.0.3. It's likely we can update to v1.0.4 without much effort but on a first attempt it seems that the SupportPackageIsVersion3 to SupportPackageIsVersion4 change might cause some headaches so I started with 1.0.3.
The grpc/creds.go serverTransportCredentials and clientTransportCredentials needed two new funcs (Clone and OverrideServerName) to conform to the updated credentials.TransportCredentials interface.
It's tempting to remove grpc/creds.go clientTransportCredentials entirely now that the TLSCredentials from upstream has a OverrideServerName function we can use, but unfortunately it only supports one hostname and must be called a-head of ClientHandshake that we still need clientTransportCredentials for our use-case. I tried this and failed, so clientTransportCredentials remains in bgrpc/creds/creds.go.
Per CONTRIBUTING.md I've verified the unit tests pass:
daniel@XXXXXX:~/go/src/google.golang.org/grpc$ git show -s
commit b7f1379d3cbbbeb2ca3405852012e237aa05459e
Merge: 33731fd bac9e1d
Author: Qi Zhao <toqizhao@gmail.com>
Date: Mon Oct 17 16:02:05 2016 -0700
Merge pull request #903 from improbable-io/blocking-graceful-shutdown-fix
Make concurrent Server.GracefulStop calls all behave equivalently.
daniel@XXXXXX:~/go/src/google.golang.org/grpc$ go test ./...
ok google.golang.org/grpc 0.215s
ok google.golang.org/grpc/benchmark 0.017s
? google.golang.org/grpc/benchmark/client [no test files]
? google.golang.org/grpc/benchmark/grpc_testing [no test files]
? google.golang.org/grpc/benchmark/server [no test files]
? google.golang.org/grpc/benchmark/stats [no test files]
? google.golang.org/grpc/benchmark/worker [no test files]
? google.golang.org/grpc/codes [no test files]
ok google.golang.org/grpc/credentials 0.041s
? google.golang.org/grpc/credentials/oauth [no test files]
? google.golang.org/grpc/examples/helloworld/greeter_client [no test files]
? google.golang.org/grpc/examples/helloworld/greeter_server [no test files]
? google.golang.org/grpc/examples/helloworld/helloworld [no test files]
? google.golang.org/grpc/examples/route_guide/client [no test files]
? google.golang.org/grpc/examples/route_guide/routeguide [no test files]
? google.golang.org/grpc/examples/route_guide/server [no test files]
ok google.golang.org/grpc/grpclb 0.047s
? google.golang.org/grpc/grpclb/grpc_lb_v1 [no test files]
? google.golang.org/grpc/grpclog [no test files]
? google.golang.org/grpc/grpclog/glogger [no test files]
? google.golang.org/grpc/health [no test files]
? google.golang.org/grpc/health/grpc_health_v1 [no test files]
? google.golang.org/grpc/internal [no test files]
? google.golang.org/grpc/interop [no test files]
? google.golang.org/grpc/interop/client [no test files]
? google.golang.org/grpc/interop/grpc_testing [no test files]
? google.golang.org/grpc/interop/server [no test files]
ok google.golang.org/grpc/metadata 0.004s
? google.golang.org/grpc/naming [no test files]
? google.golang.org/grpc/peer [no test files]
ok google.golang.org/grpc/reflection 0.029s
? google.golang.org/grpc/reflection/grpc_reflection_v1alpha [no test files]
? google.golang.org/grpc/reflection/grpc_testing [no test files]
? google.golang.org/grpc/stress/client [no test files]
? google.golang.org/grpc/stress/grpc_testing [no test files]
? google.golang.org/grpc/stress/metrics_client [no test files]
ok google.golang.org/grpc/test 94.693s
? google.golang.org/grpc/test/codec_perf [no test files]
? google.golang.org/grpc/test/grpc_testing [no test files]
ok google.golang.org/grpc/transport 12.574s
As described in #2282, our gRPC code uses mutual TLS to authenticate both clients and servers. However, currently our gRPC servers will accept any client certificate signed by the internal CA we use to authenticate connections. Instead, we would like each server to have a list of which clients it will accept. This will improve security by preventing the compromise of one client private key being used to access endpoints unrelated to its intended scope/purpose.
This PR implements support for gRPC servers to specify a list of accepted client names. A `serverTransportCredentials` implementing `ServerHandshake` uses a `verifyClient` function to enforce that the connecting peer presents a client certificate with a SAN entry that matches an entry on the list of accepted client names
The `NewServer` function from `grpc/server.go` is updated to instantiate the `serverTransportCredentials` used by `grpc.NewServer`, specifying an accepted names list populated from the `cmd.GRPCServerConfig.ClientNames` config field.
The pre-existing client and server certificates in `test/grpc-creds/` are replaced by versions that contain SAN entries as well as subject common names. A DNS and an IP SAN entry are added to allow testing both methods of specifying allowed SANs. The `generate.sh` script is converted to use @jsha's `minica` tool (OpenSSL CLI is blech!).
An example client whitelist is added to each of the existing gRPC endpoints in config-next/ to allow the SAN of the test RPC client certificate.
Resolves#2282
This commit updates the `go-jose` dependency to [v1.1.0](https://github.com/square/go-jose/releases/tag/v1.1.0) (Commit: aa2e30fdd1fe9dd3394119af66451ae790d50e0d). Since the import path changed from `github.com/square/...` to `gopkg.in/square/go-jose.v1/` this means removing the old dep and adding the new one.
The upstream go-jose library added a `[]*x509.Certificate` member to the `JsonWebKey` struct that prevents us from using a direct equality test against two `JsonWebKey` instances. Instead we now must compare the inner `Key` members.
The `TestRegistrationContactUpdate` function from `ra_test.go` was updated to populate the `Key` members used in testing instead of only using KeyID's to allow the updated comparisons to work as intended.
The `Key` field of the `Registration` object was switched from `jose.JsonWebKey` to `*jose.JsonWebKey ` to make it easier to represent a registration w/o a Key versus using a value with a nil `JsonWebKey.Key`.
I verified the upstream unit tests pass per contributing.md:
```
daniel@XXXXX:~/go/src/gopkg.in/square/go-jose.v1$ git show
commit aa2e30fdd1fe9dd3394119af66451ae790d50e0d
Merge: 139276c e18a743
Author: Cedric Staub <cs@squareup.com>
Date: Thu Sep 22 17:08:11 2016 -0700
Merge branch 'master' into v1
* master:
Better docs explaining embedded JWKs
Reject invalid embedded public keys
Improve multi-recipient/multi-sig handling
daniel@XXXXX:~/go/src/gopkg.in/square/go-jose.v1$ go test ./...
ok gopkg.in/square/go-jose.v1 17.599s
ok gopkg.in/square/go-jose.v1/cipher 0.007s
? gopkg.in/square/go-jose.v1/jose-util [no test files]
ok gopkg.in/square/go-jose.v1/json 1.238s
```
Based on experience with the new gRPC staging deployment. gRPC generates `FullMethod` names such as `-ServiceName-MethodName` which can be confusing. For client calls to a service we actually want something formatted like `ServiceName-MethodName` and for server requests we want just `MethodName`.
This PR adds a method to clean up the `FullMethod` names returned by gRPC and formats them the way we expect.
Fixes#1880.
Updates google.golang.org/grpc and github.com/jmhodges/clock, both test suites pass. A few of the gRPC interfaces changed so this also fixes those breakages.
While testing with real proxies I noticed the original CDR implementation was actually pretty broken, this refactors a bit and fixes a number of bugs. With this patch fallback to GPDNS over three distributed test proxies worked perfectly.
(Side note: `nginx` is not a viable forward proxy for this use as it doesn't support SSL, and a bunch of other _real_ forward proxy features, I ended up just using `squid3`.)
The main error in the previous implementation was the fallback was implemented in `getCAASet` which is only called in the old code path (the local CAA impl instead of the remote service) which mean't it wasn't actually being tested in the integration test. This also refactors a few repeated blocks into their own functions. Also there was a unicode encoding problem somewhere with the query string but for the life of me I can't figure out why it was broken now.
Adds a server side unary RPC interceptor which includes basic stats. We could also use this to add a server request ID to the context.Context to identify the call through the system, but really I'd rather do that on the client side before the RPC is sent which requires the client interceptor implementation upstream. Also updates google.golang.org/grpc.
Updates #1880.
- Run both gRPC and AMQP servers simultaneously
- Take explicit constructor parameters and unexport fields that were previously set by users
- Remove transitional DomainCheck code in RA now that GSB is enabled.
- Remove some leftover UpdateValidation dummy methods.
* Split out CAA checking service (minus logging etc)
* Add example.yml config + follow general Boulder style
* Update protobuf package to correct version
* Add grpc client to va
* Add TLS authentication in both directions for CAA client/server
* Remove go lint check
* Add bcodes package listing custom codes for Boulder
* Add very basic (pull-only) gRPC metrics to VA + caa-service