Previously, a given binary would have three TLS config fields (CA cert, cert,
key) for its gRPC server, plus each of its configured gRPC clients. In typical
use, we expect all three of those to be the same across both servers and clients
within a given binary.
This change reuses the TLSConfig type already defined for use with AMQP, adds a
Load() convenience function that turns it into a *tls.Config, and configures it
for use with all of the binaries. This should make configuration easier and more
robust, since it more closely matches usage.
This change preserves temporary backwards-compatibility for the
ocsp-updater->publisher RPCs, since those are the only instances of gRPC
currently enabled in production.
This allows finer-grained control of which components can request issuance. The OCSP Updater should not be able to request issuance.
Also, update test/grpc-creds/generate.sh to reissue the certs properly.
Resolves#2417
Previously we had custom code in each gRPC wrapper to implement timeouts. Moving
the timeout code into the client interceptor allows us to simplify things and
reduce code duplication.
Adds a gRPC server to the SA and SA gRPC Clients to the WFE, RA, CA, Publisher, OCSP updater, orphan finder, admin revoker, and expiration mailer.
Also adds a CA gRPC client to the OCSP Updater which was missed in #2193.
Fixes#2347.
Implements a less RPC focused signal catch/shutdown method. Certain things that probably could also use this (i.e. `ocsp-updater`) haven't been given it as they would require rather substantial changes to allow for a graceful shutdown approach.
Fixes#2298.
With the current gRPC design the CA talks directly to the Publisher when calling SubmitToCT which crosses security bounadries (secure internal segment -> internet facing segment) which is dangerous if (however unlikely) the Publisher is compromised and there is a gRPC exploit that allows memory corruption on the caller end of a RPC which could expose sensitive information or cause arbitrary issuance.
Instead we move the RPC call to the RA which is in a less sensitive network segment. Switching the call site from the CA -> RA is gated on adding the gRPC PublisherService object to the RA config.
Fixes#2202.
Add feature flagged support for issuing for IDNs, fixes#597.
This patch expects that clients have performed valid IDN2008 encoding on any label that includes unicode characters. Invalid encodings (including non-compatible IDN2003 encoding) will be rejected. No script-mixing or script exclusion checks are performed as we assume that if a name is resolvable that it conforms to the registrar's policies on these matters and if it uses non-standard scripts in sub-domains etc that browsers should be the ones choosing how to display those names.
Required a full update of the golang.org/x/net tree to pull in golang.org/x/net/idna, all test suites pass.
Updates #1699.
Adds a new package, `features`, which exposes methods to set and check if various internal features are enabled. The implementation uses global state to store the features so that services embedded in another service do not each require their own features map in order to check if something is enabled.
Requires a `boulder-tools` image update to include `golang.org/x/tools/cmd/stringer`.
Fixes#1880.
Updates google.golang.org/grpc and github.com/jmhodges/clock, both test suites pass. A few of the gRPC interfaces changed so this also fixes those breakages.
Introduces the `authorizationLifetimeDays` and `pendingAuthorizationLifetimeDays` configuration options for `RA`.
If the values are missing from configuration, the code defaults back to the current values (300/7 days).
fixes#2024
This PR removes the use of all anonymous struct fields that were introduced by myself as per my work on splitting up boulder-config (#1962).
The root of the bug was related to the loading of the json configuration file into the config struct. The config structs contained several embedded (anonymous) fields. An embedded (anonymous) field in a struct actually results in the flattening of the json structure. This caused json.Unmarshal to look not at the nested level, but at the root level of the json object and hence not find the nested field (i.e. AllowedSigningAlgos).
See https://play.golang.org/p/6uVCsEu3Df for a working example.
This fixes the reported bug: #2018
In #1971 we added the CAASERVFAILExceptions config field and argument to NewDNSResolverImpl. This argument only needs to be passed to the VA, where we do CAA validations. However, I accidentally added code to the RA as well to use this new config field. This changes backs that out.
Presently clients may request a new AuthZ be created for a domain that they have already proved authorization over. This results in unnecessary bloat in the authorizations table and duplicated effort.
This commit alters the `NewAuthorization` function of the RA such that before going through the work of creating a new AuthZ it checks whether there already exists a valid AuthZ for the domain/regID that expires in more than 24 hours from the current date. If there is, then we short circuit creation and return the existing AuthZ. When this case occurs the `RA.ReusedValidAuthz` counter is incremented to provide visibility.
Since clients requesting a new AuthZ and getting an AuthZ back expect to turn around and post updates to the corresponding challenges we also return early in `UpdateAuthorization` when asked to update an AuthZ that is already valid. When this case occurs the `RA.ReusedValidAuthzChallenge` counter is incremented.
All of the above behaviour is gated by a new RA config flag `reuseValidAuthz`. In the default case (false) the RA does **not** reuse any AuthZ's and instead maintains the historic behaviour; always creating a new AuthZ when requested, irregardless of whether there are already valid AuthZ's that could be reused. In the true case (enabled only in `boulder-config-next.json`) the AuthZ reuse described above is enabled.
Resolves#1854
Resolves#1810 by automatically updating the RA ratelimit.RateLimitConfig whenever the backing config file is changed. Much like the Policy Authority uses a reloader instance to support updating the Hostname policy on the fly, this PR changes the Registration Authority to use a reloader for the rate limit policy file.
Access to the ra.rlPolicies member is protected with a RWMutex now that there is a potential for the values to be reloaded while a reader is active.
A test is introduced to ensure that writing a new policy YAML to the policy config file results in new values being set in the RA's rlPolicies instance.
https://github.com/letsencrypt/boulder/pull/1894
In this PR, logger is passed to the following callers:
NewWebFrontEndImpl
NewCertificateAuthorityImpl
NewValidationAuthorityImpl
NewAmqpRPCServer
newUpdater
NewRegistrationAuthorityServer
This reduces the usage of a global singleton logger and allows tests to consistently use a mock logger.
Fixes#1642
* remove blog.Get() in wfe
* remove blog.Get() from va
* remove Blog.Get() from ca
* remove blog.Get() from oscp updater, ampq rpc server, registration authority server
* removed some pointless logging code
* remove one added newline
* fix format issue
* fix setup function to return *blog.Mock instead of being passed in
* remove useless blog.NewMock() call
- Run both gRPC and AMQP servers simultaneously
- Take explicit constructor parameters and unexport fields that were previously set by users
- Remove transitional DomainCheck code in RA now that GSB is enabled.
- Remove some leftover UpdateValidation dummy methods.
The rate limiting code previously lived in the `cmd` package without a clear justification for why. This commit moves the rate limiting code to its own `ratelimit` package and updates import paths as required. Notably all references from the `cmd` package's exported `LoadRateLimitPolicies`, `RateLimitPolicy`, and `RateLimitConfig` were moved to use `ratelimit`. This removed the `cmd` import from a couple of callers (nice!).
* Split CSR testing and name hoisting into own functions, verify CSR in RA & CA
* Move tests around and various other fixes
* 1.5.3 doesn't have the needed stringer
* Move functions to their own lib
* Remove unused imports
* Move MaxCNLength and BadSignatureAlgorithms to csr package
* Always normalizeCSR in VerifyCSR and de-export it
* Update comments
* Delete Policy DB.
This is no longer needed now that we have a JSON policy file.
* Fix tests.
* Revert Dockerfile.
* Fix create_db
* Simplify user addition.
* Fix tests.
* Fix tests
* Review fixes.
https://github.com/letsencrypt/boulder/pull/1773
* Fix all errcheck errors
* Add errcheck to test.sh
* Add a new sa.Rollback method to make handling errors in rollbacks easier.
This also causes a behavior change in the VA. If a HTTP connection is
abruptly closed after serving the headers for a non-200 response, the
reported error will be the read failure instead of the non-200.
- Remove error signatures from log methods. This means fewer places where errcheck will show ignored errors.
- Pull in latest cfssl to be compatible with errorless log messages.
- Reduce the number of message priorities we support to just those we actually use.
- AuditNotice -> AuditInfo
- Remove InfoObject (only one use, switched to Info)
- Remove EmergencyExit and related functions in favor of panic
- Remove SyslogWriter / AuditLogger separate types in favor of a single interface, Logger, that has all the logging methods on it.
- Merge mock log into logger. This allows us to unexport the internals but still override them in the mock.
- Shorten names to be compatible with Go style: New, Set, Get, Logger, NewMock, etc.
- Use a shorter log format for stdout logs.
- Remove "... Starting" log messages. We have better information in the "Versions" message logged at startup.
Motivation: The AuditLogger / SyslogWriter distinction was confusing and exposed internals only necessary for tests. Some components accepted one type and some accepted the other. This made it hard to consistently use mock loggers in tests. Also, the unnecessarily fat interface for AuditLogger made it hard to meaningfully mock out.
This eliminates the need the a database to store the hostname policy,
simplifying deployment. We keep the database for now, as part of our
deployability guidelines: we'll deploy, then switch config to the new style.
This also disables the obsolete whitelist checking code, but doesn't yet change
the function signature for policy.New(), to avoid bloating the pull request.
I'll fully remove the whitelist checking code in a future change when I also
remove the policy database code.
This provides a means to add retries to DNS look ups, and, with some
future work, end retries early if our request deadline is blown. That
future work is tagged with #1292.
Updates #1258
This moves the RTT metrics calculation inside of the DNSResolver. This
cleans up code in the RA and VA and makes some adding retries to the
DNSResolver less ugly to do.
Note: this will put `Rate` and `RTT` after the name of DNS query
type (`A`, `MX`, etc.). I think that's fine and desirable. We aren't
using this data in alerts or many dashboards, yet, so a flag day is
okay.
Fixes#1124
Moves the DNS code from core to dns and renames the dns package to bdns
to be clearer.
Fixes#1260 and will be good to have while we add retries and such.
In the process, break out AMQP config into its own struct, one per service.
The AMQPConfig struct is included by composition in the config structs that need
it. If any given service lacks an AMQP config of its own, it gets a default
value from the top-level AMQP config struct, for deployability reasons.
Tightens the RPC code to take a specific AMQP config, not an over-broad
cmd.Config.
Shortens construction of specific RPC clients so they instatiate the generic
client connection themselves, simplifying per-service startup code.
Remove unused SetTimeout method on RPC clients.
Consolidate initialization of stats and logging from each main.go into cmd
package.
Define a new config parameter, `StdoutLevel`, that determines the maximum log
level that will be printed to stdout. It can be set to 6 to inhibit debug
messages, or 0 to print only emergency messages, or -1 to print no messages at
all.
Remove the existing config parameter `Tag`. Instead, choose the tag from the
basename of the currently running process. Previously all Boulder log messages
had the tag "boulder", but now they will be differentiated by process, like
"boulder-wfe".
Shorten the date format used in stdout logging, and add the current binary's
basename.
Consolidate setup function in audit-logger_test.go.
Note: Most CLI binaries now get their stats and logging from the parameters of
Action. However, a few of our binaries don't use our custom AppShell, and
instead use codegangsta/cli directly. For those binaries, we export the new
StatsAndLogging method from cmd.
Fixes https://github.com/letsencrypt/boulder/issues/852
This allows us to call the Google Safe Browsing calls through the VA.
If the RA config's boolean UseIsSafeDomain is true, the RA will make the RPC
call to the VA during its NewAuthorization.
If the VA config's GoogleSafeBrowsingConfig struct is not nil, the VA
will check the Google Safe Browsing API in
VA.IsSafeDomain. If the GoogleSafeBrowsingConfig struct is nil, it will
always return true.
In order to actually make requests, the VA's GoogleSafeBrowsingConfig
will need to have a directory on disk it can store the local GSB hashes
it will check first and a working Google API key for the GSB API.
Fixes#1058
This means after parsing the config file, setting up stats, and dialing the
syslogger. But it is still before trying to initialize the given server. This
means that we are more likely to get version numbers logged for some common
runtime failures.
Challenge URIs should be determined by the WFE at fetch time, rather than stored
alongside the challenge in the DB. This simplifies a lot of the logic, and
allows to to remove a code path in NewAuthorization where we create an
authorization, then immediately save it with modifications to the challenges.
This change also gives challenges their own endpoint, which contains the
challenge id rather than the challenge's offset within its parent authorization.
This is also a first step towards replacing UpdateAuthorization with
UpdateChallenge: https://github.com/letsencrypt/boulder/issues/760.
The RA did not have any code to test what occurred when a challenge
failed. This let in the authz schema change in #705.
This change sets the expires column in authz back to NULLable and fixes
the RA tests (including, using clock.Clocks in the RA).
Fixes#744.
Currently, the debug http server in every service contains just the
net/http/pprof handlers. This allows us to get CPU, blocking, and memory
profiling remotely.
Along the way, remove all the places we use http.DefaultServeMux (which
includes use of http.Handle and http.HandlerFunc) and use a NewServeMux
for each place.
Fixes#457
- Moved HandlerTimer definition from various cmd/ binaries to cmd/shell.go
- Cleaned up HandlerTimer endpoint metrics
- Moved New... counter metrics from WFE to RA and add Updated... and Finalized... ones
- Added error code and problem type counter metrics to WFE
- Added validation type / status counter metrics to VA
- Consistently return the total RTT from LookupCAA, LookupCNAME, and LookupDNSSEC method
- Added DNS RTT timing metrics to VA for the various Loookup... methods
- Move dbMap construction and type converter into individual files in the sa package.
- Add DB configuration for the OCSP tool to the boulder config:
- left to the user if they want to use different boulder-config.json files
for different purposes.
- Added updater to Makefile
- Fix trailing ',' in the Boulder config, add more panic logging
- Ignore .pem files produced by the integration test
- Change RPC to use per-instance named reply-to queues.
- Finish OCSP Updater logic
- Rework RPC for OCSP to use a transfer object (due to serialization problems of x509.Certificate)
- Auditing for general errors in executables
- Auditing for improper messages received by WFE
- Automatic audit wlogging of software errors
- Audit logging for mis-routed messages
- Audit logging for certificate requests
- Auditing for improper messages received by WFE
- Add audit events table
- Expect more details in TestRegistration in web-front-end_test.go
- Remove "extra" debug details from web-front-end.go per Issue #174
- Improve test coverage of web-front-end.go
- WFE audit updates for revocation support rebase
- Add audit messages to RPC for Improper Messages and Error Conditions
- Also note misrouted messages