Commit Graph

42 Commits

Author SHA1 Message Date
James Renken ac68828f43
Replace most uses of net.IP with netip.Addr (#8205)
Retain `net.IP` only where we directly work with `x509.Certificate` and
friends.

Fixes #5925
Depends on #8196
2025-05-27 15:05:35 -07:00
James Renken 3f879ed0b4
Add Identifiers to Authorization & Order structs (#7961)
Add `identifier` fields, which will soon replace the `dnsName` fields,
to:
- `corepb.Authorization`
- `corepb.Order`
- `rapb.NewOrderRequest`
- `sapb.CountFQDNSetsRequest`
- `sapb.CountInvalidAuthorizationsRequest`
- `sapb.FQDNSetExistsRequest`
- `sapb.GetAuthorizationsRequest`
- `sapb.GetOrderForNamesRequest`
- `sapb.GetValidAuthorizationsRequest`
- `sapb.NewOrderRequest`

Populate these `identifier` fields in every function that creates
instances of these structs.

Use these `identifier` fields instead of `dnsName` fields (at least
preferentially) in every function that uses these structs. When crossing
component boundaries, don't assume they'll be present, for
deployability's sake.

Deployability note: Mismatched `cert-checker` and `sa` versions will be
incompatible because of a type change in the arguments to
`sa.SelectAuthzsMatchingIssuance`.

Part of #7311
2025-03-26 10:30:24 -07:00
Jacob Hoffman-Andrews 04dec59c67
ra: log User-Agent (#7908)
In the WFE, store the User-Agent in a `context.Context` object. In our
gRPC interceptors, pass that field in a Metadata header, and re-add it
to `Context` on the server side.

Add a test in the gRPC interceptors that User-Agent is properly
propagated.

Note: this adds a new `setup()` function for the gRPC tests that is
currently only used by the new test. I'll upload another PR shortly that
expands the use of that function to more tests.

Fixes https://github.com/letsencrypt/boulder/issues/7792
2025-01-14 13:39:41 -08:00
Jacob Hoffman-Andrews d6e163c15d
Revert "wfe: on rate limit error, serve 500 (#7796)" (#7900)
This reverts commit 242d746040 (#7796)

We want to make this change, but it carries some risk that we'd prefer
not to take over the holiday. We'd also like to keep `main` in a state
where it would be reasonable to deploy (even if, in practice, any
over-the-holiday deploy would be a hotfix, not a direct tag from
`main`).
2024-12-20 11:04:19 -08:00
Jacob Hoffman-Andrews 242d746040
wfe: on rate limit error, serve 500 (#7796)
This affects NewAccount and NewOrder.
2024-12-17 17:09:57 -08:00
Jacob Hoffman-Andrews 27e65f3e9f
ratelimits: add detail to error messages (#7871)
For batch operations, include the operation and the number of keys in
the error message. This should help diagnose whether we are getting `i/o
timeout` errors disproportionately for larger requests, or for certain
operations.

Also, make the ignored errors part of the overall WFE request logs,
which allows us to get additional context, like whether certain
requesters or domain names are getting disproportionately many errors.

Related to #7846.
2024-12-05 15:58:26 -08:00
Jacob Hoffman-Andrews 02685602a2
web: add feature flag PropagateCancels (#7778)
This allow client-initiated cancels to propagate through gRPC.

IN-10803 tracks the SRE-side changes to enable this flag.
2024-11-04 14:37:29 -08:00
Aaron Gable aae7fb3551
Use go1.21's context.WithoutCancel (#7073)
This new standard library method returns a context with all of the
original metadata (e.g. tracing spans) still attached, but which will
not be canceled by any cancel funcs, deadlines, or timeouts set on the
parent context. We do this manually in a few places to prevent client
cancellations (usually disconnects) from disrupting our work, so this
just makes that code slightly simpler.

Fixes https://github.com/letsencrypt/boulder/issues/5506
2023-09-11 09:01:50 -07:00
Phil Porada 439517543b
CI: Run staticcheck standalone (#7055)
Run staticcheck as a standalone binary rather than as a library via
golangci-lint. From the golangci-lint help out,
> staticcheck (megacheck): It's a set of rules from staticcheck. It's
not the same thing as the staticcheck binary. The author of staticcheck
doesn't support or approve the use of staticcheck as a library inside
golangci-lint.

We decided to disable ST1000 which warns about incorrect or missing
package comments.

For SA4011, I chose to change the semantics[1] of the for loop rather
than ignoring the SA4011 lint for that line.

Fixes https://github.com/letsencrypt/boulder/issues/6988

1. https://go.dev/ref/spec#Continue_statements
2023-08-31 21:09:40 -07:00
Aaron Gable 9e3b4bec18
Remove contact addresses from WFE logs (#6939)
The contacts field of an account can be very verbose, and is irrelevant
to the vast majority -- e.g. creating orders, validating challenges, and
downloading certificates -- of requests made by an account. To reduce
the length of our WFE log lines, remove the Contacts field from all
logs. When we actually need it, we can get it from the database.

Also remove the RequestEvent.TLS field, which is unused.
2023-06-20 14:56:27 -07:00
Aaron Gable 8224fad20b
Update to go1.20.5 (#6946)
We are already running go1.20.5 in production.
2023-06-20 14:55:37 -07:00
Matthew McPherrin 0060e695b5
Introduce OpenTelemetry Tracing (#6750)
Add a new shared config stanza which all boulder components can use to
configure their Open Telemetry tracing. This allows components to
specify where their traces should be sent, what their sampling ratio
should be, and whether or not they should respect their parent's
sampling decisions (so that web front-ends can ignore sampling info
coming from outside our infrastructure). It's likely we'll need to
evolve this configuration over time, but this is a good starting point.

Add basic Open Telemetry setup to our existing cmd.StatsAndLogging
helper, so that it gets initialized at the same time as our other
observability helpers. This sets certain default fields on all
traces/spans generated by the service. Currently these include the
service name, the service version, and information about the telemetry
SDK itself. In the future we'll likely augment this with information
about the host and process.

Finally, add instrumentation for the HTTP servers and grpc
clients/servers. This gives us a starting point of being able to monitor
Boulder, but is fairly minimal as this PR is already somewhat unwieldy:
It's really only enough to understand that everything is wired up
properly in the configuration. In subsequent work we'll enhance those
spans with more data, and add more spans for things not automatically
traced here.

Fixes https://github.com/letsencrypt/boulder/issues/6361

---------

Co-authored-by: Aaron Gable <aaron@aarongable.com>
2023-04-21 10:46:59 -07:00
Matthew McPherrin e1ed1a2ac2
Remove beeline tracing (#6733)
Remove tracing using Beeline from Boulder. The only remnant left behind
is the deprecated configuration, to ensure deployability.

We had previously planned to swap in OpenTelemetry in a single PR, but
that adds significant churn in a single change, so we're doing this as
multiple steps that will each be significantly easier to reason about
and review.

Part of #6361
2023-03-14 15:14:27 -07:00
Jacob Hoffman-Andrews 67927390e7
wfe: remove Payload from logs (#6639)
Also remove CSRDNSNames, CSRIPAddresses and CSREmailAddresses.

And add a new log field "DNSNames", for use in new-order, finalize, and
revoke requests.

Add a "RevocationReason" field in the "Extra" section for revoke
requests.
2023-02-09 13:45:14 -08:00
Jacob Hoffman-Andrews c23e59ba59
wfe2: don't pass through client-initiated cancellation (#6608)
And clean up the code and tests that were used for cancellation
pass-through.

Fixes #6603
2023-01-26 17:26:15 -08:00
Phil Porada 8bc4005423
Remove some ACMEv1 comments/documentation (#6584)
We shut down the ACMEv1 API in summer of 2021 and no longer have use for
this text. Requested by @aarongable over at
https://github.com/letsencrypt/boulder/pull/6581
2023-01-12 10:56:39 -08:00
Jacob Hoffman-Andrews 21b2ec9c42
Revert "wfe: Log TLS version (#6001)" (#6399)
This reverts commit 7336f1acce.
2022-09-26 11:16:32 -07:00
Aaron Gable 8e156f4dd9
Quiet log output for successful /directory requests (#6096)
Add functionality to the ubiquitous RequestEvent (aka logEvent) to
allow handlers to suppress the final log line that is printed when
a non-500 response is being sent.

Use this functionality to suppress logging GET requests for the /directory
endpoint. We can expand this in the future to quiet other logs that
are not helpful for metrics or analysis.

Fixes #6094
2022-05-13 08:35:34 -07:00
Jacob Hoffman-Andrews 7336f1acce
wfe: Log TLS version (#6001)
This will help inform deprecation of TLS 1.0 and TLS 1.1 for ACME API requests.
2022-03-21 14:01:52 -07:00
Aaron Gable ab79f96d7b
Fixup staticcheck and stylecheck, and violations thereof (#5897)
Add `stylecheck` to our list of lints, since it got separated out from
`staticcheck`. Fix the way we configure both to be clearer and not
rely on regexes.

Additionally fix a number of easy-to-change `staticcheck` and
`stylecheck` violations, allowing us to reduce our number of ignored
checks.

Part of #5681
2022-01-20 16:22:30 -08:00
Aaron Gable 9abb39d4d6
Honeycomb integration proof-of-concept (#5408)
Add Honeycomb tracing to all Boulder components which act as
HTTP servers, gRPC servers, or gRPC clients. Add many values
which we currently emit to logs to the trace spans. Add a way to
configure the Honeycomb integration to our config files, and by
default configure all of our tests to "mute" (send nothing).

Followup changes will refine the configuration, attempt to reduce
the new dependency load, and introduce better sampling.

Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218
2021-05-24 16:13:08 -07:00
Aaron Gable d59e715c9d
Web: Preserve context from request (#5404)
The `http.Request` object can already have a context associated
with it. If it does, preserve that context rather than creating a new
one. If it doesn't, create a new `context.Background` instead.
2021-04-29 14:20:54 -07:00
Andrew Gabbitas 0fdfbe1211
Deprecate StripDefaultSchemePort flag (#5265)
This flag is now enabled in Let's Encrypt staging/prod.

This change deprecates the flag and prepares it for deletion in a future
change. It can then be removed once no staging/prod configs reference the
flag.

Fixes #5236
2021-02-08 11:30:52 -08:00
Roland Bracewell Shoemaker 56898e8953
Log RSA key sizes in WFE/WFE2 and add feature to restrict them (#4839)
Currently 99.99% of RSA keys we see in certificates at Let's Encrypt are
either 2048, 3072, or 4096 bits, but we support every 8 bit increment
between 2048 and 4096. Supporting these uncommon key sizes opens us up to
having to block much larger ranges of keys when dealing with something
like the Debian weak keys incident. Instead we should just reduce the
set of key sizes we support down to what people actually use.

Fixes #4835.
2020-06-08 11:23:27 -07:00
Jacob Hoffman-Andrews bef02e782a
Fix nits found by staticcheck (#4726)
Part of #4700
2020-03-30 10:20:20 -07:00
Daniel McCarney 3175b4f9eb
web: strip :443/:80 unconditionally w/ features.StripDefaultSchemePort (#4505)
Only removing :443 when the http.Request.TLS is not nil breaks when
Boulder's WFE/WFE2 are running HTTP behind a separate ingress proxy that
terminates HTTPS on its behalf.
2019-10-23 15:17:13 -04:00
Roland Bracewell Shoemaker 31ed590edd
Strip default scheme ports from Host headers (#4448)
Fixes #4447.
2019-09-27 16:14:40 -07:00
Jacob Hoffman-Andrews c777dfece6 Log the Origin header. (#4376)
XHR requests from web-based ACME clients provide the User-Agent
of the browser that initiated the request, but the hostname of the site
that originated the request is sent in the Origin header. This will let
us better analyze web-based ACME traffic.

Fixes #4370
2019-07-31 09:47:44 -07:00
Roland Bracewell Shoemaker a1540bd5ec web: log 200 if code not explicitly set (#4296) 2019-06-25 16:59:30 -04:00
Roland Bracewell Shoemaker 6f93942a04 Consistently used stdlib context package (#4229) 2019-05-28 14:36:16 -04:00
Jacob Hoffman-Andrews ee337cd5d1 Update wfe2. 2019-01-09 15:19:04 -08:00
Jacob Hoffman-Andrews 0123b35295 Make Contacts optional in logs. 2019-01-09 14:07:08 -08:00
Jacob Hoffman-Andrews d2d5ba294b Shrink byte size of WFE request logs.
- Log the simple, non-whitespace-containing fields as positional
parameters to avoid the JSON overhead for them.
- Log latency in milliseconds rather than seconds (saves "0.").
- Hoist some fields from the "Extra" sub-object and give
  them shorter names. This saves the bytes for rendering the "Extra"
  field plus the bytes for the longer names.

Example output from integration tests:

Before (1687 bytes):

I205230 boulder-wfe JSON={"Endpoint":"/directory","Method":"GET","UserAgent":"Boulder integration tester","Latency":0.001,"Code":0}
I205230 boulder-wfe JSON={"Endpoint":"/acme/new-reg","Method":"HEAD","Error":"405 :: malformed :: Method not allowed","UserAgent":"Boulder integration tester","Latency":0,"Code":405}
I205230 boulder-wfe JSON={"Endpoint":"/acme/new-reg","Method":"POST","Requester":611,"Contacts":[],"UserAgent":"Boulder integration tester","Latency":0.025,"Code":201,"Payload":"{\n  \"resource\": \"new-reg\"\n}"}
I205230 boulder-wfe JSON={"Endpoint":"/acme/reg/","Slug":"611","Method":"POST","Requester":611,"Contacts":[],"UserAgent":"Boulder integration tester","Latency":0.021,"Code":202,"Payload":"{\n  \"status\": \"valid\", \n  \"resource\": \"reg\", \n  \"agreement\": \"http://boulder:4000/terms/v1\", \n  \"key\": {\n    \"e\": \"AQAB\", \n    \"kty\": \"RSA\", \n    \"n\": \"r1zCJC8Muw5K8ti-pjojivHxyNxOZye-N5aX_i7kBiHrAOp9qxgQUHUyU3COCjFPrSzScTpKoIyCwdL7x-1mPX3pby7CzGugtY9da_LZkDmsDE8LIuQkZ_wRLyh1103OQZEd71AlddMx1iwLLVl4UTICoJFUfYvXHvkqmsE5xhBPJhl-SdSrJM6F7Kn7k0WycA5ig_QPbjVbzJlQq-C65iGDJtc_LvY0FFF4exThZM7xsvucJywJMHCEWZUktm9YB-CBNA1gVbL52u22jQpX-MN52UVdqSh9ZipoJLtxKjZx31DHB_bcdgtJ8YGIE4lY_ZAax1Ut-a5WTJvVq2Hk8w\"\n  }\n}"}
I205230 boulder-wfe JSON={"Endpoint":"/acme/new-authz","Method":"POST","Requester":611,"Contacts":[],"UserAgent":"Boulder integration tester","Latency":0.031,"Code":201,"Payload":"{\n  \"identifier\": {\n    \"type\": \"dns\", \n    \"value\": \"rand.18fe4d73.xyz\"\n  }, \n  \"resource\": \"new-authz\"\n}","Extra":{"AuthzID":"PgF1JQ3TK6c1FR0wVdm_mYows_xWSsyYgyezSvSNI-0","Identifier":{"type":"dns","value":"rand.18fe4d73.xyz"}}}

After (1406 bytes):

I210117 boulder-wfe GET /directory 0 0 0 0.0.0.0 JSON={"ua":"Boulder integration tester"}
I210117 boulder-wfe HEAD /acme/new-reg 0 405 0 0.0.0.0 JSON={"Error":"405 :: malformed :: Method not allowed","ua":"Boulder integration tester"}
I210117 boulder-wfe POST /acme/new-reg 676 201 23 0.0.0.0 JSON={"Contacts":[],"ua":"Boulder integration tester","Payload":"{\n  \"resource\": \"new-reg\"\n}"}
I210117 boulder-wfe POST /acme/reg/ 676 202 23 0.0.0.0 JSON={"Slug":"676","Contacts":[],"ua":"Boulder integration tester","Payload":"{\n  \"status\": \"valid\", \n  \"resource\": \"reg\", \n  \"agreement\": \"http://boulder:4000/terms/v1\", \n  \"key\": {\n    \"e\": \"AQAB\", \n    \"kty\": \"RSA\", \n    \"n\": \"zXSFAzdzwwFGjNysmG0YE7MxAwQ8JkkvLQ7Qs7xB1h5kFM_F-W2jxYEmrRTrA0ylfuzb4RQMBrsLfv0XV8rsDIuP_t92ADBjfd25ajuuia9EGrhpHitFimEUlZjsqGQp8F49xLhDMAqm1SLBY_k1pY8TKSLHeyOyLYIKLaL3Ra9yZ63qB65oGuNhXroKqqx7nUjyZtqtUV5NUPvPgvhJgXgYKMjck3jXWgr4ZGqYyJQqNqydYSk3uJGfruChakZThwl3vbH8aUPaeoXcvPA8KaQl56JUf7jAVY3n9qKKb5mgT96vDKWUpJaI5YE1rMZIJfkaFK-ZZIhFeeKCSsSJlQ\"\n  }\n}"}
I210117 boulder-wfe POST /acme/new-authz 676 201 35 0.0.0.0 JSON={"Contacts":[],"ua":"Boulder integration tester","Payload":"{\n  \"identifier\": {\n    \"type\": \"dns\", \n    \"value\": \"rand.14ebdfd1.xyz\"\n  }, \n  \"resource\": \"new-authz\"\n}","Created":"Z-soxIEhsGlMK3GYyDqYrSlxDFEeH6q3mrd6aoi2iIs","DNSName":"rand.14ebdfd1.xyz"}
2019-01-09 13:03:07 -08:00
Jacob Hoffman-Andrews b1be4ccaed Fix latency logging. (#3937)
In the VA, we were rendering a Duration to JSON, which gave an integer
number of nanoseconds rather than a float64 of seconds. Also, in both VA
and WFE we were rendering way more precision than we needed. Millisecond
precision is enough, and since we log latency for every WFE response,
the extra bytes are worth saving.
2018-11-14 10:52:48 -05:00
Joel Sing 8ebdfc60b6 Provide formatting logger functions. (#3699)
A very large number of the logger calls are of the form log.Function(fmt.Sprintf(...)).
Rather than sprinkling fmt.Sprintf at every logger call site, provide formatting versions
of the logger functions and call these directly with the format and arguments.

While here remove some unnecessary trailing newlines and calls to String/Error.
2018-05-10 11:06:29 -07:00
Roland Bracewell Shoemaker 9821aeb46f Split internal and public errors out in web.RequestEvent (#3682)
Splits out the old `Errors` slice into a public `Error` string and a `InternalErrors` slice. Also removes a number of occurrences of calling `logEvent.AddError` then immediately calling `wfe.sendError` with either the same internal error which caused the same error to be logged twice or no error which is slightly redundant as `wfe.sendError` calls `logEvent.AddError` internally.

Fixes #3664.
2018-05-03 09:13:33 -04:00
Roland Bracewell Shoemaker c3669f9068 Split endpoint and path in WFE+WFE2 web.RequestEvent (#3683) 2018-05-02 10:20:21 -07:00
Jacob Hoffman-Andrews 9e24cad3bb Add latency logging to WFE and WFE2. (#3617)
Fixes #3609.
2018-04-04 21:02:49 +01:00
Jacob Hoffman-Andrews eb23cb3ffc Remove "Terminated request" / "Successful request" (#3484)
The WFE logs these with every request, but with #3483,
they aren't necessary; everything other than 2xx is a failed request.
2018-02-28 15:16:36 -08:00
Jacob Hoffman-Andrews bf5dc8b929 Log status code in WFE JSON logs. (#3483)
This field was introduced to the logs in #2628 but without the code required
to fill it.
2018-02-27 10:16:45 -05:00
Jacob Hoffman-Andrews c0ffa3d5d1 Remove logging of Request/ResponseNonce. (#3421)
These take up a lot of space in the logs, and we almost never reference
them.
2018-02-06 10:17:12 -05:00
Jacob Hoffman-Andrews 97265c9184 Factor out context.go from wfe and wfe2. (#3086)
* Move probs.go to web.

* Move probs_test.go

* Factor out probs.go from wfe

* Move context.go

* Extract context.go into web package.

* Add a constructor for TopHandler.
2017-09-26 13:54:14 -04:00