* Enable golint and exclude some other generated or additional dirs
Also remove `test` ignore, since it's covered by path ignore rule.
* meh
* fixes
* more
* progressing
* further
* like a boss
Common use cases for this webhook involve using Kubernetes's
generateName API to randomise resource names (this is a good
idea in Tekton pipelines, for example, where there are uniqueness
constraints. That means that the webhook metrics here end up with
very high cardinality, which makes Prometheus fall over. Even
without generateName, it is possible to shoot oneself in the foot.
This commit just removes the resource_name label altogether.
* Remove patch detail for webhook being logged on info
It might contain the sensitive data so print it to debug log
* Pass admission review info to the fields in structured logging
* Remove knative namespace from admission webhook logkey
* Use loosely-typed key-value pairs for suggered logger in admission wb
* Move admission specific logkey to local package
* Print patch type with patch body on debug log
We still see users frequently struggle with this and the error message seemingly doesn't help them. This gives a more explicit description of what the user can do to fix the situation.
* Various cleanups around the codebase
- unindent the else after return
- make things private that are not used anywhere
- rearrange params
- etc
* add
* include a filter on control plane namespaces for defaulting and validation webhooks from knative/pkg
* Update unit tests to include control-plane
* adding a comment to explain why we are adding 'control-plane' to the webhook config
* Use two lane queue instead of the regular workqueue
- we need to poll for len in the webhook tests because we have async propagation now, and check at the wrong time will be not correct.
- otherwise just a drop in replacement.
* update test
* cmt
* tests hardened
* Implement a new shared "Drainer" handler.
This implements a new `http.Handler` called `Drainer`, which is intended to wrap some inner `http.Handler` business logic with a new outer handler that can respond to Kubelet probes (successfully until told to "Drain()").
This takes over the webhook's relatively new probe handling and lame duck logic with one key difference. Previously the webhook waited for a fixed period after SIGTERM before exitting, but the new logic waits for this same grace period AFTER THE LAST REQUEST. So if the handler keeps getting (non-probe) requests, the timer will continually reset, and once it stops receiving requests for the configured grace period, "Drain()" will return and the webhook will exit.
The goal of this work is to try to better cope with what we believe to be high tail latencies of the API server seeing that a webhook replica is shutting down.
Related: https://github.com/knative/pkg/issues/1509
* Switch to RWLock
* Allows for webhooks to exclude certain namspaces
Added a namespaces selector to the mutating webhook configuration which
allows for excluding namespaces from the webhook
Fixes#1379
* Updated skipWebhooks key to skip-webhooks for defaulting and validating
webhooks
* Updated table tests with new label
* Updated key name to webhooks.knative.dev/exclude
* Added common name to cert tmpl
Signed-off-by: Jennifer Power <barnabei.jennifer@gmail.com>
* Added a common name test to TestCreateCert function
* Update webhook/certificates/resources/certs_test.go
Co-authored-by: Victor Agababov <vagababov@gmail.com>
* Update webhook/certificates/resources/certs_test.go
Co-authored-by: Victor Agababov <vagababov@gmail.com>
* Applied changes to improve formatting and style
* Update webhook/certificates/resources/certs_test.go
Co-authored-by: Matt Moore <mattmoor@vmware.com>
* Update webhook/certificates/resources/certs_test.go
Co-authored-by: Matt Moore <mattmoor@vmware.com>
* Added Subject to caParsedCert CommonName check
* Fixed failure message to remove reference to diff
* Update webhook/certificates/resources/certs_test.go
Co-authored-by: Victor Agababov <vagababov@gmail.com>
* Update webhook/certificates/resources/certs_test.go
Co-authored-by: Victor Agababov <vagababov@gmail.com>
* Update webhook/certificates/resources/certs.go
Co-authored-by: Matt Moore <mattmoor@vmware.com>
* Update webhook/certificates/resources/certs.go
Co-authored-by: Matt Moore <mattmoor@vmware.com>
* Update webhook/certificates/resources/certs.go
Co-authored-by: Matt Moore <mattmoor@vmware.com>
Co-authored-by: Victor Agababov <vagababov@gmail.com>
Co-authored-by: Matt Moore <mattmoor@vmware.com>
* Check example-hash label for consistency in ConfigMap webhook.
We've seen users try to edit our configuration and falling into the trap of editing the '_example' block a lot. This attempts at guiding the users to do "the right thing" by checking the ConfigMap's '_example' value (if present) against a precomputed hash of the same value (if present). The idea is that we precompute this has using the tool herein in code generation and thus allow us to easily change the example block automatically while making it hard to change it on ppurpose. If the hashes don't match on an upgrade, the webhook will return an error synchronously, guiding the user to the correct behavior.
* Add tests.
* Add hash check to test utilities.
* Add a bit of coverage.
* Rewrite into a table test.
* Rename.
* Reduce test surface.
* Godoc.
* Code nits.
* Docs.
* Use CRC32.
* Nits.
* Kubelet probes would result in the webhook writing the HTTP status twice
Doesn't seem like it affected anything - just writes out some extra
log messages
* nits
* nits
* nits
* nits
* Add the key of the object to the log context
We don't log _what_ we convert, but only _what type_ it is.
And it's not very useful
So log all the stuff
* issues
* redo
* Add helper function to read port number from env var
* Add check for port == 0
* Add the unit test
* Update error message
Co-Authored-By: savitaashture <sashture@redhat.com>
* Panic when env var is set but invalid
* Rename the func to be PortFromEnv.
* Apply suggestions from code review on error message
Co-authored-by: Victor Agababov <vagababov@gmail.com>
* Join defers
Co-authored-by: savitaashture <sashture@redhat.com>
Co-authored-by: Victor Agababov <vagababov@gmail.com>
* Implement the K8s lifecycle in webhook.
The webhook never properly implemented the Kubernetes SIGTERM/SIGKILL
lifecycle, and doesn't even really support readiness probes today. This
change enables folks to use a block like this on their webhook container:
```yaml
readinessProbe: &probe
periodSeconds: 1
httpGet:
scheme: HTTPS
port: 8443
httpHeaders:
- name: k-kubelet-probe
value: "webhook"
livenessProbe: *probe
```
With this, the webhook won't report as `Ready` until a probe has succeeded,
and when the SIGTERM is received, we will start failing probes for a grace
period (so our Endpoint drops) before shutting down the webhook's HTTP Server.
This was uncovered by running the webhook across 10 replicas in Serving with
the "Goose" (https://github.com/knative/pkg/pull/1316) enabled for the e2e
tests. The failure mode I saw was conversion webhook requests failing across
random tests.
This also moves the Serving probe-detection function into PKG.
* Increase the log level when we start to fail probes
* Wait for go routines to terminate on all paths.
* do not record for empty metric config
* Revert "do not record for empty metric config"
This reverts commit 539a5e4dbb.
* add a comment
* fix typo
* fix tests
* revert
* revert tests
* revert
* fix conflicts
* one more test file
* Remove unused code.
* Use raw strings to avoid escaping.
* Remove unneeded type conversions.
* Preallocate slices where possible.
* Use semantic equality in psbinding reconciler.