Since this is used both in real-code and in test-code, let's put it
somewhere it can be re-used.
Kubernetes-commit: 9c447177e85dadaba0f12fa95c5d8ee6c4e54602
some subresources were not properly included in the array due to pointers becoming stale over a resize
Kubernetes-commit: d74b6b2cfa9bd2bc322750db9c42fb575e947982
so that aggregated-apiservers can also take advantage. discovered by e2e tests with feature enabled
Kubernetes-commit: c9b34884004079ed3f184b475f7408984f9226f4
Also make some design changes exposed in testing and review.
Do not remove the ambiguous old metric
`apiserver_flowcontrol_request_concurrency_limit` because reviewers
though it is too early. This creates a problem, that metric can not
keep both of its old meanings. I chose the configured concurrency
limit.
Testing has revealed a design flaw, which concerns the initialization
of the seat demand state tracking. The current design in the KEP is
as follows.
> Adjustment is also done on configuration change … For a newly
> introduced priority level, we set HighSeatDemand, AvgSeatDemand, and
> SmoothSeatDemand to NominalCL-LendableSD/2 and StDevSeatDemand to
> zero.
But this does not work out well at server startup. As part of its
construction, the APF controller does a configuration change with zero
objects read, to initialize its request-handling state. As always,
the two mandatory priority levels are implicitly added whenever they
are not read. So this initial reconfig has one non-exempt priority
level, the mandatory one called catch-all --- and it gets its
SmoothSeatDemand initialized to the whole server concurrency limit.
From there it decays slowly, as per the regular design. So for a
fairly long time, it appears to have a high demand and competes
strongly with the other priority levels. Its Target is higher than
all the others, once they start to show up. It properly gets a low
NominalCL once other levels show up, which actually makes it compete
harder for borrowing: it has an exceptionally high Target and a rather
low NominalCL.
I have considered the following fix. The idea is that the designed
initialization is not appropriate before all the default objects are
read. So the fix is to have a mode bit in the controller. In the
initial state, those seat demand tracking variables are set to zero.
Once the config-producing controller detects that all the default
objects are pre-existing, it flips the mode bit. In the later mode,
the seat demand tracking variables are initialized as originally
designed.
However, that still gives preferential treatment to the default
PriorityLevelConfiguration objects, over any that may be added later.
So I have made a universal and simpler fix: always initialize those
seat demand tracking variables to zero. Even if a lot of load shows
up quickly, remember that adjustments are frequent (every 10 sec) and
the very next one will fully respond to that load.
Also: revise logging logic, to log at numerically lower V level when
there is a change.
Also: bug fix in float64close.
Also, separate imports in some file
Co-authored-by: Han Kang <hankang@google.com>
Kubernetes-commit: feb42277884bc7cfbd6f0bb1d875cc63b1b6caac
This change enables hot reload of encryption config file when api server
flag --encryption-provider-config-automatic-reload is set to true. This
allows the user to change the encryption config file without restarting
kube-apiserver. The change is detected by polling the file and is done
by using fsnotify watcher. When file is updated it's process to generate
new set of transformers and close the old ones.
Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com>
Kubernetes-commit: 761b7822fca569d475f782b135ef433e5b014147
Add new kube-apiserver SLI metric better reflecting that the metric is
an SLI and not an SLO and deprecate the existing
apiserver_request_slo_duration_seconds in 1.27. Although the metric is
still in alpha, we prefer deprecating it for one release since it is a
critical metric used for SLOs and to make sure that users that are using
it have time to make the transition.
Going forward we prefer going with SLI specific metrics, we will use
_sli_ instead of _slo_ so for consistency purposes.
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
Kubernetes-commit: 1493da92d9513e383f8382c7e80316a3fa6c94fa
Co-authored-by: Alexander Zielenski <zielenski@google.com>
Co-authored-by: Joe Betz <jpbetz@google.com>
Kubernetes-commit: e7d83a1fb7b3e4f6a75ed73bc6e410946e12ad9f
This change adds a flag --encryption-provider-config-automatic-reload
which will be used to drive automatic reloading of the encryption
config at runtime. While this flag is set to true, or when KMS v2
plugins are used without KMS v1 plugins, the /healthz endpoints
associated with said plugins are collapsed into a single endpoint at
/healthz/kms-providers - in this state, it is not possible to
configure exclusions for specific KMS providers while including the
remaining ones - ex: using /readyz?exclude=kms-provider-1 to exclude
a particular KMS is not possible. This single healthz check handles
checking all configured KMS providers. When reloading is enabled
but no KMS providers are configured, it is a no-op.
k8s.io/apiserver does not support dynamic addition and removal of
healthz checks at runtime. Reloading will instead have a single
static healthz check and swap the underlying implementation at
runtime when a config change occurs.
Signed-off-by: Monis Khan <mok@microsoft.com>
Kubernetes-commit: 22e540bc48d9bf698c4f381ccb56ed57dea0dae2
would fllake .04% of the time on my machine.
In tests waiting for objects to be reconciled, would erroneously treat the "Not Found" case as an error rather than waiting a bit.
also add some more context to test errors to improve debuggability
Kubernetes-commit: bfbc1f3479423b5c53231cfec58895746ef2de69
The reason for the issue is that the apiserver uses the Scheme in the
global variable pkg/api/legacyscheme/scheme.go, and registers the
DeleteOptions corresponding to each APIGroup in the Scheme. But
DeleteOptions in meta.k8s.io/v1 is not registered, resulting
in a notRegisteredErr.
Use metainternalversionscheme.Codecs as Serializer
Kubernetes-commit: e7d7f4a9e56fe5d9c10da437787118fe9ea9e5af
* add superuser fallback to authorizer
* change the order of authorizers
* change the order of authorizers
* remove the duplicate superuser authorizer
* add integration test for superuser permissions
Kubernetes-commit: f86acbad68baf1a99d6fa153f6f0cdc7b93932e4
Default api server manifest whose liveness check looks like:
"/livez?exclude=etcd&exclude=kms-provider-0&exclude=kms-provider-1"
Which causes spurious messages in apiserver logs every 10 mins:
```
W1017 00:03:39.938956 9 healthz.go:256] cannot exclude some health checks, no health checks are installed matching "kms-provider-0","kms-provider-1"
```
Let's not log excessive messages especially at warning level. We should
do this at a higher level (6 instead of 4).
NOTE: we don't change the message returned to the http request, we keep
that as-is (does not change on log level)
Also see:
https://github.com/aws/eks-distro/blob/v1-19-eks-12/projects/kubernetes/kubernetes/1-19/patches/0016-EKS-PATCH-apiserver-healthz-upper-log-verbosity-for-.patch
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Kubernetes-commit: 20de240d5bdb7fc50de3fe9b8cdd95f81bf47034
receiver name obj should be consistent with previous receiver name s for SimpleStream
error var hookNotFinished should have name of the form errFoo
Kubernetes-commit: ae385ee874a81cd01ee4fef98efc1bd5c219c9b7
This change updates the API server code to load the encryption
config once at start up instead of multiple times. Previously the
code would set up the storage transformers and the etcd healthz
checks in separate parse steps. This is problematic for KMS v2 key
ID based staleness checks which need to be able to assert that the
API server has a single view into the KMS plugin's current key ID.
Signed-off-by: Monis Khan <mok@microsoft.com>
Kubernetes-commit: f507bc255382b2e2095351053bc17e74f7100d35
- Moves kms proto apis to the staging repo
- Updates generate and verify kms proto scripts to check staging repo
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
Kubernetes-commit: c3794e2377016b1c18b1dcb63dc61d686c8ebcbf
return the last request error, instead of last error received
The rate limit allows 1 event per healthcheck timeout / 2
Kubernetes-commit: 510a85c53a5138babb1650fadd328e6f34baa03b
Fix the one path where boundNextDispatchLocked was not being called
after modifying a queue.
Also check for negative work in a request.
These are motivated by
https://github.com/kubernetes/kubernetes/issues/112169 but I do not
have a way to reproduce it and so can not check that these changes
actually remove that symptom. But these changes are good anyway.
Kubernetes-commit: 6ee93e2cee695203a6ce4935da1b9a807b624260
Use GroupResource instead of object reflection when recording the
following metrics:
- etcd_request_duration_seconds
- etcd_bookmark_counts
Add GroupResource to logs and traces where only reflection-based typing
was previously used.
Both of these changes allow us to disginguish between different CRDs,
all of which are represented as *unstructured.Unstructured.
Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>
Kubernetes-commit: 305fa2add60ad507417304d865f001006d5175fe
Use the group resource instead of objectType in watch cache metrics,
because all CustomResources are grouped together as
*unstructured.Unstructured, instead of 1 entry per type.
Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>
Kubernetes-commit: d08b69e8d35a5aa73a178c508f9b0e1ad74b882d
All CustomResources are treated as *unstructured.Unstructured, leading
the watch cache to log anything related to CRs as Unstructured. This
change uses the schema.GroupResource instead of object type for all type
related log messages in the watch cache, resulting in distinct output
for each CR type.
Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>
Kubernetes-commit: 397533a4c2df9639ff4422c907d06fae195a1835
this is specifically so that we have more structured information when
the metric is parsed and stored as a stable metric. This change does not
change the name of the actual metrics.
Change-Id: I861482401ad9a0ae12306b93abf91d6f76d7a407
Kubernetes-commit: 178e57c17b66eb572a961690bd10782aeb3c3582
- add feature gate
- add encrypted object and run generated_files
- generate protobuf for encrypted object and add unit tests
- move parse endpoint to util and refactor
- refactor interface and remove unused interceptor
- add protobuf generate to update-generated-kms.sh
- add integration tests
- add defaulting for apiVersion in kmsConfiguration
- handle v1/v2 and default in encryption config parsing
- move metrics to own pkg and reuse for v2
- use Marshal and Unmarshal instead of serializer
- add context for all service methods
- check version and keyid for healthz
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
Kubernetes-commit: f19f3f409938ff9ac8a61966e47fbe9c6075ec90
Replicated from https://github.com/etcd-io/etcd/blob/v3.5.4/client/v3/logger.go#L47
The logic of this function doesn't make a lot of sense to me, but
copying it will avoid any behaviour change.
Signed-off-by: Nic Cope <nicc@rk0n.org>
Kubernetes-commit: c1aa7a0fe73cbcab8e70f7b73a845ae9394f9a71
Currently the API server creates one etcd client per CRD. If clients
aren't provided a logger they'll each create their own. These loggers
can account for ~20% of API server memory consumption on a cluster with
hundreds of CRDs.
Signed-off-by: Nic Cope <nicc@rk0n.org>
Kubernetes-commit: 0c81eabb853e581abbcb37ebf094af3316e1012e
This logger is responsible for 20% of the API server's memory usage when
many CRDs are installed. See the below issue for more context.
https://github.com/kubernetes/kubernetes/issues/111476
Signed-off-by: Nic Cope <nicc@rk0n.org>
Kubernetes-commit: 0e5401c93940126beac45264aa056507b0950075
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Kubernetes-commit: a9593d634c6a053848413e600dadbf974627515f
v1.43.0 marked grpc.WithInsecure() deprecated so this commit moves to use
what is the recommended replacement:
grpc.WithTransportCredentials(insecure.NewCredentials())
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Kubernetes-commit: 2c8bfad9106039aa15233b5bf7282b25a7b7e0a0
add short-circuiting logic for long comaprison
replace timestamps rather than doing a full managed fields deepcopy
add guard
Kubernetes-commit: 7233538008489c189d09bb042fbabca97d9cdbaf
Make the test accept all the legitimate outcomes.
Expand the explanation of how TestPriorityAndFairnessWithPanicRecoveryAndTimeoutFilter/priority_level_concurrency_is_set_to_1,_queue_length_is_1,_first_request_should_time_out_and_second_(enqueued)_request_should_time_out_as_well is supposed to work.
Expand debug information that is available when the test fails.
Kubernetes-commit: 1f450695ffd5b2d028c87328b8b32630a8052129
expiredBookmarkWatchers allows us to schedule the next bookmark event after dispatching not before as it was previously.
It opens a new functionality in which a watcher might decide to change when the next bookmark should be delivered based on some internal state.
Kubernetes-commit: 0576f6a011cba8f0c8550fd3dd31111376c9dcd0
Using a Pod type in a GetList() call in a test
can panic at worst and error out at best. Here,
neither happened because the error condition
being tested for (cacher being stopped or not)
gets returned before the list pointer can be
enforced.
This commit changes the above to use PodList.
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
Kubernetes-commit: 487761f4e2543114db158f0d59e598dedc481882
Before:
Create()
BeginCreate()
BeforeCreate()
init UID <---------------------
strategy code
After:
Create()
init UID <-------------------------
BeginCreate()
BeforeCreate()
strategy code
This also wipes UID early (suggested by David) and asserts it is set in
BeforeCreate().
Kubernetes-commit: 5615de51f9e768dd01d7fe49a48e8db756bd8ac8
Our tests are mostly error based and explicit error typing allows
us to test against error types directly. Having made this change also
makes it obvious that our test coverage was lacking in two branches,
specifically, we were previously not testing empty start keys nor were
we testing for invalid start RVs.
Kubernetes-commit: 213e380a2e48830db6c71d2da5485d4226d95625
Members are not used in (waiting,executing) pairs, so stopped
using the wrapper that adds such pairing.
Kubernetes-commit: cd33c7cf2260b351dd345497223a944e80bc7b61
Validating admission webhook evaluation can fail, if uncaught this
crashes a kube-apiserver. Add handling to catch panic while preserving
the behavior of "must not fail".
Kubernetes-commit: d412bf92b3b02bda93707c6aaba945f28bf60c72
The means by which we encode and decode the continue token during a
paginated LIST call is not specific to etcd3. In order to allow for a
generic suite of tests against any storage.Interface implementation, we
need this logic to live outside of the etcd3 package, or import cycles
will exist.
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
Kubernetes-commit: eb3aa5be10393968d8083c79f5958501fc029e8d