This change removes our dependency on the hand-rolled kubernetes gateway API bindings from the `k8s-gateway-api` crate and replaces them with the official generated bindings from the `gateway-api` crate.
Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
We refactor the policy-controller in anticipation of switching from the `k8s-gateway-api` crate to the `gateway-api` crate for Gateway API type bindings. This change does not introduce any functional changes or changes in dependencies but instead prepares for the dependency change by introducing a number of type aliases to coincide with the type names in the `gateway-api` crate.
---------
Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
* docker.io/library/golang from 1.22 to 1.23
* gotestsum from 0.4.2 to 1.12.0
* protoc-gen-go from 1.28.1 to 1.35.2
* protoc-gen-go-grpc from 1.2 to 1.5.1
* docker.io/library/rust from 1.76.0 to 1.83.0
* cargo-deny from 0.14.11 to 0.16.3
* cargo-nextest from 0.9.67 to 0.9.85
* cargo-tarpaulin from 0.27.3 to 0.31.3
* just from 1.24.0 to 1.37.0
* yq from 4.33.3 to 4.44.5
* markdownlint-cli2 from 0.10.0 to 0.15.0
* shellcheck from 0.9.0 to 0.10.0
* actionlint from 1.6.26 to 1.7.4
* protoc from 3.20.3 to 29.0
* step from 0.25.2 to 0.28.2
* kubectl from 1.29.2 to 1.31.3
* k3d from 5.6.0 to 5.7.5
* k3s image shas
* helm from 3.14.1 to 3.16.3
* helm-docs from 1.12.0 to 1.14.2
This change introduces a timeout into the kubernetes lease logic so that patches
may not get stuck indefinitely.
This change also modifies our Cargo.tomls so that kubert and its related
dependencies (kube and k8s-openapi) are defined at the workspace-level.
There are a few things about the policy controller logging that can be cleaned
up for consistency and clarity:
* We frequently log ERROR messages when processing resources with unexpected
values. These messages are more appropriately emitted at WARN--we want to
surface these situations, but they are not really exceptional.
* The leadership status of the status controller is not logged at INFO level, so
it's not possible to know about status changes without DEBUG logging.
* We generally use Sentence-cased log messages when emitting user-facing
messages. There are a few situations where we are not consistent.
* The status controller reconciliation logging is somewhat noisy and misleading.
* The status controller does not log any messages when patching resources.
```
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder has changed
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder has changed
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder has changed
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l
DEBUG status::Index: linkerd_policy_controller_k8s_status::index: Lease holder reconciling cluster index.name=linkerd-destination-74d7fdc45d-xfb8l
```
The "Lease holder has changed" message actually indicates that the _lease_ has
changed, though the holder may be unchanged.
To improve logging clarity, this change does the following:
* Adds an INFO level log when the leadership status of the controller changes.
* Adds an INFO level log when the status controller patches resources.
* Adds DEBUG level logs when the status controller patches resources.
* Reconciliation housekeeping logging is moved to TRACE level.
* Consistently uses sentence capitalization in user-facing log messages
* Reduces ERROR messages to WARN when handling invalid user-provided data
(including cluster resources). This ensures that ERRORs are reserved for
exceptional policy controller states.
Our other types (e.g. HttpRoute) follow the Rust convention of using "Http"
instead of "HTTP".
This commit updates the policy controller to reflect this convention.
No user-facing changes.
This adds the status stanza to the HTTPLocalRateLimitPolicy CRD, and implements its handling in the policy status controller.
For the controller to accept an HTTPLocalRateLimitPolicy CR it checks that:
- The targetRef is an existing Server
- If there are multiple HTTPLocalRateLimitPolicy CRs pointing to the same server, only accept the oldest one, or if created at the same time, the first in alphabetical order (that logic was moved from the inbound indexer to the status controller).
6cd7dc22c: Update RL CRD and RBAC to allow patching its status
69aee0129: Update golden files
60f25b716: Implement status handling for HTTPLocalRateLimitPolicy CRD
fc99d3adf: Update existing unit tests
0204acf65: New unit tests
## Examples
Not accepted CR:
```yaml
...
status:
conditions:
- lastTransitionTime: "2024-11-12T23:10:05Z"
message: ""
reason: RateLimitReasonAlreadyExists
status: "False"
type: Accepted
targetRef:
group: policy.linkerd.io
kind: Server
name: web-http
```
Accepted CR:
```yaml
...
status:
conditions:
- lastTransitionTime: "2024-11-12T23:10:05Z"
message: ""
reason: Accepted
status: "True"
type: Accepted
targetRef:
group: policy.linkerd.io
kind: Server
name: web-http
```
This PR adds a few notable changes associated with the egress functionality of Linkerd:
- `EgressNetwork` objects are indexed into the outbound index
- outbound policy lookups are classfieid as either in-cluster or egress based on the `ip:port` combination
- `TCPRoute`, `TLSRoute`, `GRPCRoute` and `HTTPRoute` attachments are reflected for both `EgressNetwork` and `Service` targets
- the default traffic policy for `EgressNetwork` is honored by returning the appropriate default (failure/success) routes for all protocols
Note that this PR depends on an unreleased version of the linkerd2-proxy-api repo.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This PR adds an `EgressNetwork` CRD, which purpose is to describe networks that are external to the cluster.
In addition to that it also adds `TLSRoute` and `TCPRoute` gateway api CRDs.
Most of the work in this change is focused on introducing these CRDs and correctly setting their status based on route specificity rules described in: https://gateway-api.sigs.k8s.io/geps/gep-1426/#route-types.
Notable changes include:
- ability to attach TCP and TLS routes to both `EgressNetworks` and `Service` objects
- implemented conflict resolutions between routes
- admission validation on the newly introduced resources
- module + integration tests
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Followup to #12845
This expands the policy controller index in the following ways:
- Adds the new Audit variant to the DefaultPolicy enum
- Expands the function that synthesizes the authorizations for a given default policy (DefaultPolicy::default_authzs) so that it also creates an Unauthenticated client auth and a allow-all NetworkMatch for the new Audit default policy.
- Now that a Server can have a default policy different than Deny, when generating InboundServer authorizations (PolicyIndex::client_authzs) make sure to append the default authorizations when DefaultPolicy is Allow or Audit
Also, the admission controller ensures the new accessPolicy field contains a valid value.
## Tests
New integration tests added:
- e2e_audit.rs exercising first the audit policy in Server, and then at the namespace level
- in admit_server.rs a new test checks invalid accessPolicy values are rejected.
- in inbound_api.rs server_with_audit_policy verifies the synthesized audit authorization is returned for a Server with accessPolicy=audit
> [!NOTE]
> Please check linkerd/website#1805 for how this is supposed to work from the user's perspective.
Followup to #12844
This new field defines the default policy for Servers, i.e. if a request doesn't match the policy associated to a Server then this policy applies. The values are the same as for `proxy.defaultInboundPolicy` and the `config.linkerd.io/default-inbound-policy` annotation (all-unauthenticated, all-authenticated, cluster-authenticated, cluster-unauthenticated, deny), plus a new value "audit". The default is "deny", thus remaining backwards-compatible.
This field is also exposed as an additional printer column.
We add support for GrpcRoute resources in the policy-controller's status controller. This means that the policy controller will watch GrpcRoute resources in the cluster and keep their status up to date, in the same way that it currently does for HttpRoute resources.
Signed-off-by: Alex Leong <alex@buoyant.io>
When the policy controller updates the status of an HttpRoute resource, we currently have little observability into if those updates are failing or how long they are taking. We also have no timeout in place to protect the policy controller from extremely slow or hanging status update requests.
We add a generous 5 second timeout for these API calls and add metrics to track success, failures, timeouts, and duration.
```
# HELP resource_status_patch_succeeded_total Counter patches successfully applied to HTTPRoutes.
# TYPE resource_status_patch_succeeded_total counter
resource_status_patch_succeeded_total_total 1711
# HELP resource_status_patch_failed_total Counter patches that fail to apply to HTTPRoutes.
# TYPE resource_status_patch_failed_total counter
resource_status_patch_failed_total_total 0
# HELP resource_status_patch_timeout_total Counter patches that time out when applying to HTTPRoutes.
# TYPE resource_status_patch_timeout_total counter
resource_status_patch_timeout_total_total 0
# HELP resource_status_patch_duration_seconds Histogram of time taken to apply patches to HTTPRoutes.
# TYPE resource_status_patch_duration_seconds histogram
resource_status_patch_duration_seconds_sum 8.930499397
resource_status_patch_duration_seconds_count 1711
resource_status_patch_duration_seconds_bucket{le="0.01"} 1656
resource_status_patch_duration_seconds_bucket{le="0.025"} 1694
resource_status_patch_duration_seconds_bucket{le="0.05"} 1707
resource_status_patch_duration_seconds_bucket{le="0.1"} 1710
resource_status_patch_duration_seconds_bucket{le="0.25"} 1711
resource_status_patch_duration_seconds_bucket{le="0.5"} 1711
resource_status_patch_duration_seconds_bucket{le="1.0"} 1711
resource_status_patch_duration_seconds_bucket{le="2.5"} 1711
resource_status_patch_duration_seconds_bucket{le="5.0"} 1711
resource_status_patch_duration_seconds_bucket{le="+Inf"} 1711
```
Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
The ExternalWorkload resource we introduced has a minor naming
inconsistency; `Tls` in `meshTls` is not capitalised. Other resources
that we have (e.g. authentication resources) capitalise TLS (and so does
Go, it follows a similar naming convention).
We fix this in the workload resource by changing the field's name and
bumping the version to `v1beta1`.
Upgrading the control plane version will continue to work without
downtime. However, if an existing resource exists, the policy controller
will not completely initialise. It will not enter a crashloop backoff,
but it will also not become ready until the resource is edited or
deleted.
Signed-off-by: Matei David <matei@buoyant.io>
ExternalWorkload resources require that status condition has almost all of its
fields set (with the exception of a date field). The original inspiration for
this design was the HTTPRoute object.
When using the resource, it is more practical to handle many of the fields as
optional; it is cumbersome to fill out the fields when creating an
ExternalWorkload. We change the settings to be in-line with a [Pod] object
instead.
[Pod]:
7d1a2f7a73/core/v1/types.go (L3063-L3084)
---------
Signed-off-by: Matei David <matei@buoyant.io>
We introduced an ExternalWorkload CRD along with bindings for mesh
expansion. Currently, the CRD allows users to create ExternalWorkload
resources without adding a meshTls strategy.
This change adds some more validation restrictions to the CRD definition
(i.e. server side validation). When a meshTls strategy is used, we
require both identity and serverName to be present. We also mark meshTls
as the only required field in the spec. Every ExternalWorkload regardless
of the direction of its traffic must have it set.
WorkloadIPs and ports now become optional to allow resources to be
created only to configure outbound discovery (VM to workload)
and inbound policy discovery (VM).
---------
Signed-off-by: Matei David <matei@buoyant.io>
This PR adds the ability for a `Server` resource to select over `ExternalWorkload`
resources in addition to `Pods`. For the time being, only one of these selector types
can be specified. This has been realized via incrementing the version of the resource
to `v1beta2`
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
We introduced an ExternalWorkload CRD for mesh expansion. This change
follows up by adding bindings for Rust and Go code.
For Go code:
* We add a new schema and ExternalWorkload types
* We also update the code-gen script to generate informers
* We add a new informer type to our abstractions built on-top of
client-go, including a function to check if a client has access to the
resource.
For Rust code:
* We add ExternalWorkload bindings to the policy controller.
---------
Signed-off-by: Matei David <matei@buoyant.io>
* kube 0.87.1
* k8s-openapi 0.20.0
* kubert 0.21.1
* k8s-gateway-api 0.15
* ring 0.17
Furthermore, the policy controller's metrics endpoint has been updated
to include tokio runtime metrics.
The policy controller sets incorrect backend metadata when (1) there is
no explicit backend reference specified, and (2) when a backend
reference crosses namespaces.
This change fixes these backend references so that proxy logs and
metrics have the proper metadata references. Outbound policy tests are
updated to validate this.
New versions of the k8s-openapi crate drop support for Kubernetes 1.21.
Kubernetes v1.22 has been considered EOL by the upstream project since
2022-07-08. Major cloud providers have EOL'd it as well (GKE's current
MSKV is 1.24).
This change updates the MSKV to v1.22. It also updates the max version
in _test-helpers.sh to v1.28.
We intermittently see flaky policy integration test failures like:
```
failures:
either
thread 'either' panicked at 'assertion failed: `(left == right)`
left: `7`,
right: `0`: blessed uninjected curl must succeed', policy-test/tests/e2e_server_authorization.rs:293:9
```
This test failure is saying that the curl process is returning an exit code of 7 instead of the expected exit code of 0. This exit code indicates that curl failed to establish a connection. https://everything.curl.dev/usingcurl/returns
It's unclear why this connection occasionally fails in CI and I have not been able to reproduce this failure locally.
However, by looking at the logic of the integration test, we can see that the integration test creates the `web` Service and the `web` Pod and waits for that pod to become ready before unblocking the curl from executing. This means that, theoretically, there could be a race condition between the test and the kubernetes endpoints controller. As soon as the web pod becomes ready, the endpoints controller will update the endpoints resource for the `web` Service and at the same time, our test will unblock the curl command. If the test wins this race, it is possible that curl will run before the endpoints resource has been updated.
We add an additional wait condition to the test to wait until the endpoints resource has an endpoint before unblocking curl.
Since I could not reproduce the test failure locally, it is impossible to say if this is actually the cause of the flakiness or if this change fixes it.
Signed-off-by: Alex Leong <alex@buoyant.io>
This branch updates the policy-controller's dependency on Kubert to
v0.18, `kube-rs` to v0.85, `k8s-gateway-api` to v0.13, and `k8s-openapi`
to v0.19.
All of these crates depend on `kube-rs` and `k8s-openapi`, so they must
all be updated together in one commit. Therefore, this branch updates
all these dependencies.
Add go client codegen for HttpRoute v1beta3. This will be necessary for any of the go controllers (i.e. metrics-api) or go CLI commands to interact with HttpRoute v1beta3 resources in kubernetes.
Signed-off-by: Kevin Ingelman <ki@buoyant.io>
PR #10969 adds support for the GEP-1742 `timeouts` field to the
HTTPRoute CRD. This branch implements actual support for these fields in
the policy controller. The timeout fields are now read and used to set
the timeout fields added to the proxy-api in
linkerd/linkerd2-proxy-api#243.
In addition, I've added code to ensure that the timeout fields are
parsed correctly when a JSON manifest is deserialized. The current
implementation represents timeouts in the bindings as a Rust
`std::time::Duration` type. `Duration` does implement
`serde::Deserialize` and `serde::Serialize`, but its serialization
implementation attempts to (de)serialize it as a struct consisting of a
number of seconds and a number of subsecond nanoseconds. The timeout
fields are instead supposed to be represented as strings in the Go
standard library's `time.ParseDuration` format. Therefore, I've added a
newtype which wraps the Rust `std::time::Duration` and implements the
same parsing logic as Go. Eventually, I'd like to upstream the
implementation of this to `kube-rs`; see kube-rs/kube#1222 for details.
Depends on #10969
Depends on linkerd/linkerd2-proxy-api#243
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Add a new version to the HttpRoute CRD: `v1beta3`. This version adds a new `timeouts` struct to the http route rule. This mirrors a corresponding new field in the Gateway API, as described in [GEP-1742](https://github.com/kubernetes-sigs/gateway-api/pull/1997). This field is currently unused, but will eventually be read by the policy controller and used to configure timeouts enforced by the proxy.
The diff between v1beta2 and v1beta3 is:
```
timeouts:
description: "Timeouts defines the timeouts that can be configured
for an HTTP request. \n Support: Core \n <gateway:experimental>"
properties:
backendRequest:
description: "BackendRequest specifies a timeout for an
individual request from the gateway to a backend service.
Typically used in conjunction with automatic retries,
if supported by an implementation. Default is the value
of Request timeout. \n Support: Extended"
format: duration
type: string
request:
description: "Request specifies a timeout for responding
to client HTTP requests, disabled by default. \n For example,
the following rule will timeout if a client request is
taking longer than 10 seconds to complete: \n ``` rules:
- timeouts: request: 10s backendRefs: ... ``` \n Support:
Core"
format: duration
type: string
type: object
```
We update the `storage` version of HttpRoute to be v1beta3 but continue to serve all versions. Since this new field is optional, the Kubernetes API will be able to automatically convert between versions.
Signed-off-by: Alex Leong <alex@buoyant.io>
A route may have two conditions in a parent status: a condition that
states whether it has been accepted by the parents, and a condition that
states whether all backend references -- that traffic matched against
route is sent to -- have resolved successfully. Currently, the policy
controller does not support the latter.
This change introduces support for checking and setting a backendRef
specific condition. A successful condition (ResolvedRefs = True) is met
when all backend references point to a supported type, and that type
exists in the cluster. Currently, only Service objects are supported. A
nonexistent object, or an unsupported kind will reject the entire
condition; the particular reason will be reflected in the condition's
message.
Since statuses are set on a route's parents, the same condition will
apply to _all_ parents in a route (since there is no way to elicit
different backends for different parents).
If a route does not have any backend references, then the parent
reference type will be used. As such, any parents that are not Services
will automatically get an invalid backend condition (exception to the
rule in the third paragraph where a condition is shared by all parents).
When the parent is supported (i.e a Service) we needn't check its
existence since the parent condition will already reflect that.
---
Signed-off-by: Matei David <matei@buoyant.io>
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
Implement the outbound policy API as defined in the proxy api: https://github.com/linkerd/linkerd2-proxy-api/blob/main/proto/outbound.proto
This API is consumed by the proxy for the routing of outbound traffic. It is intended to replace the GetProfile API which is currently served by the destination controller. It has not yet been released in a proxy-api release, so we take a git dependency on it in the mean time.
This PR adds a new index to the policy controller which indexes HTTPRoutes and Services and uses this information to serve the outbound API. We also add outbound API tests to validate the behavior of this implementation.
Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
The policy controller uses the v1alpha1 HTTPRoute type as its internal
representation of HTTPRoute resources. This change updates the resource
version to v1beta2 in anticipation of adding outbound policy support.
To do so, we need to update the e2e tests to create HTTPRoute resources
properly. They currently include a `port` value, though it is not
allowed by our validator. The older resource type does not support this
field and so it was silently ignored.
The previous version k8s-gateway (`v0.10.0`) did not include backendRefs
for HTTP Routes, since the policy controller did not use them for any
specific task or validation. BackendRef support is currently being added
for the status controller, and will be used as more and more route
functionality is added to Linkerd.
This change bumps k8s-gateway to the most recent version and updates the
internal model of the route to include backendRefs. Additionally, fixes
any compiler issues that cropped up from adding a field to the struct.
Signed-off-by: Matei David <matei@buoyant.io>
This adds lease claims to the policy status controller so that upon startup, a
status controller attempts to claim the `status-controller` lease in the
`linkerd` namespace. With this lease, we can enforce leader election and ensure
that only one status controller on a cluster is attempting to patch HTTPRoute’s
`status` field.
Upon startup, the status controller now attempts to create the
`status-controller` lease — it will handle failure if the lease is already
present on the cluster. It then spawns a task for attempting to claim this lease
and sends all claim updates to the index `Index`.
Currently, `Index.claims` is not used, but in follow-up changes we can check
against the current claim for determining if the status controller is the
current leader on the cluster. If it is, we can make decisions about sending
updates or not to the controller `Controller`.
### Testing
Currently I’ve only manually tested this, but integration tests will definitely
be helpful follow-ups. For manually testing, I’ve asserted that the
`status-controller` is claimed when one or more status controllers startup and
are running on a cluster. I’ve also asserted that when the current leader is
deleted, another status controller claims the lease. Below is the summary of how
I tested it
```shell
$ linkerd install --ha |kubectl apply -f -
…
$ kubectl get -n linkerd leases status-controller
NAME HOLDER AGE
status-controller linkerd-destination-747b456876-dcwlb 15h
$ kubectl delete -n linkerd pod linkerd-destination-747b456876-dcwlb
pod "linkerd-destination-747b456876-dcwlb" deleted
$ kubectl get -n linkerd leases status-controller
NAME HOLDER AGE
status-controller linkerd-destination-747b456876-5zpwd 15h
```
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
This branch updates the dependency on `kubert` to 0.13.0.
- [Release notes](https://github.com/olix0r/kubert/releases)
- [Commits](https://github.com/olix0r/kubert/compare/release/v0.12.0...release/v0.13.0)
Since `kubert` and other Kubernetes API dependencies must be updated in
lockstep, this branch also updates `kube` to 0.78, `k8s-openapi` to
0.13, and `k8s-gateway-api` to 0.9.
`kube-runtime` now depends on a version of the `base64` crate which has
diverged significantly from the version `rustls-pemfile` depends on.
Since both `base64` deps are transitive dependencies which we have no
control over, this branch adds a `cargo deny` exception for duplicate
dependencies on `base64`.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
### Overview
This adds a policy status controller which is responsible for patching Linkerd’s
HTTPRoute resource with a `status` field. The `status` field has a list of
parent statuses — one status for each of its parent references. Each status
indicates whether or not this parent has “accepted” the HTTPRoute.
The status controller runs on its own task in the policy controller and watches
for updates to the resources that it cares about, similar to the policy
controller’s index. One of the main differences is that while the policy
controller’s index watches many resources, the status controller currently only
cares about HTTPRoutes and Servers; HTTPRoutes can still only have parent
references that are Servers so we don’t currently need to consider any other
parent reference resources.
The status controller maintains its own index of resources so that it is
completely separated from the policy controller’s index. This allows the index
to be simpler in both its structure, how it handles `apply` and `delete`, and
what information it needs to store.
### Follow-ups
There are several important follow-ups to this change. #10124 contains changes
for the policy controller index filtering out HTTPRoutes that are not accepted
by a Server. We don’t want those changes yet. Leaving those out, the status
controller does not actually have any affect on Linkerd policy in the cluster.
We can probably add additional logging several places in the status controller;
that may even take place as part of the reviews on this. Additionally, we could
try queue size for updates to be processed.
Currently if the status controller fails in any of its potential places, we do
not re-queue updates. We probably should do that so that it is more robust
against failure.
In an HA installation, there could be multiple status controllers trying to
patch the same resource. We should explore the k8s lease API so that only one
status controller can patch a resource at a time.
### Implementation
The status controller `Controller` has a k8s client for patching resources,
`index` for tracking resources, and an `updates` channel which handles
asynchronous updates to resources.
#### Index
`Index` synchronously observes changes to resources. It determines which Servers
accept each HTTPRoute and generates a status patch for that HTTPRoute. Again,
the status contains a list of parent statuses, one for each of the HTTPRoutes
parent references.
When a Server is added or deleted, the status controller needs to recalculate
the status for all HTTPRoutes. This is because an HTTPRoute can reference
Servers in other namespaces, so if a Server is added or deleted anywhere in the
cluster it could affect any of the HTTPRoutes on the cluster.
When an HTTPRoute is added, we need to determine the status only for that
HTTPRoute. When it’s deleted we just need to make sure it’s removed from the
index.
The patches that the `Index` creates are sent to the `Controller` which is
responsible only for applying those patches to HTTPRoutes.
#### Controller
`Controller` asynchronously processes updates and applies patches to HTTPRoutes.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Currently, when the policy controller's validating admission webhook
rejects a Server because it collides with an existing one, it's
difficult to determine which resource the new Server would collide with
(see #10153). Therefore, we should update the error message to include
the existing Server. Additionally, the current error message uses the
word "identical", which suggests to the user that the two Server specs
have the same pod selector. However, this may not actually be the case:
the conflict occurs if the two Servers' pod selectors would select *any*
overlapping pods.
This branch changes the error message to include the name and namespace
of the existing Server whose pod selector overlaps with the new Server.
Additionally, I've reworded the error message to avoid the use of
"identical", and tried to make it clearer that the collision is because
the pod selectors would select one or more overlapping pods, rather than
selecting all the same pods.
Fixes#10153
Fixes#9965
Adds a `path` property to the RedirectRequestFilter in all versions. This property was absent from the CRD even though it appears in the gateway API documentation and is represented in the internal types. Adding this property to the CRD will also users to specify it.
Add a new version to the HTTPRoute CRD: v1beta2. This new version includes two changes from v1beta1:
* Added `port` property to `parentRef` for use when the parentRef is a Service
* Added `backendRefs` property to HTTPRoute rules
We switch the storage version of the HTTPRoute CRD from v1alpha1 to v1beta2 so that these new fields may be persisted.
We also update the policy admission controller to allow an HTTPRoute parentRef type to be Service (in addition to Server).
Signed-off-by: Alex Leong <alex@buoyant.io>
The implementation of the `NotIn` pod selector expression in the policy
controller is backwards. If a value exists for the label in the
expression, and it is contained in the `NotIn` set, the expression will
return `true`, and it will return `false` when the value is _not_ in the
set. This is because it calls `values.contains(v)`, just like the `In`
expression.
* Update kubert to v0.10
* Update kube-rs to v0.75 (fixes#9339)
* Update k8s-openapi to v0.16
* Update k8s-gateway-api to v0.7
Signed-off-by: Oliver Gould <ver@buoyant.io>