The ExternalWorkload resource we introduced has a minor naming
inconsistency; `Tls` in `meshTls` is not capitalised. Other resources
that we have (e.g. authentication resources) capitalise TLS (and so does
Go, it follows a similar naming convention).
We fix this in the workload resource by changing the field's name and
bumping the version to `v1beta1`.
Upgrading the control plane version will continue to work without
downtime. However, if an existing resource exists, the policy controller
will not completely initialise. It will not enter a crashloop backoff,
but it will also not become ready until the resource is edited or
deleted.
Signed-off-by: Matei David <matei@buoyant.io>
Fixes#11995
If a Server is marking a Pod's port as opaque and then the Server's podSelector is updated to no longer select that Pod, then the Pod's port should no longer be marked as opaque. However, this update does not result in any updates from the destination API's Get stream and the port remains marked as opaque.
We fix this by updating the endpoint watcher's handling of Server updates to consider both the old and the new Server.
Signed-off-by: Alex Leong <alex@buoyant.io>
We introduced an endpoints controller that will be responsible for
managing EndpointSlices for services that select external workloads. We
introduce as a follow-up the reconciler component of the controller that
will be responsible for doing the writes and diffing.
Additionally, the controller is wired-up in the destination service's
main routine and will start if endpoint slice support is enabled.
---------
Signed-off-by: Matei David <matei@buoyant.io>
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Co-authored-by: Zahari Dichev <zaharidichev@gmail.com>
For mesh expansion, we need to register an ExternalWorkload's service
membership. Service memberships describe which Service objects an
ExternalWorkload is part of (i.e. which service can be used to route
traffic to an external endpoint).
Service membership will allow the control plane to discover
configuration associated with an external endpoint when performing
discovery on a service target.
To build these memberships, we introduce a new controller to the
destination service, responsible for watching Service and
ExternalWorkload objects, and for writing out EndpointSlice objects for
each Service that selects one or more external endpoints.
As a first step, we add a new externalworkload module and a new controller in the
that watches services and workloads. In a follow-up change,
the ExternalEndpointManager will additionally perform
the necessary reconciliation by writing EndpointSlice objects.
Since Linkerd's control plane may run in HA, we also add a lease object
that will be used by the manager. When a lease is claimed, a flag is
turned on in the manager to let it know it may perform writes.
A more compact list of changes:
* Add a new externalworkload module
* Add an EndpointsController in the module along with necessary mechanisms to watch resources.
* Add RBAC rules to the destination service:
* Allow policy and destination to read ExternalWorkload objects
* Allow destination to create / update / read Lease objects
---------
Signed-off-by: Matei David <matei@buoyant.io>
We introduced an ExternalWorkload CRD for mesh expansion. This change
follows up by adding bindings for Rust and Go code.
For Go code:
* We add a new schema and ExternalWorkload types
* We also update the code-gen script to generate informers
* We add a new informer type to our abstractions built on-top of
client-go, including a function to check if a client has access to the
resource.
For Rust code:
* We add ExternalWorkload bindings to the policy controller.
---------
Signed-off-by: Matei David <matei@buoyant.io>
Unregister prom gauges when recycling cluster watcher
Fixes#11839
When in `restartClusterWatcher` we fail to connect to the target cluster
for whatever reason, the function gets called again 10s later, and tries
to register the same prometheus metrics without unregistering them
first, which generates warnings.
The problem lies in `NewRemoteClusterServiceWatcher`, which instantiates
the remote kube-api client and registers the metrics, returning a nil
object if the client can't connect. `cleanupWorkers` at the beginning of
`restartClusterWatcher` won't unregister those metrics because of that
nil object.
To fix this, gauges are unregistered on error.
* Add ability to configure client-go's `QPS` and `Burst` settings
## Problem and Symptoms
When having a very large number of proxies request identity in a short period of time (e.g. during large node scaling events), the identity controller will attempt to validate the tokens sent by the proxies at a rate surpassing client-go's the default request rate threshold, triggering client-side throttling, which will delay the proxies initialization, and even failing their startup (after a 2m timeout). The identity controller will surface this through log entries like this:
```
time="2023-11-08T19:50:45Z" level=error msg="error validating token for web.emojivoto.serviceaccount.identity.linkerd.cluster.local: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline"
```
## Solution
Client-go's default `QPS` is 5 and `Burst` is 10. This PR exposes those settings as entries in `values.yaml` with defaults of 100 and 200 respectively. Note this only applies to the identity controller, as it's the only controller performing direct requests to the `kube-apiserver` in a hot path. The other controllers mostly rely in informers, and direct calls are sporadic.
## Observability
The `QPS` and `Burst` settings used are exposed both as a log entry as soon as the controller starts, and as in the new metric gauges `http_client_qps` and `http_client_burst`
## Testing
You can use the following K6 script, which simulates 6k calls to the `Certify` service during one minute from emojivoto's web pod. Before running this you need to:
- Put the identity.proto and [all the other proto files](https://github.com/linkerd/linkerd2-proxy-api/tree/v0.11.0/proto) in the same directory.
- Edit the [checkRequest](https://github.com/linkerd/linkerd2/blob/edge-23.11.3/pkg/identity/service.go#L266) function and add logging statements to figure the `token` and `csr` entries you can use here, that will be shown as soon as a web pod starts.
```javascript
import { Client, Stream } from 'k6/experimental/grpc';
import { sleep } from 'k6';
const client = new Client();
client.load(['.'], 'identity.proto');
// This always holds:
// req_num = (1 / req_duration ) * duration * VUs
// Given req_duration (0.5s) test duration (1m) and the target req_num (6k), we
// can solve for the required VUs:
// VUs = req_num * req_duration / duration
// VUs = 6000 * 0.5 / 60 = 50
export const options = {
scenarios: {
identity: {
executor: 'constant-vus',
vus: 50,
duration: '1m',
},
},
};
export default () => {
client.connect('localhost:8080', {
plaintext: true,
});
const stream = new Stream(client, 'io.linkerd.proxy.identity.Identity/Certify');
// Replace with your own token
let token = "ZXlKaGJHY2lPaUpTVXpJMU5pSXNJbXRwWkNJNkluQjBaV1pUZWtaNWQyVm5OMmxmTTBkV2VUTlhWSFpqTmxwSmJYRmtNMWRSVEhwNVNHWllhUzFaZDNNaWZRLmV5SmhkV1FpT2xzaWFXUmxiblJwZEhrdWJEVmtMbWx2SWwwc0ltVjRjQ0k2TVRjd01EWTRPVFk1TUN3aWFXRjBJam94TnpBd05qQXpNamt3TENKcGMzTWlPaUpvZEhSd2N6b3ZMMnQxWW1WeWJtVjBaWE11WkdWbVlYVnNkQzV6ZG1NdVkyeDFjM1JsY2k1c2IyTmhiQ0lzSW10MVltVnlibVYwWlhNdWFXOGlPbnNpYm1GdFpYTndZV05sSWpvaVpXMXZhbWwyYjNSdklpd2ljRzlrSWpwN0ltNWhiV1VpT2lKM1pXSXRPRFUxT1dJNU4yWTNZeTEwYldJNU5TSXNJblZwWkNJNklqaGlZbUV5WWpsbExXTXdOVGN0TkRnMk1TMWhNalZsTFRjelpEY3dOV1EzWmpoaU1TSjlMQ0p6WlhKMmFXTmxZV05qYjNWdWRDSTZleUp1WVcxbElqb2lkMlZpSWl3aWRXbGtJam9pWm1JelpUQXlNRE10TmpZMU55MDBOMk0xTFRoa09EUXRORGt6WXpBM1lXUTJaak0zSW4xOUxDSnVZbVlpT2pFM01EQTJNRE15T1RBc0luTjFZaUk2SW5ONWMzUmxiVHB6WlhKMmFXTmxZV05qYjNWdWREcGxiVzlxYVhadmRHODZkMlZpSW4wLnlwMzAzZVZkeHhpamxBOG1wVjFObGZKUDB3SC03RmpUQl9PcWJ3NTNPeGU1cnNTcDNNNk96VWR6OFdhYS1hcjNkVVhQR2x2QXRDRVU2RjJUN1lKUFoxVmxxOUFZZTNvV2YwOXUzOWRodUU1ZDhEX21JUl9rWDUxY193am9UcVlORHA5ZzZ4ZFJNcW9reGg3NE9GNXFjaEFhRGtENUJNZVZ6a25kUWZtVVZwME5BdTdDMTZ3UFZWSlFmNlVXRGtnYkI1SW9UQXpxSmcyWlpyNXBBY3F5enJ0WE1rRkhSWmdvYUxVam5sN1FwX0ljWm8yYzJWWk03T2QzRjIwcFZaVzJvejlOdGt3THZoSEhSMkc5WlNJQ3RHRjdhTkYwNVR5ZC1UeU1BVnZMYnM0ZFl1clRYaHNORjhQMVk4RmFuNjE4d0x6ZUVMOUkzS1BJLUctUXRUNHhWdw==";
// Replace with your own CSR
let csr = "MIIBWjCCAQECAQAwRjFEMEIGA1UEAxM7d2ViLmVtb2ppdm90by5zZXJ2aWNlYWNjb3VudC5pZGVudGl0eS5saW5rZXJkLmNsdXN0ZXIubG9jYWwwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAATKjgVXu6F+WCda3Bbq2ue6m3z6OTMfQ4Vnmekmvirip/XGyi2HbzRzjARnIzGlG8wo4EfeYBtd2MBCb50kP8F8oFkwVwYJKoZIhvcNAQkOMUowSDBGBgNVHREEPzA9gjt3ZWIuZW1vaml2b3RvLnNlcnZpY2VhY2NvdW50LmlkZW50aXR5LmxpbmtlcmQuY2x1c3Rlci5sb2NhbDAKBggqhkjOPQQDAgNHADBEAiAM7aXY8MRs/EOhtPo4+PRHuiNOV+nsmNDv5lvtJt8T+QIgFP5JAq0iq7M6ShRNkRG99ZquJ3L3TtLWMNVTPvqvvUE=";
const data = {
identity: "web.emojivoto.serviceaccount.identity.linkerd.cluster.local",
token: token,
certificate_signing_request: csr,
};
stream.write(data);
// This request takes around 2ms, so this sleep will mostly determine its final duration
sleep(0.5);
};
```
This results in the following report:
```
scenarios: (100.00%) 1 scenario, 50 max VUs, 1m30s max duration (incl. graceful stop):
* identity: 50 looping VUs for 1m0s (gracefulStop: 30s)
data_received................: 6.3 MB 104 kB/s
data_sent....................: 9.4 MB 156 kB/s
grpc_req_duration............: avg=2.14ms min=873.93µs med=1.9ms max=12.89ms p(90)=3.13ms p(95)=3.86ms
grpc_streams.................: 6000 99.355331/s
grpc_streams_msgs_received...: 6000 99.355331/s
grpc_streams_msgs_sent.......: 6000 99.355331/s
iteration_duration...........: avg=503.16ms min=500.8ms med=502.64ms max=532.36ms p(90)=504.05ms p(95)=505.72ms
iterations...................: 6000 99.355331/s
vus..........................: 50 min=50 max=50
vus_max......................: 50 min=50 max=50
running (1m00.4s), 00/50 VUs, 6000 complete and 0 interrupted iterations
```
With the old defaults (QPS=5 and Burst=10), the latencies would be much higher and number of complete requests much lower.
Followup to https://github.com/linkerd/linkerd2/pull/11334#issuecomment-1736093592
This extends the test introduced in #11334 to excercise upgrading a
Server associated to a pod's HostPort, and observing how the stream
updates the OpaqueProtocol field.
Helper functions were refactored a bit to allow retrieving the
l5dCRDClientSet used when building the fake API.
Similar to #11246, the destination controller was complaning above
trying to register dupe metrics for the api cache gauges, when a given
target cluster got re-linked. This change unregisters the gauges for the
target cluster when said cluster is removed.
Supersedes #11252
Whenever the service mirror's main loop was triggered again, the following warnings were generated:
```
time="2023-08-14T20:16:29Z" level=warning msg="failed to register Prometheus gauge Desc{fqName: \"service_cache_size\", help: \"Number of items in the client-go service cache\", constLabels: {cluster=\"remote\"}, variableLabels: []}: duplicate metrics collector registration attempted"
time="2023-08-14T20:16:29Z" level=warning msg="failed to register Prometheus gauge Desc{fqName: \"endpoints_cache_size\", help: \"Number of items in the client-go endpoints cache\", constLabels: {cluster=\"remote\"}, variableLabels: []}: duplicate metrics collector registration attempted"
```
To fix, this adds into the cluster watcher's `Stop()` method a directive to unregister the prometheus cache metrics associated to the cluster's client API.
Adds support for remote discovery to the destination controller.
When the destination controller gets a `Get` request for a Service with the `multicluster.linkerd.io/remote-discovery` label, this is an indication that the destination controller should discover the endpoints for this service from a remote cluster. The destination controller will look for a remote cluster which has been linked to it (using the `linkerd multicluster link` command) with that name. It will look at the `multicluster.linkerd.io/remote-discovery` label for the service name to look up in that cluster. It then streams back the endpoint data for that remote service.
Since we now have multiple client-go informers for the same resource types (one for the local cluster and one for each linked remote cluster) we add a `cluster` label onto the prometheus metrics for the informers and EndpointWatchers to ensure that each of these components' metrics are correctly tracked and don't overwrite each other.
---------
Signed-off-by: Alex Leong <alex@buoyant.io>
There were a few improvements we could have made to a recent change that
added a ClusterStore concept to the destination service. This PR
introduces the small improvements:
* Sync dynamically created clients in separate go routines
* Refactor metadata API creation
* Remove redundant check in cluster_store_test
* Create a new runtime schema every time a fake metadata API client is
created to avoid racey behaviour.
Signed-off-by: Matei David <matei@buoyant.io>
Currently, the control plane does not support indexing and discovering resources across cluster boundaries. In a multicluster set-up, it is desirable to have access to endpoint data (and by extension, any configuration associated with that endpoint) to support pod-to-pod communication. Linkerd's destination service should be extended to support reading discovery information from multiple sources (i.e. API Servers).
This change introduces an `EndpointsWatcher` cache. On creation, the cache registers a pair of event handlers:
* One handler to read `multicluster` secrets that embed kubeconfig files. For each such secret, the cache creates a corresponding `EndpointsWatcher` to read remote discovery information.
* Another handle to evict entries from the cache when a `multicluster` secret is deleted.
Additionally, a set of tests have been added that assert the behaviour of the cache when secrets are created and/or deleted. The cache will be used by the destination service to do look-ups for services that belong to another cluster, and that are running in a "remote discovery" mode.
---------
Signed-off-by: Matei David <matei@buoyant.io>
Fixes#9986
After reviewing the k8s API calls in Destination, it was concluded we
could only swap out the calls to the Node and RS resources to use the
metadata API, as all the other resources (Endpoints, EndpointSlices,
Services, Pod, ServiceProfiles, Server) required fields other than those
found in their metadata section.
This also required completing the `NewFakeAPI` implementation by adding
the missing annotations and labels entries.
## Testing Memory Consumption
The gains here aren't as big as in #9650. In order to test this we need
to push hard and create 4000 RS:
``` bash
for i in {0..4000}; do kubectl create deployment test-pod-$i --image=nginx; done
```
In edge-23.2.1 the destination pod's memory consumption goes from 40Mi
to 160Mi after all the RS were created. With this change, it went from
37Mi to 140Mi.
* Use metadata API in the proxy and tap injectors
Part of #9485
This adds a new `MetadataAPI` similar to the current `k8s.API` hosting informers, but backed by k8s' `metadatainformer` shared informers, which retrieves only the objects metadata, resulting in less memory consumption by its clients. Currently this is only implemented for the proxy and tap injectors. Usage by the destination controller will be implemented as a follow-up.
## Existing API enhancements
Shared objects and logic required by API and MetadataAPI have been moved to the new `k8s.go`, `api_resource.go` and `prometheus.go` files. That includes the `isValidRSParent()` function whose arg is now more generic.
## Unit tests
`/controller/k8s/api_test.go` now also instantiates a MetadataAPI, used in the augmented `TestGetObjects()` and `TestGetOwnerKindAndName()` tests. The `resources` struct was introduced to capture the common fields among tests and simplify `newMockAPI()`'s signature.
## Other Changes
The injector no longer watches for Pods. It only requires watching workloads that own resources (and also watch namespaces), so Pod is not required.
## Testing Memory Consumption
Install linkerd, inject emojivoto and check the injector memory consumption with `kubectl -n linkerd top pod linkerd-proxy-injector-xxx`. It'll start consuming about 16Mi. Then ramp up emojivoto's `voting` deployment replicas to 2000. After 5 minutes memory will stabilize around 32Mi using the current branch. Using the latest edge, it'll stabilize around 110Mi.
The controller's k8s client was using `admissionregistration/v1beta1`
for its MWC shared informer. `v1beta1` was removed in k8s 1.22, and `v1`
was introduced in k8s 1.16:
https://kubernetes.io/docs/reference/using-api/deprecation-guide/#v1-22
Modify the controller's k8s client to use `admissionregistration/v1` for
its MWC shared informer.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Allow initializing a k8s namespace-scoped API
This allows reusing the `k8s.API` informers by other projects that don't
necessarily have cluster-wide permissions.
Fixes#8592
Increase the minimum supported kubernetes version from 1.20 to 1.21. This allows us to drop support for batch/v1beta1/CronJob and discovery/v1beta1/EndpointSlices, instead using only v1 of those resources. This fixes deprecation warnings about these warnings printed by the CLI.
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#8624
When the proxy-injector encounters a resource with an owner ref, it calls `api.GetObjects` to fetch the owner. If the owner is a kind which is not supported by the proxy-injector, we will panic.
We add a condition so that we only attempt to fetch the owner resource if it is a kind we support.
Signed-off-by: Alex Leong <alex@buoyant.io>
We frequently compare data structures--sometimes very large data
structures--that are difficult to compare visually. This change replaces
uses of `reflect.DeepEqual` with `deep.Equal`. `go-test`'s `deep.Equal`
returns a diff of values that are not equal.
Signed-off-by: Oliver Gould <ver@buoyant.io>
Closes#7826
This adds the `gosec` and `errcheck` lints to the `golangci` configuration. Most significant lints have been fixed my individual changes, but this enables them by default so that all future changes are caught ahead of time.
A significant amount of these lints are been exluced by the various `exclude-rules` rules added to `.golangci.yml`. These include operations are files that generally do not fail such as `Copy`, `Flush`, or `Write`. We also choose to ignore most errors when cleaning up functions via the `defer` keyword.
Aside from those, there are several other rules added that all have comments explaining why it's okay to ignore the errors that they cover.
Finally, several smaller fixes in the code have been made where it seems necessary to catch errors or at least log them.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
[gocritic][gc] helps to enforce some consistency and check for potential
errors. This change applies linting changes and enables gocritic via
golangci-lint.
[gc]: https://github.com/go-critic/go-critic
Signed-off-by: Oliver Gould <ver@buoyant.io>
Since Go 1.13, errors may "wrap" other errors. [`errorlint`][el] checks
that error formatting and inspection is wrapping-aware.
This change enables `errorlint` in golangci-lint and updates all error
handling code to pass the lint. Some comparisons in tests have been left
unchanged (using `//nolint:errorlint` comments).
[el]: https://github.com/polyfloyd/go-errorlint
Signed-off-by: Oliver Gould <ver@buoyant.io>
Now, that SMI functionality is fully being moved into the
[linkerd-smi](www.github.com/linkerd/linkerd-smi) extension, we can
stop supporting its functionality by default.
This means that the `destination` component will stop reacting
to the `TrafficSplit` objects. When `linkerd-smi` is installed,
It does the conversion of `TrafficSplit` objects to `ServiceProfiles`
that destination components can understand, and will react accordingly.
Also, Whenever a `ServiceProfile` with traffic splitting is associated
with a service, the same information (i.e splits and weights) is also
surfaced through the `UI` (in the new `services` tab) and the `viz cmd`.
So, We are not really loosing any UI functionality here.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
This change ensures that if a Server exists with `proxyProtocol: opaque` that selects an endpoint backed by a pod, that destination requests for that pod reflect the fact that it handles opaque traffic.
Currently, the only way that opaque traffic is honored in the destination service is if the pod has the `config.linkerd.io/opaque-ports` annotation. With the introduction of Servers though, users can set `server.Spec.ProxyProtocol: opaque` to indicate that if a Server selects a pod, then traffic to that pod's `server.Spec.Port` should be opaque. Currently, the destination service does not take this into account.
There is an existing change up that _also_ adds this functionality; it takes a different approach by creating a policy server client for each endpoint that a destination has. For `Get` requests on a service, the number of clients scales with the number of endpoints that back that service.
This change fixes that issue by instead creating a Server watch in the endpoint watcher and sending updates through to the endpoint translator.
The two primary scenarios to consider are
### A `Get` request for some service is streaming when a Server is created/updated/deleted
When a Server is created or updated, the endpoint watcher iterates through its endpoint watches (`servicePublisher` -> `portPublisher`) and if it selects any of those endpoints, the port publisher sends an update if the Server has marked that port as opaque.
When a Server is deleted, the endpoint watcher once again iterates through its endpoint watches and deletes the address set's `OpaquePodPorts` field—ensuring that updates have been cleared of Server overrides.
### A `Get` request for some service happens after a Server is created
When a `Get` request occurs (or new endpoints are added—they both take the same path), we must check if any of those endpoints are selected by some existing Server. If so, we have to take that into account when creating the address set.
This part of the change gives me a little concern as we first must get all the Servers on the cluster and then create a set of _all_ the pod-backed endpoints that they select in order to determine if any of these _new_ endpoints are selected.
## Testing
Right now this can be tested by starting up the destination service locally and running `Get` requests on a service that has endpoints selected by a Server
**app.yaml**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: pod
labels:
app: pod
spec:
containers:
- name: app
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: svc
spec:
selector:
app: pod
ports:
- name: http
port: 80
---
apiVersion: policy.linkerd.io/v1alpha1
kind: Server
metadata:
name: srv
labels:
policy: srv
spec:
podSelector:
matchLabels:
app: pod
port: 80
proxyProtocol: HTTP/1
```
```bash
$ go run controller/script/destination-client/main.go -path svc.default.svc.cluster.local:80
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Fixes#6733
As policy resources provide a grouping, statistics summaries should
also be allowed on these groupings which are useful to the user. Them
being port specific provide a great way to break down these metrics
further.
This PR adds support for policy resources i.e `server` and `serverauthorization`
on the `stat` command.
## Changes
This adds a new path in the `stat_summary.go` file to handle policy
objects. I tried to see if we could re-use some of the other paths
but some of the labels seems to differ and hence a different path
had to be created. We can try to refactor and merge them though.
We support both request and TCP metrics for the `server` resource
while only the former with `serverauthorization` resources
as metrics are generated in this manner.
This also adds these policy objects into the `k8s` package to
make them as known resources.
For both the policy resources, `--from` doesn't work as these
metrics are not exposed from outbound, and there is no way to
query about the client workload from the inbound metrics. `--to`
is supported to get metrics specifically for a destination workload.
(just like on a service)
## Testing
```bash
> curl -sL https://run.linkerd.io/emojivoto.yml | linkerd inject --proxy-log-level debug - | kubectl apply -f -
> kubectl apply -f 897de1a8d5/emojivoto-policy.yml
# Initial values
on ⛵ kind-kind linkerd2 on 🌱 taru [📦📝🤷] via 🐼 v1.16.7 via via ❄️ impure (shell)
➜ ./bin/go-run cli viz stat srv -A -owide ~/work/linkerd2
NAMESPACE NAME UNAUTHORIZED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN READ_BYTES/SEC WRITE_BYTES/SEC
emojivoto emoji-grpc 0.0rps 100.00% 1.8rps 1ms 1ms 3ms 1 188.6B/s 2072.9B/s
emojivoto prom 0.0rps - - - - - - - -
emojivoto voting-grpc 0.0rps 80.70% 0.9rps 1ms 2ms 3ms 1 91.4B/s 52.7B/s
emojivoto web-http 0.0rps 90.68% 2.0rps 2ms 10ms 28ms 1 153.7B/s 4509.4B/s
# After changing the `emoji-grpc` authz
on ⛵ kind-kind linkerd2 on 🌱 taru [📦📝🤷] via 🐼 v1.16.7 via via ❄️ impure (shell) took 2s
➜ ./bin/go-run cli viz stat srv -A -owide ~/work/linkerd2
NAMESPACE NAME UNAUTHORIZED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN READ_BYTES/SEC WRITE_BYTES/SEC
emojivoto emoji-grpc 0.3rps 100.00% 1.1rps 0ms 0ms 0ms 1 156.5B/s 1282.4B/s
emojivoto prom 0.0rps - - - - - - - -
emojivoto voting-grpc 0.0rps 87.88% 0.6rps 0ms 0ms 0ms 1 53.5B/s 31.5B/s
emojivoto web-http 0.0rps 61.18% 1.4rps 1ms 2ms 2ms 1 110.2B/s 2195.7B/s
# after changing the `web-http` authz
on ⛵ kind-kind linkerd2 on 🌱 taru [📦📝🤷] via 🐼 v1.16.7 via via ❄️ impure (shell)
➜ ./bin/go-run cli viz stat srv -A -owide ~/work/linkerd2
NAMESPACE NAME UNAUTHORIZED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN READ_BYTES/SEC WRITE_BYTES/SEC
emojivoto emoji-grpc 0.0rps - - - - - - - -
emojivoto prom 0.0rps - - - - - - - -
emojivoto voting-grpc 0.0rps - - - - - - - -
emojivoto web-http 1.0rps - - - - - - - -
> linkerd viz stat srv/emoji-grpc -n emojivoto -owide
NAME SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN READ_BYTES/SEC WRITE_BYTES/SEC
emoji-grpc 100.00% 2.0rps 1ms 1ms 1ms 1 199.9B/s 2208.0B/s
> linkerd viz stat srv/web-http -n emojivoto -owide
NAME SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN READ_BYTES/SEC WRITE_BYTES/SEC
web-http 94.02% 1.9rps 4ms 9ms 10ms 1 152.7B/s 4505.9B/s
> linkerd viz stat srv -n emojivoto -o wide
NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN READ_BYTES/SEC WRITE_BYTES/SEC
emoji-grpc - 100.00% 2.0rps 1ms 1ms 1ms 1 201.6B/s 2209.8B/s
prom - - - - - - - - -
voting-grpc - 86.21% 1.0rps 1ms 1ms 1ms 1 98.3B/s 55.9B/s
web-http - 91.67% 2.0rps 3ms 8ms 10ms 1 157.7B/s 4600.3B/s
> linkerd viz stat serverauthorization/web-public -n emojivoto
NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99
web-http - 89.83% 2.0rps 3ms 9ms 10ms
> linkerd viz stat saz -n emojivoto
NAME AUTHORIZATION MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99
emoji-grpc emoji-grpc - 100.00% 2.0rps 1ms 1ms 1ms
prom prom-prometheus - - - - - -
voting-grpc voting-grpc - 89.83% 1.0rps 1ms 1ms 1ms
web-http web-public - 94.96% 2.0rps 1ms 5ms 9ms
> linkerd viz stat saz/web-public -n emojivoto
NAME AUTHORIZATION MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99
web-http web-public - 90.00% 2.0rps 1ms 5ms 9ms
```
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
We were getting sporadic coverage differences on `controller/k8s/test_helper.go` and `pkg/healthcheck/healthcheck_test.go` on pushes unrelated to those files.
For the former, the problem was in tests in `controller/k8s/api_test.go` that compared slices of pods and services by sorting them. The `Sort` interface was implemented through the methods in `test_helper.go`. There is indeterminism in that sorting at the go library level apparently, in that the `Swap` method is not always called, which impacted the coverage report. The fix consists on comparing those slices item by item without needing to sort beforehand.
As for `healthcheck_test.go`, `validateControlPlanePods()` in `healthcheck.go` short-circuits on the first pod having all its containers ready. The unit tests iterate over maps, an iteration we know is not deterministic, so sometimes the short-circuiting avoided to ever cover the `!container.Ready` block, thus affecting the coverage report. This is fixed by adding a new small test that makes sure that block is covered.
The problem is for parent objects that are not supported in Linkerd, we
cannot get any metrics. For example, using a Rollout will not report any
metrics higher than a pod level.
To fix, add validation for ReplicaSet owners; if it's a valid parent,
use parent Kind and Name, otherwise use ReplicaSet.
Tested using CLI/UI
Interim solution for #6429
Signed-off-by: Matei David <matei@buoyant.io>
Fixes#6354
We add Prometheus gauges which track the client-go cache size for each resource type. For example, the following metrics are added to the destination controller:
```
# HELP endpoint_cache_size Number of items in the client-go endpoint cache
# TYPE endpoint_cache_size gauge
endpoint_cache_size 21
# HELP job_cache_size Number of items in the client-go job cache
# TYPE job_cache_size gauge
job_cache_size 0
# HELP namespace_cache_size Number of items in the client-go namespace cache
# TYPE namespace_cache_size gauge
namespace_cache_size 8
# HELP node_cache_size Number of items in the client-go node cache
# TYPE node_cache_size gauge
node_cache_size 1
# HELP pod_cache_size Number of items in the client-go pod cache
# TYPE pod_cache_size gauge
pod_cache_size 23
# HELP replica_set_cache_size Number of items in the client-go replica_set cache
# TYPE replica_set_cache_size gauge
replica_set_cache_size 40
# HELP service_cache_size Number of items in the client-go service cache
# TYPE service_cache_size gauge
service_cache_size 18
# HELP service_profile_cache_size Number of items in the client-go service_profile cache
# TYPE service_profile_cache_size gauge
service_profile_cache_size 4
# HELP traffic_split_cache_size Number of items in the client-go traffic_split cache
# TYPE traffic_split_cache_size gauge
traffic_split_cache_size 0
```
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#6272
The opaqueports is prone to fail, with `context deadline exceeded`
as there are numerous k8s API requests being performed.
This PR updates the pre-fetching logic to instead use
`controller/k8s` which provides a wrapper around `pkg/k8s` with
caching by using shared informers underneath!
This commit includes the following changes:
- Update `checkMisconfiguredOpaquePortAnnotations` to use
`controllerk8s.KubeAPI` instead of `hc.kubeAPI`
- `kubeAPI.Sync` fn also had to be updated as it fails to check
if the sp and ts sharedinformers are nil, which might be the
case in cases like this where they are not needed.
We had to use `controllerK8s.NewAPI` for the initialization
instead of `controllerk8s.InitializeAPI` to take-in
`hc.kubeAPI` so as to support unit testing, etc as `hc.kubeAPI`
is how we pass the fake resources in unit tests.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* spSharedInformers, tsSharedInformers may be nil, calling the metho will cause panic
Signed-off-by: Cookie Wang <wangchl01@inspur.com>
* spSharedInformers, tsSharedInformers may be nil, calling the metho will cause panic
Signed-off-by: Cookie Wang <wangchl01@inspur.com>
Fixes#4191#4993
This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5
This consists of the following changes:
- Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD.
- Bump all k8s related dependencies to v0.19.2
- Generate CRD types, client code using the latest k8s.io/code-generator
- Use context.Context as the first argument, in all code paths that touch the k8s client-go interface
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).
The misspellings have been reported at aaf440489e (commitcomment-41423663)
The action reports that the changes in this PR would make it happy: 5b82c6c5ca
Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
Based on the [EndpointSlice PR](https://github.com/linkerd/linkerd2/pull/4663), this is just the k8s/api support for endpointslices to shorten the first PR.
* Adds CRD
* Adds functions that check whether the cluster has EndpointSlice access
* Adds discovery & endpointslice informers to api.
Signed-off-by: Matei David <matei.david.35@gmail.com>
When viewing the output of `linkerd stat` for services which do not have a selector (such as services created by the service-mirror, for example) the meshed count column shows the total number which exist, even though the service actually selects no pods at all.
We update the StatSummary implementation to account for services which have no selector.
Additionally, we update the logic of the `--unmeshed` flag. When the `--unmeshed` flag is not set, we typically skip rows for unmeshed resources because those resources would have no stats. This is not appropriate to do when the `--from` flag is also set because in this case, metrics are not collected on the target resource but are instead collected on the client-side. This means that stats can be present, even for unmeshed resources and these resources should still be displayed, even if the `--unmeshed` flag is not set.
Signed-off-by: Alex Leong <alex@buoyant.io>
Here we upgrade our dependencies on client-go to 0.17.4 and smi-sdk-go to 0.3.0. Since smi-sdk-go uses client-go 0.17.4, these upgrades must be performed simultaneously.
This also requires simultaneously upgrading our dependency on linkerd/stern to a SHA which also uses client-go 0.17.4. This keeps all of our transitive dependencies synchronized on one version of client-go.
This ALSO requires updating our codegen scripts to use the 0.17.4 version of code-generator and running it to generate 0.17.4 compatible generated code. I took this opportunity to update our code generation script to properly use the version of code-generater from `go.mod` rather than a hardcoded SHA.
Signed-off-by: Alex Leong <alex@buoyant.io>
This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web
dashboard and CLI. It adds a new Grafana dashboard for each kind of resource.
Closes#3614Closes#3630Closes#3584Closes#3585
Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com
Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com
* If tap source IP matches many running pods then only show the IP
When an unmeshed source ip matched more than one running pod, tap was
showing the names for all those pods, even though the didn't necessary
originate the connection. This could be reproduced when using pod
network add-on such as Calico.
With this change, if a node matches, return it, otherwise we proceed to look for a matching pod. If exactly one running pod matches we return it. Otherwise we return just the IP.
Fixes#3103
The `linkerd upgrade --from-manifests` command supports reading the
manifest output via `linkerd install`. PR #3167 introduced a tap
APIService object into `linkerd install`, but the manifest-reading code
in fake.go was never updated to support this new object kind.
Update the fake clientset code to support APIService objects.
Fixes#3559
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Fix injector timeout under high load
Fixes#3358
When retrieving a pod owner, we were hitting the k8s API directly because
at injection time the informer might not have been informed about the
existence of the parent object.
Under a large number of injection requests this ended up in the k8s API requests
being throttled, the proxy-injector getting blocked and the webhook requests
timing out.
Now we'll hit the shared informer first, and hit the k8s API only when
the informer doesn't return anything. After a few injection requests for
the same owner, the informer should have been updated.
Testing:
Scaling an emoji deployment to 1000 replicas, and after waiting for a
couple of minutes:
Before:
```bash
# a portion of the pods doesn't get injected
$ kubectl-n emojivoto get po | grep ./1 | wc -l
109
kubectl -n kube-system logs -f kube-apiserver-minikube | grep
failing.*timeout
.... (lots of errors)
```
After:
```bash
# all the pods get injected
$ kubectl -n emojivoto get po | grep ./1 | wc -l
0
kubectl -n kube-system logs -f kube-apiserver-minikube | grep
failing.*timeout
```
Fixes#3356
1.16 removes some api groups that were already deprecated. From k8s blog
post (https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/):
```
- PodSecurityPolicy: will no longer be served from extensions/v1beta1 in
v1.16.
Migrate to the policy/v1beta1 API, available since v1.10. Existing
persisted data can be retrieved/updated via the policy/v1beta1 API.
- DaemonSet, Deployment, StatefulSet, and ReplicaSet: will no longer be
served from extensions/v1beta1, apps/v1beta1, or apps/v1beta2 in v1.16.
Migrate to the apps/v1 API, available since v1.9. Existing persisted
data can be retrieved/updated via the apps/v1 API.
```
Previous PRs had already made this change at the Helm templates level,
but we still needed to do it at the API calls and tests.
The integration tests ran fine for k8s 1.12 and 1.15. They fail on 1.16
because the upgrade integration test tries to install linkerd 2.5 which is not
compatible with 1.16.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Set custom cluster domain in GetServiceProfileFor
* Set custom cluster domain in tap server
Move fetching cluster domain for tap server to cmd main
* Handle fetchting cluster domain errors separately
* Use custom cluster domain for traffic split adaptor
Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
* Have the proxy-injector emit events upon injection/skipping injection
Fixes#3253
Have the proxy-injector emit an event whenever a injection happens, or
when injection is skipped for some reason (also added that reason into
the proxy-injector logs). The level is associated to the parent workload
(it can't be associated to the pod because at this point the pod hasn't
been persisted).
The event recorder was setup at the `webhook/server.go` level and passed
to the proxy-injector's `Inject` function. The sp-validator thus also
has access to the event recorder, but for now it's not using it.
Related changes:
- Refactored `api.GetOwnerKindAndName()` to have it return a more
generic object.
- Refactored `report.Injectable()` to also have it return the reason why
a workload is not injectable.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
Similar to `kubectl --as`, global flag across all linkerd subcommands
which sets a `ImpersonationConfig` in the Kubernetes API config.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>