mirror of https://github.com/linkerd/linkerd2.git
1 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
|
a330d20aa0
|
stat_summary: support service metrics using `authority` label (#6514)
Currently, `viz stat` on services is pretty restricted because of it not being a podowner resource. This PR fixes that by making it use the `direction="outbound", authroty="svc"` while querying the prometheus metrics. This means that for services, we can generate metrics from the *meshed* clients side. `StatsSummary` metrics on a service are further divided into two kinds ### Service has no `ServiceProfiles.dstOverrides` In this case, We just return the metrics by querying for `direction="outbound", authroty="svc"`, along with any `--from` resources specified as client query labels. We also gate this path, to fail for requests that have `--from` as a service or for `svc/* --to xyz`, as they are invalid i.e we can't render metrics with service as the client. ### Service has `ServiceProfiles.dstOverrides` Here, We follow a similar path of `TrafficSplit` except that we use a `ServiceProfile` resource object instead. _The TrafficSplit path will be removed or merged into the `Service` path in a separate PR for simplification,_ ## Testing ### Apply Traffic Splitting through `ServiceProfiles` ```bash on ⛵ kind-kind linkerd2 on 🌱 taru [📦++1🤷] via 🐼 v1.16.5 took 1m11s ➜ k create ns linkerd-trafficsplit-test-sp ~/work/linkerd2 namespace/linkerd-trafficsplit-test-sp created on ⛵ kind-kind linkerd2 on 🌱 taru [📦++1🤷] via 🐼 v1.16.5 ➜ ./bin/linkerd inject ./test/integration/trafficsplit/testdata/application.yaml | k -n linkerd-trafficsplit-test-sp apply -f - ~/work/linkerd2 document missing "kind" field, skipped deployment "backend" injected service "backend-svc" skipped deployment "failing" injected service "failing-svc" skipped deployment "slow-cooker" injected service "slow-cooker" skipped deployment.apps/backend created service/backend-svc created deployment.apps/failing created service/failing-svc created deployment.apps/slow-cooker created service/slow-cooker created on ⛵ kind-kind linkerd2 on 🌱 taru [📦++1🤷] via 🐼 v1.16.5 ➜ k apply -f ./test/integration/trafficsplit/testdata/sp/updated-traffic-split-leaf-weights.yaml -n linkerd-trafficsplit-test-sp ~/work/linkerd2 serviceprofile.linkerd.io/backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local created on ⛵ kind-kind linkerd2 on 🌱 taru [📦++1🤷] via 🐼 v1.16.5 ➜ k describe sp -n linkerd-trafficsplit-test-sp ~/work/linkerd2 Name: backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local Namespace: linkerd-trafficsplit-test-sp Labels: <none> Annotations: <none> API Version: linkerd.io/v1alpha2 Kind: ServiceProfile Metadata: Creation Timestamp: 2021-07-01T11:05:06Z Generation: 1 Managed Fields: API Version: linkerd.io/v1alpha2 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:kubectl.kubernetes.io/last-applied-configuration: f:spec: .: f:dstOverrides: Manager: kubectl-client-side-apply Operation: Update Time: 2021-07-01T11:05:06Z Resource Version: 1398 UID: fce0a250-1396-4a14-9729-e19030048c7a Spec: Dst Overrides: Authority: backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local Weight: 500m Authority: failing-svc.linkerd-trafficsplit-test-sp.svc.cluster.local:8081 Weight: 500m Events: <none> ``` ### CLI Output ```bash on ⛵ kind-kind linkerd2 on 🌱 main [📦📝🤷] via 🐼 v1.16.6 via ➜ ./bin/go-run cli viz stat svc/backend-svc -n linkerd-trafficsplit-test-sp ~/work/linkerd2 NAME APEX LEAF WEIGHT SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local backend-svc 500m 100.00% 0.9rps 1ms 2ms 2ms backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local failing-svc 500m 0.00% 1.1rps 1ms 2ms 2ms on ⛵ kind-kind linkerd2 on 🌱 main [📦📝🤷] via 🐼 v1.16.6 via took 2s ➜ ./bin/go-run cli viz stat svc/backend-svc -n linkerd-trafficsplit-test-sp --from deploy/slow-cooker ~/work/linkerd2 NAME APEX LEAF WEIGHT SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local backend-svc 500m 100.00% 0.4rps 1ms 2ms 2ms backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local failing-svc 500m 0.00% 0.6rps 1ms 2ms 2ms on ⛵ kind-kind linkerd2 on 🌱 main [📦📝🤷] via 🐼 v1.16.6 via took 2s ➜ ./bin/go-run cli viz stat svc/backend-svc -n linkerd-trafficsplit-test-sp --from deploy/slow-cooker-1 ~/work/linkerd2 NAME APEX LEAF WEIGHT SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local backend-svc 500m 100.00% 0.5rps 1ms 2ms 2ms backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local failing-svc 500m 0.00% 0.5rps 1ms 2ms 2ms on ⛵ kind-kind linkerd2 on 🌱 main [📦📝🤷] via 🐼 v1.16.6 via ➜ ./bin/go-run cli viz stat svc/prometheus -n linkerd-viz ~/work/linkerd2 StatSummary API error: service only supported as a target on 'from' queries, or as a destination on 'to' queries% # With no `sp.dstOverrides` on ⛵ kind-kind linkerd2 on 🌱 taru [📦📝🤷] via 🐼 v1.16.6 via took 10s ➜ k -n linkerd-trafficsplit-test-sp delete sp backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local ~/work/linkerd2 serviceprofile.linkerd.io "backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local" deleted on ⛵ kind-kind linkerd2 on 🌱 taru [📦📝🤷] via 🐼 v1.16.6 via ➜ ./bin/go-run cli viz stat svc/backend-svc -n linkerd-trafficsplit-test-sp ~/work/linkerd2 NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 backend-svc - 100.00% 1.2rps 1ms 2ms 2ms on ⛵ kind-kind linkerd2 on 🌱 taru [📦📝🤷] via 🐼 v1.16.6 via ➜ ./bin/go-run cli viz stat svc/backend-svc -n linkerd-trafficsplit-test-sp --from deploy/slow-cooker-1 --from-namespace linkerd-trafficsplit-test-sp NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 backend-svc - 100.00% 0.6rps 1ms 2ms 2ms on ⛵ kind-kind linkerd2 on 🌱 taru [📦📝🤷] via 🐼 v1.16.6 via ➜ ./bin/go-run cli viz stat svc/backend-svc -n linkerd-trafficsplit-test-sp --from deploy/slow-cooker --from-namespace linkerd-trafficsplit-test-sp NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 backend-svc - 100.00% 0.7rps 1ms 2ms 2ms on ⛵ kind-kind linkerd2 on 🌱 taru [📦📝🤷] via 🐼 v1.16.6 via ➜ ./bin/go-run cli viz stat deploy/slow-cooker -n linkerd-trafficsplit-test-sp --to svc/backend-svc ~/work/linkerd2 No traffic found. on ⛵ kind-kind linkerd2 on 🌱 taru [📦📝🤷] via 🐼 v1.16.6 via ➜ ~/work/linkerd2 ``` Note: _This means that we need documenation changes to let the user know that the `viz stat` on a service are client side metrics and would be missing metrics from unmeshed clients._ Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> |