* Have `GetOwnerKindAndName` be able to skip the cache
Refactored `GetOwnerKindAndName` so it can optionally skip the
shared informer cache and instead hit the k8s API directly.
Useful for the proxy injector, when the pod's replicaset got just
created and might not be in ready in the cache yet.
Fixes#2738
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
Adds a check to Prometheus `edges` queries to verify that data for the requested
resource type exists. Previously, if Prometheus could not find request data for the
requested resource type, it would skip that label and still return data for
other labels in the `by` clause, leading to an incorrect response.
This change adds an endpoint to the public API to allow us to query Prometheus for edge data, in order to display identity information for connections between Linkerd proxies. This PR only includes changes to the controller and protobuf.
Private k8s clusters, such as the private GKE clusters offered by Google
Cloud, cannot be reached through the current API proxy method.
This commit uses the port forwarding feature already developed.
Also modify dashboard command to not fall back to ephemeral port.
Signed-off-by: Jack Price <jackprice@outlook.com>
The new proxy has changed its configuration as follows:
- `LISTENER` urls are now `LISTEN_ADDR` addresses;
- `CONTROL_URL` is now `DESTINATION_SVC_ADDR`;
- `*_NAMESPACE` vars are no longer needed;
- The `PROXY_ID` is now the `DESTINATION_CONTEXT`;
- The "metrics" port is now the "admin" port, since it serves more than
just metrics;
- A readiness probe now checks a dedicated /ready endpoint eagerly.
Identity injection is **NOT** configured by this branch.
## Problem
When an object has no previous route metrics, we do not generate a table for
that object.
The reasoning behind this was for reducing output of the following command:
```
$ linkerd routes deploy --to deploy/foo
```
For each deployment object, if it has no previous traffic to `deploy/foo`, then
a table would not be generated for it.
However, the behavior we see with that indicates there is an error even when a
Service Profile is installed:
```
$ linkerd routes deploy deploy/foo
Error: No Service Profiles found for selected resources
```
## Solution
Always generate a stat table for the queried resource object.
## Validation
I deployed [booksapp](https://github.com/buoyantIO/booksapp) with the `traffic`
deployment removed and Service Profiles installed.
Without the fix, `linkerd routes deploy/webapp` displays an error because there
has been no traffic to `deploy/webapp` without the `traffic` deployment.
With the fix, the following output is generated:
```
ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99
GET / webapp 0.00% 0.0rps 0ms 0ms 0ms
GET /authors/{id} webapp 0.00% 0.0rps 0ms 0ms 0ms
GET /books/{id} webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /authors webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /authors/{id}/delete webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /authors/{id}/edit webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /books webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /books/{id}/delete webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /books/{id}/edit webapp 0.00% 0.0rps 0ms 0ms 0ms
[DEFAULT] webapp 0.00% 0.0rps 0ms 0ms 0ms
```
Closes#2328
Signed-off-by: Kevin Leimkuhler <kevinl@buoyant.io>
linkerd/linkerd2#2428 modified SelfSubjectAccessReview behavior to no
longer paper-over failed ServiceProfile checks, assuming that
ServiceProfiles will be required going forward. There was a lingering
ServiceProfile check in the web's startup that started failing due to
this change, as the web component does not have (and should not need)
ServiceProfile access. The check was originally implemented to inform
the web component whether to expect "single namespace" mode or
ServiceProfile support.
Modify the web's initialization to always expect ServiceProfile support.
Also remove single namespace integration test
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
linkerd/linkerd2#2349 removed the `--single-namespace` flag, in favor of
runtime detection of cluster vs. namespace access, and also
ServiceProfile availability. This maintained control-plane support for
running in these two states.
This change requires control-plane components have cluster-wide
Kubernetes API access and ServiceProfile availability, and will error
out if not. Once #2349 merges, stage 1 install will be a requirement for
a successful stage 2 install.
Part of #2337
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
We were depending on an untagged version of prometheus/client_golang
from Feb 2018.
This bumps our dependency to v0.9.2, from Dec 2018.
Also, this is a prerequisite to #1488.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The control-plane components relied on a `--single-namespace` param,
passed from `linkerd install` into each individual component, to
determine which namespaces they were authorized to access, and whether
to support ServiceProfiles. This command-line flag was redundant given
the authorization rules encoded in the parent `linkerd install` output,
via [Cluster]Role[Binding]s.
Modify the control-plane components to query Kubernetes at startup to
determine which namespaces they are authorized to access, and whether
ServiceProfile support is available. This allows removal of the
`--single-namespace` flag on the components.
Also update `bin/test-cleanup` to cleanup the ServiceProfile CRD.
TODO:
- Remove `--single-namespace` flag on `linkerd install`, part of #2164
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
goconst finds repeated strings that could be replaced by a constant:
https://github.com/jgautheron/goconst
Part of #217
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Adds a flag, tcp_stats to the StatSummary request, which queries prometheus for TCP stats.
This branch returns TCP stats at /api/tps-reports when this flag is true.
TCP stats are now displayed on the Resource Detail pages.
The current queried TCP stats are:
tcp_open_connections
tcp_read_bytes_total
tcp_write_bytes_total
Up until now, the proxy-api controller service has been the sole service
that the proxy communicates with, implementing the majoriry of the API
defined in the `linkerd2-proxy-api` repo. But this is about to change:
linkerd/linkerd2-proxy-api#25 introduces a new Identity service; and
this service must be served outside of the existing proxy-api service
in the linkerd-controller deployment (so that it may run under a
distinct service account).
With this change, the "proxy-api" name becomes less descriptive. It's no
longer "the service that serves the API for the proxy," it's "the
service that serves the Destination API to the proxy." Therefore, it
seems best to bite the bullet and rename this to be the "destination"
service (i.e. because it only serves the
`io.linkerd.proxy.destination.Destination` service).
Co-authored-by: Kevin Lingerfelt <kl@buoyant.io>
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
`golangci-lint` performs numerous checks on Go code, including golint,
ineffassign, govet, and gofmt.
This change modifies `bin/lint` to use `golangci-lint`, and replaces
usage of golint and govet.
Also perform a one-time gofmt cleanup:
- `gofmt -s -w controller/`
- `gofmt -s -w pkg/`
Part of #217
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Fixes#2077
When looking up service profiles, Linkerd always looks for the service profile objects in the Linkerd control namespace. This is limiting because service owners who wish to create service profiles may not have write access to the Linkerd control namespace.
Instead, we have the control plane look for the service profile in both the client namespace (as read from the proxy's `proxy_id` field from the GetProfiles request and from the service's namespace. If a service profile exists in both namespaces, the client namespace takes priority. In this way, clients may override the behavior dictated by the service.
Signed-off-by: Alex Leong <alex@buoyant.io>
The Proxy API service lacked introspection of its internal state.
Introduce a new gRPC Discovery API, implemented by two servers:
1) Proxy API Server: returns a snapshot of discovery state
2) Public API Server: pass-through to the Proxy API Server
Also wire up a new `linkerd endpoints` command.
Fixes#2165
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
DaemonSet stats are not currently shown in the cli stat command, web ui
or grafana dashboard. This commit adds daemonset support for stat.
Update stat command's help message to reference daemonsets.
Update the public-api to support stats for daemonsets.
Add tests for stat summary and api.
Add daemonset get/list/watch permissions to the linkerd-controller
cluster role that's created using the install command.
Update golden expectation test files for install command
yaml manifest output.
Update web UI with daemonsets
Update navigation, overview and pages to list daemonsets and the pods
associated to them.
Add daemonset paths to server, and ui apps.
Add grafana dashboard for daemonsets; a clone of the deployment
dashboard.
Update dependencies and dockerfile hashes
Add DaemonSet support to tap and top commands
Fixes of #2006
Signed-off-by: Zak Knill <zrjknill@gmail.com>
Fixes#2119
When Linkerd is installed in single-namespace mode, the public-api container panics when it attempts to access watch service profiles.
In single-namespace mode, we no longer watch service profiles and return an informative error when the TopRoutes API is called.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Introduce resource selector and deprecate namespace field for ListPods
* Changes from code review
* Properly deprecate the field
* Do not check for nil
* Fix the mockProm usage
* Protoc changes revert
* Changed from code review
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
Fixes#1875
This change improves the `linkerd routes` command in a number of important ways:
* The restriction on the type of the `--to` argument is lifted and any resource type can now be used. Try `--to ns/books`, `--to po/webapp-ABCDEF`, `--to au/linkerd.io`, or even `--to svc`.
* All routes for the target will now be populated in the table, even if there are no Prometheus metrics for that route.
* [UNKNOWN] has been renamed to [DEFAULT]
* The `Service/Authority` column will now list `Service` in all cases except for when an authority target is explicitly requested.
```
$ linkerd routes deploy/traffic --to deploy/webapp
ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99
GET / webapp 100.00% 0.5rps 50ms 180ms 196ms
GET /authors/{id} webapp 100.00% 0.5rps 100ms 900ms 980ms
GET /books/{id} webapp 100.00% 0.9rps 38ms 93ms 99ms
POST /authors webapp 100.00% 0.5rps 35ms 48ms 50ms
POST /authors/{id}/delete webapp 100.00% 0.5rps 83ms 180ms 196ms
POST /authors/{id}/edit webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /books webapp 45.16% 2.1rps 75ms 425ms 485ms
POST /books/{id}/delete webapp 100.00% 0.5rps 30ms 90ms 98ms
POST /books/{id}/edit webapp 56.00% 0.8rps 92ms 875ms 975ms
[DEFAULT] webapp 0.00% 0.0rps 0ms 0ms 0ms
```
This is all made possible by a shift in the way we handle the destination resource. When we get a request with a `ToResource`, we use the k8s API to find all Services which include at least one pod belonging to that resource. We then fetch all service profiles for those services and display the routes from those serivce profiles.
This shift in thinking also precipitates a change in the TopRoutes API where we no longer need special cases for `ToAll` (which can be specified by `--to au`) or `ToAuthority` (which can be specified by `--to au/<authority>`) and instead can use a `ToResource` to handle all cases.
Signed-off-by: Alex Leong <alex@buoyant.io>
Commit 1: Enable lint check for comments
Part of #217. Follow up from #1982 and #2018.
A subsequent commit will fix the ci failure.
Commit 2: Address all comment-related linter errors.
This change addresses all comment-related linter errors by doing the
following:
- Add comments to exported symbols
- Make some exported symbols private
- Recommend via TODOs that some exported symbols should should move or
be removed
This PR does not:
- Modify, move, or remove any code
- Modify existing comments
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Add parameter to stats API to skip retrieving Prometheus stats
Used by the dashboard to populate list of resources.
Fixes#1022
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Prometheus queries check results were being ignored
* Refactor verifyPromQueries() to also test when no prometheus queries
should be generated
* Add test for SkipStats=true
Includes adding ability to public.GenStatSummaryResponse to not generate
basicStats
* Fix previous test
We rework the routes command so that it can accept any Kubernetes resource, making it act much more similarly to the stat command.
Signed-off-by: Alex Leong <alex@buoyant.io>
Filtering by Kubernetes job was not supported. Also filtering by any unknown
type caused a panic.
Add filtering support by Kubernetes job, with special case mapping `job` to
`k8s_job`, to not conflict with Prometheus' job label.
Fix panic when unknown type specified as a `--from` or `--to` flag.
Fix `job` label from `linkerd-proxy` overwriting Prometheus `job` label at
collection time. This caused all metrics collected by proxy sidecars in
Kubernetes jobs to be collected into an incorrect Prometheus job, rather than
the expected `linkerd-proxy` Prometheus job.
Fix `unsupported resource type` tap error message incorrectly printing the
target resource rather than the destination.
Set `--controller-log-level debug` in `install_test.go` for easier debugging.
Expose `slow-cooker`'s metrics via a k8s service in the tap integration test, to
validate proxy requests with a job as destination.
Fixes#1872
Part of #627
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Add a barebones ListServices endpoint, in support of autocomplete for services.
As we develop service profiles, this endpoint could probably be used to describe
more aspects of services (like, if there were some way to check whether a
service profile was enabled or not).
Accessible from the web UI via http://localhost:8084/api/services
The `linkerd` routes command only supports outbound metrics queries (i.e. ones with the `--from` flag). Inbound queries (i.e. ones without the `--from` flag) never return any metrics.
We update the proxy version and use the new canonicalized form for dst labels to gain support for inbound metrics as well.
Signed-off-by: Alex Leong <alex@buoyant.io>
Add a routes command which displays per-route stats for services that have service profiles defined.
This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service. This implementation reads per-route data from Prometheus. This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format. This is very similar to the `stat` command and much of the code was able to be refactored and shared.
Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data. This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.
Signed-off-by: Alex Leong <alex@buoyant.io>
# Problem
When we add a `--from` query to `linkerd stat au` we get more rows than if we would have just run `linkerd stat au`.
Adding a `--from` causes an extra row to be added, and the named authority to be ignored (this is the result we would have expected when running `linkerd stat au -n emojivoto --from deploy/web`).
# Solution
Destination query labels are now appended to `labels` so that those labels can be filtered on.
# Validation
Tests have been updated to reflect the expected expected destination labels now appended in `--from` queries.
Fixes#1766
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Added support for json output in `linkerd stat` through a new (-o|--output)=json option.
Fixes#1417
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
Updates to the Kubernetes utility code in `/controller/k8s` to support interacting with ServiceProfiles.
This makes use of the code generated client added in #1752
Signed-off-by: Alex Leong <alex@buoyant.io>
* Use ListPods always for data plane HC
* Missing changes in grpc_server.go
* Address review comments
* Read proxy version from spec
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
The `reader.Read` method only reads as many bytes as are currently available from reader. When reading the 4 byte message length header, if not all 4 of those bytes are available, `Read` will only read the available bytes and return. This causes alignment issues when the message body is read and there are still unread header bytes in the reader. These bytes will appear at the beginning of the message body and cause a crash when the message is unmarshalled.
Use `io.ReadFull` to ensure that we read all 4 of the message length header bytes.
Fixes#1287
Signed-off-by: Alex Leong <alex@buoyant.io>
* Update ant to 3.7.2
* Add autocomplete of namespaces/resources to Tap in web ui
* Add form fields for authority/path/method/rps/scheme
* Add the ability to clear error messages to the error banner
* Add error listener to ws object
This PR begins to migrate Conduit to Linkerd2:
* The proxy has been completely removed from this repo, and is now located at
github.com/linkerd/linkerd2-proxy.
* A `Dockerfile-proxy` has been added to fetch the most-recently published proxy
binary from build.l5d.io.
* Proxy-specific protobuf bindings have been moved to
github.com/linkerd/linkerd2-proxy-api.
* All docker images now use the gcr.io/linkerd-io registry.
* `inject` now uses `LINKERD2_PROXY_` environment variables
* Go paths have been updated to reflect the new (future) repo location.
* Fix bug where we were using dst_authorities as a group by instead of authorities
* Add test to make sure we don't dst_authorities
Previously, we were only checking to make sure we didn't add
dst_authorities in the query labels in promDstQueryLabels but we
weren't checking the groupBy labels in promDstGroupByLabelNames -
this caused us to try to query for dst_authorities when a --from
query was sent. There are no dst_authorities, so there would be no
named results.
- Add Reason to the error data passed from the api
- Rewrite error logic in the UI to try to make it clearer
- Show 0/0 pods meshed instead of 0/0 pods meshed (N/A) if 0 pods are meshed
I realized that our stat summary expectation checker would only check the actual
proto responses against the expectations if the expectations were non-empty.
Problem
If we expected empty results and the api returned actual results, we never actually
check those results against the expectations.
The bug can be reproduced by replacing any nonzero metric we expect in
expectedResponse with expectedResponse: genEmptyResponse()
The tests on master will still pass.
Solution
Remove this line and ensure we get the expected number of stat tables.
- Return pod uptimes from the GetPods endpoint
- Adds filtering by namespace to api.GetPods
- Adds a --namespace filter to conduit get pods
- Adds pod uptimes to the controller component toolitps on the ServiceMesh page
- Moves the ServiceMesh page back to using /api/pods
Adds the ability to query by a new non-kubernetes resource type, "authorities",
in the StatSummary api.
This includes an extensive refactor of stat_summary.go to deal with non-kubernetes
resource types.
- Add documentation to Resource in the public api so we can use it for authority
- Handle non-k8s resource requests in the StatSummary endpoint
- Rewrite stat summary fetching and parsing to handle non-k8s resources
- keys stat summary metric handling by Resource instead of a generated string
- Adds authority to the CLI
- Adds /authorities to the Web UI
- Adds some more stat integration and unit tests
* Add controller admin servers and readiness probes
* Tweak readiness probes to be more sane
* Refactor based on review feedback
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
Problem
`conduit stat` would cause a panic for any resource that wasn't in the list
of StatAllResourceTypes
This bug was introduced by https://github.com/runconduit/conduit/pull/1088/files
Solution
Fix writeStatsToBuffer to not depend on what resources are in StatAllResourceTypes
Also adds a unit test and integration test for `conduit stat ns`
- It would be nice to display container errors in the UI. This PR gets the pod's container
statuses and returns them in the public api
- Also add a terminationMessagePolicy to conduit's inject so that we can capture the
proxy's error messages if it terminates
Previously, in conduit stat all we would just print the map of stat results, which
resulted in the order in which stats were displayed varying between prints.
Fix:
Define an array, k8s.StatAllResourceTypes and use the order in this array to print
the map; ensuring a consistent print order every time the command is run.
Both the conduit stat command and web UI are showing failed and completed pods.
This change filters out those pods before returning the result to the client.
Fixes#1010
Signed-off-by: Ivan Sim <ihcsim@gmail.com>
- Update the `response_total` prometheus query of the StatSummary endpoint to also
break queries out by a `meshed` label.
- Add a 'Secured' column to the web UI/CLI stat displays, which indicate the percentage of traffic
starting and ending in the mesh
This meshed label is used in the CLI/Web UI to display a column of the percentage of traffic that
starts/ends in the mesh. (Which is a proxy indicator for whether that traffic is 'secured' when we
add TLS by default for intra mesh requests).
The `meshed` label is not yet added anywhere, so until it is supplied by the proxy, all traffic will
show up as 0% secured in the web/CLI.
The StatSummary endpoint was dereferencing
StatSummaryRequest.Selector.Resource, causing a panic when it received
an empty request.
Fix StatSummary to use the nil-friendly
StatSummaryRequest.GetSelector().GetResource() methods, and add a test
to validate.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Fix bug where we were dropping parts of the StatSummaryRequest
* Add tests for prometheus query strings and for failed cases
Problem
In #928 I rewrote the stat api to handle 'all' as a resource type. To query for all resource types,
we would copy the Resource, LabelSelector and TimeWindow of the original request, and then
go through all the resource types and set Resource.Type for each resource we wanted to get.
The bug is that while we copy over some fields of the original request, we didn't copy over all
of them - namely Resource.Name and the Outbound resource. So the Stat endpoint would
ignore any --to or --from flags, and would ignore requests for a specific named resource.
Solution
Copy over all fields from the request.
I've also added tests for this case. In this process I've refactored the stat_summary_test code
to make it a bit easier to read/use.
Allow the Stat endpoint in the public-api to accept requests for resourceType "all".
Currently, this queries Pods, Deployments, RCs and Services, but can be modified
to query other resources as well.
Both the CLI and web endpoints now work if you set resourceType to all.
e.g. `conduit stat all`
* Modify the Stat endpoint to also return the count of failed pods
* Add comments explaining pod count stats
* Rename total pod count to running pod count
This is to support the service mesh overview page, as I'd like to include an indicator of
failed pods there.
The `conduit tap` command is now deprecated.
Replace `conduit tap` with `connduit tapByResource`. Rename tapByResource
to tap. The underlying protobuf for tap remains, the tap gRPC endpoint now
returns Unimplemented.
Fixes#804
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
public-api and and tap were both using their own implementations of
the Kubernetes Informer/Lister APIs.
This change factors out all Informer/Lister usage into the Lister
module. This also introduces a new `Lister.GetObjects` method.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The TapByResource endpoint was previously a stub.
Implement end-to-end tapByResource functionality, with support for
specifying any kubernetes resource(s) as target and destination.
Fixes#803, #49
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The `stat` command did not support `service` as a resource type.
This change adds `service` support to the `stat` command. Specifically:
- as a destination resource on `--to` commands
- as a target resource on `--from` commands
Fixes#805
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The existing `tap` command is being deprecated.
Introduce a `tapByResource` cli command. It supports tapping a Kubernetes
resource or collection of resources, optionally filtered by outbound resources.
This command will eventually replace `tap`.
Part of #778
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This changes the public api to have a new rpc type, `TapByResource`.
This api supersedes the Tap api. `TapByResource` is richer, more closely
reflecting the proxy's capabilities.
The proxy's Tap api is extended to select over destination labels,
corresponding with those returned by the Destination api.
Now both `Tap` and `TapByResource`'s responses may include destination
labels.
This change avoids breaking backwards compatibility by:
* introducing the new `TapByResource` rpc type, opting not to change Tap
* extending the proxy's Match type with a new, optional, `destination_label` field.
* `TapEvent` is extended with a new, optional, `destination_meta`.
* Expose pod stats in CLI, web UI, and Grafana
* Fix js api helpers test
* Add outbound traffic stats to pod dashboard
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
The public-api previously only permitted 4 hard-coded time windows:
10s, 1m, 10m, 1h. This was primarily a relic of the recently removed
telemetry system.
Modify the public-api to validate the time string, but allow for any
window size, which is then passed through to Prometheus.
Fixes#686
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Add namespace as a resource type in public-api
The cli and public-api only supported deployments as a resource type.
This change adds support for namespace as a resource type in the cli and
public-api. This also change includes:
- cli statsummary now prints `-`'s when objects are not in the mesh
- cli statsummary prints `No resources found.` when applicable
- removed `out-` from cli statsummary flags, and analagous proto changes
- switched public-api to use native prometheus label types
- misc error handling and logging fixes
Part of #627
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Refactor filter and groupby label formulation
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Rename stat_summary.go to stat.go in cli
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Update rbac privileges for namespace stats
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
Conduit was relying on apps/v1 to Deployment and ReplicaSet APIs.
apps/v1 is not available on Kubernetes 1.8. This prevented the
public-api from starting.
Switch Conduit to use apps/v1beta2. Also increase the Kubernetes API
cache sync timeout from 10 to 60 seconds, as it was taking 11 seconds on
a test cluster.
Fixes#761
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Remove the telemetry service
The telemetry service is no longer needed, now that prometheus scrapes
metrics directly from proxies, and the public-api talks directly to
prometheus. In this branch I'm removing the service itself as well as
all of the telemetry protobuf, and updating the conduit install command
to no longer install the service. I'm also removing the old version of
the stat command, which required the telemetry service, and renaming the
statsummary command to stat.
* Fix time window tests
* Remove deprecated controller scrape config
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
The Prometheus client sometimes returns NaN if a calculation is invalid,
such as histogram_quantile when no requests have occurred.
Add IsNaN check in the public-api and set output to zero.
Fixes#747
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The ListPods endpoint's logic resides in the telemetry service, which is
going away.
Move ListPods logic into public-api, use new k8s informer APIs.
Fixes#694
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The new StatSummary endpoint was only providing request volume and
successs rate information.
Add support for retrieving latency stats via StatSummary. Also make
all prometheus calls in parallel, and implement kubernetes test
fixtures.
Fixes#681
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Switch public API to use cached k8s resources
* Move shared informer code to separate goroutine
* Fix spelling issue
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
The success rate calculation relies on the `classification` label, but
was incorrectly specifying `fail` rather than `failure`.
Fix public api to specify `failure`. Also re-org public api tests for
easier Kubernetes and Prometheus mocking.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>