* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.
* Added chart templates for new viz linkerd-metrics-api pod
* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.
* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).
* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.
* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.
* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.
* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
- Updated `endpoints.go` according to new API interface name.
- Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
- `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
- Added `metrics-api` to list of docker images to build in actions workflows.
- In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).
* Add retry to 'tap API service is running' check
* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used
* Separate observability API
Closes#5312
This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here.
- Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot`
- Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff.
- Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto`
- `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup.
- The web server provides handlers for both interfaces.
- `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs.
- Most of the CLI commands will use the Viz API, except for `version`.
The other changes in the go files are just changes in the imports to point to the new protobufs.
Other minor changes:
- Removed `git add controller/gen` from `bin/protoc-go.sh`
* Remove dependency of linkerd-config for most control plane components
This PR removes the dependency of `linkerd-config` into control
plane components by making all that information passed through CLI
flags. As most of these components require a couple of flags, passing
them as flags could be more helpful, as updations to the flags trigger a
rollout unlike a configMap update.
This does not update the proxy-injector as it needs a lot more data
and mounting `linkerd-config` is better.
Followup to #2990, which refactored `linkerd endpoints` to use the
`Destination.Get` API instead of the `Discovery.Endpoints` API, leaving
the Discovery with no implented methods. This PR removes all the Discovery
code leftovers.
Fixes#3499
### Summary
After the addition of the tap APIServer, all the logic related to tap in the public API no longer needs to be there. The servers and clients that are created but not used, as well as all the old testing infrastrucure related to tap can be removed.
This deprecates TapByResource and therefore required an update to the protobuf files with `bin/protoc-go.sh`. While the change to deprecate this method was extremely small, a lot of protobuf fils were updated in the process. These changes to the code and protobuf files should probably remain coupled since `TapByResource` is officially deprecated in the public API, but a majority of the additions/deletions are related to those files.
This draft passes `go test` as well as a local run of the integration tests.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
* Have `linkerd endpoints` use `Destination.Get`
Fixes#2885
We're refactoring `linkerd endpoints` so it hits
directly the `Destination.Get` endpoint, instead of relying on the
Discovery service.
For that, I've created a new `client.go` for Destination and added it to
the `APIClient` interface.
I've also added a `destinationClient` struct that mimics `tapClient`,
and whose common logic has been moved into `stream_client.go`.
Analogously, I added a `destinationServer` struct that mimics
`tapServer`.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
This change adds an endpoint to the public API to allow us to query Prometheus for edge data, in order to display identity information for connections between Linkerd proxies. This PR only includes changes to the controller and protobuf.
The Proxy API service lacked introspection of its internal state.
Introduce a new gRPC Discovery API, implemented by two servers:
1) Proxy API Server: returns a snapshot of discovery state
2) Public API Server: pass-through to the Proxy API Server
Also wire up a new `linkerd endpoints` command.
Fixes#2165
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Add a barebones ListServices endpoint, in support of autocomplete for services.
As we develop service profiles, this endpoint could probably be used to describe
more aspects of services (like, if there were some way to check whether a
service profile was enabled or not).
Accessible from the web UI via http://localhost:8084/api/services
Add a routes command which displays per-route stats for services that have service profiles defined.
This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service. This implementation reads per-route data from Prometheus. This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format. This is very similar to the `stat` command and much of the code was able to be refactored and shared.
Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data. This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.
Signed-off-by: Alex Leong <alex@buoyant.io>
This PR begins to migrate Conduit to Linkerd2:
* The proxy has been completely removed from this repo, and is now located at
github.com/linkerd/linkerd2-proxy.
* A `Dockerfile-proxy` has been added to fetch the most-recently published proxy
binary from build.l5d.io.
* Proxy-specific protobuf bindings have been moved to
github.com/linkerd/linkerd2-proxy-api.
* All docker images now use the gcr.io/linkerd-io registry.
* `inject` now uses `LINKERD2_PROXY_` environment variables
* Go paths have been updated to reflect the new (future) repo location.
- Return pod uptimes from the GetPods endpoint
- Adds filtering by namespace to api.GetPods
- Adds a --namespace filter to conduit get pods
- Adds pod uptimes to the controller component toolitps on the ServiceMesh page
- Moves the ServiceMesh page back to using /api/pods
The TapByResource endpoint was previously a stub.
Implement end-to-end tapByResource functionality, with support for
specifying any kubernetes resource(s) as target and destination.
Fixes#803, #49
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This changes the public api to have a new rpc type, `TapByResource`.
This api supersedes the Tap api. `TapByResource` is richer, more closely
reflecting the proxy's capabilities.
The proxy's Tap api is extended to select over destination labels,
corresponding with those returned by the Destination api.
Now both `Tap` and `TapByResource`'s responses may include destination
labels.
This change avoids breaking backwards compatibility by:
* introducing the new `TapByResource` rpc type, opting not to change Tap
* extending the proxy's Match type with a new, optional, `destination_label` field.
* `TapEvent` is extended with a new, optional, `destination_meta`.
* Remove the telemetry service
The telemetry service is no longer needed, now that prometheus scrapes
metrics directly from proxies, and the public-api talks directly to
prometheus. In this branch I'm removing the service itself as well as
all of the telemetry protobuf, and updating the conduit install command
to no longer install the service. I'm also removing the old version of
the stat command, which required the telemetry service, and renaming the
statsummary command to stat.
* Fix time window tests
* Remove deprecated controller scrape config
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Define a new telemetry Stat API
Proposal definition for a new Stat API, for the purposes of satisfying the queries proposed in #627.
StatSummary will replace Stat once implemented and the original Stat deleted.
Follow-up from #315.
Now that the UIs don't report per-path metrics, we can remove the path label from Prometheus, the path aggregation and filtering options from the telemetry API, and the path field from the proxy report API.
I've modified the tests to no longer expect the removed fields, and manually verified that Conduit still works after making these changes.
Closes#265
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
We now create a new test HTTP server per test case instead of sharing it across them all.
This should solve the data races we have experienced on Travis.
Signed-off-by: Phil Calcado <phil@buoyant.io>
The conduit.io/* k8s labels and annotations we're redundant in some
cases, and not flexible enough in others.
This change modifies the labels in the following ways:
`conduit.io/plane: control` => `conduit.io/controller-component: web`
`conduit.io/controller: conduit` => `conduit.io/controller-ns: conduit`
`conduit.io/plane: data` => (remove, redundant with `conduit.io/controller-ns`)
It also centralizes all k8s labels and annotations into
pkg/k8s/labels.go, and adds tests for the install command.
Part of #201
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Previously, running `$conduit tap` would return a `Unexpected EOF` error when the server wasn't available. This was due to a few problems with the way we were handling errors all the way down the tap server. This change fixes that and cleans some of the protobuf-over-HTTP code.
- first step towards #49
- closes#106