mirror of https://github.com/istio/istio.io.git
zh-translation:/docs/ops/common-problems/observability-issues/index.md (#5764)
* zh-translation:content/zh/docs/ops/common-problems/observability-issues/index.md * rm tail space * add & trans new content * fix trans problems
This commit is contained in:
parent
971a07661e
commit
1cd092da15
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
title: Observability Problems
|
||||
description: Dealing with telemetry collection issues.
|
||||
title: 可观测性问题
|
||||
description: 处理 Telemetry 收集问题。
|
||||
force_inline_toc: true
|
||||
weight: 30
|
||||
aliases:
|
||||
|
@ -8,48 +8,59 @@ aliases:
|
|||
- /zh/docs/ops/troubleshooting/missing-traces
|
||||
---
|
||||
|
||||
## Expected metrics are not being collected
|
||||
## 期望的指标没有被收集{#expected-metrics-are-not-being-collected}
|
||||
|
||||
The following procedure helps you diagnose problems where metrics
|
||||
you are expecting to see reported are not being collected.
|
||||
如果你期望上报的指标并没有被收集到,下面的过程将帮助您诊断该问题:
|
||||
|
||||
The expected flow for metrics is:
|
||||
指标收集的预期流程如下:
|
||||
|
||||
1. Envoy reports attributes from requests asynchronously to Mixer in a batch.
|
||||
1. Envoy 批量将请求中的属性异步报告给 Mixer。
|
||||
|
||||
1. Mixer translates the attributes into instances based on the operator-provided configuration.
|
||||
1. Mixer 根据操作符配置将属性转换为实例。
|
||||
|
||||
1. Mixer hands the instances to Mixer adapters for processing and backend storage.
|
||||
1. Mixer 将实例交给 Mixer 适配器进行处理和后端存储。
|
||||
|
||||
1. The backend storage systems record the metrics data.
|
||||
1. 后端存储系统记录指标数据。
|
||||
|
||||
The Mixer default installations include a Prometheus adapter and the configuration to generate a [default set of metric values](/docs/reference/config/policy-and-telemetry/metrics/) and send them to the Prometheus adapter. The Prometheus adapter configuration enables a Prometheus instance to scrape Mixer for metrics.
|
||||
Mixer 安装中默认包含一个 Prometheus 适配器,适配器会收到一份用于生成[默认监控指标](/zh/docs/reference/config/policy-and-telemetry/metrics/)的配置。该配置使 Prometheus 实例可以抓取 Mixer 以获取指标。
|
||||
|
||||
If the Istio Dashboard or the Prometheus queries don’t show the expected metrics, any step of the flow above may present an issue. The following sections provide instructions to troubleshoot each step.
|
||||
如果 Istio Dashboard 或 Prometheus 查询未显示预期的指标,则上述流程的任何步骤都可能会出现问题。以下部分提供了对每个步骤进行故障排除的说明。
|
||||
|
||||
### Verify Mixer is receiving Report calls
|
||||
### (如果需要)验证 Istio CNI pod 正在运行{#verify-Istio-CNI-pods-are-running}
|
||||
|
||||
Mixer generates metrics to monitor its own behavior. The first step is to check these metrics:
|
||||
在 Kubernetes Pod 生命周期设置网络期间,Istio CNI 插件会对 Istio 网格 Pod 执行流量重定向,从而用户在 Istio 网格中部署 Pod 时不需要 [`NET_ADMIN`能力需求](/zh/docs/ops/setup/required-pod-capabilities/)。 Istio CNI 插件主要用来替代 `istio-init` 容器的一些功能。
|
||||
|
||||
1. Establish a connection to the Mixer self-monitoring endpoint for the `istio-telemetry` deployment. In Kubernetes environments, execute the following command:
|
||||
1. 验证 `istio-cni-node` pods 正在运行:
|
||||
|
||||
{{< text bash >}}
|
||||
$ kubectl -n kube-system get pod -l k8s-app=istio-cni-node
|
||||
{{< /text >}}
|
||||
|
||||
1. 如果 `PodSecurityPolicy` 在您的集群上已经启用,请确保 `istio-cni` 服务账号可以使用具有 [`NET_ADMIN`能力需求](/zh/docs/ops/setup/required-pod-capabilities/)的 `PodSecurityPolicy`。
|
||||
|
||||
### 确认 Mixer 可以收到指标报告的调用{#verify-mixer-is-receiving-report-calls}
|
||||
|
||||
Mixer 会生成指标来监控它自身行为。首先,检查这些指标:
|
||||
|
||||
1. `istio-telemetry` Deployment 对外暴露 Mixer 自监控 endpoint。在 Kubernetes 环境中,执行以下命令:
|
||||
|
||||
{{< text bash >}}
|
||||
$ kubectl -n istio-system port-forward <istio-telemetry pod> 15014 &
|
||||
{{< /text >}}
|
||||
|
||||
1. Verify successful report calls. On the Mixer self-monitoring endpoint (`http://localhost:15014/metrics`), search for `grpc_io_server_completed_rpcs`. You should see something like:
|
||||
1. 验证上报成功调用。通过 Mixer 自监控端点 (`http://localhost:15014/metrics`) 查询 `grpc_io_server_completed_rpcs`,您应该能看到类似的东西:
|
||||
|
||||
{{< text plain >}}
|
||||
grpc_io_server_completed_rpcs{grpc_server_method="istio.mixer.v1.Mixer/Report",grpc_server_status="OK"} 2532
|
||||
{{< /text >}}
|
||||
|
||||
If you do not see any data for `grpc_io_server_completed_rpcs` with a `grpc_server_method="istio.mixer.v1.Mixer/Report"`, then Envoy is not calling Mixer to report telemetry.
|
||||
如果你没有发现带有 `grpc_server_method="istio.mixer.v1.Mixer/Report"` 的 `grpc_io_server_completed_rpcs` 数据,说明 Envoy 没有调用 Mixer 上报遥测数据。
|
||||
|
||||
1. In this case, ensure you integrated the services properly into the mesh. You can achieve this task with either [automatic or manual sidecar injection](/docs/setup/additional-setup/sidecar-injection/).
|
||||
1. 在这种情况下,请确保已经将服务正确地集成到服务网格中。您可以使用[自动或手动注入 sidecar](/zh/docs/setup/additional-setup/sidecar-injection/) 来完成这个目标。
|
||||
|
||||
### Verify the Mixer rules exist
|
||||
### 验证 Mixer 规则是否存在{#verify-the-mixer-rules-exist}
|
||||
|
||||
In Kubernetes environments, issue the following command:
|
||||
在 Kubernetes 环境中,执行以下命令:
|
||||
|
||||
{{< text bash >}}
|
||||
$ kubectl get rules --all-namespaces
|
||||
|
@ -62,13 +73,13 @@ istio-system promtcpconnectionopen 4h
|
|||
istio-system tcpkubeattrgenrulerule 4h
|
||||
{{< /text >}}
|
||||
|
||||
If the output shows no rules named `promhttp` or `promtcp`, then the Mixer configuration for sending metric instances to the Prometheus adapter is missing. You must supply the configuration for rules connecting the Mixer metric instances to a Prometheus handler.
|
||||
如果输出没有命名为 `promhttp` 或 `promtcp` 的规则,则缺少将指标实例发送到 Prometheus adapter 的 Mixer 配置。你必须提供将 Mixer 指标实例连接到 Prometheus handler 的规则配置。
|
||||
|
||||
For reference, please consult the [default rules for Prometheus]({{< github_file >}}/install/kubernetes/helm/istio/charts/mixer/templates/config.yaml).
|
||||
作为参考,请参阅 [Prometheus 的默认规则]({{< github_file >}}/install/kubernetes/helm/istio/charts/mixer/templates/config.yaml)。
|
||||
|
||||
### Verify the Prometheus handler configuration exists
|
||||
### 验证 Prometheus handler 配置是否存在(#verify-the-Prometheus-handler-configuration-exists)
|
||||
|
||||
1. In Kubernetes environments, issue the following command:
|
||||
1. 在 Kubernetes 环境中,执行以下命令:
|
||||
|
||||
{{< text bash >}}
|
||||
$ kubectl get handlers.config.istio.io --all-namespaces
|
||||
|
@ -77,7 +88,7 @@ For reference, please consult the [default rules for Prometheus]({{< github_file
|
|||
istio-system prometheus 4h
|
||||
{{< /text >}}
|
||||
|
||||
If you're upgrading from Istio 1.1 or earlier, issue the following command instead:
|
||||
如果您通过 Istio 1.1 或者更早版本升级的,执行以下命令:
|
||||
|
||||
{{< text bash >}}
|
||||
$ kubectl get prometheuses.config.istio.io --all-namespaces
|
||||
|
@ -85,34 +96,33 @@ For reference, please consult the [default rules for Prometheus]({{< github_file
|
|||
istio-system handler 13d
|
||||
{{< /text >}}
|
||||
|
||||
1. If the output shows no configured Prometheus handlers, you must reconfigure Mixer with the appropriate handler configuration.
|
||||
1. 如果输出没有的 Prometheus handler 的配置,则必须重新使用合适的 handler 配置 Mixer。
|
||||
|
||||
For reference, please consult the [default handler configuration for Prometheus]({{< github_file >}}/install/kubernetes/helm/istio/charts/mixer/templates/config.yaml).
|
||||
有关参考,请参阅 [Prometheus 的默认 handler 配置]({{< github_file >}}/install/kubernetes/helm/istio/charts/mixer/templates/config.yaml)。
|
||||
|
||||
### Verify Mixer metric instances configuration exists
|
||||
### 验证 Mixer 指标实例配置是否存在{#verify-mixer-metric-instances-configuration-exists}
|
||||
|
||||
1. In Kubernetes environments, issue the following command:
|
||||
1. 在 Kubernetes 环境下,执行以下命令:
|
||||
|
||||
{{< text bash >}}
|
||||
$ kubectl get instances -o custom-columns=NAME:.metadata.name,TEMPLATE:.spec.compiledTemplate --all-namespaces
|
||||
{{< /text >}}
|
||||
|
||||
If you're upgrading from Istio 1.1 or earlier, issue the following command instead:
|
||||
如果您通过 Istio 1.1 或者更早版本升级的,执行以下命令:
|
||||
|
||||
{{< text bash >}}
|
||||
$ kubectl get metrics.config.istio.io --all-namespaces
|
||||
{{< /text >}}
|
||||
|
||||
1. If the output shows no configured metric instances, you must reconfigure Mixer with the appropriate instance configuration.
|
||||
1. 如果输出未显示已配置的 Mixer 指标实例,则必须使用相应的实例配置重新配置 Mixer。
|
||||
|
||||
For reference, please consult the [default instances configuration for metrics]({{< github_file >}}/install/kubernetes/helm/istio/charts/mixer/templates/config.yaml).
|
||||
有关参考,请参阅 [Mixer 指标的默认实例配置]({{< github_file >}}/install/kubernetes/helm/istio/charts/mixer/templates/config.yaml)。
|
||||
|
||||
### Verify there are no known configuration errors
|
||||
### 验证没有配置错误{#verify-there-are-no-known-configuration-errors}
|
||||
|
||||
1. To establish a connection to the Istio-telemetry self-monitoring endpoint, setup a port-forward to the Istio-telemetry self-monitoring port as described in
|
||||
[Verify Mixer is receiving Report calls](#verify-mixer-is-receiving-report-calls).
|
||||
1. 与`istio-telemetry` 自监控端点建立连接,按照上文[确认 Mixer 可以收到指标报告的调用](#verify-mixer-is-receiving-report-calls)的描述设置一个到 `istio-telemetry` 自监控端口的转发。
|
||||
|
||||
1. For each of the following metrics, verify that the most up-to-date value is 0:
|
||||
1. 确认以下的指标的最新的值是0:
|
||||
|
||||
* `mixer_config_adapter_info_config_errors_total`
|
||||
|
||||
|
@ -130,60 +140,56 @@ For reference, please consult the [default rules for Prometheus]({{< github_file
|
|||
|
||||
* `mixer_handler_handler_build_failures_total`
|
||||
|
||||
On the page showing Mixer self-monitoring port, search for each of the metrics listed above. You should not find any values for those metrics if everything is
|
||||
configured correctly.
|
||||
在显示 Mixer 自监控 endpoint 的页面上,搜索上面列出的每个指标。如果所有配置正确,您应该不能找的那些指标值。
|
||||
|
||||
If any of those metrics have a value, confirm that the metric value with the largest configuration ID is 0. This will verify that Mixer has generated no errors
|
||||
in processing the most recent configuration as supplied.
|
||||
如果存在某个指标值,请确认该指标值的最大配置 ID 是0。这可以验证 Mixer 在处理最近提供配置过程中没有发生任何错误。
|
||||
|
||||
### Verify Mixer is sending metric instances to the Prometheus adapter
|
||||
### 验证 Mixer 可以将指标实例发送到 Prometheus 适配器{#verify-Mixer-is-sending-metric-instances-to-the-Prometheus-adapter}
|
||||
|
||||
1. Establish a connection to the `istio-telemetry` self-monitoring endpoint. Setup a port-forward to the `istio-telemetry` self-monitoring port as described in
|
||||
[Verify Mixer is receiving Report calls](#verify-mixer-is-receiving-report-calls).
|
||||
1. 与`istio-telemetry` 自监控端点建立连接,按照上文[确认 Mixer 可以收到指标报告的调用](#verify-mixer-is-receiving-report-calls)的描述设置一个到 `istio-telemetry` 自监控端口的转发。
|
||||
|
||||
1. On the Mixer self-monitoring port, search for `mixer_runtime_dispatches_total`. The output should be similar to:
|
||||
1. 通过 Mixer 自监控端口搜索 `mixer_runtime_dispatches_total`。应该输出类似如下结果:
|
||||
|
||||
{{< text plain >}}
|
||||
mixer_runtime_dispatches_total{adapter="prometheus",error="false",handler="prometheus.istio-system",meshFunction="metric"} 2532
|
||||
{{< /text >}}
|
||||
|
||||
1. Confirm that `mixer_runtime_dispatches_total` is present with the values:
|
||||
1. 确认 `mixer_runtime_dispatches_total` 的值是:
|
||||
|
||||
{{< text plain >}}
|
||||
adapter="prometheus"
|
||||
error="false"
|
||||
{{< /text >}}
|
||||
|
||||
If you can’t find recorded dispatches to the Prometheus adapter, there is likely a configuration issue. Please follow the steps above
|
||||
to ensure everything is configured properly.
|
||||
如果你找不到发送到 Prometheus 适配器的记录,这很可能是配置不正确。请按照上面的步骤确认所有配置正确。
|
||||
|
||||
If the dispatches to the Prometheus adapter report errors, check the Mixer logs to determine the source of the error. The most likely cause is a configuration issue for the handler listed in `mixer_runtime_dispatches_total`.
|
||||
如果发送到 Prometheus 适配器的报告有错误,可以通过检查 Mixer 的日志看到错误的来源。最可能的原因是配置问题,可以通过 handler 展示的 mixer_runtime_dispatch_count 指标看出问题。
|
||||
|
||||
1. Check the Mixer logs in a Kubernetes environment with:
|
||||
1. 在 Kubernetes 环境,通过执行以下命令查看 mixer 日志:
|
||||
|
||||
{{< text bash >}}
|
||||
$ kubectl -n istio-system logs <istio-telemetry pod> -c mixer
|
||||
{{< /text >}}
|
||||
|
||||
### Verify Prometheus configuration
|
||||
### 验证 Prometheus 配置{#verify-Prometheus-configuration}
|
||||
|
||||
1. Connect to the Prometheus UI
|
||||
1. 连接到 Prometheus UI 界面。
|
||||
|
||||
1. Verify you can successfully scrape Mixer through the UI.
|
||||
1. 验证是否可以通过 UI 成功查看到 Mixer。
|
||||
|
||||
1. In Kubernetes environments, setup port-forwarding with:
|
||||
1. 在 Kubernetes 环境中,使用以下命令设置端口转发:
|
||||
|
||||
{{< text bash >}}
|
||||
$ kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=prometheus -o jsonpath='{.items[0].metadata.name}') 9090:9090 &
|
||||
{{< /text >}}
|
||||
|
||||
1. Visit `http://localhost:9090/targets`
|
||||
1. 访问 `http://localhost:9090/targets`
|
||||
|
||||
1. Confirm the target `istio-mesh` has a status of UP.
|
||||
1. 确认目标的 `istio-mesh` 的状态是 UP。
|
||||
|
||||
1. Visit `http://localhost:9090/config`
|
||||
1. 访问 `http://localhost:9090/config`
|
||||
|
||||
1. Confirm an entry exists similar to:
|
||||
1. 确认存在以个类似如下的内容:
|
||||
|
||||
{{< text plain >}}
|
||||
- job_name: 'istio-mesh'
|
||||
|
@ -195,16 +201,13 @@ in processing the most recent configuration as supplied.
|
|||
- targets: ['istio-mixer.istio-system:42422']</td>
|
||||
{{< /text >}}
|
||||
|
||||
## No traces appearing in Zipkin when running Istio locally on Mac
|
||||
## 在 Mac 上本地运行 Istio 时,Zipkin 中没有出现任何跟踪信息{#no-traces-appearing-in-Zipkin-when-running-Istio-locally-on-Mac}
|
||||
|
||||
Istio is installed and everything seems to be working except there are no traces showing up in Zipkin when there
|
||||
should be.
|
||||
安装了 Istio 之后,看起来一切都在工作,但 Zipkin 中没有出现本该出现的跟踪信息。
|
||||
|
||||
This may be caused by a known [Docker issue](https://github.com/docker/for-mac/issues/1260) where the time inside
|
||||
containers may skew significantly from the time on the host machine. If this is the case,
|
||||
when you select a very long date range in Zipkin you will see the traces appearing as much as several days too early.
|
||||
这可能是由一个已知的 [Docker 问题](https://github.com/docker/for-mac/issues/1260)引起的,容器可能会与宿主机上的时间有明显偏差。如果是这种情况,可以尝试在 Zipkin 中选择一个非常长的日期范围,你会发现这些追踪轨迹提早出现了几天。
|
||||
|
||||
You can also confirm this problem by comparing the date inside a Docker container to outside:
|
||||
您还可以通过将 Docker 容器内的日期与外部进行比较来确认此问题:
|
||||
|
||||
{{< text bash >}}
|
||||
$ docker run --entrypoint date gcr.io/istio-testing/ubuntu-16-04-slave:latest
|
||||
|
@ -216,16 +219,9 @@ $ date -u
|
|||
Thu Jun 15 02:25:42 UTC 2017
|
||||
{{< /text >}}
|
||||
|
||||
To fix the problem, you'll need to shutdown and then restart Docker before reinstalling Istio.
|
||||
要解决此问题,您需要在重新安装 Istio 之前关闭然后重新启动 Docker。
|
||||
|
||||
## Missing Grafana output
|
||||
## 缺失 Grafana 输出{#missing-Grafana-output}
|
||||
|
||||
If you're unable to get Grafana output when connecting from a local web client to Istio remotely hosted, you
|
||||
should validate the client and server date and time match.
|
||||
|
||||
The time of the web client (e.g. Chrome) affects the output from Grafana. A simple solution
|
||||
to this problem is to verify a time synchronization service is running correctly within the
|
||||
Kubernetes cluster and the web client machine also is correctly using a time synchronization
|
||||
service. Some common time synchronization systems are NTP and Chrony. This is especially
|
||||
problematic in engineering labs with firewalls. In these scenarios, NTP may not be configured
|
||||
properly to point at the lab-based NTP services.
|
||||
如果当您通过本地 web 客户端连接远程 Istio 不能获取 Grafana 输出,您需要验证客户端和服务端日期和时间是否一致。
|
||||
Grafana 的输出会受到 Web 客户端(例如:Chrome)时间的影响。一个简单的解决方案,验证下 Kubernetes 集群内部使用的时间同步服务是否在正常运行,以及 Web 客户端是否正确的使用时间同步服务。NTP 和 Chrony 是常用的时间同步系统,特别是在有防火墙的工程实验环境中会出现问题。例如:在该场景中,NTP 没有被配置到正确的基于实验室的 NTP 服务。
|
||||
|
|
|
@ -11,7 +11,7 @@ keywords: [security,access-control,rbac,tcp,authorization]
|
|||
|
||||
本文任务假设,你已经:
|
||||
|
||||
* Read the [Istio 中的授权和鉴权](/zh/docs/concepts/security/#authorization).
|
||||
* 了解 [Istio 的授权和鉴权](/zh/docs/concepts/security/#authorization).
|
||||
|
||||
* 按照 [快速开始](/zh/docs/setup/install/kubernetes/) 的指导,在 Kubernetes 中安装完成 Istio。
|
||||
|
||||
|
|
|
@ -9,8 +9,8 @@ keywords: [security,mutual-tls]
|
|||
|
||||
* 您已经完成[认证策略](/zh/docs/tasks/security/authn-policy/) 任务.
|
||||
* 您熟悉如何通过认证策略开启双向 TLS。
|
||||
* Istio 在 Kubernetes 上运行,并且开启全局双向 TLS。可以按照 [Istio 安装说明文档](/zh/docs/setup/))。
|
||||
如果已经安装了 Istio,可以根据[为所有服务启用双向 TLS 认证](/zh/docs/tasks/security/authn-policy/#globally-enabling-istio-mutual-tls) 任务中说明,通过增加或者修改认证策略和目的规则来开启双向 TLS。
|
||||
* Istio 在 Kubernetes 上运行,并且开启全局双向 TLS。可以参考 [Istio 安装说明文档](/zh/docs/setup/)。
|
||||
如果已经安装 Istio,可以根据[为所有服务启用双向 TLS 认证](/zh/docs/tasks/security/authn-policy/#globally-enabling-istio-mutual-tls) 任务中说明,通过增加或者修改认证策略和目的规则来开启双向 TLS。
|
||||
* [httpbin]({{< github_tree >}}/samples/httpbin) 和 [sleep]({{< github_tree >}}/samples/sleep) 已经部署在了 `default` namespace,并且这两个应用带有 Envoy sidecar. 例如,可以通过以下命令[手动注入 sidecar](/zh/docs/setup/additional-setup/sidecar-injection/#manual-sidecar-injection) 来完成服务的部署:
|
||||
|
||||
{{< text bash >}}
|
||||
|
|
Loading…
Reference in New Issue