remove stackdriver, elastic task pages

clean up links and redirects
This commit is contained in:
Karen Bradshaw 2020-10-28 14:30:33 -04:00
parent 8fd80c3e35
commit 06ad25e38d
5 changed files with 111 additions and 338 deletions

View File

@ -9,23 +9,22 @@ weight: 60
<!-- overview -->
Application logs can help you understand what is happening inside your application. The logs are particularly useful for debugging problems and monitoring cluster activity. Most modern applications have some kind of logging mechanism; as such, most container engines are likewise designed to support some kind of logging. The easiest and most embraced logging method for containerized applications is to write to the standard output and standard error streams.
Application logs can help you understand what is happening inside your application. The logs are particularly useful for debugging problems and monitoring cluster activity. Most modern applications have some kind of logging mechanism. Likewise, container engines are designed to support logging. The easiest and most adopted logging method for containerized applications is writing to standard output and standard error streams.
However, the native functionality provided by a container engine or runtime is usually not enough for a complete logging solution. For example, if a container crashes, a pod is evicted, or a node dies, you'll usually still want to access your application's logs. As such, logs should have a separate storage and lifecycle independent of nodes, pods, or containers. This concept is called _cluster-level-logging_. Cluster-level logging requires a separate backend to store, analyze, and query logs. Kubernetes provides no native storage solution for log data, but you can integrate many existing logging solutions into your Kubernetes cluster.
However, the native functionality provided by a container engine or runtime is usually not enough for a complete logging solution.
For example, you may want access your application's logs if a container crashes; a pod gets evicted; or a node dies.
In a cluster, logs should have a separate storage and lifecycle independent of nodes, pods, or containers. This concept is called _cluster-level logging_.
<!-- body -->
Cluster-level logging architectures are described in assumption that
a logging backend is present inside or outside of your cluster. If you're
not interested in having cluster-level logging, you might still find
the description of how logs are stored and handled on the node to be useful.
Cluster-level logging architectures require a separate backend to store, analyze, and query logs. Kubernetes
does not provide a native storage solution for log data. Instead, there are many logging solutions that
integrate with Kubernetes. The following sections describe how to handle and store logs on nodes.
## Basic logging in Kubernetes
In this section, you can see an example of basic logging in Kubernetes that
outputs data to the standard output stream. This demonstration uses
a pod specification with a container that writes some text to standard output
once per second.
This example uses a `Pod` specification with a container
to write text to the standard output stream once per second.
{{< codenew file="debug/counter-pod.yaml" >}}
@ -34,8 +33,10 @@ To run this pod, use the following command:
```shell
kubectl apply -f https://k8s.io/examples/debug/counter-pod.yaml
```
The output is:
```
```console
pod/counter created
```
@ -44,73 +45,73 @@ To fetch the logs, use the `kubectl logs` command, as follows:
```shell
kubectl logs counter
```
The output is:
```
```console
0: Mon Jan 1 00:00:00 UTC 2001
1: Mon Jan 1 00:00:01 UTC 2001
2: Mon Jan 1 00:00:02 UTC 2001
...
```
You can use `kubectl logs` to retrieve logs from a previous instantiation of a container with `--previous` flag, in case the container has crashed. If your pod has multiple containers, you should specify which container's logs you want to access by appending a container name to the command. See the [`kubectl logs` documentation](/docs/reference/generated/kubectl/kubectl-commands#logs) for more details.
You can use `kubectl logs --previous` to retrieve logs from a previous instantiation of a container. If your pod has multiple containers, specify which container's logs you want to access by appending a container name to the command. See the [`kubectl logs` documentation](/docs/reference/generated/kubectl/kubectl-commands#logs) for more details.
## Logging at the node level
![Node level logging](/images/docs/user-guide/logging/logging-node-level.png)
Everything a containerized application writes to `stdout` and `stderr` is handled and redirected somewhere by a container engine. For example, the Docker container engine redirects those two streams to [a logging driver](https://docs.docker.com/engine/admin/logging/overview), which is configured in Kubernetes to write to a file in json format.
A container engine handles and redirects any output generated to a containerized application's `stdout` and `stderr` streams.
For example, the Docker container engine redirects those two streams to [a logging driver](https://docs.docker.com/engine/admin/logging/overview), which is configured in Kubernetes to write to a file in JSON format.
{{< note >}}
The Docker json logging driver treats each line as a separate message. When using the Docker logging driver, there is no direct support for multi-line messages. You need to handle multi-line messages at the logging agent level or higher.
The Docker JSON logging driver treats each line as a separate message. When using the Docker logging driver, there is no direct support for multi-line messages. You need to handle multi-line messages at the logging agent level or higher.
{{< /note >}}
By default, if a container restarts, the kubelet keeps one terminated container with its logs. If a pod is evicted from the node, all corresponding containers are also evicted, along with their logs.
An important consideration in node-level logging is implementing log rotation,
so that logs don't consume all available storage on the node. Kubernetes
currently is not responsible for rotating logs, but rather a deployment tool
is not responsible for rotating logs, but rather a deployment tool
should set up a solution to address that.
For example, in Kubernetes clusters, deployed by the `kube-up.sh` script,
there is a [`logrotate`](https://linux.die.net/man/8/logrotate)
tool configured to run each hour. You can also set up a container runtime to
rotate application's logs automatically, for example by using Docker's `log-opt`.
In the `kube-up.sh` script, the latter approach is used for COS image on GCP,
and the former approach is used in any other environment. In both cases, by
default rotation is configured to take place when log file exceeds 10MB.
rotate an application's logs automatically.
As an example, you can find detailed information about how `kube-up.sh` sets
up logging for COS image on GCP in the corresponding
[script](https://github.com/kubernetes/kubernetes/blob/{{< param "githubbranch" >}}/cluster/gce/gci/configure-helper.sh).
[`configure-helper` script](https://github.com/kubernetes/kubernetes/blob/{{< param "githubbranch" >}}/cluster/gce/gci/configure-helper.sh).
When you run [`kubectl logs`](/docs/reference/generated/kubectl/kubectl-commands#logs) as in
the basic logging example, the kubelet on the node handles the request and
reads directly from the log file, returning the contents in the response.
reads directly from the log file. The kubelet returns the content of the log file.
{{< note >}}
Currently, if some external system has performed the rotation,
If an external system has performed the rotation,
only the contents of the latest log file will be available through
`kubectl logs`. E.g. if there's a 10MB file, `logrotate` performs
the rotation and there are two files, one 10MB in size and one empty,
`kubectl logs` will return an empty response.
`kubectl logs`. For example, if there's a 10MB file, `logrotate` performs
the rotation and there are two files: one file that is 10MB in size and a second file that is empty.
`kubectl logs` returns the latest log file which in this example is an empty response.
{{< /note >}}
[cosConfigureHelper]: https://github.com/kubernetes/kubernetes/blob/{{< param "githubbranch" >}}/cluster/gce/gci/configure-helper.sh
### System component logs
There are two types of system components: those that run in a container and those
that do not run in a container. For example:
* The Kubernetes scheduler and kube-proxy run in a container.
* The kubelet and container runtime, for example Docker, do not run in containers.
* The kubelet and container runtime do not run in containers.
On machines with systemd, the kubelet and container runtime write to journald. If
systemd is not present, they write to `.log` files in the `/var/log` directory.
System components inside containers always write to the `/var/log` directory,
bypassing the default logging mechanism. They use the [klog](https://github.com/kubernetes/klog)
systemd is not present, the kubelet and container runtime write to `.log` files
in the `/var/log` directory. System components inside containers always write
to the `/var/log` directory, bypassing the default logging mechanism.
They use the [`klog`](https://github.com/kubernetes/klog)
logging library. You can find the conventions for logging severity for those
components in the [development docs on logging](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/logging.md).
Similarly to the container logs, system component logs in the `/var/log`
Similar to the container logs, system component logs in the `/var/log`
directory should be rotated. In Kubernetes clusters brought up by
the `kube-up.sh` script, those logs are configured to be rotated by
the `logrotate` tool daily or once the size exceeds 100MB.
@ -129,13 +130,14 @@ While Kubernetes does not provide a native solution for cluster-level logging, t
You can implement cluster-level logging by including a _node-level logging agent_ on each node. The logging agent is a dedicated tool that exposes logs or pushes logs to a backend. Commonly, the logging agent is a container that has access to a directory with log files from all of the application containers on that node.
Because the logging agent must run on every node, it's common to implement it as either a DaemonSet replica, a manifest pod, or a dedicated native process on the node. However the latter two approaches are deprecated and highly discouraged.
Because the logging agent must run on every node, it is recommended to run the agent
as a `DaemonSet`.
Using a node-level logging agent is the most common and encouraged approach for a Kubernetes cluster, because it creates only one agent per node, and it doesn't require any changes to the applications running on the node. However, node-level logging _only works for applications' standard output and standard error_.
Node-level logging creates only one agent per node and doesn't require any changes to the applications running on the node.
Kubernetes doesn't specify a logging agent, but two optional logging agents are packaged with the Kubernetes release: [Stackdriver Logging](/docs/tasks/debug-application-cluster/logging-stackdriver/) for use with Google Cloud Platform, and [Elasticsearch](/docs/tasks/debug-application-cluster/logging-elasticsearch-kibana/). You can find more information and instructions in the dedicated documents. Both use [fluentd](https://www.fluentd.org/) with custom configuration as an agent on the node.
Containers write stdout and stderr, but with no agreed format. A node-level agent collects these logs and forwards them for aggregation.
### Using a sidecar container with the logging agent
### Using a sidecar container with the logging agent {#sidecar-container-with-logging-agent}
You can use a sidecar container in one of the following ways:
@ -146,28 +148,27 @@ You can use a sidecar container in one of the following ways:
![Sidecar container with a streaming container](/images/docs/user-guide/logging/logging-with-streaming-sidecar.png)
By having your sidecar containers stream to their own `stdout` and `stderr`
By having your sidecar containers write to their own `stdout` and `stderr`
streams, you can take advantage of the kubelet and the logging agent that
already run on each node. The sidecar containers read logs from a file, a socket,
or the journald. Each individual sidecar container prints log to its own `stdout`
or `stderr` stream.
or journald. Each sidecar container prints a log to its own `stdout` or `stderr` stream.
This approach allows you to separate several log streams from different
parts of your application, some of which can lack support
for writing to `stdout` or `stderr`. The logic behind redirecting logs
is minimal, so it's hardly a significant overhead. Additionally, because
is minimal, so it's not a significant overhead. Additionally, because
`stdout` and `stderr` are handled by the kubelet, you can use built-in tools
like `kubectl logs`.
Consider the following example. A pod runs a single container, and the container
writes to two different log files, using two different formats. Here's a
For example, a pod runs a single container, and the container
writes to two different log files using two different formats. Here's a
configuration file for the Pod:
{{< codenew file="admin/logging/two-files-counter-pod.yaml" >}}
It would be a mess to have log entries of different formats in the same log
It is not recommended to write log entries with different formats to the same log
stream, even if you managed to redirect both components to the `stdout` stream of
the container. Instead, you could introduce two sidecar containers. Each sidecar
the container. Instead, you can create two sidecar containers. Each sidecar
container could tail a particular log file from a shared volume and then redirect
the logs to its own `stdout` stream.
@ -181,7 +182,10 @@ running the following commands:
```shell
kubectl logs counter count-log-1
```
```
The output is:
```console
0: Mon Jan 1 00:00:00 UTC 2001
1: Mon Jan 1 00:00:01 UTC 2001
2: Mon Jan 1 00:00:02 UTC 2001
@ -191,7 +195,10 @@ kubectl logs counter count-log-1
```shell
kubectl logs counter count-log-2
```
```
The output is:
```console
Mon Jan 1 00:00:00 UTC 2001 INFO 0
Mon Jan 1 00:00:01 UTC 2001 INFO 1
Mon Jan 1 00:00:02 UTC 2001 INFO 2
@ -202,16 +209,15 @@ The node-level agent installed in your cluster picks up those log streams
automatically without any further configuration. If you like, you can configure
the agent to parse log lines depending on the source container.
Note, that despite low CPU and memory usage (order of couple of millicores
Note, that despite low CPU and memory usage (order of a couple of millicores
for cpu and order of several megabytes for memory), writing logs to a file and
then streaming them to `stdout` can double disk usage. If you have
an application that writes to a single file, it's generally better to set
`/dev/stdout` as destination rather than implementing the streaming sidecar
an application that writes to a single file, it's recommended to set
`/dev/stdout` as the destination rather than implement the streaming sidecar
container approach.
Sidecar containers can also be used to rotate log files that cannot be
rotated by the application itself. An example
of this approach is a small container running logrotate periodically.
rotated by the application itself. An example of this approach is a small container running `logrotate` periodically.
However, it's recommended to use `stdout` and `stderr` directly and leave rotation
and retention policies to the kubelet.
@ -226,21 +232,17 @@ configured specifically to run with your application.
{{< note >}}
Using a logging agent in a sidecar container can lead
to significant resource consumption. Moreover, you won't be able to access
those logs using `kubectl logs` command, because they are not controlled
those logs using `kubectl logs` because they are not controlled
by the kubelet.
{{< /note >}}
As an example, you could use [Stackdriver](/docs/tasks/debug-application-cluster/logging-stackdriver/),
which uses fluentd as a logging agent. Here are two configuration files that
you can use to implement this approach. The first file contains
a [ConfigMap](/docs/tasks/configure-pod-container/configure-pod-configmap/) to configure fluentd.
Here are two configuration files that you can use to implement a sidecar container with a logging agent. The first file contains
a [`ConfigMap`](/docs/tasks/configure-pod-container/configure-pod-configmap/) to configure fluentd.
{{< codenew file="admin/logging/fluentd-sidecar-config.yaml" >}}
{{< note >}}
The configuration of fluentd is beyond the scope of this article. For
information about configuring fluentd, see the
[official fluentd documentation](https://docs.fluentd.org/).
For information about configuring fluentd, see the [fluentd documentation](https://docs.fluentd.org/).
{{< /note >}}
The second file describes a pod that has a sidecar container running fluentd.
@ -248,18 +250,10 @@ The pod mounts a volume where fluentd can pick up its configuration data.
{{< codenew file="admin/logging/two-files-counter-pod-agent-sidecar.yaml" >}}
After some time you can find log messages in the Stackdriver interface.
Remember, that this is just an example and you can actually replace fluentd
with any logging agent, reading from any source inside an application
container.
In the sample configurations, you can replace fluentd with any logging agent, reading from any source inside an application container.
### Exposing logs directly from the application
![Exposing logs directly from the application](/images/docs/user-guide/logging/logging-from-application.png)
You can implement cluster-level logging by exposing or pushing logs directly from
every application; however, the implementation for such a logging mechanism
is outside the scope of Kubernetes.
Cluster-logging that exposes or pushes logs directly from every application is outside the scope of Kubernetes.

View File

@ -1,94 +0,0 @@
---
reviewers:
- piosz
- x13n
content_type: concept
title: Events in Stackdriver
---
<!-- overview -->
Kubernetes events are objects that provide insight into what is happening
inside a cluster, such as what decisions were made by scheduler or why some
pods were evicted from the node. You can read more about using events
for debugging your application in the [Application Introspection and Debugging
](/docs/tasks/debug-application-cluster/debug-application-introspection/)
section.
Since events are API objects, they are stored in the apiserver on master. To
avoid filling up master's disk, a retention policy is enforced: events are
removed one hour after the last occurrence. To provide longer history
and aggregation capabilities, a third party solution should be installed
to capture events.
This article describes a solution that exports Kubernetes events to
Stackdriver Logging, where they can be processed and analyzed.
{{< note >}}
It is not guaranteed that all events happening in a cluster will be
exported to Stackdriver. One possible scenario when events will not be
exported is when event exporter is not running (e.g. during restart or
upgrade). In most cases it's fine to use events for purposes like setting up
[metrics](https://cloud.google.com/logging/docs/logs-based-metrics/) and [alerts](https://cloud.google.com/logging/docs/logs-based-metrics/charts-and-alerts), but you should be aware
of the potential inaccuracy.
{{< /note >}}
<!-- body -->
## Deployment
### Google Kubernetes Engine
In Google Kubernetes Engine, if cloud logging is enabled, event exporter
is deployed by default to the clusters with master running version 1.7 and
higher. To prevent disturbing your workloads, event exporter does not have
resources set and is in the best effort QOS class, which means that it will
be the first to be killed in the case of resource starvation. If you want
your events to be exported, make sure you have enough resources to facilitate
the event exporter pod. This may vary depending on the workload, but on
average, approximately 100Mb RAM and 100m CPU is needed.
### Deploying to the Existing Cluster
Deploy event exporter to your cluster using the following command:
```shell
kubectl apply -f https://k8s.io/examples/debug/event-exporter.yaml
```
Since event exporter accesses the Kubernetes API, it requires permissions to
do so. The following deployment is configured to work with RBAC
authorization. It sets up a service account and a cluster role binding
to allow event exporter to read events. To make sure that event exporter
pod will not be evicted from the node, you can additionally set up resource
requests. As mentioned earlier, 100Mb RAM and 100m CPU should be enough.
{{< codenew file="debug/event-exporter.yaml" >}}
## User Guide
Events are exported to the `GKE Cluster` resource in Stackdriver Logging.
You can find them by selecting an appropriate option from a drop-down menu
of available resources:
<img src="/images/docs/stackdriver-event-exporter-resource.png" alt="Events location in the Stackdriver Logging interface" width="500">
You can filter based on the event object fields using Stackdriver Logging
[filtering mechanism](https://cloud.google.com/logging/docs/view/advanced_filters).
For example, the following query will show events from the scheduler
about pods from deployment `nginx-deployment`:
```
resource.type="gke_cluster"
jsonPayload.kind="Event"
jsonPayload.source.component="default-scheduler"
jsonPayload.involvedObject.name:"nginx-deployment"
```
{{< figure src="/images/docs/stackdriver-event-exporter-filter.png" alt="Filtered events in the Stackdriver Logging interface" width="500" >}}

View File

@ -1,126 +0,0 @@
---
reviewers:
- piosz
- x13n
content_type: concept
title: Logging Using Elasticsearch and Kibana
---
<!-- overview -->
On the Google Compute Engine (GCE) platform, the default logging support targets
[Stackdriver Logging](https://cloud.google.com/logging/), which is described in detail
in the [Logging With Stackdriver Logging](/docs/tasks/debug-application-cluster/logging-stackdriver).
This article describes how to set up a cluster to ingest logs into
[Elasticsearch](https://www.elastic.co/products/elasticsearch) and view
them using [Kibana](https://www.elastic.co/products/kibana), as an alternative to
Stackdriver Logging when running on GCE.
{{< note >}}
You cannot automatically deploy Elasticsearch and Kibana in the Kubernetes cluster hosted on Google Kubernetes Engine. You have to deploy them manually.
{{< /note >}}
<!-- body -->
To use Elasticsearch and Kibana for cluster logging, you should set the
following environment variable as shown below when creating your cluster with
kube-up.sh:
```shell
KUBE_LOGGING_DESTINATION=elasticsearch
```
You should also ensure that `KUBE_ENABLE_NODE_LOGGING=true` (which is the default for the GCE platform).
Now, when you create a cluster, a message will indicate that the Fluentd log
collection daemons that run on each node will target Elasticsearch:
```shell
cluster/kube-up.sh
```
```
...
Project: kubernetes-satnam
Zone: us-central1-b
... calling kube-up
Project: kubernetes-satnam
Zone: us-central1-b
+++ Staging server tars to Google Storage: gs://kubernetes-staging-e6d0e81793/devel
+++ kubernetes-server-linux-amd64.tar.gz uploaded (sha1 = 6987c098277871b6d69623141276924ab687f89d)
+++ kubernetes-salt.tar.gz uploaded (sha1 = bdfc83ed6b60fa9e3bff9004b542cfc643464cd0)
Looking for already existing resources
Starting master and configuring firewalls
Created [https://www.googleapis.com/compute/v1/projects/kubernetes-satnam/zones/us-central1-b/disks/kubernetes-master-pd].
NAME ZONE SIZE_GB TYPE STATUS
kubernetes-master-pd us-central1-b 20 pd-ssd READY
Created [https://www.googleapis.com/compute/v1/projects/kubernetes-satnam/regions/us-central1/addresses/kubernetes-master-ip].
+++ Logging using Fluentd to elasticsearch
```
The per-node Fluentd pods, the Elasticsearch pods, and the Kibana pods should
all be running in the kube-system namespace soon after the cluster comes to
life.
```shell
kubectl get pods --namespace=kube-system
```
```
NAME READY STATUS RESTARTS AGE
elasticsearch-logging-v1-78nog 1/1 Running 0 2h
elasticsearch-logging-v1-nj2nb 1/1 Running 0 2h
fluentd-elasticsearch-kubernetes-node-5oq0 1/1 Running 0 2h
fluentd-elasticsearch-kubernetes-node-6896 1/1 Running 0 2h
fluentd-elasticsearch-kubernetes-node-l1ds 1/1 Running 0 2h
fluentd-elasticsearch-kubernetes-node-lz9j 1/1 Running 0 2h
kibana-logging-v1-bhpo8 1/1 Running 0 2h
kube-dns-v3-7r1l9 3/3 Running 0 2h
monitoring-heapster-v4-yl332 1/1 Running 1 2h
monitoring-influx-grafana-v1-o79xf 2/2 Running 0 2h
```
The `fluentd-elasticsearch` pods gather logs from each node and send them to
the `elasticsearch-logging` pods, which are part of a
[service](/docs/concepts/services-networking/service/) named `elasticsearch-logging`. These
Elasticsearch pods store the logs and expose them via a REST API.
The `kibana-logging` pod provides a web UI for reading the logs stored in
Elasticsearch, and is part of a service named `kibana-logging`.
The Elasticsearch and Kibana services are both in the `kube-system` namespace
and are not directly exposed via a publicly reachable IP address. To reach them,
follow the instructions for
[Accessing services running in a cluster](/docs/tasks/access-application-cluster/access-cluster/#accessing-services-running-on-the-cluster).
If you try accessing the `elasticsearch-logging` service in your browser, you'll
see a status page that looks something like this:
![Elasticsearch Status](/images/docs/es-browser.png)
You can now type Elasticsearch queries directly into the browser, if you'd
like. See [Elasticsearch's documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html)
for more details on how to do so.
Alternatively, you can view your cluster's logs using Kibana (again using the
[instructions for accessing a service running in the cluster](/docs/tasks/access-application-cluster/access-cluster/#accessing-services-running-on-the-cluster)).
The first time you visit the Kibana URL you will be presented with a page that
asks you to configure your view of the ingested logs. Select the option for
timeseries values and select `@timestamp`. On the following page select the
`Discover` tab and then you should be able to see the ingested logs.
You can set the refresh interval to 5 seconds to have the logs
regularly refreshed.
Here is a typical view of ingested logs from the Kibana viewer:
![Kibana logs](/images/docs/kibana-logs.png)
## {{% heading "whatsnext" %}}
Kibana opens up all sorts of powerful options for exploring your logs! For some
ideas on how to dig into it, check out [Kibana's documentation](https://www.elastic.co/guide/en/kibana/current/discover.html).

View File

@ -21,39 +21,37 @@ and [PodAntiAffinity](/docs/concepts/scheduling-eviction/assign-pod-node/#affini
## {{% heading "prerequisites" %}}
Before starting this tutorial, you should be familiar with the following
Kubernetes concepts.
Kubernetes concepts:
- [Pods](/docs/concepts/workloads/pods/)
- [Cluster DNS](/docs/concepts/services-networking/dns-pod-service/)
- [Headless Services](/docs/concepts/services-networking/service/#headless-services)
- [PersistentVolumes](/docs/concepts/storage/persistent-volumes/)
- [PersistentVolume Provisioning](https://github.com/kubernetes/examples/tree/{{< param "githubbranch" >}}/staging/persistent-volume-provisioning/)
- [StatefulSets](/docs/concepts/workloads/controllers/statefulset/)
- [PodDisruptionBudgets](/docs/concepts/workloads/pods/disruptions/#pod-disruption-budget)
- [PodAntiAffinity](/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity)
- [kubectl CLI](/docs/reference/kubectl/kubectl/)
- [Pods](/docs/concepts/workloads/pods/)
- [Cluster DNS](/docs/concepts/services-networking/dns-pod-service/)
- [Headless Services](/docs/concepts/services-networking/service/#headless-services)
- [PersistentVolumes](/docs/concepts/storage/volumes/)
- [PersistentVolume Provisioning](https://github.com/kubernetes/examples/tree/{{< param "githubbranch" >}}/staging/persistent-volume-provisioning/)
- [StatefulSets](/docs/concepts/workloads/controllers/statefulset/)
- [PodDisruptionBudgets](/docs/concepts/workloads/pods/disruptions/#pod-disruption-budget)
- [PodAntiAffinity](/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity)
- [kubectl CLI](/docs/reference/kubectl/kubectl/)
You will require a cluster with at least four nodes, and each node requires at least 2 CPUs and 4 GiB of memory. In this tutorial you will cordon and drain the cluster's nodes. **This means that the cluster will terminate and evict all Pods on its nodes, and the nodes will temporarily become unschedulable.** You should use a dedicated cluster for this tutorial, or you should ensure that the disruption you cause will not interfere with other tenants.
You must have a cluster with at least four nodes, and each node requires at least 2 CPUs and 4 GiB of memory. In this tutorial you will cordon and drain the cluster's nodes. **This means that the cluster will terminate and evict all Pods on its nodes, and the nodes will temporarily become unschedulable.** You should use a dedicated cluster for this tutorial, or you should ensure that the disruption you cause will not interfere with other tenants.
This tutorial assumes that you have configured your cluster to dynamically provision
PersistentVolumes. If your cluster is not configured to do so, you
will have to manually provision three 20 GiB volumes before starting this
tutorial.
## {{% heading "objectives" %}}
After this tutorial, you will know the following.
- How to deploy a ZooKeeper ensemble using StatefulSet.
- How to consistently configure the ensemble.
- How to spread the deployment of ZooKeeper servers in the ensemble.
- How to use PodDisruptionBudgets to ensure service availability during planned maintenance.
- How to deploy a ZooKeeper ensemble using StatefulSet.
- How to consistently configure the ensemble.
- How to spread the deployment of ZooKeeper servers in the ensemble.
- How to use PodDisruptionBudgets to ensure service availability during planned maintenance.
<!-- lessoncontent -->
### ZooKeeper Basics
### ZooKeeper
[Apache ZooKeeper](https://zookeeper.apache.org/doc/current/) is a
distributed, open-source coordination service for distributed applications.
@ -68,7 +66,7 @@ The ensemble uses the Zab protocol to elect a leader, and the ensemble cannot wr
ZooKeeper servers keep their entire state machine in memory, and write every mutation to a durable WAL (Write Ahead Log) on storage media. When a server crashes, it can recover its previous state by replaying the WAL. To prevent the WAL from growing without bound, ZooKeeper servers will periodically snapshot them in memory state to storage media. These snapshots can be loaded directly into memory, and all WAL entries that preceded the snapshot may be discarded.
## Creating a ZooKeeper Ensemble
## Creating a ZooKeeper ensemble
The manifest below contains a
[Headless Service](/docs/concepts/services-networking/service/#headless-services),
@ -127,7 +125,7 @@ zk-2 1/1 Running 0 40s
The StatefulSet controller creates three Pods, and each Pod has a container with
a [ZooKeeper](https://www-us.apache.org/dist/zookeeper/stable/) server.
### Facilitating Leader Election
### Facilitating leader election
Because there is no terminating algorithm for electing a leader in an anonymous network, Zab requires explicit membership configuration to perform leader election. Each server in the ensemble needs to have a unique identifier, all servers need to know the global set of identifiers, and each identifier needs to be associated with a network address.
@ -211,7 +209,7 @@ server.2=zk-1.zk-hs.default.svc.cluster.local:2888:3888
server.3=zk-2.zk-hs.default.svc.cluster.local:2888:3888
```
### Achieving Consensus
### Achieving consensus
Consensus protocols require that the identifiers of each participant be unique. No two participants in the Zab protocol should claim the same unique identifier. This is necessary to allow the processes in the system to agree on which processes have committed which data. If two Pods are launched with the same ordinal, two ZooKeeper servers would both identify themselves as the same server.
@ -260,7 +258,7 @@ server.3=zk-2.zk-hs.default.svc.cluster.local:2888:3888
When the servers use the Zab protocol to attempt to commit a value, they will either achieve consensus and commit the value (if leader election has succeeded and at least two of the Pods are Running and Ready), or they will fail to do so (if either of the conditions are not met). No state will arise where one server acknowledges a write on behalf of another.
### Sanity Testing the Ensemble
### Sanity testing the ensemble
The most basic sanity test is to write data to one ZooKeeper server and
to read the data from another.
@ -270,6 +268,7 @@ The command below executes the `zkCli.sh` script to write `world` to the path `/
```shell
kubectl exec zk-0 zkCli.sh create /hello world
```
```
WATCHER::
@ -304,7 +303,7 @@ dataLength = 5
numChildren = 0
```
### Providing Durable Storage
### Providing durable storage
As mentioned in the [ZooKeeper Basics](#zookeeper-basics) section,
ZooKeeper commits all entries to a durable WAL, and periodically writes snapshots
@ -445,8 +444,8 @@ The `volumeMounts` section of the `StatefulSet`'s container `template` mounts th
```shell
volumeMounts:
- name: datadir
mountPath: /var/lib/zookeeper
- name: datadir
mountPath: /var/lib/zookeeper
```
When a Pod in the `zk` `StatefulSet` is (re)scheduled, it will always have the
@ -454,7 +453,7 @@ same `PersistentVolume` mounted to the ZooKeeper server's data directory.
Even when the Pods are rescheduled, all the writes made to the ZooKeeper
servers' WALs, and all their snapshots, remain durable.
## Ensuring Consistent Configuration
## Ensuring consistent configuration
As noted in the [Facilitating Leader Election](#facilitating-leader-election) and
[Achieving Consensus](#achieving-consensus) sections, the servers in a
@ -469,6 +468,7 @@ Get the `zk` StatefulSet.
```shell
kubectl get sts zk -o yaml
```
```
command:
@ -497,7 +497,7 @@ command:
The command used to start the ZooKeeper servers passed the configuration as command line parameter. You can also use environment variables to pass configuration to the ensemble.
### Configuring Logging
### Configuring logging
One of the files generated by the `zkGenConfig.sh` script controls ZooKeeper's logging.
ZooKeeper uses [Log4j](https://logging.apache.org/log4j/2.x/), and, by default,
@ -558,13 +558,11 @@ You can view application logs written to standard out or standard error using `k
2016-12-06 19:34:46,230 [myid:1] - INFO [Thread-1142:NIOServerCnxn@1008] - Closed socket connection for client /127.0.0.1:52768 (no session established for client)
```
Kubernetes supports more powerful, but more complex, logging integrations
with [Stackdriver](/docs/tasks/debug-application-cluster/logging-stackdriver/)
and [Elasticsearch and Kibana](/docs/tasks/debug-application-cluster/logging-elasticsearch-kibana/).
For cluster level log shipping and aggregation, consider deploying a [sidecar](https://kubernetes.io/blog/2015/06/the-distributed-system-toolkit-patterns)
container to rotate and ship your logs.
Kubernetes integrates with many logging solutions. You can choose a logging solution
that best fits your cluster and applications. For cluster-level logging and aggregation,
consider deploying a [sidecar container](/docs/concepts/cluster-administration/logging#sidecar-container-with-logging-agent) to rotate and ship your logs.
### Configuring a Non-Privileged User
### Configuring a non-privileged user
The best practices to allow an application to run as a privileged
user inside of a container are a matter of debate. If your organization requires
@ -612,7 +610,7 @@ Because the `fsGroup` field of the `securityContext` object is set to 1000, the
drwxr-sr-x 3 zookeeper zookeeper 4096 Dec 5 20:45 /var/lib/zookeeper/data
```
## Managing the ZooKeeper Process
## Managing the ZooKeeper process
The [ZooKeeper documentation](https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_supervision)
mentions that "You will want to have a supervisory process that
@ -622,7 +620,7 @@ common pattern. When deploying an application in Kubernetes, rather than using
an external utility as a supervisory process, you should use Kubernetes as the
watchdog for your application.
### Updating the Ensemble
### Updating the ensemble
The `zk` `StatefulSet` is configured to use the `RollingUpdate` update strategy.
@ -631,6 +629,7 @@ You can use `kubectl patch` to update the number of `cpus` allocated to the serv
```shell
kubectl patch sts zk --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value":"0.3"}]'
```
```
statefulset.apps/zk patched
```
@ -640,6 +639,7 @@ Use `kubectl rollout status` to watch the status of the update.
```shell
kubectl rollout status sts/zk
```
```
waiting for statefulset rolling update to complete 0 pods at revision zk-5db4499664...
Waiting for 1 pods to be ready...
@ -678,7 +678,7 @@ kubectl rollout undo sts/zk
statefulset.apps/zk rolled back
```
### Handling Process Failure
### Handling process failure
[Restart Policies](/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy) control how
Kubernetes handles process failures for the entry point of the container in a Pod.
@ -731,7 +731,7 @@ that implements the application's business logic, the script must terminate with
child process. This ensures that Kubernetes will restart the application's
container when the process implementing the application's business logic fails.
### Testing for Liveness
### Testing for liveness
Configuring your application to restart failed processes is not enough to
keep a distributed system healthy. There are scenarios where
@ -795,7 +795,7 @@ zk-0 0/1 Running 1 1h
zk-0 1/1 Running 1 1h
```
### Testing for Readiness
### Testing for readiness
Readiness is not the same as liveness. If a process is alive, it is scheduled
and healthy. If a process is ready, it is able to process input. Liveness is
@ -824,7 +824,7 @@ Even though the liveness and readiness probes are identical, it is important
to specify both. This ensures that only healthy servers in the ZooKeeper
ensemble receive network traffic.
## Tolerating Node Failure
## Tolerating Node failure
ZooKeeper needs a quorum of servers to successfully commit mutations
to data. For a three server ensemble, two servers must be healthy for
@ -879,10 +879,10 @@ as `zk` in the domain defined by the `topologyKey`. The `topologyKey`
different rules, labels, and selectors, you can extend this technique to spread
your ensemble across physical, network, and power failure domains.
## Surviving Maintenance
## Surviving maintenance
**In this section you will cordon and drain nodes. If you are using this tutorial
on a shared cluster, be sure that this will not adversely affect other tenants.**
In this section you will cordon and drain nodes. If you are using this tutorial
on a shared cluster, be sure that this will not adversely affect other tenants.
The previous section showed you how to spread your Pods across nodes to survive
unplanned node failures, but you also need to plan for temporary node failures
@ -1017,6 +1017,7 @@ Continue to watch the Pods of the stateful set, and drain the node on which
```shell
kubectl drain $(kubectl get pod zk-2 --template {{.spec.nodeName}}) --ignore-daemonsets --force --delete-local-data
```
```
node "kubernetes-node-i4c4" cordoned
@ -1059,6 +1060,7 @@ Use [`kubectl uncordon`](/docs/reference/generated/kubectl/kubectl-commands/#unc
```shell
kubectl uncordon kubernetes-node-pb41
```
```
node "kubernetes-node-pb41" uncordoned
```
@ -1068,6 +1070,7 @@ node "kubernetes-node-pb41" uncordoned
```shell
kubectl get pods -w -l app=zk
```
```
NAME READY STATUS RESTARTS AGE
zk-0 1/1 Running 2 1h
@ -1130,9 +1133,7 @@ You should always allocate additional capacity for critical services so that the
## {{% heading "cleanup" %}}
- Use `kubectl uncordon` to uncordon all the nodes in your cluster.
- You will need to delete the persistent storage media for the PersistentVolumes
used in this tutorial. Follow the necessary steps, based on your environment,
storage configuration, and provisioning method, to ensure that all storage is
reclaimed.
- You must delete the persistent storage media for the PersistentVolumes used in this tutorial.
Follow the necessary steps, based on your environment, storage configuration,
and provisioning method, to ensure that all storage is reclaimed.

View File

@ -373,9 +373,7 @@
/docs/user-guide/liveness/ /docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/ 301
/docs/user-guide/load-balancer/ /docs/tasks/access-application-cluster/create-external-load-balancer/ 301
/docs/user-guide/logging/ /docs/concepts/cluster-administration/logging/ 301
/docs/user-guide/logging/elasticsearch/ /docs/tasks/debug-application-cluster/logging-elasticsearch-kibana/ 301
/docs/user-guide/logging/overview/ /docs/concepts/cluster-administration/logging/ 301
/docs/user-guide/logging/stackdriver/ /docs/tasks/debug-application-cluster/logging-stackdriver/ 301
/docs/user-guide/managing-deployments/ /docs/concepts/cluster-administration/manage-deployment/ 301
/docs/user-guide/monitoring/ /docs/tasks/debug-application-cluster/resource-usage-monitoring/ 301
/docs/user-guide/namespaces/ /docs/concepts/overview/working-with-objects/namespaces/ 301