--- type: docs title: "Production guidelines on Kubernetes" linkTitle: "Production guidelines" weight: 40000 description: "Best practices for deploying Dapr to a Kubernetes cluster in a production-ready configuration" --- ## Cluster and capacity requirements Dapr support for Kubernetes is aligned with [Kubernetes Version Skew Policy](https://kubernetes.io/releases/version-skew-policy/). Use the following resource settings as a starting point. Requirements vary depending on cluster size, number of pods, and other factors. Perform individual testing to find the right values for your environment. In production, it's recommended to not add memory limits to the Dapr control plane components to avoid `OOMKilled` pod statuses. | Deployment | CPU | Memory |-------------|-----|------- | **Operator** | Limit: 1, Request: 100m | Request: 100Mi | **Sidecar Injector** | Limit: 1, Request: 100m | Request: 30Mi | **Sentry** | Limit: 1, Request: 100m | Request: 30Mi | **Placement** | Limit: 1, Request: 250m | Request: 75Mi {{% alert title="Note" color="primary" %}} For more information, refer to the Kubernetes documentation on [CPU and Memory resource units and their meaning](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-units-in-kubernetes). {{% /alert %}} ### Helm When installing Dapr using Helm, no default limit/request values are set. Each component has a `resources` option (for example, `dapr_dashboard.resources`), which you can use to tune the Dapr control plane to fit your environment. The [Helm chart readme](https://github.com/dapr/dapr/blob/master/charts/dapr/README.md) has detailed information and examples. For local/dev installations, you might want to skip configuring the `resources` options. ### Optional components The following Dapr control plane deployments are optional: - **Placement**: For using Dapr Actors - **Sentry**: For mTLS for service-to-service invocation - **Dashboard**: For an operational view of the cluster ## Sidecar resource settings [Set the resource assignments for the Dapr sidecar using the supported annotations]({{< ref "arguments-annotations-overview.md" >}}). The specific annotations related to **resource constraints** are: - `dapr.io/sidecar-cpu-limit` - `dapr.io/sidecar-memory-limit` - `dapr.io/sidecar-cpu-request` - `dapr.io/sidecar-memory-request` If not set, the Dapr sidecar runs without resource settings, which may lead to issues. For a production-ready setup, it's strongly recommended to configure these settings. Example settings for the Dapr sidecar in a production-ready setup: | CPU | Memory | |-----|--------| | Limit: 300m, Request: 100m | Limit: 1000Mi, Request: 250Mi The CPU and memory limits above account for Dapr supporting a high number of I/O bound operations. Use a [monitoring tool]({{< ref observability >}}) to get a baseline for the sidecar (and app) containers and tune these settings based on those baselines. For more details on configuring resource in Kubernetes, see the following Kubernetes guides: - [Assign Memory Resources to Containers and Pods](https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/) - [Assign CPU Resources to Containers and Pods](https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/) {{% alert title="Note" color="primary" %}} Since Dapr is intended to do much of the I/O heavy lifting for your app, the resources given to Dapr drastically reduce the resource allocations for the application. {{% /alert %}} ### Setting soft memory limits on Dapr sidecar Set soft memory limits on the Dapr sidecar when you've set up memory limits. With soft memory limits, the sidecar garbage collector frees up memory once it exceeds the limit instead of waiting for it to be double of the last amount of memory present in the heap when it was run. Waiting is the default behavior of the [garbage collector](https://tip.golang.org/doc/gc-guide#Memory_limit) used in Go, and can lead to OOM Kill events. For example, for an app with app-id `nodeapp` with memory limit set to 1000Mi, you can use the following in your pod annotations: ```yaml annotations: dapr.io/enabled: "true" dapr.io/app-id: "nodeapp" # our daprd memory settings dapr.io/sidecar-memory-limit: "1000Mi" # your memory limit dapr.io/env: "GOMEMLIMIT=900MiB" # 90% of your memory limit. Also notice the suffix "MiB" instead of "Mi" ``` In this example, the soft limit has been set to be 90% to leave 5-10% for other services, [as recommended](https://tip.golang.org/doc/gc-guide#Memory_limit). The `GOMEMLIMIT` environment variable [allows certain suffixes for the memory size: `B`, `KiB`, `MiB`, `GiB`, and `TiB`.](https://pkg.go.dev/runtime) ## High availability mode When deploying Dapr in a production-ready configuration, it's best to deploy with a high availability (HA) configuration of the control plane. This creates three replicas of each control plane pod in the `dapr-system` namespace, allowing the Dapr control plane to retain three running instances and survive individual node failures and other outages. For a new Dapr deployment, HA mode can be set with both: - The [Dapr CLI]({{< ref "kubernetes-deploy.md#install-in-highly-available-mode" >}}), and - [Helm charts]({{< ref "kubernetes-deploy.md#add-and-install-dapr-helm-chart" >}}) For an existing Dapr deployment, [you can enable HA mode in a few extra steps]({{< ref "#enable-high-availability-in-an-existing-dapr-deployment" >}}). ### Individual service HA Helm configuration You can configure HA mode via Helm across all services by setting the `global.ha.enabled` flag to `true`. By default, `--set global.ha.enabled=true` is fully respected and cannot be overridden, making it impossible to simultaneously have either the placement or scheduler service as a single instance. > **Note:** HA for scheduler and placement services is not the default setting. To scale scheduler and placement to three instances independently of the `global.ha.enabled` flag, set `global.ha.enabled` to `false` and `dapr_scheduler.ha` and `dapr_placement.ha` to `true`. For example: ```bash helm upgrade --install dapr dapr/dapr \ --version={{% dapr-latest-version short="true" %}} \ --namespace dapr-system \ --create-namespace \ --set global.ha.enabled=false \ --set dapr_scheduler.ha=true \ --set dapr_placement.ha=true \ --wait ``` ## Setting cluster critical priority class name for control plane services In some scenarios, nodes may have memory and/or cpu pressure and the Dapr control plane pods might get selected for eviction. To prevent this, you can set a critical priority class name for the Dapr control plane pods. This ensures that the Dapr control plane pods are not evicted unless all other pods with lower priority are evicted. It's particularly important to protect the Dapr control plane components from eviction, especially the Scheduler service. When Schedulers are rescheduled or restarted, it can be highly disruptive to inflight jobs, potentially causing them to fire duplicate times. To prevent such disruptions, you should ensure the Dapr control plane components have a higher priority class than your application workloads. Learn more about [Protecting Mission-Critical Pods](https://kubernetes.io/blog/2023/01/12/protect-mission-critical-pods-priorityclass/). There are two built-in critical priority classes in Kubernetes: - `system-cluster-critical` - `system-node-critical` (highest priority) It's recommended to set the `priorityClassName` to `system-cluster-critical` for the Dapr control plane pods. If you have your own custom priority classes for your applications, ensure they have a lower priority value than the one assigned to the Dapr control plane to maintain system stability and prevent disruption of core Dapr services. For a new Dapr control plane deployment, the `system-cluster-critical` priority class mode can be set via the helm value `global.priorityClassName`. This priority class can be set with both the Dapr CLI and Helm charts, using the helm `--set global.priorityClassName=system-cluster-critical` argument. #### Dapr version < 1.14 For versions of Dapr below v1.14, it's recommended that you add a `ResourceQuota` to the Dapr control plane namespace. This prevents problems associated with scheduling pods [where the cluster may be configured](https://kubernetes.io/docs/concepts/policy/resource-quotas/#limit-priority-class-consumption-by-default ) with limitations on which pods can be assigned high priority classes. For v1.14 onwards the Helm chart adds this automatically. If you have Dapr installed in namespace `dapr-system`, you can create a `ResourceQuota` with the following content: ```yaml apiVersion: v1 kind: ResourceQuota metadata: name: dapr-system-critical-quota namespace: dapr-system spec: scopeSelector: matchExpressions: - operator : In scopeName: PriorityClass values: [system-cluster-critical] ``` ## Deploy Dapr with Helm [Visit the full guide on deploying Dapr with Helm]({{< ref "kubernetes-deploy.md#install-with-helm" >}}). ### Parameters file It's recommended to create a values file, instead of specifying parameters on the command. Check the values file into source control so that you can track its changes. [See a full list of available parameters and settings](https://github.com/dapr/dapr/blob/master/charts/dapr/README.md). The following command runs three replicas of each control plane service in the `dapr-system` namespace. ```bash # Add/update a official Dapr Helm repo. helm repo add dapr https://dapr.github.io/helm-charts/ # or add/update a private Dapr Helm repo. helm repo add dapr http://helm.custom-domain.com/dapr/dapr/ \ --username=xxx --password=xxx helm repo update # See which chart versions are available helm search repo dapr --devel --versions # create a values file to store variables touch values.yml cat << EOF >> values.yml global: ha: enabled: true EOF # run install/upgrade helm install dapr dapr/dapr \ --version= \ --namespace dapr-system \ --create-namespace \ --values values.yml \ --wait # verify the installation kubectl get pods --namespace dapr-system ``` {{% alert title="Note" color="primary" %}} The example above uses `helm install` and `helm upgrade`. You can also run `helm upgrade --install` to dynamically determine whether to install or upgrade. {{% /alert %}} The Dapr Helm chart automatically deploys with affinity for nodes with the label `kubernetes.io/os=linux`. You can deploy the Dapr control plane to Windows nodes. For more information, see [Deploying to a Hybrid Linux/Windows K8s Cluster]({{< ref "kubernetes-hybrid-clusters.md" >}}). ## Upgrade Dapr with Helm Dapr supports zero-downtime upgrades in the following steps. ### Upgrade the CLI (recommended) Upgrading the CLI is optional, but recommended. 1. [Download the latest version](https://github.com/dapr/cli/releases) of the CLI. 1. Verify the Dapr CLI is in your path. ### Upgrade the control plane [Upgrade Dapr on a Kubernetes cluster]({{< ref "kubernetes-upgrade.md#helm" >}}). ### Update the data plane (sidecars) Update pods that are running Dapr to pick up the new version of the Dapr runtime. 1. Issue a rollout restart command for any deployment that has the `dapr.io/enabled` annotation: ```bash kubectl rollout restart deploy/ ``` 1. View a list of all your Dapr enabled deployments via either: - The [Dapr Dashboard](https://github.com/dapr/dashboard) - Running the following command using the Dapr CLI: ```bash dapr list -k APP ID APP PORT AGE CREATED nodeapp 3000 16h 2020-07-29 17:16.22 ``` ### Enable high availability in an existing Dapr deployment Enabling HA mode for an existing Dapr deployment requires two steps: 1. Delete the existing placement stateful set. ```bash kubectl delete statefulset.apps/dapr-placement-server -n dapr-system ``` You delete the placement stateful set because, in HA mode, the placement service adds [Raft](https://raft.github.io/) for leader election. However, Kubernetes only allows for limited fields in stateful sets to be patched, subsequently failing upgrade of the placement service. Deletion of the existing placement stateful set is safe. The agents reconnect and re-register with the newly created placement service, which persist its table in Raft. 1. Issue the upgrade command. ```bash helm upgrade dapr ./charts/dapr -n dapr-system --set global.ha.enabled=true ``` ## Recommended security configuration When properly configured, Dapr ensures secure communication and can make your application more secure with a number of built-in features. Verify your production-ready deployment includes the following settings: 1. **Mutual Authentication (mTLS)** is enabled. Dapr has mTLS on by default. [Learn more about how to bring your own certificates]({{< ref "mtls.md#bringing-your-own-certificates" >}}). 1. **App to Dapr API authentication** is enabled. This is the communication between your application and the Dapr sidecar. To secure the Dapr API from unauthorized application access, [enable Dapr's token-based authentication]({{< ref "api-token.md" >}}). 1. **Dapr to App API authentication** is enabled. This is the communication between Dapr and your application. [Let Dapr know that it is communicating with an authorized application using token authentication]({{< ref "app-api-token.md" >}}). 1. **Component secret data is configured in a secret store** and not hard-coded in the component YAML file. [Learn how to use secrets with Dapr components]({{< ref "component-secrets.md" >}}). 1. The Dapr **control plane is installed on a dedicated namespace**, such as `dapr-system`. 1. Dapr supports and is enabled to **scope components for certain applications**. This is not a required practice. [Learn more about component scopes]({{< ref "component-scopes.md" >}}). ## Recommended Placement service configuration The [Placement service]({{< ref "placement.md" >}}) is a component in Dapr, responsible for disseminating information about actor addresses to all Dapr sidecars via a placement table (more information on this can be found [here]({{< ref "actors-features-concepts.md#actor-placement-service" >}})). When running in production, it's recommended to configure the Placement service with the following values: 1. **High availability**. Ensure the Placement service is highly available (three replicas) and can survive individual node failures. Helm chart value: `dapr_placement.ha=true` 2. **In-memory logs**. Use in-memory Raft log store for faster writes. The tradeoff is more placement table disseminations (and thus, network traffic) in an eventual Placement service pod failure. Helm chart value: `dapr_placement.cluster.forceInMemoryLog=true` 3. **No metadata endpoint**. Disable the unauthenticated `/placement/state` endpoint which exposes placement table information for the Placement service. Helm chart value: `dapr_placement.metadataEnabled=false` 4. **Timeouts** Control the sensitivity of network connectivity between the Placement service and the sidecars using the below timeout values. Default values are set, but you can adjust these based on your network conditions. 1. `dapr_placement.keepAliveTime` sets the interval at which the Placement service sends [keep alive](https://grpc.io/docs/guides/keepalive/) pings to Dapr sidecars on the gRPC stream to check if the connection is still alive. Lower values will lead to shorter actor rebalancing time in case of pod loss/restart, but higher network traffic during normal operation. Accepts values between `1s` and `10s`. Default is `2s`. 2. `dapr_placement.keepAliveTimeout` sets the timeout period for Dapr sidecars to respond to the Placement service's [keep alive](https://grpc.io/docs/guides/keepalive/) pings before the Placement service closes the connection. Lower values will lead to shorter actor rebalancing time in case of pod loss/restart, but higher network traffic during normal operation. Accepts values between `1s` and `10s`. Default is `3s`. 3. `dapr_placement.disseminateTimeout` sets the timeout period for dissemination to be delayed after actor membership change (usually related to pod restarts) to avoid excessive dissemination during multiple pod restarts. Higher values will reduce the frequency of dissemination, but delay the table dissemination. Accepts values between `1s` and `3s`. Default is `2s`. ## Service account tokens By default, Kubernetes mounts a volume containing a [Service Account token](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/) in each container. Applications can use this token, whose permissions vary depending on the configuration of the cluster and namespace, among other things, to perform API calls against the Kubernetes control plane. When creating a new Pod (or a Deployment, StatefulSet, Job, etc), you can disable auto-mounting the Service Account token by setting `automountServiceAccountToken: false` in your pod's spec. It's recommended that you consider deploying your apps with `automountServiceAccountToken: false` to improve the security posture of your pods, unless your apps depend on having a Service Account token. For example, you may need a Service Account token if: - Your application needs to interact with the Kubernetes APIs. - You are using Dapr components that interact with the Kubernetes APIs; for example, the [Kubernetes secret store]({{< ref "kubernetes-secret-store.md" >}}) or the [Kubernetes Events binding]({{< ref "kubernetes-binding.md" >}}). Thus, Dapr does not set `automountServiceAccountToken: false` automatically for you. However, in all situations where the Service Account is not required by your solution, it's recommended that you set this option in the pods spec. {{% alert title="Note" color="primary" %}} Initializing Dapr components using [component secrets]({{< ref "component-secrets.md" >}}) stored as Kubernetes secrets does **not** require a Service Account token, so you can still set `automountServiceAccountToken: false` in this case. Only calling the Kubernetes secret store at runtime, using the [Secrets management]({{< ref "secrets-overview.md" >}}) building block, is impacted. {{% /alert %}} ## Tracing and metrics configuration Tracing and metrics are enabled in Dapr by default. It's recommended that you set up distributed tracing and metrics for your applications and the Dapr control plane in production. If you already have your own observability setup, you can disable tracing and metrics for Dapr. ### Tracing [Configure a tracing backend for Dapr]({{< ref "setup-tracing.md" >}}). ### Metrics For metrics, Dapr exposes a Prometheus endpoint listening on port 9090, which can be scraped by Prometheus. [Set up Prometheus, Grafana, and other monitoring tools with Dapr]({{< ref "observability" >}}). ## Injector watchdog The Dapr Operator service includes an **injector watchdog**, which can be used to detect and remediate situations where your application's pods may be deployed without the Dapr sidecar (the `daprd` container). For example, it can assist with recovering the applications after a total cluster failure. The injector watchdog is disabled by default when running Dapr in Kubernetes mode. However, you should consider enabling it with the appropriate values for your specific situation. Refer to the [Dapr operator service documentation]({{< ref operator >}}) for more details on the injector watchdog and how to enable it. ## Configure `seccompProfile` for sidecar containers By default, the Dapr sidecar injector injects a sidecar without any `seccompProfile`. However, for the Dapr sidecar container to run successfully in a namespace with the [Restricted](https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted) profile, the sidecar container needs `securityContext.seccompProfile.Type` to not be `nil`. Refer to [the Arguments and Annotations overview]({{< ref "arguments-annotations-overview.md" >}}) to set the appropriate `seccompProfile` on the sidecar container. ## Best Practices Watch this video for a deep dive into the best practices for running Dapr in production with Kubernetes.
## Related links - [Deploy Dapr on Kubernetes]({{< ref kubernetes-deploy.md >}}) - [Upgrade Dapr on Kubernetes]({{< ref kubernetes-upgrade.md >}})