Merge branch 'jobs_quickstart' of https://github.com/hhunter-ms/docs into jobs_quickstart

This commit is contained in:
Hannah Hunter 2024-07-29 11:40:46 -04:00
commit b6d3968d29
6 changed files with 172 additions and 39 deletions

View File

@ -12,10 +12,13 @@ Now that you've learned what the [jobs building block]({{< ref jobs-overview.md
Include a diagram or image, if possible. Include a diagram or image, if possible.
--> -->
## Set up the Scheduler service ## Set up the Scheduler service
{{% alert title="Warning" color="warning" %}}
By default, job data is not resilient to [Scheduler]({{< ref scheduler.md >}}) service restarts.
A persistent volume must be provided to Scheduler to ensure job data is not lost in either [Kubernetes]({{< ref kubernetes-persisting-scheduler.md >}}) or [Self-Hosted]({{< ref self-hosted-persisting-scheduler.md >}}) mode.
{{% /alert %}}
When you run `dapr init` in either self-hosted mode or on Kubernetes, the Dapr scheduler service is started. When you run `dapr init` in either self-hosted mode or on Kubernetes, the Dapr scheduler service is started.
## Run the Dapr sidecar ## Run the Dapr sidecar

View File

@ -11,11 +11,16 @@ Many applications require job scheduling, or the need to take an action in the f
Not only does the jobs API help you with scheduling jobs, but internally, Dapr uses the scheduler service to schedule actor reminders. Not only does the jobs API help you with scheduling jobs, but internally, Dapr uses the scheduler service to schedule actor reminders.
Jobs in Dapr consist of: Jobs in Dapr consist of:
- The jobs API building block - [The jobs API building block]({{< ref jobs_api.md >}})
- [The Scheduler control plane service]({{< ref "concepts/dapr-services/scheduler.md" >}}) - [The Scheduler control plane service]({{< ref "concepts/dapr-services/scheduler.md" >}})
[See example scenarios.]({{< ref "#scenarios" >}}) [See example scenarios.]({{< ref "#scenarios" >}})
{{% alert title="Warning" color="warning" %}}
By default, job data is not resilient to [Scheduler]({{< ref scheduler.md >}}) service restarts.
A persistent volume must be provided to Scheduler to ensure job data is not lost in either [Kubernetes]({{< ref kubernetes-persisting-scheduler.md >}}) or [Self-hosted]({{< ref self-hosted-persisting-scheduler.md >}}) mode.
{{% /alert %}}
<img src="/images/scheduler/scheduler-architecture.png" alt="Diagram showing the Scheduler control plane service and the jobs API"> <img src="/images/scheduler/scheduler-architecture.png" alt="Diagram showing the Scheduler control plane service and the jobs API">
## How it works ## How it works

View File

@ -0,0 +1,55 @@
---
type: docs
title: "How-to: Persist Scheduler Jobs"
linkTitle: "How-to: Persist Scheduler Jobs"
weight: 50000
description: "Configure Scheduler to persist its database to make it resilient to restarts"
---
The [Scheduler]({{< ref scheduler.md >}}) service is responsible for writing jobs to its embedded database and scheduling them for execution.
By default, the Scheduler service database writes this data to an in-memory ephemeral tempfs volume, meaning that **this data is not persisted across restarts**. Job data will be lost during these events.
To make the Scheduler data resilient to restarts, a persistent volume must be mounted to the Scheduler `StatefulSet`.
This persistent volume is backed by a real disk that is provided by the hosted Cloud Provider or Kubernetes infrastructure platform.
Disk size is determined by how many jobs are expected to be persisted at once; however, 64Gb should be more than sufficient for most use cases.
Some Kubernetes providers recommend using a [CSI driver](https://kubernetes.io/docs/concepts/storage/volumes/#csi) to provision the underlying disks.
Below are a list of useful links to the relevant documentation for creating a persistent disk for the major cloud providers:
- [Google Cloud Persistent Disk](https://cloud.google.com/compute/docs/disks)
- [Amazon EBS Volumes](https://aws.amazon.com/blogs/storage/persistent-storage-for-kubernetes/)
- [Azure AKS Storage Options](https://learn.microsoft.com/azure/aks/concepts-storage)
- [Digital Ocean Block Storage](https://www.digitalocean.com/docs/kubernetes/how-to/add-volumes/)
- [VMWare vSphere Storage](https://docs.vmware.com/VMware-vSphere/7.0/vmware-vsphere-with-tanzu/GUID-A19F6480-40DC-4343-A5A9-A5D3BFC0742E.html)
- [OpenShift Persistent Storage](https://docs.openshift.com/container-platform/4.6/storage/persistent_storage/persistent-storage-aws-efs.html)
- [Alibaba Cloud Disk Storage](https://www.alibabacloud.com/help/ack/ack-managed-and-ack-dedicated/user-guide/create-a-pvc)
Once the persistent volume class is available, you can install Dapr using the following command, with Scheduler configured to use the persistent volume class (replace `my-storage-class` with the name of the storage class):
{{% alert title="Note" color="primary" %}}
If Dapr is already installed, the control plane needs to be completely [uninstalled]({{< ref dapr-uninstall.md >}}) in order for the Scheduler `StatefulSet` to be recreated with the new persistent volume.
{{% /alert %}}
{{< tabs "Dapr CLI" "Helm" >}}
<!-- Dapr CLI -->
{{% codetab %}}
```bash
dapr init -k --set dapr_scheduler.cluster.storageClassName=my-storage-class
```
{{% /codetab %}}
<!-- Helm -->
{{% codetab %}}
```bash
helm upgrade --install dapr dapr/dapr \
--version={{% dapr-latest-version short="true" %}} \
--namespace dapr-system \
--create-namespace \
--set dapr_scheduler.cluster.storageClassName=my-storage-class \
--wait
```
{{% /codetab %}}
{{< /tabs >}}

View File

@ -0,0 +1,27 @@
---
type: docs
title: "How-to: Persist Scheduler Jobs"
linkTitle: "How-to: Persist Scheduler Jobs"
weight: 50000
description: "Configure Scheduler to persist its database to make it resilient to restarts"
---
The [Scheduler]({{< ref scheduler.md >}}) service is responsible for writing jobs to its embedded database and scheduling them for execution.
By default, the Scheduler service database writes this data to the local volume `dapr_scheduler`, meaning that **this data is persisted across restarts**.
The host file location for this local volume is typically located at either `/var/lib/docker/volumes/dapr_scheduler/_data` or `~/.local/share/containers/storage/volumes/dapr_scheduler/_data`, depending on your container runtime.
Note that if you are using Docker Desktop, this volume is located in the Docker Desktop VM's filesystem, which can be accessed using:
```bash
docker run -it --privileged --pid=host debian nsenter -t 1 -m -u -n -i sh
```
The Scheduler persistent volume can be modified with a custom volume that is pre-existing, or is created by Dapr.
{{% alert title="Note" color="primary" %}}
By default `dapr init` creates a local persistent volume on your drive called `dapr_scheduler`. If Dapr is already installed, the control plane needs to be completely [uninstalled]({{< ref dapr-uninstall.md >}}) in order for the Scheduler container to be recreated with the new persistent volume.
{{% /alert %}}
```bash
dapr init --scheduler-volume my-scheduler-volume
```

View File

@ -10,8 +10,16 @@ weight: 1300
The jobs API is currently in alpha. The jobs API is currently in alpha.
{{% /alert %}} {{% /alert %}}
{{% alert title="Warning" color="warning" %}}
By default, job data is not resilient to [Scheduler]({{< ref scheduler.md >}}) service restarts.
A persistent volume must be provided to Scheduler to ensure job data is not lost in either [Kubernetes]({{< ref kubernetes-persisting-scheduler.md >}}) or [Self-Hosted]({{< ref self-hosted-persisting-scheduler.md >}}) mode.
{{% /alert %}}
With the jobs API, you can schedule jobs and tasks in the future. With the jobs API, you can schedule jobs and tasks in the future.
> The HTTP APIs are intended for development and testing only. For production scenarios, the use of the SDKs is strongly
> recommended as they implement the gRPC APIs providing higher performance and capability than the HTTP APIs.
## Schedule a job ## Schedule a job
Schedule a job with a name. Schedule a job with a name.
@ -22,11 +30,39 @@ POST http://localhost:3500/v1.0-alpha1/jobs/<name>
### URL parameters ### URL parameters
{{% alert title="Note" color="primary" %}}
At least one of `schedule` or `dueTime` must be provided, but they can also be provided together.
{{% /alert %}}
Parameter | Description Parameter | Description
--------- | ----------- --------- | -----------
`name` | Name of the job you're scheduling `name` | Name of the job you're scheduling
`data` | A string value and can be any related content. Content is returned when the reminder expires. For example, this may be useful for returning a URL or anything related to the content. `data` | A protobuf message `@type`/`value` pair. `@type` must be of a [well-known type](https://protobuf.dev/reference/protobuf/google.protobuf). `value` is the serialized data.
`dueTime` | Specifies the time after which this job is invoked. Its format should be [time.ParseDuration](https://pkg.go.dev/time#ParseDuration) `schedule` | An optional schedule at which the job is to be run. Details of the format are below.
`dueTime` | An optional time at which the job should be active, or the "one shot" time, if other scheduling type fields are not provided. Accepts a "point in time" string in the format of RFC3339, Go duration string (calculated from creation time), or non-repeating ISO8601.
`repeats` | An optional number of times in which the job should be triggered. If not set, the job runs indefinitely or until expiration.
`ttl` | An optional time to live or expiration of the job. Accepts a "point in time" string in the format of RFC3339, Go duration string (calculated from job creation time), or non-repeating ISO8601.
#### schedule
`schedule` accepts both systemd timer-style cron expressions, as well as human readable '@' prefixed period strings, as defined below.
Systemd timer style cron accepts 6 fields:
seconds | minutes | hours | day of month | month | day of week
0-59 | 0-59 | 0-23 | 1-31 | 1-12/jan-dec | 0-7/sun-sat
"0 30 * * * *" - every hour on the half hour
"0 15 3 * * *" - every day at 03:15
Period string expressions:
Entry | Description | Equivalent To
----- | ----------- | -------------
@every <duration> | Run every <duration> (e.g. '@every 1h30m') | N/A
@yearly (or @annually) | Run once a year, midnight, Jan. 1st | 0 0 0 1 1 *
@monthly | Run once a month, midnight, first of month | 0 0 0 1 * *
@weekly | Run once a week, midnight on Sunday | 0 0 0 * * 0
@daily (or @midnight) | Run once a day, midnight | 0 0 0 * * *
@hourly | Run once an hour, beginning of hour | 0 0 * * * *
### Request body ### Request body
@ -34,8 +70,8 @@ Parameter | Description
{ {
"job": { "job": {
"data": { "data": {
"@type": "type.googleapis.com/google.type.Expr", "@type": "type.googleapis.com/google.protobuf.StringValue",
"expression": "<expression>" "value": "\"someData\""
}, },
"dueTime": "30s" "dueTime": "30s"
} }
@ -46,13 +82,13 @@ Parameter | Description
Code | Description Code | Description
---- | ----------- ---- | -----------
`202` | Accepted `204` | Accepted
`400` | Request was malformed `400` | Request was malformed
`500` | Request formatted correctly, error in dapr code or Scheduler control plane service `500` | Request formatted correctly, error in dapr code or Scheduler control plane service
### Response content ### Response content
The following example curl command creates a job, naming the job `jobforjabba` and specifying the `dueTime` and the `data`. The following example curl command creates a job, naming the job `jobforjabba` and specifying the `schedule`, `repeats` and the `data`.
```bash ```bash
$ curl -X POST \ $ curl -X POST \
@ -61,9 +97,11 @@ $ curl -X POST \
-d '{ -d '{
"job": { "job": {
"data": { "data": {
"HanSolo": "Running spice" "@type": "type.googleapis.com/google.protobuf.StringValue",
"value": "Running spice"
}, },
"dueTime": "30s" "schedule": "@every 1m",
"repeats": 5
} }
}' }'
``` ```
@ -87,9 +125,9 @@ Parameter | Description
Code | Description Code | Description
---- | ----------- ---- | -----------
`202` | Accepted `200` | Accepted
`400` | Request was malformed `400` | Request was malformed
`500` | Request formatted correctly, error in dapr code or Scheduler control plane service `500` | Request formatted correctly, Job doesn't exist or error in dapr code or Scheduler control plane service
### Response content ### Response content
@ -101,10 +139,12 @@ $ curl -X GET http://localhost:3500/v1.0-alpha1/jobs/jobforjabba -H "Content-Typ
```json ```json
{ {
"name":"test1", "name": "jobforjabba",
"dueTime":"30s", "schedule": "@every 1m",
"repeats": 5,
"data": { "data": {
"HanSolo": "Running spice" "@type": "type.googleapis.com/google.protobuf.StringValue",
"value": "Running spice"
} }
} }
``` ```
@ -126,7 +166,7 @@ Parameter | Description
Code | Description Code | Description
---- | ----------- ---- | -----------
`202` | Accepted `204` | Accepted
`400` | Request was malformed `400` | Request was malformed
`500` | Request formatted correctly, error in dapr code or Scheduler control plane service `500` | Request formatted correctly, error in dapr code or Scheduler control plane service

View File

@ -45,6 +45,7 @@ dapr init [flags]
| N/A | DAPR_HELM_REPO_PASSWORD | A password for a private Helm chart |The password required to access the private Dapr Helm chart. If it can be accessed publicly, this env variable does not need to be set| | | N/A | DAPR_HELM_REPO_PASSWORD | A password for a private Helm chart |The password required to access the private Dapr Helm chart. If it can be accessed publicly, this env variable does not need to be set| |
| `--container-runtime` | | `docker` | Used to pass in a different container runtime other than Docker. Supported container runtimes are: `docker`, `podman` | | `--container-runtime` | | `docker` | Used to pass in a different container runtime other than Docker. Supported container runtimes are: `docker`, `podman` |
| `--dev` | | | Creates Redis and Zipkin deployments when run in Kubernetes. | | `--dev` | | | Creates Redis and Zipkin deployments when run in Kubernetes. |
| `--scheduler-volume` | | | Self-hosted only. Optionally, you can specify a volume for the scheduler service data directory. By default, without this flag, scheduler data is not persisted and not resilient to restarts. |
### Examples ### Examples
@ -57,6 +58,8 @@ dapr init [flags]
Install Dapr by pulling container images for Placement, Scheduler, Redis, and Zipkin. By default, these images are pulled from Docker Hub. Install Dapr by pulling container images for Placement, Scheduler, Redis, and Zipkin. By default, these images are pulled from Docker Hub.
> By default, a `dapr_scheduler` local volume is created for Scheduler service to be used as the database directory. The host file location for this volume is likely located at `/var/lib/docker/volumes/dapr_scheduler/_data` or `~/.local/share/containers/storage/volumes/dapr_scheduler/_data`, depending on your container runtime.
```bash ```bash
dapr init dapr init
``` ```