Merge branch 'issue_4005' of https://github.com/hhunter-ms/docs into issue_4005

This commit is contained in:
Hannah Hunter 2024-02-06 13:36:39 -05:00
commit 3f7955315a
5 changed files with 231 additions and 32 deletions

View File

@ -18,7 +18,9 @@ metadata:
name: <NAME>
spec:
type: state.azure.blobstorage
version: v1
# Supports v1 and v2. Users should always use v2 by default. There is no
# migration path from v1 to v2, see `versioning` below.
version: v2
metadata:
- name: accountName
value: "[your_account_name]"
@ -32,21 +34,32 @@ spec:
The above example uses secrets as plain strings. It is recommended to use a secret store for the secrets as described [here]({{< ref component-secrets.md >}}).
{{% /alert %}}
## Versioning
Dapr has 2 versions of the Azure Blob Storage state store component: `v1` and `v2`. It is recommended to use `v2` for all new applications. `v1` is considered legacy and is preserved for compatibility with existing applications only.
In `v1`, a longstanding implementation issue was identified, where the [key prefix]({{< ref howto-share-state.md >}}) was incorrectly stripped by the component, essentially behaving as if `keyPrefix` was always set to `none`.
The updated `v2` of the component fixes the incorrect behavior and makes the state store correctly respect the `keyPrefix` property.
While `v1` and `v2` have the same metadata fields, they are otherwise incompatible, with no automatic data migration path for `v1` to `v2`.
If you are using `v1` of this component, you should continue to use `v1` until you create a new state store.
## Spec metadata fields
| Field | Required | Details | Example |
| Field | Required | Details | Example |
|--------------------|:--------:|---------|---------|
| `accountName` | Y | The storage account name | `"mystorageaccount"`.
| `accountKey` | Y (unless using Microsoft Entra ID) | Primary or secondary storage key | `"key"`
| `containerName` | Y | The name of the container to be used for Dapr state. The container will be created for you if it doesn't exist | `"container"`
| `azureEnvironment` | N | Optional name for the Azure environment if using a different Azure cloud | `"AZUREPUBLICCLOUD"` (default value), `"AZURECHINACLOUD"`, `"AZUREUSGOVERNMENTCLOUD"`, `"AZUREGERMANCLOUD"`
| `accountName` | Y | The storage account name | `"mystorageaccount"`. |
| `accountKey` | Y (unless using Microsoft Entra ID) | Primary or secondary storage key | `"key"` |
| `containerName` | Y | The name of the container to be used for Dapr state. The container will be created for you if it doesn't exist | `"container"` |
| `azureEnvironment` | N | Optional name for the Azure environment if using a different Azure cloud | `"AZUREPUBLICCLOUD"` (default value), `"AZURECHINACLOUD"`, `"AZUREUSGOVERNMENTCLOUD"` |
| `endpoint` | N | Optional custom endpoint URL. This is useful when using the [Azurite emulator](https://github.com/Azure/azurite) or when using custom domains for Azure Storage (although this is not officially supported). The endpoint must be the full base URL, including the protocol (`http://` or `https://`), the IP or FQDN, and optional port. | `"http://127.0.0.1:10000"`
| `ContentType` | N | The blob's content type | `"text/plain"`
| `ContentMD5` | N | The blob's MD5 hash | `"vZGKbMRDAnMs4BIwlXaRvQ=="`
| `ContentEncoding` | N | The blob's content encoding | `"UTF-8"`
| `ContentLanguage` | N | The blob's content language | `"en-us"`
| `ContentDisposition` | N | The blob's content disposition. Conveys additional information about how to process the response payload | `"attachment"`
| `CacheControl` | N | The blob's cache control | `"no-cache"`
| `ContentType` | N | The blob's content type | `"text/plain"` |
| `ContentMD5` | N | The blob's MD5 hash | `"vZGKbMRDAnMs4BIwlXaRvQ=="` |
| `ContentEncoding` | N | The blob's content encoding | `"UTF-8"` |
| `ContentLanguage` | N | The blob's content language | `"en-us"` |
| `ContentDisposition` | N | The blob's content disposition. Conveys additional information about how to process the response payload | `"attachment"` |
| `CacheControl`| N | The blob's cache control | `"no-cache"` |
## Setup Azure Blob Storage

View File

@ -1,13 +1,23 @@
---
type: docs
title: "PostgreSQL"
linkTitle: "PostgreSQL"
description: Detailed information on the PostgreSQL state store component
title: "PostgreSQL v1"
linkTitle: "PostgreSQL v1"
description: Detailed information on the PostgreSQL v1 state store component
aliases:
- "/operations/components/setup-state-store/supported-state-stores/setup-postgresql/"
- "/operations/components/setup-state-store/supported-state-stores/setup-postgres/"
- "/operations/components/setup-state-store/supported-state-stores/setup-postgresql-v1/"
- "/operations/components/setup-state-store/supported-state-stores/setup-postgres-v1/"
---
This component allows using PostgreSQL (Postgres) as state store for Dapr. See [this guide]({{< ref "howto-get-save-state.md#step-1-setup-a-state-store" >}}) on how to create and apply a state store configuration.
{{% alert title="Note" color="primary" %}}
Starting with Dapr 1.13, you can leverage the [PostgreSQL v2]({{< ref setup-postgresql-v2.md >}}) state store component, which contains some improvements to performance and reliability.
The v2 component is not compatible with v1, and data cannot be migrated between the two components. The v2 component does not offer support for state store query APIs.
There are no plans to deprecate the v1 component.
{{% /alert %}}
This component allows using PostgreSQL (Postgres) as state store for Dapr, using the "v1" component. See [this guide]({{< ref "howto-get-save-state.md#step-1-setup-a-state-store" >}}) on how to create and apply a state store configuration.
```yaml
apiVersion: dapr.io/v1alpha1
@ -21,8 +31,8 @@ spec:
# Connection string
- name: connectionString
value: "<CONNECTION STRING>"
# Timeout for database operations, in seconds (optional)
#- name: timeoutInSeconds
# Timeout for database operations, as a Go duration or number of seconds (optional)
#- name: timeout
# value: 20
# Name of the table where to store the state (optional)
#- name: tableName
@ -31,8 +41,8 @@ spec:
#- name: metadataTableName
# value: "dapr_metadata"
# Cleanup interval in seconds, to remove expired rows (optional)
#- name: cleanupIntervalInSeconds
# value: 3600
#- name: cleanupInterval
# value: "1h"
# Maximum number of connections pooled by this component (optional)
#- name: maxConns
# value: 0
@ -59,7 +69,7 @@ The following metadata options are **required** to authenticate using a PostgreS
| Field | Required | Details | Example |
|--------|:--------:|---------|---------|
| `connectionString` | Y | The connection string for the PostgreSQL database. See the PostgreSQL [documentation on database connections](https://www.postgresql.org/docs/current/libpq-connect.html) for information on how to define a connection string. | `"host=localhost user=postgres password=example port=5432 connect_timeout=10 database=my_db"`
| `connectionString` | Y | The connection string for the PostgreSQL database. See the PostgreSQL [documentation on database connections](https://www.postgresql.org/docs/current/libpq-connect.html) for information on how to define a connection string. | `"host=localhost user=postgres password=example port=5432 connect_timeout=10 database=my_db"` |
### Authenticate using Microsoft Entra ID
@ -77,10 +87,10 @@ Authenticating with Microsoft Entra ID is supported with Azure Database for Post
| Field | Required | Details | Example |
|--------------------|:--------:|---------|---------|
| `timeoutInSeconds` | N | Timeout, in seconds, for all database operations. Defaults to `20` | `30`
| `tableName` | N | Name of the table where the data is stored. Defaults to `state`. Can optionally have the schema name as prefix, such as `public.state` | `"state"`, `"public.state"`
| `metadataTableName` | N | Name of the table Dapr uses to store a few metadata properties. Defaults to `dapr_metadata`. Can optionally have the schema name as prefix, such as `public.dapr_metadata` | `"dapr_metadata"`, `"public.dapr_metadata"`
| `cleanupIntervalInSeconds` | N | Interval, in seconds, to clean up rows with an expired TTL. Default: `3600` (i.e. 1 hour). Setting this to values <=0 disables the periodic cleanup. | `1800`, `-1`
| `timeout` | N | Timeout for operations on the database, as a [Go duration](https://pkg.go.dev/time#ParseDuration). Integers are interpreted as number of seconds. Defaults to `20s` | `"30s"`, `30` |
| `cleanupInterval` | N | Interval, as a Go duration or number of seconds, to clean up rows with an expired TTL. Default: `1h` (1 hour). Setting this to values <=0 disables the periodic cleanup. | `"30m"`, `1800`, `-1`
| `maxConns` | N | Maximum number of connections pooled by this component. Set to 0 or lower to use the default value, which is the greater of 4 or the number of CPUs. | `"4"`
| `connectionMaxIdleTime` | N | Max idle time before unused connections are automatically closed in the connection pool. By default, there's no value and this is left to the database driver to choose. | `"5m"`
| `queryExecMode` | N | Controls the default mode for executing queries. By default Dapr uses the extended protocol and automatically prepares and caches prepared statements. However, this may be incompatible with proxies such as PGBouncer. In this case it may be preferrable to use `exec` or `simple_protocol`. | `"simple_protocol"`
@ -100,8 +110,8 @@ Authenticating with Microsoft Entra ID is supported with Azure Database for Post
> This example does not describe a production configuration because it sets the password in plain text and the user name is left as the PostgreSQL default of "postgres".
2. Create a database for state data.
Either the default "postgres" database can be used, or create a new database for storing state data.
1. Create a database for state data.
Either the default "postgres" database can be used, or create a new database for storing state data.
To create a new database in PostgreSQL, run the following SQL command:
@ -121,10 +131,10 @@ This state store supports [Time-To-Live (TTL)]({{< ref state-store-ttl.md >}}) f
Because PostgreSQL doesn't have built-in support for TTLs, this is implemented in Dapr by adding a column in the state table indicating when the data is to be considered "expired". Records that are "expired" are not returned to the caller, even if they're still physically stored in the database. A background "garbage collector" periodically scans the state table for expired rows and deletes them.
The interval at which the deletion of expired records happens is set with the `cleanupIntervalInSeconds` metadata property, which defaults to 3600 seconds (that is, 1 hour).
You can set the deletion interval of expired records with the `cleanupInterval` metadata property, which defaults to 3600 seconds (that is, 1 hour).
- Longer intervals require less frequent scans for expired rows, but can require storing expired records for longer, potentially requiring more storage space. If you plan to store many records in your state table, with short TTLs, consider setting `cleanupIntervalInSeconds` to a smaller value, for example `300` (300 seconds, or 5 minutes).
- If you do not plan to use TTLs with Dapr and the PostgreSQL state store, you should consider setting `cleanupIntervalInSeconds` to a value <= 0 (e.g. `0` or `-1`) to disable the periodic cleanup and reduce the load on the database.
- Longer intervals require less frequent scans for expired rows, but can require storing expired records for longer, potentially requiring more storage space. If you plan to store many records in your state table, with short TTLs, consider setting `cleanupInterval` to a smaller value; for example, `5m` (5 minutes).
- If you do not plan to use TTLs with Dapr and the PostgreSQL state store, you should consider setting `cleanupInterval` to a value <= 0 (for example, `0` or `-1`) to disable the periodic cleanup and reduce the load on the database.
The column in the state table where the expiration date for records is stored in, `expiredate`, **does not have an index by default**, so each periodic cleanup must perform a full-table scan. If you have a table with a very large number of records, and only some of them use a TTL, you may find it useful to create an index on that column. Assuming that your state table name is `state` (the default), you can use this query:

View File

@ -0,0 +1,165 @@
---
type: docs
title: "PostgreSQL"
linkTitle: "PostgreSQL"
description: Detailed information on the PostgreSQL state store component
aliases:
- "/operations/components/setup-state-store/supported-state-stores/setup-postgresql-v2/"
- "/operations/components/setup-state-store/supported-state-stores/setup-postgres-v2/"
---
{{% alert title="Note" color="primary" %}}
This is the v2 of the PostgreSQL state store component, which contains some improvements to performance and reliability. New applications are encouraged to use v2.
The PostgreSQL v2 state store component is not compatible with the [v1 component]({{< ref setup-postgresql-v1.md >}}), and data cannot be migrated between the two components. The v2 component does not offer support for state store query APIs.
There are no plans to deprecate the v1 component.
{{% /alert %}}
This component allows using PostgreSQL (Postgres) as state store for Dapr, using the "v2" component. See [this guide]({{< ref "howto-get-save-state.md#step-1-setup-a-state-store" >}}) on how to create and apply a state store configuration.
```yaml
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: <NAME>
spec:
type: state.postgresql
# Note: setting "version" to "v2" is required to use the v2 of the component
version: v2
metadata:
# Connection string
- name: connectionString
value: "<CONNECTION STRING>"
# Timeout for database operations, as a Go duration or number of seconds (optional)
#- name: timeout
# value: 20
# Prefix for the table where the data is stored (optional)
#- name: tablePrefix
# value: ""
# Name of the table where to store metadata used by Dapr (optional)
#- name: metadataTableName
# value: "dapr_metadata"
# Cleanup interval in seconds, to remove expired rows (optional)
#- name: cleanupInterval
# value: "1h"
# Maximum number of connections pooled by this component (optional)
#- name: maxConns
# value: 0
# Max idle time for connections before they're closed (optional)
#- name: connectionMaxIdleTime
# value: 0
# Controls the default mode for executing queries. (optional)
#- name: queryExecMode
# value: ""
# Uncomment this if you wish to use PostgreSQL as a state store for actors (optional)
#- name: actorStateStore
# value: "true"
```
{{% alert title="Warning" color="warning" %}}
The above example uses secrets as plain strings. It is recommended to use a secret store for the secrets as described [here]({{< ref component-secrets.md >}}).
{{% /alert %}}
## Spec metadata fields
### Authenticate using a connection string
The following metadata options are **required** to authenticate using a PostgreSQL connection string.
| Field | Required | Details | Example |
|--------|:--------:|---------|---------|
| `connectionString` | Y | The connection string for the PostgreSQL database. See the PostgreSQL [documentation on database connections](https://www.postgresql.org/docs/current/libpq-connect.html) for information on how to define a connection string. | `"host=localhost user=postgres password=example port=5432 connect_timeout=10 database=my_db"` |
### Authenticate using Microsoft Entra ID
Authenticating with Microsoft Entra ID is supported with Azure Database for PostgreSQL. All authentication methods supported by Dapr can be used, including client credentials ("service principal") and Managed Identity.
| Field | Required | Details | Example |
|--------|:--------:|---------|---------|
| `useAzureAD` | Y | Must be set to `true` to enable the component to retrieve access tokens from Microsoft Entra ID. | `"true"` |
| `connectionString` | Y | The connection string for the PostgreSQL database.<br>This must contain the user, which corresponds to the name of the user created inside PostgreSQL that maps to the Microsoft Entra ID identity. This is often the name of the corresponding principal (for example, the name of the Microsoft Entra ID application). This connection string should not contain any password. | `"host=mydb.postgres.database.azure.com user=myapplication port=5432 database=my_db sslmode=require"` |
| `azureTenantId` | N | ID of the Microsoft Entra ID tenant | `"cd4b2887-304c-…"` |
| `azureClientId` | N | Client ID (application ID) | `"c7dd251f-811f-…"` |
| `azureClientSecret` | N | Client secret (application password) | `"Ecy3X…"` |
### Other metadata options
| Field | Required | Details | Example |
|--------------------|:--------:|---------|---------|
| `tablePrefix` | N | Prefix for the table where the data is stored. Can optionally have the schema name as prefix, such as `public.prefix_` | `"prefix_"`, `"public.prefix_"` |
| `metadataTableName` | N | Name of the table Dapr uses to store a few metadata properties. Defaults to `dapr_metadata`. Can optionally have the schema name as prefix, such as `public.dapr_metadata` | `"dapr_metadata"`, `"public.dapr_metadata"` |
| `timeout` | N | Timeout for operations on the database, as a [Go duration](https://pkg.go.dev/time#ParseDuration). Integers are interpreted as number of seconds. Defaults to `20s` | `"30s"`, `30` |
| `cleanupInterval` | N | Interval, as a Go duration or number of seconds, to clean up rows with an expired TTL. Default: `1h` (1 hour). Setting this to values <=0 disables the periodic cleanup. | `"30m"`, `1800`, `-1` |
| `maxConns` | N | Maximum number of connections pooled by this component. Set to 0 or lower to use the default value, which is the greater of 4 or the number of CPUs. | `"4"` |
| `connectionMaxIdleTime` | N | Max idle time before unused connections are automatically closed in the connection pool. By default, there's no value and this is left to the database driver to choose. | `"5m"` |
| `queryExecMode` | N | Controls the default mode for executing queries. By default Dapr uses the extended protocol and automatically prepares and caches prepared statements. However, this may be incompatible with proxies such as PGBouncer. In this case, it may be preferrable to use `exec` or `simple_protocol`. | `"simple_protocol"` |
| `actorStateStore` | N | Consider this state store for actors. Defaults to `"false"` | `"true"`, `"false"` |
## Setup PostgreSQL
{{< tabs "Self-Hosted" >}}
{{% codetab %}}
1. Run an instance of PostgreSQL. You can run a local instance of PostgreSQL in Docker with the following command:
```bash
docker run -p 5432:5432 -e POSTGRES_PASSWORD=example postgres
```
> This example does not describe a production configuration because it sets the password in plain text and the user name is left as the PostgreSQL default of "postgres".
2. Create a database for state data.
Either the default "postgres" database can be used, or create a new database for storing state data.
To create a new database in PostgreSQL, run the following SQL command:
```sql
CREATE DATABASE my_dapr;
```
{{% /codetab %}}
{{% /tabs %}}
## Advanced
### Differences between v1 and v2
The PostgreSQL state store v2 was introduced in Dapr 1.13. The [pre-existing v1]({{< ref setup-postgresql-v1.md >}}) remains available and is not deprecated.
In the v2 component, the table schema has been changed significantly, with the goal of increasing performance and reliability. Most notably, the value stored by Dapr is now of type _BYTEA_, which allows faster queries and, in some cases, is more space-efficient than the previously-used _JSONB_ column.
However, due to this change, the v2 component does not support the [Dapr state store query APIs]({{< ref howto-state-query-api.md >}}).
Also, in the v2 component, ETags are now random UUIDs, which ensures better compatibility with other PostgreSQL-compatible databases, such as CockroachDB.
Because of these changes, v1 and v2 components are not able to read or write data from the same table. At this stage, it's also impossible to migrate data between the two versions of the component.
### Displaying the data in human-readable format
The PostgreSQL v2 component stores the state's value in the `value` column, which is of type _BYTEA_. Most PostgreSQL tools, including pgAdmin, consider the value as binary and do not display it in human-readable form by default.
If you want to inspect the value in the state store, and you know it's not binary (for example, JSON data), you can have the value displayed in human-readable form using a query like the following:
```sql
-- Replace "state" with the name of the state table in your environment
SELECT *, convert_from(value, 'utf-8') FROM state;
```
### TTLs and cleanups
This state store supports [Time-To-Live (TTL)]({{< ref state-store-ttl.md >}}) for records stored with Dapr. When storing data using Dapr, you can set the `ttlInSeconds` metadata property to indicate after how many seconds the data should be considered "expired".
Because PostgreSQL doesn't have built-in support for TTLs, this is implemented in Dapr by adding a column in the state table indicating when the data is to be considered "expired". Records that are "expired" are not returned to the caller, even if they're still physically stored in the database. A background "garbage collector" periodically scans the state table for expired rows and deletes them.
You can set the deletion interval of expired records with the `cleanupInterval` metadata property, which defaults to 3600 seconds (that is, 1 hour).
- Longer intervals require less frequent scans for expired rows, but can require storing expired records for longer, potentially requiring more storage space. If you plan to store many records in your state table, with short TTLs, consider setting `cleanupInterval` to a smaller value; for example, `5m` (5 minutes).
- If you do not plan to use TTLs with Dapr and the PostgreSQL state store, you should consider setting `cleanupInterval` to a value <= 0 (for example, `0` or `-1`) to disable the periodic cleanup and reduce the load on the database.
## Related links
- [Basic schema for a Dapr component]({{< ref component-schema >}})
- Read [this guide]({{< ref "howto-get-save-state.md#step-2-save-and-retrieve-a-single-state" >}}) for instructions on configuring state store components
- [State management building block]({{< ref state-management >}})

View File

@ -1,8 +1,8 @@
- component: Azure Blob Storage
link: setup-azure-blobstorage
state: Stable
version: v1
since: "1.0"
version: v2
since: "1.13"
features:
crud: true
transactions: false

View File

@ -141,8 +141,8 @@
etag: true
ttl: true
query: false
- component: PostgreSQL
link: setup-postgresql
- component: PostgreSQL v1
link: setup-postgresql-v1
state: Stable
version: v1
since: "1.0"
@ -152,6 +152,17 @@
etag: true
ttl: true
query: true
- component: PostgreSQL v2
link: setup-postgresql-v2
state: Stable
version: v2
since: "1.13"
features:
crud: true
transactions: true
etag: true
ttl: true
query: false
- component: Redis
link: setup-redis
state: Stable