# Buckets The `Bucket` API defines a Source to produce an Artifact for objects from storage solutions like Amazon S3, Google Cloud Storage buckets, or any other solution with a S3 compatible API such as Minio, Alibaba Cloud OSS and others. ## Example The following is an example of a Bucket. It creates a tarball (`.tar.gz`) Artifact with the fetched objects from an object storage with an S3 compatible API (e.g. [Minio](https://min.io)): ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: minio-bucket namespace: default spec: interval: 5m0s endpoint: minio.example.com insecure: true secretRef: name: minio-bucket-secret bucketName: example --- apiVersion: v1 kind: Secret metadata: name: minio-bucket-secret namespace: default type: Opaque stringData: accesskey: secretkey: ``` In the above example: - A Bucket named `minio-bucket` is created, indicated by the `.metadata.name` field. - The source-controller checks the object storage bucket every five minutes, indicated by the `.spec.interval` field. - It authenticates to the `minio.example.com` endpoint with the static credentials from the `minio-secret` Secret data, indicated by the `.spec.endpoint` and `.spec.secretRef.name` fields. - A list of object keys and their [etags](https://en.wikipedia.org/wiki/HTTP_ETag) in the `.spec.bucketName` bucket is compiled, while filtering the keys using [default ignore rules](#default-exclusions). - The digest (algorithm defaults to SHA256) of the list is used as Artifact revision, reported in-cluster in the `.status.artifact.revision` field. - When the current Bucket revision differs from the latest calculated revision, all objects are fetched and archived. - The new Artifact is reported in the `.status.artifact` field. You can run this example by saving the manifest into `bucket.yaml`, and changing the Bucket and Secret values to target a Minio instance you have control over. **Note:** For more advanced examples targeting e.g. Amazon S3 or GCP, see [Provider](#provider). 1. Apply the resource on the cluster: ```sh kubectl apply -f bucket.yaml ``` 2. Run `kubectl get buckets` to see the Bucket: ```console NAME ENDPOINT AGE READY STATUS minio-bucket minio.example.com 34s True stored artifact for revision 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855' ``` 3. Run `kubectl describe bucket minio-bucket` to see the [Artifact](#artifact) and [Conditions](#conditions) in the Bucket's Status: ```console ... Status: Artifact: Digest: sha256:72aa638abb455ca5f9ef4825b949fd2de4d4be0a74895bf7ed2338622cd12686 Last Update Time: 2022-02-01T23:43:38Z Path: bucket/default/minio-bucket/e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855.tar.gz Revision: sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 Size: 38099 URL: http://source-controller.source-system.svc.cluster.local./bucket/default/minio-bucket/e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855.tar.gz Conditions: Last Transition Time: 2022-02-01T23:43:38Z Message: stored artifact for revision 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855' Observed Generation: 1 Reason: Succeeded Status: True Type: Ready Last Transition Time: 2022-02-01T23:43:38Z Message: stored artifact for revision 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855' Observed Generation: 1 Reason: Succeeded Status: True Type: ArtifactInStorage Observed Generation: 1 URL: http://source-controller.source-system.svc.cluster.local./bucket/default/minio-bucket/latest.tar.gz Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal NewArtifact 82s source-controller stored artifact with 16 fetched files from 'example' bucket ``` ## Writing a Bucket spec As with all other Kubernetes config, a Bucket needs `apiVersion`, `kind`, and `metadata` fields. The name of a Bucket object must be a valid [DNS subdomain name](https://kubernetes.io/docs/concepts/overview/working-with-objects/names#dns-subdomain-names). A Bucket also needs a [`.spec` section](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status). ### Provider The `.spec.provider` field allows for specifying a Provider to enable provider specific configurations, for example to communicate with a non-S3 compatible API endpoint, or to change the authentication method. Supported options are: - [Generic](#generic) - [AWS](#aws) - [Azure](#azure) - [GCP](#gcp) If you do not specify `.spec.provider`, it defaults to `generic`. #### Generic When a Bucket's `spec.provider` is set to `generic`, the controller will attempt to communicate with the specified [Endpoint](#endpoint) using the [Minio Client SDK](https://github.com/minio/minio-go), which can communicate with any Amazon S3 compatible object storage (including [GCS](https://cloud.google.com/storage/docs/interoperability), [Wasabi](https://wasabi-support.zendesk.com/hc/en-us/articles/360002079671-How-do-I-use-Minio-Client-with-Wasabi-), and many others). The `generic` Provider _requires_ a [Secret reference](#secret-reference) to a Secret with `.data.accesskey` and `.data.secretkey` values, used to authenticate with static credentials. The Provider allows for specifying a region the bucket is in using the [`.spec.region` field](#region), if required by the [Endpoint](#endpoint). ##### Generic example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: generic-insecure namespace: default spec: provider: generic interval: 5m0s bucketName: podinfo endpoint: minio.minio.svc.cluster.local:9000 timeout: 60s insecure: true secretRef: name: minio-credentials --- apiVersion: v1 kind: Secret metadata: name: minio-credentials namespace: default type: Opaque data: accesskey: secretkey: ``` #### AWS When a Bucket's `.spec.provider` field is set to `aws`, the source-controller will attempt to communicate with the specified [Endpoint](#endpoint) using the [Minio Client SDK](https://github.com/minio/minio-go). Without a [Secret reference](#secret-reference), authorization using credentials retrieved from the AWS EC2 service is attempted by default. When a reference is specified, it expects a Secret with `.data.accesskey` and `.data.secretkey` values, used to authenticate with static credentials. The Provider allows for specifying the [Amazon AWS Region](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions) using the [`.spec.region` field](#region). ##### AWS EC2 example **Note:** On EKS you have to create an [IAM role](#aws-iam-role-example) for the source-controller service account that grants access to the bucket. ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: aws namespace: default spec: interval: 5m0s provider: aws bucketName: podinfo endpoint: s3.amazonaws.com region: us-east-1 timeout: 30s ``` ##### AWS IAM role example Replace `` with the specified `.spec.bucketName`. ```json { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::/*" }, { "Sid": "", "Effect": "Allow", "Action": "s3:ListBucket", "Resource": "arn:aws:s3:::" } ] } ``` ##### AWS static auth example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: aws namespace: default spec: interval: 5m0s provider: aws bucketName: podinfo endpoint: s3.amazonaws.com region: us-east-1 secretRef: name: aws-credentials --- apiVersion: v1 kind: Secret metadata: name: aws-credentials namespace: default type: Opaque data: accesskey: secretkey: ``` #### Azure When a Bucket's `.spec.provider` is set to `azure`, the source-controller will attempt to communicate with the specified [Endpoint](#endpoint) using the [Azure Blob Storage SDK for Go](https://github.com/Azure/azure-sdk-for-go/tree/main/sdk/storage/azblob). Without a [Secret reference](#secret-reference), authentication using a chain with: - [Environment credentials](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity#EnvironmentCredential) - [Workload Identity](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity@v1.3.0-beta.4#WorkloadIdentityCredential) - [Managed Identity](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity#ManagedIdentityCredential) with the `AZURE_CLIENT_ID` - Managed Identity with a system-assigned identity is attempted by default. If no chain can be established, the bucket is assumed to be publicly reachable. When a reference is specified, it expects a Secret with one of the following sets of `.data` fields: - `tenantId`, `clientId` and `clientSecret` for authenticating a Service Principal with a secret. - `tenantId`, `clientId` and `clientCertificate` (plus optionally `clientCertificatePassword` and/or `clientCertificateSendChain`) for authenticating a Service Principal with a certificate. - `clientId` for authenticating using a Managed Identity. - `accountKey` for authenticating using a [Shared Key](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/storage/azblob#SharedKeyCredential). - `sasKey` for authenticating using a [SAS Token](https://docs.microsoft.com/en-us/azure/storage/common/storage-sas-overview) For any Managed Identity and/or Azure Active Directory authentication method, the base URL can be configured using `.data.authorityHost`. If not supplied, [`AzurePublicCloud` is assumed](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity#AuthorityHost). ##### Azure example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: azure-public namespace: default spec: interval: 5m0s provider: azure bucketName: podinfo endpoint: https://podinfoaccount.blob.core.windows.net timeout: 30s ``` ##### Azure Service Principal Secret example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: azure-service-principal-secret namespace: default spec: interval: 5m0s provider: azure bucketName: endpoint: https://.blob.core.windows.net secretRef: name: azure-sp-auth --- apiVersion: v1 kind: Secret metadata: name: azure-sp-auth namespace: default type: Opaque data: tenantId: clientId: clientSecret: ``` ##### Azure Service Principal Certificate example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: azure-service-principal-cert namespace: default spec: interval: 5m0s provider: azure bucketName: endpoint: https://.blob.core.windows.net secretRef: name: azure-sp-auth --- apiVersion: v1 kind: Secret metadata: name: azure-sp-auth namespace: default type: Opaque data: tenantId: clientId: clientCertificate: # Plus optionally clientCertificatePassword: clientCertificateSendChain: # either "1" or "true" ``` ##### Azure Managed Identity with Client ID example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: azure-managed-identity namespace: default spec: interval: 5m0s provider: azure bucketName: endpoint: https://.blob.core.windows.net secretRef: name: azure-smi-auth --- apiVersion: v1 kind: Secret metadata: name: azure-smi-auth namespace: default type: Opaque data: clientId: ``` ##### Azure Blob Shared Key example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: azure-shared-key namespace: default spec: interval: 5m0s provider: azure bucketName: endpoint: https://.blob.core.windows.net secretRef: name: azure-key --- apiVersion: v1 kind: Secret metadata: name: azure-key namespace: default type: Opaque data: accountKey: ``` ##### Workload Identity If you have [Workload Identity](https://azure.github.io/azure-workload-identity/docs/installation/managed-clusters.html) set up on your cluster, you need to create an Azure Identity and give it access to Azure Blob Storage. ```shell export IDENTITY_NAME="blob-access" az role assignment create --role "Storage Blob Data Reader" \ --assignee-object-id "$(az identity show -n $IDENTITY_NAME -o tsv --query principalId -g $RESOURCE_GROUP)" \ --scope "/subscriptions//resourceGroups//providers/Microsoft.Storage/storageAccounts//blobServices/default/containers/" ``` Establish a federated identity between the Identity and the source-controller ServiceAccount. ```shell export SERVICE_ACCOUNT_ISSUER="$(az aks show --resource-group --name --query "oidcIssuerProfile.issuerUrl" -otsv)" az identity federated-credential create \ --name "kubernetes-federated-credential" \ --identity-name "${IDENTITY_NAME}" \ --resource-group "${RESOURCE_GROUP}" \ --issuer "${SERVICE_ACCOUNT_ISSUER}" \ --subject "system:serviceaccount:flux-system:source-controller" ``` Add a patch to label and annotate the source-controller Deployment and ServiceAccount correctly so that it can match an identity binding: ```yaml apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources: - gotk-components.yaml - gotk-sync.yaml patches: - patch: |- apiVersion: v1 kind: ServiceAccount metadata: name: source-controller namespace: flux-system annotations: azure.workload.identity/client-id: labels: azure.workload.identity/use: "true" - patch: |- apiVersion: apps/v1 kind: Deployment metadata: name: source-controller namespace: flux-system labels: azure.workload.identity/use: "true" spec: template: metadata: labels: azure.workload.identity/use: "true" ``` If you have set up Workload Identity correctly and labeled the source-controller Deployment and ServiceAccount, then you don't need to reference a Secret. For more information, please see [documentation](https://azure.github.io/azure-workload-identity/docs/quick-start.html). ```yaml apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: azure-bucket namespace: flux-system spec: interval: 5m0s provider: azure bucketName: testsas endpoint: https://testfluxsas.blob.core.windows.net ``` ##### Deprecated: Managed Identity with AAD Pod Identity If you are using [aad pod identity](https://azure.github.io/aad-pod-identity/docs), You need to create an Azure Identity and give it access to Azure Blob Storage. ```sh export IDENTITY_NAME="blob-access" az role assignment create --role "Storage Blob Data Reader" \ --assignee-object-id "$(az identity show -n $IDENTITY_NAME -o tsv --query principalId -g $RESOURCE_GROUP)" \ --scope "/subscriptions//resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts//blobServices/default/containers/" export IDENTITY_CLIENT_ID="$(az identity show -n ${IDENTITY_NAME} -g ${RESOURCE_GROUP} -otsv --query clientId)" export IDENTITY_RESOURCE_ID="$(az identity show -n ${IDENTITY_NAME} -otsv --query id)" ``` Create an AzureIdentity object that references the identity created above: ```yaml --- apiVersion: aadpodidentity.k8s.io/v1 kind: AzureIdentity metadata: name: # source-controller label will match this name namespace: flux-system spec: clientID: resourceID: type: 0 # user-managed identity ``` Create an AzureIdentityBinding object that binds Pods with a specific selector with the AzureIdentity created: ```yaml apiVersion: "aadpodidentity.k8s.io/v1" kind: AzureIdentityBinding metadata: name: ${IDENTITY_NAME}-binding spec: azureIdentity: ${IDENTITY_NAME} selector: ${IDENTITY_NAME} ``` Label the source-controller Deployment correctly so that it can match an identity binding: ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: kustomize-controller namespace: flux-system spec: template: metadata: labels: aadpodidbinding: ${IDENTITY_NAME} # match the AzureIdentity name ``` If you have set up aad-pod-identity correctly and labeled the source-controller Deployment, then you don't need to reference a Secret. ```yaml apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: azure-bucket namespace: flux-system spec: interval: 5m0s provider: azure bucketName: testsas endpoint: https://testfluxsas.blob.core.windows.net ``` ##### Azure Blob SAS Token example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: azure-sas-token namespace: default spec: interval: 5m0s provider: azure bucketName: endpoint: https://.blob.core.windows.net secretRef: name: azure-key --- apiVersion: v1 kind: Secret metadata: name: azure-key namespace: default type: Opaque data: sasKey: ``` The `sasKey` only contains the SAS token e.g `?sv=2020-08-0&ss=bfqt&srt=co&sp=rwdlacupitfx&se=2022-05-26T21:55:35Z&st=2022-05...`. The leading question mark (`?`) is optional. The query values from the `sasKey` data field in the Secrets gets merged with the ones in the `.spec.endpoint` of the Bucket. If the same key is present in the both of them, the value in the `sasKey` takes precedence. **Note:** The SAS token has an expiry date, and it must be updated before it expires to allow Flux to continue to access Azure Storage. It is allowed to use an account-level or container-level SAS token. The minimum permissions for an account-level SAS token are: - Allowed services: `Blob` - Allowed resource types: `Container`, `Object` - Allowed permissions: `Read`, `List` The minimum permissions for a container-level SAS token are: - Allowed permissions: `Read`, `List` Refer to the [Azure documentation](https://learn.microsoft.com/en-us/rest/api/storageservices/create-account-sas#blob-service) for a full overview on permissions. #### GCP When a Bucket's `.spec.provider` is set to `gcp`, the source-controller will attempt to communicate with the specified [Endpoint](#endpoint) using the [Google Client SDK](https://github.com/googleapis/google-api-go-client). Without a [Secret reference](#secret-reference), authorization using a workload identity is attempted by default. The workload identity is obtained using the `GOOGLE_APPLICATION_CREDENTIALS` environment variable, falling back to the Google Application Credential file in the config directory. When a reference is specified, it expects a Secret with a `.data.serviceaccount` value with a GCP service account JSON file. The Provider allows for specifying the [Bucket location](https://cloud.google.com/storage/docs/locations) using the [`.spec.region` field](#region). ##### GCP example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: gcp-workload-identity namespace: default spec: interval: 5m0s provider: gcp bucketName: podinfo endpoint: storage.googleapis.com region: us-east-1 timeout: 30s ``` ##### GCP static auth example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: gcp-secret namespace: default spec: interval: 5m0s provider: gcp bucketName: endpoint: storage.googleapis.com region: secretRef: name: gcp-service-account --- apiVersion: v1 kind: Secret metadata: name: gcp-service-account namespace: default type: Opaque data: serviceaccount: ``` Where the (base64 decoded) value of `.data.serviceaccount` looks like this: ```json { "type": "service_account", "project_id": "example", "private_key_id": "28qwgh3gdf5hj3gb5fj3gsu5yfgh34f45324568hy2", "private_key": "-----BEGIN PRIVATE KEY-----\nHwethgy123hugghhhbdcu6356dgyjhsvgvGFDHYgcdjbvcdhbsx63c\n76tgycfehuhVGTFYfw6t7ydgyVgydheyhuggycuhejwy6t35fthyuhegvcetf\nTFUHGTygghubhxe65ygt6tgyedgy326hucyvsuhbhcvcsjhcsjhcsvgdtHFCGi\nHcye6tyyg3gfyuhchcsbhygcijdbhyyTF66tuhcevuhdcbhuhhvftcuhbh3uh7t6y\nggvftUHbh6t5rfthhuGVRtfjhbfcrd5r67yuhuvgFTYjgvtfyghbfcdrhyjhbfctfdfyhvfg\ntgvggtfyghvft6tugvTF5r66tujhgvfrtyhhgfct6y7ytfr5ctvghbhhvtghhjvcttfycf\nffxfghjbvgcgyt67ujbgvctfyhVC7uhvgcyjvhhjvyujc\ncgghgvgcfhgg765454tcfthhgftyhhvvyvvffgfryyu77reredswfthhgfcftycfdrttfhf/\n-----END PRIVATE KEY-----\n", "client_email": "test@example.iam.gserviceaccount.com", "client_id": "32657634678762536746", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/test%40podinfo.iam.gserviceaccount.com" } ``` ### Interval `.spec.interval` is a required field that specifies the interval which the object storage bucket must be consulted at. After successfully reconciling a Bucket object, the source-controller requeues the object for inspection after the specified interval. The value must be in a [Go recognized duration string format](https://pkg.go.dev/time#ParseDuration), e.g. `10m0s` to look at the object storage bucket every 10 minutes. If the `.metadata.generation` of a resource changes (due to e.g. the apply of a change to the spec), this is handled instantly outside the interval window. **Note:** The controller can be configured to apply a jitter to the interval in order to distribute the load more evenly when multiple Bucket objects are set up with the same interval. For more information, please refer to the [source-controller configuration options](https://fluxcd.io/flux/components/source/options/). ### Endpoint `.spec.endpoint` is a required field that specifies the HTTP/S object storage endpoint to connect to and fetch objects from. Connecting to an (insecure) HTTP endpoint requires enabling [`.spec.insecure`](#insecure). Some endpoints require the specification of a [`.spec.region`](#region), see [Provider](#provider) for more (provider specific) examples. ### Bucket name `.spec.bucketName` is a required field that specifies which object storage bucket on the [Endpoint](#endpoint) objects should be fetched from. See [Provider](#provider) for more (provider specific) examples. ### Region `.spec.region` is an optional field to specify the region a [`.spec.bucketName`](#bucket-name) is located in. See [Provider](#provider) for more (provider specific) examples. ### Insecure `.spec.insecure` is an optional field to allow connecting to an insecure (HTTP) [endpoint](#endpoint), if set to `true`. The default value is `false`, denying insecure (HTTP) connections. ### Timeout `.spec.timeout` is an optional field to specify a timeout for object storage fetch operations. The value must be in a [Go recognized duration string format](https://pkg.go.dev/time#ParseDuration), e.g. `1m30s` for a timeout of one minute and thirty seconds. The default value is `60s`. ### Secret reference `.spec.secretRef.name` is an optional field to specify a name reference to a Secret in the same namespace as the Bucket, containing authentication credentials for the object storage. For some `.spec.provider` implementations the presence of the field is required, see [Provider](#provider) for more details and examples. ### Ignore `.spec.ignore` is an optional field to specify rules in [the `.gitignore` pattern format](https://git-scm.com/docs/gitignore#_pattern_format). Storage objects which keys match the defined rules are excluded while fetching. When specified, `.spec.ignore` overrides the [default exclusion list](#default-exclusions), and may overrule the [`.sourceignore` file exclusions](#sourceignore-file). See [excluding files](#excluding-files) for more information. ### Suspend `.spec.suspend` is an optional field to suspend the reconciliation of a Bucket. When set to `true`, the controller will stop reconciling the Bucket, and changes to the resource or in the object storage bucket will not result in a new Artifact. When the field is set to `false` or removed, it will resume. For practical information, see [suspending and resuming](#suspending-and-resuming). ## Working with Buckets ### Excluding files By default, storage bucket objects which match the [default exclusion rules](#default-exclusions) are excluded while fetching. It is possible to overwrite and/or overrule the default exclusions using a file in the bucket and/or an in-spec set of rules. #### `.sourceignore` file Excluding files is possible by adding a `.sourceignore` file in the root of the object storage bucket. The `.sourceignore` file follows [the `.gitignore` pattern format](https://git-scm.com/docs/gitignore#_pattern_format), and pattern entries may overrule [default exclusions](#default-exclusions). #### Ignore spec Another option is to define the exclusions within the Bucket spec, using the [`.spec.ignore` field](#ignore). Specified rules override the [default exclusion list](#default-exclusions), and may overrule `.sourceignore` file exclusions. ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: spec: ignore: | # exclude all /* # include deploy dir !/deploy # exclude file extensions from deploy dir /deploy/**/*.md /deploy/**/*.txt ``` ### Triggering a reconcile To manually tell the source-controller to reconcile a Bucket outside the [specified interval window](#interval), a Bucket can be annotated with `reconcile.fluxcd.io/requestedAt: `. Annotating the resource queues the Bucket for reconciliation if the `` differs from the last value the controller acted on, as reported in [`.status.lastHandledReconcileAt`](#last-handled-reconcile-at). Using `kubectl`: ```sh kubectl annotate --field-manager=flux-client-side-apply --overwrite bucket/ reconcile.fluxcd.io/requestedAt="$(date +%s)" ``` Using `flux`: ```sh flux reconcile source bucket ``` ### Waiting for `Ready` When a change is applied, it is possible to wait for the Bucket to reach a [ready state](#ready-bucket) using `kubectl`: ```sh kubectl wait bucket/ --for=condition=ready --timeout=1m ``` ### Suspending and resuming When you find yourself in a situation where you temporarily want to pause the reconciliation of a Bucket, you can suspend it using the [`.spec.suspend` field](#suspend). #### Suspend a Bucket In your YAML declaration: ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: spec: suspend: true ``` Using `kubectl`: ```sh kubectl patch bucket --field-manager=flux-client-side-apply -p '{\"spec\": {\"suspend\" : true }}' ``` Using `flux`: ```sh flux suspend source bucket ``` **Note:** When a Bucket has an Artifact and is suspended, and this Artifact later disappears from the storage due to e.g. the source-controller Pod being evicted from a Node, this will not be reflected in the Bucket's Status until it is resumed. #### Resume a Bucket In your YAML declaration, comment out (or remove) the field: ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: spec: # suspend: true ``` **Note:** Setting the field value to `false` has the same effect as removing it, but does not allow for "hot patching" using e.g. `kubectl` while practicing GitOps; as the manually applied patch would be overwritten by the declared state in Git. Using `kubectl`: ```sh kubectl patch bucket --field-manager=flux-client-side-apply -p '{\"spec\" : {\"suspend\" : false }}' ``` Using `flux`: ```sh flux resume source bucket ``` ### Debugging a Bucket There are several ways to gather information about a Bucket for debugging purposes. #### Describe the Bucket Describing a Bucket using `kubectl describe bucket ` displays the latest recorded information for the resource in the `Status` and `Events` sections: ```console ... Status: ... Conditions: Last Transition Time: 2022-02-02T13:26:55Z Message: processing object: new generation 1 -> 2 Observed Generation: 2 Reason: ProgressingWithRetry Status: True Type: Reconciling Last Transition Time: 2022-02-02T13:26:55Z Message: bucket 'my-new-bucket' does not exist Observed Generation: 2 Reason: BucketOperationFailed Status: False Type: Ready Last Transition Time: 2022-02-02T13:26:55Z Message: bucket 'my-new-bucket' does not exist Observed Generation: 2 Reason: BucketOperationFailed Status: True Type: FetchFailed Observed Generation: 1 URL: http://source-controller.source-system.svc.cluster.local./bucket/default/minio-bucket/latest.tar.gz Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning BucketOperationFailed 37s (x11 over 42s) source-controller bucket 'my-new-bucket' does not exist ``` #### Trace emitted Events To view events for specific Bucket(s), `kubectl events` can be used in combination with `--for` to list the Events for specific objects. For example, running ```sh kubectl events --for Bucket/ ``` lists ```console LAST SEEN TYPE REASON OBJECT MESSAGE 2m30s Normal NewArtifact bucket/ fetched 16 files with revision from 'my-new-bucket' 36s Normal ArtifactUpToDate bucket/ artifact up-to-date with remote revision: 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855' 18s Warning BucketOperationFailed bucket/ bucket 'my-new-bucket' does not exist ``` Besides being reported in Events, the reconciliation errors are also logged by the controller. The Flux CLI offer commands for filtering the logs for a specific Bucket, e.g. `flux logs --level=error --kind=Bucket --name=`. ## Bucket Status ### Artifact The Bucket reports the latest synchronized state from the object storage bucket as an Artifact object in the `.status.artifact` of the resource. The Artifact file is a gzip compressed TAR archive (`.tar.gz`), and can be retrieved in-cluster from the `.status.artifact.url` HTTP address. #### Artifact example ```yaml --- apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: Bucket metadata: name: status: artifact: digest: sha256:cbec34947cc2f36dee8adcdd12ee62ca6a8a36699fc6e56f6220385ad5bd421a lastUpdateTime: "2022-01-28T10:30:30Z" path: bucket///c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.tar.gz revision: sha256:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 size: 38099 url: http://source-controller..svc.cluster.local./bucket///c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.tar.gz ``` #### Default exclusions The following files and extensions are excluded from the Artifact by default: - Git files (`.git/, .gitignore, .gitmodules, .gitattributes`) - File extensions (`.jpg, .jpeg, .gif, .png, .wmv, .flv, .tar.gz, .zip`) - CI configs (`.github/, .circleci/, .travis.yml, .gitlab-ci.yml, appveyor.yml, .drone.yml, cloudbuild.yaml, codeship-services.yml, codeship-steps.yml`) - CLI configs (`.goreleaser.yml, .sops.yaml`) - Flux v1 config (`.flux.yaml`) To define your own exclusion rules, see [excluding files](#excluding-files). ### Conditions A Bucket enters various states during its lifecycle, reflected as [Kubernetes Conditions][typical-status-properties]. It can be [reconciling](#reconciling-bucket) while fetching storage objects, it can be [ready](#ready-bucket), or it can [fail during reconciliation](#failed-bucket). The Bucket API is compatible with the [kstatus specification][kstatus-spec], and reports `Reconciling` and `Stalled` conditions where applicable to provide better (timeout) support to solutions polling the Bucket to become `Ready`. #### Reconciling Bucket The source-controller marks a Bucket as _reconciling_ when one of the following is true: - There is no current Artifact for the Bucket, or the reported Artifact is determined to have disappeared from the storage. - The generation of the Bucket is newer than the [Observed Generation](#observed-generation). - The newly calculated Artifact revision differs from the current Artifact. When the Bucket is "reconciling", the `Ready` Condition status becomes `Unknown` when the controller detects drift, and the controller adds a Condition with the following attributes to the Bucket's `.status.conditions`: - `type: Reconciling` - `status: "True"` - `reason: Progressing` | `reason: ProgressingWithRetry` If the reconciling state is due to a new revision, an additional Condition is added with the following attributes: - `type: ArtifactOutdated` - `status: "True"` - `reason: NewRevision` Both Conditions have a ["negative polarity"][typical-status-properties], and are only present on the Bucket while their status value is `"True"`. #### Ready Bucket The source-controller marks a Bucket as _ready_ when it has the following characteristics: - The Bucket reports an [Artifact](#artifact). - The reported Artifact exists in the controller's Artifact storage. - The Bucket was able to communicate with the Bucket's object storage endpoint using the current spec. - The revision of the reported Artifact is up-to-date with the latest calculated revision of the object storage bucket. When the Bucket is "ready", the controller sets a Condition with the following attributes in the Bucket's `.status.conditions`: - `type: Ready` - `status: "True"` - `reason: Succeeded` This `Ready` Condition will retain a status value of `"True"` until the Bucket is marked as [reconciling](#reconciling-bucket), or e.g. a [transient error](#failed-bucket) occurs due to a temporary network issue. When the Bucket Artifact is archived in the controller's Artifact storage, the controller sets a Condition with the following attributes in the Bucket's `.status.conditions`: - `type: ArtifactInStorage` - `status: "True"` - `reason: Succeeded` This `ArtifactInStorage` Condition will retain a status value of `"True"` until the Artifact in the storage no longer exists. #### Failed Bucket The source-controller may get stuck trying to produce an Artifact for a Bucket without completing. This can occur due to some of the following factors: - The object storage [Endpoint](#endpoint) is temporarily unavailable. - The specified object storage bucket does not exist. - The [Secret reference](#secret-reference) contains a reference to a non-existing Secret. - The credentials in the referenced Secret are invalid. - The Bucket spec contains a generic misconfiguration. - A storage related failure when storing the artifact. When this happens, the controller sets the `Ready` Condition status to `False`, and adds a Condition with the following attributes to the Bucket's `.status.conditions`: - `type: FetchFailed` | `type: StorageOperationFailed` - `status: "True"` - `reason: AuthenticationFailed` | `reason: BucketOperationFailed` This condition has a ["negative polarity"][typical-status-properties], and is only present on the Bucket while the status value is `"True"`. There may be more arbitrary values for the `reason` field to provide accurate reason for a condition. While the Bucket has this Condition, the controller will continue to attempt to produce an Artifact for the resource with an exponential backoff, until it succeeds and the Bucket is marked as [ready](#ready-bucket). Note that a Bucket can be [reconciling](#reconciling-bucket) while failing at the same time, for example due to a newly introduced configuration issue in the Bucket spec. When a reconciliation fails, the `Reconciling` Condition reason would be `ProgressingWithRetry`. When the reconciliation is performed again after the failure, the reason is updated to `Progressing`. ### Observed Ignore The source-controller reports an observed ignore in the Bucket's `.status.observedIgnore`. The observed ignore is the latest `.spec.ignore` value which resulted in a [ready state](#ready-bucket), or stalled due to error it can not recover from without human intervention. The value is the same as the [ignore in spec](#ignore). It indicates the ignore rules used in building the current artifact in storage. Example: ```yaml status: ... observedIgnore: | hpa.yaml build ... ``` ### Observed Generation The source-controller reports an [observed generation][typical-status-properties] in the Bucket's `.status.observedGeneration`. The observed generation is the latest `.metadata.generation` which resulted in either a [ready state](#ready-bucket), or stalled due to error it can not recover from without human intervention. ### Last Handled Reconcile At The source-controller reports the last `reconcile.fluxcd.io/requestedAt` annotation value it acted on in the `.status.lastHandledReconcileAt` field. For practical information about this field, see [triggering a reconcile](#triggering-a-reconcile). [typical-status-properties]: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties [kstatus-spec]: https://github.com/kubernetes-sigs/cli-utils/tree/master/pkg/kstatus