libgit2 network operations are blocking and do not provide timeout nor context capabilities,
leading for several reports by users of the controllers hanging indefinitely.
By using managed transport, golang primitives such as http.Transport and net.Dial can be used
to ensure timeouts are enforced.
Co-Authored-by: Sunny <darkowlzz@protonmail.com>
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
This ensures the Managed Identity authentication works with multiple
identities assigned to a single node.
Signed-off-by: Hidde Beydals <hello@hidde.co>
- `authorityHost` and `clientCertificateSendChain` can now be set where
applicable.
- AZ CLI fields have been removed.
- Fallback to `ChainedTokenCredential` with `EnvironmentCredential` and
`ManagedIdentityCredential` with defaults if no Secret is given.
Signed-off-by: Hidde Beydals <hello@hidde.co>
Based on recommendations from Microsoft, change the order valid
authentication options are taken into account. Mainly to ensure it works
as expected when multiple Managed Identities are bound on the same VM
node.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit allows for a Secret to be configured with `tenantId`,
`clientId` and `clientCertificate` data fields (with optionally
`clientCertificatePassword`) to authenticate using TLS.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit introduces an Azure Blob BucketProvider implementation,
capable of fetching from objects from public and private "container"
buckets.
The supported credential types are:
- ManagedIdentity with a `resourceId` Secret data field.
- ManagedIdentity with a `clientId` Secret data field.
- ClientSecret with `tenantId`, `clientId` and `clientSecret` Secret
data fields.
- SharedKey with `accountKey` Secret data field, the Account Name is
extracted from the endpoint URL specified on the object.
If no Secret is provided, the Bucket is assumed to be public.
Co-authored-by: Zhongcheng Lao <Zhongcheng.Lao@microsoft.com>
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit introduces a BucketProvider interface for fetch operations
against object storage provider buckets. Allowing for easier
introduction of new provider implementations.
The algorithm for conditionally downloading object files is the same,
whether you are using GCP storage or an S3/Minio-compatible
bucket. The only thing that differs is how the respective clients
handle enumerating through the objects in the bucket; by implementing
just that in each provider, I can have the select-and-fetch code in
once place.
The client implementations do now include safe-guards to ensure the
fetched object is the same as metadata has been collected for. In
addition, minor changes have been made to the object fetch operation
to take into account that:
- Etags can change between composition of index and actual fetch, in
which case the etag is now updated.
- Objects can disappear between composition of index and actual fetch,
in which case the item is removed from the index.
Lastly, the requirement for authentication has been removed (and not
referring to a Secret at all is thus allowed), to provide support
for e.g. public buckets.
Co-authored-by: Hidde Beydals <hello@hidde.co>
Co-authored by: Michael Bridgen <michael@weave.works>
Signed-off-by: pa250194 <pa250194@ncr.com>
- Introduce mock GCP Server to test the gcp bucket client against mocked
gcp server results.
- Add tests for reconcileGCPSource().
- Patch GCPClient.BucketExists() to return no error when the bucket
doesn't exists. This keeps the GCP client compatible with the minio
client.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
NOTE: This should be amended with the previous commit which has
commented out tests.
Update reconcileSource() to work with the test case where no secret is
set. A minimal auth options is created and used for git checkout.
Update TestGitRepositoryReconciler_verifyCommitSignature() to use the
new git.Commit type.
Update TestGitRepositoryReconciler_reconcileSource_checkoutStrategy to
add skipForImplementation for branch commit test case.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
If there is no configuration set for `init.defaultBranch`, it does not
return an error but an empty string. We now take this into account so
we do not overwrite the default, and make the default `master` to match
with libgit2 defaults.
In addition, some comments have been added to not get confused about
what commits we are checking against.
Signed-off-by: Hidde Beydals <hello@hidde.co>
In the recent update from libgit2 1.1.x to 1.3.x, something seems to
have changed upstream. Resulting in the clone of a branch ending up
with a semi-bare file system state (in other words: without any files
present in the directory).
This commit patches the clone behavior to set the `CheckoutForce`
strategy as `CheckoutOption`, which mitigates the issue.
In addition, test cases have been added to ensure we do not run into
this again by asserting the state of the branch after cloning.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This adds a test to detect any regression in libgit2's ED25519 key
support. go-git supports ED25519 but not the current version of
libgit2 used in flux. The updates to libgit2 in v1.2.0 adds support
for ED25519. This test would help ensure the right version of libgit2
is used.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
Downstream breaking changes introduced since git2go@V31:
- git2go.ErrorCode was deprecated in favour of the native error type.
- FetchOptions no longer expects a pointer, but rather the actual value of git2go.FetchOptions.
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
go-git: Include the commit message in the returned commit object.
libgit2: Set the URL in the checkout error.
Add new method Commit.ShortMessage() for returning short commit
message.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
This configures ProxyOptions for all libgit2 Checkout functions when
cloning and configures the options based on current environment
settings using the git2go.ProxyTypeAuto option.
Refs: #131
Signed-off-by: Robert Clarke <rob@robertandrewclarke.com>
Co-authored-by: Aurélien GARNIER <aurelien.garnier@atos.net>
- Ensure the proper path is garbage collected for libgit2 repositories,
as the `Path` method on the repository object returns the `.git`
directory, and not the root path.
- Ensure the Helm test server does not get swapped during tests,
with as side-effect that no obsolete temporary directories remain.
Signed-off-by: Hidde Beydals <hello@hidde.co>
- Adds tests for the libgit2 remote callbacks
- Adds tests for CheckoutStrategyForImplementation with context timeout
and verify timeout is respected by both the git implementations.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
Add SidebandProgressCallback to be able to cancel the network operation
before any transfer operation.
Add PushTransferProgressCallback to be able to cancel the push transfer
operation.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
In transferProgressCallback(), if the received objects is equal to the
total objects, return early with OK.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
With the information from the refactor still fresh in mind, I continue
to find new paths now I mentally tamed the git2go beast.
`libgit2` seems to assume that a transport will eventually tell by
itself that it has timed out. This also means that at present any
timeout configuration does not seem have an effect. It will continue
to transfer until the remote (or _something_ else) tells it is no
longer transfering.
This commit introduces a simple check (without tests) which was used
to confirm the theory in combination with the tests in
`pkg/git/strategy` (by setting it to a very low timeout and observing
it fail).
A future iteration should probably take the data given to the callback
into account to ensure it doesn't error out if the given data[1]
reports it has successfully received all objects. Another candidate
for this check may be `CompletionCallback`, but one should study the
C code (and likely some Go code as well) before this.
In addition, to ensure the same timeout is taken into account for push
operations, `PushTransferProgressCallback` may require a likewise
helper.
[1]: https://github.com/libgit2/git2go/blob/main/remote.go#L50-L58
Signed-off-by: Hidde Beydals <hello@hidde.co>
parseKnownHosts() uses golang.org/x/crypto/ssh's ParseKnownHosts() for
parsing known hosts. It returns EOF error when the input is not a host
public key, but a valid known_hosts content, like a comment line.
With this fix, lines causing EOF error are skipped and the parsing of
the known_hosts file continues. But invalid lines still cause parsing
failure.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
Adds tests for git.CheckoutStrategy to check if both the git
implementations follow the same SemVer tag selection rules.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
This changes the logic of `credentialsCallback` so that it takes the
`allowedTypes` passed on by `git2go` into account. Reason for this
change is because this prepares it to work with `v33`, but also
because it can provide better guidance when `libgit2` has been
compiled with a different configuration, which e.g. doesn't
allow for "in-memory SSH keys".
Because `AuthOptions#Identity` now gets validated by the callback
and go-git does its own validaiton, the check has been removed
from `Validate` (and now does a simple check if the fields are set).
Signed-off-by: Hidde Beydals <hello@hidde.co>
Adds tests for git.CheckoutStrategy to check if both the git
implementations work with all the authentication methods.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
Main requirement for this is the image-automation-controller
depending on being able to get a working auth configuration.
Once the package is moved, we should add push logic to it,
so that the controller is able to use that instead.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit changes the `gogit` behavior for commit checkouts,
now allowing one to reference to just a commit while omitting any
branch reference. Doing this creates an Artifact with a
`HEAD/<commit>` revision.
If both a `branch` and `commit` are defined, the commit is expected
to exist within the branch. This results in a more efficient clone
of just the target branch, and also makes this change backwards
compatible.
Fixes#407Fixes#315
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit refactors the previous `Commit` interface into a
standardised `Commit` struct. This object contains sufficient
information for referencing, observating and (PGP) verification.
- `libgit2` commit checkout does now return `HEAD/<SHA1>` as
the branch is not taken into account.
- `git2go` objects are now properly `Free`d everywhere
- `Verify` logic is tested.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit moves the previous `AuthStrategy` wiring to a more generic
`AuthOptions`, breaking free from implementation specific details in
the `git` package.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit ensures most of the `git2go` objects `Free` themselves from
the underlying C object.
Ensuring all objects are freed is not possible yet, due to the way
commits are wired in to facilitate verification later on. In a later
follow up, we should change this and e.g. validate as part of the
checkout process, and move the implementation specific authentication
configuration from `git` into `libgit2`.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit is a follow up on 4dc3185c5f
and adds tests for the remaining checkout strategies, while
consolidating some of the logic.
The consolidated logic ensures that (SemVer) tag and commit checkouts
happen using the same "checkout detached HEAD" logic.
The branch checkout is left unmodified, and simply checks out at the
current HEAD of the given branch.
Signed-off-by: Hidde Beydals <hello@hidde.co>
In d0560e5 the SemVer implementations were aligned, and the logic was
simplified a bit (or so I thought). This did however result in the
introduction of a regression, as it failed to take "simple tags" into
account.
This commit ensures both are taken into account again, and ensures it
is now covered by a proper test.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit updates `github.com/libgit2/git2go` to `v31.6.1` (with
`libgit2` `1.1.1`), and changes the container image build process so
that it makes use of `ghcr.io/hiddeco/golang-with-libgit2`.
This image provides a pre-build dynamic `libgit2` dependency linked
against OpenSSL and LibSSH2 (without gcrypt), and a set of cross-compile
build tools (see
[rationale](https://github.com/hiddeco/golang-with-libgit2#rationale) and
[usage](https://github.co/hiddeco/golang-with-libgit2#usage) for more
detailed information).
The linked set of dependency should solve most known issues around
unsupport private key types, but does not resolve the issues with ECDSA*
and ED25519 hostkeys yet. Solving this requires a newer version of
`libgit2` (`>=1.2.0`), which currently does not seem to work properly
with `git2go/v32`.
Some small changes have been made to the `libgit2` package to address
(future) deprecations.
Signed-off-by: Hidde Beydals <hello@hidde.co>
Added Support for Google Cloud Storage with Workload Identity as Source Provider. This enables the use of GCP without enabling S3 compatible access.
Signed-off-by: pa250194 <pa250194@ncr.com>
SetHeadDetached (git_repository_set_head_detached) only changes HEAD,
and does not actually checkout the files on disk. Use CheckoutHead with
the CheckoutForce Strategy to actually check the files out on disk.
Additionally add a test that validates the hash of a checked out file's
contents.
Previously, the hash of the desired tag was being reported as the
checked out revision by the GitRepository. However the wrong files were
checked out and an incorrect revision would be deployed by Flux.
Signed-off-by: Blake Burkhart <blake.burkhart@us.af.mil>
This commit makes the filtering applied during the archiving
configurable by introducing an optional `ArchiveFileFilter`
callback argument and a `SourceIgnoreFilter` implementation.
`SourceIgnoreFilter` filters out files matching
sourceignore.VCSPatterns and any of the provided patterns.
If an empty gitignore.Pattern slice is given, the matcher is set to
sourceignore.NewDefaultMatcher.
The `GitRepository` now loads the ignore patterns before archiving
the repository contents by calling `sourceignore.LoadIgnorePatterns`
and other helpers. The loading behavior is **breaking** as
`.sourceignore` files in the (subdirectories of the) repository are
now still taken into account if `spec.ignore` for a resource is
defined, overwriting is still possible by creating an overwriting
rule in the `spec.ignore` of the resource.
This change also makes it possible for the `BucketReconciler` to not
configure a callback at all and prevent looking for ignore
matches twice. To finalize the bucket refactor, a change to the
reconciler has been made to look for a `.sourceignore` file in
the root of the bucket to provide an additional way of configuring
(global) exclusions.
Signed-off-by: Hidde Beydals <hello@hidde.co>
In some circumstances (that are rather hard to reproduce), cloning
from a GitLab repo gets a multiline response as described in
https://github.com/fluxcd/image-automation-controller/pull/115.
This uses the same remedy as in that PR, by calling the funcs provided
by fluxcd/pkg/gitutil on any error returned by libgit2 or gogit clone
operations.
Signed-off-by: Michael Bridgen <mikeb@squaremobius.net>
The callback from libgit2 only provides a hostname (without the port),
but the `known_hosts` file indexes the public keys based on the full
host (e.g. `[localhost]:123` for a host behind a specific port).
As a result, it was unable to find the correct public key for the
hostname when it was added to the `known_hosts` file with the port.
To work around this, we add the user provided host that includes the
port to the `PublicKeyAuth` strategy, and use this to find the right
entry in the `known_hosts` file, after having validated that the
hostname provided to the callback matches the hostname of the host
provided by the user.
Signed-off-by: Hidde Beydals <hello@hidde.co>
We had a hardcoded assumption that the SSH user for a Git repository is
always "git". This is however not true in all scenarios, for example
when one is making use of Gerrit for team code collaboration, as users
there have their own username for (SSH) Git operations.
This commit changes the logic of the auth strategy helpers to:
1. Select the auth strategy based on the protocol of the parsed URL,
instead of a simple rely on a correct prefix.
2. Use the user information from the parsed URL to configure the user
for the public key authentication strategy, with a fallback to `git`
if none is defined.
Signed-off-by: Hidde Beydals <hello@hidde.co>
As this will result in a checkout failure when the default branch on the
remote is not `master`. Surfaced due to Contour switching from `master` to
`main` overnight.