Commit Graph

20 Commits

Author SHA1 Message Date
Sunny 7f7490ebf0
libgit2/managed: fix race issues in ssh transport
Race conditions in ssh smart subtransport caused some goroutines to
panic, resulting in crashing the whole controller, mostly evident in
image-automation-controller CI runs. Panic recovery in the main thread
do not handle goroutine panics. So, the existing panic recovery code in
libgit2 Checkout() methods weren't able to handle it.

This change groups the fields in ssh smart subtransport that may be
accessed by multiple goroutines into a new struct with a mutex. Also
adds panic recovery in the created goroutine to handle any other
possible panics.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-06-03 01:45:18 +05:30
Paulo Gomes 978148ea71
libgit2: enforce context timeout
Some scenarios could lead a goroutine to be running indefinetely within managed ssh.
Previously between the two git operations, the reconciliation
could take twice the timeout set for the Flux object.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-05-27 13:59:50 +01:00
Sanskar Jaiswal 94c50fa3a8 remvoe support for sha1 and md5 hashing for public keys
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
2022-05-27 14:01:23 +05:30
Sanskar Jaiswal 7d2bc64f47 fix panics on unmanaged http and proxy on managed http
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
2022-05-27 14:01:23 +05:30
Sanskar Jaiswal d4beacb6ad Remove dependency on libgit2 credentials callback
Injects transport and auth options at the transport level directly to
bypass the inbuilt credentials callback because of it's several
shortcomings. Moves some of the pre-existing logic from the reconciler
to the checkout implementation.

Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
2022-05-27 14:01:23 +05:30
Paulo Gomes ce92881013
libgit2: remove connection caching
Connection caching was a feature created to resolve
upstream issues raised from concurrent ssh connections.
Some scenarios were based on multiple key exchange
operations happening at the same time.

This PR removes the connection caching, and instead:
- Services Session.StdoutPipe() as soon as possible,
  as it is a known source of blocking SSH connections.
- Reuse SSH connection within the same subtransport,
  eliminating the need for new handshakes when talking
  with the same server.
- Simplifies the entire transport logic for better
  maintainability.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-05-13 11:09:02 +01:00
Paulo Gomes 4e3e62923b
git: Add git.HostKeyAlgos
Enables the setting of HostKey algorithms to be used from
a client perspective. This implementation supports go-git
and libgit2 when in ManagedTransport.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-05-06 17:58:09 +01:00
Paulo Gomes 68eece48fb
libgit2: optimise mutex on cached connections
Previously the mutex.Lock was acquired before creating
a new connection. The lock would then hold until the
process was finished, and all network latency would be
absorbed by other goroutines trying to establish a new
connection.

Now the lock is acquired after the connection has been
created. The downside of this approach is that concurrent
goroutine may be trying to open a connection to the same
target. The loser in the race will then have to Close the
connection and use the winner's instead.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-04-07 19:10:54 +01:00
Paulo Gomes b264a3513d
libgit2: refactor max length values into constants
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-04-07 19:10:50 +01:00
Paulo Gomes 3ab95a4bf0
libgit2: close discarded connections
Cached connections can be shared across concurrent
operations, and their disposal must take that into
account to avoid closing a connection that is stale for
one goroutine, but is still valid for another.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-04-07 18:37:40 +01:00
Paulo Gomes add07745f3
libgit2: restrict smart creds to Type SSH Memory
Avoid asking for SSH credential in files, as they won't be
used. The cacheKeyAndConfig func already enforces this
behaviour.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-04-07 18:37:39 +01:00
Paulo Gomes d86f0a280a
libgit2: validate URL max length
The major Git SaaS providers have repository URLs
for both HTTP and SSH that tops around 250
characters in length.

The limits chosen were a lot higher to align with use
cases in which users may have on-premise servers with
long domain names and paths.

For SSH the validation is around path length only,
which is now limited to 4096 characters, which is
at the higher end of the range in Linux.

For HTTP the validation is around the full URL
provided by the caller.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-04-07 18:37:38 +01:00
Paulo Gomes 54d0794d19
libgit2: handle the closing of stale connections
Internal and upstream calls to sshSmartSubtransport.Close()
when dealing with an stale connection, may lead to misleading
errors.

Focus should instead be redirected to ensuring that Close()
releases resources and ensures that a new SubTransport can be
created, so new operations can succeed.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-04-07 18:37:38 +01:00
Paulo Gomes 69c3f00172
libgit2: retry on stale connections
SSH servers that block the reuse of SSH connections for
multiple SSH sessions may lead to EOF when a new session
is being created.

This fixes the issue of long-running connections resulting
in EOF for GitLab servers.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-04-07 18:37:37 +01:00
Sanskar Jaiswal 5c84ea7e96 add flag to allow configuration of SSH kex algos
Adds a flag `ssh-kex-algos` which configures the gogit and libgit2
managed clients to use the specified list of kex algos for ssh. If not
used the default list in `golang/x/crypto/ssh` is used.

Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
2022-04-07 16:43:15 +05:30
Paulo Gomes 36fcdeeb5e
libgit2: fix access to nil t.stdin and improve observability
All errors that were previously not handled are now logged through
traceLog, to further help during transport investigations.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-03-30 14:44:41 +01:00
Paulo Gomes 92ad1f813b
Cache SSH connections
The underlying SSH connections are kept open and are reused
across several SSH sessions. This is due to upstream issues in
which concurrent/parallel SSH connections may lead to instability.

https://github.com/golang/go/issues/51926
https://github.com/golang/go/issues/27140
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-03-28 11:58:10 +01:00
Paulo Gomes 017707a71c
Improve managed transport observability
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-03-25 19:08:54 +00:00
Paulo Gomes 5091b69ad5
Force ssh.Dial timeout
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-03-25 19:08:52 +00:00
Paulo Gomes aa3288112e
Implement Managed Transport for libgit2
libgit2 network operations are blocking and do not provide timeout nor context capabilities,
leading for several reports by users of the controllers hanging indefinitely.

By using managed transport, golang primitives such as http.Transport and net.Dial can be used
to ensure timeouts are enforced.

Co-Authored-by: Sunny <darkowlzz@protonmail.com>
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-03-16 16:22:20 +05:30