Commit Graph

23282 Commits

Author SHA1 Message Date
Paul Holzinger a65aecd260
podman volume rm --force: fix ABBA deadlock
We cannot get first the volume lock and the container locks. Other code
paths always have to first lock the container and the lock the volumes,
i.e. to mount/umount them. As such locking the volume fust can always
result in ABBA deadlocks.

To fix this move the lock down after the container removal. The removal
code is racy regardless of the lock as the volume lcok on create is no
longer taken since commit 3cc9db8626 due another deadlock there.

Fixes #23613

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-15 11:07:27 +02:00
Paul Holzinger b6beed9f76
test/system: fix network cleanup restart test
Now that on-failure exits right away the test is racy as the
RestartCount is not at the value we expect as the container is still
restarting in the background. As such add a timer based approach.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-15 11:07:27 +02:00
Paul Holzinger 30eb6b6aae
libpod: do not stop pod on init ctr exit
Init containers are meant to exit early before other containers are
started. Thus stopping the infra container in such case is wrong.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-15 11:07:27 +02:00
Paul Holzinger 8a943311db
libpod: simplify WaitForExit()
The current code did several complicated state checks that simply do not
work properly on a fast restarting container. It uses a special case for
--restart=always but forgot to take care of --restart=on-failure which
always hang for 20s until it run into the timeout.

The old logic also used to call CheckConmonRunning() but synced the
state before which means it may check a new conmon every time and thus
misses exits.

To fix the new the code is much simpler. Check the conmon pid, if it is
no longer running then get then check exit file and get exit code.

This is related to #23473 but I am not sure if this fixes it because we
cannot reproduce.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-15 11:07:27 +02:00
openshift-merge-bot[bot] f456b53a0e
Merge pull request #23621 from inknos/podman-machine-fix-for-flake-23505
Fix known_hosts file clogging and remote host id
2024-08-15 08:48:22 +00:00
openshift-merge-bot[bot] 62b953b6c6
Merge pull request #23623 from edsantiago/nuke-buildtime-quay-check
CI: remove build-time quay check
2024-08-14 15:48:29 +00:00
Ed Santiago 5b6de98ee8 CI: remove build-time quay check
CI will fail if quay is down, but a build-time check does not
help us in any way. It just introduces another pain point
where we have to hit the Rerun button.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-14 08:17:45 -06:00
Nicola Sella 6b1c7de3d5 Fix known_hosts file clogging and remote host id
By enabling UserKnownHostsFile=/dev/null, and CheckHostIP=no
options to the defaults we prevent the user from adding the host key
multiple times and from flakes that can raise Remote Host Id change.

Resolves: https://github.com/containers/podman/issues/23505

Signed-off-by: Nicola Sella <nsella@redhat.com>
2024-08-14 15:53:11 +02:00
openshift-merge-bot[bot] 6638337453
Merge pull request #23603 from containers/renovate/github.com-docker-docker-27.x
Update module github.com/docker/docker to v27.1.2+incompatible
2024-08-14 11:57:27 +00:00
openshift-merge-bot[bot] f4c85cab32
Merge pull request #23608 from containers/renovate/docker.io-library-golang-1.x
Update docker.io/library/golang Docker tag to v1.23
2024-08-14 09:01:29 +00:00
openshift-merge-bot[bot] 2f8648277f
Merge pull request #23605 from containers/renovate/setuptools-72.x
Update dependency setuptools to ~=72.2.0
2024-08-14 08:58:43 +00:00
renovate[bot] c4cdb6defa
Update docker.io/library/golang Docker tag to v1.23
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-08-13 22:03:53 +00:00
renovate[bot] 0d1c19248a
Update dependency setuptools to ~=72.2.0
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-08-13 18:05:58 +00:00
openshift-merge-bot[bot] 17baab0bf5
Merge pull request #23561 from Luap99/test-pasta-port
test/system: pasta_test_do add explicit port check
2024-08-13 18:04:58 +00:00
renovate[bot] 9945736a3e
Update module github.com/docker/docker to v27.1.2+incompatible
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-08-13 16:42:32 +00:00
openshift-merge-bot[bot] a4c6bef65f
Merge pull request #23592 from edsantiago/safename-080
CI: 080-pause.bats: make parallel-safe
2024-08-13 10:54:26 +00:00
openshift-merge-bot[bot] 1bf711e526
Merge pull request #23591 from edsantiago/safename-050
CI: 050-stop.bats: make parallel-safe
2024-08-13 10:51:42 +00:00
openshift-merge-bot[bot] d2208baf72
Merge pull request #23594 from edsantiago/safename-220
CI: healthcheck system test: make parallel-safe
2024-08-13 10:48:57 +00:00
openshift-merge-bot[bot] 936455d1a8
Merge pull request #23587 from rhatdan/errors
Additional potential race condition on os.Readdir
2024-08-13 10:04:59 +00:00
openshift-merge-bot[bot] d4ecd574f0
Merge pull request #23585 from ashley-cui/sshkeygen
pkg/machine: Read stderr from ssh-keygen correctly
2024-08-13 10:02:14 +00:00
openshift-merge-bot[bot] c3111c24c1
Merge pull request #23593 from cevich/fix_validate_renovate
[CI:ALL] Fix and validate renovate config
2024-08-12 19:08:03 +00:00
openshift-merge-bot[bot] bd53a11630
Merge pull request #23225 from edsantiago/no-more-ci-docs
pr-should-include-tests: no more CI:DOCS override
2024-08-12 18:46:02 +00:00
Ed Santiago 0d7e14fb83 healthcheck system check: reduce raciness
When will I learn not to dismiss something as "easy"?

Anyhow, this doesn't actually change anything parallel-wise
but it does reduce a race condition seen on heavily-loaded
slow systems, wherein a container goes into unhealthy before
we want it to. This version isn't perfect; I don't think
there's an ideal fix for this.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:24:37 -06:00
Ed Santiago 30ee9c0114 CI: healthcheck system test: make parallel-safe
Easy one, just replace "healthcheck_c"

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:23:54 -06:00
Chris Evich 8f191618e4
Validate renovate config in every PR
Signed-off-by: Chris Evich <cevich@redhat.com>
2024-08-12 14:10:28 -04:00
Ashley Cui 0177f74dc6 pkg/machine: Read stderr from ssh-keygen correctly
Read stderr from ssh-keygen before calling wait(), since cmd.Wait() closes cmd.StderrPipe() after it exits, causing a read-on-closed-pipe error.

Signed-off-by: Ashley Cui <acui@redhat.com>
2024-08-12 14:09:16 -04:00
Chris Evich e30b0978b8
Fix renovate config syntax error
Signed-off-by: Chris Evich <cevich@redhat.com>
2024-08-12 14:05:28 -04:00
Ed Santiago 36f9a04499 CI: 080-pause.bats: make parallel-safe
Only one test can be parallelized. Do so, and add a comment
to the other one explaining why it can't be.

Also, add some missing error-message checks.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:05:27 -06:00
Ed Santiago 6656a18c3f CI: 050-stop.bats: make parallel-safe
Very few changes needed, all of them simple.

It is impossible to parallelize this entire file, because "stop -a".
Add tags to tests that can be parallelized, and comments to those
that can't.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-08-12 12:00:09 -06:00
openshift-merge-bot[bot] 6738405d59
Merge pull request #23581 from Luap99/remote-ignore
remote: fix invalid --cidfile + --ignore
2024-08-12 16:13:30 +00:00
openshift-merge-bot[bot] 8f85a4da43
Merge pull request #23584 from rhatdan/error
Fix race condition when listing /dev
2024-08-12 15:48:25 +00:00
Daniel J Walsh 25d66d97d2
Additional potential race condition on os.Readdir
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-08-12 11:38:02 -04:00
openshift-merge-bot[bot] 4f2d98f228
Merge pull request #23564 from cevich/renovate_manage_requirements
[skip-ci] Maintain renovate configuration
2024-08-12 15:34:40 +00:00
Paul Holzinger 5ec413fac7
pkg/bindings/containers: handle ignore for stop
When the client gets a 404 back we know the container does not exists,
if ignore is set as well we should just ignore the error client side.

seen in #23554

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-12 17:12:25 +02:00
Paul Holzinger 6fce734f42
remote: fix invalid --cidfile + --ignore
When the cidfile does not exists and ignore is set the cli parser skips
the file without error and we call into the backend code without any
names at all. This should logically be a NOP but on remote it caused all
containers to be returned which caused podman stop to stop everything in
this case.

Fixes #23554

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-12 17:12:12 +02:00
Chris Evich e111b6c0be
Update/simplify renovate config header comment
The previous comment included way too many details.  It also referenced
a docker-hub container image which is not accessible under all
circumstances.  Switch to the GitHub container registry and include
mention of the pre-commit hook that's available.

Signed-off-by: Chris Evich <cevich@redhat.com>
2024-08-12 11:08:12 -04:00
Chris Evich 6c0b8b64d4
Migrate renovate config to latest schema
The main change is a global "packageRules" config that encompasses all
rules instead of configuring them as options to a manager.

Signed-off-by: Chris Evich <cevich@redhat.com>
2024-08-12 11:08:11 -04:00
Daniel J Walsh d33abcdf10
Fix race condition when listing /dev
Also replace os.IsNotExist(err) with errors.Is(err, fs.ErrNotExist)

Fixes: https://github.com/containers/podman/issues/23582

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-08-12 10:28:01 -04:00
openshift-merge-bot[bot] 708d6c5e2b
Merge pull request #23449 from ygalblum/quadlet-override-service-name
Quadlet override service name
2024-08-12 13:56:48 +00:00
openshift-merge-bot[bot] 7acaf714ca
Merge pull request #23496 from rhatdan/manifest
Should not force conversion of manifest type to DockerV2ListMediaType
2024-08-12 13:36:30 +00:00
openshift-merge-bot[bot] 594f01315b
Merge pull request #23485 from cgwalters/doc-quadlet-exec-more
[ci:docs] docs/podman-systemd: Try to clarify `Exec=` more
2024-08-12 13:28:34 +00:00
openshift-merge-bot[bot] 6ef3a2347a
Merge pull request #23577 from Luap99/save-error
libpod: fix broken saveContainerError()
2024-08-12 13:22:42 +00:00
Colin Walters d26341332c docs/podman-systemd: Try to clarify `Exec=` more
In podman-systemd we are intersecting the worlds of containers
and systemd, and I had to stop and think to understand what
`Exec=` does.

I tried to clarify things more here.

I found it especially confusing because the example at the
very top of the file does:

```
Image=quay.io/fedora/fedora
Exec=sleep 10
```

But that only makes sense because the fedora base image
(being generic) doesn't define an `ENTRYPOINT`, just a `CMD`.

But IMO by far the most common usage for podman-systemd
is "app images" which conventionally should use `ENTRYPOINT`
in general. Maybe we should change the default example,
but I'm leaving that for a later followup.

(It perhaps would have been less confusing if this field
 had been called `Args=` to make clear it's quite different
 in practice from systemd `ExecStart=`)

Signed-off-by: Colin Walters <walters@verbum.org>
2024-08-12 09:03:57 -04:00
openshift-merge-bot[bot] 40df14012b
Merge pull request #23569 from emersion/patch-1
[CI:DOCS] readme: replace GPG with PGP
2024-08-12 12:55:14 +00:00
Paul Holzinger ecf88f17b6
libpod: reset state error on init
If we manage to init/start a container successfully we should unset any
previously stored state errors. Otherwise a user might be confused why
there is an error in the state about some old error even though the
container works/runs.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-12 14:30:48 +02:00
openshift-merge-bot[bot] 52fe111b51
Merge pull request #23562 from cevich/rm_docker_py_dupe
De-duplicate docker-py testing
2024-08-12 12:05:41 +00:00
Paul Holzinger 20f3e8909e
test/system: pasta_test_do add explicit port check
Do not rely on an arbitrary delay in order to ensure the port was bound
in the container. Instead this approach checks if the port is bound in
the netns and only then starts the client. This speeds up the entire
test file by 50% but more importantly in parallel testing it solves
hangs as the timeout there was unreliable.

Fixes #23471

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-12 13:46:56 +02:00
Paul Holzinger 78cb1e28cb
libpod: do not save expected stop errors in ctr state
If we try to stop a contianer that is not running or paused we get an
ErrCtrStateInvalid or ErrCtrStopped error. As podman stop is idempotent
this is not a user visable error at all so we should also never log it
in the container state.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-12 12:09:01 +02:00
Paul Holzinger f276d53532
libpod: fix broken saveContainerError()
We cannot unlock then lock again without syncing the state as this will
then save a potentially old state causing very bad things, such as
double netns cleanup issues.

The fix here is simple move the saveContainerError() under the same
lock. The comment about the re-lock is just wrong. Not doing this under
the same lock would cause us to update the error after something else
changed the container alreayd.

Most likely this was caused by a misunderstanding on how go defer's work.
Given they run Last In - First Out (LIFO) it is safe as long as out
defer function is after the defer unlock() call.

I think this issue is very bad and might have caused a variety of other
weird flakes. As fact I am confident that this fixes the double cleanup
errors.

Fixes #21569
Also fixes the netns removal ENOENT issues seen in #19721.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-08-12 11:19:47 +02:00
openshift-merge-bot[bot] 277e061878
Merge pull request #23498 from lelemka0/fix/quadlets/userLevelFilter
Quadlet: Fix `userLevelFilter` when `UnitDirAdmin` is a symlink
2024-08-11 13:43:34 +00:00