When we check if source code was changed also include header files.
There is only one header file currently but that can change and it may
be possible that changes in this file can break things so make sure it
is considered source code so that all tests are triggered.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
It seems that if some background tasks are queued in libpod's Runtime before the worker's channel is set up (eg. in the refresh phase), they are not executed later on, but the workerGroup's counter is still ticked up. This leads podman to hang when the imageEngine is shutdown, since it waits for the workerGroup to be done.
fixescontainers/podman#22984
Signed-off-by: Farya Maerten <me@ltow.me>
We should never try to reexxec when we are already root with
CAP_SYS_ADMIN. The code contained a bug when --cgroups=disabled is used
as it tried to perfom a reexec even when it was not needed.
Fixes: 900e29549a ("libpod: do not move podman with --cgroups=disabled")
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The code currently tried to avoid joining the userns from conmon
directly and rather joined to only read the pid file and then send this
back to use so we could join the userns. From the comment this was done
because we could not read the pid file. However this is no longer true
as of commit 49eb5af301 and file is no always owned by the real user.
This means we can just remove this special logic and join the namespace
directly there. A test has been added to check the rejoin logic with a
custom uidmapping.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This directory contains important tools such as ginkgo as such updates
there should run through all testing and not skip anything.
Technically we do not need to run system tests as it doesn't use any
tool from there but that
a) might change in the future and
b) would make the only_if rules much more complicated if we try to
exclude it and
c) updates in test/tools are rare and/or automated so it does not cause
inconveniences to run all anyway
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The chocolatey tool that was fetching us wix v3 can no longer be used to
fetch wix v4+ so we had to switch to dotnet to fetch the latest wix.
This commit builds the installer with wix v5.
wix v5 is installed via the `dotnet` tool in the windows image itself
at https://github.com/containers/automation_images/pull/354.
Going forward, the `dotnet` tool will also be used to build the installer.
In the process, the wix v3 files were converted to wix v4+ using `wix
convert` followed by manual modifications along with switch to wixproj
builds with dotnet.
The GitHub Action to upload windows installer now builds the installer
using winmake.ps1.
Contributions from Mario Loriedo:
- bundle setup update to wix5
- updates to build and release process scripts
Ref: https://github.com/lsm5/podman/pull/3
- small fixes to windows installer theme
Ref: https://github.com/lsm5/podman/pull/4
- Better win-installer sidebar logo
Ref: https://github.com/lsm5/podman/pull/5
Resolves: RUN-2055
Co-authored-by: Mario Loriedo <mario.loriedo@gmail.com>
Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
In case of timeouts actually log the command again and make sure to send
SIGABRT to the process as go will create a useful stack strace where we
can see where things are hanging. It also kill the process unlike the
default Eventually().Should(Exit()) call the leaves the process around.
The output will be captured by default in the log so we just see the
stack trace there.
And while at it bump the timout up to 10 mins, we are hitting hard
flakes in CI where machine init takes longer than 5 mins for unknown
reasons but this seems to be good enough.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
I am seeing a weird flake in my parallel system test PR. The issue is
that system units generated by podman systemd generate leave a container
in the Removing state behind.
As far as I can tell the porblems seems to be that the cleanup process
is killed while it tries to remove the container from the db. Because
the cidfile was removed before the ExecStopPost=podman rm ... process no
longer had access to the cidfile and reported no error because it runs
with --ignore.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
- fix test name to reflect that it's not pasta-only
(followup from #21563)
- in one podman-update test run in OpenQA, defer assertion
failures so we can gather better data on regressions.
This would've been helpful in diagnosing bz2281805.
- add an error-message check to one test that needed it
(found by accident)
- add distro-integration test tag to a handful of new tests,
so they run in OpenQA. Found via 'git diff 33891e8 test/system'
and scanning for '^\+@test '. I only added tests that IMO
have some risk of interacting poorly with kernel or systemd
updates, e.g. quadlet, modules, tmpfs+noswap.
Signed-off-by: Ed Santiago <santiago@redhat.com>
As we want to get rid of the special titles convert the existing skips
to the only_if condition, this makes it more readable as we do not need
to negate so much.
Then add similar conditions for all test tasks, this removes the need to
a special title such as CI:DOCS as the logic is smart enough to only
docs changes when no source code was changed.
Update the documentation for the new logic and no longer point
contributors to the CI:DOCS title as it is gone now.
There is a bunch of duplication in the rules as yaml doesn't allow us to
share only parts of a string. To prevent unwanted drift a test case in
contrib/cirrus/cirrus_yaml_test.py is added to ensure all conditions
follow the same base ruleset.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The events code makes use of two channels, one for the events and one
for the resulting error. Then in the main file we have a loop reading
from both channels that should exit on first error it gets.
However in case the event channel is closed before the error channel
cotains the error it could caused an early exit as it looked like all
events were done. Commit c46884aa93 fixed that somewhat by checking for
an error in the error channel before exiting. This however was still
racy as it added a default case in the select which means the channel
check is non blocking. Thus the error was not yet send into the channel.
To fix this we should make it a blocking read to wait for the error in
the channel. Also the err != nil check can be removed as we either
return err or nil anyway.
And as last step make sure the error channel is closed, that prevents us
from blocking forever in case the main select already processed the nil
error.
Fixes#23165
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Currently all podman machine rm errors in AfterEach were ignored.
This means some leaked and caused issues later on, see #22844.
To fix it first rework the logic to only remove machines when needed at
the place were they are created using DeferCleanup(), however
DeferCleanup() does not work well together with AfterEach() as it always
run AfterEach() before DeferCleanup(). As AfterEach() deletes the dir
the podman machine rm call can not be done afterwards.
As such migrate all cleanup to use DeferCleanup() and while I have to
touch this fix the code to remove the per file duplciation and define
the setup/cleanup once in the global scope.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
On linux and macos the connections are stored under the home dir by
default so it is not a problem there but on windows we first check
the APPDATA env and use this dir as config storage. This has the problem
that it is not cleaned up after each test as such connections might leak
into the following test causing failues there.
Fixes#22844
Signed-off-by: Paul Holzinger <pholzing@redhat.com>