Commit Graph

15 Commits

Author SHA1 Message Date
Debarshi Ray 1c616f04bf debug 2025-07-04 22:25:11 +02:00
Debarshi Ray fdce5e4f52 Isolate the host's XDG_RUNTIME_DIR from the system tests
XDG_RUNTIME_DIR is needed for two groups of reasons when Toolbx is used
rootless.

First, it's important for toolbox(1) itself to work rootless because it
needs to place several files:

  * The 'lock' file to synchronize Podman migrations.

  * The initialization stamp file to synchronize the container's entry
    point with the user-facing 'enter' and 'run' commands running on the
    host operating system.

  * The generated Container Device Interface specification.

These files need to be separate for the toolbox(1) processes run by the
system tests, those run by the user for 'normal' use, and concurrent
invocations of the tests.

Therefore, it's better to use a custom XDG_RUNTIME_DIR that's within the
sandbox offered by Bats [1].  The sandbox is clearly labelled as being
used by Bats, is unique for each invocation, and Bats takes care of
cleaning everything up once it has finished running.

Note that XDG_RUNTIME_DIR's Unix access mode MUST be 0700 [2].  eg.,
Ubuntu 22.04 and 24.04 Desktop have a umask of 0002, and if an access
mode is not explicitly specified, XDG_RUNTIME_DIR will be created with
0775.  That will cause dbus-daemon(1) to fail with:
  Unable to set up transient service directory: XDG_RUNTIME_DIR
      "/var/tmp/bats-run-4XQL6i/suite/xdg-runtime-dir" can be written by
      others (mode 040775)

Second, XDG_RUNTIME_DIR is used to propagate things like the user D-Bus,
Pipewire and Wayland sockets from the host to the container.  These
don't need to be separated.  However, if a custom XDG_RUNTIME_DIR is
used then those sockets that are used by the system tests, such as the
user D-Bus socket, have to be replicated.

Therefore, a custom D-Bus instance is run to offer the user D-Bus socket
with a configuration similar to that of the host OS.  The dbus-daemon(1)
implementation is used for the sake of simplicity.  It creates the
socket itself based on the configuration, unlike dbus-broker-launch(1)
where the socket must be separately created and passed to it by its
parent.

However, Podman can't use systemd as the cgroups manager with this D-Bus
instance, as the bus wasn't started by the user systemd instance.  So, a
custom containers.conf(5) is used to change the cgroups manager to
cgroupfs.  The only other options in the containers.conf(5) are those
that are common across Fedora 41 and 42, and Ubuntu 22.04 and 24.04.

[1] https://bats-core.readthedocs.io/en/stable/writing-tests.html

[2] https://specifications.freedesktop.org/basedir-spec/latest/

https://github.com/containers/toolbox/pull/1652
2025-07-04 22:25:11 +02:00
Debarshi Ray cded2709e2 test/system: Remove redundant set-up
There's no need to repeatedly create the home directory and the
containers-storage.conf(5), because their life cycles are suite-wide.
It's enough to create them once as part of the suite-wide set-up hook
(ie., setup_suite()).

It was only necessary to do it this way at a time when Bats < 1.7.0
didn't offer any hooks for suite-wide set-up and tear-down, and the
CONTAINERS_STORAGE_CONF and XDG_RUNTIME_DIR environment variables were
exported to the test cases from the _setup_environment() function.

The switch to using the suite-wide set-up hook offered by Bats >= 1.7.0
in commit 7a387dcc8b removed the need to do this set-up
repeatedly.  Currently, those two environment variables are set globally
outside the _setup_environment() function [1,2], which removes any
lingering doubts about this.

This should have been part of commit 7a387dcc8b.

[1] Commit f4591718e4
    https://github.com/containers/toolbox/commit/f4591718e483bdf5
    https://github.com/containers/toolbox/pull/1672

[2] Commit ad3346f915
    https://github.com/containers/toolbox/commit/ad3346f9153dbf0f
    https://github.com/containers/toolbox/pull/1676

https://github.com/containers/toolbox/pull/1676
2025-07-04 11:37:34 +02:00
Debarshi Ray a5a0d5350f test/system: Remove redundant clean-up
The IMAGE_CACHE_DIR environment variable is defined as
"${BATS_SUITE_TMPDIR}/image-cache" [1].  Earlier, it used to be
"${BATS_RUN_TMPDIR}/image-cache".

There's no need to clean up anything inside BATS_RUN_TMPDIR or
BATS_SUITE_TMPDIR after the test suite has finished running, because
their life cycle is managed by Bats [2].

[1] Commit 3a549a6252
    https://github.com/containers/toolbox/commit/3a549a6252e990d6
    https://github.com/containers/toolbox/pull/1452

[2] https://bats-core.readthedocs.io/en/stable/writing-tests.html

Fallout from 9820550c82

https://github.com/containers/toolbox/pull/1645
2025-05-12 17:58:15 +02:00
Debarshi Ray d64682af0d test/system: Don't use XDG_CACHE_HOME or HOME for temporary files
The XDG_CACHE_HOME environment variable is supposed to default to
$HOME/.cache [1], just as it did in the test suite, and this location is
meant to be used as a cache for 'normal' use by the user.  Test suites
generally don't qualify as 'normal' use.

One expects that deleting the cache shouldn't affect 'normal' use other
than degrading performance.  However, deleting these temporary files
used by the test suite will cause actual breakage.  Even if the user
doesn't manually delete the cache, two concurrent invocations of the
test suite can do so or lead to other unexpected collisions, because the
paths are constant across multiple invocations.

Therefore, it's better to limit the scope of the test suite's temporary
files within the sandbox offered by Bats [2].  The sandbox is clearly
labelled as being used by Bats, is unique for each invocation, and Bats
takes care of cleaning everything up once it has finished running.

Note that there's no need for the system-test-storage sub-directory
under BATS_SUITE_TMPDIR.  So it was left out.

[1] https://specifications.freedesktop.org/basedir-spec/latest/

[2] https://bats-core.readthedocs.io/en/stable/writing-tests.html

https://github.com/containers/toolbox/pull/1645
2025-05-12 17:41:52 +02:00
Debarshi Ray 987f5e2592 .zuul, playbooks, test/system: Optimize the CI on Fedora nodes
The test suite has expanded to 415 system tests.  These tests can be
very I/O intensive, because many of them copy OCI images from the test
suite's image cache directory to its local container/storage store,
create containers, and then delete everything to run the next test with
a clean slate.  This makes the system tests slow.

Unfortunately, Zuul's max-job-timeout setting defaults to an upper limit
of 3 hours or 10800 seconds for jobs [1], and this is what Software
Factory uses [2].  So, there comes a point beyond which the CI can't be
prevented from timing out by increasing the timeout.

One way of scaling past this maximum time limit is to run the tests in
parallel across multiple nodes.  This has been implemented by splitting
the system tests into different groups, which are run separately by
different nodes.

First, the tests were grouped into those that test commands and options
accepted by the toolbox(1) binary, and those that test the runtime
environment within the Toolbx containers.  The first group has more
tests, but runs faster, because many of them test error handling and
don't do much I/O.

The runtime environment tests take especially long on Fedora Rawhide
nodes, which are often slower than the stable Fedora nodes.  Possibly
because Rawhide uses Linux kernels that are built with debugging
enabled, which makes it slower.  Therefore, this group of tests were
further split for Rawhide nodes by the Toolbx images they use.  Apart
from reducing the number of tests in each group, this should also reduce
the amount of time spent in downloading the images.

The split has been implemented with Bats' tagging system that is
available from Bats 1.8.0 [3].  Fortunately, commit 87eaeea6f0
already added a dependency on Bats >= 1.10.0.  So, there's nothing to
worry about.

At the moment, Bats doesn't expose the tags being used to run the test
suite to setup_suite() and teardown_suite() [4].  Therefore, the
TOOLBX_TEST_SYSTEM_TAGS environment variable was used to optimize the
contents of setup_suite().

[1] https://zuul-ci.org/docs/zuul/latest/tenants.html

[2] Commit 83f28c52e4
    https://github.com/containers/toolbox/commit/83f28c52e47c2d44
    https://github.com/containers/toolbox/pull/1548

[3] https://bats-core.readthedocs.io/en/stable/writing-tests.html

[4] https://github.com/bats-core/bats-core/issues/1006

https://github.com/containers/toolbox/pull/1551
2024-09-28 01:28:58 +02:00
Debarshi Ray 18d47d1fee test/system: Replace the RHEL toolbox:8.9 image with toolbox:8.10
Red Hat Enterprise Linux 8.9 reached End of Life when RHEL 8.10 was
released on 22nd May 2024:
https://access.redhat.com/articles/3078
https://access.redhat.com/support/policy/updates/errata

For what it's worth, RHEL 8's full support phase ended on the 31st of
May 2024 and it wil be in maintenance support, as RHEL 8.10, until the
corresponding day in 2029.

https://github.com/containers/toolbox/pull/1522
2024-07-31 18:03:23 +02:00
Debarshi Ray 6838e93471 test/system: Unbreak Podman's downstream Fedora CI
The paths to bats-assert and bats-support are broken, if bats(1) is
invoked from any other location than the parent directory of the 'tests'
directory.  eg., Podman's downstream Fedora CI invokes the tests as:
  $ cd /path/to/toolbox/test/system
  $ bats .

... and it led to [1]:
  1..306
  # test suite: Set up
  # Missing dependencies
  # Forgot to run 'git submodule init' and 'git submodule update' ?
  # test suite: Tear down
  not ok 1 setup_suite
  # (from function `setup_suite' in test file ./setup_suite.bash, line 33)
  #   `return 1' failed
  # bats warning: Executed 1 instead of expected 306 tests

Fallout from 2c09606603

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2263968

https://github.com/containers/toolbox/pull/1448
2024-02-13 22:08:09 +01:00
Debarshi Ray dcec5680e5 test/system: Replace the RHEL toolbox:8.7 image with toolbox:8.9
Red Hat Enterprise Linux 8.7 reached End of Life on 31st May 2023:
https://access.redhat.com/articles/4038291
https://access.redhat.com/support/policy/updates/errata

Since the tests are intended for Toolbx, not the Red Hat infrastructure,
it will be better to use a newer image, because it will be closer to
what the users are seeing.

https://github.com/containers/toolbox/pull/1437
2024-01-24 22:45:18 +01:00
Debarshi Ray d528301b7f test/system: Silence SC2155
Otherwise https://www.shellcheck.net/ would complain:
  Line 624:
  local system_id="$(get_system_id)"
        ^-------^ SC2155 (warning): Declare and assign separately to
                  avoid masking return values.

See: https://www.shellcheck.net/wiki/SC2155

https://github.com/containers/toolbox/pull/1370
2023-09-20 10:24:29 +02:00
Debarshi Ray fd7ca125fc test/system: Replace the shebangs with 'shell' directives
These files aren't marked as executable, and shouldn't be, because they
aren't meant to be standalone executable scripts.  They're meant to be
part of a test suite driven by Bats.  Therefore, it doesn't make sense
for them to have shebangs, because it gives the opposite impression.

The shebangs were actually being used by external tools like Coverity to
deduce the shell when running shellcheck(1).  Shellcheck's inline
'shell' directive is a more obvious way to achieve that.

https://github.com/containers/toolbox/pull/1363
2023-09-14 15:18:04 +02:00
Debarshi Ray 776562235a test/system: Fix the shebang
The setup_suite.bash file is meant to be written in Bash, and is not
supposed to have any Bats-specific syntax.  That's why it has the *.bash
suffix, not *.bats.  If Bats finds a setup_suite.bash file, when running
the test suite, it uses Bash's source(1) builtin to read the file.

This is a cosmetic change.  The Bats syntax is a superset of the Bash
syntax.  Therefore, it didn't make a difference to external tools like
Coverity that use the shebang to deduce the shell for shellcheck(1).
Secondly setup_suite.bash isn't meant to be an executable script and,
hence, the shebang has no effect on how the file is used.  However, it's
still a commonly used hint about the contents of the file, and it's
better to be accurate than misleading.

A subsequent commit will replace the shebangs in the test suite with
ShellCheck's 'shell' directives.

Fallout from 7a387dcc8b

https://github.com/containers/toolbox/pull/1363
2023-09-14 15:16:35 +02:00
Debarshi Ray c846b6d844 test/system: Simplify the check for Fedora Rawhide
First, it's not a good idea to use awk(1) as a grep(1) replacement.
Unless one really needs the AWK programming language, it's better to
stick to grep(1) because it's simpler.

Secondly, it's better to look for a specific os-release(5) field instead
of looking for the occurrence of 'rawhide' anywhere in the file, because
it lowers the possibility of false positives.

https://github.com/containers/toolbox/pull/1336
2023-07-11 20:30:35 +02:00
Matthias Clasen 2c09606603 test/system: Clarify the use of Git submodules
We wasted some time trying to get the tests running locally, when all we
were missing were the 'git submodule ...' commands.

Add some more obvious hints about this possible stumbling block.

Note that Bats cautions against printing outside the @test, setup* or
teardown* functions [1].  In this case, doing so leads to the first line
of the error output going missing, when using the pretty formatter for
human consumption:

  $ bats --formatter pretty ./test/system
   ✗ setup_suite
     Forgot to run 'git submodule init' and 'git submodule update' ?
     bats warning: Executed 1 instead of expected 191 tests

  191 tests, 1 failure, 190 not run

[1] https://bats-core.readthedocs.io/en/stable/writing-tests.html

https://github.com/containers/toolbox/pull/1298

Signed-off-by: Matthias Clasen <mclasen@redhat.com>
2023-06-21 12:34:08 +02:00
Debarshi Ray 7a387dcc8b test/system: Simplify running a subset of the tests with Bats >= 1.7.0
The 000-setup.bats and 999-teardown.bats files were added [1] at a time
when Bats didn't offer any hooks for suite-wide setup and teardown.

That changed in Bats 1.7.0, which introduced the setup_suite and
teardown_suite hooks.  These hooks make it easier to run a subset of the
tests, which is a good thing.

In the past, to run a subset of the tests, one had to do:
  $ bats ./test/system/000-setup.bats ./test/system/002-help.bats \
      ./test/system/999-teardown.bats

Now, one only has to do:
  $ bats ./test/system/002-help.bats

Commit e22a82fec8 already added a dependency on Bats >= 1.7.0.
Therefore, it should be exploited wherever possible to simplify things.

[1] Commit 54a2ca1ead
    https://github.com/containers/toolbox/issues/751

[2] Bats commit fb467ec3f04e322a
    https://github.com/bats-core/bats-core/issues/39
    https://bats-core.readthedocs.io/en/stable/writing-tests.html

https://github.com/containers/toolbox/pull/1317
2023-06-21 09:07:29 +02:00