When 'toolbox run ulimit' is invoked, it comes down to invoking
"bash --login -c 'exec ulimit'" inside the container, and it leads to
different outcomes on Fedora and Ubuntu hosts.
On Fedora:
$ bash --login -c 'exec ulimit'
unlimited
On Ubuntu:
$ bash --login -c 'exec ulimit'
bash: line 1: exec: ulimit: not found
This is because Bash's exec built-in cannot execute another built-in and
needs an external command; and Fedora ships a wrapper for the ulimit
built-in as an external command [1] to satisfy POSIX [2]:
$ cat /usr/bin/ulimit
#!/usr/bin/sh
builtin ulimit "$@"
... that Ubuntu doesn't.
Wrapping the 'ulimit' as 'bash -c ulimit' solves this problem because
then it becomes "bash --login -c 'exec bash -c ulimit'", and the exec
built-in can execute the external command bash(1).
[1] Fedora bash commit 3b09d0d67fe7ff4c
Fedora bash commit a28ab9933095eaf6
https://src.fedoraproject.org/rpms/bash/c/3b09d0d67fe7ff4chttps://src.fedoraproject.org/rpms/bash/c/a28ab9933095eaf6https://bugzilla.redhat.com/show_bug.cgi?id=1297166
[2] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/toc.htmlhttps://github.com/containers/toolbox/pull/1599
Signed-off-by: Penn Bauman <me@pennbauman.com>
It's simpler to check and set all the environment variables in one
place and in the same way, instead of having them scattered and using
different means.
https://github.com/containers/toolbox/pull/1676
The default location for a rootless user's storage.conf file [1], which
is overridden by the CONTAINERS_STORAGE_CONF environment variable [2],
is at $XDG_CONFIG_HOME/containers/storage.conf, and the default value of
the rootless_storage_path option is $XDG_DATA_HOME/containers/storage.
Earlier they were being set to different paths because the user's home
directory on the host operating system was not isolated from the system
tests.
Now that the user's home directory is isolated because the system tests
use a custom HOME, these configuration paths can be restored to their
default values relative to this custom HOME. This will make things more
understandable by making the directory hierarchy within Bats' sandbox
similar to the defaults.
The CONTAINERS_STORAGE_CONF environment variable was not removed to
protect against cases where Bats might get invoked with a different
CONTAINERS_STORAGE_CONF set in the environment. The other option is to
unset the variable from the environment. The former was chosen to
deviate as little from the status quo as possible.
[1] https://github.com/containers/storage/blob/main/docs/containers-storage.conf.5.md
[2] https://docs.podman.io/en/latest/markdown/podman.1.htmlhttps://github.com/containers/toolbox/pull/1673
The "${FOO}" notation for accessing the value of a variable is not
consistently used elsewhere and ShellCheck doesn't insist on it.
Therefore, it's better to use the simpler "$FOO" notation.
https://github.com/containers/toolbox/pull/1672
Matching the name of the variable with the configuration option improves
clarity, and prefixing the name with TOOLBX prevents it from colliding
with environment variables used by other programs or projects.
https://github.com/containers/toolbox/pull/1672
Currently, it's impossible to create a Toolbx container with a different
home directory from the host while sitting inside a Toolbx container.
Preserving the HOME environment variable when forwarding to the host
will enable users to create containers with different home directories
for increased isolation while sitting inside a Toolbx container, and to
isolate the host's HOME from the system tests.
This was never supported by the POSIX shell implementation.
https://github.com/containers/toolbox/issues/1044https://github.com/containers/toolbox/issues/1564
The POSIX shell implementation used to read and respect the HOME
environment variable. It broke when the Go implementation started using
user.Current() [1], because it uses getpwuid_r(3), which only uses the
GNU Name Service Switch's (or NSS') password database and ignores HOME.
This created a strange situation where the toolbox(1) binary ignored the
HOME environment variable, while the profile.d/toolbox.sh shell start-up
snippet and Podman read and respected it.
Restoring the HOME environment variable will enable users to create
Toolbx containers with different home directories from the host for
increased isolation, and to isolate the host's HOME from the system
tests.
Fallout from e8d7f25e83
[1] https://pkg.go.dev/os/user#Currenthttps://github.com/containers/toolbox/issues/1044https://github.com/containers/toolbox/issues/1564
Currently, 'go build' is failing on Fedora 42 Workstation:
$ meson compile -C builddir --verbose
...
/path/src/go-build-wrapper /path/src /path/builddir src/toolbox 0.1.1
cc /lib64/ld-linux-x86-64.so.2 false
go: updates to go.mod needed; to update it:
go mod tidy
ninja: build stopped: subcommand failed.
... with Go version:
$ go version
go version go1.24.3 linux/amd64
$ rpm -q golang
golang-1.24.3-2.fc42.x86_64
Strangely, the CI hasn't been failing on Fedora 42 with the same Go
version [1].
Starting from Go version 1.21.0, Go started using an explicit 0 micro
version instead of skipping it - compare Go 1.20 and 1.21.0 [2]. It
looks like recent versions of Go are pedantic about using the exact
version number.
[1] https://github.com/containers/toolbox/pull/1657
[2] https://github.com/golang/go/releases/tag/go1.20https://github.com/golang/go/releases/tag/go1.21.0https://github.com/containers/toolbox/pull/1659
This uses the same approach taken by Flatpak [1] to ensure that the
certificates from certificate authorities (or CAs) that are available
inside a Toolbx container are kept synchronized with the host operating
system. Any program that uses PKCS #11 to access CA certificates should
see the same ones both inside the container and on the host.
During every 'enter' and 'run' command, toolbox(1) ensures that an
instance of 'p11-kit server' is running on the host listening on a local
file system socket that's accessible to both the container and the host.
If an instance is already running, then a second one is not created.
The location of the socket is injected into the container through the
P11_KIT_SERVER_ADDRESS environment variable.
Just like Flatpak, the singleton 'p11-kit server' process is not
terminated when the last 'enter' or 'run' command exits.
The Toolbx container's entry point configures it to use the
p11-kit-client.so PKCS #11 module instead of the usual p11-kit-trust.so
module. This talks to the 'p11-kit server' instance running on the host
over the socket instead of reading the CA certificates that are present
inside the container.
However, unlike Flatpak, this doesn't use D-Bus to set up the
communication between the container and the host, because when invoked
as 'sudo toolbox ...' there's no user or session D-Bus instance
available for the root user.
This set-up is skipped if 'p11-kit server' can't be run on the host, or
if the /etc/pkcs11/modules directory for configuring PKCS #11 modules or
p11-kit-client.so are missing inside the container. None of these are
considered hard dependencies to accommodate size-constrained OSes like
Fedora CoreOS that might not have 'p11-kit server', and existing Toolbx
containers and old images that might not have p11-kit-client.so.
The UBI-based toolbox images haven't yet been updated to contain
p11-kit-client.so. Until that happens, containers created from them
won't have access to the CA certificates from the host.
The CI needs to be run without 'p11-kit server' because the lingering
singleton process causes Bats to hang when tearing down the suite of
system tests [2]. To terminate the 'p11-kit server' instance run by the
system tests, it needs to be distinguishable from the instance run by
'normal' use of Toolbx by the user. One way to do this is to isolate
the host operating system's XDG_RUNTIME_DIR from the system tests.
Unfortunately, this is easier said than done [3]. So, this workaround
has to suffice until the problem is solved.
On the Ubuntu 22.04 CI nodes, it's not possible to remove the p11-kit
package that provides 'p11-kit server', because it leads to:
$ sudo dpkg --purge p11-kit
dpkg: dependency problems prevent removal of p11-kit:
adoptium-ca-certificates depends on p11-kit.
Therefore, as a workaround only the /usr/libexec/p11-kit/p11-kit-server
binary that provides the 'server' command is removed. The rest of the
p11-kit package is left untouched.
[1] Flatpak commit 66b2ff40f7caf3a7
https://github.com/flatpak/flatpak/commit/66b2ff40f7caf3a7https://github.com/flatpak/flatpak/pull/1757https://github.com/p11-glue/p11-kit/issues/68
[2] https://bats-core.readthedocs.io/en/stable/writing-tests.html
[3] https://github.com/containers/toolbox/pull/1652https://github.com/containers/toolbox/issues/626
A subsequent commit will use this to give Toolbx containers access to
the certificates from certificate authorities on the host.
The ideal goal is to ensure that all supported Toolbx containers and
images have p11-kit-client.so in them. In practice, some of them never
will. Either because it's an existing container or an older version of
an image that was already present in the local containers/storage image
store, or because the operating system is too old.
Therefore, there needs to be a way to check at runtime if a Toolbx
container has p11-kit-client.so or not.
https://github.com/containers/toolbox/issues/626
A subsequent commit will use this to give Toolbx containers access to
the certificates from certificate authorities on the host.
This changes the user-visible error message from:
$ toolbox --verbose list
...
DEBU Migrating to newer Podman: failed to create migration lock file
/run/user/1000/toolbox/migrate.lock: open
/run/user/1000/toolbox/migrate.lock: no such file or directory
Error: failed to create migration lock file
... to:
$ toolbox --verbose list
...
DEBU Migrating to newer Podman: failed to create lock file
/run/user/1000/toolbox/migrate.lock: open
/run/user/1000/toolbox/migrate.lock: no such file or directory
Error: failed to create lock file
Or, from:
$ toolbox --verbose list
...
DEBU Migrating to newer Podman: failed to acquire migration lock on
/run/user/1000/toolbox/migrate.lock: bad file descriptor
Error: failed to acquire migration lock
... to:
$ toolbox --verbose list
...
DEBU Migrating to newer Podman: failed to acquire lock on
/run/user/1000/toolbox/migrate.lock: bad file descriptor
Error: failed to acquire lock
This is admittedly less specific without the debug logs, but it's
probably alright because it's such an unlikely error.
https://github.com/containers/toolbox/issues/626
The system tests can be very I/O intensive, because many of them copy
OCI images from the test suite's image cache directory to its local
container/storage store, create containers, and then delete everything
to run the next test with a clean slate. This makes them slow.
The runtime environment tests, which includes the environment variable
tests, are particularly slow because they don't skip the I/O even when
testing error handling. This makes them a good target for
optimizations.
The environment variable tests query the values of different environment
variables from different containers without changing their state.
Therefore, a lot of disk I/O can be avoided by creating these containers
only once for all the tests.
This can reduce the time needed to run the environment variable tests
from almost 26 minutes to almost 9 minutes.
https://github.com/containers/toolbox/pull/1646
The XDG_CACHE_HOME environment variable is supposed to default to
$HOME/.cache [1], just as it did in the test suite, and this location is
meant to be used as a cache for 'normal' use by the user. Test suites
generally don't qualify as 'normal' use.
One expects that deleting the cache shouldn't affect 'normal' use other
than degrading performance. However, deleting these temporary files
used by the test suite will cause actual breakage. Even if the user
doesn't manually delete the cache, two concurrent invocations of the
test suite can do so or lead to other unexpected collisions, because the
paths are constant across multiple invocations.
Therefore, it's better to limit the scope of the test suite's temporary
files within the sandbox offered by Bats [2]. The sandbox is clearly
labelled as being used by Bats, is unique for each invocation, and Bats
takes care of cleaning everything up once it has finished running.
Note that there's no need for the system-test-storage sub-directory
under BATS_SUITE_TMPDIR. So it was left out.
[1] https://specifications.freedesktop.org/basedir-spec/latest/
[2] https://bats-core.readthedocs.io/en/stable/writing-tests.htmlhttps://github.com/containers/toolbox/pull/1645
The p11-kit-modules package in Ubuntu provides p11-kit-client.so, but
the /etc/pkcs11/modules directory that's necessary to configure p11-kit
to use p11-kit-client.so is not created by any package.
It's better to ensure that the /etc/pkcs11/modules directory exists in
the image, instead of having the Toolbx container's entry point create
it at runtime, because it can be a confirmation that p11-kit was built
to read the module configuration from this location.
This should have been part of commit aa8507730d.
https://github.com/containers/toolbox/issues/626
The /etc/pkcs11 directory and /etc/pkcs11/pkcs11.conf.example file are
created by the p11-kit package in Arch Linux, and the lib11-kit package
provides p11-kit-client.so. However, the /etc/pkcs11/modules directory
that's necessary to configure p11-kit to use p11-kit-client.so is not
created by any package.
It's better to ensure that the /etc/pkcs11/modules directory exists in
the image, instead of having the Toolbx container's entry point create
it at runtime, because it can be a confirmation that p11-kit was built
to read the module configuration from this location.
This should have been part of commit 259de86c8f.
https://github.com/containers/toolbox/issues/626
It's been a while since it's been necessary to read the ID field from
os-release(5) outside this package or the VARIANT_ID field anywhere at
all. Therefore, it's time to adjust the code to reflect this reality.
Fallout from 8caa7cd828https://github.com/containers/toolbox/pull/1642
The system tests can be very I/O intensive, because many of them copy
OCI images from the test suite's image cache directory to its local
container/storage store, create containers, and then delete everything
to run the next test with a clean slate. This makes them slow.
The runtime environment tests, which includes the D-Bus tests, are
particularly slow because they don't skip the I/O even when testing
error handling. This makes them a good target for optimizations.
The D-Bus tests check if methods can be called across the user or
session and system D-Bus instances from different containers without
changing their state. Therefore, a lot of disk I/O can be avoided by
reating these containers only once for all the tests.
This can reduce the time needed to run the D-Bus tests from almost 10
minutes to almost 5 minutes.
https://github.com/containers/toolbox/pull/1641
This will reduce the size of the src/pkg/utils/utils.go file and make it
easier to specify which part of the code base is maintained by whom.
https://github.com/containers/toolbox/pull/1639
This will reduce the size of the src/pkg/utils/utils.go file and make it
easier to specify which part of the code base is maintained by whom.
https://github.com/containers/toolbox/pull/1639
This will reduce the size of the src/pkg/utils/utils.go file and make it
easier to specify which part of the code base is maintained by whom.
https://github.com/containers/toolbox/pull/1639
This will reduce the size of the src/pkg/utils/utils.go file and make it
easier to specify which part of the code base is maintained by whom.
https://github.com/containers/toolbox/pull/1639
The system tests can be very I/O intensive, because many of them copy
OCI images from the test suite's image cache directory to its local
container/storage store, create containers, and then delete everything
to run the next test with a clean slate. This makes them slow.
The runtime environment tests, which includes the networking tests, are
particularly slow because they don't skip the I/O even when testing
error handling. This makes them a good target for optimizations.
The networking tests check the behaviour and configuration of the
network in different containers without changing their state.
Therefore, a lot of disk I/O can be avoided by creating these containers
only once for all the tests.
This can reduce the time needed to run the networking tests from almost
15 minutes to almost 6 minutes.
https://github.com/containers/toolbox/pull/1637
The libp11-kit package was added to the arch-toolbox image to ensure the
presence of p11-kit-client.so. Currently, the package is already pulled
in by various dependencies, like the gnutls and p11-kit packages.
Therefore, it doesn't increase the size of the base image, but serves as
a safeguard against any inadvertent changes.
A subsequent commit will use this to give Toolbx containers access to
the certificates from certificate authorities on the host. This commit
was kept separate from the changes to toolbox(1) to ensure that the
arch-toolbox image is ready before that happens.
https://github.com/containers/toolbox/issues/626
The system tests can be very I/O intensive, because many of them copy
OCI images from the test suite's image cache directory to its local
container/storage store, create containers, and then delete everything
to run the next test with a clean slate. This makes them slow.
The runtime environment tests, which includes the group and user tests,
are particularly slow because they don't skip the I/O even when testing
error handling. This makes them a good target for optimizations.
The group and user tests check the group and user configuration in
different containers without changing their state. Therefore, a lot of
disk I/O can be avoided by creating these containers only once for all
the tests.
This can reduce the time needed to run the group and user tests from
almost 22 minutes to almost 5 minutes.
https://github.com/containers/toolbox/pull/1635