Compare commits

...

582 Commits
1.16.1 ... main

Author SHA1 Message Date
Giuseppe Scrivano 411e56c5cb
Merge pull request #1825 from eriksjolund/linux-add-missing-crun_make_error
linux: add missing crun_make_error
2025-07-14 15:40:57 +02:00
Erik Sjölund 72b3502000
linux: add missing crun_make_error
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-07-14 15:12:59 +02:00
Giuseppe Scrivano 57d0192d30
Merge pull request #1821 from kolyshkin/1795-followup
libcrun: inline can_skip into write_cgroup_resources_v2
2025-07-10 10:38:43 +02:00
Kir Kolyshkin 4e5375cbd0 libcrun: inline can_skip into write_cgroup_resources_v2
This reverts part of commit 832db004, as can_skip function is not used
from any other place, and the error logic separated into two functions
is cumbersome, hard to follow, and probably has bugs.

Reported-by: Erik Sjölund <erik.sjolund@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-07-09 18:54:00 -07:00
Giuseppe Scrivano 976dca9d4d
Merge pull request #1795 from kolyshkin/bpf-program
cgroup, systemd: use BPFProgram=device if supported
2025-07-09 22:55:47 +02:00
Kir Kolyshkin ad9f90b7df tests: add test_bpf_devices
This is a basic test for the functionality added by a few previous
commits. It does not test that device bpf program work as it should,
merely that it is installed.

Co-authored-by: Claude Sonnet 4
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-07-09 10:55:22 -07:00
Kir Kolyshkin 6f466dcc9b cgroup,systemd: do not install duplicated ebpf
In case systemd cgroup manager is used, and when BPFProgram=device
property is added to systemd unit to limit cgroup/container access
to devices, there is no need to install a second BPF program that
does the same thing.

Add bpf_dev_set to struct libcrun_cgroup_status to communicate that
systemd's BPFProgram was set, so that update_cgroup_resources won't
install another bpf program.

Among the other benefits (performance), this allows to properly test the
functionality introduced by the previous commit.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-07-09 10:55:22 -07:00
Giuseppe Scrivano edf6678892 cgroup,systemd: check for loaded bpf on update
Since systemd does not support updating BPFProgram property, we can't
really perform updates to device access list. Yet, there is a common
scenario (with podman) when crun update receives the same set of devices
as during the container creation. In such case, we can obtain the
old/original bpf program and compare it with the new one. If they match,
we can continue without doing anything.

If they don't, we still have to error out.

[@kolyshkin: commit message]

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-07-09 10:53:19 -07:00
Kir Kolyshkin 227bd0f112 cgroup,systemd: use BPFProgram=device if supported
This adds initial support for using BPFProgram=device systemd unit
property to manage device access for a container.

It only works for root user in initial user namespace, and requires
cgroup v2 and systemd >= v249.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-07-09 10:34:09 -07:00
Kir Kolyshkin 832db00471 write_devices_resources_v2: refactor
1. Separate create_dev_bpf out of write_devices_resources_v2_internal.

2. Separate can_skip_devices out of write_devices_resources_v2.

Should not result in any change of logic. This is merely a preparation
for the next commit.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-07-08 11:13:28 -07:00
Giuseppe Scrivano e8b3952335
Merge pull request #1812 from kolyshkin/def-slice
cgroup,systemd: allow empty slice in cgroupsPath
2025-07-08 11:19:38 +02:00
Kir Kolyshkin 1dae52bae1 cgroup,systemd: allow empty slice in cgroupsPath
While it may not be properly documented in runtime-spec, runc's systemd
cgroup driver follows some rules when constructing the slice and scope
from the linux.cgroupsPath (see [1]).

One such rule is, when the slice is empty, it defaults to "system.slice",
unless we have cgroup v2 and a rootless container, in which case it
defaults to "user.slice". This is supported by runc and although it
might be questionable, it makes sense for crun to be compatible.

Add a test case.

Fixes: 1811.

[1]: https://github.com/opencontainers/runc/blob/main/docs/systemd.md#systemd-unit-name-and-placement

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-07-07 12:51:58 -07:00
Giuseppe Scrivano d3b2b788da
Merge pull request #1819 from lsm5/disable-wasmedge-centos
RPM: Disable wasmedge for centos
2025-07-07 16:44:17 +02:00
Lokesh Mandvekar 5610c0cfeb
RPM/Packit: wasmedge support Fedora-only
Disable EPEL repos for CentOS Stream copr builds.

Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
2025-07-07 09:00:48 -04:00
Lokesh Mandvekar 39bdaaefd8
Packit: disable propose_downstream for CentOS Stream
Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
2025-07-07 08:57:21 -04:00
Lokesh Mandvekar c6dfc871cd
RPM: placeholder check to silence rpmlint
Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
2025-07-07 08:56:09 -04:00
Giuseppe Scrivano 1c68e96d20
Merge pull request #1814 from giuseppe/NULL-check-free-functions
src: add NULL checks for free* functions
2025-07-07 11:59:08 +02:00
flouthoc 273ff2d7d7
Merge pull request #1817 from giuseppe/print-version-invalid-rundir
crun: print version even with invalid rundir
2025-07-04 08:08:10 -07:00
Giuseppe Scrivano 12b63d9954
Merge pull request #1815 from eriksjolund/remove-dead-code-after-exit
Remove dead code after exit
2025-07-04 17:00:04 +02:00
Giuseppe Scrivano 8d61001df2
crun: print version even with invalid rundir
Closes: https://github.com/containers/crun/issues/1816

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-04 16:34:58 +02:00
Erik Sjölund 7680511920
Remove dead code after exit
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-07-04 15:14:53 +02:00
Giuseppe Scrivano 0b9aab0942
handler: add NULL check to handler_manager_free
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-04 14:13:18 +02:00
Giuseppe Scrivano a755e04371
utils: Add NULL pointer check to cleanup_close_vecp
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-04 14:09:35 +02:00
Giuseppe Scrivano 66d710ce67
linux: add NULL check to cleanup_free_init_statusp
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-04 14:09:01 +02:00
Giuseppe Scrivano ce47a21932
linux: Add NULL pointer checks to free_remount
add missing NULL pointer checks to be uniform across the code.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-04 14:08:54 +02:00
Giuseppe Scrivano 54d4b042cb
string_map: ignore empty map
Add null pointer check before accessing string_map structure.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-04 12:32:37 +02:00
Giuseppe Scrivano 357e3c6856
Merge pull request #1810 from giuseppe/limit-idmap-creation
linux: limit mounts creation outside of namespace
2025-07-03 12:37:56 +02:00
Giuseppe Scrivano f25352f9e8
linux: limit mounts creation outside of namespace
follow up for 4b7257d403.

Limit the usage of the mount creation outside of the container
initialization to mounts that have a specified mapping.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-03 09:29:40 +02:00
Giuseppe Scrivano b9a0fdeccf
Merge pull request #1767 from giuseppe/change-formula-cpu-shares-conversion
cgroup: change conversion  from CPU shares to weight
2025-07-03 09:18:19 +02:00
Giuseppe Scrivano c3bfab059a
Merge pull request #1809 from pprkut/lua-cleanup
lua: clean up unused defines
2025-07-01 23:42:19 +02:00
Heinz Wiesinger d51df096f9 lua: clean up unused defines
Signed-off-by: Heinz Wiesinger <pprkut@liwjatan.org>
2025-07-01 22:41:34 +02:00
Giuseppe Scrivano c15c2b9769
Merge pull request #1807 from giuseppe/fix-idmap-without-userns
linux: fix regression with idmapped mounts
2025-07-01 22:36:57 +02:00
Daniel J Walsh 3dd26565eb
Merge pull request #1806 from giuseppe/fix-lua-build
lua: fix build errors
2025-06-30 16:25:01 -04:00
Giuseppe Scrivano 4b7257d403
linux: fix regression with idmapped mounts
support idmapped mounts also when there is no user namespace specified
for the container.

commit 4a27212af8 introduced the
regression.

Closes: https://github.com/containers/crun/issues/1803

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-30 18:49:06 +02:00
Giuseppe Scrivano 83f601d378
lua: fix build errors
Closes: https://github.com/containers/crun/issues/1804

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-30 18:07:56 +02:00
Giuseppe Scrivano d4024ae783
Merge pull request #1798 from giuseppe/tag-1.22
NEWS: tag 1.22
2025-06-27 10:58:16 +02:00
Giuseppe Scrivano 4de19b63a8
NEWS: tag 1.22
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-27 09:45:35 +02:00
flouthoc e922e42251
Merge pull request #1800 from giuseppe/fix-podman-ci
tests: install catatonit package
2025-06-26 09:46:20 -07:00
Giuseppe Scrivano aa082854f7
tests: install catatonit package
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-26 10:05:30 +02:00
Giuseppe Scrivano 72e5468e43
tests: improve cpu_weight_systemd coverage
The `test_resources_cpu_weight_systemd` function previously
tested the CPU shares update with a single value.

This change expands the test to cover boundary values.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-26 10:03:28 +02:00
Giuseppe Scrivano 4998c92849
cgroup: improve conversion from shares to weight
The OCI CPU shares (range [2-262144]) to cgroup v2
`cpu.weight` (range [1-10000]) conversion formula has been
updated to use a quadratic function so that min, max and default
values match.

Closes: https://github.com/containers/crun/issues/1721

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-26 10:03:28 +02:00
Giuseppe Scrivano aaeeefc378
tests: install gperf on alpine
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-26 10:03:28 +02:00
Giuseppe Scrivano eca0f54697
Merge pull request #1796 from msekletar/dev-console-symlink
libcrun: setup /dev/console as a symlink to pty instead of bind mount
2025-06-26 09:59:55 +02:00
Giuseppe Scrivano f9d2458da3
Merge pull request #1799 from l0rd/wsl-regression
Fix regression on Windows WSL
2025-06-26 09:59:24 +02:00
Michal Sekletar 405d2a2c5b libcrun: setup /dev/console as a symlink to pty instead of bind mount
Signed-off-by: Michal Sekletar <msekleta@redhat.com>
2025-06-25 17:27:03 +02:00
Mario Loriedo 1203dadc76 Fix regression on Windows WSL
Reverting the changes of 91732ac in file src/libcrun/cgroup-setup.c
Fixes https://github.com/containers/crun/issues/1797

Signed-off-by: Mario Loriedo <mario.loriedo@gmail.com>
2025-06-25 17:15:00 +02:00
Giuseppe Scrivano de57a13768
Merge pull request #1794 from bitoku/fix-cpumax
Fix incorrectly set cpu.max when quota is -1.
2025-06-23 15:13:08 +02:00
Ayato Tokubi 4db005a5d9 Fix incorrectly set cpu.max when quota is -1.
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-06-20 04:31:05 +00:00
Giuseppe Scrivano 32aa142273
Merge pull request #1793 from jakecorrenti/krun-sigsegv
krun: check `kconf->handle{_sev}` exists before unloading
2025-06-19 17:08:29 +02:00
Jake Correnti f231b79afb krun: check `kconf->handle{_sev}` exists before unloading
In `libkrun_configure_flavor`, verify the `kconf->handle{_sev}` handle
exists before unloading.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2025-06-19 10:47:57 -04:00
Giuseppe Scrivano 3dd5fe3434
Merge pull request #1791 from eriksjolund/normalize-S_ISDIR-result
utils: normalize S_ISDIR() result to 0 or 1
2025-06-17 21:55:14 +02:00
Erik Sjölund 910eb16b36
utils: normalize S_ISDIR() result to 0 or 1
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-06-17 19:03:04 +02:00
Giuseppe Scrivano 8ff0973196
Merge pull request #1790 from eriksjolund/add_get_errno
libcrun: add crun_error_get_errno
2025-06-17 17:59:54 +02:00
Erik Sjölund 91732ac0e6 libcrun: add crun_error_get_errno
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-06-17 17:32:45 +02:00
Giuseppe Scrivano 80290775bb
Merge pull request #1789 from eriksjolund/simplify-libcrun_status_check_directories
status: simplify libcrun_status_check_directories
2025-06-17 14:51:19 +02:00
Giuseppe Scrivano 909bff05bc
Merge pull request #1788 from eriksjolund/status-add-cleanup_free
status: add cleanup_free
2025-06-17 14:45:48 +02:00
Erik Sjölund 7d618b7acf
status: simplify libcrun_status_check_directories
Remove redundant code.

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-06-17 07:30:19 +02:00
Erik Sjölund b34f613e5b
status: add cleanup_free
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-06-17 07:21:53 +02:00
Giuseppe Scrivano d89542c615
Merge pull request #1784 from giuseppe/revert-chroot-realpath
Revert "chroot_realpath: do not return non-existing paths"
2025-06-12 21:55:16 +02:00
Giuseppe Scrivano 7407bbc9a5
Revert "chroot_realpath: do not return non-existing paths"
This reverts commit 17135c1b25

It introduced a regression.

Closes: https://github.com/containers/crun/issues/1783

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 19:24:38 +02:00
flouthoc 073b9b8622
Merge pull request #1782 from giuseppe/update-containerd-to-2.1.1
test: bump containerd version
2025-06-11 08:03:31 -07:00
Giuseppe Scrivano 48855fc463
Merge pull request #1781 from giuseppe/harden-sprintf
src: drop usage of sprintf
2025-06-11 15:21:42 +02:00
Giuseppe Scrivano 1412f0a8e8
test: bump containerd version
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 14:25:26 +02:00
Giuseppe Scrivano 82b75fa465
cfg.mk: prohibit usage of sprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 11:33:12 +02:00
Giuseppe Scrivano af163aa511
container: use snprintf instead of sprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 11:33:12 +02:00
Giuseppe Scrivano 07eef8ac77
container: use xasprintf instead of sprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 11:33:12 +02:00
Giuseppe Scrivano c3c1928053
error: replace sprintf with snprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 11:33:12 +02:00
Giuseppe Scrivano 4666e880d2
cgroup: replace sprintf with snprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 11:33:12 +02:00
Giuseppe Scrivano c212049d8f
seccomp: replace sprintf with snprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 11:33:11 +02:00
Giuseppe Scrivano 9bb4e901e3
linux, utils: use snprintf instead of sprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 11:33:11 +02:00
Giuseppe Scrivano 4353d55a98
status: use snprintf instead of sprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 11:33:11 +02:00
Giuseppe Scrivano 271f7f5060
intelrdt: use snprintf instead of sprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 11:33:11 +02:00
Giuseppe Scrivano fd118c153a
cgroup-setup: drop unused variable
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-11 11:33:11 +02:00
flouthoc 8c689a90c5
Merge pull request #1780 from giuseppe/configure-fix-configure-ac-typo
configure.ac: fix variable name
2025-06-10 09:08:23 -07:00
Giuseppe Scrivano 4dbe754958
configure.ac: fix variable name
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 16:55:19 +02:00
flouthoc 8fb459a193
Merge pull request #1779 from giuseppe/fix-potential-segfault-mount-diagnose
linux: ensure fstype is not NULL
2025-06-10 06:53:25 -07:00
flouthoc 3cf05aaf29
Merge pull request #1778 from giuseppe/tests-refactoring
tests: fix mounts tests and improve error messages
2025-06-10 06:50:21 -07:00
Giuseppe Scrivano a95034a443
linux: ensure fstype is not NULL
there is no enforcement that fstype is always defined.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:46:34 +02:00
Giuseppe Scrivano d462c1cf1b
tests: improve error messages in start tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:19 +02:00
Giuseppe Scrivano f2ae65db48
tests: improve error messages in seccomp tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:19 +02:00
Giuseppe Scrivano 6b3485f337
tests: improve error messages in rlimits tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:19 +02:00
Giuseppe Scrivano dffbeaddaa
tests: improve error messages in resources tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:19 +02:00
Giuseppe Scrivano 8f518ee211
tests: improve error messages in oci_features tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:18 +02:00
Giuseppe Scrivano 2bfc81bdaf
tests: improve error messages in limits tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:18 +02:00
Giuseppe Scrivano 9b164379ee
tests: improve error messages in hostname tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:18 +02:00
Giuseppe Scrivano 39b374d53a
tests: improve error messages in exec tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:18 +02:00
Giuseppe Scrivano dbd25dccea
tests: improve error messages in domainname tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:18 +02:00
Giuseppe Scrivano 57769eeee3
tests: improve error messages in devices tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:18 +02:00
Giuseppe Scrivano 77fc894fae
tests: improve error messages in exec tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:18 +02:00
Giuseppe Scrivano be391abb2a
tests: improve error messages in mounts tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:18 +02:00
Giuseppe Scrivano d212647d55
tests: improve error messages in capabilities tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:18 +02:00
Giuseppe Scrivano c3bac03c13
tests: improve error reporting
and add timing information to test output

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:18 +02:00
Giuseppe Scrivano f62dcc217b
tests: fix mount of tmpfs
it does not accept the "bind" option, that is only for bind mounts.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:20:04 +02:00
Giuseppe Scrivano d318fa1a55
tests: recreate tests root for each test
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-10 12:11:07 +02:00
Giuseppe Scrivano 3f3893515a
Merge pull request #1776 from charlie0129/fix-rootless-option-parsing
fix rootless option parsing
2025-06-07 15:36:46 +02:00
flouthoc fec3782d0b
Merge pull request #1762 from giuseppe/allow-mount-symlinks
Add `src-nofollow` & `dest-nofollow` mount options
2025-06-06 08:14:02 -07:00
Giuseppe Scrivano 5393ffda47
Merge pull request #1774 from eriksjolund/create-missing-errors
container: create missing errors
2025-06-06 11:55:35 +02:00
Charlie Chiang 21e860c73f fix rootless option parsing
Signed-off-by: Charlie Chiang <charlie_c_0129@outlook.com>
2025-06-06 14:58:14 +08:00
Erik Sjölund 7b82568080
container: create missing errors
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-06-05 19:19:27 +02:00
Giuseppe Scrivano a84e65648f
Merge pull request #1773 from eriksjolund/fix-asprintf-free
python: reset pointer after asprintf failure
2025-06-05 10:08:31 +02:00
Erik Sjölund 64fbacabbe
python: reset pointer after asprintf failure
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-06-05 07:14:21 +02:00
Giuseppe Scrivano b21bebee29
Merge pull request #1772 from eriksjolund/run_create_deduplicate_code
run, create: deduplicate code
2025-06-04 09:39:38 +02:00
Giuseppe Scrivano 2c67616ac8
Merge pull request #1771 from eriksjolund/prefer-NULL-arg
libcrun: prefer waitpid_ignore_stopped NULL argument
2025-06-03 20:23:22 +02:00
Erik Sjölund cfcb839a2a
run, create: deduplicate code
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-06-03 18:13:26 +02:00
Giuseppe Scrivano 957a6518d6
linux: add `src-nofollow` & `dest-nofollow` options
Introduce `src-nofollow` and `dest-nofollow` bind mount options
for more precise control over symbolic link handling.

The `src-nofollow` option enables mounting the source symbolic link
itself, rather than its target.

The `dest-nofollow` option ensures that if the destination path is
a symbolic link, the mount operation replaces the symbolic link
itself, instead of dereferencing it and mounting to its target.

Closes: https://github.com/containers/crun/issues/1761

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-03 09:13:20 +02:00
Erik Sjölund 044c89d289
run, create: align implementations
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-06-02 07:53:00 +02:00
Erik Sjölund 0479ae73f2
libcrun: prefer waitpid_ignore_stopped NULL argument
If the status value is not needed, pass the argument
NULL.

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-06-02 07:05:53 +02:00
Giuseppe Scrivano 88f45286f3
criu: reject unsupported 'src-nofollow' option
CRIU does not support the 'src-nofollow' option for bind mounts.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-01 00:24:19 +02:00
Giuseppe Scrivano 602d0e1453
linux: add argument nofollow to is_bind_mount
now is_bind_mount returns whether the src-nofollow option was
specified for the bind mount.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-01 00:24:19 +02:00
Giuseppe Scrivano ae0ecdd74f
mountflags: add `dest-nofollow` and `src-nofollow` options
This commit introduces the `dest-nofollow` and `src-nofollow` mount
options.  These new options allow for more precise control
over how symbolic links are handled during mount-related
operations, by enabling a "no-follow" behavior for symlinks
at the specified path.

The `mount_flags.c` file has been regenerated by gperf
to include this new option in the lookup table.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-01 00:24:19 +02:00
Giuseppe Scrivano af39d91354
linux: refactor mount fd handling in do_mounts
A local `source_mountfd` variable is introduced to store
the file descriptor for the current mount entry.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-01 00:24:19 +02:00
Giuseppe Scrivano c77a1807e6
linux: add `nofollow` option to `get_bind_mount`
This change enhances the `get_bind_mount` function by
introducing a `nofollow` boolean parameter.  When set to
true, this parameter ensures that the `AT_SYMLINK_NOFOLLOW`
flag is passed to the `syscall_open_tree` call.

This allows the caller to specify that if the source of
the bind mount is a symbolic link, the link itself should
be mounted, rather than the target it points to.

All existing call sites have been updated to pass `false`
for the new `nofollow` parameter, thus preserving the
previous behavior of following symlinks by default. This
addition provides finer control for future use cases
requiring specific handling of symbolic links during
bind mount creation.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-01 00:24:18 +02:00
Giuseppe Scrivano c8d042b34b
linux: remove duplicate close of rootfsfd
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-01 00:24:18 +02:00
Giuseppe Scrivano 91d202a2d5
linux: drop unuseful variable
just pass the value to the function call.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-01 00:24:07 +02:00
Giuseppe Scrivano c1671bd0b4
github: show the diff for the check job
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-01 00:24:07 +02:00
Giuseppe Scrivano 9489b9962b
Merge pull request #1769 from eriksjolund/improve-dlopen-error-message
src: improve dlopen error message
2025-05-30 20:52:00 +02:00
Erik Sjölund 6c24739d39
src: improve dlopen error message
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-05-29 17:08:40 +02:00
flouthoc b1a71e7ba6
Merge pull request #1753 from giuseppe/fix-opening-root
linux: treats empty path to safe_openat as root
2025-05-27 06:54:36 -07:00
Giuseppe Scrivano 8622175a84
Merge pull request #1760 from lsm5/tmt-disable-centos-stream-10-x86_64
Packit/TMT: disable centos-stream-10-x86_64 tests
2025-05-26 11:07:07 +02:00
Lokesh Mandvekar 33602a5145
Packit/TMT: disable centos-stream-10-x86_64 tests
Ref: https://github.com/containers/crun/pull/1758#issuecomment-2901772392

Issue filed: https://github.com/containers/crun/issues/1759

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-05-22 15:26:31 -04:00
Giuseppe Scrivano b973373bf7
Merge pull request #1758 from kolyshkin/pr-1747-followups
Fixup "criu: support mounts where dest is a symlink"
2025-05-22 21:06:05 +02:00
Giuseppe Scrivano 735014ffca
Merge pull request #1754 from tylerfanelli/krun-vm-config-flavor
krun: Parse libkrun flavor indicated in `KRUN_VM_FILE`
2025-05-22 17:57:22 +02:00
Lokesh Mandvekar b1133e9549 TMT: include podman checkpoint system tests
`podman checkpoint...` tests depend on crun as well so they should be
included in the TMT system test runs. These are currently failing on
podman upstream PRs, for example: [0], [1] and [2].

[0]: https://artifacts.dev.testing-farm.io/fd5c08eb-2b0d-4ea8-9c33-ac9bf3447bd8/
[1]: https://artifacts.dev.testing-farm.io/e227ef00-c5c4-43e6-9785-0b5b5fcd2908/
[2]: https://artifacts.dev.testing-farm.io/c4ef6e7a-3b5d-4afc-91b9-dd477ef699e3/

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-05-21 20:04:02 -07:00
Kir Kolyshkin fddb30429d Revert "criu: rename a variable"
This reverts commit 55498c1f8a.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-21 20:01:35 -07:00
Kir Kolyshkin 97fd76bf53 Fixup "criu: support mounts where dest is a symlink"
Commit e0b01580 "criu: support mounts where dest is a symlink" brought
in a few issues.

First, the concatenation of status->bundle and status->rootfs do not
make any sense (and I'm not yet sure why it is used as CRIU root).
Revert to using status->rootfs directly.

Second, if chroot_resolve returns ENOENT, it is OK (not all paths are
visible from the host -- some are only inside the mount namespace).
In this case, use the destination as is.

This should fix podman checkpoint failures.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-21 20:00:38 -07:00
Tyler Fanelli 379524f0bf
krun: Determine flavor from VM config tree
To support other krun flavors in the future, parse which flavor (and by
extension, which handle) should be used before making any libkrun API
calls past libkrun_create_ctx. The correct handle to be used can be
determined before any other subsequent API calls.

Signed-off-by: Tyler Fanelli <tfanelli@redhat.com>
2025-05-21 22:15:33 -04:00
Tyler Fanelli 7d1d15d71a
krun: Configure VM with pre-parsed config tree
The krun VM config was parsed before the VM was configured. Use the
pre-parsed config tree already available instead of parsing it once
more.

Signed-off-by: Tyler Fanelli <tfanelli@redhat.com>
2025-05-21 21:48:26 -04:00
Tyler Fanelli 13fcca9e7f
krun: Add function to parse krun VM config
To determine the libkrun handle to use, the container may embed a field
"flavor" in the KRUN_VM_FILE. To prepare for this, pre-parse the krun VM
config when exec'ing.

Signed-off-by: Tyler Fanelli <tfanelli@redhat.com>
2025-05-21 21:48:26 -04:00
flouthoc c3c51ba06c
Merge pull request #1755 from giuseppe/add-more-tests
add new tests
2025-05-21 14:12:09 -07:00
Giuseppe Scrivano fbd8ea8f48
tests: add new tests to test_pid
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-21 17:30:38 +02:00
Giuseppe Scrivano 3241e2c757
tests: add new test_uid_gid.py tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-21 17:30:38 +02:00
Giuseppe Scrivano 4f1734073f
tests: add new test_devices.py tests
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-21 17:30:38 +02:00
Giuseppe Scrivano 0b8455ea67
tests: add ischar, isblock, isfifo commands to init
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-21 17:30:38 +02:00
Giuseppe Scrivano 9056ec3bda
tests: add openwronly command to init
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-21 17:30:38 +02:00
Giuseppe Scrivano 42e5bc61da
tests: report the correct exit status for ls
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-21 17:30:38 +02:00
flouthoc 2924a20a6d
Merge pull request #1750 from giuseppe/support-net-devices
linux: add support for net devices
2025-05-21 07:53:35 -07:00
Giuseppe Scrivano 07374bb665
crun: expose net devices feature
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-21 12:34:03 +02:00
Giuseppe Scrivano 006c7aa1aa
libcrun: advertise net devices support
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-21 12:34:03 +02:00
Giuseppe Scrivano ba0ec5a7a9
linux: add support for network devices
https://github.com/opencontainers/runtime-spec/pull/1271 added support
for moving existing network devices to the container network
namespace.

Closes: https://github.com/containers/crun/issues/1712

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-21 12:34:03 +02:00
Giuseppe Scrivano fe8f3277b6
tests: add ip command to init
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-21 12:34:03 +02:00
Giuseppe Scrivano 64a2e0e1c6
linux: Update rootfsfd when rootfs is replaced
When a mount operation replaces the container's root filesystem ("/"),
the existing `rootfsfd` becomes stale.  This file descriptor would
still point to the old, now overmounted root, potentially causing
subsequent filesystem operations within the container setup to fail
or target the incorrect filesystem.

Closes: https://github.com/containers/crun/issues/1752

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-20 21:43:20 +02:00
Giuseppe Scrivano 7de03e622f
linux: safe_openat reopens root
If an empty path is used, reopens directly the rootfs so that it can
grab a reference to the topmost mount, not the previously open file
descriptor.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-20 21:43:20 +02:00
Giuseppe Scrivano 2e210bdce6
linux: use rootfsfd directly from container data
This change ensures that the file descriptor for the
rootfs is always sourced directly from the container's
private data.  This avoids potential stale file descriptor
issues that could happen if a local variable were used and
the descriptor in the private data was updated elsewhere.

Should not introduce any behavior change.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-20 21:43:20 +02:00
Giuseppe Scrivano e9d159f727
linux: store rootfsfd under private data only
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-20 21:43:20 +02:00
Giuseppe Scrivano 953a8c49d6
utils: crun_safe_ensure_at opens empty paths
if `do_open` is used with an empty path, the reopen the `dirpath`.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-20 21:42:38 +02:00
Giuseppe Scrivano 372446d747
tests: fix unused variable
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-20 16:54:59 +02:00
Giuseppe Scrivano 9ddddfb05a
test: fix mount to test
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-20 16:54:59 +02:00
Giuseppe Scrivano 040cb2e706
linux: include errno check in UNLIKELY macro
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-20 16:54:59 +02:00
Giuseppe Scrivano b337c9d34f
libocispec: update from upstream
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-20 12:20:21 +02:00
flouthoc b81374d6ca
Merge pull request #1748 from giuseppe/support-for-schemata-intel-rdt
Support generic Intel RDT schemata
2025-05-19 07:08:15 -07:00
Giuseppe Scrivano 6946703c25
Merge pull request #1749 from kolyshkin/mount-nits
linux: nits to mounting code
2025-05-16 12:27:34 +02:00
Kir Kolyshkin 37dacae3a3 linux: do_mount: simplify
Should not result in any change in functionality.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-15 15:36:33 -07:00
Kir Kolyshkin 0f74f03fe8 linux: use ALL_PROPAGATIONS_NO_REC
No functional change.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-15 15:27:51 -07:00
Giuseppe Scrivano 7f686825c2
intelrdt: add support for generic schemata update
This patch enhances Intel RDT management by enabling updates
to the generic `schemata` file within a resctrl group.

Closes: https://github.com/containers/crun/issues/1746

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-15 14:59:42 +02:00
Giuseppe Scrivano ca40dac71a
ci: show git status
The `make check` target should not result in any modifications
to the source tree. Adding these `git` commands helps ensure that
the working directory remains clean after tests run.

`git status` is added for visibility into any changes, and
`git describe --dirty` will cause the job to fail if
the working directory is not clean, preventing accidental
uncommitted changes generated by the test suite.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-15 14:59:42 +02:00
Giuseppe Scrivano 2ad0b600af
libocispec: sync from upstream
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-15 11:41:22 +02:00
Giuseppe Scrivano 558419b235
utils: do not use stack for lens array
The str_join_array function previously declared an array `lens`
on the stack with a variable length derived from its `size`
parameter.

Play safe and do not use the stack for a VLA.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-15 11:41:22 +02:00
Giuseppe Scrivano 7143e24103
Merge pull request #1747 from kolyshkin/criu-3
criu: fix c/r for container with bind mounts where dest is a symlink
2025-05-14 15:06:44 +02:00
Kir Kolyshkin 174963dc6c criu: avoid malloc in prepare_restore_mounts
The code allocates the new string when it can easily be avoided.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-13 10:31:31 -07:00
Kir Kolyshkin e0b0158075 criu: support mounts where dest is a symlink
This is a rough equivalent of runc PR 3047 [1], fixing c/r for
the case when the bind mount destination is a symlink.

Found when running runc's integration test named
"checkpoint and restore (bind mount, destination is symlink)".

[1]: https://github.com/opencontainers/runc/pull/3047

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-12 14:59:50 -07:00
Kir Kolyshkin 55498c1f8a criu: rename a variable
This is a preparation for the next commit.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-12 14:59:50 -07:00
Kir Kolyshkin 880b7ee3cd criu: allow mount type to be NULL
In runtime-spec, mount's type property is optional and thus it can be
NULL (for bind mounts). This is accounted for in other parts of code
but not in c/r.

This fix prevents SIGSEGV on crun checkpoint/restore.

Found when running runc's integration test named
"checkpoint and restore (bind mount, destination is symlink)"

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-12 14:59:50 -07:00
Kir Kolyshkin a9c1f02468 criu: reuse is_bind_mount
This simplifies the code a bit.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-12 14:59:47 -07:00
Giuseppe Scrivano 819c2a76af
Merge pull request #1740 from sgsaenger/patch-1
Fix python call to `libcrun_get_verbosity`
2025-05-12 11:46:39 +02:00
Sebastian Gsänger b4b290888b Fix python call to `libcrun_get_verbosity`
Fixes erroneous call with an unused argument to a 0-argument function.
Current call fails with GCC 15.1:
```
python/crun_python.c: In function 'get_verbosity':
python/crun_python.c:461:27: error: too many arguments to function 'libcrun_get_verbosity'; expected 0, have 1
  461 |   return PyLong_FromLong (libcrun_get_verbosity (verbosity));
      |                           ^~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~
In file included from /home/user/.cache/yay/crun-krun/src/crun/src/libcrun/container.h:24:
/home/user/.cache/yay/crun-krun/src/crun/src/libcrun/error.h:109:20: note: declared here
  109 | LIBCRUN_PUBLIC int libcrun_get_verbosity ();
      |                    ^~~~~~~~~~~~~~~~~~~~~
```

Signed-off-by: Sebastian Gsänger <8004308+sgsaenger@users.noreply.github.com>
2025-05-09 01:36:48 +02:00
Giuseppe Scrivano 16b7ae4fe5
Merge pull request #1742 from kolyshkin/criu-2
criu: validate --parent-path
2025-05-08 23:22:46 +02:00
Giuseppe Scrivano 9d48af53e7
Merge pull request #1743 from kolyshkin/1741-followup
Followups to #1741
2025-05-08 23:21:55 +02:00
Kir Kolyshkin c1e72c1090 tests/test_checkpoint_restore.py: fixup
Commit 0dceab0c did not define the work_dir for one of the cases.

Fixes: 0dceab0c ("commit tests: add --work-path to criu test")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-08 13:43:00 -07:00
Kir Kolyshkin e5d2489dd2 criu restore: create --work-path
Commit 90ef9732 ("criu: create --work-path directory") forgot to add the
same logic upon restore. If the --work-path used during restore differs
from one that was used during checkpoint, criu fails.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-08 13:25:55 -07:00
Kir Kolyshkin 157a673d43 criu checkpoint: error message fixup
Fixes: 90ef973 ("criu: create --work-path directory")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-08 13:25:30 -07:00
Kir Kolyshkin 91aa3683e8 criu: validate --parent-path
The --parent-path argument should be relative to image path, or CRIU
will fail with not very helpful error message, like ENOENT.

Add validation of --parent-path and helpful errors.

This is similar to [1] and fixes the runc test named
"checkpoint --pre-dump (bad --parent-path)".

[1]: https://github.com/opencontainers/runc/pull/2913

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-08 12:49:48 -07:00
flouthoc db82220a11
Merge pull request #1741 from kolyshkin/criu
crun checkpoint/restore: create --work-path directory
2025-05-08 08:40:41 -07:00
Kir Kolyshkin 0dceab0c03 tests: add --work-path to criu test
This is mostly to check that --work-path works. Later we may also use it
to spit some diagnostics in case a test fails.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-08 08:08:09 -07:00
Kir Kolyshkin 90ef9732f8 criu: create --work-path directory
There is a discrepancy in behavior between how crun and runc
handle --work-path argument:
 - runc creates the directory on checkpoint and restore [1];
 - crun fails with ENOENT if the directory doesn't exist.

Make crun create the work-path directory, fixing the following error:

> crun checkpoint --work-path ./work-dir test_busybox (status=1):
> error opening CRIU work directory `./work-dir`: No such file or directory

Found when running runc's checkpoint.bats tests with crun.

[1]: https://github.com/opencontainers/runc/commit/a8d5fdf1

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-08 08:08:09 -07:00
Giuseppe Scrivano 851254d54b
Merge pull request #1739 from giuseppe/update-nix-2025-5-7
nix: update nixpkgs
2025-05-08 09:01:15 +02:00
Giuseppe Scrivano 1fa8649277
nix: update nixpkgs
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-07 16:17:50 +02:00
Daniel J Walsh bec937219c
Merge pull request #1737 from giuseppe/improve-mount-failure-message
linux: improve cgroup2 mount error message
2025-05-06 06:40:37 -04:00
Giuseppe Scrivano b979642067
linux: improve cgroup2 mount error message
Closes: https://github.com/containers/crun/issues/1736

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-06 12:09:31 +02:00
Giuseppe Scrivano b679e7f765
linux: move unified_cgroup_path to private_data
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-06 12:09:30 +02:00
Giuseppe Scrivano 4fdf8afd8c
Merge pull request #1733 from promansk/fix-cli-help-text
crun: fix the binary name displayed in the Usage info for commands.
2025-05-05 22:48:02 +02:00
Giuseppe Scrivano 486c5d1665
Merge pull request #1734 from eriksjolund/getcwd-fix-errno
linux: use syscall getcwd return value to set error
2025-05-05 22:46:22 +02:00
Giuseppe Scrivano bf72edbf92
Merge pull request #1735 from eriksjolund/fix-getcwd-error-handling
Fix getcwd error handling
2025-05-05 22:45:05 +02:00
Erik Sjölund ffbfb6f355
Fix getcwd error handling
Handle getcwd failure.
Fix getcwd error message.

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-05-05 20:33:01 +02:00
Erik Sjölund 8c0075b516
linux: use syscall getcwd return value to set error
Reference:
df67cb4c58/fs/d_path.c (L394-L411)

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-05-05 19:34:06 +02:00
Piotr Romanski 99c7b959f7 crun: Fix the binary name in the Usage info
This patch solves the bug where the Usage info for
commands displays the command name instead of "crun"
as the crun binary name.
E.g. for `crun list --help` crun prints "Usage: list [OPTION...] list"
while it should print "Usage: crun [OPTION...] list".

This fix clones and modifies the `argc` and `argv`
for the second `argp_parse` execution - the `parse_opt`
of each command will now receive the binary name (crun)
instead of the command name in argv[0].
The change is transparent to commands, because
- the command name is extracted and used in the first
`argp_parse` execution in crun, and
- the argp_parse doesn't call the parse_opt with the argv[0]
by default

Signed-off-by: Piotr Romanski <piotrek.romanski@gmail.com>
2025-05-05 17:50:49 +02:00
Giuseppe Scrivano b812ced4ca
Merge pull request #1732 from lsm5/packit-revert-diff-scan-disable
Revert "Packit: Disable osh_diff_scan"
2025-05-01 19:36:53 +02:00
Lokesh Mandvekar 19b4c25485
Revert "Packit: Disable osh_diff_scan"
After checking further with Siteshwar, disabling this job in #1731
was a rather hasty decision. The Packit osh diff scan job hasn't caused any
actual noise upstream because of false positives and on the contrary
has found out actual issues.
Refs:
[0] https://github.com/containers/crun/pull/1708#issuecomment-2761459459
[1] https://github.com/containers/crun/pull/1689#issuecomment-2706391593
[2] https://github.com/containers/crun/pull/1695#issuecomment-2737525652

A false positive result in a Packit osh diff scan would pass with a
warning sign:
Ref: https://github.com/stratis-storage/stratisd/pull/3811/checks?check_run_id=41436634933

So not really any annoyance upstream.

Siteshwar will disable crun in the mass scans, so that should suffice.

This reverts commit dd8e1af5aa.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-05-01 17:21:57 +05:30
flouthoc 1cdec7985d
Merge pull request #1731 from lsm5/osh-diff-scan-disable
Packit: Disable osh_diff_scan
2025-04-30 07:56:08 -07:00
Giuseppe Scrivano cd060bca0f
Merge pull request #1730 from KristinaHa26/riscv64
Revert "Disable criu support on riscv64"
2025-04-30 16:25:55 +02:00
Lokesh Mandvekar dd8e1af5aa
Packit: Disable osh_diff_scan
The differential scans run by Packit currently report false positives.

Fixes: #1729

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-04-30 18:28:16 +05:30
Kristina Hanicova 5ec1f2ab6b Revert "Disable criu support on riscv64"
This reverts commit 4ea62f2521.

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
2025-04-30 14:53:49 +02:00
Giuseppe Scrivano 71f9f917ba
Merge pull request #1726 from kolyshkin/bump-shellcheck
ci: improve shellcheck job
2025-04-28 10:04:33 +02:00
Giuseppe Scrivano 636520bf46
Merge pull request #1728 from kolyshkin/krun-fix
krun.1: regenerate
2025-04-28 09:45:21 +02:00
Kir Kolyshkin bdd62e1740 krun.1: regenerate
This fixes NAME section formatting for whatis parsing (fixed in
go-md2man v2.0.5, see https://github.com/cpuguy83/go-md2man/pull/124).

A similar change to crun(1) was (implicitly) made by commit b5a566bf.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-24 19:07:12 -07:00
Kir Kolyshkin 739a2bffa1 build-aux/release.sh: fix shellcheck warnings
Too many to mention, but most are:

> SC2086 (info): Double quote to prevent globbing and word splitting

Use bash arrays where it make sense, and move the repeated stuff into
the BUILD_CMD.

PS I tried to make it more readable, but in some places with all the
added quotes it might actually become worse, so feel free to drop this
commit.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-24 16:46:36 -07:00
Kir Kolyshkin 5c14c0dc1b make shellcheck: add more files
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-24 16:46:36 -07:00
Kir Kolyshkin f1cf3b35f9 tests/run_all_tests.sh: fix shellcheck issues
These ones:
>
> In tests/run_all_tests.sh line 9:
> rm -f *.trs
>       ^-- SC2035 (info): Use ./*glob* or -- *glob* so names with dashes won't become options.
>
>
> In tests/run_all_tests.sh line 18:
>     ./tap-driver.sh --test-name $i --log-file $i.log --trs-file $i.trs ${COLOR} --enable-hard-errors yes --expect-failure no -- /usr/bin/python $i
>                                 ^-- SC2086 (info): Double quote to prevent globbing and word splitting.
>                                               ^-- SC2086 (info): Double quote to prevent globbing and word splitting.
>                                                                 ^-- SC2086 (info): Double quote to prevent globbing and word splitting.
>                                                                        ^------^ SC2086 (info): Double quote to prevent globbing and word splitting.
>                                                                                                                                                 ^-- SC2086 (info): Double quote to prevent globbing and word splitting.
>
> Did you mean:
>     ./tap-driver.sh --test-name "$i" --log-file "$i".log --trs-file "$i".trs "${COLOR}" --enable-hard-errors yes --expect-failure no -- /usr/bin/python "$i"
>
>
> In tests/run_all_tests.sh line 21:
> if grep FAIL *.trs; then
>              ^-- SC2035 (info): Use ./*glob* or -- *glob* so names with dashes won't become options.
>
> For more information:
>   https://www.shellcheck.net/wiki/SC2035 -- Use ./*glob* or -- *glob* so name...
>   https://www.shellcheck.net/wiki/SC2086 -- Double quote to prevent globbing ...
>

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-24 16:46:36 -07:00
Kir Kolyshkin d084c9b5e3 ci: bump shellcheck to v0.10.0
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-24 16:46:32 -07:00
Kir Kolyshkin 10e312d59b ci: improve shellcheck job
1. Use env directive instead of adding to $GITHUB_ENV.
2. Use bash herefile to feed sha256sum instead of pipe to grep.
3. Use ubuntu-latest.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-24 16:44:31 -07:00
Giuseppe Scrivano 69f1cece1c
Merge pull request #1727 from kolyshkin/man-fix
crun.1: fix "CPU controller" table rendering
2025-04-24 23:22:19 +02:00
flouthoc fb9a4cbb4d
Merge pull request #1725 from giuseppe/ci-install-awk
ci: Add awk dependency
2025-04-24 13:33:53 -07:00
Kir Kolyshkin 1e518be642 crun.1: fix "CPU controller" table rendering
There might be a bug in go-md2man (I haven't took a look yet), but it
results in wrong typesetting in this table, which is rendered as:

       ┌─────────┬────────────────────────┬────────────────────────┬────────────────────────┐
       │ OCI (x) │ cgroup 2 value (y)     │ conversion             │ comment                │
       ├─────────┼────────────────────────┼────────────────────────┼────────────────────────┤
       │ shares  │ cpu.weight             │ y = (1 + ((x  -  2)  * │                        │
       │         │                        │ 9999) / 262142)        │                        │
       ├─────────┼────────────────────────┼────────────────────────┼────────────────────────┤
       │         │ convert           from │                        │                        │
       │         │ [2-262144]          to │                        │                        │
       │         │ [1-10000]              │                        │                        │
       ├─────────┼────────────────────────┼────────────────────────┼────────────────────────┤

Apparently, escaping the * fixes things, so it looks like it's supposed
to now:

       ┌─────────┬────────────────────┬─────────────────────────────────────┬────────────────────────┐
       │ OCI (x) │ cgroup 2 value (y) │ conversion                          │ comment                │
       ├─────────┼────────────────────┼─────────────────────────────────────┼────────────────────────┤
       │ shares  │ cpu.weight         │ y = (1 + ((x - 2) * 9999) / 262142) │ convert           from │
       │         │                    │                                     │ [2-262144]          to │
       │         │                    │                                     │ [1-10000]              │
       ├─────────┼────────────────────┼─────────────────────────────────────┼────────────────────────┤

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-24 10:59:44 -07:00
Giuseppe Scrivano e7f8dc33af
github: use ubuntu-latest for shellcheck
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-24 11:22:56 +02:00
Giuseppe Scrivano 10db1f7a89
utils: Mark base64 table as non-string data
Add the `__attribute__ ((__nonstring__))` to the base64
character lookup table.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-24 10:59:21 +02:00
Giuseppe Scrivano 245d2edfe9
ci: Add awk dependency to test containers
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-24 10:59:21 +02:00
Giuseppe Scrivano 51958b39d7
ci: Add awk dependency
Add awk to the list of installed packages in the
build, test, and shellcheck jobs.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-24 10:59:21 +02:00
Giuseppe Scrivano 8b63d4a23c
Merge pull request #1722 from containers/dependabot/github_actions/uraimo/run-on-arch-action-3.0.1
build(deps): bump uraimo/run-on-arch-action from 3.0.0 to 3.0.1
2025-04-24 10:59:14 +02:00
Giuseppe Scrivano a5df9509d0
Merge pull request #1723 from lsm5/tmt-shellcheck
shellcheck: resolve warnings in TMT test script
2025-04-23 16:36:49 +02:00
Lokesh Mandvekar 9adca80681
Packit/TMT: run shellcheck tests on fedora envs
The ShellCheck tool currently doesn't exist on EPEL10, so we run them
only on Fedora and EL9 envs.
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=2328756

Packit test failure notifications have also been modified to reflect
tests other than podman tests.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-04-23 18:18:58 +05:30
Lokesh Mandvekar 6e1ef46472
ShellCheck: resolve warnings in TMT test script
TODO: Run shellcheck on test scripts before executing them. ShellCheck
is not available on epel10 yet, so we're kinda sorta blocked:
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=2328756

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-04-23 17:26:08 +05:30
Giuseppe Scrivano 8dbf343906
Merge pull request #1684 from lsm5/tmt-simplify-plans
TMT: Simplify plans
2025-04-22 18:27:55 +02:00
Lokesh Mandvekar 7b910ea298
TMT: Simplify plans
The same set of tests are being run upstream and downstream.

Enablement of podman-next copr for bleeding-edge testing is handled by
plans/main.fmf and .packit.yaml automatically so there's no additional
need to specify `/upstream` and `/downstream`.

Also, no need to for a redundant `/system-test` inside
test/tmt/podman/system-test.fmf.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-04-22 16:49:25 +05:30
dependabot[bot] 9b395e9aec
build(deps): bump uraimo/run-on-arch-action from 3.0.0 to 3.0.1
Bumps [uraimo/run-on-arch-action](https://github.com/uraimo/run-on-arch-action) from 3.0.0 to 3.0.1.
- [Release notes](https://github.com/uraimo/run-on-arch-action/releases)
- [Commits](https://github.com/uraimo/run-on-arch-action/compare/v3.0.0...v3.0.1)

---
updated-dependencies:
- dependency-name: uraimo/run-on-arch-action
  dependency-version: 3.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-21 17:30:15 +00:00
Giuseppe Scrivano 628140d55b
Merge pull request #1720 from kolyshkin/unused-prepro
[nit] linux: remove unused preprocessor directives
2025-04-11 08:31:57 +02:00
Kir Kolyshkin 9f2604acdb linux: remove unused preprocessor directives
Since commit 21e0179b ("linux: fix definition of CLONE_NEWCGROUP") and
commit 684b2540 ("linux: fix definition of CLONE_NEWTIME") these two
defines can't ever be zero (unless defined to be 0 in the included
files, which is highly unlikely), so #if CLONE_NEWCGROUP and #if
CLONE_NEWTIME don't make much sense now.

Remove those, and simplify container_has_cgroupns.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-10 11:47:21 -07:00
flouthoc d71846893e
Merge pull request #1719 from giuseppe/fix-time-test-centos-stream-9
linux: fix definition of CLONE_NEWTIME on Centos 9
2025-04-10 10:06:20 -07:00
Giuseppe Scrivano 21e0179b01
linux: fix definition of CLONE_NEWCGROUP
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-10 17:27:14 +02:00
Giuseppe Scrivano 684b2540a3
linux: fix definition of CLONE_NEWTIME
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-10 17:26:40 +02:00
Giuseppe Scrivano 3e9b1c4ace
linux: fix definition of CLONE_NEWTIME on Centos 9
Include <linux/sched.h> header to provide the proper definition of
CLONE_NEWTIME.

Closes: https://github.com/containers/crun/issues/1718

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-10 17:25:20 +02:00
Giuseppe Scrivano 9b6bd79b21
Merge pull request #1714 from giuseppe/more-uniform-err-messages
src: make error messages more uniform
2025-04-04 15:54:45 +02:00
Giuseppe Scrivano 5e48e2ad59
Merge pull request #1716 from slp/krun-stop-using-workdir
krun: stop using set_workdir
2025-04-04 14:40:57 +02:00
Sergio Lopez fc3ae8e21d krun: stop using set_workdir
Stop using set_workdir, which can be dangerous, and rely exclusively
on the config file for setting up the workdir in the container.

Fixes: #1691

Signed-off-by: Sergio Lopez <slp@redhat.com>
2025-04-04 08:17:31 -04:00
Giuseppe Scrivano dfb649b39f
linux: fix error leak from sync fd
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-04 13:52:53 +02:00
Giuseppe Scrivano 8cb44cd684
container: don't leak error
reuse the error returned from libcrun_container_load_from_file.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-04 13:52:53 +02:00
Giuseppe Scrivano 0d760a96b9
linux: fix error leak
crun_safe_create_and_open_ref_at already creates the error.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-04 13:52:53 +02:00
Giuseppe Scrivano 3649947bdd
linux: make error messages more uniform
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-04 13:52:53 +02:00
Giuseppe Scrivano 75e74bf7f4
container: make error messages more uniform
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-04 13:52:53 +02:00
Giuseppe Scrivano 9774d59ac6
utils: make error messages more uniform
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-04 13:52:52 +02:00
Giuseppe Scrivano b656f67f46
cgroup: fix quoting for file names in error messages
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-04 13:52:48 +02:00
Daniel J Walsh 7cd3518539
Merge pull request #1715 from giuseppe/fix-podman-ginkgo-location
tests, podman: fix ginkgo installation
2025-04-04 07:46:04 -04:00
Giuseppe Scrivano 6d4f538741
tests: configure additional IDs for the containers user
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-04 11:43:53 +02:00
Giuseppe Scrivano 3b2e2a13a2
tests, podman: fix ginkgo installation
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-04 11:16:14 +02:00
Giuseppe Scrivano 9c972c3755
Merge pull request #1713 from flouthoc/krun-error
krun: return `dlerror` string when library is not found
2025-04-03 17:29:17 +02:00
flouthoc c359fbd8e0
krun: return dlerror string when library is not found
krun returns incomplete error without this patch which is difficult to debug
```console
Error: OCI runtime error: /home/crun/krun: failed to open library `libkrun.so.1` and `libkrun-sev.so.1` for krun_config
```

after patch
```console
Error: /home/ar/work/crun/krun: failed to open library `libkrun.so.1` and `libkrun-sev.so.1` for krun_config: libkrun-sev.so.1: cannot open shared object file: No such file or directory: OCI runtime attempted to invoke a command that was not found
```

Signed-off-by: flouthoc <flouthoc.git@gmail.com>
2025-04-02 20:49:45 -07:00
flouthoc 918668454f
Merge pull request #1699 from giuseppe/add-mounts-command
crun: add mounts command
2025-04-02 07:57:30 -07:00
Giuseppe Scrivano 285574fe29
tests: add tests for "crun mounts"
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-02 14:20:03 +02:00
Giuseppe Scrivano b5a566bf01
crun: expose mounts command
add a new CLI command "crun mounts add|remove $CTR $FILE" to alter the
mounts of a running container.

The "crun mounts add" command adds the mounts specified in the $FILE
file to the mount namespace of the container process.

Differently, "crun mounts remove" can be used to remove a set of
mounts from the container mount namespace.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-02 14:20:02 +02:00
Giuseppe Scrivano 196ad5e418
container: add/rm mounts API for a running container
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-02 14:20:02 +02:00
Giuseppe Scrivano 4a27212af8
linux: move prepare_mount to its only caller
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-02 14:20:02 +02:00
Giuseppe Scrivano c1c5232d4d
linux: split code to new function
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-02 14:20:02 +02:00
Giuseppe Scrivano c7337717a2
linux: refactor code in a new function
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-02 14:20:02 +02:00
Giuseppe Scrivano 9acf13d6c1
tests: fix function signature
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-02 14:20:01 +02:00
Giuseppe Scrivano 052778356a
status: report better error on ENOENT
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-02 14:20:01 +02:00
Giuseppe Scrivano 74a8d44b54
Merge pull request #1711 from eriksjolund/fix_add_error_release
libcrun, krun: use existing error
2025-04-01 10:32:42 +02:00
Erik Sjölund c6197431b2
libcrun, krun: use existing error
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-04-01 08:18:44 +02:00
Giuseppe Scrivano 9e59075e22
Merge pull request #1709 from sky-big/main
seccomp plugins and seccomp receivers cannot be declared at the same time
2025-03-31 12:37:15 +02:00
Xingwen Xu e229c12987 seccomp plugins and seccomp receivers cannot be declared at the same time
Signed-off-by: Xingwen Xu <sky_big@yeah.net>
Signed-off-by: sky-big <sky_big@yeah.net>
2025-03-29 13:06:53 +08:00
Giuseppe Scrivano db8977e949
Merge pull request #1710 from giuseppe/fix-compiler-warning-krun
src: remove unused variables
2025-03-28 15:34:45 +01:00
Giuseppe Scrivano 18af45153c
src: remove unused variables
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-03-28 15:07:09 +01:00
Daniel J Walsh d7938c95ee
Merge pull request #1708 from slp/krun-external-kernel
krun: implement support for external kernels
2025-03-28 09:56:10 -04:00
Giuseppe Scrivano 0f732e3187
Merge pull request #1704 from eriksjolund/getsubidrange-fix-return-value
utils: getsubidrange returns negative value on errors
2025-03-28 09:18:36 +01:00
flouthoc 788c46835c
Merge pull request #1707 from giuseppe/tag-1.21
NEWS: tag 1.21
2025-03-27 15:06:46 -07:00
Giuseppe Scrivano 10269840aa
NEWS: tag 1.21
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-03-27 21:56:45 +01:00
Sergio Lopez 84828c6551 krun: bump vcpu limit to 16
A couple releases ago, libkrunfw bumped MAX_CPUS to 16, to do the same
here.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2025-03-27 01:23:03 -04:00
Sergio Lopez 079f95d0ed krun: implement support for external kernels
libkrun has recently acquired the ability to load external kernels. Use
it to enable users to bundle kernel images in their container files.

Note: in our execution model, the kernel in the microVM is not (and has
never been) considered a trusted component. This is the reason why we
put the whole VMM inside the container and avoid using vhost or any
other features that require special privileges, beyond KVM.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2025-03-27 01:23:03 -04:00
Sergio Lopez a7a178a1ba krun: consolidate configuration file definitions
Turn "/krun-sev.json" references into a global id, and document the
purpose of each file.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2025-03-27 01:22:59 -04:00
Erik Sjölund 059445af99
utils: getsubidrange returns negative value on errors
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-03-25 17:43:31 +01:00
Giuseppe Scrivano 08df5ec520
Merge pull request #1701 from giuseppe/set-home-unknown-id
utils: set_home_env returns negative value on errors
2025-03-25 10:19:02 +01:00
Giuseppe Scrivano a197cd6588
Merge pull request #1703 from kolyshkin/fix-1702
maybe_chown_std_streams: ignore EBADF
2025-03-25 10:17:34 +01:00
Kir Kolyshkin ff054fe719 maybe_chown_std_streams: ignore EBADF
In some cases, stdin/out/err might be legitimately closed, let's not
bail out early because of that.

Fixes: https://github.com/containers/crun/issues/1702
Fixes: cab3d524cc ("crun: chown std streams")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-03-24 14:07:43 -07:00
Giuseppe Scrivano 3adcc2c75a
utils: set_home_env returns negative value on errors
the caller checks that the return value is negative.

Closes: https://issues.redhat.com/browse/OCPBUGS-45016

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-03-24 13:57:01 +01:00
Giuseppe Scrivano 3632b0d432
Merge pull request #1700 from sky-big/main
terminal fd is not created in the container process if it is not necessary
2025-03-24 10:37:55 +01:00
Giuseppe Scrivano f4973d7aeb
tests: move cwd tests to the correct file
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-03-24 10:34:40 +01:00
Xingwen Xu 53f1c0bca5 if the container main process and console socket do not need a terminal, the terminal fd is not created
Signed-off-by: Xingwen Xu <sky_big@yeah.net>
2025-03-24 17:11:00 +08:00
Giuseppe Scrivano 1245daabe8
Merge pull request #1697 from sky-big/main
Console socket client is initialized only when the user declares terminal
2025-03-21 14:40:04 +01:00
Xingwen Xu 2f7c9b8847 console socket client is initialized only when the user declares terminal.
Signed-off-by: Xingwen Xu <sky_big@yeah.net>
2025-03-21 21:33:13 +08:00
Giuseppe Scrivano c0d19b4dc2
Merge pull request #1698 from eriksjolund/fix-krun-error-message
krun: fix error message
2025-03-21 09:52:53 +01:00
Erik Sjölund 132c793aa3
krun: fix error message
Fixes: https://github.com/containers/crun/issues/1696

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-03-21 09:32:12 +01:00
Giuseppe Scrivano 4284382b54
Merge pull request #1695 from slp/krun-create-ctx
krun: create context after loading the library
2025-03-20 14:49:23 +01:00
Sergio Lopez 8675bafa8e krun: create context after loading the library
Newer versions of libkrun no longer link against libkrunfw (the
library bundling the kernel) and, instead, they load it dynamically
when creating the context.

This implies we need to call krun_create_ctx ealier, before
switching namespaces, or libkrun won't be able to find libkrunfw.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2025-03-20 14:26:46 +01:00
Giuseppe Scrivano a980c89665
Merge pull request #1693 from haircommander/subcgroup-cpuset
cpuset: fix handling of absent subcgroup
2025-03-13 16:56:22 +01:00
Peter Hunt a5cb511de7 cpuset: fix handling of absent subcgroup
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2025-03-12 13:57:23 -04:00
Giuseppe Scrivano e9a27669c4
Merge pull request #1689 from giuseppe/krun-stop-using-set-exec
krun: stop using krun_set_exec
2025-03-07 18:29:53 +01:00
Giuseppe Scrivano d2b824caae
krun: stop using krun_set_exec
so that the command line is taken directly from the config.json file.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-03-07 14:16:37 +01:00
Daniel J Walsh e67610c84f
Merge pull request #1686 from giuseppe/fix-mode-krun-config
krun: make krun config file world readable
2025-03-06 14:01:50 -05:00
Giuseppe Scrivano b09aa67d5e
krun: make krun config file world readable
the file must be readable by all IDs otherwise it is ignored when
running with a UID != 0.

Closes: https://github.com/containers/crun/issues/1685

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-03-06 14:14:24 +01:00
flouthoc e208daaca6
Merge pull request #1683 from giuseppe/ignore-SIGWINCH-without-tty
container: ignore SIGWINCH without tty
2025-03-05 07:54:00 -08:00
flouthoc 4d6eae2eb8
Merge pull request #1679 from giuseppe/criu-restore-same-cgroup
criu: migrate all processes to the correct cgroup
2025-03-04 10:12:26 -08:00
Giuseppe Scrivano 459595b614
container: ignore SIGWINCH without tty
if there is no tty configured for the container, ignore the SIGWINCH
signal instead of exiting with success.

Closes: https://github.com/containers/crun/issues/1442

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-03-04 11:15:30 +01:00
Giuseppe Scrivano 226d0c25d5
Merge pull request #1682 from containers/dependabot/github_actions/uraimo/run-on-arch-action-3.0.0
build(deps): bump uraimo/run-on-arch-action from 2.8.1 to 3.0.0
2025-03-03 19:24:06 +01:00
dependabot[bot] 107214b1b9
build(deps): bump uraimo/run-on-arch-action from 2.8.1 to 3.0.0
Bumps [uraimo/run-on-arch-action](https://github.com/uraimo/run-on-arch-action) from 2.8.1 to 3.0.0.
- [Release notes](https://github.com/uraimo/run-on-arch-action/releases)
- [Commits](https://github.com/uraimo/run-on-arch-action/compare/v2.8.1...v3.0.0)

---
updated-dependencies:
- dependency-name: uraimo/run-on-arch-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-03 17:51:53 +00:00
Giuseppe Scrivano c954b1b64e
criu: use a process to initialize the cgroup
under systemd, use a proxy process that initializes the cgroup, which
is especially needed with systemd, and then restore the container at
the already created cgroup.

Closes: https://github.com/containers/crun/issues/1651

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-03-03 17:54:57 +01:00
Giuseppe Scrivano 959cc6c4c1
cgroup: extend function
now it works on cgroup v1, and can read the cgroup for any pid.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-03-03 17:54:57 +01:00
Giuseppe Scrivano e3866cc5eb
cgroup: fix ownership of dfd in read_pids_cgroup
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-03-03 17:54:53 +01:00
Giuseppe Scrivano e4d7ce8fc4
Merge pull request #1681 from eriksjolund/set-subsystem_path
linux: set subsystem_path before use in error
2025-02-27 21:17:22 +01:00
Erik Sjölund 0f16ced143
linux: set subsystem_path before use in error
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-27 19:27:18 +01:00
Giuseppe Scrivano a1652fd638
Merge pull request #1680 from eriksjolund/remove-dead-code
linux: remove dead code
2025-02-25 08:53:07 +01:00
Erik Sjölund 6ed12c63ce
linux: remove dead code
Fixes: https://github.com/containers/crun/issues/1677

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-25 07:51:18 +01:00
Giuseppe Scrivano 0fd6cd978e
Merge pull request #1678 from giuseppe/test-fail-on-make-check-sudo
ci: fail on "sudo make check"
2025-02-24 16:25:16 +01:00
Giuseppe Scrivano 6c049b8190
criu: hide feature if dlopen is not present
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-24 12:34:38 +01:00
Giuseppe Scrivano 73d0007922
tests: map all IDs into the user namespace
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-24 11:50:42 +01:00
Giuseppe Scrivano 0037d567c9
ci: fail on "sudo make check"
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-24 11:27:13 +01:00
Giuseppe Scrivano c760c2c080
Merge pull request #1654 from eriksjolund/fail-early
chroot_realpath: do not return non-existing paths
2025-02-24 09:17:05 +01:00
Giuseppe Scrivano a561918b67
Merge pull request #1676 from paravoid/testsuite-cgroup-controller-path
tests: fix test_resources_unified_invalid_controller()
2025-02-24 09:15:58 +01:00
Faidon Liambotis f82ead82c8 tests: fix test_resources_unified_invalid_controller()
Commit a4dcb9c (PR #1639) changed error message to show the absolute
path to cgroup.controllers when a controller is not available.

Change the test to also expect the new error message, while the old one,
as the new code may still conditionally return it.

Signed-off-by: Faidon Liambotis <paravoid@debian.org>
2025-02-22 13:14:15 +02:00
Daniel J Walsh 94b114fe53
Merge pull request #1668 from eriksjolund/fix-libkrun_unload
krun: fix libkrun_unload
2025-02-20 14:49:54 -05:00
Giuseppe Scrivano 6da2b1b36e
Merge pull request #1672 from giuseppe/improve-error-message-path-not-executable
utils: improve error message if path not executable
2025-02-19 16:33:24 +01:00
Giuseppe Scrivano 37213555bd
utils: improve error message if path not executable
improve the error returned when the specified path is not executable.

Return the reason why the lookup failed when the path is not a regular
file, or has no executable bit set.

Closes: https://github.com/containers/crun/issues/1671

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-18 15:10:25 +01:00
Giuseppe Scrivano 5f5454b593
utils: do not use hardcoded path buffer
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-18 15:10:25 +01:00
Giuseppe Scrivano 4948e45132
utils: check for eaccess existence
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-18 15:10:25 +01:00
Giuseppe Scrivano 410f0d5308
container: pass down executable path to custom handler
if a handler is specified, do not return an early error if the
executable path could not be found.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-18 15:10:25 +01:00
Giuseppe Scrivano bb56343cbb
utils: move error handling inside find_executable()
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-18 15:10:25 +01:00
Giuseppe Scrivano a77702c0ae
tests: do a shallow git clone for podman
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-18 15:10:25 +01:00
Giuseppe Scrivano f941be482d
error: silence compiler warning
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-18 15:10:17 +01:00
Giuseppe Scrivano b02becf3d1
Merge pull request #1674 from YaSuenag/pr/criu-static
Prevent dlopen() for CRIU in static link'ed binary
2025-02-18 12:54:09 +01:00
Yasumasa Suenaga b8b25ea1a6 Prevent dlopen() for CRIU in static link'ed binary
Signed-off-by: Yasumasa Suenaga <yasuenag@gmail.com>
2025-02-18 17:18:14 +09:00
Erik Sjölund 990b5f6839
krun: fix libkrun_unload
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-17 08:43:06 +01:00
flouthoc 5ceb2a1845
Merge pull request #1666 from giuseppe/annotations-index
libcrun: use an hash map to lookup annotations
2025-02-14 09:15:58 -08:00
Giuseppe Scrivano 439b1e0d58
Merge pull request #1667 from eriksjolund/fix-krun-errno
krun: fix libkrun_exec return value
2025-02-14 10:10:01 +01:00
Giuseppe Scrivano 79b7e6b374
libcrun: use an hash map to lookup the key
Use a hash map to speed up the lookup of the string map.

If the array is too small, use a binary search on the sorted array.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-13 11:27:55 +01:00
Giuseppe Scrivano 51fa411b0b
libcrun: move annotations handling to a separate struct
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-13 11:27:55 +01:00
Giuseppe Scrivano 9c97f5fe5f
Merge pull request #1669 from kolyshkin/readme-centos-9-10
README: add CentOS Stream 9 & 10, rm 8
2025-02-13 09:17:08 +01:00
Giuseppe Scrivano a9e76dc565
Merge pull request #1670 from eriksjolund/fix-dup-error
linux: fix dup error
2025-02-13 09:16:44 +01:00
Erik Sjölund 9abef0dadc
linux: fix dup error
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-13 08:48:57 +01:00
Kir Kolyshkin 14d5baa4d9 README: add CentOS Stream 9 & 10, rm 8
Remove outdated RHEL/CentOS 8 build instructions.

Add RHEL/CentOS Stream 9 and 10.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-02-12 13:21:23 -08:00
Erik Sjölund e735b4bc73
krun: fix libkrun_exec return value
Fixes: https://github.com/containers/crun/issues/1660

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-12 18:11:55 +01:00
Giuseppe Scrivano bb7c4fe2c4
Merge pull request #1659 from eriksjolund/update-error-handling
container, error: update error handling
2025-02-11 16:22:25 +01:00
Giuseppe Scrivano 0c9ff872a6
Merge pull request #1665 from eriksjolund/fix-PATH-lookup
utils: fix PATH lookup
2025-02-11 09:12:00 +01:00
Erik Sjölund 90a321c6db
container, error: update error handling
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-11 08:32:54 +01:00
Erik Sjölund 620b91b5a0
utils: fix PATH lookup
Support filenames starting with a dot when
doing PATH lookup.

Fixes: https://github.com/containers/crun/issues/1664

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-10 19:34:15 +01:00
Giuseppe Scrivano cc27fd6c1b
Merge pull request #1663 from lsm5/rpm-gating-config
Downstream Fedora: fix gating config
2025-02-10 16:08:39 +01:00
Lokesh Mandvekar 4847000937
Downstream Fedora: fix gating config
Doesn't affect upstream. The wrongly configured file blocked koji
builds.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-02-10 20:09:42 +05:30
Giuseppe Scrivano 88a2ad543d
Merge pull request #1662 from giuseppe/fix-podman-ci-10-feb-25
tests: disable new test that does not use the runtime
2025-02-10 11:21:41 +01:00
Giuseppe Scrivano 7949fbc670
tests: disable new test that does not use the runtime
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-10 10:50:33 +01:00
Giuseppe Scrivano 50d5667f59
Merge pull request #1658 from eriksjolund/add-o-nofollow
utils: add O_NOFOLLOW
2025-02-10 10:47:02 +01:00
Giuseppe Scrivano e2dcd7ac84
Merge pull request #1661 from eriksjolund/add-missing-crun_error_release
utils: add missing crun_error_release()
2025-02-10 10:43:58 +01:00
Erik Sjölund 7f76fcd4a8
utils: add missing crun_error_release()
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-08 10:14:02 +01:00
Erik Sjölund 6598c99ba2
utils: add O_NOFOLLOW
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-07 08:10:28 +01:00
Giuseppe Scrivano 4e51077be5
Merge commit from fork
krun: fix CVE-2025-24965
2025-02-05 08:58:21 +01:00
Giuseppe Scrivano 9c9a76ac11
NEWS: tag 1.20
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-04 17:14:55 +01:00
Giuseppe Scrivano 0aec82c2b6
krun: fix CVE-2025-24965
make sure the opened .krun_config.json is below the rootfs directory
and we don't follow any symlink.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-04 17:14:55 +01:00
Giuseppe Scrivano 793188c27c
krun: initialize bool
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-04 17:14:55 +01:00
Giuseppe Scrivano ac956685f2
utils: add O_WRONLY to WRITE_FILE_DEFAULT_FLAGS
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-04 17:14:55 +01:00
Giuseppe Scrivano dcf4f78b8e
utils: drop rootfs_len from safe_openat function
this information is useful only on the fallback case, on kernels that
do not support openat2.

Simplify the code and retrieve the rootfs len only when needed.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-04 10:10:57 +01:00
Giuseppe Scrivano de33f0a8fd
utils: write_file_at_with_flags uses safe_write
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-04 10:10:57 +01:00
Giuseppe Scrivano c460b2537d
utils: safe_write uses size_t for the buffer length
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-04 10:10:57 +01:00
Giuseppe Scrivano 12778089e7
utils: drop function write_file_with_flags
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-04 09:28:16 +01:00
Giuseppe Scrivano 73d52bd6fc
Merge pull request #1656 from giuseppe/safe-partial-writes
utils: fix partial writes with write_file_at_with_flags
2025-02-03 16:32:58 +01:00
Giuseppe Scrivano 7a53b088b7
Merge pull request #1657 from eriksjolund/remove-unneeded-crun-error-release
linux, mono: remove unneeded crun_error_release()
2025-02-03 10:58:41 +01:00
Giuseppe Scrivano f7e0b659c0
Merge pull request #1655 from eriksjolund/reduce-memory-consumption-in-safe_readlinkat
utils: reduce memory consumption in safe_readlinkat
2025-02-03 10:58:30 +01:00
Giuseppe Scrivano f1ef3bb448
utils: move write_file* wrappers to utils.h
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-03 10:11:09 +01:00
Giuseppe Scrivano 3e2344b040
utils: fix partial writes with write_file_at_with_flags
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-03 10:04:20 +01:00
Giuseppe Scrivano 7930c13da0
krun: drop unused variable
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-02-03 09:58:44 +01:00
Erik Sjölund f7987aa978
linux, mono: remove unneeded crun_error_release()
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-01 15:43:44 +01:00
Erik Sjölund b548479ce0
utils: reduce memory consumption in safe_readlinkat
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-02-01 10:31:36 +01:00
Erik Sjölund 3b65317032
chroot_realpath: remove dead code
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-01-31 19:27:50 +01:00
Erik Sjölund 17135c1b25
chroot_realpath: do not return non-existing paths
Do not return non-existing paths in chroot_realpath().
Rationale: failing early is good.

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-01-31 19:27:35 +01:00
Giuseppe Scrivano a029af28c8
Merge pull request #1653 from eriksjolund/remove-dead-code
linux, utils: remove dead code crun_ensure_file*()
2025-01-31 09:00:09 +01:00
Giuseppe Scrivano c332fe4fd5
Merge pull request #1652 from eriksjolund/fix-error-after-read
container: fix error after read
2025-01-31 08:57:24 +01:00
Erik Sjölund 7ab1acd35b
container: fix error after read
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-01-30 19:23:58 +01:00
Erik Sjölund 490d550204
linux, utils: remove dead code crun_ensure_file*()
Remove dead code.

Use 0620 instead of 0700 as mode
for the new file.

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2025-01-30 19:16:03 +01:00
flouthoc 801d6e8ffa
Merge pull request #1647 from giuseppe/status-validate-container-id
status: validate container id
2025-01-29 08:04:50 -08:00
flouthoc cdc907b792
Merge pull request #1649 from giuseppe/sysctl-add-explanation
linux: improve error writing to net.ipv4.ping_group_range
2025-01-29 08:03:54 -08:00
Giuseppe Scrivano 432a66d994
status: validate container id
check that the container name does not contain any slash so that
operations in status.c won't risk to access directories/files outside
of the run directory.

Beside, it is not a security issue because if a directory exists, crun
refuses to use it; and if it does not exist then it is created and
immediately deleted because the container creation fails.

Play safe and do not allow any slash in the container name.

Closes: https://github.com/containers/crun/issues/1646

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-29 12:26:20 +01:00
Giuseppe Scrivano 73bcfabbb1
status: report errors from get_state_directory_status_file
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-29 12:26:20 +01:00
Giuseppe Scrivano 30d22ba31b
status: report errors from get_run_directory
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-29 12:26:20 +01:00
Giuseppe Scrivano 873db607fa
status: report errors from libcrun_get_state_directory
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-29 12:26:20 +01:00
Giuseppe Scrivano f5e7718ceb
linux: improve error writing to net.ipv4.ping_group_range
improve the error message when writing to the
/proc/sys/net/ipv4/ping_group_range file and the write fails with
EINVAL.  When running in a user namespace, it might mean the requested
groups are not mapped.

Closes: https://github.com/containers/crun/issues/1648

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-29 12:26:06 +01:00
Giuseppe Scrivano 22b53c8f99
Merge pull request #1650 from giuseppe/fix-ppcle64-ci
github: disable failing CI builds
2025-01-29 12:25:33 +01:00
Giuseppe Scrivano 5c35f278ad
tests: make python script executable
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-29 11:54:19 +01:00
Giuseppe Scrivano 5b51cca89a
github: disable aarch64, ppc64le and s390x build
the compiler crashes in random places.  Re-enable when it works again.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-29 11:05:22 +01:00
Giuseppe Scrivano c00c540937
Merge pull request #1643 from giuseppe/exec_always_setsid
exec: always call setsid
2025-01-28 19:20:48 +01:00
Giuseppe Scrivano 84d509926a
github: cat config.log on configure failures
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-28 09:04:46 +01:00
Giuseppe Scrivano 7aa2cd856b
github: add r/w permissions
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-28 09:04:46 +01:00
Giuseppe Scrivano 4f82309082
github: show apt-get output
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-28 09:04:46 +01:00
Giuseppe Scrivano 2d08f58646
exec: always call setsid
previously setsid was called only for interactive sessions.  Change it
to be always used for the "crun exec" process.

Closes: https://github.com/containers/crun/issues/1642

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-28 09:04:42 +01:00
Giuseppe Scrivano 71c93c6a9a
Merge pull request #1645 from giuseppe/reset-affinity-ignore-ENOSYS
scheduler: ignore ENOSYS when resetting affinity mask
2025-01-28 09:04:15 +01:00
Giuseppe Scrivano b788f338bc
scheduler: ignore ENOSYS when resetting affinity mask
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-27 21:27:18 +01:00
Giuseppe Scrivano fbcb80be34
Merge pull request #1644 from sohankunkerkar/fix-enival-err-report
src/linux: handle EINVAL during pidfd_open gracefully
2025-01-27 20:33:35 +01:00
Sohan Kunkerkar e292c9e946 src/linux: handle EINVAL during pidfd_open gracefully
Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2025-01-27 13:20:11 -05:00
flouthoc a2d140d262
Merge pull request #1640 from lsm5/tmt-rhel-conditionals
TMT: account for environments on internal testing farm ranch
2025-01-21 08:08:00 -08:00
Lokesh Mandvekar 869804f487
TMT: account for environments on internal testing farm ranch
RHEL envs on the internal redhat testing farm ranch don't have any easy way
to install and enable the `epel-release` package.

Also, CentOS-Stream envs on the internal ranch have EPEL installed but
disabled.

This PR should account for both these envs. The tests on public ranch
should continue unaffected.

The packages required for testing have also been moved to the plan
preparation stage itself.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-01-21 17:33:08 +05:30
Giuseppe Scrivano 8d17a9165f
Merge pull request #1631 from lsm5/tmt-centos-stream-followup
TMT: Replace adjust with prepare conditionals
2025-01-15 15:10:01 +01:00
Daniel J Walsh a6464392e2
Merge pull request #1639 from aarontomlin/main
cgroup: Show the absolute path to cgroup.controllers when a controller is not available
2025-01-15 06:39:01 -05:00
Lokesh Mandvekar 65484cb925
TMT: Replace `adjust` with `prepare` conditionals
Ensure all envs are setup correctly.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-01-15 16:29:47 +05:30
Aaron Tomlin a4dcb9c6c4 cgroup: Show the absolute path to cgroup.controllers when a controller is not available
This patch will provide the absolute path to the cgroup.controllers file
in the error message if the specified controller is not available.
Hopefully this will help with troubleshooting efforts.

Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
2025-01-14 21:20:17 +00:00
Giuseppe Scrivano 283ffdf7dc
Merge pull request #1637 from giuseppe/revert-remove-tun-tap
Revert "cgroup: remove tun/tap from the default allow list"
2025-01-13 12:00:15 +01:00
Giuseppe Scrivano 179686b7e5
Revert "cgroup: remove tun/tap from the default allow list"
This reverts commit 1e3042427d.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-13 10:08:40 +01:00
flouthoc b4df4136cb
Merge pull request #1638 from giuseppe/fix-wasmedge-ci
test: use wasm32-wasip1 instead of wasm32-wasi
2025-01-10 14:17:10 -08:00
Giuseppe Scrivano 68e8d9abac
test: use wasm32-wasip1 instead of wasm32-wasi
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-10 14:15:22 +01:00
Giuseppe Scrivano 6f010b5f6b
Merge pull request #1628 from giuseppe/fix-criu-set-network-lock
criu: do not set network_lock if not specified
2025-01-08 10:01:42 +01:00
Giuseppe Scrivano fb1010ac39
Merge pull request #1633 from kwilczynski/fix/handle-mssing-user-when-setting-HOME-environment
utils: return error from set_home_env() if the user was not found
2025-01-08 09:10:50 +01:00
Krzysztof Wilczyński 25efd10ae9
Remove surplus ENOENT error check
Signed-off-by: Krzysztof Wilczyński <kwilczynski@redhat.com>
2025-01-08 01:00:17 +09:00
Krzysztof Wilczyński 99f2824f4c
utils: return error from set_home_env() if the user was not found
If a given user cannot be found within the container, for example,
because the specific UID does not exist, then return an error and
let the callers of the set_home_env() function handle the case of
the missing user appropriately.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@redhat.com>
2025-01-08 01:00:17 +09:00
Giuseppe Scrivano 3158e4912b
criu: improve error handling for CRIU function calls
fix the function signature for criu_set_network_lock, and check errors
from libcriu.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-07 16:46:34 +01:00
Giuseppe Scrivano 3cd9c2c931
criu: do not set network_lock if not specified
commit c4f8c87a7a introduced the issue.

Closes: https://github.com/containers/crun/issues/1627

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-07 16:46:26 +01:00
Giuseppe Scrivano 0b63b44d00
Merge pull request #1634 from giuseppe/enable-userns-github
github: enable unprivileged userns
2025-01-07 16:46:06 +01:00
Giuseppe Scrivano a542ecc7e8
github: enable unprivileged userns
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-07 13:00:57 +01:00
Giuseppe Scrivano 38122ac91f
test: fix compiler warnings
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-01-07 12:49:40 +01:00
Giuseppe Scrivano bd4f773309
Merge pull request #1629 from lsm5/tmt-c9s-upstreaming
TMT: upstream tests from c9s
2024-12-30 12:51:45 +01:00
Lokesh Mandvekar ec5947ce45
TMT: Add sanity tests from c9s downstream
This commit upstreams sanity tests added to CentOS Stream 9
contributed by Alex Jia <ajia@redhat.com>
Ref: https://gitlab.com/redhat/centos-stream/rpms/crun/-/merge_requests/117

These tests will now run for all active Fedora and CentOS Stream.

The existing podman system tests have also been made more idiomatic.

The rpm gating config also has updated rules for rhel updates.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2024-12-26 21:06:06 +05:30
Lokesh Mandvekar d08e304af3
Packit: Remove RHEL jobs
RHEL environments are frequently out of date and move much slower than
what's required for upstream. Also, RHEL jobs run on internal testing
farm so the logs are not publicly visible.

CentOS Stream works well enough upstream and the logs are also visible
publicly.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2024-12-23 15:04:18 +05:30
Giuseppe Scrivano a48818a45a
Merge pull request #1626 from giuseppe/tag-1.19.1
NEWS: tag 1.19.1
2024-12-17 16:42:47 +01:00
Giuseppe Scrivano 3e32a70c93
NEWS: tag 1.19.1
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-17 16:33:09 +01:00
Giuseppe Scrivano 64deb6285a
Merge pull request #1624 from giuseppe/update-nix-13-dec-2024
nix: update packages list
2024-12-16 21:26:35 +01:00
Giuseppe Scrivano 3dbc7e765d
Merge pull request #1623 from giuseppe/fix-hang-on-unresponsive-tty
linux: fix a hang if there are no reads from the tty
2024-12-16 14:27:13 +01:00
Giuseppe Scrivano 8b972be9be
linux: fix a hang if there are no reads from the tty
avoid a "crun exec" hang when the the other end of the terminal
stopped reading.

That happened because `copy_fd_to_fd` tried to write everything that
it has received from the source fd, so it would hang the current
process.  Prevent that using non blocking file descriptors and using
epoll to detect when the file descriptor is available for write.

Fixes: https://issues.redhat.com/browse/OCPBUGS-45632

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-16 13:58:27 +01:00
Giuseppe Scrivano e50e47ca90
libcrun: add ring buffer implementation
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-16 13:58:26 +01:00
Giuseppe Scrivano 20ec098250
utils: extend epoll_helper to monitor writeable fds
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-16 09:29:56 +01:00
Giuseppe Scrivano 77a72bdfaa
utils: use bool for set_blocking_fd()
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-16 09:29:56 +01:00
Giuseppe Scrivano 5f9ca9eb57
utils: skip copy_file_range if not usable
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-16 09:29:56 +01:00
Giuseppe Scrivano e2380490aa
tests: adjust test to upstream code
commit 1832c1706e introduced the change.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-16 09:29:56 +01:00
Giuseppe Scrivano d79334864c
build-aux: use an init process for the nix container
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-13 23:32:06 +01:00
Giuseppe Scrivano 0ec1522ba9
nix: update packages list
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-13 11:37:27 +01:00
Giuseppe Scrivano 017b5fddcb
Merge pull request #1622 from xw19/small-contrib-sm
Add missing periods at the end of sentence
2024-12-12 20:40:31 +01:00
Sourav Moitra 9b01471809 Generated crun.1
Signed-off-by: Sourav Moitra <sourav.moitr@gmail.com>
2024-12-12 23:16:46 +05:30
Sourav Moitra d700d9db51 Add missing periods at the end of sentence
Signed-off-by: Sourav Moitra <sourav.moitr@gmail.com>
2024-12-12 22:03:12 +05:30
flouthoc 3ec6298abd
Merge pull request #1621 from giuseppe/drop-fallback-with-temporary-mount
linux: remove tmpmount workaround
2024-12-10 08:31:37 -08:00
Giuseppe Scrivano 1832c1706e
linux: remove tmpmount workaround
remove the workaround to mount a cgroup on top of another cgroup mpunt
using a temporary mount.  This has the disadvantage to leak a mount on
the host, while the alternative one, which is the only mechanism used
now, does not since it uses a private tmpfs.

Fixes: https://issues.redhat.com/browse/RHEL-70694

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-10 16:26:42 +01:00
Giuseppe Scrivano ca39d7c43b
Merge pull request #1619 from jpalus/no-tests-normal-build
build: don't compile tests during normal build
2024-12-09 14:42:47 +01:00
Jan Palus 9e3615a485
ci: build tests_libcrun_fuzzer before fuzzing
Signed-off-by: Jan Palus <jpalus@fastmail.com>
2024-12-09 12:55:56 +01:00
Giuseppe Scrivano ae05b083e1
Merge pull request #1618 from jpalus/libtool-libcrun_testing
build: use libtool to create libcrun_testing
2024-12-09 09:01:27 +01:00
Jan Palus 6b2e6193a9
build: use libtool to create libcrun_testing
static library is supposed to be archive of object files. if libtool is
not used libcrun_testing.a includes raw libocispec.la file instead of
its object files.

Signed-off-by: Jan Palus <jpalus@fastmail.com>
2024-12-08 16:43:13 +01:00
Jan Palus 3c5292b270
build: don't compile tests during normal build
Signed-off-by: Jan Palus <jpalus@fastmail.com>
2024-12-08 16:42:29 +01:00
Giuseppe Scrivano b3f8bab53f
Merge pull request #1617 from giuseppe/tag-1.19
NEWS: tag 1.19
2024-12-06 13:05:37 +01:00
Giuseppe Scrivano db31c42ac4
NEWS: tag 1.19
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-06 12:33:01 +01:00
Giuseppe Scrivano 1f8dc4d192
Merge pull request #1613 from danishprakash/criu-network-lock
checkpoint/restore: allow passing network lock method to libcriu
2024-12-06 12:31:45 +01:00
Giuseppe Scrivano 1aac2ef2a8
Merge pull request #1616 from michalsieron/no-cgroup-v1-freezer
Handle case where cgroup v1 freezer is disabled
2024-12-06 11:39:03 +01:00
Danish Prakash c4f8c87a7a
checkpoint/restore: allow passing network lock method to libcriu
Netavark defaults to nftables but it seems that crun checkpoint doesn't
allow setting firewall driver used for network locking and unlocking by
criu. criu defaults to iptables and in situations where podman is
installed without iptables, criu tries to use iptable utils and fails.

Signed-off-by: Danish Prakash <contact@danishpraka.sh>
2024-12-06 11:34:19 +01:00
Michal Sieron 1942efc9c4 Handle case where cgroup v1 freezer is disabled
On cgroup v1 it is possible to disable freezer subsystem. In such case
freezer.state file won't be present. Due to the race condition handling
in libcrun_get_container_state_string, missing freezer.state would be
interpreted as cgroup being removed when check is being performed.

But as indicated earlier, that is not the case when it's cgroup v1 and
the freezer is disabled. Therefore introduce logic that checks for that
using type of the filesystem mounted under the freezer directory.

When freezer is disabled, container simply cannot be paused.

Fixes #1612

Signed-off-by: Michal Sieron <michalwsieron@gmail.com>
2024-12-06 10:53:52 +01:00
Giuseppe Scrivano b243185136
Merge pull request #1610 from usiegl00/wamr_support
wasm: add support for wamr (wasm-micro-runtime)
2024-12-06 10:40:50 +01:00
Giuseppe Scrivano 59c1f32f8b
Merge pull request #1592 from alexminder/main
error: 'CHAR_BIT' undeclared. fix compile failure with musl libc
2024-12-06 10:40:42 +01:00
Giuseppe Scrivano 38c717f8b3
Merge pull request #1609 from darktohka/bugfix/crun-build-makefile
build: Don't build cloned_binary as part of crun
2024-12-06 09:39:18 +01:00
Giuseppe Scrivano df606728f5
Merge pull request #1614 from giuseppe/systemd-no-override-devices
cgroup, systemd: do not override devices on update
2024-12-06 08:55:33 +01:00
Maciej b366a78574 wamr: revitalize wamr handler
Signed-off-by: usiegl00 <50933431+usiegl00@users.noreply.github.com>
2024-12-06 11:08:31 +09:00
Giuseppe Scrivano 2121950418
cgroup, systemd: do not override devices on update
if the resources configuration on update does not contain any
information on devices, do not change the current configuration.

Fixes: https://issues.redhat.com/browse/OCPBUGS-45394

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-05 21:26:13 +01:00
Alexander Miroshnichenko d15310734d error: 'CHAR_BIT' undeclared. fix compile failure with musl libc
cgroup-systemd.c:84:49: error: 'CHAR_BIT' undeclared (first use in this function)

Signed-off-by: Alexander Miroshnichenko <alex@millerson.name>
2024-12-05 11:28:43 +03:00
Derzsi Dániel 5d66b30967 build: Don't build cloned_binary as part of crun
Signed-off-by: Derzsi Dániel <daniel@tohka.us>
2024-11-28 12:45:52 +02:00
Daniel J Walsh 52ed5880c4
Merge pull request #1607 from giuseppe/exec-affinity
add support for exec cpu affinity
2024-11-26 11:43:04 -05:00
Giuseppe Scrivano fd69065d6a
test: add new test for exec-cpu-affinity
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-25 14:50:36 +01:00
Giuseppe Scrivano b941d6c5a1
linux: move reset cpu affinity to scheduler
and set it only when there is no exec cpu affinity mask specified.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-25 14:50:36 +01:00
Giuseppe Scrivano ef33259c77
linux: honor exec cpu affinity mask
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-25 14:50:36 +01:00
Giuseppe Scrivano 047b7485a5
src: move cpuset_string_to_bitmask to utils
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-25 14:50:36 +01:00
Giuseppe Scrivano 2c8088c4e9
libocispec: sync
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-25 14:50:36 +01:00
Giuseppe Scrivano 42b959b548
container: initialize max caps before accessing process block
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-25 14:50:36 +01:00
Giuseppe Scrivano 46bd62b117
cgroup: do not stop process on exec
the cpu mask is configured on the systemd scope, so there is no time
when the process joins the cgroup and runs on unexpected cpus

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-25 14:50:36 +01:00
Giuseppe Scrivano 19bbd8dae3
utils: silence compiler warning
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-25 14:50:35 +01:00
Daniel J Walsh 2b3faef7ee
Merge pull request #1606 from giuseppe/mount-api-clone-binary
src: use mount API to self-clone
2024-11-19 08:40:27 -05:00
Giuseppe Scrivano 8a0ee4b56a
src: use mount API to self-clone
if the new mount API is available, use it to grad a read-only
reference to the current executable.  The advantage is that there is
no need to leak a mount in the current mount namespace.

Closes: https://github.com/containers/crun/issues/1383
Closes: https://issues.redhat.com/browse/RHEL-67558

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-17 22:05:32 +01:00
flouthoc 83998014c5
Merge pull request #1605 from giuseppe/fix-static-analysis-reports
Fix static analysis reports
2024-11-15 08:49:02 -08:00
Giuseppe Scrivano 85d4db3d8b
crun: check for integer overflow
validate that the stroll returned value can fit into an integer.

Closes: https://github.com/containers/crun/issues/1604

Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-14 18:49:52 +01:00
Giuseppe Scrivano 10b2146e65
linux: add check before deref
check that mount_fds is not NULL before accessing it.  A similar check
exists later in the code.

Closes: https://github.com/containers/crun/issues/1603

Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-14 15:11:35 +01:00
Giuseppe Scrivano 2525752d7b
cgroup: drop unuseful check
subsystem can't be NULL, so remove the check.

Closes: https://github.com/containers/crun/issues/1602

Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-14 15:11:35 +01:00
Giuseppe Scrivano 1ae190b0cd
src: run make clang-format
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-14 15:11:35 +01:00
Giuseppe Scrivano 01830cb038
Merge pull request #1598 from giuseppe/fix-regression-podman
cgroup, systemd: fix first rule selection for systemd
2024-10-31 17:19:41 +01:00
Giuseppe Scrivano 00ab38af87
NEWS: tag 1.18.2
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-31 15:26:10 +01:00
Giuseppe Scrivano 5bc6b50ece
cgroup, systemd: fix first rule selection for systemd
The `find_first_rule_no_default` function was modified to also check
the simple case where there is only a default BLOCK ALL rule.

In addition, improve the function to skip to the first allow rule when
the default BLOCK ALL rule is implicit.

Closes: https://github.com/containers/crun/issues/1597

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-31 15:23:50 +01:00
Giuseppe Scrivano cba595650c
Merge pull request #1596 from giuseppe/tag-1.18.1
NEWS: tag 1.18.1
2024-10-30 11:45:37 +01:00
Giuseppe Scrivano c41f034fdb
NEWS: tag 1.18.1
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-29 21:14:01 +01:00
Giuseppe Scrivano f12b2dab88
Merge pull request #1595 from eriksjolund/check-for-truncation
utils: check for snprintf truncation
2024-10-29 21:13:52 +01:00
Daniel J Walsh 056a40764b
Merge pull request #1591 from giuseppe/fix-cgroup-devices
cgroup: ignore redundant deny dev cgroup rules
2024-10-29 14:30:58 -04:00
Erik Sjölund 6628d7a3d9 utils: check for snprintf truncation
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2024-10-29 18:46:10 +01:00
Giuseppe Scrivano 7c4a3f9cbc
cgroup: skip DevicePolicy if all devices are allowed
Closes: https://github.com/containers/crun/issues/1589

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-29 12:41:23 +01:00
Giuseppe Scrivano 3b31243146
Merge pull request #1593 from giuseppe/deprecate-cgroup-v1
libcrun: deprecate cgroup v1
2024-10-29 11:38:10 +01:00
Giuseppe Scrivano ef60ec905f
libcrun: deprecate cgroup v1
we are two months away from 2025, cgroup v1 should not be used
anymore.

For now, add a warning when cgroup v1 is used.

Closes: https://github.com/containers/crun/issues/1149

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-28 12:18:04 +01:00
Giuseppe Scrivano 77e4233a2d
cgroup, systemd: ignore rules before a default deny one
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-28 10:29:11 +01:00
Giuseppe Scrivano 8a30a57a35
cgroup: ignore redundant deny dev cgroup rules
Closes: https://github.com/containers/crun/issues/1588

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-28 10:29:11 +01:00
Daniel J Walsh 0cf2e6cb97
Merge pull request #1590 from giuseppe/add-CONTRIBUTING.md
CONTRIBUTING.md: new file
2024-10-25 12:00:38 -04:00
Giuseppe Scrivano 369dd95be6
CONTRIBUTING.md: new file
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-25 16:40:02 +02:00
Giuseppe Scrivano 135f6aae92
Merge pull request #1587 from aconz2/uidgid-xstrdup
linux: copy map_file before tokenizing in uidgidmap_helper
2024-10-25 13:13:11 +02:00
Andrew Consroe 3647ecabff linux: copy map_file before tokenizing in uidgidmap_helper
libcrun_set_usernamespace passes uid_map/gid_map to uidgidmap_helper
which tokenizes it to pass as process args. But if the helper isn't available,
the fallback (when host_uid != 0) reuses this tokenized string and tries
writing it to /proc/pid/gid_map which fails with EINVAL

Signed-off-by: Andrew Consroe <aconz2@gmail.com>
2024-10-24 15:48:23 -05:00
Giuseppe Scrivano 2381713dd6
Merge pull request #1584 from giuseppe/tag-1.18
NEWS: tag 1.18
2024-10-22 14:23:07 +02:00
Giuseppe Scrivano 8656b25485
NEWS: tag 1.18
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-22 13:44:06 +02:00
Giuseppe Scrivano c98e705efc
Merge pull request #1582 from rst0git/add-lsm-options-to-man-page
crun.1.md: add lsm-profile and lsm-mount-context
2024-10-22 13:44:03 +02:00
Giuseppe Scrivano e0335d461a
Merge pull request #1583 from yselkowitz/patch-1
rpm: use embedded yajl in RHEL builds
2024-10-22 13:43:23 +02:00
Yaakov Selkowitz bf0a3516ba
rpm: use embedded yajl in RHEL builds
A standalone yajl library is unwanted in RHEL, and nothing else therein depends on it; based on c10s:

https://gitlab.com/redhat/centos-stream/rpms/crun/-/merge_requests/102

Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2024-10-21 15:23:26 -04:00
Radostin Stoyanov 414612906f crun.1.md: add lsm-profile and lsm-mount-context
This pull request updates the man page with description for the
`--lsm-profile` and `--lsm-mount-context` options introduced in
https://github.com/containers/crun/pull/1578

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-10-21 16:43:34 +01:00
Giuseppe Scrivano 093ed74b2e
Merge pull request #1581 from rst0git/fix-lsm
criu: fix loading of lsm functions
2024-10-21 17:28:50 +02:00
Radostin Stoyanov ed64259318 criu: load lsm functions
Fixes https://github.com/containers/crun/pull/1578

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-10-21 16:05:29 +01:00
Giuseppe Scrivano 2d36664524
Merge pull request #1578 from rst0git/lsm
restore: add lsm-profile and lsm-mount-context options
2024-10-21 14:17:09 +02:00
Radostin Stoyanov ce89aa662b
restore: add lsm-mount-context option
The lsm-mount-context option allows to specify a new mount context to be
used during restore. For example, if a mountpoint has been checkpointed
with context like

	context="system_u:object_r:container_file_t:s0:c82,c137"

it is possible to change this context using

	--lsm-mount-context "system_u:object_r:container_file_t:s0:c204,c495"

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-10-21 14:15:07 +02:00
Radostin Stoyanov 9efd6a87de
restore: add lsm-profile option
By default, CRIU restores containers with the same SELinux process
labels used during checkpointing. However, when restoring multiple
copies of a container, this results in all containers using identical
SELinux labels, which is undesirable.

In addition, all containers in a Pod share the SELinux label of the
infrastructure container. To restore a new container into an existing
Pod, we need to specify the SELinux label to be used during restore.

This patch adds `--lsm-profile` option for the `crun restore` command
to enable this functionality, similar to runc [1].

[1] https://github.com/opencontainers/runc/pull/3005

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2024-10-21 14:15:07 +02:00
Giuseppe Scrivano bfdabce162
Merge pull request #1580 from giuseppe/update-run-on-arch
github: update run-on-arch-action
2024-10-21 13:39:20 +02:00
Giuseppe Scrivano aee13711b5
github: update run-on-arch-action
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-21 13:10:58 +02:00
flouthoc 4097051548
Merge pull request #1577 from giuseppe/support-misc-controller
cgroup: add support for the misc controller
2024-10-19 08:26:23 -07:00
flouthoc f7557b701d
Merge pull request #1579 from giuseppe/support-io-weight-device
cgroup, systemd: add support for IODeviceWeight
2024-10-19 08:25:51 -07:00
Giuseppe Scrivano c4a65aad3d
cgroup: split lines when writing raw unified files
writing to io.weight, at least, fails with EINVAL the write attempts
with multiple lines.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-18 17:50:40 +02:00
Giuseppe Scrivano dd7adb2265
cgroup: write_cgroup_file_or_alias uses write_cgroup_file
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-18 17:50:40 +02:00
Giuseppe Scrivano 22b018d029
cgroup: convert block_io devices to IODeviceWeight
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-18 17:50:39 +02:00
Giuseppe Scrivano c7745e9a9a
cgroup, systemd: add support for IODeviceWeight
If the "io.(bfq)weight" is specified in the format "$DEVICE $VALUE",
pass the information to systemd using IODeviceWeight instead of
IOWeight, that is only used for the default value.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-18 17:49:54 +02:00
Giuseppe Scrivano 8e3e693e75
cgroup: refactor handling of io.weight
it is a preparation for the next commit

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-18 17:49:51 +02:00
Giuseppe Scrivano 7d0e2cdbe9
cgroup: report errors if value contains not parsed data
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-17 22:23:39 +02:00
flouthoc e9c9294ade
Merge pull request #1574 from giuseppe/more-systemd-props
cgroup: add support for more systemd properties
2024-10-17 09:14:25 -07:00
Giuseppe Scrivano efae52ab32
cgroup: add support for the misc controller
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-17 09:29:10 +02:00
Giuseppe Scrivano d55194b293
cgroup systemd: ignore unsupported properties
introduce a mechanism to detect and register the properties systemd doesn't
support so that we don't attempt to set them next time.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-16 16:23:06 +02:00
Giuseppe Scrivano 500cf80292
cgroup, systemd: honor cpu.idle
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-16 11:43:34 +02:00
Giuseppe Scrivano 5f64da6a13
linux: pass down state_root to the cgroup handler
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-16 11:42:28 +02:00
Giuseppe Scrivano 80d9677bbc
cgroup, systemd: honor memory.zswap.max
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-16 11:42:23 +02:00
Giuseppe Scrivano 01fa4993d6
cgroup: specify devices rules to systemd
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-16 11:42:18 +02:00
Giuseppe Scrivano 667442e4ed
cgroup: move standard devs definition in a common place
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-16 11:42:18 +02:00
Giuseppe Scrivano 335d8cfb06
cgroup: specify TasksMax to systemd
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-14 14:07:13 +02:00
Giuseppe Scrivano f6d8373f6f
cgroup: specify MemorySwapMax to systemd
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-14 14:07:12 +02:00
Giuseppe Scrivano 1a04566d7d
cgroup: specify MemoryLow|MemoryHigh|MemoryMin to systemd
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-14 14:07:11 +02:00
Giuseppe Scrivano 8d90eb3a5d
cgroup: use macro to refactor common pattern
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-14 14:07:08 +02:00
Giuseppe Scrivano 53cd1c1c69
Merge pull request #1575 from sky-big/main
add duplicate namespace detection
2024-10-14 11:52:46 +02:00
Xingwen Xu 34061ab5c8 add duplicate namespace detection
Signed-off-by: Xingwen Xu <sky_big@yeah.net>
2024-10-14 16:28:35 +08:00
Giuseppe Scrivano b29ccd7e1d
cgroup: rename function
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-11 13:26:09 +02:00
Giuseppe Scrivano af034b911c
cgroup: special handle value "max"
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-11 13:26:09 +02:00
Giuseppe Scrivano d854bf0df0
Merge pull request #1571 from lsm5/packit-constrain-downstream-jobs
Packit: constrain koji and bodhi jobs to the fedora package
2024-10-10 14:02:07 +02:00
flouthoc db8572fd4f
Merge pull request #1573 from giuseppe/set-io-weight-on-systemd-owned-cgroup
cgroup: set io weight on systemd owned cgroup
2024-10-09 07:16:15 -07:00
Giuseppe Scrivano 2825a5791c
cgroup: set io weight on systemd owned cgroup
the io weight must be set on the systemd owned cgroup because its
value is a weight compared to the sibling cgroups so it is pointless
to specify it in the sub-cgroup.

Closes: https://github.com/containers/crun/issues/1572

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-09 15:08:31 +02:00
Giuseppe Scrivano 291dd730e2
Merge pull request #1570 from giuseppe/update-nixpkgs-7-oct-2024
nix: update list of packages
2024-10-09 09:22:07 +02:00
Lokesh Mandvekar 6cf5324bab
Packit: constrain koji and bodhi jobs to the fedora package
Without this constraint, packit creates duplicate jobs when running
downstream jobs. This does not affect upstream.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2024-10-08 16:45:47 +05:30
Giuseppe Scrivano 7140aea169
nix: replace gitMinimal with git
criu already uses git

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-08 11:46:20 +02:00
Giuseppe Scrivano a3fcba9f2b
Merge pull request #1568 from michalsieron/no-user-namespaces
Fix running on kernel without user namespaces
2024-10-08 09:21:19 +02:00
Michal Sieron 27b5a2f607 Fix running on kernel without user namespaces
As long as container is not rootless, runc can run it on a Kernel
without enabled user namespaces, which makes sense to me. I think crun
should be able to do the same, however I hit some issues with it.

First I got
    error opening file `/proc/self/uid_map`: No such file or directory
and that's because when kernel doesn't have user namespaces,
the uid_map doesn't exist at all. Adding a check for file existence,
after the read_all_file call, solves the issue.

With that change in place I got an error about next missing file.
This time it was /proc/self/setgroups and it was coming from
can_setgroups call. Adding similar fix to the one for uid_map, solves
the issue.

With both changes in place I am now able to run a non-rootless
container on a kernel without user namespaces enabled.

Signed-off-by: Michal Sieron <michalwsieron@gmail.com>
2024-10-07 16:27:48 +02:00
Giuseppe Scrivano b5ff44f2f5
nix: update list of packages
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-07 16:08:08 +02:00
Giuseppe Scrivano 3b40d77322
build: specify --extra-experimental-features to nix
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-07 16:08:08 +02:00
Giuseppe Scrivano da61687585
release.sh: update nix image
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-07 16:08:08 +02:00
Daniel J Walsh 7c194cb726
Merge pull request #1569 from michalsieron/features-no-seccomp
Fix segfault in `crun features`
2024-10-07 10:05:21 -04:00
Michał Sieroń dee824e660 Fix segfault in `crun features`
When crun is compiled after configuring with --disable-seccomp, running
`crun features` subcommand segfaults.

This happens, because crun tries to add `actions` and `operators` arrays
to the printed JSON, but those arrays are actually NULLs.
Their initialization is guarded by `#ifdef HAVE_SECCOMP` in
`libcrun_container_get_features`.

Guarding their addition to the JSON with a NULL check solves the issue.
I also guarded `libseccomp.version` annotation with similar check.

Signed-off-by: Michal Sieron <michalwsieron@gmail.com>
2024-10-07 14:18:07 +02:00
Daniel J Walsh 4b75c7cb57
Merge pull request #1565 from KristinaHa26/riscv64
Disable criu support on riscv64
2024-09-26 14:07:48 -04:00
David Abdurachmanov 4ea62f2521 Disable criu support on riscv64
criu is not ported to riscv64 arch.

Signed-off-by: David Abdurachmanov <davidlt@rivosinc.com>
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
2024-09-26 15:53:54 +02:00
Giuseppe Scrivano 4ab4ac0798
Merge pull request #1559 from lsm5/packit-mkpath
Packit: Create missing path components in files_to_sync
2024-09-10 14:08:10 +02:00
Lokesh Mandvekar 969fd2ed16
Packit: Create missing path components in files_to_sync
Packit's propose-downstream failed[0] for crun 1.17 because packit wasn't
able to create `tests/tmt` path structure in the downstream repo[1].
[0]: https://github.com/containers/crun/issues/1558
[1]: https://dashboard.packit.dev/jobs/propose-downstream/10800

Using `mkpath: true` will create missing path components if any.
Ref: https://packit.dev/docs/configuration#files_to_sync

Fixes: #1558

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2024-09-10 16:37:33 +05:30
Giuseppe Scrivano 3edf8d47ea
Merge pull request #1557 from giuseppe/tag-1.17
NEWS: tag 1.17
2024-09-09 15:05:12 +02:00
Giuseppe Scrivano 000fa0d4ee
NEWS: tag 1.17
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-09 14:45:19 +02:00
Giuseppe Scrivano 558fce5196
Merge pull request #1556 from lsm5/packit-update-fedora-targets
Packit: reorg Fedora targets
2024-09-06 15:51:01 +02:00
Giuseppe Scrivano b3ee99ded0
Merge pull request #1552 from saschagrunert/logs
Add debug logs for container creation
2024-09-06 14:51:10 +02:00
Lokesh Mandvekar e3b5a263a6
Packit: Reuse Fedora targets wherever possible
This commit uses 2 anchors, `fedora_copr_targets` and `fedora_targets`
for copr and downstream jobs respectively.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2024-09-06 18:19:22 +05:30
Lokesh Mandvekar 556b808989
Packit: separate out ELN build jobs
ELN is neither Fedora nor RHEL, but more of a midstream between the two.
It's best to split it out as a separate package set in packit. This will
allow reusability of Fedora targets specified in copr_build jobs in
other jobs.

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2024-09-06 16:20:26 +05:30
Daniel J Walsh c00c467d34
Merge pull request #1555 from giuseppe/fix-double-message
error: do not write error twice to stderr
2024-09-06 06:23:46 -04:00
Daniel J Walsh c1226ebd84
Merge pull request #1554 from giuseppe/simplify-dup-user
container: remove manual dup operation
2024-09-06 06:22:03 -04:00
Sascha Grunert a5320ae3e7
Add debug logs for container creation
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-09-06 10:09:04 +02:00
flouthoc 4467dd9da1
Merge pull request #1553 from giuseppe/hooks-ignore-EPIPE
linux: ignore EPIPE for hooks
2024-09-05 14:45:06 -07:00
Giuseppe Scrivano 228ad7c86a
container: remove manual dup operation
but use the version generated by libocispec.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-05 23:18:13 +02:00
Giuseppe Scrivano 13ea475ccc
libocispec: sync from upstream
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-05 23:18:13 +02:00
Giuseppe Scrivano 3dbf152b7b
error: do not write error twice to stderr
it solves the following error:

$ crun --log-level foo run foo
2024-09-05T21:13:00.024172Z: unknown verbosity `foo` specified
2024-09-05T21:13:00.024273Z: unknown verbosity `foo` specified

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-05 23:11:49 +02:00
Giuseppe Scrivano 5e35dfe0d8
libcrun: vanity, color debug messages
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-05 23:08:13 +02:00
Giuseppe Scrivano 2c4db997a9
linux: ignore EPIPE for hooks
ignore EPIPE when writing to the hook stdin, as the hook process could
already have been terminated.

Closes: https://github.com/containers/crun/issues/1551

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-05 22:46:23 +02:00
Giuseppe Scrivano 00fde03433
Merge pull request #1533 from lsm5/rpm-wasm-symlink
RPM/Packit: Fix wasm conditionals, cleanup rpm spec, update packit config
2024-09-05 20:16:57 +02:00
Lokesh Mandvekar 7fcede6004
RPM/Packit: Fix wasm conditionals, cleanup rpm spec, update packit config
1. Wrongly mentioned conditionals were causing crun-wasm package
   creation on CentOS Stream 10 and ELN environments which don't have
   wasm support yet.

2. All environments that end up consuming rpm/crun.spec have rpmautospec
   enabled so we don't need these conditionals anymore. CentOS Stream 9
   should soon get rpmautospec support as well, and the current lack of it
   is not really a blocker to removal of these conditionals.

3. Simplify TMT tests

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2024-09-05 19:03:18 +05:30
Giuseppe Scrivano df78c1618d
Merge pull request #1549 from saschagrunert/unused-result
Fix warning around unused result on chdir("/")
2024-09-05 13:27:00 +02:00
Giuseppe Scrivano 0f556b7c31
build: force install symlinks
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-05 15:42:03 +05:30
Giuseppe Scrivano 841a0de133
Merge pull request #1517 from sohankunkerkar/fix-trailing-slash
src/libcrun: fix handling of device paths with trailing slashes
2024-09-04 15:37:06 +02:00
Giuseppe Scrivano dc993bd8c7
Merge pull request #1548 from saschagrunert/err-tty
Report executable not found errors after tty has been setup
2024-09-04 13:10:19 +02:00
Sascha Grunert 23d5e498ff
Fix warning around unused result on chdir("/")
This fixes the warning while ignoring the return result of `chdir`:

```
src/libcrun/linux.c:5769:14: warning: ignoring return value of 'chdir' declared with attribute 'warn_unused_result' [-Wunused-result]
 5769 |       (void) chdir ("/");
```

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-09-04 11:25:22 +02:00
Giuseppe Scrivano 7d73c798c4
Merge pull request #1546 from aconz2/getpwuid_r-alloc
fix getpwuid_r error handling
2024-09-04 11:12:36 +02:00
Giuseppe Scrivano 3a400b1aeb
Merge pull request #1547 from saschagrunert/stderr
Only log to stderr if `--log` is not provided
2024-09-04 11:10:37 +02:00
Sascha Grunert 6bf9e7c3de
Report executable not found errors after tty has been setup
This allows to pass the error (as others) through the tty and not lose
it.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-09-04 10:41:55 +02:00
Sascha Grunert a295e70ad3
Only log to stderr if `--log` is not provided
In any other case we log to the dedicated log driver for consistency
reason. This avoids having another option like `--log-stderr`.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-09-04 10:31:51 +02:00
Giuseppe Scrivano c5c667913d
Merge pull request #1545 from giuseppe/fix-containerd-tests-3-9-2024
tests: bump containerd version
2024-09-04 09:19:58 +02:00
Andrew Consroe fb593fcaaf fix getpwuid_r error handling
getpwuid_r returns the error directly, not -1. In a situation with
no /etc/passwd for example, ret is ENOFILE (2) and so the branch is
never taken, resulting in a runaway realloc loop

Signed-off-by: Andrew Consroe <aconz2@gmail.com>
2024-09-03 17:58:45 -05:00
Giuseppe Scrivano d29fdaecdb
tests: bump containerd version
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-03 23:49:45 +02:00
Giuseppe Scrivano f36c2163c3
tests: bump ubuntu version
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-03 23:49:45 +02:00
Giuseppe Scrivano d065a5a7da
Revert "Add `--log-stderr` option"
This reverts commit b98e0ddd22.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-09-03 23:49:45 +02:00
Giuseppe Scrivano 0da87ba3ad
Merge pull request #1544 from eriksjolund/fix-recvfrom-error
linux: fix recvfrom error handling
2024-09-03 16:50:38 +02:00
Sohan Kunkerkar dc31069afe src/libcrun: fix handling of device paths with trailing slashes
This change handles device paths with trailing slashes correctly.

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2024-09-03 10:49:26 -04:00
Daniel J Walsh 04e09b0d6f
Merge pull request #1543 from saschagrunert/gids-len
Fix `additional_gids_size` on `process_user_dup`
2024-09-03 08:40:36 -04:00
Erik Sjölund ab64a5cb5e linux: fix recvfrom error handling
Fixes: https://github.com/containers/crun/issues/1531

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2024-09-03 14:12:09 +02:00
Sascha Grunert a32d433a23
Fix `additional_gids_size` on `process_user_dup`
The size needs to be set as well otherwise we may break `additional_gids`.

Found in critest, like: https://github.com/cri-o/cri-o/actions/runs/10680190170/job/29601173374

```
Summarizing 3 Failures:
  [FAIL] [k8s.io] Security Context SupplementalGroupsPolicy when SupplementalGroupsPolicy=Strict [It] even if the container's primary UID belongs to some groups in the image, runtime should not add SupplementalGroups to them
  sigs.k8s.io/cri-tools/pkg/validate/security_context_linux.go:737
  [FAIL] [k8s.io] Security Context bucket [It] runtime should support SupplementalGroups
  sigs.k8s.io/cri-tools/pkg/validate/security_context_linux.go:309
  [FAIL] [k8s.io] Security Context SupplementalGroupsPolicy when SupplementalGroupsPolicy=Merge (Default) [It] if the container's primary UID belongs to some groups in the image, runtime should add SupplementalGroups to them
  sigs.k8s.io/cri-tools/pkg/validate/security_context_linux.go:669
```

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-09-03 11:39:19 +02:00
Giuseppe Scrivano 28b47d6c7f
Merge pull request #1541 from saschagrunert/journal-id
Allow passing an ID to journald log driver
2024-09-03 11:31:42 +02:00
Giuseppe Scrivano 6ad2b30058
Merge pull request #1542 from saschagrunert/log-stderr
Add `--log-stderr` option
2024-09-03 11:20:00 +02:00
Sascha Grunert b98e0ddd22
Add `--log-stderr` option
This option was inherited before, but now we can set it explicitly to
allow/disallow logging to stderr when selecting a certain log driver.

For example on exec, we would need to log log anything to stdio even
when the debug logs are enabled.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-09-03 10:42:31 +02:00
Sascha Grunert 544fe3f0f4
Allow passing an ID to journald log driver
The ID has been dropped before this patch, but we can use it to refer a
container ID or anything else identifiable.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-09-03 09:06:00 +02:00
Giuseppe Scrivano 9bc06d02c8
Merge pull request #1538 from bduffany/fix-double-free
Fix double-free of process user in crun exec
2024-09-02 21:47:34 +02:00
Giuseppe Scrivano de7694720a
Merge pull request #1539 from saschagrunert/log-docs
Add log options documentation
2024-09-02 14:59:44 +02:00
Giuseppe Scrivano e9a05a8517
Merge pull request #1540 from saschagrunert/log-context
Log only after crun context has been setup
2024-09-02 14:57:05 +02:00
Sascha Grunert 6d92b28b70
Log only after crun context has been setup
We should log only after the context setup, otherwise we may use the
wrong format or target.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-09-02 12:24:31 +02:00
Sascha Grunert 29259e4b1b
Add log options documentation
Adding the default values to the CLI help as well as the `--log-level`
to the manpage.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-09-02 12:16:19 +02:00
Giuseppe Scrivano 7501195f1e
Merge pull request #1530 from saschagrunert/autopatchelfhook
Add autoPatchelfHook to static build
2024-09-02 09:16:42 +02:00
Giuseppe Scrivano a3e9a88284
Merge pull request #1527 from saschagrunert/log-level
Add `--log-level` option
2024-09-02 09:15:49 +02:00
Giuseppe Scrivano f134a7f1bc
Merge pull request #1535 from sohankunkerkar/fix-ci-test
[WIP] src/libcrun: fix error handling in libcrun_kill_linux
2024-09-02 09:13:56 +02:00
Brandon Duffany f72483a75e Fix double-free in crun exec
The process struct cleanup in crun_command_exec was doubly freeing the
process->user pointer, which was previously freed by the container struct
cleanup in libcrun_container_exec_with_options.

Signed-off-by: Brandon Duffany <brandon@buildbuddy.io>
2024-09-01 21:32:14 -04:00
Giuseppe Scrivano 7dba7fc11f
Merge pull request #1534 from sohankunkerkar/handle-setns-error
src/libcrun: improve error handling for the mnt namespace restoration
2024-08-30 07:45:37 +02:00
Sohan Kunkerkar e4b4a21018 src/libcrun: fix error handling in libcrun_kill_linux
Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2024-08-29 11:44:08 -04:00
Giuseppe Scrivano f54d383e00
Merge pull request #1532 from sohankunkerkar/fix-error
src/libcrun: added custom error message for ESRCH case
2024-08-29 16:23:12 +02:00
Sohan Kunkerkar 83c13558f4 src/libcrun: improve error handling for the mnt namespace restoration
This patch addresses the issue where the "setns `mnt`: Bad file descriptor"
error can occur during mount namespace (CLONE_NEWNS) restoration. The error
arises when attempting to restore the mount namespace without properly
tracking whether it was changed.

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2024-08-28 13:04:44 -04:00
Sohan Kunkerkar 6fb1f08c5b src/libcrun: added custom error message for ESRCH case
Fixes https://github.com/containers/crun/issues/1510

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2024-08-28 06:54:10 -04:00
Giuseppe Scrivano 7e36280031
Merge pull request #1479 from eriksjolund/add-more-O_PATH
Add more O_PATH flags
2024-08-27 11:34:18 +02:00
Sascha Grunert 9f06d3c7dd
Add autoPatchelfHook to static build
This should fix the wrong dlopen search path mentioned in
https://github.com/cri-o/cri-o/issues/8518

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-08-26 14:32:13 +02:00
Sascha Grunert 19b9893076
Add `--log-level` option
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-08-21 12:02:34 +02:00
Daniel J Walsh 80fa3db6c3
Merge pull request #1529 from giuseppe/test-fix-wasmedge
tests: fix wasmedge build
2024-08-21 05:50:07 -04:00
Giuseppe Scrivano fd7f50a837
tests: fix wasmedge build
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-08-21 08:23:44 +02:00
Giuseppe Scrivano 457d02acf7
Merge pull request #1528 from NilsIrl/patch-1
Remove libcrun_setup_terminal_ptmx
2024-08-20 19:03:45 +02:00
Nils 0380369527 Remove libcrun_setup_terminal_ptmx
Since https://github.com/containers/crun/pull/1214 it is a no-op and
does nothing more than call set_raw.

Signed-off-by: Nils <nils@nilsand.re>
2024-08-20 14:31:18 +00:00
Giuseppe Scrivano 10b3038c13
Merge pull request #1526 from sohankunkerkar/fix-defaultdependencies
src/libcrun: ensure DefaultDependencies respects CRI-O annotation
2024-08-14 22:12:13 +02:00
Giuseppe Scrivano d6bbc552c1
Merge pull request #1523 from giuseppe/configure-fix-condition
configure.ac: fix condition for wasm detection
2024-08-14 22:06:47 +02:00
Sohan Kunkerkar 1edf6d06e5 src/libcrun: ensure DefaultDependencies respects CRI-O annotation
The issue with DefaultDependencies being set to 'no' instead of 'yes'
was caused by the order of the code. The annotations were being processed
before setting the DefaultDependencies property, which allowed crun's
default behavior to override the CRI-O annotation.

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2024-08-14 15:30:41 -04:00
Giuseppe Scrivano 42b0b99358
configure.ac: fix condition for wasm detection
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-08-13 17:39:22 +02:00
Giuseppe Scrivano 26c7687b4a
Merge pull request #1520 from giuseppe/improve-make-parent-private
linux: attempt to make rootfs private too
2024-08-13 16:36:09 +02:00
Erik Sjölund bfa0640011 Add more O_PATH flags
Fixes: https://github.com/containers/crun/issues/1478

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2024-08-04 12:47:01 +02:00
134 changed files with 9175 additions and 2502 deletions

View File

@ -15,7 +15,7 @@ jobs:
- uses: actions/cache@v4
with:
path: .cache
key: nix-v1-2.12.0-${{ hashFiles('nix/nixpkgs.json') }}
key: nix-v1-2.24.9-${{ hashFiles('nix/nixpkgs.json') }}
- run: sudo apt-get update
@ -25,7 +25,7 @@ jobs:
set -ex
sudo mkdir -p .cache
sudo mv .cache /nix
if [[ -z $(ls -A /nix) ]]; then sudo docker run --rm --privileged -v /:/mnt nixos/nix:2.12.0 cp -rfT /nix /mnt/nix; fi
if [[ -z $(ls -A /nix) ]]; then sudo docker run --rm --privileged -v /:/mnt nixos/nix:2.24.9 cp -rfT /nix /mnt/nix; fi
sudo RUNTIME=docker SKIP_CHECKS=1 SKIP_GPG=1 build-aux/release.sh
sudo mv /nix .cache
sudo chown -Rf $(whoami) .cache

View File

@ -6,18 +6,21 @@ jobs:
build_job:
runs-on: ubuntu-latest
name: Build on ${{ matrix.arch }}
permissions:
contents: read
packages: write
strategy:
matrix:
include:
- arch: armv7
distro: ubuntu_latest
- arch: aarch64
distro: ubuntu_latest
- arch: s390x
distro: ubuntu_latest
- arch: ppc64le
distro: ubuntu_latest
# - arch: aarch64
# distro: ubuntu_latest
# - arch: s390x
# distro: ubuntu_latest
# - arch: ppc64le
# distro: ubuntu_latest
- arch: riscv64
distro: ubuntu_latest
steps:
@ -26,7 +29,7 @@ jobs:
submodules: true
set-safe-directory: true
- uses: uraimo/run-on-arch-action@v2.7.2
- uses: uraimo/run-on-arch-action@v3.0.1
name: Build
id: build
with:
@ -36,13 +39,13 @@ jobs:
githubToken: ${{ github.token }}
install: |
apt-get update -q -y
apt-get install -q -y automake libtool autotools-dev libseccomp-dev git make libcap-dev cmake pkg-config gcc wget go-md2man libsystemd-dev gperf clang-format libyajl-dev libprotobuf-c-dev
apt-get update -y
apt-get install -y automake libtool autotools-dev libseccomp-dev git make libcap-dev cmake pkg-config gcc wget go-md2man libsystemd-dev gperf clang-format libyajl-dev libprotobuf-c-dev clang mawk
run: |
find $(pwd) -name '.git' -exec bash -c 'git config --global --add safe.directory ${0%/.git}' {} \;
./autogen.sh
./configure CFLAGS='-Wall -Werror'
./configure CFLAGS='-Wall -Werror' || cat config.log
make -j $(nproc) -C libocispec libocispec.la
make git-version.h
make -j $(nproc) libcrun.la
@ -91,7 +94,7 @@ jobs:
sudo add-apt-repository -y ppa:criu/ppa
# add-apt-repository runs apt-get update so we don't have to.
sudo apt-get install -q -y criu automake libtool autotools-dev libseccomp-dev git make libcap-dev cmake pkg-config gcc wget go-md2man libsystemd-dev gperf clang-format libyajl-dev containerd runc libasan6 libprotobuf-c-dev
sudo apt-get install -q -y criu automake libtool autotools-dev libseccomp-dev git make libcap-dev cmake pkg-config gcc wget go-md2man libsystemd-dev gperf clang-format libyajl-dev containerd runc libasan6 libprotobuf-c-dev mawk
- name: run autogen.sh
run: |
@ -107,16 +110,20 @@ jobs:
make -j $(nproc)
;;
check)
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0
./configure --disable-dl
make
make syntax-check
echo run tests as root
sudo make check ASAN_OPTIONS=detect_leaks=false || cat test-suite.log
sudo make check ASAN_OPTIONS=detect_leaks=false || (cat test-suite.log; exit 1)
echo run tests as rootless
make check ASAN_OPTIONS=detect_leaks=false || (cat test-suite.log; exit 1)
echo run tests as rootless in a user namespace
unshare -r make check ASAN_OPTIONS=detect_leaks=false || (cat test-suite.log; exit 1)
git status
git diff
# check that the working dir is clean
git describe --broken --dirty --all | grep -qv dirty
;;
@ -183,27 +190,27 @@ jobs:
esac
shellcheck:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: vars
run: |
echo 'VERSION=v0.8.0' >> $GITHUB_ENV
echo 'BASEURL=https://github.com/koalaman/shellcheck/releases/download' >> $GITHUB_ENV
echo 'SHA256SUM=f4bce23c11c3919c1b20bcb0f206f6b44c44e26f2bc95f8aa708716095fa0651' >> $GITHUB_ENV
echo ~/bin >> $GITHUB_PATH
- name: install shellcheck
env:
VERSION: v0.10.0
BASEURL: https://github.com/koalaman/shellcheck/releases/download
SHA256: f35ae15a4677945428bdfe61ccc297490d89dd1e544cc06317102637638c6deb
run: |
mkdir ~/bin
curl -sSfL --retry 5 $BASEURL/$VERSION/shellcheck-$VERSION.linux.x86_64.tar.xz |
tar xfJ - -C ~/bin --strip 1 shellcheck-$VERSION/shellcheck
sha256sum ~/bin/shellcheck | grep -q $SHA256SUM
sha256sum --strict --check - <<<"$SHA256 *$HOME/bin/shellcheck"
# make sure to remove the old version
sudo rm -f /usr/bin/shellcheck
# Add ~/bin to $PATH.
echo ~/bin >> $GITHUB_PATH
- name: install dependencies
run: |
sudo apt-get update -q -y
sudo apt-get install -q -y automake libtool autotools-dev libseccomp-dev git make libcap-dev cmake pkg-config gcc wget go-md2man libsystemd-dev gperf clang-format libyajl-dev libprotobuf-c-dev
sudo apt-get install -q -y automake libtool autotools-dev libseccomp-dev git make libcap-dev cmake pkg-config gcc wget go-md2man libsystemd-dev gperf clang-format libyajl-dev libprotobuf-c-dev mawk
- uses: lumaxis/shellcheck-problem-matchers@v2
- name: shellcheck
run: |

View File

@ -11,12 +11,15 @@ files_to_sync:
- src: plans/
dest: plans/
delete: true
mkpath: true
- src: tests/tmt/
dest: tests/tmt/
delete: true
mkpath: true
- src: .fmf/
dest: .fmf/
delete: true
mkpath: true
- .packit.yaml
packages:
@ -26,7 +29,7 @@ packages:
crun-centos:
pkg_tool: centpkg
specfile_path: rpm/crun.spec
crun-rhel:
crun-eln:
specfile_path: rpm/crun.spec
srpm_build_deps:
@ -46,9 +49,15 @@ jobs:
notifications: &copr_build_failure_notification
failure_comment:
message: "Ephemeral COPR build failed. @containers/packit-build please check."
targets:
targets: &fedora_copr_targets
- fedora-all-x86_64
- fedora-all-aarch64
- job: copr_build
trigger: pull_request
packages: [crun-eln]
notifications: *copr_build_failure_notification
targets:
- fedora-eln-x86_64
- fedora-eln-aarch64
@ -56,25 +65,11 @@ jobs:
trigger: pull_request
packages: [crun-centos]
notifications: *copr_build_failure_notification
targets:
# Need epel9 repos to fetch wasmedge build dependency
centos-stream-9-x86_64:
additional_repos:
- https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/
centos-stream-9-aarch64:
additional_repos:
- https://dl.fedoraproject.org/pub/epel/9/Everything/aarch64/
# TODO: build on CS10 with wasmedge once epel-10 is available
centos-stream-10-x86_64: {}
centos-stream-10-aarch64: {}
- job: copr_build
trigger: pull_request
packages: [crun-rhel]
notifications: *copr_build_failure_notification
targets:
- epel-9-x86_64
- epel-9-aarch64
targets: &centos_copr_targets
- centos-stream-9-x86_64
- centos-stream-9-aarch64
- centos-stream-10-x86_64
- centos-stream-10-aarch64
# Run on commit to main branch
- job: copr_build
@ -91,55 +86,55 @@ jobs:
- job: tests
trigger: pull_request
packages: [crun-fedora]
notifications: &podman_system_test_fail_notification
notifications: &test_failure_notification
failure_comment:
message: "podman system tests failed. @containers/packit-build please check."
targets:
- fedora-all-x86_64
- fedora-all-aarch64
message: "TMT tests failed. @containers/packit-build please check."
targets: *fedora_copr_targets
tf_extra_params:
environments:
- artifacts:
- type: repository-file
id: https://copr.fedorainfracloud.org/coprs/rhcontainerbot/podman-next/repo/fedora-$releasever/rhcontainerbot-podman-next-fedora-$releasever.repo
# Podman system tests for CentOS Stream
- job: tests
trigger: pull_request
packages: [crun-centos]
notifications: *podman_system_test_fail_notification
notifications: *test_failure_notification
# TODO: Re-enable centos-stream-10-x86_64 once criu issues are solved
# Ref: https://github.com/containers/crun/pull/1758#issuecomment-2901772392
# Issue filed: https://github.com/containers/crun/issues/1759
#targets: *centos_copr_targets
targets:
- centos-stream-9-x86_64
- centos-stream-9-aarch64
- centos-stream-10-x86_64
- centos-stream-10-aarch64
# Podman system tests for RHEL
- job: tests
trigger: pull_request
packages: [crun-rhel]
use_internal_tf: true
notifications: *podman_system_test_fail_notification
targets:
epel-9-x86_64:
distros: [RHEL-9.4.0-Nightly,RHEL-9-Nightly]
epel-9-aarch64:
distros: [RHEL-9.4.0-Nightly,RHEL-9-Nightly]
#TODO: Enable RHEL10 targets once epel-10 copr target is available
tf_extra_params:
environments:
- artifacts:
- type: repository-file
id: https://copr.fedorainfracloud.org/coprs/rhcontainerbot/podman-next/repo/centos-stream-$releasever/rhcontainerbot-podman-next-centos-stream-$releasever.repo
- job: propose_downstream
trigger: release
packages: [crun-fedora]
dist_git_branches:
dist_git_branches: &fedora_targets
- fedora-all
# Disabled until we're switching to Packit for CentOS Stream
- job: propose_downstream
trigger: release
trigger: ignore
packages: [crun-centos]
dist_git_branches:
- c10s
- job: koji_build
trigger: commit
dist_git_branches:
- fedora-all
packages: [crun-fedora]
dist_git_branches: *fedora_targets
- job: bodhi_update
trigger: commit
packages: [crun-fedora]
dist_git_branches:
- fedora-branched # rawhide updates are created automatically

148
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,148 @@
# Contributing to crun
Thanks for your interest in contributing!
## Note: Before writing a big patch
If you plan to contribute a large change, please get in touch *before*
submitting a pull request by e.g. filing an issue describing your proposed
change. This will help ensure alignment.
## Background knowledge
You will need to understand the C programming language.
## Development environment
Crun at its core is a low-dependency C library and CLI tools. You'll
need a Linux environment, which could be a container or a VM/physical system.
crun should be buildable on nearly any relatively modern Linux OS/distribution.
### Building and testing
crun uses [GNU Autotools](https://www.gnu.org/software/automake/manual/html_node/Autotools-Introduction.html) for its build system. Ensure that Autotools is installed and properly configured on your system.
#### Setup
To set up the build, run the following commands in the root directory:
```bash
./autogen.sh
./configure --prefix=/usr
make -j $(nproc)
```
## Testing
To run the crun tests suite, you can use the following command:
```bash
make check
```
## Code linting
Be sure you've run
```
make clang-format
```
to reformat the code automatically.
## Submitting a patch
The podman project has some [generic useful guidance](https://github.com/containers/podman/blob/main/CONTRIBUTING.md#submitting-pull-requests);
like that project, a "Developer Certificate of Origin" is required.
### Sign your PRs
The sign-off is a line at the end of the explanation for the patch. Your
signature certifies that you wrote the patch or otherwise have the right to pass
it on as an open-source patch. The rules are simple: if you can certify
the below (from [developercertificate.org](https://developercertificate.org/)):
```
Developer Certificate of Origin
Version 1.1
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
660 York Street, Suite 102,
San Francisco, CA 94110 USA
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
```
Then you just add a line to every git commit message:
Signed-off-by: Joe Smith <joe.smith@email.com>
Use your real name (sorry, no pseudonyms or anonymous contributions.)
If you set your `user.name` and `user.email` git configs, you can sign your
commit automatically with `git commit -s`.
### Git commit style
Please look at `git log` and match the commit log style, which is very
similar to the
[Linux kernel](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git).
Then you just add a line to every git commit message:
Signed-off-by: Joe Smith <joe.smith@email.com>
Use your real name (sorry, no pseudonyms or anonymous contributions).
1. Title
- Specify the context or category of the changes e.g. `lib` for library changes, `docs` for document changes, `bin/<command-name>` for command changes, etc.
- Begin the title with the first letter of the first word capitalized.
- Aim for less than 50 characters, otherwise 72 characters max.
- Do not end the title with a period.
- Use an [imperative tone](https://en.wikipedia.org/wiki/Imperative_mood).
2. Body
- Separate the body with a blank line after the title.
- Begin a paragraph with the first letter of the first word capitalized.
- Each paragraph should be formatted within 72 characters.
- Content should be about what was changed and why this change was made.
- If your commit fixes an issue, the commit message should end with `Closes: #<number>`.
Commit Message example:
```bash
<context>: Less than 50 characters for subject title
A paragraph of the body should be within 72 characters.
This paragraph is also less than 72 characters.
```
For more information see [How to Write a Git Commit Message](https://chris.beams.io/posts/git-commit/)

View File

@ -32,9 +32,11 @@ else
noinst_LTLIBRARIES = libcrun.la
endif
check_LIBRARIES = libcrun_testing.a
check_LTLIBRARIES = libcrun_testing.la
libcrun_SOURCES = src/libcrun/utils.c \
src/libcrun/string_map.c \
src/libcrun/ring_buffer.c \
src/libcrun/blake3/blake3.c \
src/libcrun/blake3/blake3_portable.c \
src/libcrun/cgroup-cgroupfs.c \
@ -57,6 +59,7 @@ libcrun_SOURCES = src/libcrun/utils.c \
src/libcrun/handlers/wasmedge.c \
src/libcrun/handlers/wasmer.c \
src/libcrun/handlers/wasmtime.c \
src/libcrun/handlers/wamr.c \
src/libcrun/intelrdt.c \
src/libcrun/io_priority.c \
src/libcrun/linux.c \
@ -66,6 +69,7 @@ libcrun_SOURCES = src/libcrun/utils.c \
src/libcrun/seccomp_notify.c \
src/libcrun/signals.c \
src/libcrun/status.c \
src/libcrun/net_device.c \
src/libcrun/terminal.c
if HAVE_EMBEDDED_YAJL
@ -83,9 +87,9 @@ libcrun_la_LIBADD = libocispec/libocispec.la $(FOUND_LIBS) $(maybe_libyajl.la)
libcrun_la_LDFLAGS = -Wl,--version-script=$(abs_top_srcdir)/libcrun.lds
# build a version with all the symbols visible for testing
libcrun_testing_a_SOURCES = $(libcrun_SOURCES)
libcrun_testing_a_CFLAGS = -I $(abs_top_builddir)/libocispec/src -I $(abs_top_srcdir)/libocispec/src -fvisibility=default
libcrun_testing_a_LIBADD = libocispec/libocispec.la $(maybe_libyajl.la)
libcrun_testing_la_SOURCES = $(libcrun_SOURCES)
libcrun_testing_la_CFLAGS = -I $(abs_top_builddir)/libocispec/src -I $(abs_top_srcdir)/libocispec/src -fvisibility=default
libcrun_testing_la_LIBADD = libocispec/libocispec.la $(maybe_libyajl.la)
if PYTHON_BINDINGS
pyexec_LTLIBRARIES = python_crun.la
@ -130,7 +134,7 @@ endif
crun_CFLAGS = -I $(abs_top_builddir)/libocispec/src -I $(abs_top_srcdir)/libocispec/src -D CRUN_LIBDIR="\"$(CRUN_LIBDIR)\""
crun_SOURCES = src/crun.c src/run.c src/delete.c src/kill.c src/pause.c src/unpause.c src/oci_features.c src/spec.c \
src/exec.c src/list.c src/create.c src/start.c src/state.c src/update.c src/ps.c \
src/checkpoint.c src/restore.c src/libcrun/cloned_binary.c
src/checkpoint.c src/restore.c src/mounts.c src/run_create.c
if DYNLOAD_LIBCRUN
crun_LDFLAGS = -Wl,--unresolved-symbols=ignore-all $(CRUN_LDFLAGS)
@ -141,8 +145,8 @@ endif
EXTRA_DIST = COPYING COPYING.libcrun README.md NEWS SECURITY.md rpm/crun.spec autogen.sh \
src/libcrun/blake3/blake3_impl.h src/libcrun/blake3/blake3.h \
src/crun.h src/list.h src/run.h src/delete.h src/kill.h src/pause.h src/unpause.h \
src/create.h src/start.h src/state.h src/exec.h src/oci_features.h src/spec.h src/update.h src/ps.h \
src/crun.h src/list.h src/run.h src/run_create.h src/delete.h src/kill.h src/pause.h src/unpause.h \
src/create.h src/start.h src/state.h src/exec.h src/oci_features.h src/spec.h src/update.h src/ps.h src/mounts.h \
src/checkpoint.h src/restore.h src/libcrun/seccomp_notify.h src/libcrun/seccomp_notify_plugin.h \
src/libcrun/container.h src/libcrun/seccomp.h src/libcrun/ebpf.h \
src/libcrun/cgroup.h src/libcrun/cgroup-cgroupfs.h \
@ -153,12 +157,13 @@ EXTRA_DIST = COPYING COPYING.libcrun README.md NEWS SECURITY.md rpm/crun.spec au
src/libcrun/handlers/handler-utils.h \
src/libcrun/linux.h src/libcrun/utils.h src/libcrun/error.h src/libcrun/criu.h \
src/libcrun/scheduler.h src/libcrun/status.h src/libcrun/terminal.h \
src/libcrun/mount_flags.h src/libcrun/intelrdt.h \
src/libcrun/mount_flags.h src/libcrun/intelrdt.h src/libcrun/ring_buffer.h src/libcrun/string_map.h \
src/libcrun/net_device.h \
crun.1.md crun.1 libcrun.lds \
krun.1.md krun.1 \
lua/luacrun.rockspec
UNIT_TESTS = tests/tests_libcrun_utils tests/tests_libcrun_errors tests/tests_libcrun_intelrdt
UNIT_TESTS = tests/tests_libcrun_utils tests/tests_libcrun_ring_buffer tests/tests_libcrun_errors tests/tests_libcrun_intelrdt
if ENABLE_CRUN
bin_PROGRAMS = crun
@ -167,9 +172,9 @@ else
noinst_PROGRAMS = crun
endif
noinst_PROGRAMS += tests/init $(UNIT_TESTS) tests/tests_libcrun_fuzzer
check_PROGRAMS = tests/init $(UNIT_TESTS) tests/tests_libcrun_fuzzer
TESTS_LDADD = libcrun_testing.a $(FOUND_LIBS) $(maybe_libyajl.la)
TESTS_LDADD = libcrun_testing.la $(FOUND_LIBS) $(maybe_libyajl.la)
tests_init_LDADD =
tests_init_LDFLAGS = -static-libgcc -all-static
@ -180,6 +185,11 @@ tests_tests_libcrun_utils_SOURCES = tests/tests_libcrun_utils.c
tests_tests_libcrun_utils_LDADD = $(TESTS_LDADD)
tests_tests_libcrun_utils_LDFLAGS = $(crun_LDFLAGS)
tests_tests_libcrun_ring_buffer_CFLAGS = -I $(abs_top_builddir)/libocispec/src -I $(abs_top_srcdir)/libocispec/src -I $(abs_top_builddir)/src -I $(abs_top_srcdir)/src
tests_tests_libcrun_ring_buffer_SOURCES = tests/tests_libcrun_ring_buffer.c
tests_tests_libcrun_ring_buffer_LDADD = $(TESTS_LDADD)
tests_tests_libcrun_ring_buffer_LDFLAGS = $(crun_LDFLAGS)
tests_tests_libcrun_intelrdt_CFLAGS = -I $(abs_top_builddir)/libocispec/src -I $(abs_top_srcdir)/libocispec/src -I $(abs_top_builddir)/src -I $(abs_top_srcdir)/src
tests_tests_libcrun_intelrdt_SOURCES = tests/tests_libcrun_intelrdt.c
tests_tests_libcrun_intelrdt_LDADD = $(TESTS_LDADD)
@ -223,7 +233,8 @@ PYTHON_TESTS = tests/test_capabilities.py \
tests/test_start.py \
tests/test_exec.py \
tests/test_seccomp.py \
tests/test_time.py
tests/test_time.py \
tests/test_bpf_devices.py
TESTS = $(PYTHON_TESTS) $(UNIT_TESTS)
@ -239,7 +250,7 @@ git-version.h:
fi
nixpkgs:
@nix run -f channel:nixpkgs-unstable nix-prefetch-git -- \
@nix --extra-experimental-features nix-command run -f channel:nixpkgs-unstable nix-prefetch-git -- \
--no-deepClone https://github.com/nixos/nixpkgs > nix/nixpkgs.json
dist-hook:
@ -273,10 +284,10 @@ endif HAVE_MD2MAN
install-exec-hook:
if ENABLE_KRUN
$(LN_S) crun$(EXEEXT) $(DESTDIR)$(bindir)/krun$(EXEEXT)
$(LN_S) -f crun$(EXEEXT) $(DESTDIR)$(bindir)/krun$(EXEEXT)
endif
if ENABLE_WASM
$(LN_S) crun$(EXEEXT) $(DESTDIR)$(bindir)/crun-wasm$(EXEEXT)
$(LN_S) -f crun$(EXEEXT) $(DESTDIR)$(bindir)/crun-wasm$(EXEEXT)
endif
uninstall-hook:
@ -312,6 +323,6 @@ clang-format:
git ls-files src tests | grep -E "\\.[hc]" | grep -v "blake3\|chroot_realpath.c\|cloned_binary.c\|signals.c\|mount_flags.c" | xargs clang-format -style=file -i
shellcheck:
shellcheck tests/*/*.sh contrib/*.sh
shellcheck autogen.sh build-aux/release.sh tests/run_all_tests.sh tests/*/*.sh contrib/*.sh
.PHONY: coverity sync generate-rust-bindings generate-signals.c generate-mount_flags.c clang-format shellcheck

135
NEWS
View File

@ -1,3 +1,138 @@
* crun-1.22
- crun: add a new command crun mounts to dynamically add or remove
mounts from a running container.
- linux: add support for moving existing network devices into the
container namespace as defined in the OCI specification.
- linux: add src-nofollow and dest-nofollow mount options for more
precise control over how symbolic links are handled.
- krun: implement support for external kernels, allowing users to
bundle a kernel image with the container.
- krun: the vCPU limit has been increased to 16.
- krun: add support for specifying the libkrun flavor via the
KRUN_VM_FILE.
- criu: fix checkpoint and restore for containers that have a bind
mount where the destination is a symbolic link.
- criu: automatically create the directory specified by --work-path if
it does not exist, improving compatibility with other runtimes.
- criu: re-enable support on the riscv64 architecture.
- cgroup: fix incorrect setting of cpu.max when the OCI quota is -1.
- hardening: replace all uses of the insecure sprintf function with
safer alternatives like snprintf to prevent buffer overflows.
- fix a regression that caused issues when dealing with paths that do
not exist and openat2 is not available.
- fix an issue where the file descriptor for the rootfs would become
stale if the rootfs was replaced by a mount.
- fix parsing of rootless options.
- fix a potential crash in krun by checking if library handles exist
before being unloaded.
- improve error messages for dlopen failures, making them more descriptive.
- cgroup: fix a regression on WSL when running with cgroup v1.
- libcrun: setup /dev/console as a symlink to pty instead of bind mount
when possible.
* crun-1.21
- criu: when running under systemd, use a proxy process to initialize
the cgroup so that all the container processes are restored in the
correct cgroup.
- set HOME to "/" if the specified user is not present in the /etc/passwd file.
- do not fail if any of stdin/stdout/stderr is closed.
- cgroup: fix handling of absent subcgroup when configuring cpuset on cgroup v1.
- ignore SIGWINCH when a tty is not used
- utils: improve error message if the specified command is not executable.
- fix PATH lookup. Support filenames starting with a dot.
- krun: create context after loading the library
- krun: stop using krun_set_exec but use the command line directly
from the OCI configuration file.
* crun-1.20
- krun: fix CVE-2025-24965. The .krun_config.json file could be
created outside of the container rootfs.
- cgroup: reverted the removal of `tun/tap` from the default allow
list, this was done in crun-1.5. The `tun/tap` device is now added
by default again.
- CRIU: do not set `network_lock` unless explicitly specified.
- status: disallow container names containing slashes in their name.
- linux: Improved error message when failing to set the
`net.ipv4.ping_group_range` sysctl.
- scheduler: Ignore `ENOSYS` errors when resetting the CPU affinity
mask.
- linux: return a better error message when `pidfd_open` fails with
`EINVAL`.
- cgroup: display the absolute path to `cgroup.controllers` when a
controller is unavailable.
- exec: always call setsid. Now processes created through `exec` get
the correct process group id.
* crun-1.19.1
- linux: fix a hang if there are no reads from the tty. Use non
blocking sockets to read and write from the tty so that the "crun
exec" process doesn't hang when the terminal is not consuming any
data.
- linux: remove the workaround needed to mount a cgroup on top of
another cgroup mount. The workaround had the disadvantage to
temporarily leak a mount on the host. The alternative that is
currently used is to mount a temporary tmpfs between the two cgroup
mounts.
* crun-1.19
- wasm: add new handler wamr.
- criu: allow passing network lock method to libcriu.
- linux: honor exec cpu affinity mask.
- build: fix build with musl libc.
- crun: use mount API to self-clone.
- cgroup, systemd: do not override devices on update. If the "update" request has no
device block configured, do not reset the previously configuration.
- cgroup: handle case where cgroup v1 freezer is disabled. On systems without the
freezer controller, containers were mistakenly reported as paused.
- cgroup: do not stop process on exec. The cpu mask is configured on the systemd
scope, the previous workaround to stop the container until the cgroup is fully
configured is no longer needed.
* crun-1.18.2
- cgroup, systemd: fix a regression when a configuration file includes only one
default rule.
* crun-1.18.1
- cgroup: deprecate cgroup v1.
- cgroup: fix regression setting up the devices cgroup on cgroup v1.
- cgroup: fix regression and work again with the default Docker devices
configuration on systemd.
- linux: fix setting up user namespace when newuidmap/newgidmap are not available.
* crun-1.18
- cgroup: support running without a sub-cgroup with systemd. Use the
d-bus API to set the container limits on the systemd scope itself.
It allows running without a sub-cgroup when the systemd driver is
used, the run.oci.systemd.subgroup annotation controls it. For now,
a sub-cgroup is still created, but it might be changed in future.
- cgroup: add support for the misc controller.
- linux: fix running on kernel without user namespaces.
- criu, restore: add lsm-profile option.
- criu, restore: add lsm-mount-context option.
- linux: add duplicate namespace detection.
* crun-1.17
- Add `--log-level` option. It accepts `error`, `warning` and `error`.
- Add debug logs for container creation.
- Fix double-free in crun exec code that could lead to a crash.
- Allow passing an ID to the journald log driver.
- Report "executable not found" errors after tty has been setup.
- Do not treat EPIPE from hooks as an error.
- Make sure `DefaultDependencies` is correctly set in the systemd scope.
- Improve the error message when the container process is not found.
- Improve error handling for the mnt namespace restoration.
- Fix error handling for `getpwuid_r`, `recvfrom` and `libcrun_kill_linux`.
- Fix handling of device paths with trailing slashes.
* crun-1.16.1
- fix a regression introduced by 1.16 where using 'rshared' rootfs

View File

@ -56,31 +56,34 @@ These dependencies are required for the build:
### Fedora
```console
$ sudo dnf install -y make python git gcc automake autoconf libcap-devel \
systemd-devel yajl-devel libseccomp-devel pkg-config \
go-md2man glibc-static python3-libmount libtool
$ sudo dnf install -y \
autoconf automake gcc git-core glibc-static go-md2man \
libcap-devel libseccomp-devel libtool make pkg-config \
python python-libmount systemd-devel yajl-devel
```
### RHEL/CentOS 8
### RHEL/CentOS Stream 9
```console
$ sudo yum --enablerepo='*' --disablerepo='media-*' install -y make automake \
autoconf gettext \
libtool gcc libcap-devel systemd-devel yajl-devel \
glibc-static libseccomp-devel python36 git
$ sudo dnf config-manager --set-enabled crb
$ sudo dnf install -y \
autoconf automake gcc git-core glibc-static go-md2man \
libcap-devel libseccomp-devel libtool make pkg-config \
python python-libmount systemd-devel yajl-devel
```
go-md2man is not available on RHEL/CentOS 8, so if you'd like to build
the man page, you also need to manually install go-md2man. It can be
installed with:
### RHEL/CentOS Stream 10
```console
$ sudo yum --enablerepo='*' install -y golang
$ export GOPATH=$HOME/go
$ go get github.com/cpuguy83/go-md2man
$ export PATH=$PATH:$GOPATH/bin
$ sudo dnf config-manager --set-enabled crb
$ sudo dnf install -y \
autoconf automake gcc git-core glibc-static go-md2man \
libcap-devel libseccomp-devel libtool make pkg-config \
python python-libmount systemd-devel
```
NOTE that you need to add `--enable-embedded-yajl` to `./configure` flags below.
### Ubuntu
```console

View File

@ -5,7 +5,7 @@ set -xeuo pipefail
SKIP_GPG=${SKIP_GPG:-}
SKIP_CHECKS=${SKIP_CHECKS:-}
NIX_IMAGE=${NIX_IMAGE:-nixos/nix:2.12.0}
NIX_IMAGE=${NIX_IMAGE:-nixos/nix:2.24.9}
test -e Makefile && make distclean
@ -13,54 +13,63 @@ test -e Makefile && make distclean
./configure
make -j $(nproc)
make -j "$(nproc)"
VERSION=$($(dirname $0)/git-version-gen --prefix "" .)
if test x$SKIP_CHECKS = x; then
grep $VERSION NEWS
VERSION="$("$(dirname "$0")/git-version-gen" --prefix "" .)"
if test "$SKIP_CHECKS" = ""; then
grep "$VERSION" NEWS
fi
OUTDIR=${OUTDIR:-release-$VERSION}
if test -e $OUTDIR; then
if test -e "$OUTDIR"; then
echo "the directory $OUTDIR already exists" >&2
exit 1
fi
mkdir -p $OUTDIR
mkdir -p "$OUTDIR"
rm -f crun-*.tar*
make dist-gzip
make ZSTD_OPT="--ultra -c22" dist-zstd
mv crun-*.tar.gz $OUTDIR
mv crun-*.tar.zst $OUTDIR
mv crun-*.tar.gz "$OUTDIR"
mv crun-*.tar.zst "$OUTDIR"
make distclean
RUNTIME=${RUNTIME:-podman}
RUNTIME_EXTRA_ARGS=${RUNTIME_EXTRA_ARGS:-}
read -r -a RUNTIME_EXTRA_ARGS <<< "${RUNTIME_EXTRA_ARGS:-}"
BUILD_CMD=(
"${RUNTIME:-podman}" run --init --rm
"${RUNTIME_EXTRA_ARGS[@]}"
--privileged
-v /nix:/nix -v "${PWD}:${PWD}"
-w "${PWD}"
"${NIX_IMAGE}"
nix
--extra-experimental-features nix-command
--print-build-logs
--option cores "$(nproc)"
--option max-jobs "$(nproc)"
build
--max-jobs auto
)
mkdir -p /nix
NIX_ARGS="--extra-experimental-features nix-command --print-build-logs --option cores $(nproc) --option max-jobs $(nproc)"
for ARCH in amd64 arm64 ppc64le riscv64 s390x; do
$RUNTIME run --rm $RUNTIME_EXTRA_ARGS --privileged -v /nix:/nix -v ${PWD}:${PWD} -w ${PWD} ${NIX_IMAGE} \
nix $NIX_ARGS build --max-jobs auto --file nix/default-${ARCH}.nix
cp ./result/bin/crun $OUTDIR/crun-$VERSION-linux-${ARCH}
"${BUILD_CMD[@]}" --file nix/default-${ARCH}.nix
cp ./result/bin/crun "$OUTDIR/crun-$VERSION-linux-${ARCH}"
rm -rf result
$RUNTIME run --rm $RUNTIME_EXTRA_ARGS --privileged -v /nix:/nix -v ${PWD}:${PWD} -w ${PWD} ${NIX_IMAGE} \
nix $NIX_ARGS build --max-jobs auto --file nix/default-${ARCH}.nix --arg enableSystemd false
cp ./result/bin/crun $OUTDIR/crun-$VERSION-linux-${ARCH}-disable-systemd
"${BUILD_CMD[@]}" --file nix/default-${ARCH}.nix --arg enableSystemd false
cp ./result/bin/crun "$OUTDIR/crun-$VERSION-linux-${ARCH}-disable-systemd"
rm -rf result
done
if test x$SKIP_GPG = x; then
for i in $OUTDIR/*; do
gpg2 -b --armour $i
if test "$SKIP_GPG" = ""; then
for i in "$OUTDIR"/*; do
gpg2 -b --armour "$i"
done
fi

8
cfg.mk
View File

@ -17,6 +17,14 @@ local-checks-to-skip = \
sc_prohibit_always-defined_macros \
sc_prohibit_gnu_make_extensions
sc_prohibit_sprintf:
@prohibit='\<sprintf *\(' \
halt='do not use sprintf, use snprintf' \
$(_sc_search_regexp)
local-checks-available += $(sc_prohibit_sprintf)
#SHELL=bash -x
show-vc-list-except:
@$(VC_LIST_EXCEPT)

View File

@ -30,7 +30,7 @@ AC_CHECK_HEADERS([error.h linux/openat2.h stdatomic.h linux/ioprio.h])
AC_CHECK_TYPES([atomic_int], [], [], [[#include <stdatomic.h>]])
AC_CHECK_FUNCS(copy_file_range fgetxattr statx fgetpwent_r issetugid memfd_create)
AC_CHECK_FUNCS(eaccess hsearch_r copy_file_range fgetxattr statx fgetpwent_r issetugid memfd_create)
AC_ARG_ENABLE(crun,
AS_HELP_STRING([--enable-crun], [Include crun executable in installation (default: yes)]),
@ -49,7 +49,7 @@ AS_HELP_STRING([--enable-libcrun], [Include libcrun in installation (default: ye
case "${enableval}" in
yes) enable_libcrun=true ;;
no) enable_libcrun=false ;;
*) AC_MSG_ERROR(bad value ${enablevaal} for --enable-libcrun) ;;
*) AC_MSG_ERROR(bad value ${enableval} for --enable-libcrun) ;;
esac
],
[enable_libcrun=true])
@ -124,13 +124,17 @@ dnl include support for wasmedge (EXPERIMENTAL)
AC_ARG_WITH([wasmedge], AS_HELP_STRING([--with-wasmedge], [build with WasmEdge support]))
AS_IF([test "x$with_wasmedge" = "xyes"], AC_CHECK_HEADERS([wasmedge/wasmedge.h], AC_DEFINE([HAVE_WASMEDGE], 1, [Define if WasmEdge is available]), [AC_MSG_ERROR([*** Missing wasmedge headers])]))
dnl include support for wamr (EXPERIMENTAL)
AC_ARG_WITH([wamr], AS_HELP_STRING([--with-wamr], [build with WAMR support]))
AS_IF([test "x$with_wamr" = "xyes"], AC_CHECK_HEADERS([wasm_export.h], AC_DEFINE([HAVE_WAMR], 1, [Define if WAMR is available]), [AC_MSG_ERROR([*** Missing WAMR headers])]))
dnl include support for libkrun (EXPERIMENTAL)
AC_ARG_WITH([libkrun], AS_HELP_STRING([--with-libkrun], [build with libkrun support]))
AS_IF([test "x$with_libkrun" = "xyes"], AC_CHECK_HEADERS([libkrun.h], AC_DEFINE([HAVE_LIBKRUN], 1, [Define if libkrun is available]), [AC_MSG_ERROR([*** Missing libkrun headers])]))
AM_CONDITIONAL([ENABLE_KRUN], [test "x$with_libkrun" = xyes])
AM_CONDITIONAL([ENABLE_WASM], [test "x$with_wasmer" = xyes && test "x$with_wasmedge" = xyes && test "x$with_wasmtime" = xyes])
AM_CONDITIONAL([ENABLE_WASM], [test "x$with_wasmer" = xyes || test "x$with_wasmedge" = xyes || test "x$with_wasmtime" = xyes])
dnl include support for spin (EXPERIMENTAL)
AC_ARG_WITH([spin], AS_HELP_STRING([--with-spin], [build with spin support]))
@ -248,6 +252,8 @@ AS_IF([test "x$enable_criu" != "xno"], [
AC_MSG_NOTICE([CRIU version doesn't support join-ns API])])
PKG_CHECK_MODULES([CRIU_PRE_DUMP], [criu > 3.16.1], [have_criu_pre_dump="yes"], [have_criu_pre_dump="no"
AC_MSG_NOTICE([CRIU version doesn't support for pre-dumping])])
PKG_CHECK_MODULES([CRIU_NETWORK_LOCK_SKIP], [criu >= 3.19], [have_criu_network_lock_skip="yes"], [have_criu_network_lock_skip="no"
AC_MSG_NOTICE([CRIU version doesn't support CRIU_NETWORK_LOCK_SKIP])])
AS_IF([test "$have_criu" = "yes"], [
AC_DEFINE([HAVE_CRIU], 1, [Define if CRIU is available])
])
@ -257,8 +263,47 @@ AS_IF([test "x$enable_criu" != "xno"], [
AS_IF([test "$have_criu_pre_dump" = "yes"], [
AC_DEFINE([CRIU_PRE_DUMP_SUPPORT], 1, [Define if CRIU pre-dump support is available])
])
AS_IF([test "$have_criu_network_lock_skip" = "yes"], [
AC_DEFINE([CRIU_NETWORK_LOCK_SKIP_SUPPORT], 1, [Define if CRIU_NETWORK_LOCK_SKIP is available])
])
], [AC_MSG_NOTICE([CRIU support disabled per user request])])
AC_MSG_CHECKING([for log2])
AC_LINK_IFELSE([
AC_LANG_PROGRAM([
#include <math.h>
#include <stdlib.h>
], [
double result = log2 ((double) rand ());
return (int) result;
])
], [
# log2 works without -lm (musl libc)
AC_MSG_RESULT([yes])
AC_DEFINE([HAVE_LOG2], [1], [Define if log2 is available])
], [
# Try with -lm (glibc)
LIBS="$LIBS -lm"
AC_LINK_IFELSE([
AC_LANG_PROGRAM([
#include <math.h>
#include <stdlib.h>
], [
double result = log2 ((double) rand ());
return (int) result;
])
], [
# log2 works with -lm
AC_MSG_RESULT([yes (with -lm)])
AC_DEFINE([HAVE_LOG2], [1], [Define if log2 is available])
], [
# log2 not available - restore LIBS and fail
AC_MSG_RESULT([no])
AC_MSG_ERROR([*** log2 function is required but not found])
])
])
FOUND_LIBS=$LIBS
LIBS=""

276
crun.1
View File

@ -1,28 +1,25 @@
'\" t
.nh
.TH crun 1 "User Commands"
.SH NAME
.PP
crun - a fast and lightweight OCI runtime
crun \- a fast and lightweight OCI runtime
.SH SYNOPSIS
.PP
crun [global options] command [command options] [arguments...]
.SH DESCRIPTION
.PP
crun is a command line program for running Linux containers that
follow the Open Container Initiative (OCI) format.
.SH COMMANDS
.PP
\fBcreate\fP
Create a container. The runtime detaches from the container process
once the container environment is created. It is necessary to
successively use \fB\fCstart\fR for starting the container.
successively use \fBstart\fR for starting the container.
.PP
\fBdelete\fP
@ -36,6 +33,14 @@ Exec a command in a running container.
\fBlist\fP
List known containers.
.PP
\fBmounts add\fP
Add mounts while the container is running. It requires two arguments: the container ID and a JSON file containing the mounts section of the OCI config file. Each mount listed there is added to the running container. The command is experimental and can be changed without notice.
.PP
\fBmounts remove\fP
Remove mounts while the container is running. It requires two arguments: the container ID and a JSON file containing the mounts section of the OCI config file. Only the destination attribute for each mount is used. The command is experimental and can be changed without notice.
.PP
\fBkill\fP
Send the specified signal to the container init process. If no signal
@ -76,15 +81,14 @@ Update container resource constraints.
.PP
\fBcheckpoint\fP
Checkpoint a running container using CRIU
Checkpoint a running container using CRIU.
.PP
\fBrestore\fP
Restore a container from a checkpoint
Restore a container from a checkpoint.
.SH STATE
.PP
By default, when running as root user, crun saves its state under the
\fB/run/crun\fP directory. As unprivileged user, instead the
\fIXDG_RUNTIME_DIR\fP environment variable is honored, and the directory
@ -93,7 +97,6 @@ overrides this setting.
.SH GLOBAL OPTIONS
.PP
\fB--debug\fP
Produce verbose output.
@ -110,8 +113,6 @@ It is specified in the form \fIBACKEND:SPECIFIER\fP\&.
.PP
These following backends are supported:
.RS
.IP \(bu 2
file:PATH
.IP \(bu 2
@ -119,8 +120,6 @@ journald:IDENTIFIER
.IP \(bu 2
syslog:IDENTIFIER
.RE
.PP
If no backend is specified, then \fIfile:\fP is used by default.
@ -129,9 +128,14 @@ If no backend is specified, then \fIfile:\fP is used by default.
Define the format of the log messages. It can either be \fBtext\fP, or
\fBjson\fP\&. The default is \fBtext\fP\&.
.PP
\fB--log-level\fP=\fILEVEL\fP
Define the log level. It can either be \fBdebug\fP, \fBwarning\fP or \fBerror\fP\&.
The default is \fBerror\fP\&.
.PP
\fB--no-pivot\fP
Use \fB\fCchroot(2)\fR instead of \fB\fCpivot_root(2)\fR when creating the container.
Use \fBchroot(2)\fR instead of \fBpivot_root(2)\fR when creating the container.
This option is not safe, and should be avoided.
.PP
@ -161,7 +165,6 @@ Print a short usage message.
Print program version
.SH CREATE OPTIONS
.PP
crun [global options] create [options] CONTAINER
.PP
@ -190,7 +193,6 @@ Additional number of FDs to pass into the container.
Path to the file that will contain the container process PID.
.SH RUN OPTIONS
.PP
crun [global options] run [options] CONTAINER
.PP
@ -223,7 +225,6 @@ Path to the file that will contain the container process PID.
Detach the container process from the current session.
.SH DELETE OPTIONS
.PP
crun [global options] delete [options] CONTAINER
.PP
@ -235,7 +236,6 @@ Delete the container even if it is still running.
Delete all the containers that satisfy the specified regex.
.SH EXEC OPTIONS
.PP
crun [global options] exec [options] CONTAINER CMD
.PP
@ -297,7 +297,6 @@ Allocate a pseudo TTY.
Specify the user in the form UID[:GID].
.SH LIST OPTIONS
.PP
crun [global options] list [options]
.PP
@ -305,7 +304,6 @@ crun [global options] list [options]
Show only the container ID.
.SH KILL OPTIONS
.PP
crun [global options] kill [options] CONTAINER SIGNAL
.PP
@ -317,16 +315,14 @@ Kill all the processes in the container.
Kill all the containers that satisfy the specified regex.
.SH PS OPTIONS
.PP
crun [global options] ps [options]
.PP
\fB--format\fP=\fIFORMAT\fP
Specify the output format. It must be either \fB\fCtable\fR or \fB\fCjson\fR\&.
By default \fB\fCtable\fR is used.
Specify the output format. It must be either \fBtable\fR or \fBjson\fR\&.
By default \fBtable\fR is used.
.SH SPEC OPTIONS
.PP
crun [global options] spec [options]
.PP
@ -338,7 +334,6 @@ Path to the root of the bundle dir (default ".").
Generate a config.json file that is usable by an unprivileged user.
.SH UPDATE OPTIONS
.PP
crun [global options] update [options] CONTAINER
.PP
@ -402,7 +397,6 @@ Maximum number of pids allowed in the container.
Path to the file containing the resources to update.
.SH CHECKPOINT OPTIONS
.PP
crun [global options] checkpoint [options] CONTAINER
.PP
@ -452,7 +446,6 @@ Specify which CRIU manage cgroup mode should be used. Permitted values are
\fBsoft\fP, \fBignore\fP, \fBfull\fP or \fBstrict\fP\&. Default is \fBsoft\fP\&.
.SH RESTORE OPTIONS
.PP
crun [global options] restore [options] CONTAINER
.PP
@ -492,10 +485,22 @@ Where to write the PID of the container
Specify which CRIU manage cgroup mode should be used. Permitted values are
\fBsoft\fP, \fBignore\fP, \fBfull\fP or \fBstrict\fP\&. Default is \fBsoft\fP\&.
.PP
\fB--lsm-profile\fP=\fITYPE\fP:\fINAME\fP
Specify an LSM profile to be used during restore.
\fITYPE\fP can be either \fBapparmor\fP or \fBselinux\fP\&.
.PP
\fB--lsm-mount-context\fP=\fIVALUE\fP
Specify a new LSM mount context to be used during restore.
This option replaces an existing mount context information
with the specified value. This is useful when restoring
a container into an existing Pod and selinux labels
need to be changed during restore.
.SH Extensions to OCI
.SH \fB\fCrun.oci.mount_context_type=context\fR
.PP
.SH \fBrun.oci.mount_context_type=context\fR
Set the mount context type on volumes mounted with SELinux labels.
.PP
@ -506,49 +511,43 @@ Valid context types are:
rootcontext
.PP
More information on how the context mount flags works see the \fB\fCmount(8)\fR man page.
More information on how the context mount flags works see the \fBmount(8)\fR man page.
.SH \fB\fCrun.oci.seccomp.receiver=PATH\fR
.PP
If the annotation \fB\fCrun.oci.seccomp.receiver=PATH\fR is specified, the
.SH \fBrun.oci.seccomp.receiver=PATH\fR
If the annotation \fBrun.oci.seccomp.receiver=PATH\fR is specified, the
seccomp listener is sent to the UNIX socket listening on the specified
path. It can also set with the \fB\fCRUN_OCI_SECCOMP_RECEIVER\fR environment variable.
path. It can also set with the \fBRUN_OCI_SECCOMP_RECEIVER\fR environment variable.
It is an experimental feature, and the annotation will be removed once
it is supported in the OCI runtime specs. It must be an absolute path.
.SH \fB\fCrun.oci.seccomp.plugins=PATH\fR
.PP
If the annotation \fB\fCrun.oci.seccomp.plugins=PLUGIN1[:PLUGIN2]...\fR is specified, the
.SH \fBrun.oci.seccomp.plugins=PATH\fR
If the annotation \fBrun.oci.seccomp.plugins=PLUGIN1[:PLUGIN2]...\fR is specified, the
seccomp listener fd is handled through the specified plugins. The
plugin must either be an absolute path or a file name that is looked
up by \fB\fCdlopen(3)\fR\&. More information on how the lookup is performed
are available on the \fB\fCld.so(8)\fR man page.
up by \fBdlopen(3)\fR\&. More information on how the lookup is performed
are available on the \fBld.so(8)\fR man page.
.SH \fB\fCrun.oci.seccomp_fail_unknown_syscall=1\fR
.PP
If the annotation \fB\fCrun.oci.seccomp_fail_unknown_syscall\fR is present, then crun
.SH \fBrun.oci.seccomp_fail_unknown_syscall=1\fR
If the annotation \fBrun.oci.seccomp_fail_unknown_syscall\fR is present, then crun
will fail when an unknown syscall is encountered in the seccomp configuration.
.SH \fB\fCrun.oci.seccomp_bpf_data=PATH\fR
.PP
If the annotation \fB\fCrun.oci.seccomp_bpf_data\fR is present, then crun
.SH \fBrun.oci.seccomp_bpf_data=PATH\fR
If the annotation \fBrun.oci.seccomp_bpf_data\fR is present, then crun
ignores the seccomp section in the OCI configuration file and use the specified data
as the raw data to the \fB\fCseccomp(SECCOMP_SET_MODE_FILTER)\fR syscall.
as the raw data to the \fBseccomp(SECCOMP_SET_MODE_FILTER)\fR syscall.
The data must be encoded in base64.
.PP
It is an experimental feature, and the annotation will be removed once
it is supported in the OCI runtime specs.
.SH \fB\fCrun.oci.keep_original_groups=1\fR
.PP
If the annotation \fB\fCrun.oci.keep_original_groups\fR is present, then crun
will skip the \fB\fCsetgroups\fR syscall that is used to either set the
.SH \fBrun.oci.keep_original_groups=1\fR
If the annotation \fBrun.oci.keep_original_groups\fR is present, then crun
will skip the \fBsetgroups\fR syscall that is used to either set the
additional groups specified in the OCI configuration, or to reset the
list of additional groups if none is specified.
.SH \fB\fCrun.oci.pidfd_receiver=PATH\fR
.PP
.SH \fBrun.oci.pidfd_receiver=PATH\fR
It is an experimental feature and will be removed once the feature is in the
OCI runtime specs.
@ -556,11 +555,10 @@ OCI runtime specs.
If present, specify the path to the UNIX socket that will receive the
pidfd for the container process.
.SH \fB\fCrun.oci.systemd.force_cgroup_v1=/PATH\fR
.PP
If the annotation \fB\fCrun.oci.systemd.force_cgroup_v1=/PATH\fR is present, then crun
will override the specified mount point \fB\fC/PATH\fR with a cgroup v1 mount
made of a single hierarchy \fB\fCnone,name=systemd\fR\&.
.SH \fBrun.oci.systemd.force_cgroup_v1=/PATH\fR
If the annotation \fBrun.oci.systemd.force_cgroup_v1=/PATH\fR is present, then crun
will override the specified mount point \fB/PATH\fR with a cgroup v1 mount
made of a single hierarchy \fBnone,name=systemd\fR\&.
It is useful to run on a cgroup v2 system containers using older
versions of systemd that lack support for cgroup v2.
@ -572,137 +570,113 @@ has to have permissions to this mountpoint.
.PP
For example, as root:
.PP
.RS
.nf
.EX
mkdir /sys/fs/cgroup/systemd
mount cgroup -t cgroup /sys/fs/cgroup/systemd -o none,name=systemd,xattr
chown -R the_user.the_user /sys/fs/cgroup/systemd
.EE
.fi
.RE
.SH \fB\fCrun.oci.systemd.subgroup=SUBGROUP\fR
.PP
.SH \fBrun.oci.systemd.subgroup=SUBGROUP\fR
Override the name for the systemd sub cgroup created under the systemd
scope, so the final cgroup will be like:
.PP
.RS
.nf
.EX
/sys/fs/cgroup/$PATH/$SUBGROUP
.fi
.RE
.EE
.PP
When it is set to the empty string, a sub cgroup is not created.
.PP
If not specified, it defaults to \fB\fCcontainer\fR on cgroup v2, and to \fB\fC""\fR
If not specified, it defaults to \fBcontainer\fR on cgroup v2, and to \fB""\fR
on cgroup v1.
.PP
e.g.
.PP
.RS
.nf
.EX
/sys/fs/cgroup//system.slice/foo-352700.scope/container
.EE
.fi
.RE
.SH \fB\fCrun.oci.delegate-cgroup=DELEGATED-CGROUP\fR
.PP
If the \fB\fCrun.oci.systemd.subgroup\fR annotation is specified, yet another
.SH \fBrun.oci.delegate-cgroup=DELEGATED-CGROUP\fR
If the \fBrun.oci.systemd.subgroup\fR annotation is specified, yet another
sub-cgroup is created and the container process is moved here.
.PP
If a cgroup namespace is used, the cgroup namespace is created before
moving the container to the delegated cgroup.
.PP
.RS
.nf
.EX
/sys/fs/cgroup/$PATH/$SUBGROUP/$DELEGATED-CGROUP
.fi
.RE
.EE
.PP
The runtime doesn't apply any limit to the \fB\fC$DELEGATED-CGROUP\fR
sub-cgroup, the runtime uses only \fB\fC$PATH/$SUBGROUP\fR\&.
The runtime doesn't apply any limit to the \fB$DELEGATED-CGROUP\fR
sub-cgroup, the runtime uses only \fB$PATH/$SUBGROUP\fR\&.
.PP
The container payload fully manages \fB\fC$DELEGATE-CGROUP\fR, the limits
applied to \fB\fC$PATH/$SUBGROUP\fR still applies to \fB\fC$DELEGATE-CGROUP\fR\&.
The container payload fully manages \fB$DELEGATE-CGROUP\fR, the limits
applied to \fB$PATH/$SUBGROUP\fR still applies to \fB$DELEGATE-CGROUP\fR\&.
.PP
Since cgroup delegation is not safe on cgroup v1, this option is
supported only on cgroup v2.
.SH \fB\fCrun.oci.hooks.stdout=FILE\fR
.PP
If the annotation \fB\fCrun.oci.hooks.stdout\fR is present, then crun
.SH \fBrun.oci.hooks.stdout=FILE\fR
If the annotation \fBrun.oci.hooks.stdout\fR is present, then crun
will open the specified file and use it as the stdout for the hook
processes. The file is opened in append mode and it is created if it
doesn't already exist.
.SH \fB\fCrun.oci.hooks.stderr=FILE\fR
.PP
If the annotation \fB\fCrun.oci.hooks.stderr\fR is present, then crun
.SH \fBrun.oci.hooks.stderr=FILE\fR
If the annotation \fBrun.oci.hooks.stderr\fR is present, then crun
will open the specified file and use it as the stderr for the hook
processes. The file is opened in append mode and it is created if it
doesn't already exist.
.SH \fB\fCrun.oci.handler=HANDLER\fR
.PP
.SH \fBrun.oci.handler=HANDLER\fR
It is an experimental feature.
.PP
If specified, run the specified handler for execing the container.
The only supported values are \fB\fCkrun\fR and \fB\fCwasm\fR\&.
.RS
The only supported values are \fBkrun\fR and \fBwasm\fR\&.
.IP \(bu 2
\fB\fCkrun\fR: When \fB\fCkrun\fR is specified, the \fB\fClibkrun.so\fR shared object is loaded
\fBkrun\fR: When \fBkrun\fR is specified, the \fBlibkrun.so\fR shared object is loaded
and it is used to launch the container using libkrun.
.IP \(bu 2
\fB\fCwasm\fR: If specified, run the wasm handler for container. Allows running wasm
workload natively. Accepts a \fB\fC\&.wasm\fR binary as input and if \fB\fC\&.wat\fR is
\fBwasm\fR: If specified, run the wasm handler for container. Allows running wasm
workload natively. Accepts a \fB\&.wasm\fR binary as input and if \fB\&.wat\fR is
provided it will be automatically compiled into a wasm module. Stdout of
wasm module is relayed back via crun.
.RE
.SH tmpcopyup mount options
.PP
If the \fB\fCtmpcopyup\fR option is specified for a tmpfs, then the path that
If the \fBtmpcopyup\fR option is specified for a tmpfs, then the path that
is shadowed by the tmpfs mount is recursively copied up to the tmpfs
itself.
.SH copy-symlink mount options
.PP
If the \fB\fCcopy-symlink\fR option is specified, if the source of a bind
If the \fBcopy-symlink\fR option is specified, if the source of a bind
mount is a symlink, the symlink is recreated at the specified
destination instead of attempting a mount that would resolve the
symlink itself. If the destination already exists and it is not a
symlink with the expected content, crun will return an error.
.SH dest-nofollow
When this option is specified for a bind mount, and the destination of
the bind mount is a symbolic link, \fBcrun\fR will mount the symbolic link
itself at the target destination.
.SH src-nofollow
When this option is specified for a bind mount, and the source of the
bind mount is a symbolic link, \fBcrun\fR will use the symlink itself
rather than the file or directory the symbolic link points to.
.SH r$FLAG mount options
.PP
If a \fB\fCr$FLAG\fR mount option is specified then the flag \fB\fC$FLAG\fR is set
If a \fBr$FLAG\fR mount option is specified then the flag \fB$FLAG\fR is set
recursively for each children mount.
.PP
These flags are supported:
.RS
.IP \(bu 2
"rro"
.IP \(bu 2
@ -746,21 +720,18 @@ These flags are supported:
.IP \(bu 2
"rnostrictatime"
.RE
.SH idmap mount options
.PP
If the \fB\fCidmap\fR option is specified then the mount is ID mapped using
If the \fBidmap\fR option is specified then the mount is ID mapped using
the container target user namespace. This is an experimental feature
and can change at any time without notice.
.PP
The \fB\fCidmap\fR option supports a custom mapping that can be different
The \fBidmap\fR option supports a custom mapping that can be different
than the user namespace used by the container.
.PP
The mapping can be specified after the \fB\fCidmap\fR option like:
\fB\fCidmap=uids=0-1-10#10-11-10;gids=0-100-10\fR\&.
The mapping can be specified after the \fBidmap\fR option like:
\fBidmap=uids=0-1-10#10-11-10;gids=0-100-10\fR\&.
.PP
For each triplet, the first value is the start of the backing
@ -768,16 +739,16 @@ file system IDs that are mapped to the second value on the host. The
length of this mapping is given in the third value.
.PP
Multiple ranges are separated with \fB\fC#\fR\&.
Multiple ranges are separated with \fB#\fR\&.
.PP
These values are written to the \fB\fC/proc/$PID/uid_map\fR and
\fB\fC/proc/$PID/gid_map\fR files to create the user namespace for the
These values are written to the \fB/proc/$PID/uid_map\fR and
\fB/proc/$PID/gid_map\fR files to create the user namespace for the
idmapped mount.
.PP
The only two options that are currently supported after \fB\fCidmap\fR are
\fB\fCuids\fR and \fB\fCgids\fR\&.
The only two options that are currently supported after \fBidmap\fR are
\fBuids\fR and \fBgids\fR\&.
.PP
When a custom mapping is specified, a new user namespace is created
@ -793,12 +764,9 @@ the mapping is changed to account for the relative position of the
container user in the container user namespace.
.PP
For example, the mapping: \fB\fCuids=@1-3-10\fR, given a configuration like
For example, the mapping: \fBuids=@1-3-10\fR, given a configuration like
.PP
.RS
.nf
.EX
"uidMappings": [
{
"containerID": 0,
@ -811,15 +779,13 @@ For example, the mapping: \fB\fCuids=@1-3-10\fR, given a configuration like
"size": 1000
}
]
.fi
.RE
.EE
.PP
will be converted to the absolute value \fB\fCuids=1-4-10\fR, where 4 is
calculated by adding 3 (container ID in the \fB\fCuids=\fR mapping) and 1
(\fB\fChostID - containerID\fR for the user namespace mapping where
\fB\fCcontainerID = 1\fR is found).
will be converted to the absolute value \fBuids=1-4-10\fR, where 4 is
calculated by adding 3 (container ID in the \fBuids=\fR mapping) and 1
(\fBhostID - containerID\fR for the user namespace mapping where
\fBcontainerID = 1\fR is found).
.PP
The current implementation doesn't take into account multiple
@ -828,16 +794,18 @@ mapping if it overlaps multiple ranges in the user namespace. In such
a case, there won't be any error reported.
.SH Automatically create user namespace
.PP
When running as user different than root, an user namespace is
automatically created even if it is not specified in the config file.
The current user is mapped to the ID 0 in the container, and any
additional id specified in the files \fB\fC/etc/subuid\fR and \fB\fC/etc/subgid\fR
additional id specified in the files \fB/etc/subuid\fR and \fB/etc/subgid\fR
is automatically added starting with ID 1.
.SH CGROUP v1
Support for cgroup v1 is deprecated and will be removed in a future release.
.SH CGROUP v2
.PP
\fBNote\fP: cgroup v2 does not yet support control of realtime processes and
the cpu controller can only be enabled when all RT processes are in the root
cgroup. This will make crun fail while running alongside RT processes.
@ -855,7 +823,7 @@ they are converted when needed from the cgroup v1 configuration.
allbox;
l l l l
l l l l .
\fB\fCOCI (x)\fR \fB\fCcgroup 2 value (y)\fR \fB\fCconversion\fR \fB\fCcomment\fR
\fBOCI (x)\fP \fBcgroup 2 value (y)\fP \fBconversion\fP \fBcomment\fP
limit memory.max y = x
swap memory.swap.max y = x - memory_limit T{
the swap limit on cgroup v1 includes the memory usage too
@ -868,7 +836,7 @@ reservation memory.low y = x
allbox;
l l l l
l l l l .
\fB\fCOCI (x)\fR \fB\fCcgroup 2 value (y)\fR \fB\fCconversion\fR \fB\fCcomment\fR
\fBOCI (x)\fP \fBcgroup 2 value (y)\fP \fBconversion\fP \fBcomment\fP
limit pids.max y = x
.TE
@ -877,9 +845,9 @@ limit pids.max y = x
allbox;
l l l l
l l l l .
\fB\fCOCI (x)\fR \fB\fCcgroup 2 value (y)\fR \fB\fCconversion\fR \fB\fCcomment\fR
\fBOCI (x)\fP \fBcgroup 2 value (y)\fP \fBconversion\fP \fBcomment\fP
shares cpu.weight T{
y = (1 + ((x - 2) * 9999) / 262142)
y=10^((log2(x)^2 + 125 * log2(x)) / 612.0 - 7.0 / 34.0)
T}
T{
convert from [2-262144] to [1-10000]
@ -897,7 +865,7 @@ T}
allbox;
l l l l
l l l l .
\fB\fCOCI (x)\fR \fB\fCcgroup 2 value (y)\fR \fB\fCconversion\fR \fB\fCcomment\fR
\fBOCI (x)\fP \fBcgroup 2 value (y)\fP \fBconversion\fP \fBcomment\fP
weight io.bfq.weight y = x
weight_device io.bfq.weight y = x
weight io.weight (fallback) y = 1 + (x-10)*9999/990 T{
@ -917,7 +885,7 @@ wiops io.max y=x
allbox;
l l l l
l l l l .
\fB\fCOCI (x)\fR \fB\fCcgroup 2 value (y)\fR \fB\fCconversion\fR \fB\fCcomment\fR
\fBOCI (x)\fP \fBcgroup 2 value (y)\fP \fBconversion\fP \fBcomment\fP
cpus cpuset.cpus y = x
mems cpuset.mems y = x
.TE
@ -927,6 +895,6 @@ mems cpuset.mems y = x
allbox;
l l l l
l l l l .
\fB\fCOCI (x)\fR \fB\fCcgroup 2 value (y)\fR \fB\fCconversion\fR \fB\fCcomment\fR
\fBOCI (x)\fP \fBcgroup 2 value (y)\fP \fBconversion\fP \fBcomment\fP
\&.limit_in_bytes hugetlb.\&.max y = x
.TE

View File

@ -30,6 +30,12 @@ Exec a command in a running container.
**list**
List known containers.
**mounts add**
Add mounts while the container is running. It requires two arguments: the container ID and a JSON file containing the mounts section of the OCI config file. Each mount listed there is added to the running container. The command is experimental and can be changed without notice.
**mounts remove**
Remove mounts while the container is running. It requires two arguments: the container ID and a JSON file containing the mounts section of the OCI config file. Only the destination attribute for each mount is used. The command is experimental and can be changed without notice.
**kill**
Send the specified signal to the container init process. If no signal
is specified, SIGTERM is used.
@ -60,10 +66,10 @@ Resume the processes in the container.
Update container resource constraints.
**checkpoint**
Checkpoint a running container using CRIU
Checkpoint a running container using CRIU.
**restore**
Restore a container from a checkpoint
Restore a container from a checkpoint.
# STATE
By default, when running as root user, crun saves its state under the
@ -98,6 +104,10 @@ If no backend is specified, then *file:* is used by default.
Define the format of the log messages. It can either be **text**, or
**json**. The default is **text**.
**--log-level**=_LEVEL_
Define the log level. It can either be **debug**, **warning** or **error**.
The default is **error**.
**--no-pivot**
Use `chroot(2)` instead of `pivot_root(2)` when creating the container.
This option is not safe, and should be avoided.
@ -386,6 +396,17 @@ Where to write the PID of the container
Specify which CRIU manage cgroup mode should be used. Permitted values are
**soft**, **ignore**, **full** or **strict**. Default is **soft**.
**--lsm-profile**=_TYPE_:_NAME_
Specify an LSM profile to be used during restore.
_TYPE_ can be either **apparmor** or **selinux**.
**--lsm-mount-context**=_VALUE_
Specify a new LSM mount context to be used during restore.
This option replaces an existing mount context information
with the specified value. This is useful when restoring
a container into an existing Pod and selinux labels
need to be changed during restore.
# Extensions to OCI
## `run.oci.mount_context_type=context`
@ -550,6 +571,16 @@ destination instead of attempting a mount that would resolve the
symlink itself. If the destination already exists and it is not a
symlink with the expected content, crun will return an error.
## dest-nofollow
When this option is specified for a bind mount, and the destination of
the bind mount is a symbolic link, `crun` will mount the symbolic link
itself at the target destination.
## src-nofollow
When this option is specified for a bind mount, and the source of the
bind mount is a symbolic link, `crun` will use the symlink itself
rather than the file or directory the symbolic link points to.
## r$FLAG mount options
If a `r$FLAG` mount option is specified then the flag `$FLAG` is set
@ -649,6 +680,10 @@ The current user is mapped to the ID 0 in the container, and any
additional id specified in the files `/etc/subuid` and `/etc/subgid`
is automatically added starting with ID 1.
# CGROUP v1
Support for cgroup v1 is deprecated and will be removed in a future release.
# CGROUP v2
**Note**: cgroup v2 does not yet support control of realtime processes and
@ -677,11 +712,11 @@ they are converted when needed from the cgroup v1 configuration.
## CPU controller
| OCI (x) | cgroup 2 value (y) | conversion | comment |
|---|---|---|---|
| shares | cpu.weight | y = (1 + ((x - 2) * 9999) / 262142) | convert from [2-262144] to [1-10000]|
| period | cpu.max | y = x| period and quota are written together|
| quota | cpu.max | y = x| period and quota are written together|
| OCI (x) | cgroup 2 value (y) | conversion | comment |
|---------|--------------------|---------------------------------------------------------|---------------------------------------|
| shares | cpu.weight | y=10^((log2(x)^2 + 125 * log2(x)) / 612.0 - 7.0 / 34.0) | convert from [2-262144] to [1-10000] |
| period | cpu.max | y = x | period and quota are written together |
| quota | cpu.max | y = x | period and quota are written together |
## blkio controller

View File

@ -27,7 +27,7 @@ podman run -it -p 8080:8080 --name=wasm-example --platform=wasi/wasm32 micha
println!("{}", "This is from a main function from a wasm module");
}
```
* Compile to `wasm32-wasi` target using `wasm-pack` or any other relevant tool. We are going to be using `cargo build --target wasm32-wasi`
* Compile to `wasm32-wasip1` target using `wasm-pack` or any other relevant tool. We are going to be using `cargo build --target wasm32-wasip2`
* Create relevant image and use your container manager. But for this example we will be running directly using crun and plub config manually.
```console
$ crun run wasm-container

View File

@ -1,8 +1,8 @@
# Running wasi workload natively on kubernetes using crun
Crun natively supports running wasm/wasi workload on using `wasmedge`, `wasmer` and `wasmtime`.
Each one of them (`wasmedge`, `wasmer` and `wasmtime`) comes with their own set of unique features.
For instance `wasmer` can compile your `.wat` on the fly. Similarly `wasmedge` has its own perks.
Crun natively supports running wasm/wasi workload on using `wasmedge`, `wasmer`, `wasmtime` and `wamr`.
Each one of them (`wasmedge`, `wasmer`, `wasmtime` and `wamr`) comes with their own set of unique features.
For instance `wasmer` can compile your `.wat` on the fly. Similarly `wasmedge` has its own perks. `wamr` has a layered JIT architecture which can tier up during runtime.
Crun can support only one of them at a time. Please build crun with whatever runtime suits you the best.
#### How does crun detects if is a wasm workload ?
@ -31,6 +31,7 @@ So spec generated by CRI implementation must contain annotation something like.
* Following features works completely out if the box once `cri-o` is using `crun` built with `wasm` support.
* Configure `cri-o` to use `crun` instead of `runc` by editing config at `/etc/crio/crio.conf` read more about it here https://docs.openshift.com/container-platform/3.11/crio/crio_runtime.html#configure-crio-use-crio-engine
* As of `cri-o` version `1.31` it defaults to `crun`, but the bundled `crun` may not have been built with `wasm` support.
* Restart `cri-o` by `sudo systemctl restart crio`
* `cri-o` automatically propagates pod annotations to container spec. So we don't need to do anything.

7
krun.1
View File

@ -2,18 +2,15 @@
.TH crun 1 "User Commands"
.SH NAME
.PP
krun - crun based OCI runtime using libkrun to run containerized programs in
krun \- crun based OCI runtime using libkrun to run containerized programs in
isolated KVM environments
.SH SYNOPSIS
.PP
krun [global options] command [command options] [arguments...]
.SH DESCRIPTION
.PP
krun is a sub package of the crun command line program for running Linux
containers that follow the Open Container Initiative (OCI) format. The krun
command is a symbolic link to the crun executable, that tells crun to run in
@ -34,10 +31,8 @@ containers outside of the krun VM is more difficult.
.SH COMMANDS
.PP
See crun.1 man page for the commands available to krun
.SH SEE ALSO
.PP
crun.1

@ -1 +1 @@
Subproject commit 7b27d0a0bb87fdd7ee46365994e450a58405004f
Subproject commit 68397329bc51a66c56938fc4111fac751d6fd3b0

View File

@ -49,16 +49,8 @@ static const char *LUA_CRUN_TAG_CONTS_ITER = "crun-containers-iterator";
return luacrun_error (S, crun_err) + addret; \
}
#if __STDC_VERSION__ < 201112L
# define LUACRUN_NoRet
#elif __STDC_VERSION__ < 202300L
# define LUACRUN_NoRet _Noreturn
#else
# define LUACRUN_NoRet [[noreturn]]
#endif
extern LUACRUN_NoRet int lua_error (lua_State *L);
extern LUACRUN_NoRet int luaL_error (lua_State *L, const char *fmt, ...);
extern int lua_error (lua_State *L);
extern int luaL_error (lua_State *L, const char *fmt, ...);
/* Build the error string, push onto stack. */
LUA_API int
@ -82,7 +74,7 @@ luacrun_error (lua_State *S, libcrun_error_t *err)
return 1;
}
LUA_API LUACRUN_NoRet void
LUA_API void
luacrun_set_error (lua_State *S, libcrun_error_t *err)
{
luacrun_error (S, err);
@ -512,9 +504,10 @@ luacrun_ctx_status_container (lua_State *S)
cleanup_container libcrun_container_t *container = NULL;
cleanup_free char *dir = NULL;
dir = libcrun_get_state_directory (state_root, id);
if (dir == NULL)
ret = libcrun_get_state_directory (&dir, state_root, id, &crun_err);
if (UNLIKELY (ret < 0))
{
libcrun_error_release (&crun_err);
lua_pushnil (S);
lua_pushstring (S, "cannot get state directory");
return 2;
@ -526,6 +519,7 @@ luacrun_ctx_status_container (lua_State *S)
lua_pop (S, 1);
if (container == NULL)
{
libcrun_error_release (&crun_err);
lua_pushnil (S);
lua_pushstring (S, "error loading config.json");
return 2;
@ -836,6 +830,8 @@ luaopen_luacrun (lua_State *S)
lua_setfield (S, libtab_idx, "VERBOSITY_ERROR");
lua_pushinteger (S, LIBCRUN_VERBOSITY_WARNING);
lua_setfield (S, libtab_idx, "VERBOSITY_WARNING");
lua_pushinteger (S, LIBCRUN_VERBOSITY_DEBUG);
lua_setfield (S, libtab_idx, "VERBOSITY_DEBUG");
luacrun_setup_ctx_metatable (S);
luacrun_setup_cont_metatable (S);

View File

@ -12,8 +12,9 @@ with pkgs; stdenv.mkDerivation {
outputs = [ "out" ];
nativeBuildInputs = with buildPackages; [
autoreconfHook
autoPatchelfHook
bash
gitMinimal
git
pkg-config
python3
which
@ -32,7 +33,7 @@ with pkgs; stdenv.mkDerivation {
] ++ lib.optionals enableCriu [ criu ];
configureFlags = [ "--enable-static" ] ++ lib.optional (!enableSystemd) [ "--disable-systemd" ];
prePatch = ''
export CFLAGS='-static -pthread'
export CFLAGS='-static -pthread -DSTATIC'
export LDFLAGS='-s -w -static-libgcc -static'
export EXTRA_LDFLAGS='-s -w -linkmode external -extldflags "-static -lm"'
export CRUN_LDFLAGS='-all-static'

View File

@ -1,10 +1,10 @@
{
"url": "https://github.com/nixos/nixpkgs",
"rev": "69b095e77f0a6b6ff498093410fc85d18919d3ad",
"date": "2024-06-24T16:37:24+03:00",
"path": "/nix/store/2bxcxg9rb6xhfnmzkx8lhsy1c6hrvw1m-nixpkgs",
"sha256": "1km3laqpzhs669ml28rskqxy9v8924nvc5h1f1bw42si8j1a5kh4",
"hash": "sha256-BM6igkRRC8JXcAEWti0RCe3kO546I0FrMkbDf7Gio84=",
"rev": "66b41287a6d4088e07219991720702b1cc9a146b",
"date": "2025-05-07T16:10:03+02:00",
"path": "/nix/store/kl1l84aab70y4hzy1rn9sk7726q7yli3-nixpkgs",
"sha256": "05i46f27847ym9h53nf6n441lqg6nzyr34wm9f7apspzllay43g1",
"hash": "sha256-4Q3iFaX/6quOS5WTkf235mEaCLHG2VFgqv4QdIQzJBY=",
"fetchLFS": false,
"fetchSubmodules": false,
"deepClone": false,

View File

@ -3,7 +3,27 @@ let
in
self: super:
{
criu = (static super.criu);
protobufc = super.protobufc.overrideAttrs (x: {
configureFlags = (x.configureFlags or [ ]) ++ [ "--enable-static" ];
});
libnl = super.libnl.overrideAttrs (x: {
configureFlags = (x.configureFlags or [ ]) ++ [ "--enable-static" ];
});
libnet = super.libnet.overrideAttrs (x: {
configureFlags = (x.configureFlags or [ ]) ++ [ "--enable-static" ];
});
criu = (static super.criu).overrideAttrs (x: {
buildInputs = (x.buildInputs or []) ++ [
super.protobuf
super.protobufc
super.libnl
super.libnet
];
NIX_LDFLAGS = "${x.NIX_LDFLAGS or ""} -lprotobuf-c";
buildPhase = ''
make lib
'';
});
gpgme = (static super.gpgme);
libassuan = (static super.libassuan);
libgpgerror = (static super.libgpgerror);

View File

@ -1,20 +0,0 @@
discover:
how: fmf
execute:
how: tmt
/upstream:
summary: Run crun specific Podman system tests on upstream PRs
discover+:
filter: tag:upstream
adjust+:
enabled: false
when: initiator is not defined or initiator != packit
/downstream:
summary: Run crun specific Podman system tests on bodhi / errata and dist-git PRs
discover+:
filter: tag:downstream
adjust+:
enabled: false
when: initiator == packit

40
plans/main.fmf Normal file
View File

@ -0,0 +1,40 @@
discover:
how: fmf
execute:
how: tmt
prepare:
- when: distro == centos-stream or distro == rhel
how: shell
script: |
dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-$(rpm --eval '%{?rhel}').noarch.rpm
dnf -y config-manager --set-enabled epel
order: 10
- when: initiator == packit
how: shell
script: |
COPR_REPO_FILE="/etc/yum.repos.d/*podman-next*.repo"
if compgen -G $COPR_REPO_FILE > /dev/null; then
sed -i -n '/^priority=/!p;$apriority=1' $COPR_REPO_FILE
fi
dnf -y upgrade --allowerasing
order: 20
- how: install
package:
- bats
- crun
- podman-tests
/shellcheck:
discover+:
filter: 'tag:shellcheck'
enabled: true
adjust:
enabled: false
when: distro == centos-stream-10 or distro == rhel-10
prepare+:
- how: install
package: ShellCheck
/tests:
discover+:
filter: 'tag:podman | tag:sanity'

View File

@ -55,6 +55,8 @@ set_error (libcrun_error_t *err)
ret = asprintf (&msg, "%s: %s", (*err)->msg, strerror ((*err)->status));
if (LIKELY (ret >= 0))
PyErr_SetString (PyExc_RuntimeError, msg);
else
msg = NULL;
}
libcrun_error_release (err);
@ -405,6 +407,8 @@ container_update (PyObject *self arg_unused, PyObject *args)
ret = asprintf (&msg, "cannot parse process: %s", parser_err);
if (LIKELY (ret >= 0))
PyErr_SetString (PyExc_RuntimeError, msg);
else
msg = NULL;
free (parser_err);
return NULL;
}
@ -450,15 +454,7 @@ container_spec (PyObject *self arg_unused, PyObject *args arg_unused)
static PyObject *
get_verbosity (PyObject *self arg_unused, PyObject *args)
{
libcrun_error_t err;
PyObject *ctx_obj = NULL;
libcrun_context_t *ctx;
int verbosity;
if (!PyArg_ParseTuple (args, "i", &verbosity))
return NULL;
return PyLong_FromLong (libcrun_get_verbosity (verbosity));
return PyLong_FromLong (libcrun_get_verbosity());
}
static PyObject *
@ -495,7 +491,7 @@ static PyMethodDef CrunMethods[] = {
{"make_context", (PyCFunction) make_context, METH_VARARGS | METH_KEYWORDS,
"Create a context object."},
{"set_verbosity", set_verbosity, METH_VARARGS, "Set the logging verbosity."},
{"get_verbosity", get_verbosity, METH_VARARGS, "Get the logging verbosity."},
{"get_verbosity", get_verbosity, METH_NOARGS, "Get the logging verbosity."},
{"spec", container_spec, METH_VARARGS,
"Generate a new configuration file."},
{NULL, NULL, 0, NULL}
@ -519,5 +515,6 @@ PyInit_python_crun (void)
return ret;
(void) PyModule_AddIntConstant (ret, "VERBOSITY_ERROR", LIBCRUN_VERBOSITY_ERROR);
(void) PyModule_AddIntConstant (ret, "VERBOSITY_WARNING", LIBCRUN_VERBOSITY_WARNING);
(void) PyModule_AddIntConstant (ret, "VERBOSITY_DEBUG", LIBCRUN_VERBOSITY_DEBUG);
return ret;
}

View File

@ -1,27 +1,31 @@
%global krun_opts %{nil}
%global wasmedge_opts %{nil}
# krun and wasm[edge,time] support only on aarch64 and x86_64
%ifarch aarch64 || x86_64
%global wasm_support 1
%global yajl_opts %{nil}
%if %{defined copr_username}
%define copr_build 1
%endif
# Disable wasmedge on rhel 10 until EPEL10 is in place, otherwise it causes
# build issues on copr
%if %{defined fedora} || (%{defined %copr_build} && %{defined rhel} && 0%{?rhel} < 10)
# krun and wasm support only on aarch64 and x86_64
%ifarch aarch64 || x86_64
%if %{defined fedora}
# krun only exists on fedora
%global krun_support 1
%global krun_opts --with-libkrun
# Keep wasmedge enabled only on Fedora. It breaks a lot on EPEL.
%global wasm_support 1
%global wasmedge_support 1
%global wasmedge_opts --with-wasmedge
%endif
# krun only exists on fedora
%if %{defined fedora}
%global krun_support 1
%global krun_opts --with-libkrun
%endif
%if %{defined fedora} || (%{defined rhel} && 0%{?rhel} < 10)
%global system_yajl 1
%else
%global yajl_opts --enable-embedded-yajl
%endif
Summary: OCI runtime written in C
@ -55,7 +59,9 @@ BuildRequires: libcap-devel
BuildRequires: libkrun-devel
%endif
BuildRequires: systemd-devel
%if %{defined system_yajl}
BuildRequires: yajl-devel
%endif
BuildRequires: libseccomp-devel
BuildRequires: python3-libmount
BuildRequires: libtool
@ -87,12 +93,10 @@ krun is a symlink to the %{name} binary, with libkrun as an additional dependenc
%package wasm
Summary: %{name} with wasm support
Requires: %{name} = %{?epoch:%{epoch}:}%{version}-%{release}
# The hard dep on wasm-library is causing trouble in internal testing farm
# with RHEL.
# wasm packages are not present on RHEL yet and are currently a PITA to test
# Best to only include wasmedge as weak dep on rhel
%if %{defined fedora}
Requires: wasm-library
%else
Recommends: wasm-library
%endif
Recommends: wasmedge
@ -105,16 +109,15 @@ Recommends: wasmedge
%build
./autogen.sh
./configure --disable-silent-rules %{krun_opts} %{wasmedge_opts}
./configure --disable-silent-rules %{krun_opts} %{wasmedge_opts} %{yajl_opts}
%make_build
%install
%make_install prefix=%{_prefix}
rm -rf %{buildroot}%{_prefix}/lib*
%if %{defined wasm_support}
ln -s %{name} %{buildroot}%{_bindir}/%{name}-wasm
%endif
# Placeholder check to silence rpmlint
%check
%files
%license COPYING
@ -135,12 +138,4 @@ ln -s %{name} %{buildroot}%{_bindir}/%{name}-wasm
%endif
%changelog
%if %{defined autochangelog}
%autochangelog
%else
# NOTE: This changelog will be visible on CentOS 8 Stream builds
# Other envs are capable of handling autochangelog
* Tue Jun 13 2023 RH Container Bot <rhcontainerbot@fedoraproject.org>
- Placeholder changelog for envs that are not autochangelog-ready.
- Contact upstream if you need to report an issue with the build.
%endif

View File

@ -1,7 +1,9 @@
--- !Policy
product_versions:
- fedora-*
decision_context: bodhi_update_push_stable
decision_contexts:
- bodhi_update_push_stable
- bodhi_update_push_testing
rules:
- !PassingTestCaseRule {test_case_name: fedora-ci.koji-build.tier0.functional}
@ -9,4 +11,5 @@ rules:
product_versions:
- rhel-*
decision_context: osci_compose_gate
rules: []
rules:
- !PassingTestCaseRule {test_case_name: osci.brew-build.tier0.functional}

View File

@ -44,6 +44,7 @@ enum
OPTION_SHELL_JOB,
OPTION_EXT_UNIX_SK,
OPTION_FILE_LOCKS,
OPTION_NETWORK_LOCK_METHOD,
OPTION_PARENT_PATH,
OPTION_PRE_DUMP,
OPTION_MANAGE_CGROUPS_MODE,
@ -61,6 +62,7 @@ static struct argp_option options[]
{ "ext-unix-sk", OPTION_EXT_UNIX_SK, 0, 0, "allow external unix sockets", 0 },
{ "shell-job", OPTION_SHELL_JOB, 0, 0, "allow shell jobs", 0 },
{ "file-locks", OPTION_FILE_LOCKS, 0, 0, "allow file locks", 0 },
{ "network-lock", OPTION_NETWORK_LOCK_METHOD, 0, 0, "set network lock method", 0 },
#ifdef CRIU_PRE_DUMP_SUPPORT
{ "parent-path", OPTION_PARENT_PATH, "DIR", 0, "path for previous criu image files in pre-dump", 0 },
{ "pre-dump", OPTION_PRE_DUMP, 0, 0, "dump container's memory information only, leave the container running after this", 0 },
@ -72,6 +74,25 @@ static struct argp_option options[]
static char args_doc[] = "checkpoint CONTAINER";
int
crun_parse_network_lock_method (char *param arg_unused)
{
#if HAVE_CRIU && HAVE_DLOPEN
if (strcmp (param, "iptables") == 0)
return CRIU_NETWORK_LOCK_IPTABLES;
else if (strcmp (param, "nftables") == 0)
return CRIU_NETWORK_LOCK_NFTABLES;
# if CRIU_NETWORK_LOCK_SKIP_SUPPORT
else if (strcmp (param, "skip") == 0)
return CRIU_NETWORK_LOCK_SKIP;
# endif
else
libcrun_fail_with_error (0, "unknown network lock method specified");
#else
return 0;
#endif
}
int
crun_parse_manage_cgroups_mode (char *param arg_unused)
{
@ -139,6 +160,10 @@ parse_opt (int key, char *arg, struct argp_state *state)
cr_options.manage_cgroups_mode = crun_parse_manage_cgroups_mode (argp_mandatory_argument (arg, state));
break;
case OPTION_NETWORK_LOCK_METHOD:
cr_options.network_lock_method = crun_parse_network_lock_method (argp_mandatory_argument (arg, state));
break;
default:
return ARGP_ERR_UNKNOWN;
}
@ -174,7 +199,7 @@ crun_command_checkpoint (struct crun_global_arguments *global_args, int argc, ch
path = getcwd (NULL, 0);
if (UNLIKELY (path == NULL))
libcrun_fail_with_error (0, "realloc failed");
libcrun_fail_with_error (errno, "getcwd failed");
ret = asprintf (&cr_path, "%s/checkpoint", path);
if (UNLIKELY (ret < 0))

View File

@ -21,6 +21,7 @@
#include "crun.h"
int crun_parse_manage_cgroups_mode (char *param);
int crun_parse_network_lock_method (char *param);
int crun_command_checkpoint (struct crun_global_arguments *global_args, int argc, char **argv, libcrun_error_t *error);
#endif

View File

@ -17,16 +17,10 @@
*/
#include <config.h>
#include <stdio.h>
#include <stdlib.h>
#include <argp.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include "crun.h"
#include "libcrun/container.h"
#include "libcrun/utils.h"
#include "run_create.h"
enum
{
@ -80,7 +74,7 @@ parse_opt (int key, char *arg, struct argp_state *state)
break;
case OPTION_PRESERVE_FDS:
crun_context.preserve_fds = strtoul (argp_mandatory_argument (arg, state), NULL, 10);
crun_context.preserve_fds = parse_int_or_fail (argp_mandatory_argument (arg, state), "preserve-fds");
break;
case OPTION_NO_SUBREAPER:
@ -110,64 +104,14 @@ parse_opt (int key, char *arg, struct argp_state *state)
static struct argp run_argp = { options, parse_opt, args_doc, doc, NULL, NULL, NULL };
static unsigned int
get_options ()
{
return 0;
}
int
crun_command_create (struct crun_global_arguments *global_args, int argc, char **argv, libcrun_error_t *err)
{
int first_arg = 0, ret;
cleanup_container libcrun_container_t *container = NULL;
cleanup_free char *bundle_cleanup = NULL;
cleanup_free char *config_file_cleanup = NULL;
crun_context.preserve_fds = 0;
crun_context.listen_fds = 0;
argp_parse (&run_argp, argc, argv, ARGP_IN_ORDER, &first_arg, &crun_context);
crun_assert_n_args (argc - first_arg, 1, 1);
/* Make sure the config is an absolute path before changing the directory. */
if ((strcmp ("config.json", config_file) != 0))
{
if (config_file[0] != '/')
{
config_file_cleanup = realpath (config_file, NULL);
if (config_file_cleanup == NULL)
libcrun_fail_with_error (errno, "realpath `%s` failed", config_file);
config_file = config_file_cleanup;
}
}
/* Make sure the bundle is an absolute path. */
if (bundle == NULL)
bundle = bundle_cleanup = getcwd (NULL, 0);
else
{
if (bundle[0] != '/')
{
bundle_cleanup = realpath (bundle, NULL);
if (bundle_cleanup == NULL)
libcrun_fail_with_error (errno, "realpath `%s` failed", bundle);
bundle = bundle_cleanup;
}
if (chdir (bundle) < 0)
libcrun_fail_with_error (errno, "chdir `%s` failed", bundle);
}
ret = init_libcrun_context (&crun_context, argv[first_arg], global_args, err);
if (UNLIKELY (ret < 0))
return ret;
container = libcrun_container_load_from_file (config_file, err);
if (container == NULL)
libcrun_fail_with_error (0, "error loading config.json");
crun_context.bundle = bundle;
if (getenv ("LISTEN_FDS"))
{
crun_context.listen_fds = strtoll (getenv ("LISTEN_FDS"), NULL, 10);
crun_context.preserve_fds += crun_context.listen_fds;
}
return libcrun_container_create (&crun_context, container, 0, err);
return crun_run_create_internal (global_args, argc, argv, libcrun_container_create, get_options, &crun_context, &run_argp, &config_file, &bundle, err);
}

View File

@ -22,6 +22,8 @@
#include <argp.h>
#include <string.h>
#include <libgen.h>
#include <errno.h>
#include <limits.h>
#ifdef HAVE_DLOPEN
# include <dlfcn.h>
@ -48,6 +50,7 @@
#include "oci_features.h"
#include "ps.h"
#include "checkpoint.h"
#include "mounts.h"
#include "restore.h"
static struct crun_global_arguments arguments;
@ -116,6 +119,9 @@ init_libcrun_context (libcrun_context_t *con, const char *id, struct crun_global
return ret;
}
libcrun_set_verbosity (glob->verbosity);
libcrun_debug ("Using debug verbosity");
if (con->bundle == NULL)
con->bundle = ".";
@ -142,6 +148,7 @@ enum
COMMAND_PS,
COMMAND_CHECKPOINT,
COMMAND_RESTORE,
COMMAND_MOUNTS,
};
struct commands_s commands[] = { { COMMAND_CREATE, "create", crun_command_create },
@ -162,6 +169,7 @@ struct commands_s commands[] = { { COMMAND_CREATE, "create", crun_command_create
{ COMMAND_CHECKPOINT, "checkpoint", crun_command_checkpoint },
{ COMMAND_RESTORE, "restore", crun_command_restore },
#endif
{ COMMAND_MOUNTS, "mounts", crun_command_mounts },
{
0,
} };
@ -175,6 +183,7 @@ static char doc[] = "\nCOMMANDS:\n"
"\texec - exec a command in a running container\n"
"\tfeatures - show the enabled features\n"
"\tlist - list known containers\n"
"\tmounts - add or remove mounts from a running container\n"
"\tkill - send a signal to the container init process\n"
"\tps - show the processes in the container\n"
#if HAVE_CRIU && HAVE_DLOPEN
@ -209,19 +218,21 @@ enum
OPTION_CGROUP_MANAGER,
OPTION_LOG,
OPTION_LOG_FORMAT,
OPTION_LOG_LEVEL,
OPTION_ROOT,
OPTION_ROOTLESS
};
const char *argp_program_bug_address = "https://github.com/containers/crun/issues";
static struct argp_option options[] = { { "debug", OPTION_DEBUG, 0, 0, "produce verbose output", 0 },
static struct argp_option options[] = { { "debug", OPTION_DEBUG, 0, 0, "produce verbose output, similar to --log-level=debug", 0 },
{ "cgroup-manager", OPTION_CGROUP_MANAGER, "MANAGER", 0, "cgroup manager", 0 },
{ "systemd-cgroup", OPTION_SYSTEMD_CGROUP, 0, 0, "use systemd cgroups", 0 },
{ "log", OPTION_LOG, "FILE", 0, NULL, 0 },
{ "log-format", OPTION_LOG_FORMAT, "FORMAT", 0, NULL, 0 },
{ "log", OPTION_LOG, "FILE", 0, "log destination: '[file:]PATH', 'journald:ID' or 'syslog:ID' (defaults to stderr)", 0 },
{ "log-format", OPTION_LOG_FORMAT, "FORMAT", 0, "log format: 'text' (default) or 'json'", 0 },
{ "log-level", OPTION_LOG_LEVEL, "LEVEL", 0, "log level to use: 'error' (default), 'warning' or 'debug'", 0 },
{ "root", OPTION_ROOT, "DIR", 0, NULL, 0 },
{ "rootless", OPTION_ROOT, "VALUE", 0, NULL, 0 },
{ "rootless", OPTION_ROOTLESS, "VALUE", 0, NULL, 0 },
{ "version", OPTION_VERSION, 0, 0, NULL, 0 },
// alias OPTION_VERSION_CAP with OPTION_VERSION
{ NULL, OPTION_VERSION_CAP, 0, OPTION_ALIAS, NULL, 0 },
@ -232,11 +243,21 @@ static struct argp_option options[] = { { "debug", OPTION_DEBUG, 0, 0, "produce
static void
print_version (FILE *stream, struct argp_state *state arg_unused)
{
cleanup_free char *rundir = libcrun_get_state_directory (arguments.root, NULL);
libcrun_error_t err = NULL;
cleanup_free char *rundir = NULL;
int ret;
fprintf (stream, "%s version %s\n", PACKAGE_NAME, PACKAGE_VERSION);
fprintf (stream, "commit: %s\n", GIT_VERSION);
fprintf (stream, "rundir: %s\n", rundir);
ret = libcrun_get_state_directory (&rundir, arguments.root, NULL, &err);
if (LIKELY (ret == 0))
fprintf (stream, "rundir: %s\n", rundir);
else
libcrun_error_release (&err);
fprintf (stream, "spec: 1.0.0\n");
#ifdef HAVE_SYSTEMD
fprintf (stream, "+SYSTEMD ");
#endif
@ -251,7 +272,7 @@ print_version (FILE *stream, struct argp_state *state arg_unused)
#ifdef HAVE_EBPF
fprintf (stream, "+EBPF ");
#endif
#ifdef HAVE_CRIU
#if HAVE_CRIU && HAVE_DLOPEN
fprintf (stream, "+CRIU ");
#endif
@ -268,7 +289,7 @@ parse_opt (int key, char *arg, struct argp_state *state)
switch (key)
{
case OPTION_DEBUG:
arguments.debug = true;
arguments.verbosity = LIBCRUN_VERBOSITY_DEBUG;
break;
case OPTION_CGROUP_MANAGER:
@ -307,6 +328,26 @@ parse_opt (int key, char *arg, struct argp_state *state)
arguments.log_format = argp_mandatory_argument (arg, state);
break;
case OPTION_LOG_LEVEL:
tmp = argp_mandatory_argument (arg, state);
if (strcmp (tmp, "error") == 0)
{
arguments.verbosity = LIBCRUN_VERBOSITY_ERROR;
}
else if (strcmp (tmp, "warning") == 0)
{
arguments.verbosity = LIBCRUN_VERBOSITY_WARNING;
}
else if (strcmp (tmp, "debug") == 0)
{
arguments.verbosity = LIBCRUN_VERBOSITY_DEBUG;
}
else
{
libcrun_fail_with_error (0, "unknown verbosity `%s` specified", arg);
}
break;
case OPTION_ROOT:
arguments.root = argp_mandatory_argument (arg, state);
break;
@ -348,6 +389,24 @@ argp_mandatory_argument (char *arg, struct argp_state *state)
return state->argv[state->next++];
}
int
parse_int_or_fail (const char *str, const char *kind)
{
char *endptr = NULL;
long long l;
errno = 0;
l = strtoll (str, &endptr, 10);
if (errno != 0)
libcrun_fail_with_error (errno, "invalid value for `%s`", kind);
if (endptr != NULL && *endptr != '\0')
libcrun_fail_with_error (EINVAL, "invalid value for `%s`", kind);
if (l < INT_MIN || l > INT_MAX)
libcrun_fail_with_error (ERANGE, "invalid value for `%s`", kind);
return (int) l;
}
static struct argp argp = { options, parse_opt, args_doc, doc, NULL, NULL, NULL };
int ensure_cloned_binary (void);
@ -367,6 +426,16 @@ fill_handler_from_argv0 (char *argv0, struct crun_global_arguments *args)
args->handler = b + 5;
}
static char **
copy_args (char **argv, int argc)
{
char **buff = xmalloc0 ((argc + 1) * sizeof (char *));
for (int i = 0; i < argc; i++)
buff[i] = argv[i];
return buff;
}
int
main (int argc, char **argv)
{
@ -384,22 +453,23 @@ main (int argc, char **argv)
}
/* Resolve all libcrun weak dependencies. */
if (dlopen ("libcrun.so", RTLD_GLOBAL | RTLD_DEEPBIND | RTLD_LAZY) == NULL)
error (EXIT_FAILURE, 0, "dlopen: %s", dlerror ());
error (EXIT_FAILURE, 0, "could not load `libcrun.so`: `%s`", dlerror ());
#endif
fill_handler_from_argv0 (argv[0], &arguments);
argp_parse (&argp, argc, argv, ARGP_IN_ORDER, &first_argument, &arguments);
command = get_command (argv[first_argument]);
if (command == NULL)
libcrun_fail_with_error (0, "unknown command %s", argv[first_argument]);
if (arguments.debug)
libcrun_set_verbosity (LIBCRUN_VERBOSITY_WARNING);
int command_argc = argc - first_argument;
cleanup_free char **command_argv = copy_args (argv + first_argument, command_argc);
command_argv[0] = argv[0];
ret = command->handler (&arguments, argc - first_argument, argv + first_argument, &err);
ret = command->handler (&arguments, command_argc, command_argv, &err);
if (ret && err)
libcrun_fail_with_error (err->status, "%s", err->msg);
return ret;
}

View File

@ -28,15 +28,16 @@ struct crun_global_arguments
const char *handler;
int argc;
int verbosity;
char **argv;
bool command;
bool debug;
bool option_systemd_cgroup;
bool option_force_no_cgroup;
};
char *argp_mandatory_argument (char *arg, struct argp_state *state);
int parse_int_or_fail (const char *str, const char *kind);
int init_libcrun_context (libcrun_context_t *con, const char *id, struct crun_global_arguments *glob,
libcrun_error_t *err);
void crun_assert_n_args (int n, int min, int max);

View File

@ -23,6 +23,7 @@
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <limits.h>
#include "crun.h"
#include "libcrun/container.h"
@ -151,7 +152,7 @@ parse_opt (int key, char *arg, struct argp_state *state)
break;
case OPTION_PRESERVE_FDS:
exec_options.preserve_fds = strtoul (argp_mandatory_argument (arg, state), NULL, 10);
exec_options.preserve_fds = parse_int_or_fail (argp_mandatory_argument (arg, state), "preserve-fds");
break;
case OPTION_CGROUP:
@ -203,27 +204,34 @@ make_oci_process_user (const char *userspec)
{
runtime_spec_schema_config_schema_process_user *u;
char *endptr = NULL;
long long l;
if (userspec == NULL)
return NULL;
u = xmalloc0 (sizeof (runtime_spec_schema_config_schema_process_user));
errno = 0;
u->uid = strtol (userspec, &endptr, 10);
l = strtoll (userspec, &endptr, 10);
if (errno == ERANGE)
libcrun_fail_with_error (0, "invalid UID specified");
if (*endptr == '\0')
return u;
if (*endptr != ':')
libcrun_fail_with_error (0, "invalid USERSPEC specified");
if (l < INT_MIN || l > INT_MAX)
libcrun_fail_with_error (0, "invalid UID specified");
u->uid = (int) l;
errno = 0;
u->gid = strtol (endptr + 1, &endptr, 10);
l = strtoll (endptr + 1, &endptr, 10);
if (errno == ERANGE)
libcrun_fail_with_error (0, "invalid GID specified");
if (l < INT_MIN || l > INT_MAX)
libcrun_fail_with_error (0, "invalid GID specified");
if (*endptr != '\0')
libcrun_fail_with_error (0, "invalid USERSPEC specified");
u->gid = (int) l;
return u;
}

View File

@ -93,7 +93,7 @@ make_new_sibling_cgroup (char **out, const char *id, libcrun_error_t *err)
char *dir;
int ret;
ret = libcrun_get_current_unified_cgroup (&current_cgroup, false, err);
ret = libcrun_get_cgroup_process (0, &current_cgroup, false, err);
if (UNLIKELY (ret < 0))
return ret;

View File

@ -21,6 +21,9 @@
#include "container.h"
#include "utils.h"
#include <stdint.h>
#include <math.h>
enum
{
CGROUP_MEMORY = 1 << 0,
@ -29,6 +32,7 @@ enum
CGROUP_CPUSET = 1 << 3,
CGROUP_PIDS = 1 << 4,
CGROUP_IO = 1 << 5,
CGROUP_MISC = 1 << 6,
};
struct libcrun_cgroup_status
@ -37,6 +41,8 @@ struct libcrun_cgroup_status
char *scope;
int manager;
bool bpf_dev_set;
};
struct libcrun_cgroup_manager
@ -47,7 +53,7 @@ struct libcrun_cgroup_manager
/* Destroy the cgroup and kill any process if needed. */
int (*destroy_cgroup) (struct libcrun_cgroup_status *cgroup_status, libcrun_error_t *err);
/* Additional resources configuration specific to this manager. */
int (*update_resources) (struct libcrun_cgroup_status *cgroup_status, runtime_spec_schema_config_linux_resources *resources, libcrun_error_t *err);
int (*update_resources) (struct libcrun_cgroup_status *cgroup_status, const char *state_root, runtime_spec_schema_config_linux_resources *resources, libcrun_error_t *err);
};
int move_process_to_cgroup (pid_t pid, const char *subsystem, const char *path, libcrun_error_t *err);
@ -55,7 +61,6 @@ int enter_cgroup_subsystem (pid_t pid, const char *subsystem, const char *path,
libcrun_error_t *err);
int enable_controllers (const char *path, libcrun_error_t *err);
int chown_cgroups (const char *path, uid_t uid, gid_t gid, libcrun_error_t *err);
int destroy_cgroup_path (const char *path, int mode, libcrun_error_t *err);
int cgroup_killall_path (const char *path, int signal, libcrun_error_t *err);
int libcrun_cgroup_read_pids_from_path (const char *path, bool recurse, pid_t **pids, libcrun_error_t *err);
@ -76,8 +81,22 @@ int libcrun_cgroup_pause_unpause_path (const char *cgroup_path, const bool pause
static inline uint64_t
convert_shares_to_weight (uint64_t shares)
{
/* convert linearly from 2-262144 to 1-10000. */
return (1 + ((shares - 2) * 9999) / 262142);
double l, exponent;
/* The value of 0 means "unset". */
if (shares == 0)
return 0;
if (shares <= 2)
return 1;
if (shares >= 262144)
return 10000;
l = log2 ((double) shares);
/* Quadratic function which fits min, max, and default. */
exponent = (l * l + 125 * l) / 612.0 - 7.0 / 34.0;
return (uint64_t) ceil (pow (10, exponent));
}
int initialize_cpuset_subsystem (const char *path, libcrun_error_t *err);

View File

@ -38,6 +38,28 @@
#include <fcntl.h>
#include <libgen.h>
struct default_dev_s default_devices[] = {
{ 'c', -1, -1, "m" },
{ 'b', -1, -1, "m" },
{ 'c', 1, 3, "rwm" },
{ 'c', 1, 8, "rwm" },
{ 'c', 1, 7, "rwm" },
{ 'c', 5, 0, "rwm" },
{ 'c', 1, 5, "rwm" },
{ 'c', 1, 9, "rwm" },
{ 'c', 5, 1, "rwm" },
{ 'c', 136, -1, "rwm" },
{ 'c', 5, 2, "rwm" },
{ 'c', 10, 200, "rwm" },
{ 0, 0, 0, NULL }
};
struct default_dev_s *
get_default_devices ()
{
return default_devices;
}
static inline int
write_cgroup_file (int dirfd, const char *name, const void *data, size_t len, libcrun_error_t *err)
{
@ -49,11 +71,11 @@ write_cgroup_file_or_alias (int dirfd, const char *name, const char *alias, cons
{
int ret;
ret = write_file_at_with_flags (dirfd, O_WRONLY | O_CLOEXEC, 0, name, data, len, err);
ret = write_cgroup_file (dirfd, name, data, len, err);
if (UNLIKELY (alias != NULL && ret < 0 && crun_error_get_errno (err) == ENOENT))
{
crun_error_release (err);
ret = write_file_at_with_flags (dirfd, O_WRONLY | O_CLOEXEC, 0, alias, data, len, err);
ret = write_cgroup_file (dirfd, alias, data, len, err);
}
return ret;
}
@ -156,8 +178,18 @@ check_cgroup_v2_controller_available_wrapper (int ret, int cgroup_dirfd, const c
}
if (! found)
{
cleanup_free char *absolute_path = NULL;
libcrun_error_t tmp_err = NULL;
crun_error_release (err);
return crun_make_error (err, 0, "the requested cgroup controller `%s` is not available", key);
ret = get_realpath_to_file (cgroup_dirfd, "cgroup.controllers", &absolute_path, &tmp_err);
if (LIKELY (ret >= 0))
ret = crun_make_error (err, 0, "controller `%s` is not available under %s", key, absolute_path);
else
{
crun_error_release (&tmp_err);
ret = crun_make_error (err, 0, "the requested cgroup controller `%s` is not available", key);
}
}
}
return ret;
@ -175,6 +207,21 @@ write_file_and_check_controllers_at (bool cgroup2, int dirfd, const char *name,
return ret;
}
static int
open_file_and_check_controllers_at (bool cgroup2, int dirfd, const char *name, int flags, libcrun_error_t *err)
{
int ret;
ret = openat (dirfd, name, flags);
if (UNLIKELY (ret < 0))
{
ret = crun_make_error (err, errno, "open `%s`", name);
if (cgroup2)
return check_cgroup_v2_controller_available_wrapper (ret, dirfd, name, err);
}
return ret;
}
/* The parser generates different structs but they are really all the same. */
typedef runtime_spec_schema_defs_linux_block_io_device_throttle throttling_s;
@ -196,9 +243,11 @@ write_blkio_v1_resources_throttling (int dirfd, const char *name, throttling_s *
for (i = 0; i < throttling_len; i++)
{
int ret;
size_t len;
len = sprintf (fmt_buf, "%" PRIu64 ":%" PRIu64 " %" PRIu64 "\n", throttling[i]->major, throttling[i]->minor,
throttling[i]->rate);
int len;
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64 ":%" PRIu64 " %" PRIu64 "\n", throttling[i]->major, throttling[i]->minor,
throttling[i]->rate);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = TEMP_FAILURE_RETRY (write (fd, fmt_buf, len));
if (UNLIKELY (ret < 0))
@ -220,9 +269,11 @@ write_blkio_v2_resources_throttling (int fd, const char *name, throttling_s **th
for (i = 0; i < throttling_len; i++)
{
int ret;
size_t len;
len = sprintf (fmt_buf, "%" PRIu64 ":%" PRIu64 " %s=%" PRIu64 "\n", throttling[i]->major, throttling[i]->minor,
name, throttling[i]->rate);
int len;
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64 ":%" PRIu64 " %s=%" PRIu64 "\n", throttling[i]->major, throttling[i]->minor,
name, throttling[i]->rate);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = TEMP_FAILURE_RETRY (write (fd, fmt_buf, len));
if (UNLIKELY (ret < 0))
@ -236,14 +287,16 @@ write_blkio_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linux
libcrun_error_t *err)
{
char fmt_buf[128];
size_t len;
int len;
int ret;
if (blkio->weight)
{
uint32_t val = blkio->weight;
len = sprintf (fmt_buf, "%" PRIu32, val);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu32, val);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
if (! cgroup2)
{
ret = write_cgroup_file_or_alias (dirfd, "blkio.weight", "blkio.bfq.weight", fmt_buf, len, err);
@ -262,7 +315,9 @@ write_blkio_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linux
/* convert linearly from [10-1000] to [1-10000] */
val = 1 + (val - 10) * 9999 / 990;
len = sprintf (fmt_buf, "%" PRIu32, val);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu32, val);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd, "io.weight", fmt_buf, len, err);
}
@ -276,7 +331,11 @@ write_blkio_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linux
{
if (cgroup2)
return crun_make_error (err, 0, "cannot set leaf_weight with cgroupv2");
len = sprintf (fmt_buf, "%d", blkio->leaf_weight);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%d", blkio->leaf_weight);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd, "blkio.leaf_weight", fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -290,16 +349,19 @@ write_blkio_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linux
wfd = openat (dirfd, "io.bfq.weight", O_WRONLY | O_CLOEXEC);
if (UNLIKELY (wfd < 0))
return crun_make_error (err, errno, "open io.weight");
return crun_make_error (err, errno, "open `io.bfq.weight`");
for (i = 0; i < blkio->weight_device_len; i++)
{
uint32_t w = blkio->weight_device[i]->weight;
len = sprintf (fmt_buf, "%" PRIu64 ":%" PRIu64 " %i\n", blkio->weight_device[i]->major,
blkio->weight_device[i]->minor, w);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64 ":%" PRIu64 " %i\n", blkio->weight_device[i]->major,
blkio->weight_device[i]->minor, w);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = TEMP_FAILURE_RETRY (write (wfd, fmt_buf, len));
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "write io.weight");
return crun_make_error (err, errno, "write `io.bfq.weight`");
/* Ignore blkio->weight_device[i]->leaf_weight. */
}
@ -327,16 +389,22 @@ write_blkio_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linux
for (i = 0; i < blkio->weight_device_len; i++)
{
len = sprintf (fmt_buf, "%" PRIu64 ":%" PRIu64 " %" PRIu16 "\n", blkio->weight_device[i]->major,
blkio->weight_device[i]->minor, blkio->weight_device[i]->weight);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64 ":%" PRIu64 " %" PRIu16 "\n", blkio->weight_device[i]->major,
blkio->weight_device[i]->minor, blkio->weight_device[i]->weight);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = TEMP_FAILURE_RETRY (write (w_device_fd, fmt_buf, len));
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "write `%s`", weight_device_file_name);
if (w_leafdevice_fd >= 0)
{
len = sprintf (fmt_buf, "%" PRIu64 ":%" PRIu64 " %" PRIu16 "\n", blkio->weight_device[i]->major,
blkio->weight_device[i]->minor, blkio->weight_device[i]->leaf_weight);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64 ":%" PRIu64 " %" PRIu16 "\n", blkio->weight_device[i]->major,
blkio->weight_device[i]->minor, blkio->weight_device[i]->leaf_weight);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = TEMP_FAILURE_RETRY (write (w_leafdevice_fd, fmt_buf, len));
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "write `%s`", leaf_weight_device_file_name);
@ -410,11 +478,14 @@ write_network_resources (int dirfd_netclass, int dirfd_netprio, runtime_spec_sch
libcrun_error_t *err)
{
char fmt_buf[128];
size_t len;
int len;
int ret;
if (net->class_id)
{
len = sprintf (fmt_buf, "%d", net->class_id);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%d", net->class_id);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd_netclass, "net_cls.classid", fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -429,10 +500,13 @@ write_network_resources (int dirfd_netclass, int dirfd_netprio, runtime_spec_sch
for (i = 0; i < net->priorities_len; i++)
{
len = sprintf (fmt_buf, "%s %d\n", net->priorities[i]->name, net->priorities[i]->priority);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%s %d\n", net->priorities[i]->name, net->priorities[i]->priority);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = TEMP_FAILURE_RETRY (write (fd, fmt_buf, len));
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "write net_prio.ifpriomap");
return crun_make_error (err, errno, "write `net_prio.ifpriomap`");
}
}
@ -450,14 +524,16 @@ write_hugetlb_resources (int dirfd, bool cgroup2,
{
cleanup_free char *filename = NULL;
const char *suffix;
size_t len;
int len;
int ret;
suffix = cgroup2 ? "max" : "limit_in_bytes";
xasprintf (&filename, "hugetlb.%s.%s", htlb[i]->page_size, suffix);
len = sprintf (fmt_buf, "%" PRIu64, htlb[i]->limit);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64, htlb[i]->limit);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_file_and_check_controllers_at (cgroup2, dirfd, filename, NULL, fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -469,22 +545,9 @@ static int
write_devices_resources_v1 (int dirfd, runtime_spec_schema_defs_linux_device_cgroup **devs, size_t devs_len,
libcrun_error_t *err)
{
size_t i, len;
size_t i;
int len;
int ret;
char *default_devices[] = {
"c *:* m",
"b *:* m",
"c 1:3 rwm",
"c 1:8 rwm",
"c 1:7 rwm",
"c 5:0 rwm",
"c 1:5 rwm",
"c 1:9 rwm",
"c 5:1 rwm",
"c 136:* rwm",
"c 5:2 rwm",
NULL
};
for (i = 0; i < devs_len; i++)
{
@ -503,31 +566,64 @@ write_devices_resources_v1 (int dirfd, runtime_spec_schema_defs_linux_device_cgr
char fmt_buf_major[16];
char fmt_buf_minor[16];
#define FMT_DEV(x, b) \
do \
{ \
if (x##_present) \
sprintf (b, "%" PRIi64, x); \
else \
strcpy (b, "*"); \
#define FMT_DEV(x, b) \
do \
{ \
if (x##_present) \
{ \
len = snprintf (b, sizeof (b), "%" PRIi64, x); \
if (UNLIKELY (len >= (int) sizeof (b))) \
return crun_make_error (err, 0, "internal error: static buffer too small"); \
} \
else \
strcpy (b, "*"); \
} while (0)
FMT_DEV (devs[i]->major, fmt_buf_major);
FMT_DEV (devs[i]->minor, fmt_buf_minor);
#undef FMT_DEV
len = snprintf (fmt_buf, FMT_BUF_LEN - 1, "%s %s:%s %s", devs[i]->type, fmt_buf_major, fmt_buf_minor,
len = snprintf (fmt_buf, FMT_BUF_LEN, "%s %s:%s %s", devs[i]->type, fmt_buf_major, fmt_buf_minor,
devs[i]->access);
/* Make sure it is still a NUL terminated string. */
fmt_buf[len] = '\0';
if (UNLIKELY (len >= FMT_BUF_LEN))
return crun_make_error (err, 0, "internal error: static buffer too small");
}
ret = write_cgroup_file (dirfd, file, fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
}
for (i = 0; default_devices[i]; i++)
for (i = 0; default_devices[i].type; i++)
{
ret = write_cgroup_file (dirfd, "devices.allow", default_devices[i], strlen (default_devices[i]), err);
char fmt_buf_major[16];
char fmt_buf_minor[16];
char device[64];
int len;
#define FMT_DEV(x, b) \
do \
{ \
if (x != -1) \
{ \
len = snprintf (b, sizeof (b), "%d", x); \
if (UNLIKELY (len >= (int) sizeof (b))) \
return crun_make_error (err, 0, "internal error: static buffer too small"); \
} \
else \
strcpy (b, "*"); \
} while (0)
FMT_DEV (default_devices[i].major, fmt_buf_major);
FMT_DEV (default_devices[i].minor, fmt_buf_minor);
#undef FMT_DEV
len = snprintf (device, sizeof (device), "%c %s:%s %s", default_devices[i].type, fmt_buf_major, fmt_buf_minor,
default_devices[i].access);
if (UNLIKELY (len >= (int) sizeof (device)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd, "devices.allow", device, strlen (device), err);
if (UNLIKELY (ret < 0))
return ret;
}
@ -535,45 +631,25 @@ write_devices_resources_v1 (int dirfd, runtime_spec_schema_defs_linux_device_cgr
return 0;
}
static int
write_devices_resources_v2_internal (int dirfd, runtime_spec_schema_defs_linux_device_cgroup **devs, size_t devs_len,
libcrun_error_t *err)
struct bpf_program *
create_dev_bpf (runtime_spec_schema_defs_linux_device_cgroup **devs, size_t devs_len,
libcrun_error_t *err)
{
int i, ret;
cleanup_free struct bpf_program *program = NULL;
struct default_dev_s
{
char type;
int major;
int minor;
const char *access;
};
struct default_dev_s default_devices[] = {
{ 'c', -1, -1, "m" },
{ 'b', -1, -1, "m" },
{ 'c', 1, 3, "rwm" },
{ 'c', 1, 8, "rwm" },
{ 'c', 1, 7, "rwm" },
{ 'c', 5, 0, "rwm" },
{ 'c', 1, 5, "rwm" },
{ 'c', 1, 9, "rwm" },
{ 'c', 5, 1, "rwm" },
{ 'c', 136, -1, "rwm" },
{ 'c', 5, 2, "rwm" },
};
int i;
struct bpf_program *program;
program = bpf_program_new (2048);
program = bpf_program_init_dev (program, err);
if (UNLIKELY (program == NULL))
return -1;
return NULL;
for (i = (sizeof (default_devices) / sizeof (default_devices[0])) - 1; i >= 0; i--)
{
program = bpf_program_append_dev (program, default_devices[i].access, default_devices[i].type,
default_devices[i].major, default_devices[i].minor, true, err);
if (UNLIKELY (program == NULL))
return -1;
return NULL;
}
for (i = devs_len - 1; i >= 0; i--)
{
@ -589,18 +665,27 @@ write_devices_resources_v2_internal (int dirfd, runtime_spec_schema_defs_linux_d
program = bpf_program_append_dev (program, devs[i]->access, type, major, minor, devs[i]->allow, err);
if (UNLIKELY (program == NULL))
return -1;
return NULL;
}
program = bpf_program_complete_dev (program, err);
if (UNLIKELY (program == NULL))
return NULL;
return program;
}
static int
write_devices_resources_v2_internal (int dirfd, runtime_spec_schema_defs_linux_device_cgroup **devs, size_t devs_len,
libcrun_error_t *err)
{
cleanup_free struct bpf_program *program = NULL;
program = create_dev_bpf (devs, devs_len, err);
if (UNLIKELY (program == NULL))
return -1;
ret = libcrun_ebpf_load (program, dirfd, NULL, err);
if (ret < 0)
return ret;
return 0;
return libcrun_ebpf_load (program, dirfd, NULL, err);
}
static int
@ -683,26 +768,33 @@ write_devices_resources (int dirfd, bool cgroup2, runtime_spec_schema_defs_linux
/* use for cgroupv2 files with .min, .max, .low, or .high suffix */
static int
cg_itoa (char *buf, int64_t value, bool use_max)
cg_itoa (char *buf, size_t size, int64_t value, bool use_max, libcrun_error_t *err)
{
int len;
if (use_max && value < 0)
{
memcpy (buf, "max", 4);
return 3;
}
return sprintf (buf, "%" PRIi64, value);
len = snprintf (buf, size, "max");
else
len = snprintf (buf, size, "%" PRIi64, value);
if (UNLIKELY (len >= (int) size))
return crun_make_error (err, 0, "internal error: static buffer too small");
return len;
}
static int
write_memory (int dirfd, bool cgroup2, runtime_spec_schema_config_linux_resources_memory *memory, libcrun_error_t *err)
{
char limit_buf[32];
size_t limit_buf_len;
int limit_buf_len;
if (! memory->limit_present)
return 0;
limit_buf_len = cg_itoa (limit_buf, memory->limit, cgroup2);
limit_buf_len = cg_itoa (limit_buf, sizeof (limit_buf), memory->limit, cgroup2, err);
if (UNLIKELY (limit_buf_len < 0))
return limit_buf_len;
return write_cgroup_file (dirfd, cgroup2 ? "memory.max" : "memory.limit_in_bytes", limit_buf, limit_buf_len, err);
}
@ -714,7 +806,7 @@ write_memory_swap (int dirfd, bool cgroup2, runtime_spec_schema_config_linux_res
int ret;
int64_t swap;
char swap_buf[32];
size_t swap_buf_len;
int len;
const char *fname = cgroup2 ? "memory.swap.max" : "memory.memsw.limit_in_bytes";
if (! memory->swap_present)
@ -734,9 +826,11 @@ write_memory_swap (int dirfd, bool cgroup2, runtime_spec_schema_config_linux_res
swap -= memory->limit;
}
swap_buf_len = cg_itoa (swap_buf, swap, cgroup2);
len = cg_itoa (swap_buf, sizeof (swap_buf), swap, cgroup2, err);
if (UNLIKELY (len < 0))
return len;
ret = write_cgroup_file (dirfd, fname, swap_buf, swap_buf_len, err);
ret = write_cgroup_file (dirfd, fname, swap_buf, len, err);
if (ret >= 0)
return ret;
@ -754,7 +848,7 @@ static int
write_memory_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linux_resources_memory *memory,
libcrun_error_t *err)
{
size_t len;
int len;
int ret;
char fmt_buf[32];
bool memory_limits_written = false;
@ -829,7 +923,9 @@ write_memory_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linu
if (cgroup2)
return crun_make_error (err, 0, "cannot set kernel memory with cgroupv2");
len = sprintf (fmt_buf, "%" PRIu64, memory->kernel);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64, memory->kernel);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd, "memory.kmem.limit_in_bytes", fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -850,7 +946,9 @@ write_memory_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linu
if (memory->reservation_present)
{
len = sprintf (fmt_buf, "%" PRIu64, memory->reservation);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64, memory->reservation);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_file_and_check_controllers_at (cgroup2, dirfd, cgroup2 ? "memory.low" : "memory.soft_limit_in_bytes",
NULL, fmt_buf, len, err);
if (UNLIKELY (ret < 0))
@ -870,7 +968,9 @@ write_memory_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linu
if (cgroup2)
return crun_make_error (err, 0, "cannot set kernel TCP with cgroupv2");
len = sprintf (fmt_buf, "%" PRIu64, memory->kernel_tcp);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64, memory->kernel_tcp);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd, "memory.kmem.tcp.limit_in_bytes", fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -880,7 +980,9 @@ write_memory_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linu
if (cgroup2)
return crun_make_error (err, 0, "cannot set memory swappiness with cgroupv2");
len = sprintf (fmt_buf, "%" PRIu64, memory->swappiness);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64, memory->swappiness);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd, "memory.swappiness", fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -894,12 +996,14 @@ write_cpu_burst (int cpu_dirfd, bool cgroup2, runtime_spec_schema_config_linux_r
libcrun_error_t *err)
{
char fmt_buf[32];
size_t len;
int len;
if (! cpu->burst_present)
return 0;
len = sprintf (fmt_buf, "%" PRIi64, cpu->burst);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIi64, cpu->burst);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
return write_cgroup_file (cpu_dirfd, cgroup2 ? "cpu.max.burst" : "cpu.cfs_burst_us", fmt_buf, len, err);
}
@ -910,10 +1014,13 @@ write_pids_resources (int dirfd, bool cgroup2, runtime_spec_schema_config_linux_
if (pids->limit)
{
char fmt_buf[32];
size_t len;
int len;
int ret;
len = cg_itoa (fmt_buf, pids->limit, true);
len = cg_itoa (fmt_buf, sizeof (fmt_buf), pids->limit, true, err);
if (UNLIKELY (len < 0))
return len;
ret = write_file_and_check_controllers_at (cgroup2, dirfd, "pids.max", NULL, fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -926,7 +1033,7 @@ static int
write_cpu_resources (int dirfd_cpu, bool cgroup2, runtime_spec_schema_config_linux_resources_cpu *cpu,
libcrun_error_t *err)
{
size_t len, period_len;
int len, period_len;
int ret;
char fmt_buf[64];
int64_t period = -1;
@ -940,7 +1047,9 @@ write_cpu_resources (int dirfd_cpu, bool cgroup2, runtime_spec_schema_config_lin
if (cgroup2)
val = convert_shares_to_weight (val);
len = sprintf (fmt_buf, "%u", val);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%u", val);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_file_and_check_controllers_at (cgroup2, dirfd_cpu, cgroup2 ? "cpu.weight" : "cpu.shares",
NULL, fmt_buf, len, err);
@ -953,7 +1062,9 @@ write_cpu_resources (int dirfd_cpu, bool cgroup2, runtime_spec_schema_config_lin
period = cpu->period;
else
{
len = sprintf (fmt_buf, "%" PRIu64, cpu->period);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64, cpu->period);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd_cpu, "cpu.cfs_period_us", fmt_buf, len, err);
if (UNLIKELY (ret < 0))
{
@ -978,7 +1089,9 @@ write_cpu_resources (int dirfd_cpu, bool cgroup2, runtime_spec_schema_config_lin
quota = cpu->quota;
else
{
len = sprintf (fmt_buf, "%" PRIi64, cpu->quota);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIi64, cpu->quota);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd_cpu, "cpu.cfs_quota_us", fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -994,7 +1107,9 @@ write_cpu_resources (int dirfd_cpu, bool cgroup2, runtime_spec_schema_config_lin
{
if (cgroup2)
return crun_make_error (err, 0, "realtime period not supported on cgroupv2");
len = sprintf (fmt_buf, "%" PRIu64, cpu->realtime_period);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64, cpu->realtime_period);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd_cpu, "cpu.rt_period_us", fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -1003,14 +1118,18 @@ write_cpu_resources (int dirfd_cpu, bool cgroup2, runtime_spec_schema_config_lin
{
if (cgroup2)
return crun_make_error (err, 0, "realtime runtime not supported on cgroupv2");
len = sprintf (fmt_buf, "%" PRIu64, cpu->realtime_runtime);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIu64, cpu->realtime_runtime);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd_cpu, "cpu.rt_runtime_us", fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
}
if (cpu->idle_present)
{
len = sprintf (fmt_buf, "%" PRIi64, cpu->idle);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIi64, cpu->idle);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_cgroup_file (dirfd_cpu, "cpu.idle", fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -1021,9 +1140,13 @@ write_cpu_resources (int dirfd_cpu, bool cgroup2, runtime_spec_schema_config_lin
if (period < 0)
period = 100000;
if (quota < 0)
len = sprintf (fmt_buf, "max %" PRIi64, period);
len = snprintf (fmt_buf, sizeof (fmt_buf), "max %" PRIi64, period);
else
len = sprintf (fmt_buf, "%" PRIi64 " %" PRIi64, quota, period);
len = snprintf (fmt_buf, sizeof (fmt_buf), "%" PRIi64 " %" PRIi64, quota, period);
if (UNLIKELY (len >= (int) sizeof (fmt_buf)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_file_and_check_controllers_at (cgroup2, dirfd_cpu, "cpu.max", NULL, fmt_buf, len, err);
if (UNLIKELY (ret < 0))
return ret;
@ -1073,7 +1196,7 @@ update_cgroup_v1_resources (runtime_spec_schema_config_linux_resources *resource
if (UNLIKELY (ret < 0))
return ret;
dirfd_blkio = open (path_to_blkio, O_DIRECTORY | O_RDONLY | O_CLOEXEC);
dirfd_blkio = open (path_to_blkio, O_DIRECTORY | O_PATH | O_CLOEXEC);
if (UNLIKELY (dirfd_blkio < 0))
return crun_make_error (err, errno, "open `%s`", path_to_blkio);
@ -1098,11 +1221,11 @@ update_cgroup_v1_resources (runtime_spec_schema_config_linux_resources *resource
if (UNLIKELY (ret < 0))
return ret;
dirfd_netclass = open (path_to_netclass, O_DIRECTORY | O_RDONLY | O_CLOEXEC);
dirfd_netclass = open (path_to_netclass, O_DIRECTORY | O_PATH | O_CLOEXEC);
if (UNLIKELY (dirfd_netclass < 0))
return crun_make_error (err, errno, "open `%s`", path_to_netclass);
dirfd_netprio = open (path_to_netprio, O_DIRECTORY | O_RDONLY | O_CLOEXEC);
dirfd_netprio = open (path_to_netprio, O_DIRECTORY | O_PATH | O_CLOEXEC);
if (UNLIKELY (dirfd_netprio < 0))
return crun_make_error (err, errno, "open `%s`", path_to_netprio);
@ -1119,7 +1242,7 @@ update_cgroup_v1_resources (runtime_spec_schema_config_linux_resources *resource
ret = append_paths (&path_to_htlb, err, CGROUP_ROOT "/hugetlb", path, NULL);
if (UNLIKELY (ret < 0))
return ret;
dirfd_htlb = open (path_to_htlb, O_DIRECTORY | O_RDONLY | O_CLOEXEC);
dirfd_htlb = open (path_to_htlb, O_DIRECTORY | O_PATH | O_CLOEXEC);
if (UNLIKELY (dirfd_htlb < 0))
return crun_make_error (err, errno, "open `%s`", path_to_htlb);
@ -1156,7 +1279,7 @@ update_cgroup_v1_resources (runtime_spec_schema_config_linux_resources *resource
if (UNLIKELY (ret < 0))
return ret;
dirfd_mem = open (path_to_mem, O_DIRECTORY | O_RDONLY | O_CLOEXEC);
dirfd_mem = open (path_to_mem, O_DIRECTORY | O_PATH | O_CLOEXEC);
if (UNLIKELY (dirfd_mem < 0))
return crun_make_error (err, errno, "open `%s`", path_to_mem);
@ -1174,7 +1297,7 @@ update_cgroup_v1_resources (runtime_spec_schema_config_linux_resources *resource
if (UNLIKELY (ret < 0))
return ret;
dirfd_pid = open (path_to_pid, O_DIRECTORY | O_RDONLY | O_CLOEXEC);
dirfd_pid = open (path_to_pid, O_DIRECTORY | O_PATH | O_CLOEXEC);
if (UNLIKELY (dirfd_pid < 0))
return crun_make_error (err, errno, "open `%s`", path_to_pid);
@ -1194,7 +1317,7 @@ update_cgroup_v1_resources (runtime_spec_schema_config_linux_resources *resource
if (UNLIKELY (ret < 0))
return ret;
dirfd_cpu = open (path_to_cpu, O_DIRECTORY | O_RDONLY | O_CLOEXEC);
dirfd_cpu = open (path_to_cpu, O_DIRECTORY | O_PATH | O_CLOEXEC);
if (UNLIKELY (dirfd_cpu < 0))
return crun_make_error (err, errno, "open `%s`", path_to_cpu);
ret = write_cpu_resources (dirfd_cpu, false, resources->cpu, err);
@ -1208,7 +1331,7 @@ update_cgroup_v1_resources (runtime_spec_schema_config_linux_resources *resource
if (UNLIKELY (ret < 0))
return ret;
dirfd_cpuset = open (path_to_cpuset, O_DIRECTORY | O_RDONLY | O_CLOEXEC);
dirfd_cpuset = open (path_to_cpuset, O_DIRECTORY | O_PATH | O_CLOEXEC);
if (UNLIKELY (dirfd_cpuset < 0))
return crun_make_error (err, errno, "open `%s`", path_to_cpuset);
@ -1231,23 +1354,36 @@ write_unified_resources (int cgroup_dirfd, runtime_spec_schema_config_linux_reso
for (i = 0; i < resources->unified->len; i++)
{
size_t len;
cleanup_close int fd = -1;
cleanup_free char *value = NULL;
char *saveptr = NULL;
char *line;
if (strchr (resources->unified->keys[i], '/'))
return crun_make_error (err, 0, "key `%s` must be a file name without any slash", resources->unified->keys[i]);
len = strlen (resources->unified->values[i]);
ret = write_file_and_check_controllers_at (true, cgroup_dirfd, resources->unified->keys[i],
NULL, resources->unified->values[i], len, err);
if (UNLIKELY (ret < 0))
return ret;
if (is_empty_string (resources->unified->values[i]))
continue;
value = xstrdup (resources->unified->values[i]);
fd = open_file_and_check_controllers_at (true, cgroup_dirfd, resources->unified->keys[i], O_WRONLY, err);
if (UNLIKELY (fd < 0))
return fd;
for (line = strtok_r (value, "\n", &saveptr); line; line = strtok_r (NULL, "\n", &saveptr))
{
ret = TEMP_FAILURE_RETRY (write (fd, line, strlen (line)));
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "write to `%s`", resources->unified->keys[i]);
}
}
return 0;
}
static int
update_cgroup_v2_resources (runtime_spec_schema_config_linux_resources *resources, const char *path, libcrun_error_t *err)
update_cgroup_v2_resources (runtime_spec_schema_config_linux_resources *resources, const char *path, bool need_bpf_dev, libcrun_error_t *err)
{
cleanup_free char *cgroup_path = NULL;
cleanup_close int cgroup_dirfd = -1;
@ -1264,7 +1400,7 @@ update_cgroup_v2_resources (runtime_spec_schema_config_linux_resources *resource
if (UNLIKELY (cgroup_dirfd < 0))
return crun_make_error (err, errno, "open `%s`", cgroup_path);
if (resources->devices_len)
if (need_bpf_dev && resources->devices_len)
{
ret = write_devices_resources (cgroup_dirfd, true, resources->devices, resources->devices_len, err);
if (UNLIKELY (ret < 0))
@ -1321,11 +1457,15 @@ update_cgroup_v2_resources (runtime_spec_schema_config_linux_resources *resource
int
update_cgroup_resources (const char *path,
const char *state_root,
runtime_spec_schema_config_linux_resources *resources,
bool need_bpf_dev,
libcrun_error_t *err)
{
int cgroup_mode;
(void) state_root;
cgroup_mode = libcrun_get_cgroup_mode (err);
if (UNLIKELY (cgroup_mode < 0))
return cgroup_mode;
@ -1355,7 +1495,7 @@ update_cgroup_resources (const char *path,
switch (cgroup_mode)
{
case CGROUP_MODE_UNIFIED:
return update_cgroup_v2_resources (resources, path, err);
return update_cgroup_v2_resources (resources, path, need_bpf_dev, err);
case CGROUP_MODE_LEGACY:
case CGROUP_MODE_HYBRID:

View File

@ -22,8 +22,22 @@
#include "cgroup.h"
#include <unistd.h>
struct default_dev_s
{
char type;
int major;
int minor;
const char *access;
};
struct default_dev_s *get_default_devices ();
int update_cgroup_resources (const char *path,
const char *state_root,
runtime_spec_schema_config_linux_resources *resources,
bool need_devices,
libcrun_error_t *err);
struct bpf_program *create_dev_bpf (runtime_spec_schema_defs_linux_device_cgroup **devs, size_t devs_len,
libcrun_error_t *err);
#endif

View File

@ -50,7 +50,7 @@ initialize_cpuset_subsystem_rec (char *path, size_t path_len, char *cpus, char *
bool has_cpus = false, has_mems = false;
int b_len;
dirfd = open (path, O_DIRECTORY | O_RDONLY | O_CLOEXEC);
dirfd = open (path, O_DIRECTORY | O_PATH | O_CLOEXEC);
if (UNLIKELY (dirfd < 0))
return crun_make_error (err, errno, "open `%s`", path);
@ -174,7 +174,7 @@ initialize_memory_subsystem (const char *path, libcrun_error_t *err)
cleanup_close int dirfd = -1;
int i;
dirfd = open (path, O_DIRECTORY | O_RDONLY | O_CLOEXEC);
dirfd = open (path, O_DIRECTORY | O_PATH | O_CLOEXEC);
if (UNLIKELY (dirfd < 0))
return crun_make_error (err, errno, "open `%s`", path);
@ -200,7 +200,7 @@ enter_cgroup_subsystem (pid_t pid, const char *subsystem, const char *path, bool
cleanup_free char *cgroup_path = NULL;
int ret;
ret = append_paths (&cgroup_path, err, CGROUP_ROOT, subsystem ? subsystem : "", path ? path : "", NULL);
ret = append_paths (&cgroup_path, err, CGROUP_ROOT, subsystem, path ? path : "", NULL);
if (UNLIKELY (ret < 0))
return ret;
@ -209,10 +209,10 @@ enter_cgroup_subsystem (pid_t pid, const char *subsystem, const char *path, bool
ret = crun_ensure_directory (cgroup_path, 0755, false, err);
if (UNLIKELY (ret < 0))
{
crun_error_release (err);
if (errno != EROFS)
return crun_make_error (err, errno, "creating cgroup directory `%s`", cgroup_path);
crun_error_release (err);
return 0;
}
@ -297,8 +297,13 @@ read_unified_cgroup_pid (pid_t pid, char **path, libcrun_error_t *err)
char cgroup_path[32];
char *from, *to;
cleanup_free char *content = NULL;
int len;
sprintf (cgroup_path, "/proc/%d/cgroup", pid);
len = snprintf (cgroup_path, sizeof (cgroup_path), "/proc/%d/cgroup", pid);
if (UNLIKELY (len >= (int) sizeof (cgroup_path)))
return crun_make_error (err, 0, "internal error: static buffer too small");
cgroup_path[len] = '\0';
ret = read_all_file (cgroup_path, &content, NULL, err);
if (UNLIKELY (ret < 0))
@ -326,14 +331,11 @@ enter_cgroup_v1 (pid_t pid, const char *path, bool create_if_missing, libcrun_er
bool entered_any = false;
size_t content_size;
char *controller;
char pid_str[16];
char *saveptr;
bool has_data;
int rootless;
int ret;
sprintf (pid_str, "%d", pid);
rootless = is_rootless (err);
if (UNLIKELY (rootless < 0))
return rootless;
@ -355,6 +357,7 @@ enter_cgroup_v1 (pid_t pid, const char *path, bool create_if_missing, libcrun_er
{
char subsystem_path[64];
char *subsystem;
size_t len;
if (has_prefix (controller, "name="))
controller += 5;
@ -366,7 +369,10 @@ enter_cgroup_v1 (pid_t pid, const char *path, bool create_if_missing, libcrun_er
if (strcmp (subsystem, "cpuacct,cpu") == 0)
subsystem = "cpu,cpuacct";
snprintf (subsystem_path, sizeof (subsystem_path), CGROUP_ROOT "/%s", subsystem);
len = snprintf (subsystem_path, sizeof (subsystem_path), CGROUP_ROOT "/%s", subsystem);
if (UNLIKELY (len >= (int) sizeof (subsystem_path)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = crun_path_exists (subsystem_path, err);
if (UNLIKELY (ret < 0))
return ret;
@ -400,9 +406,12 @@ enter_cgroup_v2 (pid_t pid, pid_t init_pid, const char *path, bool create_if_mis
cleanup_free char *cgroup_path = NULL;
char pid_str[16];
int repeat;
int len;
int ret;
sprintf (pid_str, "%d", pid);
len = snprintf (pid_str, sizeof (pid_str), "%d", pid);
if (UNLIKELY (len >= (int) sizeof (pid_str)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = append_paths (&cgroup_path, err, CGROUP_ROOT, path, NULL);
if (UNLIKELY (ret < 0))
@ -506,20 +515,5 @@ enter_cgroup (int cgroup_mode, pid_t pid, pid_t init_pid, const char *path,
if (UNLIKELY (ret < 0))
return ret;
}
/* Reset the inherited cpu affinity. Old kernels do that automatically, but
new kernels remember the affinity that was set before the cgroup move.
This is undesirable, because it inherits the systemd affinity when the container
should really move to the container space cpus.
The sched_setaffinity call will always return an error (EINVAL or ENODEV)
when used like this. This is expected and part of the backward compatibility.
See: https://issues.redhat.com/browse/OCPBUGS-15102 */
ret = sched_setaffinity (pid, 0, NULL);
if (LIKELY (ret < 0))
{
if (UNLIKELY (! ((errno == EINVAL) || (errno == ENODEV))))
return crun_make_error (err, errno, "failed to reset affinity");
}
return 0;
}

File diff suppressed because it is too large Load Diff

View File

@ -24,8 +24,6 @@
#ifdef HAVE_SYSTEMD
extern int parse_sd_array (char *s, char **out, char **next, libcrun_error_t *err);
extern int cpuset_string_to_bitmask (const char *str, char **out, size_t *out_size, libcrun_error_t *err);
extern char *get_cgroup_scope_path (const char *cgroup_path, const char *scope);
#endif

View File

@ -130,7 +130,9 @@ move_process_to_cgroup (pid_t pid, const char *subsystem, const char *path, libc
if (UNLIKELY (ret < 0))
return ret;
sprintf (pid_str, "%d", pid);
ret = snprintf (pid_str, sizeof (pid_str), "%d", pid);
if (UNLIKELY (ret >= (int) sizeof (pid_str)))
return crun_make_error (err, 0, "internal error: static buffer too small");
ret = write_file (cgroup_path_procs, pid_str, strlen (pid_str), err);
if (UNLIKELY (ret < 0))
@ -161,35 +163,6 @@ move_process_to_cgroup (pid_t pid, const char *subsystem, const char *path, libc
return ret;
}
int
libcrun_get_current_unified_cgroup (char **path, bool absolute, libcrun_error_t *err)
{
cleanup_free char *content = NULL;
size_t content_size;
char *from, *to;
int ret;
ret = read_all_file (PROC_SELF_CGROUP, &content, &content_size, err);
if (UNLIKELY (ret < 0))
return ret;
from = strstr (content, "0::");
if (UNLIKELY (from == NULL))
return crun_make_error (err, 0, "cannot find cgroup2 for the current process");
from += 3;
to = strchr (from, '\n');
if (UNLIKELY (to == NULL))
return crun_make_error (err, 0, "cannot parse `%s`", PROC_SELF_CGROUP);
*to = '\0';
if (absolute)
return append_paths (path, err, CGROUP_ROOT, from, NULL);
*path = xstrdup (from);
return 0;
}
#ifndef CGROUP2_SUPER_MAGIC
# define CGROUP2_SUPER_MAGIC 0x63677270
#endif
@ -237,10 +210,65 @@ libcrun_get_cgroup_mode (libcrun_error_t *err)
return cgroup_mode;
}
int
libcrun_get_cgroup_process (pid_t pid, char **path, bool absolute, libcrun_error_t *err)
{
cleanup_free char *content = NULL;
char proc_cgroup_file[64];
char *cg_path = NULL;
size_t content_size;
char *controller;
char *saveptr;
int cgroup_mode;
bool has_data;
int ret;
cgroup_mode = libcrun_get_cgroup_mode (err);
if (UNLIKELY (cgroup_mode < 0))
return cgroup_mode;
if (pid == 0)
strcpy (proc_cgroup_file, PROC_SELF_CGROUP);
else
{
int len = snprintf (proc_cgroup_file, sizeof (proc_cgroup_file), "/proc/%d/cgroup", pid);
if (UNLIKELY (len >= (int) sizeof (proc_cgroup_file)))
return crun_make_error (err, 0, "internal error: static buffer too small");
}
ret = read_all_file (proc_cgroup_file, &content, &content_size, err);
if (UNLIKELY (ret < 0))
return ret;
for (has_data = read_proc_cgroup (content, &saveptr, NULL, &controller, &cg_path);
has_data;
has_data = read_proc_cgroup (NULL, &saveptr, NULL, &controller, &cg_path))
{
if (cgroup_mode == CGROUP_MODE_UNIFIED)
{
if (strcmp (controller, "") == 0 && strlen (cg_path) > 0)
goto found;
}
else
{
if (strcmp (controller, "memory"))
goto found;
}
}
return crun_make_error (err, 0, "cannot find cgroup for the process");
found:
if (absolute)
return append_paths (path, err, CGROUP_ROOT, cg_path, NULL);
*path = xstrdup (cg_path);
return 0;
}
static int
read_pids_cgroup (int dfd, bool recurse, pid_t **pids, size_t *n_pids, size_t *allocated, libcrun_error_t *err)
{
__attribute__ ((unused)) cleanup_close int clean_dfd = dfd;
cleanup_close int tasksfd = -1;
cleanup_free char *buffer = NULL;
char *saveptr = NULL;
@ -287,11 +315,11 @@ read_pids_cgroup (int dfd, bool recurse, pid_t **pids, size_t *n_pids, size_t *a
if (UNLIKELY (dir == NULL))
return crun_make_error (err, errno, "open cgroup sub-directory");
/* Now dir owns the dfd descriptor. */
clean_dfd = -1;
dfd = -1;
for (de = readdir (dir); de; de = readdir (dir))
{
int nfd;
cleanup_close int nfd = -1;
if (strcmp (de->d_name, ".") == 0 || strcmp (de->d_name, "..") == 0)
continue;
@ -346,22 +374,16 @@ rmdir_all_fd (int dfd)
size_t i, n_pids = 0, allocated = 0;
cleanup_close int child_dfd = -1;
int tmp;
int child_dfd_clone;
child_dfd = openat (dfd, name, O_DIRECTORY | O_CLOEXEC);
if (child_dfd < 0)
return child_dfd;
/* read_pids_cgroup takes ownership for the fd, so dup it. */
child_dfd_clone = dup (child_dfd);
if (LIKELY (child_dfd_clone >= 0))
ret = read_pids_cgroup (child_dfd, true, &pids, &n_pids, &allocated, &tmp_err);
if (UNLIKELY (ret < 0))
{
ret = read_pids_cgroup (child_dfd_clone, true, &pids, &n_pids, &allocated, &tmp_err);
if (UNLIKELY (ret < 0))
{
crun_error_release (&tmp_err);
continue;
}
crun_error_release (&tmp_err);
continue;
}
for (i = 0; i < n_pids; i++)
@ -394,8 +416,8 @@ int
libcrun_cgroup_read_pids_from_path (const char *path, bool recurse, pid_t **pids, libcrun_error_t *err)
{
cleanup_free char *cgroup_path = NULL;
cleanup_close int dirfd = -1;
size_t n_pids, allocated;
int dirfd;
int mode;
int ret;
@ -681,7 +703,7 @@ cgroup_killall_path (const char *path, int signal, libcrun_error_t *err)
if (UNLIKELY (ret < 0))
return ret;
ret = write_file_with_flags (kill_file, 0, "1", 1, err);
ret = write_file_at_with_flags (AT_FDCWD, 0, 0700, kill_file, "1", 1, err);
if (ret >= 0)
return 0;
@ -732,7 +754,7 @@ read_available_controllers (const char *path, libcrun_error_t *err)
ret = read_all_file (controllers, &buf, NULL, err);
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "error reading from file `%s`", controllers);
return ret;
for (token = strtok_r (buf, " \n", &saveptr); token; token = strtok_r (NULL, " \n", &saveptr))
{
@ -748,6 +770,8 @@ read_available_controllers (const char *path, libcrun_error_t *err)
available |= CGROUP_PIDS;
else if (strcmp (token, "io") == 0)
available |= CGROUP_IO;
else if (strcmp (token, "misc") == 0)
available |= CGROUP_MISC;
}
return available;
}
@ -761,10 +785,11 @@ write_controller_file (const char *path, int controllers_to_enable, libcrun_erro
int ret;
controllers_len = xasprintf (
&controllers, "%s %s %s %s %s %s", (controllers_to_enable & CGROUP_CPU) ? "+cpu" : "",
&controllers, "%s %s %s %s %s %s %s", (controllers_to_enable & CGROUP_CPU) ? "+cpu" : "",
(controllers_to_enable & CGROUP_IO) ? "+io" : "", (controllers_to_enable & CGROUP_MEMORY) ? "+memory" : "",
(controllers_to_enable & CGROUP_PIDS) ? "+pids" : "", (controllers_to_enable & CGROUP_CPUSET) ? "+cpuset" : "",
(controllers_to_enable & CGROUP_HUGETLB) ? "+hugetlb" : "");
(controllers_to_enable & CGROUP_HUGETLB) ? "+hugetlb" : "",
(controllers_to_enable & CGROUP_MISC) ? "+misc" : "");
ret = append_paths (&subtree_control, err, CGROUP_ROOT, path, "cgroup.subtree_control", NULL);
if (UNLIKELY (ret < 0))
@ -925,7 +950,7 @@ enable_controllers (const char *path, libcrun_error_t *err)
}
int
libcrun_move_process_to_cgroup (pid_t pid, pid_t init_pid, char *path, libcrun_error_t *err)
libcrun_move_process_to_cgroup (pid_t pid, pid_t init_pid, const char *path, bool create_if_missing, libcrun_error_t *err)
{
int cgroup_mode = libcrun_get_cgroup_mode (err);
if (UNLIKELY (cgroup_mode < 0))
@ -934,7 +959,7 @@ libcrun_move_process_to_cgroup (pid_t pid, pid_t init_pid, char *path, libcrun_e
if (path == NULL || *path == '\0')
return 0;
return enter_cgroup (cgroup_mode, pid, init_pid, path, false, err);
return enter_cgroup (cgroup_mode, pid, init_pid, path, create_if_missing, err);
}
int
@ -962,9 +987,82 @@ libcrun_get_cgroup_dirfd (struct libcrun_cgroup_status *status, const char *sub_
if (UNLIKELY (ret < 0))
return ret;
cgroupdirfd = open (path_to_cgroup, O_CLOEXEC | O_NOFOLLOW | O_DIRECTORY | O_RDONLY);
cgroupdirfd = open (path_to_cgroup, O_CLOEXEC | O_NOFOLLOW | O_DIRECTORY | O_PATH);
if (UNLIKELY (cgroupdirfd < 0))
return crun_make_error (err, errno, "open `%s`", path_to_cgroup);
return cgroupdirfd;
}
int
libcrun_migrate_all_pids_to_cgroup (pid_t init_pid, char *from, char *to, libcrun_error_t *err)
{
cleanup_free pid_t *pids = NULL;
cleanup_close int child_dfd = -1;
int cgroup_mode;
size_t from_len;
size_t i;
int ret;
cgroup_mode = libcrun_get_cgroup_mode (err);
if (cgroup_mode < 0)
return cgroup_mode;
ret = libcrun_cgroup_pause_unpause_path (from, true, err);
if (UNLIKELY (ret < 0))
return ret;
ret = libcrun_cgroup_read_pids_from_path (from, true, &pids, err);
if (UNLIKELY (ret < 0))
return ret;
from_len = strlen (from);
for (i = 0; pids && pids[i]; i++)
{
cleanup_free char *pid_path = NULL;
cleanup_free char *dest_cgroup = NULL;
ret = libcrun_get_cgroup_process (pids[i], &pid_path, false, err);
if (UNLIKELY (ret < 0))
return ret;
/* Make sure the pid is in the cgroup we are migrating from. */
if (! has_prefix (pid_path, from))
return crun_make_error (err, 0, "error migrating pid %d. It is not in the cgroup `%s`", pids[i], from);
/* Build the destination cgroup path, keeping the same hierarchy. */
xasprintf (&dest_cgroup, "%s%s", to, pid_path + from_len);
ret = enter_cgroup (cgroup_mode, pids[i], init_pid, dest_cgroup, false, err);
if (UNLIKELY (ret < 0))
return ret;
}
ret = libcrun_cgroup_pause_unpause_path (from, false, err);
if (UNLIKELY (ret < 0))
return ret;
return destroy_cgroup_path (from, cgroup_mode, err);
}
int
get_cgroup_dirfd_path (int dirfd, char **path, libcrun_error_t *err)
{
cleanup_free char *cgroup_path = NULL;
proc_fd_path_t fd_path;
ssize_t len;
get_proc_self_fd_path (fd_path, dirfd);
len = safe_readlinkat (AT_FDCWD, fd_path, &cgroup_path, 0, err);
if (UNLIKELY (len < 0))
return len;
if (has_prefix (cgroup_path, CGROUP_ROOT))
{
*path = xstrdup (cgroup_path + strlen (CGROUP_ROOT));
return 0;
}
return crun_make_error (err, 0, "invalid cgroup path `%s`", cgroup_path);
}

View File

@ -22,11 +22,11 @@
#include "cgroup.h"
#include <unistd.h>
int libcrun_move_process_to_cgroup (pid_t pid, pid_t init_pid, char *path, libcrun_error_t *err);
int libcrun_move_process_to_cgroup (pid_t pid, pid_t init_pid, const char *path, bool create_if_missing, libcrun_error_t *err);
int libcrun_cgroups_create_symlinks (int dirfd, libcrun_error_t *err);
int libcrun_get_current_unified_cgroup (char **path, bool absolute, libcrun_error_t *err);
int libcrun_get_cgroup_process (pid_t pid, char **path, bool absolute, libcrun_error_t *err);
int libcrun_get_cgroup_mode (libcrun_error_t *err);
@ -34,4 +34,10 @@ int libcrun_get_cgroup_dirfd (struct libcrun_cgroup_status *status, const char *
int maybe_make_cgroup_threaded (const char *path, libcrun_error_t *err);
int libcrun_migrate_all_pids_to_cgroup (pid_t init_pid, char *from, char *to, libcrun_error_t *err);
int destroy_cgroup_path (const char *path, int mode, libcrun_error_t *err);
int get_cgroup_dirfd_path (int dirfd, char **path, libcrun_error_t *err);
#endif

View File

@ -37,6 +37,7 @@
#include <inttypes.h>
#include <time.h>
#include <linux/magic.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <fcntl.h>
@ -85,11 +86,11 @@ get_cgroup_manager (int manager, struct libcrun_cgroup_manager **out, libcrun_er
}
static const char *
find_delegate_cgroup (json_map_string_string *annotations)
find_delegate_cgroup (string_map *annotations)
{
const char *annotation;
annotation = find_annotation_map (annotations, "run.oci.delegate-cgroup");
annotation = find_string_map_value (annotations, "run.oci.delegate-cgroup");
if (annotation)
{
if (annotation[0] == '\0')
@ -100,37 +101,6 @@ find_delegate_cgroup (json_map_string_string *annotations)
return NULL;
}
static inline void
cleanup_sig_contp (void *p)
{
pid_t *pp = p;
if (*pp < 0)
return;
TEMP_FAILURE_RETRY (kill (*pp, SIGCONT));
}
static bool
must_stop_proc (runtime_spec_schema_config_linux_resources *resources)
{
size_t i;
if (resources == NULL)
return false;
if (resources->cpu && (resources->cpu->cpus || resources->cpu->mems))
return true;
if (resources->unified)
{
for (i = 0; i < resources->unified->len; i++)
if (has_prefix (resources->unified->keys[i], "cpuset."))
return true;
}
return false;
}
int
libcrun_cgroup_pause_unpause (struct libcrun_cgroup_status *status, const bool pause, libcrun_error_t *err)
{
@ -173,7 +143,41 @@ libcrun_cgroup_is_container_paused (struct libcrun_cgroup_status *status, bool *
ret = read_all_file (path, &content, NULL, err);
if (UNLIKELY (ret < 0))
return ret;
{
errno = crun_error_get_errno (err);
/* If the file is missing and we were checking for freezer.state
(so either cgroup v1 or hybrid), it may be the freezer is
simply disabled. In such case the container cannot be paused.
On cgroup v2 freezer is always there.
*/
if (errno != ENOENT || cgroup_mode == CGROUP_MODE_UNIFIED)
return ret;
/* Even with freezer disabled, its directory is still there. But
when it's disabled it has type tmpfs, while on systems with
freezer enabled, its type is cgroupfs. Use that to determine
whether freezer is enabled or not.
*/
struct statfs freezer_stat;
if (statfs (CGROUP_ROOT "/freezer", &freezer_stat))
{
crun_error_release (err);
return crun_make_error (err, errno, "error when using statfs on `%s`", CGROUP_ROOT "/freezer");
}
/* If the freezer is mounted as cgroupfs type, then missing
freezer.state file is an error and should be handled like before.
*/
if (freezer_stat.f_type == CGROUP_SUPER_MAGIC)
return ret;
/* When freezer dir is not mounted as cgroupfs, then it's
disabled, therefore container cannot be in paused state.
*/
crun_error_release (err);
*paused = false;
return 0;
}
*paused = strstr (content, state) != NULL;
return 0;
@ -206,6 +210,7 @@ libcrun_cgroup_destroy (struct libcrun_cgroup_status *cgroup_status, libcrun_err
int
libcrun_update_cgroup_resources (struct libcrun_cgroup_status *cgroup_status,
const char *state_root,
runtime_spec_schema_config_linux_resources *resources,
libcrun_error_t *err)
{
@ -218,11 +223,11 @@ libcrun_update_cgroup_resources (struct libcrun_cgroup_status *cgroup_status,
if (cgroup_manager->update_resources)
{
ret = cgroup_manager->update_resources (cgroup_status, resources, err);
ret = cgroup_manager->update_resources (cgroup_status, state_root, resources, err);
if (UNLIKELY (ret < 0))
return ret;
}
return update_cgroup_resources (cgroup_status->path, resources, err);
return update_cgroup_resources (cgroup_status->path, state_root, resources, ! cgroup_status->bpf_dev_set, err);
}
static int
@ -283,7 +288,6 @@ libcrun_cgroup_preenter (struct libcrun_cgroup_args *args, int *dirfd, libcrun_e
int
libcrun_cgroup_enter (struct libcrun_cgroup_args *args, struct libcrun_cgroup_status **out, libcrun_error_t *err)
{
__attribute__ ((unused)) pid_t sigcont_cleanup __attribute__ ((cleanup (cleanup_sig_contp))) = -1;
/* status will be filled by the cgroup manager. */
cleanup_cgroup_status struct libcrun_cgroup_status *status = xmalloc0 (sizeof *status);
struct libcrun_cgroup_manager *cgroup_manager;
@ -296,21 +300,6 @@ libcrun_cgroup_enter (struct libcrun_cgroup_args *args, struct libcrun_cgroup_st
if (UNLIKELY (cgroup_mode < 0))
return cgroup_mode;
/* If the cgroup configuration is limiting what CPUs/memory Nodes are available for the container,
then stop the container process during the cgroup configuration to avoid it being rescheduled on
a CPU that is not allowed. This extra step is required for setting up the sub cgroup with the
systemd driver. The alternative would be to temporarily setup the cpus/mems using d-bus.
*/
if (must_stop_proc (args->resources))
{
ret = TEMP_FAILURE_RETRY (kill (args->pid, SIGSTOP));
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "cannot stop container process `%d` with SIGSTOP", args->pid);
/* Send SIGCONT as soon as the function exits. */
sigcont_cleanup = args->pid;
}
if (cgroup_mode == CGROUP_MODE_HYBRID)
{
/* We don't really support hybrid mode, so check that cgroups2 is not using any controller. */
@ -375,27 +364,11 @@ libcrun_cgroup_enter (struct libcrun_cgroup_args *args, struct libcrun_cgroup_st
if (args->resources)
{
ret = update_cgroup_resources (status->path, args->resources, err);
ret = update_cgroup_resources (status->path, args->state_root, args->resources, ! status->bpf_dev_set, err);
if (UNLIKELY (ret < 0))
return ret;
}
}
/* Reset the inherited cpu affinity. Old kernels do that automatically, but
new kernels remember the affinity that was set before the cgroup move.
This is undesirable, because it inherits the systemd affinity when the container
should really move to the container space cpus.
The sched_setaffinity call will always return an error (EINVAL or ENODEV)
when used like this. This is expected and part of the backward compatibility.
See: https://issues.redhat.com/browse/OCPBUGS-15102 */
ret = sched_setaffinity (args->pid, 0, NULL);
if (LIKELY (ret < 0))
{
if (UNLIKELY (! ((errno == EINVAL) || (errno == ENODEV))))
return crun_make_error (err, errno, "failed to reset affinity");
}
success:
*out = status;
status = NULL;

View File

@ -19,6 +19,7 @@
#define CGROUP_H
#include "container.h"
#include "string_map.h"
#include <unistd.h>
#ifndef CGROUP_ROOT
@ -48,7 +49,7 @@ struct libcrun_cgroup_status;
struct libcrun_cgroup_args
{
runtime_spec_schema_config_linux_resources *resources;
json_map_string_string *annotations;
string_map *annotations;
const char *cgroup_path;
int manager;
pid_t pid;
@ -56,6 +57,8 @@ struct libcrun_cgroup_args
gid_t root_gid;
const char *id;
bool joined;
const char *state_root;
};
/* cgroup life-cycle management. */
@ -89,6 +92,7 @@ int libcrun_cgroup_has_oom (struct libcrun_cgroup_status *status, libcrun_error_
int libcrun_cgroup_read_pids (struct libcrun_cgroup_status *status, bool recurse, pid_t **pids, libcrun_error_t *err);
int libcrun_update_cgroup_resources (struct libcrun_cgroup_status *status,
const char *state_root,
runtime_spec_schema_config_linux_resources *resources,
libcrun_error_t *err);

View File

@ -139,12 +139,8 @@ char *chroot_realpath(const char *chroot, const char *path, char resolved_path[]
return resolved_path;
}
/* EINVAL means the file exists but isn't a symlink. */
if (errno != EINVAL) {
/* Make sure it's null terminated. */
*new_path = '\0';
strcpy(resolved_path, got_path);
if (errno != EINVAL)
return NULL;
}
} else {
/* Note: readlink doesn't add the null byte. */
link_path[n] = '\0';

View File

@ -59,6 +59,7 @@
#include <sys/syscall.h>
#include "utils.h"
#include "linux.h"
/* Use our own wrapper for memfd_create. */
#if !defined(SYS_memfd_create) && defined(__NR_memfd_create)
@ -366,6 +367,17 @@ static int seal_execfd(int *fd, int fdtype)
return -1;
}
static int try_bindfd_mount_api(void)
{
libcrun_error_t err;
int mountfd = get_bind_mount (-1, "/proc/self/exe", false, true, false, &err);
if (mountfd < 0) {
crun_error_release (&err);
return -1;
}
return mountfd;
}
static int try_bindfd(void)
{
mode_t mask;
@ -464,6 +476,13 @@ static int clone_binary(void)
* Before we resort to copying, let's try creating an ro-binfd in one shot
* by getting a handle for a read-only bind-mount of the execfd.
*/
execfd = try_bindfd_mount_api();
if (execfd >= 0) {
/* Transfer ownership to caller */
int ret_execfd = execfd;
execfd = -1;
return ret_execfd;
}
execfd = try_bindfd();
if (execfd >= 0) {
/* Transfer ownership to caller */

File diff suppressed because it is too large Load Diff

View File

@ -22,6 +22,7 @@
#include <config.h>
#include <ocispec/runtime_spec_schema_config_schema.h>
#include "error.h"
#include "string_map.h"
enum handler_configure_phase
{
@ -88,6 +89,8 @@ struct libcrun_container_s
char *config_file;
char *config_file_content;
string_map *annotations;
void *private_data;
void (*cleanup_private_data) (void *private_data);
struct libcrun_context_s *context;
@ -136,6 +139,11 @@ struct intel_rdt_s
bool enabled;
};
struct net_devices_s
{
bool enabled;
};
struct mount_ext_info_s
{
struct idmap_info_s idmap;
@ -151,6 +159,7 @@ struct linux_info_s
struct selinux_info_s selinux;
struct mount_ext_info_s mount_ext;
struct intel_rdt_s intel_rdt;
struct net_devices_s net_devices;
};
struct annotations_info_s
@ -187,6 +196,9 @@ struct libcrun_checkpoint_restore_s
char *parent_path;
bool pre_dump;
int manage_cgroups_mode;
int network_lock_method;
char *lsm_profile;
char *lsm_mount_context;
};
typedef struct libcrun_checkpoint_restore_s libcrun_checkpoint_restore_t;
@ -265,6 +277,7 @@ struct libcrun_intel_rdt_update
{
const char *l3_cache_schema;
const char *mem_bw_schema;
char *const *schemata;
};
LIBCRUN_PUBLIC int libcrun_container_update_intel_rdt (libcrun_context_t *context, const char *id,
@ -289,6 +302,12 @@ LIBCRUN_PUBLIC int libcrun_container_read_pids (libcrun_context_t *context, cons
LIBCRUN_PUBLIC int libcrun_write_json_containers_list (libcrun_context_t *context, FILE *out, libcrun_error_t *err);
LIBCRUN_PUBLIC int libcrun_container_add_mounts_from_file (libcrun_context_t *context, const char *id, const char *file,
libcrun_error_t *err);
LIBCRUN_PUBLIC int libcrun_container_remove_mounts_from_file (libcrun_context_t *context, const char *id, const char *file,
libcrun_error_t *err);
// Not part of the public API, just a method in container.c we need to access from linux.c
void get_root_in_the_userns (runtime_spec_schema_config_schema *def, uid_t host_uid, gid_t host_gid,
uid_t *uid, gid_t *gid);

View File

@ -36,7 +36,9 @@
# include "cgroup.h"
# include "cgroup-utils.h"
# include <dlfcn.h>
# ifndef STATIC
# include <dlfcn.h>
# endif
# define CRIU_CHECKPOINT_LOG_FILE "dump.log"
# define CRIU_RESTORE_LOG_FILE "restore.log"
@ -49,6 +51,9 @@
# define CLONE_NEWTIME 0x00000080 /* New time namespace */
# endif
/* Defined in chroot_realpath.c */
char *chroot_realpath (const char *chroot, const char *path, char resolved_path[]);
static const char *console_socket = NULL;
# define LIBCRIU_MIN_VERSION 31500
@ -79,16 +84,20 @@ struct libcriu_wrapper_s
void (*criu_set_leave_running) (bool leave_running);
void (*criu_set_manage_cgroups) (bool manage);
void (*criu_set_manage_cgroups_mode) (enum criu_cg_mode mode);
int (*criu_set_network_lock) (enum criu_network_lock_method method);
void (*criu_set_notify_cb) (int (*cb) (char *action, criu_notify_arg_t na));
void (*criu_set_orphan_pts_master) (bool orphan_pts_master);
void (*criu_set_images_dir_fd) (int fd);
int (*criu_set_parent_images) (const char *path);
void (*criu_set_pid) (int pid);
int (*criu_set_root) (const char *root);
int (*criu_add_cg_root) (const char *ctrl, const char *path);
void (*criu_set_shell_job) (bool shell_job);
void (*criu_set_tcp_established) (bool tcp_established);
void (*criu_set_track_mem) (bool track_mem);
void (*criu_set_work_dir_fd) (int fd);
int (*criu_set_lsm_profile) (const char *name);
int (*criu_set_lsm_mount_context) (const char *name);
};
static struct libcriu_wrapper_s *libcriu_wrapper;
@ -102,8 +111,10 @@ cleanup_wrapper (void *p)
if (*w == NULL)
return;
# ifndef STATIC
if ((*w)->handle)
dlclose ((*w)->handle);
# endif
free (*w);
libcriu_wrapper = NULL;
}
@ -115,20 +126,27 @@ load_wrapper (struct libcriu_wrapper_s **wrapper_out, libcrun_error_t *err)
{
cleanup_free struct libcriu_wrapper_s *wrapper = xmalloc0 (sizeof (*wrapper));
# define LOAD_CRIU_FUNCTION(X, ALLOW_NULL) \
do \
{ \
wrapper->X = dlsym (wrapper->handle, #X); \
if (! ALLOW_NULL && wrapper->X == NULL) \
{ \
dlclose (wrapper->handle); \
return crun_make_error (err, 0, "could not find symbol `%s` in `libcriu.so`", #X); \
} \
} while (0)
# ifdef STATIC
# define LOAD_CRIU_FUNCTION(X, ALLOW_NULL) \
wrapper->X = &X;
# else
# define LOAD_CRIU_FUNCTION(X, ALLOW_NULL) \
do \
{ \
wrapper->X = dlsym (wrapper->handle, #X); \
if (! ALLOW_NULL && wrapper->X == NULL) \
{ \
dlclose (wrapper->handle); \
return crun_make_error (err, 0, "could not find symbol `%s` in `libcriu.so`", #X); \
} \
} while (0)
# endif
# ifndef STATIC
wrapper->handle = dlopen ("libcriu.so.2", RTLD_NOW);
if (wrapper->handle == NULL)
return crun_make_error (err, 0, "could not load `libcriu.so.2`");
return crun_make_error (err, 0, "could not load `libcriu.so.2`: `%s`", dlerror ());
# endif
LOAD_CRIU_FUNCTION (criu_add_ext_mount, false);
LOAD_CRIU_FUNCTION (criu_add_external, false);
@ -161,15 +179,19 @@ load_wrapper (struct libcriu_wrapper_s **wrapper_out, libcrun_error_t *err)
LOAD_CRIU_FUNCTION (criu_set_log_level, false);
LOAD_CRIU_FUNCTION (criu_set_manage_cgroups, false);
LOAD_CRIU_FUNCTION (criu_set_manage_cgroups_mode, false);
LOAD_CRIU_FUNCTION (criu_set_network_lock, true);
LOAD_CRIU_FUNCTION (criu_set_notify_cb, false);
LOAD_CRIU_FUNCTION (criu_set_orphan_pts_master, false);
LOAD_CRIU_FUNCTION (criu_set_parent_images, false);
LOAD_CRIU_FUNCTION (criu_set_pid, false);
LOAD_CRIU_FUNCTION (criu_set_root, false);
LOAD_CRIU_FUNCTION (criu_add_cg_root, false);
LOAD_CRIU_FUNCTION (criu_set_shell_job, false);
LOAD_CRIU_FUNCTION (criu_set_tcp_established, false);
LOAD_CRIU_FUNCTION (criu_set_track_mem, false);
LOAD_CRIU_FUNCTION (criu_set_work_dir_fd, false);
LOAD_CRIU_FUNCTION (criu_set_lsm_profile, false);
LOAD_CRIU_FUNCTION (criu_set_lsm_mount_context, false);
libcriu_wrapper = *wrapper_out = wrapper;
wrapper = NULL;
@ -264,7 +286,8 @@ restore_cgroup_v1_mount (runtime_spec_schema_config_schema *def, libcrun_error_t
/* First check if there is actually a cgroup mount in the container. */
for (i = 0; i < def->mounts_len; i++)
{
if (strcmp (def->mounts[i]->type, "cgroup") == 0)
char *type = def->mounts[i]->type;
if (type && strcmp (type, "cgroup") == 0)
{
has_cgroup_mount = true;
break;
@ -313,7 +336,9 @@ restore_cgroup_v1_mount (runtime_spec_schema_config_schema *def, libcrun_error_t
if (UNLIKELY (ret < 0))
return ret;
libcriu_wrapper->criu_add_ext_mount (source, destination);
ret = libcriu_wrapper->criu_add_ext_mount (source, destination);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external mount to `%s`", destination);
}
return 0;
@ -332,7 +357,8 @@ checkpoint_cgroup_v1_mount (runtime_spec_schema_config_schema *def, libcrun_erro
/* First check if there is actually a cgroup mount in the container. */
for (i = 0; i < def->mounts_len; i++)
{
if (strcmp (def->mounts[i]->type, "cgroup") == 0)
char *type = def->mounts[i]->type;
if (type && strcmp (type, "cgroup") == 0)
{
has_cgroup_mount = true;
break;
@ -376,7 +402,9 @@ checkpoint_cgroup_v1_mount (runtime_spec_schema_config_schema *def, libcrun_erro
if (UNLIKELY (ret < 0))
return ret;
libcriu_wrapper->criu_add_ext_mount (source_path, source_path);
ret = libcriu_wrapper->criu_add_ext_mount (source_path, source_path);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external mount to `%s`", source_path);
}
return 0;
@ -455,6 +483,10 @@ libcrun_container_checkpoint_linux_criu (libcrun_container_status_t *status, lib
* crun to set it if the user has not selected a specific directory. */
if (cr_options->work_path != NULL)
{
ret = mkdir (cr_options->work_path, 0700);
if (UNLIKELY ((ret == -1) && (errno != EEXIST)))
return crun_make_error (err, errno, "error creating CRIU work directory `%s`", cr_options->work_path);
work_fd = open (cr_options->work_path, O_DIRECTORY | O_CLOEXEC);
if (UNLIKELY (work_fd == -1))
return crun_make_error (err, errno, "error opening CRIU work directory `%s`", cr_options->work_path);
@ -481,10 +513,19 @@ libcrun_container_checkpoint_linux_criu (libcrun_container_status_t *status, lib
if (UNLIKELY (criu_can_mem_track == -1))
return -1;
libcriu_wrapper->criu_set_track_mem (true);
/* The parent path needs to be a relative path. CRIU will fail
* if the path is not in the right format. Usually something like
* ../previous-dump */
libcriu_wrapper->criu_set_parent_images (cr_options->parent_path);
/* The parent path must be relative to image path (something like ../previous-dump).
CRIU will fail with an unclear error message if the path is not right.
*/
if (UNLIKELY (cr_options->parent_path[0] == '/'))
return crun_make_error (err, 0, "--parent-path must be relative");
int is_dir = crun_dir_p_at (image_fd, cr_options->parent_path, false, err);
if (UNLIKELY (is_dir <= 0))
return crun_make_error (err, is_dir < 0 ? errno : ENOTDIR, "invalid --parent-path");
ret = libcriu_wrapper->criu_set_parent_images (cr_options->parent_path);
if (UNLIKELY (ret != 0))
return crun_make_error (err, -ret, "error setting CRIU parent images path to `%s`", cr_options->parent_path);
}
if (cr_options->pre_dump)
@ -532,22 +573,37 @@ libcrun_container_checkpoint_linux_criu (libcrun_container_status_t *status, lib
if (cgroup_mode != CGROUP_MODE_UNIFIED)
{
ret = checkpoint_cgroup_v1_mount (def, err);
if (UNLIKELY (ret != 0))
return crun_make_error (err, 0, "error handling cgroup v1 mounts");
if (UNLIKELY (ret < 0))
return ret;
}
/* Tell CRIU about external bind mounts. */
for (i = 0; i < def->mounts_len; i++)
{
size_t j;
for (j = 0; j < def->mounts[i]->options_len; j++)
bool nofollow = false;
if (is_bind_mount (def->mounts[i], NULL, &nofollow))
{
if (strcmp (def->mounts[i]->options[j], "bind") == 0 || strcmp (def->mounts[i]->options[j], "rbind") == 0)
/* We need to resolve mount destination inside container's root for CRIU to handle. */
char buf[PATH_MAX];
const char *dest_in_root;
if (nofollow)
return crun_make_error (err, 0, "CRIU does not support `src-nofollow` for bind mounts");
dest_in_root = chroot_realpath (status->rootfs, def->mounts[i]->destination, buf);
if (UNLIKELY (dest_in_root == NULL))
{
libcriu_wrapper->criu_add_ext_mount (def->mounts[i]->destination, def->mounts[i]->destination);
break;
if (errno != ENOENT)
return crun_make_error (err, errno, "unable to resolve external bind mount `%s` under rootfs", def->mounts[i]->destination);
else
dest_in_root = def->mounts[i]->destination;
}
else
dest_in_root += strlen (status->rootfs);
ret = libcriu_wrapper->criu_add_ext_mount (dest_in_root, dest_in_root);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external mount to `%s`", def->mounts[i]->destination);
}
}
@ -556,7 +612,11 @@ libcrun_container_checkpoint_linux_criu (libcrun_container_status_t *status, lib
struct stat statbuf;
ret = stat (def->linux->masked_paths[i], &statbuf);
if (ret == 0 && S_ISREG (statbuf.st_mode))
libcriu_wrapper->criu_add_ext_mount (def->linux->masked_paths[i], def->linux->masked_paths[i]);
{
ret = libcriu_wrapper->criu_add_ext_mount (def->linux->masked_paths[i], def->linux->masked_paths[i]);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external mount to `%s`", def->linux->masked_paths[i]);
}
}
/* CRIU tries to checkpoint and restore all namespaces. However,
@ -591,7 +651,9 @@ libcrun_container_checkpoint_linux_criu (libcrun_container_status_t *status, lib
return crun_make_error (err, errno, "unable to stat(): `%s`", def->linux->namespaces[i]->path);
xasprintf (&external, "net[%ld]:" CRIU_EXT_NETNS, statbuf.st_ino);
libcriu_wrapper->criu_add_external (external);
ret = libcriu_wrapper->criu_add_external (external);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external namespace `%s`", external);
}
if (value == CLONE_NEWPID && def->linux->namespaces[i]->path != NULL)
@ -604,7 +666,9 @@ libcrun_container_checkpoint_linux_criu (libcrun_container_status_t *status, lib
return crun_make_error (err, errno, "unable to stat(): `%s`", def->linux->namespaces[i]->path);
xasprintf (&external, "pid[%ld]:" CRIU_EXT_PIDNS, statbuf.st_ino);
libcriu_wrapper->criu_add_external (external);
ret = libcriu_wrapper->criu_add_external (external);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external namespace `%s`", external);
}
}
@ -639,11 +703,19 @@ libcrun_container_checkpoint_linux_criu (libcrun_container_status_t *status, lib
libcriu_wrapper->criu_set_manage_cgroups_mode (CRIU_CG_MODE_SOFT);
else
libcriu_wrapper->criu_set_manage_cgroups_mode (cr_options->manage_cgroups_mode);
libcriu_wrapper->criu_set_manage_cgroups (true);
if (libcriu_wrapper->criu_set_network_lock && cr_options->network_lock_method > 0)
{
ret = libcriu_wrapper->criu_set_network_lock (cr_options->network_lock_method);
if (UNLIKELY (ret < 0))
return crun_make_error (err, 0, "CRIU: failed setting network lock");
}
ret = libcriu_wrapper->criu_dump ();
if (UNLIKELY (ret != 0))
return crun_make_error (err, 0,
return crun_make_error (err, ret < 0 ? -ret : 0,
"CRIU checkpointing failed %d. Please check CRIU logfile %s/%s",
ret, cr_options->work_path, CRIU_CHECKPOINT_LOG_FILE);
@ -661,22 +733,24 @@ prepare_restore_mounts (runtime_spec_schema_config_schema *def, char *root, libc
char *dest = def->mounts[i]->destination;
char *type = def->mounts[i]->type;
cleanup_close int root_fd = -1;
bool nofollow = false;
bool on_tmpfs = false;
int is_dir = 1;
size_t j;
/* cgroup restore should be handled by CRIU itself */
if (strcmp (type, "cgroup") == 0 || strcmp (type, "cgroup2") == 0)
if (type && (strcmp (type, "cgroup") == 0 || strcmp (type, "cgroup2") == 0))
continue;
/* Check if the mountpoint is on a tmpfs. CRIU restores
* all tmpfs. We do need to recreate directories on a tmpfs. */
size_t dest_len = strlen (dest);
for (j = 0; j < def->mounts_len; j++)
{
cleanup_free char *dest_loop = NULL;
xasprintf (&dest_loop, "%s/", def->mounts[j]->destination);
if (strncmp (dest, dest_loop, strlen (dest_loop)) == 0 && strcmp (def->mounts[j]->type, "tmpfs") == 0)
if (def->mounts[j]->type == NULL || strcmp (def->mounts[j]->type, "tmpfs") != 0)
continue;
size_t mount_len = strlen (def->mounts[j]->destination);
if (mount_len < dest_len && dest[mount_len] == '/' && strncmp (dest, def->mounts[j]->destination, mount_len) == 0)
{
/* This is a mountpoint which is on a tmpfs.*/
on_tmpfs = true;
@ -688,16 +762,14 @@ prepare_restore_mounts (runtime_spec_schema_config_schema *def, char *root, libc
continue;
/* For bind mounts check if the source is a file or a directory. */
for (j = 0; j < def->mounts[i]->options_len; j++)
if (is_bind_mount (def->mounts[i], NULL, &nofollow))
{
const char *opt = def->mounts[i]->options[j];
if (strcmp (opt, "bind") == 0 || strcmp (opt, "rbind") == 0)
{
is_dir = crun_dir_p (def->mounts[i]->source, false, err);
if (UNLIKELY (is_dir < 0))
return is_dir;
break;
}
if (nofollow)
return crun_make_error (err, 0, "CRIU does not support `src-nofollow` for bind mounts");
is_dir = crun_dir_p (def->mounts[i]->source, false, err);
if (UNLIKELY (is_dir < 0))
return is_dir;
}
root_fd = open (root, O_RDONLY | O_CLOEXEC);
@ -708,7 +780,7 @@ prepare_restore_mounts (runtime_spec_schema_config_schema *def, char *root, libc
{
int ret;
ret = crun_safe_ensure_directory_at (root_fd, root, strlen (root), dest, 0755, err);
ret = crun_safe_ensure_directory_at (root_fd, root, dest, 0755, err);
if (UNLIKELY (ret < 0))
return ret;
}
@ -716,7 +788,7 @@ prepare_restore_mounts (runtime_spec_schema_config_schema *def, char *root, libc
{
int ret;
ret = crun_safe_ensure_file_at (root_fd, root, strlen (root), dest, 0755, err);
ret = crun_safe_ensure_file_at (root_fd, root, dest, 0755, err);
if (UNLIKELY (ret < 0))
return ret;
}
@ -814,6 +886,10 @@ libcrun_container_restore_linux_criu (libcrun_container_status_t *status, libcru
* crun to set it if the user has not selected a specific directory. */
if (cr_options->work_path != NULL)
{
ret = mkdir (cr_options->work_path, 0700);
if (UNLIKELY ((ret == -1) && (errno != EEXIST)))
return crun_make_error (err, errno, "error creating CRIU work directory `%s`", cr_options->work_path);
work_fd = open (cr_options->work_path, O_DIRECTORY | O_CLOEXEC);
if (UNLIKELY (work_fd == -1))
return crun_make_error (err, errno, "error opening CRIU work directory `%s`", cr_options->work_path);
@ -826,18 +902,46 @@ libcrun_container_restore_linux_criu (libcrun_container_status_t *status, libcru
cr_options->work_path = cr_options->image_path;
}
if (cr_options->lsm_profile != NULL)
{
ret = libcriu_wrapper->criu_set_lsm_profile (cr_options->lsm_profile);
if (UNLIKELY (ret != 0))
return crun_make_error (err, -ret, "error setting LSM profile to `%s`", cr_options->lsm_profile);
}
if (cr_options->lsm_mount_context != NULL)
{
ret = libcriu_wrapper->criu_set_lsm_mount_context (cr_options->lsm_mount_context);
if (UNLIKELY (ret != 0))
return crun_make_error (err, -ret, "error setting LSM mount context to `%s`", cr_options->lsm_mount_context);
}
/* Tell CRIU about external bind mounts. */
for (i = 0; i < def->mounts_len; i++)
{
size_t j;
for (j = 0; j < def->mounts[i]->options_len; j++)
bool nofollow = false;
if (is_bind_mount (def->mounts[i], NULL, &nofollow))
{
if (strcmp (def->mounts[i]->options[j], "bind") == 0 || strcmp (def->mounts[i]->options[j], "rbind") == 0)
/* We need to resolve mount destination inside container's root for CRIU to handle. */
char buf[PATH_MAX];
const char *dest_in_root;
if (nofollow)
return crun_make_error (err, 0, "CRIU does not support `src-nofollow` for bind mounts");
dest_in_root = chroot_realpath (status->rootfs, def->mounts[i]->destination, buf);
if (UNLIKELY (dest_in_root == NULL))
{
libcriu_wrapper->criu_add_ext_mount (def->mounts[i]->destination, def->mounts[i]->source);
break;
if (errno != ENOENT)
return crun_make_error (err, errno, "unable to resolve external bind mount destination `%s` under rootfs", def->mounts[i]->destination);
dest_in_root = def->mounts[i]->destination;
}
else
dest_in_root += strlen (status->rootfs);
ret = libcriu_wrapper->criu_add_ext_mount (dest_in_root, def->mounts[i]->source);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external mount to `%s`", def->mounts[i]->source);
}
}
@ -846,7 +950,11 @@ libcrun_container_restore_linux_criu (libcrun_container_status_t *status, libcru
struct stat statbuf;
ret = stat (def->linux->masked_paths[i], &statbuf);
if (ret == 0 && S_ISREG (statbuf.st_mode))
libcriu_wrapper->criu_add_ext_mount (def->linux->masked_paths[i], "/dev/null");
{
ret = libcriu_wrapper->criu_add_ext_mount (def->linux->masked_paths[i], "/dev/null");
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external mount to `%s`", "/dev/null");
}
}
/* do realpath on root */
@ -884,7 +992,7 @@ libcrun_container_restore_linux_criu (libcrun_container_status_t *status, libcru
ret = libcriu_wrapper->criu_set_root (root);
if (UNLIKELY (ret != 0))
{
ret = crun_make_error (err, 0, "error setting CRIU root to `%s`", root);
ret = crun_make_error (err, -ret, "error setting CRIU root to `%s`", root);
goto out_umount;
}
@ -907,7 +1015,9 @@ libcrun_container_restore_linux_criu (libcrun_container_status_t *status, libcru
if (UNLIKELY (inherit_new_net_fd < 0))
return crun_make_error (err, errno, "unable to open(): `%s`", def->linux->namespaces[i]->path);
libcriu_wrapper->criu_add_inherit_fd (inherit_new_net_fd, CRIU_EXT_NETNS);
ret = libcriu_wrapper->criu_add_inherit_fd (inherit_new_net_fd, CRIU_EXT_NETNS);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding fd");
}
if (value == CLONE_NEWPID && def->linux->namespaces[i]->path != NULL)
@ -916,32 +1026,40 @@ libcrun_container_restore_linux_criu (libcrun_container_status_t *status, libcru
if (UNLIKELY (inherit_new_pid_fd < 0))
return crun_make_error (err, errno, "unable to open(): `%s`", def->linux->namespaces[i]->path);
libcriu_wrapper->criu_add_inherit_fd (inherit_new_pid_fd, CRIU_EXT_PIDNS);
ret = libcriu_wrapper->criu_add_inherit_fd (inherit_new_pid_fd, CRIU_EXT_PIDNS);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding fd");
}
# ifdef CRIU_JOIN_NS_SUPPORT
if (value == CLONE_NEWTIME && def->linux->namespaces[i]->path != NULL)
{
if (libcriu_wrapper->criu_join_ns_add != NULL)
libcriu_wrapper->criu_join_ns_add ("time", def->linux->namespaces[i]->path, NULL);
else
if (libcriu_wrapper->criu_join_ns_add == NULL)
return crun_make_error (err, 0, "shared time namespace restore is supported in CRIU >= 3.16.1");
ret = libcriu_wrapper->criu_join_ns_add ("time", def->linux->namespaces[i]->path, NULL);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external namespace `%s`", def->linux->namespaces[i]->path);
}
if (value == CLONE_NEWIPC && def->linux->namespaces[i]->path != NULL)
{
if (libcriu_wrapper->criu_join_ns_add != NULL)
libcriu_wrapper->criu_join_ns_add ("ipc", def->linux->namespaces[i]->path, NULL);
else
if (libcriu_wrapper->criu_join_ns_add == NULL)
return crun_make_error (err, 0, "shared ipc namespace restore is supported in CRIU >= 3.16.1");
ret = libcriu_wrapper->criu_join_ns_add ("ipc", def->linux->namespaces[i]->path, NULL);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external namespace `%s`", def->linux->namespaces[i]->path);
}
if (value == CLONE_NEWUTS && def->linux->namespaces[i]->path != NULL)
{
if (libcriu_wrapper->criu_join_ns_add != NULL)
libcriu_wrapper->criu_join_ns_add ("uts", def->linux->namespaces[i]->path, NULL);
else
if (libcriu_wrapper->criu_join_ns_add == NULL)
return crun_make_error (err, 0, "shared uts namespace restore is supported in CRIU >= 3.16.1");
ret = libcriu_wrapper->criu_join_ns_add ("uts", def->linux->namespaces[i]->path, NULL);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "CRIU: failed adding external namespace `%s`", def->linux->namespaces[i]->path);
}
# endif
}
@ -960,16 +1078,37 @@ libcrun_container_restore_linux_criu (libcrun_container_status_t *status, libcru
libcriu_wrapper->criu_set_tcp_established (cr_options->tcp_established);
libcriu_wrapper->criu_set_file_locks (cr_options->file_locks);
libcriu_wrapper->criu_set_orphan_pts_master (true);
if (status->cgroup_path)
{
ret = libcriu_wrapper->criu_add_cg_root (NULL, status->cgroup_path);
if (UNLIKELY (ret != 0))
return crun_make_error (err, 0, "error setting CRIU cgroup root to `%s`", status->cgroup_path);
}
if (cr_options->manage_cgroups_mode == -1)
/* Defaulting to CRIU_CG_MODE_SOFT just as runc */
libcriu_wrapper->criu_set_manage_cgroups_mode (CRIU_CG_MODE_SOFT);
else
libcriu_wrapper->criu_set_manage_cgroups_mode (cr_options->manage_cgroups_mode);
libcriu_wrapper->criu_set_manage_cgroups (true);
if (libcriu_wrapper->criu_set_network_lock && cr_options->network_lock_method > 0)
{
ret = libcriu_wrapper->criu_set_network_lock (cr_options->network_lock_method);
if (UNLIKELY (ret < 0))
return crun_make_error (err, 0, "CRIU: failed setting network lock");
}
libcriu_wrapper->criu_set_log_level (4);
libcriu_wrapper->criu_set_log_file (CRIU_RESTORE_LOG_FILE);
ret = libcriu_wrapper->criu_restore_child ();
ret = libcriu_wrapper->criu_set_log_file (CRIU_RESTORE_LOG_FILE);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "error setting CRIU log file to `%s`", CRIU_RESTORE_LOG_FILE);
/* criu_restore() returns the PID of the process of the restored process
* tree. This PID will not be the same as status->pid if the container is
* running in a PID namespace. But it will always be > 0. */
ret = libcriu_wrapper->criu_restore_child ();
if (UNLIKELY (ret <= 0))
{
ret = crun_make_error (err, 0,

View File

@ -45,6 +45,9 @@ extern struct custom_handler_s handler_wasmedge;
#if HAVE_DLOPEN && HAVE_WASMER
extern struct custom_handler_s handler_wasmer;
#endif
#if HAVE_DLOPEN && HAVE_WAMR
extern struct custom_handler_s handler_wamr;
#endif
#if HAVE_DLOPEN && HAVE_MONO
extern struct custom_handler_s handler_mono;
#endif
@ -65,6 +68,9 @@ static struct custom_handler_s *static_handlers[] = {
#if HAVE_DLOPEN && HAVE_WASMTIME
&handler_wasmtime,
#endif
#if HAVE_DLOPEN && HAVE_WAMR
&handler_wamr,
#endif
#if HAVE_DLOPEN && HAVE_MONO
&handler_mono,
#endif
@ -115,6 +121,9 @@ handler_manager_free (struct custom_handler_manager_s *manager)
{
size_t i;
if (manager == NULL)
return;
for (i = 0; i < manager->handlers_len; i++)
{
#ifdef HAVE_DLOPEN

View File

@ -61,32 +61,32 @@ struct bpf_program
#ifdef HAVE_EBPF
# define BPF_ALU32_IMM(OP, DST, IMM) \
((struct bpf_insn){ .code = BPF_ALU | BPF_OP (OP) | BPF_K, .dst_reg = DST, .src_reg = 0, .off = 0, .imm = IMM })
((struct bpf_insn) { .code = BPF_ALU | BPF_OP (OP) | BPF_K, .dst_reg = DST, .src_reg = 0, .off = 0, .imm = IMM })
# define BPF_LDX_MEM(SIZE, DST, SRC, OFF) \
((struct bpf_insn){ \
((struct bpf_insn) { \
.code = BPF_LDX | BPF_SIZE (SIZE) | BPF_MEM, .dst_reg = DST, .src_reg = SRC, .off = OFF, .imm = 0 })
# define BPF_MOV64_REG(DST, SRC) \
((struct bpf_insn){ .code = BPF_ALU64 | BPF_MOV | BPF_X, .dst_reg = DST, .src_reg = SRC, .off = 0, .imm = 0 })
((struct bpf_insn) { .code = BPF_ALU64 | BPF_MOV | BPF_X, .dst_reg = DST, .src_reg = SRC, .off = 0, .imm = 0 })
# define BPF_JMP_A(OFF) \
((struct bpf_insn){ .code = BPF_JMP | BPF_JA, .dst_reg = 0, .src_reg = 0, .off = OFF, .imm = 0 })
((struct bpf_insn) { .code = BPF_JMP | BPF_JA, .dst_reg = 0, .src_reg = 0, .off = OFF, .imm = 0 })
# define BPF_JMP_IMM(OP, DST, IMM, OFF) \
((struct bpf_insn){ .code = BPF_JMP | BPF_OP (OP) | BPF_K, .dst_reg = DST, .src_reg = 0, .off = OFF, .imm = IMM })
((struct bpf_insn) { .code = BPF_JMP | BPF_OP (OP) | BPF_K, .dst_reg = DST, .src_reg = 0, .off = OFF, .imm = IMM })
# define BPF_JMP_REG(OP, DST, SRC, OFF) \
((struct bpf_insn){ .code = BPF_JMP | BPF_OP (OP) | BPF_X, .dst_reg = DST, .src_reg = SRC, .off = OFF, .imm = 0 })
((struct bpf_insn) { .code = BPF_JMP | BPF_OP (OP) | BPF_X, .dst_reg = DST, .src_reg = SRC, .off = OFF, .imm = 0 })
# define BPF_MOV64_IMM(DST, IMM) \
((struct bpf_insn){ .code = BPF_ALU64 | BPF_MOV | BPF_K, .dst_reg = DST, .src_reg = 0, .off = 0, .imm = IMM })
((struct bpf_insn) { .code = BPF_ALU64 | BPF_MOV | BPF_K, .dst_reg = DST, .src_reg = 0, .off = 0, .imm = IMM })
# define BPF_MOV32_REG(DST, SRC) \
((struct bpf_insn){ .code = BPF_ALU | BPF_MOV | BPF_X, .dst_reg = DST, .src_reg = SRC, .off = 0, .imm = 0 })
((struct bpf_insn) { .code = BPF_ALU | BPF_MOV | BPF_X, .dst_reg = DST, .src_reg = SRC, .off = 0, .imm = 0 })
# define BPF_EXIT_INSN() \
((struct bpf_insn){ .code = BPF_JMP | BPF_EXIT, .dst_reg = 0, .src_reg = 0, .off = 0, .imm = 0 })
((struct bpf_insn) { .code = BPF_JMP | BPF_EXIT, .dst_reg = 0, .src_reg = 0, .off = 0, .imm = 0 })
#endif
#ifdef HAVE_EBPF
@ -527,9 +527,12 @@ libcrun_ebpf_load (struct bpf_program *program, int dirfd, const char *pin, libc
}
}
ret = ebpf_attach_program (fd, dirfd, err);
if (UNLIKELY (ret < 0))
return ret;
if (dirfd >= 0)
{
ret = ebpf_attach_program (fd, dirfd, err);
if (UNLIKELY (ret < 0))
return ret;
}
/* Optionally pin the program to the specified path. */
if (pin)
@ -547,3 +550,68 @@ libcrun_ebpf_load (struct bpf_program *program, int dirfd, const char *pin, libc
return 0;
#endif
}
int
libcrun_ebpf_read_program (struct bpf_program **program_ret, const char *path, libcrun_error_t *err)
{
#ifndef HAVE_EBPF
(void) program_ret;
(void) path;
return crun_make_error (err, 0, "eBPF not supported");
#else
cleanup_free struct bpf_program *program = NULL;
cleanup_free char *buffer = NULL;
cleanup_close int prog_fd = -1;
size_t buffer_size = 0;
struct bpf_prog_info info;
union bpf_attr attr;
int ret;
memset (&attr, 0, sizeof (attr));
attr.pathname = ptr_to_u64 (path);
prog_fd = bpf (BPF_OBJ_GET, &attr, sizeof (attr));
if (UNLIKELY (prog_fd < 0))
return crun_make_error (err, errno, "bpf get `%s`", path);
memset (&info, 0, sizeof (info));
memset (&attr, 0, sizeof (attr));
attr.info.bpf_fd = prog_fd;
attr.info.info = ptr_to_u64 (&info);
attr.info.info_len = sizeof (info);
ret = bpf (BPF_OBJ_GET_INFO_BY_FD, &attr, sizeof (attr));
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "bpf get info `%s`", path);
buffer_size = info.xlated_prog_len;
buffer = xmalloc (buffer_size);
memset (&info, 0, sizeof (info));
info.xlated_prog_insns = ptr_to_u64 (buffer);
info.xlated_prog_len = buffer_size;
ret = bpf (BPF_OBJ_GET_INFO_BY_FD, &attr, sizeof (attr));
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "bpf get info `%s`", path);
program = bpf_program_new (buffer_size);
program = bpf_program_append (program, buffer, buffer_size);
*program_ret = program;
program = NULL;
return 0;
#endif
}
bool
libcrun_ebpf_cmp_programs (struct bpf_program *program1, struct bpf_program *program2)
{
if (program1->used != program2->used)
return false;
return memcmp (program1->program, program2->program, program1->used) == 0;
}

View File

@ -27,6 +27,9 @@
#include <ocispec/runtime_spec_schema_config_schema.h>
#include "container.h"
#define SYS_FS_BPF "/sys/fs/bpf"
#define CRUN_BPF_DIR SYS_FS_BPF "/crun"
struct bpf_program;
struct bpf_program *bpf_program_new (size_t size);
@ -38,5 +41,7 @@ struct bpf_program *bpf_program_append_dev (struct bpf_program *program, const c
struct bpf_program *bpf_program_complete_dev (struct bpf_program *program, libcrun_error_t *err);
int libcrun_ebpf_load (struct bpf_program *program, int dirfd, const char *pin, libcrun_error_t *err);
int libcrun_ebpf_read_program (struct bpf_program **program, const char *path, libcrun_error_t *err);
bool libcrun_ebpf_cmp_programs (struct bpf_program *program1, struct bpf_program *program2);
#endif

View File

@ -42,7 +42,6 @@ enum
};
static int log_format;
static bool log_also_to_stderr;
static int output_verbosity = LIBCRUN_VERBOSITY_ERROR;
int
@ -72,7 +71,10 @@ crun_error_wrap (libcrun_error_t *err, const char *fmt, ...)
int ret;
if (err == NULL || *err == NULL)
return 0;
{
// Internal error
return 0;
}
ret = -(*err)->status - 1;
@ -124,8 +126,11 @@ crun_error_write_warning_and_release (FILE *out, libcrun_error_t **err)
if (out == NULL)
out = stderr;
if (err == NULL || *err == NULL)
return;
if (err == NULL || *err == NULL || **err == NULL)
{
// Internal error
return;
}
ref = **err;
if (ref->status)
@ -157,13 +162,16 @@ typedef char timestamp_t[64];
static void
get_timestamp (timestamp_t *timestamp, const char *suffix)
{
const size_t buffer_size = sizeof (timestamp_t);
char *buffer = (char *) timestamp;
struct timeval tv;
size_t written;
struct tm now;
gettimeofday (&tv, NULL);
gmtime_r (&tv.tv_sec, &now);
strftime ((char *) timestamp, 64, "%Y-%m-%dT%H:%M:%S", &now);
sprintf (((char *) timestamp) + 19, ".%06lldZ%.8s", (long long int) tv.tv_usec, suffix);
written = strftime (buffer, buffer_size, "%Y-%m-%dT%H:%M:%S", &now);
snprintf (buffer + written, buffer_size - written, ".%06lldZ%.8s", (long long int) tv.tv_usec, suffix);
}
static void *
@ -236,16 +244,16 @@ libcrun_init_logging (crun_output_handler *new_output_handler, void **new_output
case LOG_TYPE_JOURNALD:
*new_output_handler = log_write_to_journald;
*new_output_handler_arg = NULL;
*new_output_handler_arg = (void *) id;
break;
}
}
crun_set_output_handler (*new_output_handler, *new_output_handler_arg, log != NULL);
crun_set_output_handler (*new_output_handler, *new_output_handler_arg);
return 0;
}
void
log_write_to_stream (int errno_, const char *msg, bool warning, void *arg)
log_write_to_stream (int errno_, const char *msg, int verbosity, void *arg)
{
timestamp_t timestamp = {
0,
@ -257,8 +265,21 @@ log_write_to_stream (int errno_, const char *msg, bool warning, void *arg)
if (tty)
{
color_begin = warning ? "\x1b[1;33m" : "\x1b[1;31m";
color_end = "\x1b[0m";
switch (verbosity)
{
case LIBCRUN_VERBOSITY_DEBUG:
color_begin = "\x1b[1;34m";
color_end = "\x1b[0m";
break;
case LIBCRUN_VERBOSITY_WARNING:
color_begin = "\x1b[1;33m";
color_end = "\x1b[0m";
break;
case LIBCRUN_VERBOSITY_ERROR:
color_begin = "\x1b[1;31m";
color_end = "\x1b[0m";
break;
}
if (log_format == LOG_FORMAT_TEXT)
get_timestamp (&timestamp, ": ");
@ -271,31 +292,57 @@ log_write_to_stream (int errno_, const char *msg, bool warning, void *arg)
}
void
log_write_to_stderr (int errno_, const char *msg, bool warning, void *arg arg_unused)
log_write_to_stderr (int errno_, const char *msg, int verbosity, void *arg arg_unused)
{
log_write_to_stream (errno_, msg, warning, stderr);
log_write_to_stream (errno_, msg, verbosity, stderr);
}
void
log_write_to_syslog (int errno_, const char *msg, bool warning, void *arg arg_unused)
log_write_to_syslog (int errno_, const char *msg, int verbosity, void *arg arg_unused)
{
int priority = LOG_ERR;
switch (verbosity)
{
case LIBCRUN_VERBOSITY_DEBUG:
priority = LOG_DEBUG;
break;
case LIBCRUN_VERBOSITY_WARNING:
priority = LOG_WARNING;
break;
case LIBCRUN_VERBOSITY_ERROR:
priority = LOG_ERR;
break;
}
if (errno_ == 0)
syslog (warning ? LOG_WARNING : LOG_ERR, "%s", msg);
syslog (priority, "%s", msg);
else
syslog (warning ? LOG_WARNING : LOG_ERR, "%s: %s", msg, strerror (errno_));
syslog (priority, "%s: %s", msg, strerror (errno_));
}
void
log_write_to_journald (int errno_, const char *msg, bool warning, void *arg arg_unused)
log_write_to_journald (int errno_, const char *msg, int verbosity, void *arg arg_unused)
{
(void) errno_;
(void) msg;
(void) warning;
(void) verbosity;
#ifdef HAVE_SYSTEMD
int priority = LOG_ERR;
switch (verbosity)
{
case LIBCRUN_VERBOSITY_DEBUG:
priority = LOG_DEBUG;
break;
case LIBCRUN_VERBOSITY_WARNING:
priority = LOG_WARNING;
break;
case LIBCRUN_VERBOSITY_ERROR:
priority = LOG_ERR;
break;
}
if (errno_ == 0)
sd_journal_send ("PRIORITY=%d", warning ? LOG_WARNING : LOG_ERR, "MESSAGE=%s", msg, "ID=%s", arg, NULL);
sd_journal_send ("PRIORITY=%d", priority, "MESSAGE=%s", msg, "ID=%s", arg, NULL);
else
sd_journal_send ("PRIORITY=%d", warning ? LOG_WARNING : LOG_ERR, "MESSAGE=%s: %s", msg, strerror (errno_), "ID=%s",
sd_journal_send ("PRIORITY=%d", priority, "MESSAGE=%s: %s", msg, strerror (errno_), "ID=%s",
arg, NULL);
#endif
}
@ -316,17 +363,28 @@ libcrun_get_verbosity ()
}
void
crun_set_output_handler (crun_output_handler handler, void *arg, bool log_to_stderr)
crun_set_output_handler (crun_output_handler handler, void *arg)
{
output_handler = handler;
output_handler_arg = arg;
log_also_to_stderr = log_to_stderr;
}
static char *
make_json_error (const char *msg, int errno_, bool warning)
make_json_error (const char *msg, int errno_, int verbosity)
{
const char *level = warning ? "warning" : "error";
const char *level;
switch (verbosity)
{
case LIBCRUN_VERBOSITY_DEBUG:
level = "debug";
break;
case LIBCRUN_VERBOSITY_WARNING:
level = "warning";
break;
case LIBCRUN_VERBOSITY_ERROR:
level = "error";
break;
}
const unsigned char *buf = NULL;
yajl_gen gen = NULL;
char *ret = NULL;
@ -372,44 +430,53 @@ make_json_error (const char *msg, int errno_, bool warning)
}
static void
write_log (int errno_, bool warning, const char *msg, va_list args_list)
write_log (int errno_, int verbosity, const char *msg, va_list args_list)
{
int ret;
cleanup_free char *output = NULL;
cleanup_free char *json = NULL;
if (warning && output_verbosity < LIBCRUN_VERBOSITY_WARNING)
if (verbosity > output_verbosity)
return;
ret = vasprintf (&output, msg, args_list);
if (UNLIKELY (ret < 0))
OOM ();
if (log_also_to_stderr)
log_write_to_stderr (errno_, output, warning, NULL);
if (verbosity == LIBCRUN_VERBOSITY_ERROR && output_handler != log_write_to_stderr)
log_write_to_stderr (errno_, output, LIBCRUN_VERBOSITY_ERROR, NULL);
switch (log_format)
{
case LOG_FORMAT_TEXT:
output_handler (errno_, output, warning, output_handler_arg);
output_handler (errno_, output, verbosity, output_handler_arg);
break;
case LOG_FORMAT_JSON:
json = make_json_error (output, errno_, warning);
json = make_json_error (output, errno_, verbosity);
if (json)
output_handler (0, json, warning, output_handler_arg);
output_handler (0, json, verbosity, output_handler_arg);
else
output_handler (errno_, output, warning, output_handler_arg);
output_handler (errno_, output, verbosity, output_handler_arg);
break;
}
}
void
libcrun_debug (const char *msg, ...)
{
va_list args_list;
va_start (args_list, msg);
write_log (0, LIBCRUN_VERBOSITY_DEBUG, msg, args_list);
va_end (args_list);
}
void
libcrun_warning (const char *msg, ...)
{
va_list args_list;
va_start (args_list, msg);
write_log (0, true, msg, args_list);
write_log (0, LIBCRUN_VERBOSITY_WARNING, msg, args_list);
va_end (args_list);
}
@ -419,7 +486,7 @@ libcrun_error (int errno_, const char *msg, ...)
va_list args_list;
va_start (args_list, msg);
write_log (errno_, false, msg, args_list);
write_log (errno_, LIBCRUN_VERBOSITY_ERROR, msg, args_list);
va_end (args_list);
}
@ -428,7 +495,7 @@ libcrun_fail_with_error (int errno_, const char *msg, ...)
{
va_list args_list;
va_start (args_list, msg);
write_log (errno_, false, msg, args_list);
write_log (errno_, LIBCRUN_VERBOSITY_ERROR, msg, args_list);
va_end (args_list);
exit (EXIT_FAILURE);
}

View File

@ -55,17 +55,17 @@ typedef struct libcrun_error_s *libcrun_error_t;
_exit (EXIT_FAILURE); \
} while (0)
typedef void (*crun_output_handler) (int errno_, const char *msg, bool warning, void *arg);
typedef void (*crun_output_handler) (int errno_, const char *msg, int verbosity, void *arg);
void crun_set_output_handler (crun_output_handler handler, void *arg, bool log_to_stderr);
void crun_set_output_handler (crun_output_handler handler, void *arg);
void log_write_to_journald (int errno_, const char *msg, bool warning, void *arg);
void log_write_to_journald (int errno_, const char *msg, int verbosity, void *arg);
void log_write_to_syslog (int errno_, const char *msg, bool warning, void *arg);
void log_write_to_syslog (int errno_, const char *msg, int verbosity, void *arg);
void log_write_to_stream (int errno_, const char *msg, bool warning, void *arg);
void log_write_to_stream (int errno_, const char *msg, int verbosity, void *arg);
void log_write_to_stderr (int errno_, const char *msg, bool warning, void *arg);
void log_write_to_stderr (int errno_, const char *msg, int verbosity, void *arg);
int crun_error_wrap (libcrun_error_t *err, const char *fmt, ...) __attribute__ ((format (printf, 2, 3)));
@ -75,6 +75,8 @@ int crun_error_release (libcrun_error_t *err);
void crun_error_write_warning_and_release (FILE *out, libcrun_error_t **err);
LIBCRUN_PUBLIC void libcrun_debug (const char *msg, ...) __attribute__ ((format (printf, 1, 2)));
LIBCRUN_PUBLIC void libcrun_warning (const char *msg, ...) __attribute__ ((format (printf, 1, 2)));
LIBCRUN_PUBLIC void libcrun_error (int errno_, const char *msg, ...) __attribute__ ((format (printf, 2, 3)));
@ -98,8 +100,9 @@ int yajl_error_to_crun_error (int yajl_status, libcrun_error_t *err);
enum
{
LIBCRUN_VERBOSITY_ERROR,
LIBCRUN_VERBOSITY_ERROR = 0,
LIBCRUN_VERBOSITY_WARNING,
LIBCRUN_VERBOSITY_DEBUG,
};
LIBCRUN_PUBLIC void libcrun_set_verbosity (int verbosity);

View File

@ -40,67 +40,265 @@
# include <libkrun.h>
#endif
/* libkrun has a hard-limit of 8 vCPUs per microVM. */
#define LIBKRUN_MAX_VCPUS 8
/* libkrun has a hard-limit of 16 vCPUs per microVM. */
#define LIBKRUN_MAX_VCPUS 16
/* crun dumps the container configuration into this file, which will be read by
* libkrun to set up the environment for the workload inside the microVM.
*/
#define KRUN_CONFIG_FILE ".krun_config.json"
/* The presence of this file indicates this is a container intended to be run
* as a confidential workload inside a SEV-powered TEE.
*/
#define KRUN_SEV_FILE "/krun-sev.json"
/* This file contains configuration parameters for the microVM. crun needs to
* read and parse it, using the information obtained from it to configure
* libkrun as required.
*/
#define KRUN_VM_FILE "/.krun_vm.json"
#define KRUN_FLAVOR_SEV "sev"
struct krun_config
{
void *handle;
void *handle_sev;
bool sev;
int32_t ctx_id;
int32_t ctx_id_sev;
};
/* libkrun handler. */
#if HAVE_DLOPEN && HAVE_LIBKRUN
static int
libkrun_exec (void *cookie, libcrun_container_t *container, const char *pathname, char *const argv[])
static int32_t
libkrun_create_context (void *handle, libcrun_error_t *err)
{
runtime_spec_schema_config_schema *def = container->container_def;
int32_t (*krun_set_log_level) (uint32_t level);
int32_t (*krun_create_ctx) ();
int (*krun_start_enter) (uint32_t ctx_id);
int32_t (*krun_set_vm_config) (uint32_t ctx_id, uint8_t num_vcpus, uint32_t ram_mib);
int32_t (*krun_set_root) (uint32_t ctx_id, const char *root_path);
int32_t (*krun_set_root_disk) (uint32_t ctx_id, const char *disk_path);
int32_t (*krun_set_workdir) (uint32_t ctx_id, const char *workdir_path);
int32_t (*krun_set_exec) (uint32_t ctx_id, const char *exec_path, char *const argv[], char *const envp[]);
int32_t (*krun_set_tee_config_file) (uint32_t ctx_id, const char *file_path);
struct krun_config *kconf = (struct krun_config *) cookie;
void *handle;
uint32_t num_vcpus, ram_mib;
int32_t ctx_id, ret;
cpu_set_t set;
char *const envp[] = { 0 };
int32_t ctx_id;
if (access ("/krun-sev.json", F_OK) == 0)
krun_create_ctx = dlsym (handle, "krun_create_ctx");
if (krun_create_ctx == NULL)
return crun_make_error (err, 0, "could not find symbol in the krun library");
ctx_id = krun_create_ctx ();
if (UNLIKELY (ctx_id < 0))
return crun_make_error (err, -ctx_id, "could not create krun context");
return ctx_id;
}
static int
libkrun_configure_kernel (uint32_t ctx_id, void *handle, yajl_val *config_tree, libcrun_error_t *err)
{
int32_t (*krun_set_kernel) (uint32_t ctx_id, const char *kernel_path,
uint32_t kernel_format, const char *initrd_path, const char *kernel_cmdline);
const char *path_kernel_path[] = { "kernel_path", (const char *) 0 };
const char *path_kernel_format[] = { "kernel_format", (const char *) 0 };
const char *path_initrd_path[] = { "initrd_path", (const char *) 0 };
const char *path_kernel_cmdline[] = { "kernel_cmdline", (const char *) 0 };
yajl_val kernel_path = NULL;
yajl_val kernel_format = NULL;
yajl_val val_initrd_path = NULL;
yajl_val val_kernel_cmdline = NULL;
char *initrd_path = NULL;
char *kernel_cmdline = NULL;
int ret;
/* kernel_path and kernel_format must be present */
kernel_path = yajl_tree_get (*config_tree, path_kernel_path, yajl_t_string);
if (kernel_path == NULL || ! YAJL_IS_STRING (kernel_path))
return 0;
kernel_format = yajl_tree_get (*config_tree, path_kernel_format, yajl_t_number);
if (kernel_format == NULL || ! YAJL_IS_INTEGER (kernel_format))
return 0;
/* initrd and kernel_cmdline are optional */
val_initrd_path = yajl_tree_get (*config_tree, path_initrd_path, yajl_t_string);
if (val_initrd_path != NULL && YAJL_IS_STRING (val_initrd_path))
initrd_path = YAJL_GET_STRING (val_initrd_path);
val_kernel_cmdline = yajl_tree_get (*config_tree, path_kernel_cmdline, yajl_t_string);
if (val_kernel_cmdline != NULL && YAJL_IS_STRING (val_kernel_cmdline))
kernel_cmdline = YAJL_GET_STRING (val_kernel_cmdline);
krun_set_kernel = dlsym (handle, "krun_set_kernel");
if (krun_set_kernel == NULL)
return crun_make_error (err, 0, "could not find symbol in krun library");
ret = krun_set_kernel (ctx_id,
YAJL_GET_STRING (kernel_path),
YAJL_GET_INTEGER (kernel_format),
initrd_path, kernel_cmdline);
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "could not configure a krun external kernel");
return 0;
}
static int
libkrun_read_vm_config (yajl_val *config_tree, libcrun_error_t *err)
{
int ret;
cleanup_free char *config = NULL;
struct parser_context ctx = { 0, stderr };
if (access (KRUN_VM_FILE, F_OK) != 0)
return 0;
ret = read_all_file (KRUN_VM_FILE, &config, NULL, err);
if (UNLIKELY (ret < 0))
return ret;
ret = parse_json_file (config_tree, config, &ctx, err);
if (UNLIKELY (ret < 0))
return ret;
return 0;
}
static int
libkrun_configure_vm (uint32_t ctx_id, void *handle, bool *configured, yajl_val *config_tree, libcrun_error_t *err)
{
int32_t (*krun_set_vm_config) (uint32_t ctx_id, uint8_t num_vcpus, uint32_t ram_mib);
yajl_val cpus = NULL;
yajl_val ram_mib = NULL;
const char *path_cpus[] = { "cpus", (const char *) 0 };
const char *path_ram_mib[] = { "ram_mib", (const char *) 0 };
int ret;
if (*config_tree == NULL)
return 0;
/* Try to configure an external kernel. If the configuration file doesn't
* specify a kernel, libkrun automatically fall back to using libkrunfw,
* if the library is present and was loaded while creating the context.
*/
ret = libkrun_configure_kernel (ctx_id, handle, config_tree, err);
if (UNLIKELY (ret))
return ret;
cpus = yajl_tree_get (*config_tree, path_cpus, yajl_t_number);
ram_mib = yajl_tree_get (*config_tree, path_ram_mib, yajl_t_number);
/* Both cpus and ram_mib must be present at the same time */
if (cpus == NULL || ram_mib == NULL || ! YAJL_IS_INTEGER (cpus) || ! YAJL_IS_INTEGER (ram_mib))
return 0;
krun_set_vm_config = dlsym (handle, "krun_set_vm_config");
if (krun_set_vm_config == NULL)
return crun_make_error (err, 0, "could not find symbol in the krun library");
ret = krun_set_vm_config (ctx_id, YAJL_GET_INTEGER (cpus), YAJL_GET_INTEGER (ram_mib));
if (UNLIKELY (ret < 0))
return crun_make_error (err, -ret, "could not set krun vm configuration");
*configured = true;
return 0;
}
static int
libkrun_configure_flavor (void *cookie, yajl_val *config_tree, libcrun_error_t *err)
{
int ret, sev_indicated = 0;
const char *path_flavor[] = { "flavor", (const char *) 0 };
struct krun_config *kconf = (struct krun_config *) cookie;
yajl_val val_flavor = NULL;
char *flavor = NULL;
// Read if the SEV flavor was indicated in the krun VM config.
val_flavor = yajl_tree_get (*config_tree, path_flavor, yajl_t_string);
if (val_flavor != NULL && YAJL_IS_STRING (val_flavor))
{
flavor = YAJL_GET_STRING (val_flavor);
// The SEV flavor will be used if the krun VM config indicates to use SEV
// within the "flavor" field.
sev_indicated |= strcmp (flavor, KRUN_FLAVOR_SEV) == 0;
}
// To maintain backward compatibility, also use the SEV flavor if the
// KRUN_SEV_FILE was found.
sev_indicated |= access (KRUN_SEV_FILE, F_OK) == 0;
if (sev_indicated)
{
if (kconf->handle_sev == NULL)
error (EXIT_FAILURE, 0, "the container requires libkrun-sev but it's not available");
handle = kconf->handle_sev;
if (kconf->handle != NULL)
{
// We no longer need the libkrun handle.
ret = dlclose (kconf->handle);
if (UNLIKELY (ret != 0))
return crun_make_error (err, 0, "could not unload handle: `%s`", dlerror ());
}
kconf->handle = kconf->handle_sev;
kconf->ctx_id = kconf->ctx_id_sev;
kconf->sev = true;
}
else
{
if (kconf->handle == NULL)
error (EXIT_FAILURE, 0, "the container requires libkrun but it's not available");
handle = kconf->handle;
if (kconf->handle_sev != NULL)
{
// We no longer need the libkrun-sev handle.
ret = dlclose (kconf->handle_sev);
if (UNLIKELY (ret != 0))
return crun_make_error (err, 0, "could not unload handle: `%s`", dlerror ());
}
kconf->sev = false;
}
return 0;
}
static int
libkrun_exec (void *cookie, libcrun_container_t *container, const char *pathname, char *const argv[])
{
runtime_spec_schema_config_schema *def = container->container_def;
int32_t (*krun_set_log_level) (uint32_t level);
int (*krun_start_enter) (uint32_t ctx_id);
int32_t (*krun_set_vm_config) (uint32_t ctx_id, uint8_t num_vcpus, uint32_t ram_mib);
int32_t (*krun_set_root) (uint32_t ctx_id, const char *root_path);
int32_t (*krun_set_root_disk) (uint32_t ctx_id, const char *disk_path);
int32_t (*krun_set_tee_config_file) (uint32_t ctx_id, const char *file_path);
struct krun_config *kconf = (struct krun_config *) cookie;
void *handle;
uint32_t num_vcpus, ram_mib;
int32_t ctx_id, ret;
cpu_set_t set;
libcrun_error_t err;
bool configured = false;
yajl_val config_tree = NULL;
ret = libkrun_read_vm_config (&config_tree, &err);
if (UNLIKELY (ret < 0))
error (EXIT_FAILURE, -ret, "libkrun VM config exists, but unable to parse");
ret = libkrun_configure_flavor (cookie, &config_tree, &err);
if (UNLIKELY (ret < 0))
error (EXIT_FAILURE, -ret, "unable to configure libkrun flavor");
handle = kconf->handle;
ctx_id = kconf->ctx_id;
krun_set_log_level = dlsym (handle, "krun_set_log_level");
krun_create_ctx = dlsym (handle, "krun_create_ctx");
krun_start_enter = dlsym (handle, "krun_start_enter");
if (krun_set_log_level == NULL || krun_create_ctx == NULL
|| krun_start_enter == NULL)
error (EXIT_FAILURE, 0, "could not find symbol in `libkrun.so`");
if (krun_set_log_level == NULL || krun_start_enter == NULL)
error (EXIT_FAILURE, 0, "could not find symbol in the krun library");
/* Set log level to "error" */
krun_set_log_level (1);
ctx_id = krun_create_ctx ();
if (UNLIKELY (ctx_id < 0))
error (EXIT_FAILURE, -ctx_id, "could not create krun context");
if (kconf->sev)
{
krun_set_root_disk = dlsym (handle, "krun_set_root_disk");
@ -112,11 +310,34 @@ libkrun_exec (void *cookie, libcrun_container_t *container, const char *pathname
if (UNLIKELY (ret < 0))
error (EXIT_FAILURE, -ret, "could not set root disk");
ret = krun_set_tee_config_file (ctx_id, "/krun-sev.json");
ret = krun_set_tee_config_file (ctx_id, KRUN_SEV_FILE);
if (UNLIKELY (ret < 0))
error (EXIT_FAILURE, -ret, "could not set krun tee config file");
}
else
{
krun_set_root = dlsym (handle, "krun_set_root");
if (krun_set_root == NULL)
error (EXIT_FAILURE, 0, "could not find symbol in `libkrun.so`");
ret = krun_set_root (ctx_id, "/");
if (UNLIKELY (ret < 0))
error (EXIT_FAILURE, -ret, "could not set krun root");
}
ret = libkrun_configure_vm (ctx_id, handle, &configured, &config_tree, &err);
if (UNLIKELY (ret))
{
libcrun_error_t *tmp_err = &err;
libcrun_error_write_warning_and_release (NULL, &tmp_err);
error (EXIT_FAILURE, ret, "could not configure krun vm");
}
/* If we couldn't configure the microVM using KRUN_VM_FILE, fall back to the
* legacy configuration logic.
*/
if (! configured)
{
/* If sched_getaffinity fails, default to 1 vcpu. */
num_vcpus = 1;
@ -132,34 +353,19 @@ libkrun_exec (void *cookie, libcrun_container_t *container, const char *pathname
num_vcpus = MIN (CPU_COUNT (&set), LIBKRUN_MAX_VCPUS);
krun_set_vm_config = dlsym (handle, "krun_set_vm_config");
krun_set_root = dlsym (handle, "krun_set_root");
krun_set_workdir = dlsym (handle, "krun_set_workdir");
krun_set_exec = dlsym (handle, "krun_set_exec");
if (krun_set_vm_config == NULL || krun_set_root == NULL
|| krun_set_exec == NULL)
if (krun_set_vm_config == NULL)
error (EXIT_FAILURE, 0, "could not find symbol in `libkrun.so`");
ret = krun_set_vm_config (ctx_id, num_vcpus, ram_mib);
if (UNLIKELY (ret < 0))
error (EXIT_FAILURE, -ret, "could not set krun vm configuration");
ret = krun_set_root (ctx_id, "/");
if (UNLIKELY (ret < 0))
error (EXIT_FAILURE, -ret, "could not set krun root");
if (krun_set_workdir && def && def->process && def->process->cwd)
{
ret = krun_set_workdir (ctx_id, def->process->cwd);
if (UNLIKELY (ret < 0))
error (EXIT_FAILURE, -ret, "could not set krun working directory");
}
ret = krun_set_exec (ctx_id, pathname, &argv[1], &envp[0]);
if (UNLIKELY (ret < 0))
error (EXIT_FAILURE, -ret, "could not set krun executable");
}
return krun_start_enter (ctx_id);
yajl_tree_free (config_tree);
ret = krun_start_enter (ctx_id);
return -ret;
}
/* libkrun_create_kvm_device: explicitly adds kvm device. */
@ -176,8 +382,8 @@ libkrun_configure_container (void *cookie, enum handler_configure_phase phase,
cleanup_close int devfd = -1;
cleanup_close int rootfsfd_cleanup = -1;
runtime_spec_schema_config_schema *def = container->container_def;
bool create_sev = false;
bool is_user_ns;
bool create_sev;
if (rootfs == NULL)
rootfsfd = AT_FDCWD;
@ -193,11 +399,12 @@ libkrun_configure_container (void *cookie, enum handler_configure_phase phase,
cleanup_free char *origin_config_path = NULL;
cleanup_free char *state_dir = NULL;
cleanup_free char *config = NULL;
cleanup_close int fd = -1;
size_t config_size;
state_dir = libcrun_get_state_directory (context->state_root, context->id);
if (UNLIKELY (state_dir == NULL))
return crun_make_error (err, 0, "could not retrieve the state directory");
ret = libcrun_get_state_directory (&state_dir, context->state_root, context->id, err);
if (UNLIKELY (ret < 0))
return ret;
ret = append_paths (&origin_config_path, err, state_dir, "config.json", NULL);
if (UNLIKELY (ret < 0))
@ -207,7 +414,13 @@ libkrun_configure_container (void *cookie, enum handler_configure_phase phase,
if (UNLIKELY (ret < 0))
return ret;
ret = write_file_at (rootfsfd, ".krun_config.json", config, config_size, err);
/* CVE-2025-24965: the content below rootfs cannot be trusted because it is controlled by the user. We
must ensure the file is opened below the rootfs directory. */
fd = safe_openat (rootfsfd, rootfs, KRUN_CONFIG_FILE, WRITE_FILE_DEFAULT_FLAGS | O_NOFOLLOW, S_IRUSR | S_IRGRP | S_IROTH, err);
if (UNLIKELY (fd < 0))
return fd;
ret = safe_write (fd, KRUN_CONFIG_FILE, config, config_size, err);
if (UNLIKELY (ret < 0))
return ret;
}
@ -232,7 +445,7 @@ libkrun_configure_container (void *cookie, enum handler_configure_phase phase,
}
}
devfd = openat (rootfsfd, "dev", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
devfd = openat (rootfsfd, "dev", O_PATH | O_DIRECTORY | O_CLOEXEC);
if (UNLIKELY (devfd < 0))
return crun_make_error (err, errno, "open /dev directory in `%s`", rootfs);
@ -258,8 +471,8 @@ libkrun_configure_container (void *cookie, enum handler_configure_phase phase,
static int
libkrun_load (void **cookie, libcrun_error_t *err)
{
int32_t ret;
struct krun_config *kconf;
void *handle;
const char *libkrun_so = "libkrun.so.1";
const char *libkrun_sev_so = "libkrun-sev.so.1";
@ -273,11 +486,37 @@ libkrun_load (void **cookie, libcrun_error_t *err)
if (kconf->handle == NULL && kconf->handle_sev == NULL)
{
free (kconf);
return crun_make_error (err, 0, "failed to open `%s` and `%s` for krun_config", libkrun_so, libkrun_sev_so);
return crun_make_error (err, 0, "failed to open `%s` and `%s` for krun_config: %s", libkrun_so, libkrun_sev_so, dlerror ());
}
kconf->sev = false;
/* Newer versions of libkrun no longer link against libkrunfw and
instead they open it when creating the context. This implies
we need to call "krun_create_ctx" before switching namespaces
or it won't be able to find the library bundling the kernel. */
if (kconf->handle)
{
ret = libkrun_create_context (kconf->handle, err);
if (UNLIKELY (ret < 0))
{
free (kconf);
return ret;
}
kconf->ctx_id = ret;
}
if (kconf->handle_sev)
{
ret = libkrun_create_context (kconf->handle_sev, err);
if (UNLIKELY (ret < 0))
{
free (kconf);
return ret;
}
kconf->ctx_id_sev = ret;
}
*cookie = kconf;
return 0;
@ -288,11 +527,15 @@ libkrun_unload (void *cookie, libcrun_error_t *err)
{
int r;
if (cookie)
struct krun_config *kconf = (struct krun_config *) cookie;
if (kconf != NULL)
{
r = dlclose (cookie);
if (UNLIKELY (r < 0))
return crun_make_error (err, 0, "could not unload handle: `%s`", dlerror ());
if (kconf->handle != NULL)
{
r = dlclose (kconf->handle);
if (UNLIKELY (r != 0))
return crun_make_error (err, 0, "could not unload handle: `%s`", dlerror ());
}
}
return 0;
}

View File

@ -100,7 +100,7 @@ mono_load (void **cookie, libcrun_error_t *err)
handle = dlopen ("libmono-native.so", RTLD_NOW);
if (handle == NULL)
return crun_make_error (err, 0, "could not load `libmono-2.0.so`: `%s`", dlerror ());
return crun_make_error (err, 0, "could not load `libmono-native.so`: `%s`", dlerror ());
*cookie = handle;
return 0;
@ -131,9 +131,6 @@ mono_configure_container (void *cookie arg_unused, enum handler_configure_phase
if (ret != 0)
return ret;
/* release any error if set since we are going to be returning from here */
crun_error_release (err);
return 0;
}

247
src/libcrun/handlers/wamr.c Normal file
View File

@ -0,0 +1,247 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2017, 2018, 2019, 2020, 2021 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation; either version 2.1 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#define _GNU_SOURCE
#include <config.h>
#include "../custom-handler.h"
#include "../container.h"
#include "../utils.h"
#include "../linux.h"
#include "handler-utils.h"
#include <unistd.h>
#include <sys/stat.h>
#include <errno.h>
#include <sys/types.h>
#include <fcntl.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#ifdef HAVE_DLOPEN
# include <dlfcn.h>
#endif
#ifdef HAVE_WAMR
# include <wasm_export.h>
#endif
#if HAVE_DLOPEN && HAVE_WAMR
static int
libwamr_load (void **cookie, libcrun_error_t *err)
{
void *handle;
handle = dlopen ("libiwasm.so", RTLD_NOW);
if (handle == NULL)
return crun_make_error (err, 0, "could not load `libiwasm.so`: `%s`", dlerror ());
*cookie = handle;
return 0;
}
static int
libwamr_unload (void *cookie, libcrun_error_t *err)
{
int r;
if (cookie)
{
r = dlclose (cookie);
if (UNLIKELY (r < 0))
return crun_make_error (err, 0, "could not unload handle: `%s`", dlerror ());
}
return 0;
}
static int
libwamr_exec (void *cookie, __attribute__ ((unused)) libcrun_container_t *container, const char *pathname, char *const argv[])
{
// load symbols from the shared library libiwasm.so
bool (*wasm_runtime_init) ();
RuntimeInitArgs init_args;
bool (*wasm_runtime_full_init) (RuntimeInitArgs *init_args);
wasm_module_t module;
wasm_module_t (*wasm_runtime_load) (uint8_t *buf, uint32_t size, char *error_buf, uint32_t error_buf_size);
wasm_module_inst_t module_inst;
wasm_module_inst_t (*wasm_runtime_instantiate) (const wasm_module_t module, uint32_t default_stack_size, uint32_t host_managed_heap_size, char *error_buf, uint32_t error_buf_size);
wasm_function_inst_t func;
wasm_function_inst_t (*wasm_runtime_lookup_function) (wasm_module_inst_t const module_inst, const char *name);
wasm_exec_env_t exec_env;
wasm_exec_env_t (*wasm_runtime_create_exec_env) (wasm_module_inst_t module_inst, uint32_t stack_size);
bool (*wasm_runtime_call_wasm) (wasm_exec_env_t exec_env, wasm_function_inst_t function, uint32_t argc, uint32_t argv[]);
const char *(*wasm_runtime_get_exception) (wasm_module_inst_t module_inst);
void (*wasm_runtime_set_exception) (wasm_module_inst_t module_inst, const char *exception);
void (*wasm_runtime_clear_exception) (wasm_module_inst_t module_inst);
void (*wasm_runtime_destroy_exec_env) (wasm_exec_env_t exec_env);
void (*wasm_runtime_deinstantiate) (wasm_module_inst_t module_inst);
void (*wasm_runtime_unload) (wasm_module_t module);
void (*wasm_runtime_destroy) ();
uint32_t (*wasm_runtime_get_wasi_exit_code) (wasm_module_inst_t module_inst);
bool (*wasm_application_execute_main) (wasm_module_inst_t module_inst, int32_t argc, char *argv[]);
void (*wasm_runtime_set_wasi_args) (wasm_module_t module, const char *dir_list[], uint32_t dir_count, const char *map_dir_list[], uint32_t map_dir_count, const char *env[], uint32_t env_count, char *argv[], int argc);
wasm_runtime_init = dlsym (cookie, "wasm_runtime_init");
wasm_runtime_full_init = dlsym (cookie, "wasm_runtime_full_init");
wasm_runtime_load = dlsym (cookie, "wasm_runtime_load");
wasm_runtime_instantiate = dlsym (cookie, "wasm_runtime_instantiate");
wasm_runtime_lookup_function = dlsym (cookie, "wasm_runtime_lookup_function");
wasm_runtime_create_exec_env = dlsym (cookie, "wasm_runtime_create_exec_env");
wasm_runtime_call_wasm = dlsym (cookie, "wasm_runtime_call_wasm");
wasm_runtime_get_exception = dlsym (cookie, "wasm_runtime_get_exception");
wasm_runtime_set_exception = dlsym (cookie, "wasm_runtime_set_exception");
wasm_runtime_clear_exception = dlsym (cookie, "wasm_runtime_clear_exception");
wasm_runtime_destroy_exec_env = dlsym (cookie, "wasm_runtime_destroy_exec_env");
wasm_runtime_deinstantiate = dlsym (cookie, "wasm_runtime_deinstantiate");
wasm_runtime_unload = dlsym (cookie, "wasm_runtime_unload");
wasm_runtime_destroy = dlsym (cookie, "wasm_runtime_destroy");
wasm_runtime_get_wasi_exit_code = dlsym (cookie, "wasm_runtime_get_wasi_exit_code");
wasm_application_execute_main = dlsym (cookie, "wasm_application_execute_main");
wasm_runtime_set_wasi_args = dlsym (cookie, "wasm_runtime_set_wasi_args");
if (wasm_runtime_init == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_init symbol in `libiwasm.so`");
if (wasm_runtime_full_init == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_full_init symbol in `libiwasm.so`");
if (wasm_runtime_load == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_load symbol in `libiwasm.so`");
if (wasm_runtime_instantiate == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_instantiate symbol in `libiwasm.so`");
if (wasm_runtime_lookup_function == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_lookup_function symbol in `libiwasm.so`");
if (wasm_runtime_create_exec_env == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_create_exec_env symbol in `libiwasm.so`");
if (wasm_runtime_call_wasm == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_call_wasm symbol in `libiwasm.so`");
if (wasm_runtime_get_exception == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_get_exception symbol in `libiwasm.so`");
if (wasm_runtime_set_exception == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_set_exception symbol in `libiwasm.so`");
if (wasm_runtime_clear_exception == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_clear_exception symbol in `libiwasm.so`");
if (wasm_runtime_destroy_exec_env == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_destroy_exec_env symbol in `libiwasm.so`");
if (wasm_runtime_deinstantiate == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_deinstantiate symbol in `libiwasm.so`");
if (wasm_runtime_unload == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_unload symbol in `libiwasm.so`");
if (wasm_runtime_destroy == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_destroy symbol in `libiwasm.so`");
if (wasm_runtime_get_wasi_exit_code == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_get_wasi_exit_code symbol in `libiwasm.so`");
if (wasm_application_execute_main == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_application_execute_main symbol in `libiwasm.so`");
if (wasm_runtime_set_wasi_args == NULL)
error (EXIT_FAILURE, 0, "could not find wasm_runtime_set_wasi_args symbol in `libiwasm.so`");
int ret;
const char *exception;
cleanup_free char *buffer = NULL;
char error_buf[128];
size_t buffer_size;
uint32_t stack_size = 8096, heap_size = 0;
libcrun_error_t tmp_err = NULL;
const char *wasi_proc_exit_exception = "wasi proc exit";
const char *wasi_addr_pool[2] = { "0.0.0.0/0", "::/0" };
const char *wasi_ns_lookup_pool[1] = { "*" };
const char *dirs[2] = { "/", "." };
char **container_env = container->container_def->process->env;
size_t env_count = container->container_def->process->env_len;
int arg_count = 0;
char *const *arg;
for (arg = argv; *arg != NULL; ++arg)
arg_count++;
// initialize the wasm runtime by default configurations
if (! wasm_runtime_init ())
error (EXIT_FAILURE, 0, "Failed to initialize the wasm runtime");
// read WASM file into a memory buffer
ret = read_all_file (pathname, &buffer, &buffer_size, &tmp_err);
if (UNLIKELY (ret < 0))
{
crun_error_release (&tmp_err);
error (EXIT_FAILURE, 0, "Failed to read file");
}
if (UNLIKELY (buffer_size > UINT32_MAX))
error (EXIT_FAILURE, 0, "File size is too large");
// parse the WASM file from buffer and create a WASM module
module = wasm_runtime_load (buffer, buffer_size, error_buf, sizeof (error_buf));
if (! module)
error (EXIT_FAILURE, 0, "Failed to load WASM file");
// instantiate the WASI environment
wasm_runtime_set_wasi_args (module, dirs, 1, NULL, 0, (const char **) container_env, env_count, (char **) argv, arg_count);
// enable the WASI socket api
wasm_runtime_set_wasi_addr_pool (module, wasi_addr_pool, 2);
wasm_runtime_set_wasi_ns_lookup_pool (module, wasi_ns_lookup_pool, 1);
// create an instance of the WASM module (WASM linear memory is ready)
module_inst = wasm_runtime_instantiate (module, stack_size, heap_size, error_buf, sizeof (error_buf));
if (! module_inst)
error (EXIT_FAILURE, 0, "Failed to instantiate the WASM module");
// look up a WASM function by its name (The function signature can NULL here)
func = wasm_runtime_lookup_function (module_inst, "_start");
if (! func)
error (EXIT_FAILURE, 0, "Failed to look up the WASM function");
// create an execution environment to execute the WASM functions
exec_env = wasm_runtime_create_exec_env (module_inst, stack_size);
if (! exec_env)
error (EXIT_FAILURE, 0, "Failed to create the execution environment");
// call the WASM function
ret = wasm_runtime_call_wasm (exec_env, func, 0, NULL);
if (ret)
wasm_runtime_set_exception (module_inst, wasi_proc_exit_exception);
exception = wasm_runtime_get_exception (module_inst);
if (! strstr (exception, wasi_proc_exit_exception))
error (EXIT_FAILURE, 0, "Failed to call the WASM function");
wasm_runtime_clear_exception (module_inst);
wasm_runtime_destroy_exec_env (exec_env);
wasm_runtime_deinstantiate (module_inst);
wasm_runtime_unload (module);
wasm_runtime_destroy ();
exit (EXIT_SUCCESS);
}
static int
libwamr_can_handle_container (libcrun_container_t *container, libcrun_error_t *err)
{
return wasm_can_handle_container (container, err);
}
struct custom_handler_s handler_wamr = {
.name = "wamr",
.alias = "wasm",
.feature_string = "WASM:wamr",
.load = libwamr_load,
.unload = libwamr_unload,
.run_func = libwamr_exec,
.can_handle_container = libwamr_can_handle_container,
};
#endif

View File

@ -26,6 +26,8 @@
#include <sched.h>
#include <fcntl.h>
#include <sys/vfs.h>
#include <unistd.h>
#include <errno.h>
#define INTEL_RDT_MOUNT_POINT "/sys/fs/resctrl"
#define SCHEMATA_FILE "schemata"
@ -45,10 +47,21 @@ is_rdt_mounted (libcrun_error_t *err)
return sfs.f_type == RDTGROUP_SUPER_MAGIC;
}
static int
get_rdt_value (char **out, const char *l3_cache_schema, const char *mem_bw_schema)
int
get_rdt_value (char **out, const char *l3_cache_schema, const char *mem_bw_schema, char *const *schemata)
{
return xasprintf (out, "%s%s%s\n", l3_cache_schema ?: "", (l3_cache_schema && mem_bw_schema) ? "\n" : "", mem_bw_schema ?: "");
cleanup_free char *schemata_joined = NULL;
size_t schemata_size = 0;
while (schemata && schemata[schemata_size])
schemata_size++;
if (schemata_size > 0)
schemata_joined = str_join_array (0, schemata_size, schemata, "\n");
return xasprintf (out, "%s%s%s%s%s\n", l3_cache_schema ?: "",
(l3_cache_schema && mem_bw_schema) ? "\n" : "", mem_bw_schema ?: "",
((l3_cache_schema || mem_bw_schema) && schemata_joined) ? "\n" : "", schemata_joined ?: "");
}
struct key_value
@ -189,12 +202,12 @@ validate_rdt_configuration (const char *name, const char *l3_cache_schema, const
}
static int
write_intelrdt_string (int fd, const char *file, const char *l3_cache_schema, const char *mem_bw_schema, libcrun_error_t *err)
write_intelrdt_string (int fd, const char *file, const char *l3_cache_schema, const char *mem_bw_schema, char *const *schemata, libcrun_error_t *err)
{
cleanup_free char *formatted = NULL;
int len, ret;
len = get_rdt_value (&formatted, l3_cache_schema, mem_bw_schema);
len = get_rdt_value (&formatted, l3_cache_schema, mem_bw_schema, schemata);
if (len < 0)
return crun_make_error (err, errno, "internal error get_rdt_value");
@ -237,6 +250,8 @@ resctl_create (const char *name, bool explicit_clos_id, bool *created, const cha
int exist;
int ret;
*created = false;
ret = is_rdt_mounted (err);
if (UNLIKELY (ret < 0))
return ret;
@ -269,9 +284,12 @@ resctl_create (const char *name, bool explicit_clos_id, bool *created, const cha
return validate_rdt_configuration (name, l3_cache_schema, mem_bw_schema, err);
/* At this point, assume it was created. */
ret = crun_ensure_directory (path, 0755, true, err);
if (UNLIKELY (ret < 0))
return ret;
*created = true;
return crun_ensure_directory (path, 0755, true, err);
return 0;
}
int
@ -286,21 +304,25 @@ resctl_move_task_to (const char *name, pid_t pid, libcrun_error_t *err)
if (UNLIKELY (ret < 0))
return ret;
len = sprintf (pid_str, "%d", pid);
len = snprintf (pid_str, sizeof (pid_str), "%d", pid);
if (UNLIKELY (len >= (int) sizeof (pid_str)))
return crun_make_error (err, 0, "internal error: static buffer too small");
return write_file (path, pid_str, len, err);
}
int
resctl_update (const char *name, const char *l3_cache_schema, const char *mem_bw_schema, libcrun_error_t *err)
resctl_update (const char *name, const char *l3_cache_schema, const char *mem_bw_schema,
char *const *schemata, libcrun_error_t *err)
{
const char *actual_l3_cache_schema = l3_cache_schema;
cleanup_free char *cleaned_l3_cache_schema = NULL;
cleanup_free char *path = NULL;
cleanup_close int fd = -1;
int ret;
/* Nothing to do. */
if (l3_cache_schema == NULL && mem_bw_schema == NULL)
if (l3_cache_schema == NULL && mem_bw_schema == NULL && schemata == NULL)
return 0;
ret = append_paths (&path, err, INTEL_RDT_MOUNT_POINT, name, SCHEMATA_FILE, NULL);
@ -308,17 +330,16 @@ resctl_update (const char *name, const char *l3_cache_schema, const char *mem_bw
return ret;
if (l3_cache_schema && strstr (l3_cache_schema, "MB:"))
l3_cache_schema = cleaned_l3_cache_schema = intelrdt_clean_l3_cache_schema (l3_cache_schema);
{
cleaned_l3_cache_schema = intelrdt_clean_l3_cache_schema (l3_cache_schema);
actual_l3_cache_schema = cleaned_l3_cache_schema;
}
fd = open (path, O_WRONLY | O_CLOEXEC);
if (UNLIKELY (fd < 0))
return crun_make_error (err, errno, "open `%s`", path);
return crun_make_error (err, errno, "open `%s` for writing", path);
ret = write_intelrdt_string (fd, path, l3_cache_schema, mem_bw_schema, err);
if (UNLIKELY (ret < 0))
return ret;
return 0;
return write_intelrdt_string (fd, path, actual_l3_cache_schema, mem_bw_schema, schemata, err);
}
int

View File

@ -26,7 +26,7 @@
int resctl_create (const char *name, bool explicit_clos_id, bool *created, const char *l3_cache_schema, const char *mem_bw_schema, libcrun_error_t *err);
int resctl_move_task_to (const char *name, pid_t pid, libcrun_error_t *err);
int resctl_update (const char *name, const char *l3_cache_schema, const char *mem_bw_schema, libcrun_error_t *err);
int resctl_update (const char *name, const char *l3_cache_schema, const char *mem_bw_schema, char *const *schemata, libcrun_error_t *err);
int resctl_destroy (const char *name, libcrun_error_t *err);
#endif

File diff suppressed because it is too large Load Diff

View File

@ -84,7 +84,9 @@ int libcrun_join_process (libcrun_context_t *context, libcrun_container_t *conta
libcrun_container_status_t *status, const char *cgroup, int detach,
runtime_spec_schema_config_schema_process *process, int *terminal_fd, libcrun_error_t *err);
int libcrun_linux_container_update (libcrun_container_status_t *status,
runtime_spec_schema_config_linux_resources *resources, libcrun_error_t *err);
const char *state_root,
runtime_spec_schema_config_linux_resources *resources,
libcrun_error_t *err);
int libcrun_create_keyring (const char *name, const char *label, libcrun_error_t *err);
int libcrun_container_pause_linux (libcrun_container_status_t *status, libcrun_error_t *err);
int libcrun_container_unpause_linux (libcrun_container_status_t *status, libcrun_error_t *err);
@ -139,10 +141,20 @@ const char *libcrun_get_intelrdt_name (const char *ctr_name, libcrun_container_t
int libcrun_apply_intelrdt (const char *ctr_name, libcrun_container_t *container, pid_t pid, int actions, libcrun_error_t *err);
int libcrun_move_network_devices (libcrun_container_t *container, pid_t pid, libcrun_error_t *err);
int libcrun_destroy_intelrdt (const char *name, libcrun_error_t *err);
int libcrun_update_intel_rdt (const char *ctr_name, libcrun_container_t *container, const char *l3_cache_schema, const char *mem_bw_schema, libcrun_error_t *err);
int libcrun_update_intel_rdt (const char *ctr_name, libcrun_container_t *container, const char *l3_cache_schema, const char *mem_bw_schema, char *const *schemata, libcrun_error_t *err);
int libcrun_safe_chdir (const char *path, libcrun_error_t *err);
int get_bind_mount (int dirfd, const char *src, bool recursive, bool rdonly, bool nofollow, libcrun_error_t *err);
bool is_bind_mount (runtime_spec_schema_defs_mount *mnt, bool *recursive, bool *src_nofollow);
int libcrun_make_runtime_mounts (libcrun_container_t *container, libcrun_container_status_t *status, runtime_spec_schema_defs_mount **mounts, size_t len, libcrun_error_t *err);
int libcrun_destroy_runtime_mounts (libcrun_container_t *container, libcrun_container_status_t *status, runtime_spec_schema_defs_mount **mounts, size_t len, libcrun_error_t *err);
#endif

View File

@ -48,14 +48,14 @@
struct propagation_flags_s;
enum
{
TOTAL_KEYWORDS = 57,
TOTAL_KEYWORDS = 59,
MIN_WORD_LENGTH = 2,
MAX_WORD_LENGTH = 14,
MIN_HASH_VALUE = 2,
MAX_HASH_VALUE = 74
MAX_HASH_VALUE = 79
};
/* maximum key range = 73, duplicates = 0 */
/* maximum key range = 78, duplicates = 0 */
#ifdef __GNUC__
__inline
@ -69,32 +69,32 @@ hash (register const char *str, register size_t len)
{
static const unsigned char asso_values[] =
{
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 8, 29, 12,
3, 21, 0, 75, 31, 0, 75, 75, 15, 10,
0, 4, 16, 75, 0, 19, 8, 17, 26, 0,
16, 26, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 24, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 11, 35, 9,
3, 20, 3, 80, 35, 0, 80, 80, 17, 17,
0, 4, 21, 80, 0, 9, 4, 25, 24, 0,
23, 33, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80
};
register unsigned int hval = len;
@ -138,100 +138,104 @@ static const struct propagation_flags_s wordlist[] =
{"nodiratime", 0, MS_NODIRATIME, 0},
#line 85 "src/libcrun/mount_flags.perf"
{"rnodiratime", 0, MS_NODIRATIME, OPTION_RECURSIVE},
#line 55 "src/libcrun/mount_flags.perf"
{"diratime", 1, MS_NODIRATIME, 0},
#line 83 "src/libcrun/mount_flags.perf"
{"rnoatime", 0, MS_NOATIME, OPTION_RECURSIVE},
#line 81 "src/libcrun/mount_flags.perf"
{"rnomand", 1, MS_MANDLOCK, OPTION_RECURSIVE},
#line 82 "src/libcrun/mount_flags.perf"
{"ratime", 1, MS_NOATIME, OPTION_RECURSIVE},
#line 80 "src/libcrun/mount_flags.perf"
{"rmand", 0, MS_MANDLOCK, OPTION_RECURSIVE},
#line 66 "src/libcrun/mount_flags.perf"
{"rprivate", 0, MS_REC|MS_PRIVATE, 0},
#line 51 "src/libcrun/mount_flags.perf"
{"mand", 0, MS_MANDLOCK, 0},
#line 91 "src/libcrun/mount_flags.perf"
{"idmap", 0, 0, OPTION_IDMAP},
#line 54 "src/libcrun/mount_flags.perf"
{"noatime", 0, MS_NOATIME, 0},
#line 52 "src/libcrun/mount_flags.perf"
{"nomand", 1, MS_MANDLOCK, 0},
#line 49 "src/libcrun/mount_flags.perf"
{"dirsync", 0, MS_DIRSYNC, 0},
#line 72 "src/libcrun/mount_flags.perf"
{"rnosuid", 0, MS_NOSUID, OPTION_RECURSIVE},
#line 53 "src/libcrun/mount_flags.perf"
{"atime", 1, MS_NOATIME, 0},
#line 76 "src/libcrun/mount_flags.perf"
{"rnoexec", 0, MS_NOEXEC, OPTION_RECURSIVE},
#line 44 "src/libcrun/mount_flags.perf"
{"nodev", 0, MS_NODEV, 0},
#line 38 "src/libcrun/mount_flags.perf"
{"rbind", 0, MS_REC|MS_BIND, 0},
#line 58 "src/libcrun/mount_flags.perf"
{"norelatime", 1, MS_RELATIME, 0},
#line 37 "src/libcrun/mount_flags.perf"
{"bind", 0, MS_BIND, 0},
#line 89 "src/libcrun/mount_flags.perf"
{"rnostrictatime", 1, MS_STRICTATIME, OPTION_RECURSIVE},
#line 82 "src/libcrun/mount_flags.perf"
{"ratime", 1, MS_NOATIME, OPTION_RECURSIVE},
#line 55 "src/libcrun/mount_flags.perf"
{"diratime", 1, MS_NODIRATIME, 0},
#line 83 "src/libcrun/mount_flags.perf"
{"rnoatime", 0, MS_NOATIME, OPTION_RECURSIVE},
#line 59 "src/libcrun/mount_flags.perf"
{"strictatime", 0, MS_STRICTATIME, 0},
#line 88 "src/libcrun/mount_flags.perf"
{"rstrictatime", 0, MS_STRICTATIME, OPTION_RECURSIVE},
#line 36 "src/libcrun/mount_flags.perf"
{"defaults", 0, 0, 0},
#line 71 "src/libcrun/mount_flags.perf"
{"rsuid", 1, MS_NOSUID, OPTION_RECURSIVE},
#line 50 "src/libcrun/mount_flags.perf"
{"remount", 0, MS_REMOUNT, 0},
#line 41 "src/libcrun/mount_flags.perf"
{"suid", 1, MS_NOSUID, 0},
#line 54 "src/libcrun/mount_flags.perf"
{"noatime", 0, MS_NOATIME, 0},
#line 89 "src/libcrun/mount_flags.perf"
{"rnostrictatime", 1, MS_STRICTATIME, OPTION_RECURSIVE},
#line 81 "src/libcrun/mount_flags.perf"
{"rnomand", 1, MS_MANDLOCK, OPTION_RECURSIVE},
#line 66 "src/libcrun/mount_flags.perf"
{"rprivate", 0, MS_REC|MS_PRIVATE, 0},
#line 60 "src/libcrun/mount_flags.perf"
{"nostrictatime", 1, MS_STRICTATIME, 0},
#line 86 "src/libcrun/mount_flags.perf"
{"rrelatime", 0, MS_RELATIME, OPTION_RECURSIVE},
#line 42 "src/libcrun/mount_flags.perf"
{"nosuid", 0, MS_NOSUID, 0},
#line 46 "src/libcrun/mount_flags.perf"
{"noexec", 0, MS_NOEXEC, 0},
#line 76 "src/libcrun/mount_flags.perf"
{"rnoexec", 0, MS_NOEXEC, OPTION_RECURSIVE},
#line 44 "src/libcrun/mount_flags.perf"
{"nodev", 0, MS_NODEV, 0},
#line 80 "src/libcrun/mount_flags.perf"
{"rmand", 0, MS_MANDLOCK, OPTION_RECURSIVE},
#line 58 "src/libcrun/mount_flags.perf"
{"norelatime", 1, MS_RELATIME, 0},
#line 51 "src/libcrun/mount_flags.perf"
{"mand", 0, MS_MANDLOCK, 0},
#line 91 "src/libcrun/mount_flags.perf"
{"idmap", 0, 0, OPTION_IDMAP},
#line 53 "src/libcrun/mount_flags.perf"
{"atime", 1, MS_NOATIME, 0},
#line 52 "src/libcrun/mount_flags.perf"
{"nomand", 1, MS_MANDLOCK, 0},
#line 71 "src/libcrun/mount_flags.perf"
{"rsuid", 1, MS_NOSUID, OPTION_RECURSIVE},
#line 38 "src/libcrun/mount_flags.perf"
{"rbind", 0, MS_REC|MS_BIND, 0},
#line 41 "src/libcrun/mount_flags.perf"
{"suid", 1, MS_NOSUID, 0},
#line 37 "src/libcrun/mount_flags.perf"
{"bind", 0, MS_BIND, 0},
#line 64 "src/libcrun/mount_flags.perf"
{"rslave", 0, MS_REC|MS_SLAVE, 0},
#line 65 "src/libcrun/mount_flags.perf"
{"private", 0, MS_PRIVATE, 0},
#line 42 "src/libcrun/mount_flags.perf"
{"nosuid", 0, MS_NOSUID, 0},
#line 36 "src/libcrun/mount_flags.perf"
{"defaults", 0, 0, 0},
#line 86 "src/libcrun/mount_flags.perf"
{"rrelatime", 0, MS_RELATIME, OPTION_RECURSIVE},
#line 77 "src/libcrun/mount_flags.perf"
{"rsync", 0, MS_SYNCHRONOUS, OPTION_RECURSIVE},
#line 57 "src/libcrun/mount_flags.perf"
{"relatime", 0, MS_RELATIME, 0},
#line 50 "src/libcrun/mount_flags.perf"
{"remount", 0, MS_REMOUNT, 0},
#line 94 "src/libcrun/mount_flags.perf"
{"dest-nofollow", 0, 0, OPTION_DEST_NOFOLLOW},
#line 43 "src/libcrun/mount_flags.perf"
{"dev", 1, MS_NODEV, 0},
#line 73 "src/libcrun/mount_flags.perf"
{"rdev", 1, MS_NODEV, OPTION_RECURSIVE},
#line 90 "src/libcrun/mount_flags.perf"
{"tmpcopyup", 0, 0, OPTION_TMPCOPYUP},
#line 67 "src/libcrun/mount_flags.perf"
{"unbindable", 0, MS_UNBINDABLE, 0},
#line 68 "src/libcrun/mount_flags.perf"
{"runbindable", 0, MS_REC|MS_UNBINDABLE, 0},
#line 65 "src/libcrun/mount_flags.perf"
{"private", 0, MS_PRIVATE, 0},
#line 46 "src/libcrun/mount_flags.perf"
{"noexec", 0, MS_NOEXEC, 0},
#line 93 "src/libcrun/mount_flags.perf"
{"src-nofollow", 0, 0, OPTION_SRC_NOFOLLOW},
#line 47 "src/libcrun/mount_flags.perf"
{"sync", 0, MS_SYNCHRONOUS, 0},
#line 57 "src/libcrun/mount_flags.perf"
{"relatime", 0, MS_RELATIME, 0},
#line 48 "src/libcrun/mount_flags.perf"
{"async", 1, MS_SYNCHRONOUS, 0},
#line 78 "src/libcrun/mount_flags.perf"
{"rasync", 1, MS_SYNCHRONOUS, OPTION_RECURSIVE},
#line 47 "src/libcrun/mount_flags.perf"
{"sync", 0, MS_SYNCHRONOUS, 0},
#line 75 "src/libcrun/mount_flags.perf"
{"rexec", 1, MS_NOEXEC, OPTION_RECURSIVE},
#line 90 "src/libcrun/mount_flags.perf"
{"tmpcopyup", 0, 0, OPTION_TMPCOPYUP},
#line 61 "src/libcrun/mount_flags.perf"
{"shared", 0, MS_SHARED, 0},
#line 62 "src/libcrun/mount_flags.perf"
{"rshared", 0, MS_REC|MS_SHARED, 0},
#line 92 "src/libcrun/mount_flags.perf"
{"copy-symlink", 0, 0, OPTION_COPY_SYMLINK},
#line 63 "src/libcrun/mount_flags.perf"
{"slave", 0, MS_SLAVE, 0},
#line 75 "src/libcrun/mount_flags.perf"
{"rexec", 1, MS_NOEXEC, OPTION_RECURSIVE},
#line 67 "src/libcrun/mount_flags.perf"
{"unbindable", 0, MS_UNBINDABLE, 0},
#line 68 "src/libcrun/mount_flags.perf"
{"runbindable", 0, MS_REC|MS_UNBINDABLE, 0},
#line 45 "src/libcrun/mount_flags.perf"
{"exec", 1, MS_NOEXEC, 0}
{"exec", 1, MS_NOEXEC, 0},
#line 92 "src/libcrun/mount_flags.perf"
{"copy-symlink", 0, 0, OPTION_COPY_SYMLINK}
};
const struct propagation_flags_s *
@ -373,22 +377,22 @@ libcrun_mount_flag_in_word_set (register const char *str, register size_t len)
case 48:
resword = &wordlist[41];
goto compare;
case 50:
case 49:
resword = &wordlist[42];
goto compare;
case 51:
case 50:
resword = &wordlist[43];
goto compare;
case 52:
case 51:
resword = &wordlist[44];
goto compare;
case 53:
case 52:
resword = &wordlist[45];
goto compare;
case 54:
case 53:
resword = &wordlist[46];
goto compare;
case 55:
case 54:
resword = &wordlist[47];
goto compare;
case 56:
@ -397,27 +401,33 @@ libcrun_mount_flag_in_word_set (register const char *str, register size_t len)
case 57:
resword = &wordlist[49];
goto compare;
case 59:
case 58:
resword = &wordlist[50];
goto compare;
case 61:
case 59:
resword = &wordlist[51];
goto compare;
case 62:
case 60:
resword = &wordlist[52];
goto compare;
case 63:
case 64:
resword = &wordlist[53];
goto compare;
case 68:
case 66:
resword = &wordlist[54];
goto compare;
case 71:
case 68:
resword = &wordlist[55];
goto compare;
case 72:
case 69:
resword = &wordlist[56];
goto compare;
case 74:
resword = &wordlist[57];
goto compare;
case 77:
resword = &wordlist[58];
goto compare;
}
return 0;
compare:
@ -431,7 +441,7 @@ libcrun_mount_flag_in_word_set (register const char *str, register size_t len)
}
return 0;
}
#line 93 "src/libcrun/mount_flags.perf"
#line 95 "src/libcrun/mount_flags.perf"
const struct propagation_flags_s *

View File

@ -25,6 +25,8 @@ enum
OPTION_RECURSIVE = (1 << 1),
OPTION_IDMAP = (1 << 2),
OPTION_COPY_SYMLINK = (1 << 3),
OPTION_SRC_NOFOLLOW = (1 << 4),
OPTION_DEST_NOFOLLOW = (1 << 5),
};
struct propagation_flags_s

View File

@ -90,6 +90,8 @@ rnostrictatime, 1, MS_STRICTATIME, OPTION_RECURSIVE
tmpcopyup, 0, 0, OPTION_TMPCOPYUP
idmap, 0, 0, OPTION_IDMAP
copy-symlink, 0, 0, OPTION_COPY_SYMLINK
src-nofollow, 0, 0, OPTION_SRC_NOFOLLOW
dest-nofollow, 0, 0, OPTION_DEST_NOFOLLOW
%%
const struct propagation_flags_s *

490
src/libcrun/net_device.c Normal file
View File

@ -0,0 +1,490 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2025 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation; either version 2.1 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#define _GNU_SOURCE
#include <config.h>
#include "net_device.h"
#include "utils.h"
#include <sys/socket.h>
#include <errno.h>
#include <sched.h>
#include <sys/wait.h>
#include <sys/xattr.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <arpa/inet.h>
#include <net/if.h>
#if HAVE_STDATOMIC_H
# include <stdatomic.h>
# ifndef HAVE_ATOMIC_INT
# define atomic_uint volatile uint
# endif
#endif
struct ip_addr
{
struct ifaddrmsg ifa;
int rta_len;
char *rta;
};
struct nl_req
{
struct nlmsghdr nlh;
union
{
struct ifaddrmsg ifa;
struct ifinfomsg ifi;
};
};
static atomic_uint nl_seq_counter;
static uint32_t
get_next_seq ()
{
return (uint32_t) ++nl_seq_counter;
}
static uint32_t
reset_request (struct nl_req *req, int type, int flags, int msg_len)
{
uint32_t seq = get_next_seq ();
memset (req, 0, sizeof (*req));
req->nlh.nlmsg_type = type;
req->nlh.nlmsg_flags = flags;
req->nlh.nlmsg_len = NLMSG_LENGTH (msg_len);
req->nlh.nlmsg_seq = seq;
return seq;
}
static void
cleanup_ip_addrsp (void *p)
{
struct ip_addr **pp = (struct ip_addr **) p;
struct ip_addr *ip;
if (*pp == NULL)
return;
for (ip = *pp; ip->rta_len >= 0; ip++)
free (ip->rta);
free (*pp);
}
#define cleanup_ip_addrs __attribute__ ((cleanup (cleanup_ip_addrsp)))
static int
open_netlink_fd (libcrun_error_t *err)
{
cleanup_close int sock = -1;
struct sockaddr_nl local = {
.nl_family = AF_NETLINK
};
int fd;
sock = socket (AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (sock < 0)
return crun_make_error (err, errno, "netlink socket");
if (bind (sock, (struct sockaddr *) &local, sizeof (local)) < 0)
return crun_make_error (err, errno, "bind");
fd = sock;
sock = -1;
return fd;
}
static int
append_rtattr (struct nlmsghdr *n, size_t maxlen, int type, const void *data, size_t data_len, libcrun_error_t *err)
{
int len = RTA_LENGTH (data_len);
struct rtattr *rta;
if (NLMSG_ALIGN (n->nlmsg_len) + RTA_ALIGN (len) > maxlen)
return crun_make_error (err, E2BIG, "internal error: buffer too small");
rta = (struct rtattr *) (((char *) n) + NLMSG_ALIGN (n->nlmsg_len));
rta->rta_type = type;
rta->rta_len = len;
if (data_len)
memcpy (RTA_DATA (rta), data, data_len);
n->nlmsg_len = NLMSG_ALIGN (n->nlmsg_len) + RTA_ALIGN (len);
return 0;
}
static int
send_request (int sock, struct nl_req *req, libcrun_error_t *err)
{
int ret;
ret = TEMP_FAILURE_RETRY (send (sock, req, req->nlh.nlmsg_len, 0));
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "send");
return 0;
}
/* if_nametoindex with an open netlink socket. */
static int
name_to_index (int sock, const char *ifname, char *buffer, size_t buffer_size, libcrun_error_t *err)
{
struct nlmsghdr *nlh;
struct nl_req *req;
uint32_t seq;
int index = 0;
ssize_t len;
int ret;
req = (struct nl_req *) buffer;
nlh = &req->nlh;
seq = reset_request (req, RTM_GETLINK, NLM_F_REQUEST, sizeof (struct ifinfomsg));
req->ifi.ifi_family = AF_UNSPEC;
ret = append_rtattr (nlh, buffer_size, IFLA_IFNAME, ifname, strlen (ifname) + 1, err);
if (UNLIKELY (ret < 0))
return ret;
ret = send_request (sock, req, err);
if (UNLIKELY (ret < 0))
return ret;
len = TEMP_FAILURE_RETRY (recv (sock, buffer, buffer_size, 0));
if (UNLIKELY (len < 0))
return crun_make_error (err, errno, "recv");
for (nlh = (struct nlmsghdr *) buffer; NLMSG_OK (nlh, (unsigned int) len); nlh = NLMSG_NEXT (nlh, len))
{
if (nlh->nlmsg_seq != seq)
continue;
if (nlh->nlmsg_type == NLMSG_ERROR)
{
struct nlmsgerr *err_data = (struct nlmsgerr *) NLMSG_DATA (nlh);
if (err_data->error == 0)
continue;
return crun_make_error (err, -err_data->error, "netlink error while looking for interface `%s`", ifname);
}
if (nlh->nlmsg_type == RTM_NEWLINK)
{
struct ifinfomsg *ifi = NLMSG_DATA (nlh);
index = ifi->ifi_index;
return index;
}
}
if (index == 0)
return crun_make_error (err, 0, "could not find device `%s`", ifname);
return index;
}
static int
wait_for_ack (int sock, uint32_t seq, char *recv_buffer, size_t recv_buffer_size, libcrun_error_t *err)
{
struct nlmsghdr *nlh_recv = (struct nlmsghdr *) recv_buffer;
ssize_t len;
do
{
len = TEMP_FAILURE_RETRY (recv (sock, recv_buffer, recv_buffer_size, 0));
if (len < 0)
return crun_make_error (err, errno, "recv");
} while (nlh_recv->nlmsg_seq != seq);
if (! NLMSG_OK (nlh_recv, len))
return crun_make_error (err, 0, "received invalid packet");
if (nlh_recv->nlmsg_type == NLMSG_ERROR)
{
struct nlmsgerr *err_data = (struct nlmsgerr *) NLMSG_DATA (nlh_recv);
if (err_data->error == 0)
return 0;
return crun_make_error (err, -err_data->error, "netlink error");
}
return crun_make_error (err, 0, "internal error: received unknown netlink packet type");
}
static void
copy_ip_addr (struct nlmsghdr *nlh, struct ip_addr *ip)
{
struct ifaddrmsg *ifa = (struct ifaddrmsg *) NLMSG_DATA (nlh);
memcpy (&ip->ifa, ifa, sizeof (struct ifaddrmsg));
ip->rta_len = IFA_PAYLOAD (nlh);
ip->rta = xmalloc (ip->rta_len);
memcpy (ip->rta, IFA_RTA (ifa), ip->rta_len);
};
static int
get_ip_addresses (int sock, uint32_t ifindex, struct ip_addr **out_ips, char *buffer, size_t buffer_size, libcrun_error_t *err)
{
struct nl_req *req = (struct nl_req *) buffer;
cleanup_ip_addrs struct ip_addr *ips = NULL;
size_t ips_len = 0;
int optval = 1;
uint32_t seq;
ssize_t len;
int ret;
#ifdef NETLINK_GET_STRICT_CHK
ret = setsockopt (sock, SOL_NETLINK, NETLINK_GET_STRICT_CHK, &optval, sizeof (optval));
if (ret < 0)
{
if (errno != ENOPROTOOPT)
return crun_make_error (err, errno, "setsockopt (NETLINK_GET_STRICT_CHK)");
/* NETLINK_GET_STRICT_CHK not supported by this kernel, continue without strict checking. */
}
#endif
seq = reset_request (req, RTM_GETADDR, NLM_F_DUMP | NLM_F_REQUEST, sizeof (struct ifaddrmsg));
req->ifa.ifa_family = AF_UNSPEC;
req->ifa.ifa_index = ifindex;
ret = send_request (sock, req, err);
if (UNLIKELY (ret < 0))
return ret;
while ((len = TEMP_FAILURE_RETRY (recv (sock, buffer, buffer_size, 0))) > 0)
{
struct nlmsghdr *nlh;
for (nlh = (struct nlmsghdr *) buffer; NLMSG_OK (nlh, len); nlh = NLMSG_NEXT (nlh, len))
{
struct ifaddrmsg *ifa;
if (nlh->nlmsg_seq != seq)
continue;
if (nlh->nlmsg_type == NLMSG_DONE)
{
*out_ips = ips;
ips = NULL;
return 0;
}
if (nlh->nlmsg_type == NLMSG_ERROR)
{
struct nlmsgerr *err_data = (struct nlmsgerr *) NLMSG_DATA (nlh);
if (err_data->error == 0)
continue;
return crun_make_error (err, -err_data->error, "netlink error reading ip addresses");
}
ifa = (struct ifaddrmsg *) NLMSG_DATA (nlh);
if (ifa->ifa_index != ifindex)
continue;
/* Copy only permanent, globally routable IP addresses. */
if (! (ifa->ifa_flags & IFA_F_PERMANENT) || (ifa->ifa_scope != RT_SCOPE_UNIVERSE))
continue;
/* Always append an empty struct. */
ips = xrealloc (ips, sizeof (struct ip_addr) * (++ips_len + 1));
/* Mark the end of the array. */
ips[ips_len].rta_len = -1;
copy_ip_addr (nlh, &ips[ips_len - 1]);
}
}
if (UNLIKELY (len < 0))
return crun_make_error (err, errno, "recv");
return 0;
}
static int
configure_ip_addresses (int sock, int ifindex, char *buffer, size_t buffer_size, const struct ip_addr *ips, libcrun_error_t *err)
{
struct nl_req *req = (struct nl_req *) buffer;
const struct ip_addr *ip;
int ret;
if (ips == NULL)
return 0;
for (ip = ips; ip->rta_len >= 0; ip++)
{
/* RTA_NEXT modifies the argument, so use a copy. */
int rta_len = ip->rta_len;
struct rtattr *rta;
uint32_t seq;
seq = reset_request (req, RTM_NEWADDR, NLM_F_REQUEST | NLM_F_CREATE | NLM_F_REPLACE | NLM_F_ACK, sizeof (struct ifaddrmsg));
memcpy (&req->ifa, &ip->ifa, sizeof (struct ifaddrmsg));
req->ifa.ifa_index = ifindex;
for (rta = (struct rtattr *) ip->rta; RTA_OK (rta, rta_len); rta = RTA_NEXT (rta, rta_len))
{
ret = append_rtattr (&(req->nlh), buffer_size, rta->rta_type, RTA_DATA (rta), RTA_PAYLOAD (rta), err);
if (UNLIKELY (ret < 0))
return ret;
}
ret = send_request (sock, req, err);
if (UNLIKELY (ret < 0))
return ret;
ret = wait_for_ack (sock, seq, buffer, buffer_size, err);
if (ret < 0)
return ret;
}
return 0;
}
static int
request_enable_interface_and_wait (int sock, char *buffer, size_t buffer_size, int index, libcrun_error_t *err)
{
struct nl_req *req = (struct nl_req *) buffer;
uint32_t seq;
int ret;
seq = reset_request (req, RTM_NEWLINK, NLM_F_REQUEST | NLM_F_ACK, sizeof (struct ifinfomsg));
req->ifi.ifi_family = AF_UNSPEC;
req->ifi.ifi_index = index;
req->ifi.ifi_flags = IFF_UP;
req->ifi.ifi_change = IFF_UP;
ret = send_request (sock, req, err);
if (UNLIKELY (ret < 0))
return ret;
return wait_for_ack (sock, seq, buffer, buffer_size, err);
}
static int
setup_network_device_in_ns_helper (char *buffer, size_t buffer_size, int netns_fd, const char *newifname,
struct ip_addr *ips, libcrun_error_t *err)
{
cleanup_close int sock_in_ns = -1;
int new_ifindex;
int ret;
ret = setns (netns_fd, CLONE_NEWNET);
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "change network namespace");
sock_in_ns = open_netlink_fd (err);
if (sock_in_ns < 0)
return sock_in_ns;
/* we could ask for a specific index with IFLA_NEW_IFINDEX, and apparently the kernel tries anyway to
reuse the existing one, but asking for a specific index could cause conflicts if the
target network namespace already exists, so avoid doing it and lookup the device again. */
new_ifindex = name_to_index (sock_in_ns, newifname, buffer, buffer_size, err);
if (UNLIKELY (new_ifindex < 0))
return new_ifindex;
ret = configure_ip_addresses (sock_in_ns, new_ifindex, buffer, buffer_size, ips, err);
if (ret < 0)
return ret;
return request_enable_interface_and_wait (sock_in_ns, buffer, buffer_size, new_ifindex, err);
}
static int
do_move_link_to_ns_and_wait (int sock, char *buffer, size_t buffer_size, int ifindex, int netns_fd, const char *newifname, libcrun_error_t *err)
{
struct nl_req *req = (struct nl_req *) buffer;
uint32_t seq;
int ret;
seq = reset_request (req, RTM_NEWLINK, NLM_F_REQUEST | NLM_F_ACK, sizeof (struct ifinfomsg));
req->ifi.ifi_family = AF_UNSPEC;
req->ifi.ifi_index = ifindex;
ret = append_rtattr (&req->nlh, buffer_size, IFLA_NET_NS_FD, &netns_fd, sizeof (netns_fd), err);
if (UNLIKELY (ret < 0))
return ret;
ret = append_rtattr (&req->nlh, buffer_size, IFLA_IFNAME, newifname, strlen (newifname) + 1, err);
if (UNLIKELY (ret < 0))
return ret;
ret = send_request (sock, req, err);
if (UNLIKELY (ret < 0))
return ret;
return wait_for_ack (sock, seq, buffer, buffer_size, err);
}
int
move_network_device (const char *ifname, const char *newifname, int netns_fd, libcrun_error_t *err)
{
const size_t buffer_size = 8192;
cleanup_ip_addrs struct ip_addr *ips = NULL;
cleanup_free char *buffer = xmalloc (buffer_size);
cleanup_close int sock = -1;
int wait_status;
int ifindex;
pid_t pid;
int ret;
sock = open_netlink_fd (err);
if (sock < 0)
return sock;
ifindex = name_to_index (sock, ifname, buffer, buffer_size, err);
if (UNLIKELY (ifindex < 0))
return ifindex;
ret = get_ip_addresses (sock, ifindex, &ips, buffer, buffer_size, err);
if (UNLIKELY (ret < 0))
return ret;
/* Move the device to the target network namespace. */
ret = do_move_link_to_ns_and_wait (sock, buffer, buffer_size, ifindex, netns_fd, newifname, err);
if (UNLIKELY (ret < 0))
return ret;
/* must be vfork to propagate the error from the child proc. */
pid = vfork ();
if (UNLIKELY (pid < 0))
return crun_make_error (err, errno, "vfork");
if (pid == 0)
{
ret = setup_network_device_in_ns_helper (buffer, buffer_size, netns_fd, newifname, ips, err);
if (UNLIKELY (ret < 0))
_exit (-ret);
_exit (0);
}
ret = waitpid_ignore_stopped (pid, &wait_status, 0);
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "waitpid for exec child pid");
if (wait_status != 0)
return -get_process_exit_status (wait_status);
return 0;
}

28
src/libcrun/net_device.h Normal file
View File

@ -0,0 +1,28 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2025 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation; either version 2.1 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef NET_DEVICE_H
#define NET_DEVICE_H
#include <config.h>
#include <ocispec/runtime_spec_schema_config_schema.h>
#include "error.h"
int move_network_device (const char *ifname, const char *newifname, int netns_fd, libcrun_error_t *err);
#endif

234
src/libcrun/ring_buffer.c Normal file
View File

@ -0,0 +1,234 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2024 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation; either version 2.1 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#define _GNU_SOURCE
#include <config.h>
#include <sys/uio.h>
#include "ring_buffer.h"
#include "utils.h"
struct ring_buffer
{
char *buffer;
size_t size;
size_t head;
size_t tail;
};
/*
* It returns up to two regions in `iov` that can be read from.
*/
static int
ring_buffer_get_read_iov (struct ring_buffer *rb, struct iovec *iov)
{
int iov_count = 0;
/* Buffer is empty. */
if (rb->head == rb->tail)
return 0;
/* Head before tail. There is only one region to read from, up to tail. */
if (rb->tail > rb->head)
{
iov[iov_count].iov_base = rb->buffer + rb->head;
iov[iov_count].iov_len = rb->tail - rb->head;
iov_count++;
}
/* Head after tail. There are two regions to read from, up to the
* end of the buffer and from the beginning of the buffer to tail. */
else
{
iov[iov_count].iov_base = rb->buffer + rb->head;
iov[iov_count].iov_len = rb->size - rb->head;
iov_count++;
if (rb->tail > 0)
{
iov[iov_count].iov_base = rb->buffer;
iov[iov_count].iov_len = rb->tail;
iov_count++;
}
}
return iov_count;
}
/*
* It returns up to two regions in `iov` that can be written to without overwriting
* existing data.
*/
static int
ring_buffer_get_write_iov (struct ring_buffer *rb, struct iovec *iov)
{
int iov_count = 0;
/* Buffer is full. */
if (rb->tail + 1 == rb->head)
return 0;
/* Tail before head. There is only one region to write to, up to head. */
if (rb->head > rb->tail + 1)
{
iov[iov_count].iov_base = rb->buffer + rb->tail;
iov[iov_count].iov_len = rb->head - rb->tail - 1;
iov_count++;
}
/* Tail after or equal to head. There are two regions to write to, up to the
* end of the buffer and from the beginning of the buffer to head. */
else
{
iov[iov_count].iov_base = rb->buffer + rb->tail;
iov[iov_count].iov_len = rb->size - rb->tail;
iov_count++;
if (rb->head > 1)
{
iov[iov_count].iov_base = rb->buffer;
iov[iov_count].iov_len = rb->head - 1;
iov_count++;
}
}
return iov_count;
}
/* manually advance the head after a successful read. */
static void
ring_buffer_advance_nocheck_head (struct ring_buffer *rb, size_t amount)
{
rb->head = (rb->head + amount) % rb->size;
}
/* manually advance the tail after a successful write. */
static void
ring_buffer_advance_nocheck_tail (struct ring_buffer *rb, size_t amount)
{
rb->tail = (rb->tail + amount) % rb->size;
}
size_t
ring_buffer_get_data_available (struct ring_buffer *rb)
{
if (rb->head <= rb->tail)
return rb->tail - rb->head;
return rb->size - rb->head + rb->tail;
}
size_t
ring_buffer_get_size (struct ring_buffer *rb)
{
return rb->size - 1;
}
size_t
ring_buffer_get_space_available (struct ring_buffer *rb)
{
return rb->size - ring_buffer_get_data_available (rb) - 1;
}
int
ring_buffer_read (struct ring_buffer *rb, int fd, bool *is_eagain, libcrun_error_t *err)
{
struct iovec iov[2];
int iov_count = 0;
ssize_t ret;
*is_eagain = false;
iov_count = ring_buffer_get_write_iov (rb, iov);
if (iov_count == 0)
{
*is_eagain = true;
return 0;
}
ret = readv (fd, iov, iov_count);
if (UNLIKELY (ret < 0))
{
if (errno == EIO)
return 0;
if (errno == EAGAIN || errno == EWOULDBLOCK)
{
*is_eagain = true;
return 0;
}
return crun_make_error (err, errno, "readv");
}
ring_buffer_advance_nocheck_tail (rb, ret);
return ret;
}
int
ring_buffer_write (struct ring_buffer *rb, int fd, bool *is_eagain, libcrun_error_t *err)
{
ssize_t ret;
struct iovec iov[2];
int iov_count = 0;
*is_eagain = false;
iov_count = ring_buffer_get_read_iov (rb, iov);
if (iov_count == 0)
{
*is_eagain = true;
return 0;
}
ret = writev (fd, iov, iov_count);
if (UNLIKELY (ret < 0))
{
if (errno == EIO)
return 0;
if (errno == EAGAIN || errno == EWOULDBLOCK)
{
*is_eagain = true;
return 0;
}
return crun_make_error (err, errno, "writev");
}
ring_buffer_advance_nocheck_head (rb, ret);
/* If the buffer is empty, reset the head and tail. */
if (rb->head == rb->tail)
{
rb->head = 0;
rb->tail = 0;
}
return ret;
}
struct ring_buffer *
ring_buffer_make (size_t size)
{
struct ring_buffer *rb = xmalloc (sizeof (struct ring_buffer));
/* The extra byte is used to distinguish between full and empty buffer. */
rb->size = size + 1;
rb->buffer = xmalloc (rb->size);
rb->head = 0;
rb->tail = 0;
return rb;
}
void
ring_buffer_free (struct ring_buffer *rb)
{
if (rb == NULL)
return;
free (rb->buffer);
free (rb);
}

52
src/libcrun/ring_buffer.h Normal file
View File

@ -0,0 +1,52 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2024 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation; either version 2.1 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef RING_BUFFER_H
#define RING_BUFFER_H
#include <config.h>
#include "error.h"
#include "utils.h"
struct ring_buffer;
size_t ring_buffer_get_data_available (struct ring_buffer *rb);
size_t ring_buffer_get_space_available (struct ring_buffer *rb);
size_t ring_buffer_get_size (struct ring_buffer *rb);
int ring_buffer_read (struct ring_buffer *rb, int fd, bool *is_eagain, libcrun_error_t *err);
int ring_buffer_write (struct ring_buffer *rb, int fd, bool *is_eagain, libcrun_error_t *err);
struct ring_buffer *ring_buffer_make (size_t size);
void ring_buffer_free (struct ring_buffer *rb);
#define cleanup_ring_buffer __attribute__ ((cleanup (cleanup_ring_bufferp)))
static inline void
cleanup_ring_bufferp (struct ring_buffer **p)
{
struct ring_buffer *rb = *p;
if (rb)
ring_buffer_free (rb);
}
#endif

View File

@ -78,6 +78,34 @@ syscall_sched_setattr (pid_t pid, struct sched_attr_s *attr, unsigned int flags)
#endif
}
int
libcrun_reset_cpu_affinity_mask (pid_t pid, libcrun_error_t *err)
{
int ret;
/* Reset the inherited cpu affinity. Old kernels do that automatically, but
new kernels remember the affinity that was set before the cgroup move.
This is undesirable, because it inherits the systemd affinity when the container
should really move to the container space cpus.
The sched_setaffinity call will always return an error (EINVAL or ENODEV);
when used like this. This is expected and part of the backward compatibility.
Ignore ENOSYS as well, as it might be blocked by seccomp.
See: https://issues.redhat.com/browse/OCPBUGS-15102 */
ret = sched_setaffinity (pid, 0, NULL);
if (LIKELY (ret < 0))
{
if (LIKELY (errno == EINVAL || errno == ENODEV || errno == ENOSYS))
return 0;
return crun_make_error (err, errno, "failed to reset affinity");
}
return 0;
}
int
libcrun_set_scheduler (pid_t pid, runtime_spec_schema_config_schema_process *process, libcrun_error_t *err)
{
@ -162,3 +190,42 @@ libcrun_set_scheduler (pid_t pid, runtime_spec_schema_config_schema_process *pro
return 0;
}
int
libcrun_set_cpu_affinity_from_string (pid_t pid, const char *str, libcrun_error_t *err)
{
cleanup_free char *bitmask = NULL;
int ret, saved_errno;
size_t bitmask_size;
cpu_set_t *cpuset;
size_t alloc_size;
size_t i;
if (is_empty_string (str))
return 0;
ret = cpuset_string_to_bitmask (str, &bitmask, &bitmask_size, err);
if (UNLIKELY (ret < 0))
return ret;
alloc_size = CPU_ALLOC_SIZE (bitmask_size * CHAR_BIT);
cpuset = CPU_ALLOC (alloc_size);
if (UNLIKELY (cpuset == NULL))
OOM ();
CPU_ZERO_S (alloc_size, cpuset);
for (i = 0; i < bitmask_size * CHAR_BIT; i++)
{
if (bitmask[i / CHAR_BIT] & (1 << (i % CHAR_BIT)))
CPU_SET_S (i, alloc_size, cpuset);
}
ret = sched_setaffinity (pid, alloc_size, cpuset);
saved_errno = errno;
CPU_FREE (cpuset);
if (UNLIKELY (ret < 0))
return crun_make_error (err, saved_errno, "sched_setaffinity");
return 0;
}

View File

@ -24,4 +24,8 @@
int libcrun_set_scheduler (pid_t pid, runtime_spec_schema_config_schema_process *process, libcrun_error_t *err);
int libcrun_set_cpu_affinity_from_string (pid_t pid, const char *str, libcrun_error_t *err);
int libcrun_reset_cpu_affinity_mask (pid_t pid, libcrun_error_t *err);
#endif

View File

@ -443,7 +443,12 @@ calculate_seccomp_checksum (runtime_spec_schema_config_linux_seccomp *seccomp, u
blake3_hasher_finalize (&hasher, hash, sizeof (hash));
for (i = 0; i < 32; i++)
sprintf (&out[i * 2], "%02x", hash[i]);
{
size_t remaining = sizeof (seccomp_checksum_t) - i * 2;
ret = snprintf (&out[i * 2], remaining, "%02x", hash[i]);
if (UNLIKELY (ret != 2))
return crun_make_error (err, 0, "internal error: static buffer has wrong size");
}
out[64] = 0;
#undef PROCESS_STRING
@ -456,10 +461,11 @@ open_rundir_dirfd (const char *state_root, libcrun_error_t *err)
{
cleanup_free char *dir = NULL;
int dirfd;
int ret;
dir = libcrun_get_state_directory (state_root, NULL);
if (UNLIKELY (dir == NULL))
return crun_make_error (err, 0, "cannot get state directory");
ret = libcrun_get_state_directory (&dir, state_root, NULL, err);
if (UNLIKELY (ret < 0))
return ret;
dirfd = TEMP_FAILURE_RETRY (open (dir, O_PATH | O_DIRECTORY | O_CLOEXEC));
if (UNLIKELY (dirfd < 0))
@ -848,9 +854,9 @@ libcrun_copy_seccomp (struct libcrun_seccomp_gen_ctx_s *gen_ctx, const char *b64
if (UNLIKELY (consumed != (int) in_size))
return crun_make_error (err, 0, "invalid seccomp BPF data");
ret = safe_write (gen_ctx->fd, bpf_data, (ssize_t) size);
ret = safe_write (gen_ctx->fd, "seccomp fd", bpf_data, size, err);
if (UNLIKELY (ret < 0))
return crun_make_error (err, 0, "write to seccomp fd");
return ret;
return 0;
}

View File

@ -32,18 +32,34 @@
#define YAJL_STR(x) ((const unsigned char *) (x))
#define STEAL_POINTER(x, y) \
do \
{ \
*x = y; \
y = NULL; \
} while (0)
struct pid_stat
{
char state;
unsigned long long starttime;
};
static char *
get_run_directory (const char *state_root)
/* If ID is not NULL, then ennsure that it does not contain any slash. */
static int
validate_id (const char *id, libcrun_error_t *err)
{
if (id && strchr (id, '/') != NULL)
return crun_make_error (err, 0, "invalid character `/` in the ID `%s`", id);
return 0;
}
static int
get_run_directory (char **out, const char *state_root, libcrun_error_t *err)
{
int ret;
char *root = NULL;
libcrun_error_t err = NULL;
cleanup_free char *root = NULL;
if (state_root)
root = xstrdup (state_root);
@ -52,57 +68,69 @@ get_run_directory (const char *state_root)
const char *runtime_dir = getenv ("XDG_RUNTIME_DIR");
if (runtime_dir && runtime_dir[0] != '\0')
{
ret = append_paths (&root, &err, runtime_dir, "crun", NULL);
ret = append_paths (&root, err, runtime_dir, "crun", NULL);
if (UNLIKELY (ret < 0))
{
crun_error_release (&err);
return NULL;
}
return ret;
}
}
if (root == NULL)
root = xstrdup ("/run/crun");
ret = crun_ensure_directory (root, 0700, false, &err);
ret = crun_ensure_directory (root, 0700, false, err);
if (UNLIKELY (ret < 0))
crun_error_release (&err);
return root;
return ret;
STEAL_POINTER (out, root);
return 0;
}
char *
libcrun_get_state_directory (const char *state_root, const char *id)
int
libcrun_get_state_directory (char **out, const char *state_root, const char *id, libcrun_error_t *err)
{
int ret;
char *path;
libcrun_error_t *err = NULL;
cleanup_free char *root = get_run_directory (state_root);
cleanup_free char *path = NULL;
cleanup_free char *root = NULL;
ret = validate_id (id, err);
if (UNLIKELY (ret < 0))
return ret;
ret = get_run_directory (&root, state_root, err);
if (UNLIKELY (ret < 0))
return ret;
ret = append_paths (&path, err, root, id, NULL);
if (UNLIKELY (ret < 0))
{
crun_error_release (err);
return NULL;
}
return ret;
return path;
STEAL_POINTER (out, path);
return 0;
}
static char *
get_state_directory_status_file (const char *state_root, const char *id)
static int
get_state_directory_status_file (char **out, const char *state_root, const char *id, libcrun_error_t *err)
{
cleanup_free char *root = get_run_directory (state_root);
libcrun_error_t *err = NULL;
char *path = NULL;
cleanup_free char *root = NULL;
cleanup_free char *path = NULL;
int ret;
ret = validate_id (id, err);
if (UNLIKELY (ret < 0))
return ret;
ret = get_run_directory (&root, state_root, err);
if (UNLIKELY (ret < 0))
return ret;
ret = append_paths (&path, err, root, id, "status", NULL);
if (UNLIKELY (ret < 0))
{
crun_error_release (err);
return NULL;
}
return ret;
return path;
STEAL_POINTER (out, path);
return 0;
}
static int
@ -114,7 +142,9 @@ read_pid_stat (pid_t pid, struct pid_stat *st, libcrun_error_t *err)
char *it, *s;
int i, ret;
sprintf (pid_stat_file, "/proc/%d/stat", pid);
ret = snprintf (pid_stat_file, sizeof (pid_stat_file), "/proc/%d/stat", pid);
if (UNLIKELY (ret >= (int) sizeof (pid_stat_file)))
return crun_make_error (err, 0, "internal error: static buffer too small");
fd = open (pid_stat_file, O_RDONLY | O_CLOEXEC);
if (fd < 0)
@ -173,8 +203,8 @@ libcrun_write_container_status (const char *state_root, const char *id, libcrun_
libcrun_error_t *err)
{
int r, ret;
cleanup_free char *file = get_state_directory_status_file (state_root, id);
cleanup_free char *file_tmp = NULL;
cleanup_free char *file = NULL;
size_t len;
cleanup_close int fd_write = -1;
const unsigned char *buf = NULL;
@ -182,6 +212,10 @@ libcrun_write_container_status (const char *state_root, const char *id, libcrun_
const char *tmp;
yajl_gen gen = NULL;
ret = get_state_directory_status_file (&file, state_root, id, err);
if (UNLIKELY (ret < 0))
return ret;
ret = read_pid_stat (status->pid, &st, err);
if (UNLIKELY (ret < 0))
return ret;
@ -314,11 +348,9 @@ libcrun_write_container_status (const char *state_root, const char *id, libcrun_
if (UNLIKELY (r != yajl_gen_status_ok))
goto yajl_error;
if (UNLIKELY (safe_write (fd_write, buf, (ssize_t) len) < 0))
{
ret = crun_make_error (err, errno, "cannot write status file");
goto exit;
}
ret = safe_write (fd_write, "status file", buf, len, err);
if (UNLIKELY (r < 0))
goto exit;
close_and_reset (&fd_write);
@ -348,12 +380,37 @@ libcrun_read_container_status (libcrun_container_status_t *status, const char *s
cleanup_free char *buffer = NULL;
char err_buffer[256];
int ret;
cleanup_free char *file = get_state_directory_status_file (state_root, id);
cleanup_free char *file = NULL;
yajl_val tree, tmp;
ret = get_state_directory_status_file (&file, state_root, id, err);
if (UNLIKELY (ret < 0))
return ret;
ret = read_all_file (file, &buffer, NULL, err);
if (UNLIKELY (ret < 0))
return ret;
{
if (crun_error_get_errno (err) == ENOENT)
{
cleanup_free char *statedir = NULL;
libcrun_error_t tmp_err;
int tmp_ret;
tmp_ret = libcrun_get_state_directory (&statedir, state_root, id, &tmp_err);
if (UNLIKELY (tmp_ret < 0))
crun_error_release (&tmp_err);
else
{
tmp_ret = crun_path_exists (statedir, &tmp_err);
if (UNLIKELY (tmp_ret < 0))
crun_error_release (&tmp_err);
else if (tmp_ret == 0)
return crun_error_wrap (err, "container `%s` does not exist", id);
}
}
return ret;
}
tree = yajl_tree_parse (buffer, err_buffer, sizeof (err_buffer));
if (UNLIKELY (tree == NULL))
@ -438,17 +495,12 @@ int
libcrun_status_check_directories (const char *state_root, const char *id, libcrun_error_t *err)
{
cleanup_free char *dir = NULL;
cleanup_free char *run_directory = get_run_directory (state_root);
int ret;
ret = crun_ensure_directory (run_directory, 0700, false, err);
ret = libcrun_get_state_directory (&dir, state_root, id, err);
if (UNLIKELY (ret < 0))
return ret;
dir = libcrun_get_state_directory (state_root, id);
if (UNLIKELY (dir == NULL))
return crun_make_error (err, 0, "cannot get state directory");
ret = crun_path_exists (dir, err);
if (UNLIKELY (ret < 0))
return ret;
@ -528,11 +580,15 @@ libcrun_container_delete_status (const char *state_root, const char *id, libcrun
cleanup_close int dfd = -1;
cleanup_free char *dir = NULL;
dir = get_run_directory (state_root);
if (UNLIKELY (dir == NULL))
return crun_make_error (err, 0, "cannot get state directory");
ret = validate_id (id, err);
if (UNLIKELY (ret < 0))
return ret;
rundir_dfd = TEMP_FAILURE_RETRY (open (dir, O_DIRECTORY | O_RDONLY | O_CLOEXEC));
ret = get_run_directory (&dir, state_root, err);
if (UNLIKELY (ret < 0))
return ret;
rundir_dfd = TEMP_FAILURE_RETRY (open (dir, O_DIRECTORY | O_PATH | O_CLOEXEC));
if (UNLIKELY (rundir_dfd < 0))
return crun_make_error (err, errno, "cannot open run directory `%s`", dir);
@ -571,21 +627,27 @@ libcrun_free_container_status (libcrun_container_status_t *status)
}
int
libcrun_get_containers_list (libcrun_container_list_t **ret, const char *state_root, libcrun_error_t *err)
libcrun_get_containers_list (libcrun_container_list_t **out, const char *state_root, libcrun_error_t *err)
{
struct dirent *next;
cleanup_container_list libcrun_container_list_t *tmp = NULL;
cleanup_free char *path = get_run_directory (state_root);
cleanup_free char *root = NULL;
cleanup_dir DIR *dir = NULL;
int ret;
*ret = NULL;
dir = opendir (path);
*out = NULL;
ret = get_run_directory (&root, state_root, err);
if (UNLIKELY (ret < 0))
return ret;
dir = opendir (root);
if (UNLIKELY (dir == NULL))
return crun_make_error (err, errno, "cannot opendir `%s`", path);
return crun_make_error (err, errno, "cannot opendir `%s`", root);
for (next = readdir (dir); next; next = readdir (dir))
{
int r, exists;
int exists;
cleanup_free char *status_file = NULL;
libcrun_container_list_t *next_container;
@ -593,9 +655,9 @@ libcrun_get_containers_list (libcrun_container_list_t **ret, const char *state_r
if (next->d_name[0] == '.')
continue;
r = append_paths (&status_file, err, path, next->d_name, "status", NULL);
if (UNLIKELY (r < 0))
return r;
ret = append_paths (&status_file, err, root, next->d_name, "status", NULL);
if (UNLIKELY (ret < 0))
return ret;
exists = crun_path_exists (status_file, err);
if (exists < 0)
@ -614,8 +676,9 @@ libcrun_get_containers_list (libcrun_container_list_t **ret, const char *state_r
next_container->next = tmp;
tmp = next_container;
}
*ret = tmp;
tmp = NULL;
STEAL_POINTER (out, tmp);
return 0;
}
@ -676,14 +739,19 @@ libcrun_is_container_running (libcrun_container_status_t *status, libcrun_error_
int
libcrun_status_create_exec_fifo (const char *state_root, const char *id, libcrun_error_t *err)
{
cleanup_free char *state_dir = libcrun_get_state_directory (state_root, id);
cleanup_free char *state_dir = NULL;
cleanup_free char *fifo_path = NULL;
int ret, fd = -1;
ret = libcrun_get_state_directory (&state_dir, state_root, id, err);
if (UNLIKELY (ret < 0))
return ret;
ret = append_paths (&fifo_path, err, state_dir, "exec.fifo", NULL);
if (UNLIKELY (ret < 0))
return ret;
libcrun_debug ("Creating exec fifo: %s", fifo_path);
ret = mkfifo (fifo_path, 0600);
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "mkfifo");
@ -698,7 +766,7 @@ libcrun_status_create_exec_fifo (const char *state_root, const char *id, libcrun
int
libcrun_status_write_exec_fifo (const char *state_root, const char *id, libcrun_error_t *err)
{
cleanup_free char *state_dir = libcrun_get_state_directory (state_root, id);
cleanup_free char *state_dir = NULL;
cleanup_free char *fifo_path = NULL;
char buffer[1] = {
0,
@ -706,6 +774,10 @@ libcrun_status_write_exec_fifo (const char *state_root, const char *id, libcrun_
cleanup_close int fd = -1;
int ret;
ret = libcrun_get_state_directory (&state_dir, state_root, id, err);
if (UNLIKELY (ret < 0))
return ret;
ret = append_paths (&fifo_path, err, state_dir, "exec.fifo", NULL);
if (UNLIKELY (ret < 0))
return ret;
@ -728,10 +800,14 @@ libcrun_status_write_exec_fifo (const char *state_root, const char *id, libcrun_
int
libcrun_status_has_read_exec_fifo (const char *state_root, const char *id, libcrun_error_t *err)
{
cleanup_free char *state_dir = libcrun_get_state_directory (state_root, id);
cleanup_free char *state_dir = NULL;
cleanup_free char *fifo_path = NULL;
int ret;
ret = libcrun_get_state_directory (&state_dir, state_root, id, err);
if (UNLIKELY (ret < 0))
return ret;
ret = append_paths (&fifo_path, err, state_dir, "exec.fifo", NULL);
if (UNLIKELY (ret < 0))
return ret;

View File

@ -55,7 +55,7 @@ LIBCRUN_PUBLIC int libcrun_read_container_status (libcrun_container_status_t *st
const char *id, libcrun_error_t *err);
LIBCRUN_PUBLIC void libcrun_free_containers_list (libcrun_container_list_t *list);
LIBCRUN_PUBLIC int libcrun_is_container_running (libcrun_container_status_t *status, libcrun_error_t *err);
LIBCRUN_PUBLIC char *libcrun_get_state_directory (const char *state_root, const char *id);
LIBCRUN_PUBLIC int libcrun_get_state_directory (char **out, const char *state_root, const char *id, libcrun_error_t *err);
LIBCRUN_PUBLIC int libcrun_container_delete_status (const char *state_root, const char *id, libcrun_error_t *err);
LIBCRUN_PUBLIC int libcrun_get_containers_list (libcrun_container_list_t **ret, const char *state_root,
libcrun_error_t *err);

171
src/libcrun/string_map.c Normal file
View File

@ -0,0 +1,171 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2025 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation; either version 2.1 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#define _GNU_SOURCE
#include <config.h>
#include <stdlib.h>
#include <errno.h>
#include "string_map.h"
#include "utils.h"
#include <ocispec/runtime_spec_schema_config_schema.h>
struct kv_s
{
char *key;
char *value;
};
struct string_map_s
{
size_t len;
struct kv_s *kvs;
#ifdef HAVE_HSEARCH_R
struct hsearch_data htab;
bool htab_initialized;
#endif
bool sorted;
};
static int
compare_kv (const void *a, const void *b)
{
return strcmp (((struct kv_s *) a)->key, ((struct kv_s *) b)->key);
}
const char *
find_string_map_value (string_map *map, const char *name)
{
struct kv_s *r, key;
if (map == NULL || map->len == 0)
return NULL;
#ifdef HAVE_HSEARCH_R
ENTRY e, *ep;
/* Do not bother with hash tables for small maps. */
if (map->len < 8)
goto fallback;
if (! map->htab_initialized)
{
size_t i;
if (hcreate_r (map->len, &map->htab) == 0)
goto fallback;
for (i = 0; i < map->len; i++)
{
e.key = (char *) map->kvs[i].key;
e.data = map->kvs[i].value;
if (hsearch_r (e, ENTER, &ep, &map->htab) == 0)
{
hdestroy_r (&map->htab);
goto fallback;
}
}
map->htab_initialized = true;
}
e.key = (char *) name;
if (hsearch_r (e, FIND, &ep, &map->htab) == 0)
return NULL;
return ep->data;
fallback:
#endif
if (! map->sorted)
{
qsort (map->kvs, map->len, sizeof (struct kv_s), compare_kv);
map->sorted = true;
}
key.key = (char *) name;
r = bsearch (&key, map->kvs, map->len, sizeof (struct kv_s), compare_kv);
return r ? r->value : NULL;
}
string_map *
make_string_map_from_json (json_map_string_string *jmap)
{
struct string_map_s *new_map = xmalloc0 (sizeof (struct string_map_s));
size_t i;
if (jmap == NULL)
return new_map;
new_map->len = jmap->len;
new_map->kvs = xmalloc0 (sizeof (struct kv_s) * (jmap->len + 1));
for (i = 0; i < jmap->len; i++)
{
new_map->kvs[i].key = xstrdup (jmap->keys[i]);
new_map->kvs[i].value = xstrdup (jmap->values[i]);
}
return new_map;
}
int
string_map_get_at (string_map *map, size_t index, const char **name, const char **value)
{
if (map == NULL || index >= map->len)
{
errno = ERANGE;
return -1;
}
*name = map->kvs[index].key;
*value = map->kvs[index].value;
return 0;
}
void
free_string_map (string_map *map)
{
size_t i;
if (map == NULL)
return;
if (map->htab_initialized)
hdestroy_r (&map->htab);
for (i = 0; i < map->len; i++)
{
free (map->kvs[i].key);
free (map->kvs[i].value);
}
free (map->kvs);
free (map);
}
size_t
string_map_size (string_map *map)
{
return map->len;
}

46
src/libcrun/string_map.h Normal file
View File

@ -0,0 +1,46 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2025 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation; either version 2.1 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef STRING_MAP_H
#define STRING_MAP_H
#define _GNU_SOURCE
#include <config.h>
#ifdef HAVE_HSEARCH_R
# include <search.h>
#endif
#include <ocispec/runtime_spec_schema_config_schema.h>
#include "error.h"
struct string_map_s;
typedef struct string_map_s string_map;
const char *find_string_map_value (string_map *map, const char *name);
string_map *make_string_map_from_json (json_map_string_string *jmap);
void free_string_map (string_map *map);
int string_map_get_at (string_map *map, size_t index, const char **name, const char **value);
size_t string_map_size (string_map *map);
#endif

View File

@ -61,8 +61,8 @@ libcrun_new_terminal (char **pty, libcrun_error_t *err)
return ret;
}
static int
set_raw (int fd, void **current_status, libcrun_error_t *err)
int
libcrun_set_raw (int fd, void **current_status, libcrun_error_t *err)
{
int ret;
struct termios termios;
@ -114,23 +114,6 @@ libcrun_set_stdio (char *pty, libcrun_error_t *err)
return 0;
}
int
libcrun_setup_terminal_ptmx (int fd, void **current_status, libcrun_error_t *err)
{
int ret;
struct termios termios;
ret = tcgetattr (fd, &termios);
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "tcgetattr");
ret = tcsetattr (fd, TCSANOW, &termios);
if (UNLIKELY (ret < 0))
return crun_make_error (err, errno, "tcsetattr");
return set_raw (0, current_status, err);
}
void
cleanup_terminalp (void *p)
{

View File

@ -29,7 +29,7 @@ int libcrun_new_terminal (char **pty, libcrun_error_t *err);
int libcrun_set_stdio (char *pty, libcrun_error_t *err);
int libcrun_setup_terminal_ptmx (int fd, void **current_status, libcrun_error_t *err);
int libcrun_set_raw (int fd, void **current_status, libcrun_error_t *err);
int libcrun_terminal_setup_size (int fd, unsigned short rows, unsigned short cols, libcrun_error_t *err);

File diff suppressed because it is too large Load Diff

View File

@ -27,6 +27,7 @@
#include <dirent.h>
#include <unistd.h>
#include <signal.h>
#include <fcntl.h>
#include <ocispec/runtime_spec_schema_config_schema.h>
#include <sys/wait.h>
#include "container.h"
@ -55,6 +56,8 @@
#define LIKELY(x) __builtin_expect ((x), 1)
#define UNLIKELY(x) __builtin_expect ((x), 0)
#define WRITE_FILE_DEFAULT_FLAGS (O_CLOEXEC | O_CREAT | O_TRUNC | O_WRONLY)
__attribute__ ((malloc)) static inline void *
xmalloc (size_t size)
{
@ -186,6 +189,9 @@ cleanup_close_vecp (int **p)
int *pp = *p;
int i;
if (pp == NULL)
return;
for (i = 0; pp[i] >= 0; i++)
TEMP_FAILURE_RETRY (close (pp[i]));
}
@ -225,6 +231,8 @@ xstrdup (const char *str)
return ret;
}
void consume_trailing_slashes (char *path);
static inline const char *
consume_slashes (const char *t)
{
@ -255,28 +263,30 @@ int xasprintf (char **str, const char *fmt, ...) __attribute__ ((format (printf,
int crun_path_exists (const char *path, libcrun_error_t *err);
int write_file_with_flags (const char *name, int flags, const void *data, size_t len, libcrun_error_t *err);
int write_file (const char *name, const void *data, size_t len, libcrun_error_t *err);
int write_file_at (int dirfd, const char *name, const void *data, size_t len, libcrun_error_t *err);
int write_file_at_with_flags (int dirfd, int flags, mode_t mode, const char *name, const void *data, size_t len, libcrun_error_t *err);
static inline int
write_file (const char *name, const void *data, size_t len, libcrun_error_t *err)
{
return write_file_at_with_flags (AT_FDCWD, WRITE_FILE_DEFAULT_FLAGS, 0700, name, data, len, err);
}
static inline int
write_file_at (int dirfd, const char *name, const void *data, size_t len, libcrun_error_t *err)
{
return write_file_at_with_flags (dirfd, WRITE_FILE_DEFAULT_FLAGS, 0700, name, data, len, err);
}
int crun_ensure_directory (const char *path, int mode, bool nofollow, libcrun_error_t *err);
int crun_ensure_file (const char *path, int mode, bool nofollow, libcrun_error_t *err);
int crun_ensure_directory_at (int dirfd, const char *path, int mode, bool nofollow, libcrun_error_t *err);
int crun_ensure_file_at (int dirfd, const char *path, int mode, bool nofollow, libcrun_error_t *err);
int crun_safe_create_and_open_ref_at (bool dir, int dirfd, const char *dirpath, const char *path, int mode, libcrun_error_t *err);
int crun_safe_create_and_open_ref_at (bool dir, int dirfd, const char *dirpath, size_t dirpath_len, const char *path, int mode, libcrun_error_t *err);
int crun_safe_ensure_directory_at (int dirfd, const char *dirpath, size_t dirpath_len, const char *path, int mode,
int crun_safe_ensure_directory_at (int dirfd, const char *dirpath, const char *path, int mode,
libcrun_error_t *err);
int crun_safe_ensure_file_at (int dirfd, const char *dirpath, size_t dirpath_len, const char *path, int mode,
int crun_safe_ensure_file_at (int dirfd, const char *dirpath, const char *path, int mode,
libcrun_error_t *err);
int crun_dir_p (const char *path, bool nofollow, libcrun_error_t *err);
@ -285,7 +295,7 @@ int crun_dir_p_at (int dirfd, const char *path, bool nofollow, libcrun_error_t *
int detach_process ();
int create_file_if_missing_at (int dirfd, const char *file, libcrun_error_t *err);
int create_file_if_missing_at (int dirfd, const char *file, mode_t mode, libcrun_error_t *err);
int check_running_in_user_namespace (libcrun_error_t *err);
@ -303,6 +313,8 @@ read_all_fd (int fd, const char *description, char **out, size_t *len, libcrun_e
return read_all_fd_with_size_hint (fd, description, out, len, 0, err);
}
int get_realpath_to_file (int dirfd, const char *path_name, char **absolute_path, libcrun_error_t *err);
int read_all_file (const char *path, char **out, size_t *len, libcrun_error_t *err);
int read_all_file_at (int dirfd, const char *path, char **out, size_t *len, libcrun_error_t *err);
@ -323,13 +335,13 @@ int receive_fd_from_socket_with_payload (int from, char *payload, size_t payload
int create_signalfd (sigset_t *mask, libcrun_error_t *err);
int epoll_helper (int *fds, int *levelfds, libcrun_error_t *err);
int epoll_helper (int *in_fds, int *in_levelfds, int *out_fds, int *out_levelfds, libcrun_error_t *err);
int copy_from_fd_to_fd (int src, int dst, int consume, libcrun_error_t *err);
int run_process (char **args, libcrun_error_t *err);
size_t format_default_id_mapping (char **ret, uid_t container_id, uid_t host_uid, uid_t host_id, int is_uid);
int format_default_id_mapping (char **out, uid_t container_id, uid_t host_uid, uid_t host_id, int is_uid, libcrun_error_t *err);
int run_process_with_stdin_timeout_envp (char *path, char **args, const char *cwd, int timeout, char **envp,
char *stdin, size_t stdin_len, int out_fd, int err_fd, libcrun_error_t *err);
@ -338,7 +350,7 @@ int mark_or_close_fds_ge_than (int n, bool close_now, libcrun_error_t *err);
void get_current_timestamp (char *out, size_t len);
int set_blocking_fd (int fd, int blocking, libcrun_error_t *err);
int set_blocking_fd (int fd, bool blocking, libcrun_error_t *err);
int parse_json_file (yajl_val *out, const char *jsondata, struct parser_context *ctx, libcrun_error_t *err);
@ -349,7 +361,7 @@ has_prefix (const char *str, const char *prefix)
return strlen (str) >= prefix_len && memcmp (str, prefix, prefix_len) == 0;
}
char *find_executable (const char *executable_path, const char *cwd);
int find_executable (char **exec_path, const char *executable_path, const char *cwd, libcrun_error_t *err);
int copy_recursive_fd_to_fd (int srcfd, int destfd, const char *srcname, const char *destname, libcrun_error_t *err);
@ -359,7 +371,6 @@ int libcrun_initialize_selinux (libcrun_error_t *err);
int libcrun_initialize_apparmor (libcrun_error_t *err);
const char *find_annotation_map (json_map_string_string *annotations, const char *name);
const char *find_annotation (libcrun_container_t *container, const char *name);
int get_file_type_at (int dirfd, mode_t *mode, bool nofollow, const char *path);
@ -370,10 +381,10 @@ int get_file_type_fd (int fd, mode_t *mode);
char *get_user_name (uid_t uid);
int safe_openat (int dirfd, const char *rootfs, size_t rootfs_len, const char *path, int flags, int mode,
int safe_openat (int dirfd, const char *rootfs, const char *path, int flags, int mode,
libcrun_error_t *err);
ssize_t safe_write (int fd, const void *buf, ssize_t count);
int safe_write (int fd, const char *fname, const void *buf, size_t count, libcrun_error_t *err);
int append_paths (char **out, libcrun_error_t *err, ...) __attribute__ ((sentinel));
@ -385,6 +396,8 @@ char *str_join_array (int offset, size_t size, char *const array[], const char *
ssize_t safe_readlinkat (int dfd, const char *name, char **buffer, ssize_t hint, libcrun_error_t *err);
char **read_dir_entries (const char *path, libcrun_error_t *err);
static inline bool
is_empty_string (const char *s)
{
@ -469,4 +482,39 @@ validate_options (unsigned int specified_options, unsigned int supported_options
return 0;
}
extern int cpuset_string_to_bitmask (const char *str, char **out, size_t *out_size, libcrun_error_t *err);
/*
* A channel_fd_pair takes care of copying data between two file descriptors.
* The two file descriptors are expected to be set to non-blocking mode.
* The channel_fd_pair will buffer data read from the input file descriptor and
* write it to the output file descriptor. If the output file descriptor is not
* ready to accept the data, the channel_fd_pair will buffer the data until it
* can be written.
*/
struct channel_fd_pair;
struct channel_fd_pair *channel_fd_pair_new (int in_fd, int out_fd, size_t size);
void channel_fd_pair_free (struct channel_fd_pair *channel);
/* Process the data in the channel_fd_pair. This function will read data from
* the input file descriptor and write it to the output file descriptor. If
* the output file descriptor is not ready to accept the data, the data will be
* buffered. If epollfd is provided, the in_fd and out_fd will be registered
* and unregistered as necessary.
*/
int channel_fd_pair_process (struct channel_fd_pair *channel, int epollfd, libcrun_error_t *err);
static inline void
cleanup_channel_fd_pairp (void *p)
{
struct channel_fd_pair **pp = (struct channel_fd_pair **) p;
if (*pp == NULL)
return;
channel_fd_pair_free (*pp);
}
#define cleanup_channel_fd_pair __attribute__ ((cleanup (cleanup_channel_fd_pairp)))
#endif

73
src/mounts.c Normal file
View File

@ -0,0 +1,73 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2017, 2018, 2019 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#include <config.h>
#include <stdio.h>
#include <stdlib.h>
#include <argp.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include "crun.h"
#include "libcrun/container.h"
#include "libcrun/utils.h"
static char doc[] = "OCI runtime";
static libcrun_context_t crun_context;
static struct argp_option options[] = {
0,
};
static char args_doc[] = "mounts [add|remove] CONTAINER FILE";
static error_t
parse_opt (int key, char *arg arg_unused, struct argp_state *state arg_unused)
{
switch (key)
{
default:
return ARGP_ERR_UNKNOWN;
}
return 0;
}
static struct argp run_argp = { options, parse_opt, args_doc, doc, NULL, NULL, NULL };
int
crun_command_mounts (struct crun_global_arguments *global_args, int argc, char **argv, libcrun_error_t *err)
{
int first_arg = 0, ret;
argp_parse (&run_argp, argc, argv, ARGP_IN_ORDER, &first_arg, &crun_context);
crun_assert_n_args (argc - first_arg, 3, 3);
ret = init_libcrun_context (&crun_context, argv[first_arg], global_args, err);
if (UNLIKELY (ret < 0))
return ret;
if (strcmp (argv[first_arg], "add") == 0)
return libcrun_container_add_mounts_from_file (&crun_context, argv[first_arg + 1], argv[first_arg + 2], err);
else if (strcmp (argv[first_arg], "remove") == 0)
return libcrun_container_remove_mounts_from_file (&crun_context, argv[first_arg + 1], argv[first_arg + 2], err);
return crun_make_error (err, 0, "unknown command %s", argv[first_arg + 1]);
}

25
src/mounts.h Normal file
View File

@ -0,0 +1,25 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2017, 2018, 2019 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef MOUNTS_H
#define MOUNTS_H
#include "crun.h"
int crun_command_mounts (struct crun_global_arguments *global_args, int argc, char **argv, libcrun_error_t *err);
#endif

View File

@ -142,8 +142,10 @@ crun_features_add_seccomp_info (yajl_gen json_gen, const struct linux_info_s *li
yajl_gen_map_open (json_gen);
add_bool_to_json (json_gen, "enabled", linux->seccomp.enabled);
add_array_to_json (json_gen, "actions", linux->seccomp.actions);
add_array_to_json (json_gen, "operators", linux->seccomp.operators);
if (linux->seccomp.actions)
add_array_to_json (json_gen, "actions", linux->seccomp.actions);
if (linux->seccomp.operators)
add_array_to_json (json_gen, "operators", linux->seccomp.operators);
yajl_gen_map_close (json_gen);
}
@ -193,6 +195,15 @@ crun_features_add_intel_rdt (yajl_gen json_gen, const struct linux_info_s *linux
yajl_gen_map_close (json_gen);
}
void
crun_features_add_net_devices (yajl_gen json_gen, const struct linux_info_s *linux)
{
yajl_gen_string (json_gen, (const unsigned char *) "netDevices", strlen ("netDevices"));
yajl_gen_map_open (json_gen);
add_bool_to_json (json_gen, "enabled", linux->net_devices.enabled);
yajl_gen_map_close (json_gen);
}
void
crun_features_add_linux_info (yajl_gen json_gen, const struct linux_info_s *linux)
{
@ -207,6 +218,7 @@ crun_features_add_linux_info (yajl_gen json_gen, const struct linux_info_s *linu
crun_features_add_selinux_info (json_gen, linux);
crun_features_add_mount_ext_info (json_gen, linux);
crun_features_add_intel_rdt (json_gen, linux);
crun_features_add_net_devices (json_gen, linux);
yajl_gen_map_close (json_gen);
}
@ -217,7 +229,8 @@ crun_features_add_annotations_info (yajl_gen json_gen, const struct annotations_
yajl_gen_string (json_gen, (const unsigned char *) "annotations", strlen ("annotations"));
yajl_gen_map_open (json_gen);
add_string_to_json (json_gen, "io.github.seccomp.libseccomp.version", annotation->io_github_seccomp_libseccomp_version);
if (! is_empty_string (annotation->io_github_seccomp_libseccomp_version))
add_string_to_json (json_gen, "io.github.seccomp.libseccomp.version", annotation->io_github_seccomp_libseccomp_version);
add_bool_str_to_json (json_gen, "org.opencontainers.runc.checkpoint.enabled", annotation->run_oci_crun_checkpoint_enabled);
add_bool_str_to_json (json_gen, "run.oci.crun.checkpoint.enabled", annotation->run_oci_crun_checkpoint_enabled);

View File

@ -44,6 +44,9 @@ enum
OPTION_CONSOLE_SOCKET,
OPTION_FILE_LOCKS,
OPTION_MANAGE_CGROUPS_MODE,
OPTION_NETWORK_LOCK_METHOD,
OPTION_LSM_PROFILE,
OPTION_LSM_MOUNT_CONTEXT,
};
static char doc[] = "OCI runtime";
@ -67,6 +70,9 @@ static struct argp_option options[]
"path to a socket that will receive the ptmx end of the tty", 0 },
{ "file-locks", OPTION_FILE_LOCKS, 0, 0, "allow file locks", 0 },
{ "manage-cgroups-mode", OPTION_MANAGE_CGROUPS_MODE, "MODE", 0, "cgroups mode: 'soft' (default), 'ignore', 'full' and 'strict'", 0 },
{ "network-lock", OPTION_NETWORK_LOCK_METHOD, 0, 0, "set network lock method", 0 },
{ "lsm-profile", OPTION_LSM_PROFILE, "VALUE", 0, "Specify an LSM profile to be used during restore in the form of TYPE:NAME", 0 },
{ "lsm-mount-context", OPTION_LSM_MOUNT_CONTEXT, "VALUE", 0, "Specify an LSM mount context to be used during restore", 0 },
{
0,
} };
@ -125,6 +131,18 @@ parse_opt (int key, char *arg, struct argp_state *state)
cr_options.manage_cgroups_mode = crun_parse_manage_cgroups_mode (argp_mandatory_argument (arg, state));
break;
case OPTION_NETWORK_LOCK_METHOD:
cr_options.network_lock_method = crun_parse_network_lock_method (argp_mandatory_argument (arg, state));
break;
case OPTION_LSM_PROFILE:
cr_options.lsm_profile = argp_mandatory_argument (arg, state);
break;
case OPTION_LSM_MOUNT_CONTEXT:
cr_options.lsm_mount_context = argp_mandatory_argument (arg, state);
break;
default:
return ARGP_ERR_UNKNOWN;
}
@ -169,13 +187,15 @@ crun_command_restore (struct crun_global_arguments *global_args, int argc, char
if (UNLIKELY (ret < 0))
return ret;
cr_options.manage_cgroups_mode = -1;
if (cr_options.image_path == NULL)
{
cleanup_free char *path = NULL;
path = getcwd (NULL, 0);
if (UNLIKELY (path == NULL))
libcrun_fail_with_error (0, "realloc failed");
libcrun_fail_with_error (errno, "getcwd failed");
ret = asprintf (&cr_path, "%s/checkpoint", path);
if (UNLIKELY (ret < 0))

View File

@ -17,16 +17,10 @@
*/
#include <config.h>
#include <stdio.h>
#include <stdlib.h>
#include <argp.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include "crun.h"
#include "libcrun/container.h"
#include "libcrun/utils.h"
#include "run_create.h"
static char doc[] = "OCI runtime";
@ -93,7 +87,7 @@ parse_opt (int key, char *arg, struct argp_state *state)
break;
case OPTION_PRESERVE_FDS:
crun_context.preserve_fds = strtoll (argp_mandatory_argument (arg, state), NULL, 10);
crun_context.preserve_fds = parse_int_or_fail (argp_mandatory_argument (arg, state), "preserve-fds");
break;
case OPTION_NO_SUBREAPER:
@ -123,63 +117,14 @@ parse_opt (int key, char *arg, struct argp_state *state)
static struct argp run_argp = { options, parse_opt, args_doc, doc, NULL, NULL, NULL };
static unsigned int
get_options ()
{
return keep ? LIBCRUN_RUN_OPTIONS_KEEP : 0;
}
int
crun_command_run (struct crun_global_arguments *global_args, int argc, char **argv, libcrun_error_t *err)
{
int first_arg = 0, ret;
cleanup_container libcrun_container_t *container = NULL;
cleanup_free char *bundle_cleanup = NULL;
cleanup_free char *config_file_cleanup = NULL;
crun_context.preserve_fds = 0;
crun_context.listen_fds = 0;
argp_parse (&run_argp, argc, argv, ARGP_IN_ORDER, &first_arg, &crun_context);
crun_assert_n_args (argc - first_arg, 1, 1);
/* Make sure the config is an absolute path before changing the directory. */
if ((strcmp ("config.json", config_file) != 0))
{
if (config_file[0] != '/')
{
config_file_cleanup = realpath (config_file, NULL);
if (config_file_cleanup == NULL)
libcrun_fail_with_error (errno, "realpath `%s` failed", config_file);
config_file = config_file_cleanup;
}
}
/* Make sure the bundle is an absolute path. */
if (bundle == NULL)
bundle = bundle_cleanup = getcwd (NULL, 0);
else
{
if (bundle[0] != '/')
{
bundle_cleanup = realpath (bundle, NULL);
if (bundle_cleanup == NULL)
libcrun_fail_with_error (errno, "realpath `%s` failed", bundle);
bundle = bundle_cleanup;
}
if (chdir (bundle) < 0)
libcrun_fail_with_error (errno, "chdir `%s` failed", bundle);
}
container = libcrun_container_load_from_file (config_file, err);
if (container == NULL)
return -1;
ret = init_libcrun_context (&crun_context, argv[first_arg], global_args, err);
if (UNLIKELY (ret < 0))
return ret;
crun_context.bundle = bundle;
if (getenv ("LISTEN_FDS"))
{
crun_context.listen_fds = strtoll (getenv ("LISTEN_FDS"), NULL, 10);
crun_context.preserve_fds += crun_context.listen_fds;
}
return libcrun_container_run (&crun_context, container, keep ? LIBCRUN_RUN_OPTIONS_KEEP : 0, err);
return crun_run_create_internal (global_args, argc, argv, libcrun_container_run, get_options, &crun_context, &run_argp, &config_file, &bundle, err);
}

103
src/run_create.c Normal file
View File

@ -0,0 +1,103 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2017, 2018, 2019, 2020 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#include <config.h>
#include <stdio.h>
#include <stdlib.h>
#include <argp.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include "crun.h"
#include "run_create.h"
#include "libcrun/container.h"
#include "libcrun/utils.h"
int
crun_run_create_internal (struct crun_global_arguments *global_args, int argc, char **argv,
container_run_create_func_t container_run_create_func, get_options_func_t get_options_func,
libcrun_context_t *crun_context, struct argp *run_argp, const char **config_file_ptr,
const char **bundle_ptr, libcrun_error_t *err)
{
int first_arg = 0, ret;
cleanup_container libcrun_container_t *container = NULL;
cleanup_free char *bundle_cleanup = NULL;
cleanup_free char *config_file_cleanup = NULL;
crun_context->preserve_fds = 0;
crun_context->listen_fds = 0;
argp_parse (run_argp, argc, argv, ARGP_IN_ORDER, &first_arg, crun_context);
/* Get options after parsing the arguments. */
unsigned int options = get_options_func ();
const char *config_file = *config_file_ptr;
const char *bundle = *bundle_ptr;
crun_assert_n_args (argc - first_arg, 1, 1);
/* Make sure the config is an absolute path before changing the directory. */
if ((strcmp ("config.json", config_file) != 0))
{
if (config_file[0] != '/')
{
config_file_cleanup = realpath (config_file, NULL);
if (config_file_cleanup == NULL)
libcrun_fail_with_error (errno, "realpath `%s` failed", config_file);
config_file = config_file_cleanup;
}
}
/* Make sure the bundle is an absolute path. */
if (bundle == NULL)
{
bundle = bundle_cleanup = getcwd (NULL, 0);
if (UNLIKELY (bundle == NULL))
libcrun_fail_with_error (errno, "getcwd failed");
}
else
{
if (bundle[0] != '/')
{
bundle_cleanup = realpath (bundle, NULL);
if (bundle_cleanup == NULL)
libcrun_fail_with_error (errno, "realpath `%s` failed", bundle);
bundle = bundle_cleanup;
}
if (chdir (bundle) < 0)
libcrun_fail_with_error (errno, "chdir `%s` failed", bundle);
}
ret = init_libcrun_context (crun_context, argv[first_arg], global_args, err);
if (UNLIKELY (ret < 0))
return ret;
container = libcrun_container_load_from_file (config_file, err);
if (container == NULL)
return -1;
libcrun_debug ("Using bundle: %s", bundle);
crun_context->bundle = bundle;
if (getenv ("LISTEN_FDS"))
{
crun_context->listen_fds = parse_int_or_fail (getenv ("LISTEN_FDS"), "LISTEN_FDS");
crun_context->preserve_fds += crun_context->listen_fds;
}
return container_run_create_func (crun_context, container, options, err);
}

31
src/run_create.h Normal file
View File

@ -0,0 +1,31 @@
/*
* crun - OCI runtime written in C
*
* Copyright (C) 2017, 2018, 2019 Giuseppe Scrivano <giuseppe@scrivano.org>
* crun is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* crun is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with crun. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef RUN_CREATE_H
#define RUN_CREATE_H
#include "crun.h"
typedef int (*container_run_create_func_t) (libcrun_context_t *, libcrun_container_t *, unsigned int, libcrun_error_t *);
typedef unsigned int (*get_options_func_t) ();
int crun_run_create_internal (struct crun_global_arguments *global_args, int argc, char **argv,
container_run_create_func_t container_run_create_func, get_options_func_t get_options_func,
libcrun_context_t *crun_context, struct argp *run_argp, const char **config_file,
const char **bundle, libcrun_error_t *err);
#endif

View File

@ -1,7 +1,7 @@
FROM alpine
RUN apk add gcc automake autoconf libtool gettext pkgconf git make musl-dev \
python3 libcap-dev libseccomp-dev yajl-dev argp-standalone go-md2man
python3 libcap-dev libseccomp-dev yajl-dev argp-standalone go-md2man gperf
COPY run-tests.sh /usr/local/bin

View File

@ -1,6 +1,6 @@
FROM fedora:latest
RUN yum install -y git protobuf-c protobuf-c-devel make clang-tools-extra clang python3-pip 'dnf-command(builddep)' && \
RUN dnf install -y awk git protobuf-c protobuf-c-devel make clang-tools-extra clang python3-pip 'dnf-command(builddep)' && \
dnf builddep -y crun && pip install scan-build
COPY run-tests.sh /usr/local/bin

View File

@ -1,6 +1,6 @@
FROM fedora:latest
RUN dnf install -y git make clang-tools-extra 'dnf-command(builddep)' && dnf builddep -y crun
RUN dnf install -y awk git make clang-tools-extra 'dnf-command(builddep)' && dnf builddep -y crun
COPY run-tests.sh /usr/local/bin
ENTRYPOINT /usr/local/bin/run-tests.sh

View File

@ -5,7 +5,7 @@ set -e -x
cd /crun
# Recent git complains that the directory is owned by someone else,
# which happens if we run it via a Dockefile with a volume mounted.
# which happens if we run it via a Dockerfile with a volume mounted.
git config --global --add safe.directory "$(pwd)"
./configure

View File

@ -1,3 +1,3 @@
FROM fedora:latest
RUN yum install -y codespell && yum clean all -y
RUN dnf install -y awk codespell && yum clean all -y

View File

@ -1,23 +1,23 @@
FROM ubuntu:lunar
FROM ubuntu:noble
ENV GOPATH=/root/go
ENV PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/go/bin
RUN apt-get update \
&& apt-get -y upgrade \
&& apt-get install -y bash golang-1.19 libbtrfs-dev libnl-3-dev libnet1-dev \
&& apt-get install -y bash golang-1.22 libbtrfs-dev libnl-3-dev libnet1-dev \
protobuf-c-compiler libcap-dev libaio-dev \
curl libprotobuf-c-dev libprotobuf-dev socat libseccomp-dev \
pigz lsof make git gcc build-essential pkgconf libtool \
libsystemd-dev libcap-dev libyajl-dev \
go-md2man libtool autoconf python3 automake sudo \
&& update-alternatives --install /usr/bin/go go /usr/lib/go-1.19/bin/go 0 \
&& update-alternatives --install /usr/bin/go go /usr/lib/go-1.22/bin/go 0 \
&& mkdir -p /root/go/src/github.com/containerd \
&& chmod 755 /root \
&& (cd /root/go/src/github.com/containerd \
&& git clone https://github.com/containerd/containerd \
&& cd containerd \
&& git reset --hard v1.7.1 \
&& git reset --hard v2.1.1 \
&& make \
&& make binaries \
&& make install \

View File

@ -18,4 +18,4 @@ ulimit -u unlimited
export PATH=$PATH:${PWD}/bin
make RUNC_FLAVOR=crun TEST_RUNTIME=io.containerd.runc.v2 TESTFLAGS="-timeout 120m" integration
make RUNC_FLAVOR=crun TEST_RUNTIME=io.containerd.runc.v2 TESTFLAGS="-timeout 120m -no-criu -test.v" integration

View File

@ -3,11 +3,11 @@ FROM fedora:latest
ENV GOPATH=/root/go
ENV PATH=/usr/bin:/usr/sbin:/root/go/bin:/usr/local/bin::/usr/local/sbin
RUN yum install -y python git gcc automake autoconf libcap-devel \
RUN dnf install -y awk python git gcc automake autoconf libcap-devel \
systemd-devel yajl-devel libseccomp-devel go-md2man conntrack-tools which \
glibc-static python3-libmount libtool make podman xz nmap-ncat jq bats \
iproute openssl iputils socat criu-libs irqbalance && \
dnf install -y 'dnf-command(builddep)' && dnf builddep -y podman && \
dnf install -y awk 'dnf-command(builddep)' && dnf builddep -y podman && \
dnf remove -y golang && \
sudo dnf update -y && \
curl -LO https://dl.google.com/go/go1.19.4.linux-amd64.tar.gz && \

View File

@ -1,6 +1,6 @@
FROM fedora:latest
RUN yum install -y golang python git automake autoconf libcap-devel \
RUN dnf install -y awk golang python git automake autoconf libcap-devel \
systemd-devel yajl-devel libseccomp-devel go-md2man \
glibc-static python3-libmount libtool make honggfuzz git

View File

@ -17,6 +17,7 @@ git clean -fdx
./autogen.sh
./configure --enable-embedded-yajl HFUZZ_CC_UBSAN=1 HFUZZ_CC_ASAN=1 CC=hfuzz-clang CPPFLAGS="-D FUZZER" CFLAGS="-ggdb3 -fsanitize-coverage=trace-pc-guard,trace-cmp,trace-div,indirect-calls"
make -j "$(nproc)"
make -j "$(nproc)" tests/tests_libcrun_fuzzer
mkdir rootfs
mkdir random-data

View File

@ -38,6 +38,12 @@
#include <sys/types.h>
#include <sched.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <net/if.h>
#ifdef HAVE_LINUX_IOPRIO_H
# include <linux/ioprio.h>
#endif
@ -245,9 +251,9 @@ cat (char *file)
}
static int
open_only (char *file)
open_only (char *file, int flags)
{
int fd = open (file, O_RDONLY);
int fd = open (file, flags);
if (fd >= 0)
{
close (fd);
@ -364,7 +370,7 @@ memhog (int megabytes)
while (1)
{
/* change one page each 0.1 seconds */
nanosleep ((const struct timespec[]){ { 0, 100000000L } }, NULL);
nanosleep ((const struct timespec[]) { { 0, 100000000L } }, NULL);
buf[pos] = 'c';
pos += sysconf (_SC_PAGESIZE);
if (pos > megabytes * 1024 * 1024)
@ -374,6 +380,99 @@ memhog (int megabytes)
return 0;
}
#define BUFFER_SIZE 8192
static void
dump_net_interface (const char *ifname)
{
struct sockaddr_nl sa;
struct nlmsghdr *nlh;
struct ifaddrmsg *ifa;
char *buffer;
int ifindex;
int sock;
ifindex = if_nametoindex (ifname);
if (ifindex == 0)
error (EXIT_FAILURE, errno, "if_nametoindex `%s`", ifname);
sock = socket (AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (sock < 0)
error (EXIT_FAILURE, errno, "socket");
buffer = malloc (BUFFER_SIZE);
if (buffer == NULL)
error (EXIT_FAILURE, errno, "malloc");
sa.nl_family = AF_NETLINK;
if (bind (sock, (struct sockaddr *) &sa, sizeof (sa)) < 0)
error (EXIT_FAILURE, errno, "bind");
nlh = (struct nlmsghdr *) buffer;
memset (buffer, 0, BUFFER_SIZE);
nlh->nlmsg_len = NLMSG_LENGTH (sizeof (struct ifaddrmsg));
nlh->nlmsg_type = RTM_GETADDR;
nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
nlh->nlmsg_seq = 1;
ifa = NLMSG_DATA (nlh);
ifa->ifa_family = AF_INET;
if (send (sock, nlh, nlh->nlmsg_len, 0) < 0)
error (EXIT_FAILURE, errno, "send");
while (1)
{
ssize_t len = recv (sock, buffer, BUFFER_SIZE, 0);
if (len < 0)
error (EXIT_FAILURE, errno, "recv");
for (nlh = (struct nlmsghdr *) buffer; NLMSG_OK (nlh, len); nlh = NLMSG_NEXT (nlh, len))
{
struct rtattr *rta[IFA_MAX + 1] = {};
struct rtattr *rta_it;
int rta_len;
if (nlh->nlmsg_type == NLMSG_DONE)
goto done;
if (nlh->nlmsg_type == NLMSG_ERROR)
error (EXIT_FAILURE, 0, "netlink error");
ifa = NLMSG_DATA (nlh);
if (ifa->ifa_index != ifindex)
continue;
rta_it = IFA_RTA (ifa);
rta_len = IFA_PAYLOAD (nlh);
while (RTA_OK (rta_it, rta_len))
{
if (rta_it->rta_type <= IFA_MAX)
rta[rta_it->rta_type] = rta_it;
rta_it = RTA_NEXT (rta_it, rta_len);
}
if (rta[IFA_ADDRESS])
{
char addr[INET_ADDRSTRLEN];
inet_ntop (AF_INET, RTA_DATA (rta[IFA_LOCAL]), addr, sizeof (addr));
printf ("address: %s/%d\n", addr, ifa->ifa_prefixlen);
}
if (rta[IFA_BROADCAST])
{
char bcast[INET_ADDRSTRLEN];
inet_ntop (AF_INET, RTA_DATA (rta[IFA_BROADCAST]), bcast, sizeof (bcast));
printf ("broadcast: %s\n", bcast);
}
}
}
done:
close (sock);
free (buffer);
}
int
main (int argc, char **argv)
{
@ -442,7 +541,14 @@ main (int argc, char **argv)
{
if (argc < 3)
error (EXIT_FAILURE, 0, "'open' requires an argument");
return open_only (argv[2]);
return open_only (argv[2], O_RDONLY);
}
if (strcmp (argv[1], "openwronly") == 0)
{
if (argc < 3)
error (EXIT_FAILURE, 0, "'openwronly' requires an argument");
return open_only (argv[2], O_WRONLY);
}
if (strcmp (argv[1], "access") == 0)
@ -500,6 +606,45 @@ main (int argc, char **argv)
return 0;
}
if (strcmp (argv[1], "isfifo") == 0)
{
struct stat st;
if (argc < 3)
error (EXIT_FAILURE, 0, "'isfifo' requires a path argument");
if (stat (argv[2], &st) < 0)
error (EXIT_FAILURE, errno, "stat %s", argv[2]);
if (S_ISFIFO (st.st_mode))
exit (0);
else
exit (1);
}
if (strcmp (argv[1], "ischar") == 0)
{
struct stat st;
if (argc < 3)
error (EXIT_FAILURE, 0, "'ischar' requires a path argument");
if (stat (argv[2], &st) < 0)
error (EXIT_FAILURE, errno, "stat %s", argv[2]);
if (S_ISCHR (st.st_mode))
exit (0);
else
exit (1);
}
if (strcmp (argv[1], "isblock") == 0)
{
struct stat st;
if (argc < 3)
error (EXIT_FAILURE, 0, "'isblock' requires a path argument");
if (stat (argv[2], &st) < 0)
error (EXIT_FAILURE, errno, "stat %s", argv[2]);
if (S_ISBLK (st.st_mode))
exit (0);
else
exit (1);
}
if (strcmp (argv[1], "owner") == 0)
{
struct stat st;
@ -587,6 +732,14 @@ main (int argc, char **argv)
return 0;
}
if (strcmp (argv[1], "ip") == 0)
{
if (argc < 2)
error (EXIT_FAILURE, 0, "'ip' requires an argument");
dump_net_interface (argv[2]);
exit (EXIT_SUCCESS);
}
if (strcmp (argv[1], "write") == 0)
{
if (argc < 3)
@ -682,7 +835,11 @@ main (int argc, char **argv)
while (ret < 0 && errno == EINTR);
if (ret < 0)
return ret;
return status;
if (WIFEXITED (status))
return WEXITSTATUS (status);
if (WIFSIGNALED (status))
return 128 + WTERMSIG (status);
return EXIT_FAILURE;
}
return ls (argv[2]);
}
@ -690,6 +847,13 @@ main (int argc, char **argv)
if (strcmp (argv[1], "systemd-notify") == 0)
return sd_notify ();
if (strcmp (argv[1], "getpgrp") == 0)
{
pid_t pid = getpgrp ();
printf ("%d\n", pid);
return 0;
}
if (strcmp (argv[1], "check-feature") == 0)
{
if (argc < 3)

View File

@ -3,7 +3,7 @@ FROM fedora:latest
ENV GOPATH=/root/go
ENV PATH=/usr/bin:/usr/sbin:/root/go/bin:/usr/local/bin::/usr/local/sbin
RUN yum install -y golang python git gcc automake autoconf libcap-devel \
RUN dnf install -y awk golang python git gcc automake autoconf libcap-devel \
systemd-devel yajl-devel libseccomp-devel libselinux-devel \
glibc-static python3-libmount libtool make go-md2man perl-Test2-Harness

View File

@ -3,19 +3,20 @@ FROM fedora:latest
ENV GOPATH=/root/go
ENV PATH=/usr/bin:/usr/sbin:/root/go/bin:/usr/local/bin::/usr/local/sbin
RUN dnf install -y golang python git gcc automake autoconf libcap-devel \
systemd-devel yajl-devel libseccomp-devel go-md2man \
RUN dnf install -y awk golang python git gcc automake autoconf libcap-devel \
systemd-devel yajl-devel libseccomp-devel go-md2man catatonit \
glibc-static python3-libmount libtool make podman xz nmap-ncat procps-ng slirp4netns \
device-mapper-devel containernetworking-plugins 'dnf-command(builddep)' && \
dnf builddep -y podman && \
chmod 755 /root && \
git clone https://github.com/containers/podman /root/go/src/github.com/containers/podman && \
git clone --depth=1 https://github.com/containers/podman /root/go/src/github.com/containers/podman && \
cd /root/go/src/github.com/containers/podman && \
make .install.ginkgo install.catatonit && cp ./test/tools/build/ginkgo /usr/local/bin && \
make .install.ginkgo && cp ./bin/ginkgo /usr/local/bin && \
make
## Change default log driver to k8s-file for tests
RUN sed -i 's/journald/k8s-file/g' /usr/share/containers/containers.conf
RUN echo containers:200000:268435456 >> /etc/subuid && echo containers:200000:268435456 >> /etc/subgid
COPY run-tests.sh /usr/local/bin
WORKDIR /root/go/src/github.com/containers/podman
ENTRYPOINT /usr/local/bin/run-tests.sh

View File

@ -47,6 +47,6 @@ export TMPDIR=/var/tmp
# - Podman run with specified static IPv6 has correct IP
# Does not work inside test environment.
ginkgo --focus='.*' --skip='.*(selinux|notify_socket|systemd|podman run exit 12*|podman run exit code on failure to exec|failed to start|search|trust|inspect|logs|generate|import|mounted rw|inherit host devices|play kube|cgroups=disabled|privileged CapEff|device-cgroup-rule|capabilities|network|pull from docker|--add-host|removes a pod with a container|prune removes a pod with a stopped container|overlay volume flag|prune unused images|podman images filter|image list filter|create --pull|podman ps json format|using journald for container|image tree|--pull|shared layers|child images|cached images|flag with multiple mounts|overlay and used as workdir|image_copy_tmp_dir|Podman run with specified static IPv6 has correct IP|authenticated push|pod create --share-parent|podman kill paused container|login and logout|podman top on privileged container|local registry with authorization|podman update container all options v2|push test|podman pull and run on split imagestore|Podman kube play|uidmapping and gidmapping|push with --add-compression).*' \
ginkgo --focus='.*' --skip='.*(selinux|notify_socket|systemd|podman run exit 12*|podman run exit code on failure to exec|failed to start|search|trust|inspect|logs|generate|import|mounted rw|inherit host devices|play kube|cgroups=disabled|privileged CapEff|device-cgroup-rule|capabilities|network|pull from docker|--add-host|removes a pod with a container|prune removes a pod with a stopped container|overlay volume flag|prune unused images|podman images filter|image list filter|create --pull|podman ps json format|using journald for container|image tree|--pull|shared layers|child images|cached images|flag with multiple mounts|overlay and used as workdir|image_copy_tmp_dir|Podman run with specified static IPv6 has correct IP|authenticated push|pod create --share-parent|podman kill paused container|login and logout|podman top on privileged container|local registry with authorization|podman update container all options v2|push test|podman pull and run on split imagestore|Podman kube play|uidmapping and gidmapping|push with --add-compression|enforces DiffID matching).*' \
-vv -tags "seccomp ostree selinux exclude_graphdriver_devicemapper" \
-timeout=50m -cover -flake-attempts 3 -progress -trace -no-color test/e2e/.

View File

@ -6,18 +6,18 @@ OCI_RUNTIME=${OCI_RUNTIME:-/usr/bin/crun}
export INIT
export OCI_RUNTIME
rm -f *.trs
rm -f -- *.trs
COLOR=
COLOR="no"
if [ -t 1 ]; then
COLOR="--color-tests yes"
COLOR="yes"
fi
for i in test_*.py
do
./tap-driver.sh --test-name $i --log-file $i.log --trs-file $i.trs ${COLOR} --enable-hard-errors yes --expect-failure no -- /usr/bin/python $i
./tap-driver.sh --test-name "$i" --log-file "$i.log" --trs-file "$i.trs" --color-tests "${COLOR}" --enable-hard-errors yes --expect-failure no -- /usr/bin/python "$i"
done
if grep FAIL *.trs; then
if grep FAIL -- *.trs; then
exit 1
fi

140
tests/test_bpf_devices.py Normal file
View File

@ -0,0 +1,140 @@
#!/bin/env python3
# crun - OCI runtime written in C
#
# Copyright (C) 2017, 2018, 2019 Giuseppe Scrivano <giuseppe@scrivano.org>
# crun is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# crun is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with crun. If not, see <http://www.gnu.org/licenses/>.
import subprocess
import sys
import json
import os
from tests_utils import *
def has_bpf_fs():
"""Check if BPF filesystem is mounted"""
try:
return os.path.exists("/sys/fs/bpf") and os.path.ismount("/sys/fs/bpf")
except:
return False
def get_systemd_version():
"""Get systemd version number"""
try:
output = subprocess.check_output(["systemctl", "--version"], universal_newlines=True)
# First line format: "systemd 250 (250.3-2-arch)"
first_line = output.split('\n')[0]
version_str = first_line.split()[1]
return int(version_str)
except:
return 0
def systemd_supports_bpf_program():
"""Check if systemd version supports BPFProgram property (>= 249)"""
return get_systemd_version() >= 249
def check_bpf_prerequisites():
"""Check all prerequisites for BPF device tests. Returns 77 (skip) if not met, 0 if OK"""
# Skip if not root
if is_rootless():
return 77
# Skip if not cgroup v2
if not is_cgroup_v2_unified():
return 77
# Skip if systemd not available
if 'SYSTEMD' not in get_crun_feature_string():
return 77
# Skip if not running on systemd
if not running_on_systemd():
return 77
# Skip if no BPF support
if not has_bpf_fs():
return 77
# Skip if systemd doesn't support BPFProgram
if not systemd_supports_bpf_program():
return 77
return 0
def test_bpf_devices_systemd():
"""Test BPF device handling with systemd: property set, file created, and cleanup"""
ret = check_bpf_prerequisites()
if ret != 0:
return ret
conf = base_config()
conf['linux']['resources'] = {}
add_all_namespaces(conf, cgroupns=True)
conf['process']['args'] = ['/init', 'pause']
cid = None
bpf_path = None
try:
# Run container with systemd cgroup manager.
_, cid = run_and_get_output(conf, command='run', detach=True, cgroup_manager="systemd")
# Get systemd scope.
state = run_crun_command(['state', cid])
scope = json.loads(state)['systemd-scope']
# Test 1: Check that BPFProgram property is set on the scope.
output = subprocess.check_output(['systemctl', 'show', '-PBPFProgram', scope], close_fds=False).decode().strip()
if output == "":
sys.stderr.write("# BPFProgram property not found or empty\n")
return -1
# Should look like "device:/sys/fs/bpf/crun/crun-xxx_scope".
if "device:/sys/fs/bpf/crun/" not in output:
sys.stderr.write("# Bad BPFProgram property value: `%s`\n" % output)
return -1
# Test 2: Check that BPF program file was created.
# Extract the path.
bpf_path = output.split("device:", 1)[1]
if not os.path.exists(bpf_path):
sys.stderr.write("# BPF program file `%s` not found\n" % bpf_path)
return -1
# Test 3: Check that BPF program is cleaned up.
# Delete the container.
run_crun_command(["delete", "-f", cid])
cid = None
if os.path.exists(bpf_path):
sys.stderr.write("# BPF program `%s` still exist after crun delete\n" % bpf_path)
return -1
return 0
except Exception as e:
sys.stderr.write("# Test failed with exception: %s\n" % str(e))
return -1
finally:
if cid is not None:
run_crun_command(["delete", "-f", cid])
return 0
all_tests = {
"bpf-devices-systemd": test_bpf_devices_systemd,
}
if __name__ == "__main__":
tests_main(all_tests)

Some files were not shown because too many files have changed in this diff Show More