commit 788fdc685b introduced a race
where the target process dies before the child process opens the
namespace files. Move the open before the fork so if it fails the
parent process can attempt to join a different container instead of
failing.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
since we join directly the conmon user namespace, there is no need to
look up its parent user namespace, as we can safely assume it is the
init namespace.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
when /proc is mounted with hidepid=1 a process doesn't see processes
from the outer user namespace. This causes an issue reading the
cmdline from the parent process.
To address it, always read the command line from /proc/self instead of
using /proc/PARENT_PID.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
we need to store the pause process PID file so that it can be re-used
later.
commit e9dc212092 introduced this
regression.
Closes: https://github.com/containers/libpod/issues/5246
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
if there are more than FD_SETSIZE open fds passed down to the Podman
process, the initialization code could crash as it attempts to store
them into a fd_set. Use an array of fd_set structs, each of them
holding only FD_SETSIZE file descriptors.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
if the pause process doesn't exist and we try to join a conmon
namespace, make sure the process still exists. Otherwise re-create
the user namespace.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
make sure the rootless env variables are set also when we are joining
directly the user+mount namespace without creating a new process.
It is required by pkg/unshare in containers/common.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
the renameat2 syscall might be defined in the C library but lacking
support in the kernel.
In such case, let it fallback to open(O_CREAT)+rename as it does on
systems lacking the definition for renameat2.
Closes: https://github.com/containers/libpod/issues/4570
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
if the pause process cannot be joined, remove the pause.pid while
keeping a lock on it, and try to recreate it.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
If we don't do this, we print WARN level messages that we should
not be printing by default.
Up one WARN message to ERROR so it still shows up by default.
Fixes: #4115Fixes: #4012
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
detect if the current user namespace doesn't match the configuration
in the /etc/subuid and /etc/subgid files.
If there is a mismatch, raise a warning and suggest the user to
recreate the user namespace with "system migrate", that also restarts
the containers.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
unfortunately rootless won't work without cgo, as most of the
implementation is in C, but at least allow to build libpod.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
do not attempt to join the rootless namespace if it is running already
with euid == 0.
Closes: https://github.com/containers/libpod/issues/3463
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Currently pause process blocks all signals which may cause its
termination, including SIGTERM. This behavior hangs init(1) during
system shutdown, until pause process gets SIGKILLed after some grace
period. To avoid this hanging, SIGTERM is excluded from list of blocked
signals.
Fixes#3440
Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
at least on Fedora 30 it creates the /run/user/UID directory for the
user logged in via ssh.
This needs to be done very early so that every other check when we
create the default configuration file will point to the correct
location.
Closes: https://github.com/containers/libpod/issues/3410
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
To avoid unnecessary warnings and errors in the future I'd like to
propose building all cgo related sources with `-Wall -Werror`. This
commit fixes some warnings which came up in `shm_lock.c`, too.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
The second argument of `execlp` should be of type `char *`, so we need
to add an additional argument there.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
on old kernels the ioctl NS_GET_PARENT is not available.
Handle the error code and immediately return the same fd. It should
be fine now that we use the namespace resolution using the conmon pid,
so the namespace parent resolution is just a safety measure.
Closes: https://github.com/containers/libpod/issues/2968
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
we are allowed to use only signal safe functions between a fork of a
multithreaded application and the next execve. Since setenv(3) is not
signal safe, block signals. We are already doing it for creating a
new namespace.
This is mostly a cleanup since reexec_in_user_namespace_wait is used
only only to join existing namespaces when we have not a pause.pid
file.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
after we read from the pause PID file, NUL terminate the buffer to
avoid reading garbage from the stack.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>