Commit Graph

60 Commits

Author SHA1 Message Date
Qi Wang 34e82f81bd validate fds --preserve-fds
validate file descriptors passed from podman run and podman exec --preserve-fds.

Signed-off-by: Qi Wang <qiwan@redhat.com>
2020-08-04 15:09:17 -04:00
Giuseppe Scrivano 8df7ab24b0
rootless: system service joins immediately the namespaces
when there is a pause process running, let the "system service" podman
instance join immediately the existing namespaces.

Closes: https://github.com/containers/podman/issues/7180
Closes: https://github.com/containers/podman/issues/6660

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-08-03 22:08:17 +02:00
Giuseppe Scrivano d86ef45441
rootless: child exits immediately on userns errors
if the parent process failed to create the user namespace, let the
child exit immediately.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-07-30 21:46:04 +02:00
Daniel J Walsh 6979d140f1
Add podman image mount
There are many use cases where you want to just mount an image
without creating a container on it. For example you might want
to just examine the content in an image after you pull it for
security analysys.  Or you might want to just use the executables
on the image without running it in a container.

The image is mounted readonly since we do not want people changing
images.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-07-28 10:27:44 -04:00
Daniel J Walsh 8f7ed50cb2
Cleanup handling of podman mount/unmount
We should default to the user name unmount rather then the internal
name of umount.

Also User namespace was not being handled correctly. We want to inform
the user that if they do a mount when in rootless mode that they have
to be first in the podman unshare state.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-07-27 16:53:02 -04:00
Giuseppe Scrivano 89d4940a37
rootless: move ns open before fork
commit 788fdc685b introduced a race
where the target process dies before the child process opens the
namespace files.  Move the open before the fork so if it fails the
parent process can attempt to join a different container instead of
failing.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-04-29 11:44:24 +02:00
Giuseppe Scrivano 788fdc685b
rootless: move join namespace inside child process
open the namespace file descriptors inside of the child process.

Closes: https://github.com/containers/libpod/issues/5873

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-04-20 17:40:25 +02:00
Giuseppe Scrivano c33371fadb
rootless: use snprintf
use directly snprintf instead of strlen+strcpy.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-04-13 13:35:38 +02:00
Giuseppe Scrivano 1091440e5d
rootless: fix usage with hidepid=1
when /proc is mounted with hidepid=1 a process doesn't see processes
from the outer user namespace.  This causes an issue reading the
cmdline from the parent process.

To address it, always read the command line from /proc/self instead of
using /proc/PARENT_PID.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-03-19 11:18:23 +01:00
Giuseppe Scrivano d400f0b5b2
rootless: fix segfault when open fd >= FD_SETSIZE
if there are more than FD_SETSIZE open fds passed down to the Podman
process, the initialization code could crash as it attempts to store
them into a fd_set.  Use an array of fd_set structs, each of them
holding only FD_SETSIZE file descriptors.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-02-25 17:52:06 +01:00
Giuseppe Scrivano 1d9537e242 rootless: enable shortcut only for podman
disable joining automatically the user namespace if the process is not
podman.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-29 16:16:36 -06:00
Giuseppe Scrivano ab7744d3c1
rootless: set C variables also on shortcut
make sure the rootless env variables are set also when we are joining
directly the user+mount namespace without creating a new process.

It is required by pkg/unshare in containers/common.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-20 16:42:45 +01:00
Giuseppe Scrivano a94e625868
rootless: add fallback for renameat2 at runtime
the renameat2 syscall might be defined in the C library but lacking
support in the kernel.

In such case, let it fallback to open(O_CREAT)+rename as it does on
systems lacking the definition for renameat2.

Closes: https://github.com/containers/libpod/issues/4570

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-12-04 10:30:40 +01:00
Giuseppe Scrivano 0a8dcd7112
rootless: provide workaround for missing renameat2
on RHEL 7.7 renameat2 is not implemented for s390x, provide a
workaround.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1768519

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-11-06 15:27:46 +01:00
Giuseppe Scrivano a114e9059a
rootless: use SYS_renameat2 instead of __NR_renameat2
use the correct definition for the syscall number.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-11-06 13:41:15 +01:00
OpenShift Merge Robot 146719718e
Merge pull request #3782 from eriksjolund/fix_realloc_in_rootless_linux.c
Fix incorrect use of realloc()
2019-08-11 19:44:58 +02:00
Erik Sjölund 39ce3626e0
Adjust read count so that a newline can be added afterwards
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2019-08-11 16:44:26 +02:00
Erik Sjölund 4d3cf9b576
Fix incorrect use of realloc()
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2019-08-11 15:58:20 +02:00
Daniel J Walsh 44126969f1
Fix a couple of errors descovered by coverity
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-08-09 15:33:16 -04:00
Giuseppe Scrivano 4b176d4f45
rootless: do not join namespace if it has already euid == 0
do not attempt to join the rootless namespace if it is running already
with euid == 0.

Closes: https://github.com/containers/libpod/issues/3463

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-01 21:58:33 +02:00
Danila Kiver 7ea7754e4a Exclude SIGTERM from blocked signals for pause process.
Currently pause process blocks all signals which may cause its
termination, including SIGTERM. This behavior hangs init(1) during
system shutdown, until pause process gets SIGKILLed after some grace
period. To avoid this hanging, SIGTERM is excluded from list of blocked
signals.

Fixes #3440

Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
2019-06-28 00:18:13 +03:00
OpenShift Merge Robot f446ccf0b0
Merge pull request #3379 from openSUSE/rootless-fix
Fix format specifiers in rootless_linux.c
2019-06-21 00:18:24 -07:00
OpenShift Merge Robot f65ddc0991
Merge pull request #3380 from openSUSE/asprintf-fix
Handle possible asprintf failure in rootless_linux.c
2019-06-20 12:30:27 -07:00
Sascha Grunert 6e318a01a0
Fix execvp uage in rootless_linux.c
The second argument of `execlp` should be of type `char *`, so we need
to add an additional argument there.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-06-20 15:07:01 +02:00
Sascha Grunert fa1b0a2d89
Handle possible asprintf failure in rootless_linux.c
If `asprintf` fails we early exit now.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-06-20 14:52:32 +02:00
Sascha Grunert 3cf3ccbd77
Fix format specifiers in rootless_linux.c
Format `%d` expects argument of type `int`, but the argument has a type
of `long int`.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-06-20 12:03:04 +02:00
Giuseppe Scrivano 6b0e1a3091
rootless: block signals on re-exec
we are allowed to use only signal safe functions between a fork of a
multithreaded application and the next execve.  Since setenv(3) is not
signal safe, block signals.  We are already doing it for creating a
new namespace.

This is mostly a cleanup since reexec_in_user_namespace_wait is used
only only to join existing namespaces when we have not a pause.pid
file.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-06-03 14:25:10 +02:00
Giuseppe Scrivano 27e47cb6d0
rootless: use TEMP_FAILURE_RETRY macro
avoid checking for EINTR for every syscall that could block.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-05-31 22:05:25 +02:00
Giuseppe Scrivano b88dc3a41e
rootless: fix return type
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-05-31 22:05:25 +02:00
Giuseppe Scrivano 10983c363e
rootless: make sure the buffer is NUL terminated
after we read from the pause PID file, NUL terminate the buffer to
avoid reading garbage from the stack.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-05-31 22:05:24 +02:00
Giuseppe Scrivano ee11f3bce9
rootless: new function to join existing conmon processes
move the logic for joining existing namespaces down to the rootless
package.  In main_local we still retrieve the list of conmon pid files
and use it from the rootless package.

In addition, create a temporary user namespace for reading these
files, as the unprivileged user might not have enough privileges for
reading the conmon pid file, for example when running with a different
uidmap and root in the container is different than the rootless user.

Closes: https://github.com/containers/libpod/issues/3187

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-05-25 13:47:57 +02:00
Giuseppe Scrivano ce26aa701f
rootless: block signals for pause
block signals for the pause process, so it can't be killed by
mistake.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-05-25 13:46:32 +02:00
Giuseppe Scrivano 6df320c391
rootless: store also the original GID in the host
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-05-23 22:41:48 +02:00
Giuseppe Scrivano 562357ebb2
rootless: join namespace immediately when possible
add a shortcut for joining immediately the namespace so we don't need
to re-exec Podman.

With the pause process simplificaton, we can now attempt to join the
namespaces as soon as Podman starts (and before the Go runtime kicks
in), so that we don't need to re-exec and use just one process.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-05-17 20:48:24 +02:00
Giuseppe Scrivano 791d53a214
rootless: use a pause process
use a pause process to keep the user and mount namespace alive.

The pause process is created immediately on reload, and all successive
Podman processes will refer to it for joining the user&mount
namespace.

This solves all the race conditions we had on joining the correct
namespaces using the conmon processes.

As a fallback if the join fails for any reason (e.g. the pause process
was killed), then we try to join the running containers as we were
doing before.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-05-17 20:48:24 +02:00
Giuseppe Scrivano 0dfb61d806
rootless: not close more FDs than needed
we were previously closing as many FDs as they were open when we first
started Podman in the range (3-MAX-FD).  This would cause issues if
there were empty intervals, as these FDs are later on used by the
Golang runtime.  Store exactly what FDs were first open in a fd_set,
so that we can close exactly the FDs that were open at startup.

Closes: https://github.com/containers/libpod/issues/2964

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-04-18 14:32:46 +02:00
Giuseppe Scrivano 9e79530f8f
Revert "rootless: set controlling terminal for podman in the userns"
This reverts commit 531514e823.

Closes: https://github.com/containers/libpod/issues/2926

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-04-14 07:48:37 +02:00
Giuseppe Scrivano 531514e823
rootless: set controlling terminal for podman in the userns
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-04-12 18:20:28 +02:00
Giuseppe Scrivano 72382a12a7
rootless: use a single user namespace
simplify the rootless implementation to use a single user namespace
for all the running containers.

This makes the rootless implementation behave more like root Podman,
where each container is created in the host environment.

There are multiple advantages to it: 1) much simpler implementation as
there is only one namespace to join.  2) we can join namespaces owned
by different containers.  3) commands like ps won't be limited to what
container they can access as previously we either had access to the
storage from a new namespace or access to /proc when running from the
host.  4) rootless varlink works.  5) there are only two ways to enter
in a namespace, either by creating a new one if no containers are
running or joining the existing one from any container.

Containers created by older Podman versions must be restarted.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-04-01 15:32:58 +02:00
Giuseppe Scrivano ce0ca0d459
rootless: change env prefix
from _LIBPOD to _CONTAINERS.  The same change was done in buildah
unshare.

This is necessary for podman to detect we are running in a rootless
environment and work properly from a "buildah unshare" session.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-03-28 17:08:20 +01:00
Giuseppe Scrivano cc411dd98f
rootless: propagate errors from info
we use "podman info" to reconfigure the runtime after a reboot, but we
don't propagate the error message back if something goes wrong.

Closes: https://github.com/containers/libpod/issues/2584

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-03-08 19:42:20 +01:00
Giuseppe Scrivano ca5114faf9
rootless: fix clone syscall on s390 and cris archs
from the clone man page:

  On the cris and s390 architectures, the order of the first two
  arguments is reversed:

           long clone(void *child_stack, unsigned long flags,
                      int *ptid, int *ctid,
                      unsigned long newtls);

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1672714

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-03-05 17:27:43 +01:00
Giuseppe Scrivano 8984ba7461
rootless: force same cwd when re-execing
when joining an existing namespace, we were not maintaining the
current working directory, causing commands like export -o to fail
when they weren't referring to absolute paths.

Closes: https://github.com/containers/libpod/issues/2381

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-02-22 23:55:21 +01:00
Harald Hoyer 3f60dc02e3
Adjust LISTEN_PID for reexec in varlink mode
Because the varlink server honors the socket activation protocol,
LISTEN_PID has to be adjusted with the new PID.

https://varlink.org/FAQ.html#how-does-socket-activation-work
Signed-off-by: Harald Hoyer <harald@redhat.com>
2019-02-21 10:25:57 +01:00
Daniel J Walsh 0abb757425
Cleanup coverity scan issues
If realloc fails, then buffer will be leaked, this change frees up the buffer.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-01-15 17:09:15 -05:00
Giuseppe Scrivano f2e96b0934
rootless: add function to join user and mount namespace
Add the possibility to join directly the user and mount namespace
without looking up the parent of the user namespace.

We need this in order to be able the conmon process, as the mount
namespace is kept alive only there.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-12-21 09:46:05 +01:00
Giuseppe Scrivano 55c9b03baf
rootless: detect when user namespaces are not enabled
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-10-11 17:49:16 +02:00
Giuseppe Scrivano 2933c3b980
rootless: report more error messages from the startup phase
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-10-11 17:09:19 +02:00
Giuseppe Scrivano 48f6f9254d
rootless: fix an hang on older versions of setresuid/setresgid
the issue is caused by the Go Runtime that messes up with the process
signals, overriding SIGSETXID and SIGCANCEL which are used internally
by glibc.  They are used to inform all the threads to update their
stored uid/gid information.  This causes a hang on the set*id glibc
wrappers since the handler installed by glibc is never invoked.

Since we are running with only one thread, we don't really need to
update other threads or even the current thread as we are not using
getuid/getgid before the execvp.

Closes: https://github.com/containers/libpod/issues/1625

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-10-11 17:09:18 +02:00
Giuseppe Scrivano 807f6f8d8f rootless: check uid with Geteuid() instead of Getuid()
change the tests to use chroot to set a numeric UID/GID.

Go syscall.Credential doesn't change the effective UID/GID of the
process.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

Closes: #1372
Approved by: mheon
2018-09-04 14:36:57 +00:00