we were adding a negative duration in podman events, causing inputs like
-5s to be correct and 5s to be incorrect.
fixes#11158
Signed-off-by: cdoern <cdoern@redhat.com>
Adds support for transferring data between systems and backing up systems.
Use cases: recover from disasters or move data between machines.
Signed-off-by: flouthoc <flouthoc.git@gmail.com>
This leverages conmon's ability to proxy the SD-NOTIFY socket.
This prevents locking caused by OCI runtime blocking, waiting for
SD-NOTIFY messages, and instead passes the messages directly up
to the host.
NOTE: Also re-enable the auto-update tests which has been disabled due
to flakiness. With this change, Podman properly integrates into
systemd.
Fixes: #7316
Signed-off-by: Joseph Gooch <mrwizard@dok.org>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
When a network id is used to create a container we translate it to use the
name internally for the db. The network aliases are also stored with the
network name as key so we have to also translate them for the db.
Also removed some outdated skips from the e2e tests.
Fixes#11285
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
For docker compat include information about available volume, log and
network drivers which should be listed under the plugins key.
Fixes#11265
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Originally, Podman would unconditionally remove volumes from the
DB, even if they failed to be removed from the volume plugin;
this was a safety measure to ensure that `volume rm` can always
remove a volume from the database, even if the plugin is
misbehaving.
However, this is a significant deivation from Docker, which
refuses to remove if the plugin errors. These errors can be
legitimate configuration issues which the user should address
before the volume is removed, so Podman should also use this
behaviour.
Fixes#11214
Signed-off-by: Matthew Heon <mheon@redhat.com>
Dealing with os.Signal channels seems more like an art than science
since signals may get lost. os.Notify doesn't block on an unbuffered
channel, so users are expected to know what they're doing or hope for
the best.
In the recent past, I've seen a number of flakes and BZs on non-amd64
architectures where I was under the impression that signals may got
lost, for instance, during stop and exec.
[NO TESTS NEEDED] since this is art.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
When a host uses systemd-resolved but not the resolved stub resolver the
following symlinks are created: `/etc/resolv.conf` ->
`/run/systemd/resolve/stub-resolv.conf` -> `/run/systemd/resolve/resolv.conf`.
Because the code uses filepath.EvalSymlinks we put the new resolv.conf
to `/run/systemd/resolve/resolv.conf` but the `/run/systemd/resolve/stub-resolv.conf`
link does not exists in the mount ns.
To fix this we will walk the symlinks manually until we reach the first
one under `/run` and use this for the resolv.conf file destination.
This fixes a regression which was introduced in e73d482990.
Fixes#11222
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
after the init containers pr merged, it was suggested to use `once`
instead of `oneshot` containers as it is more aligned with other
terminiology used similarily.
[NO TESTS NEEDED]
Signed-off-by: Brent Baude <bbaude@redhat.com>
Add the --userns flag to podman pod create and keep
track of the userns setting that pod was created with
so that all containers created within the pod will inherit
that userns setting.
Specifically we need to be able to launch a pod with
--userns=keep-id
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
The slirp4netns path can be set in the config file or with
--network-cmd-path. Podman info should read the version information
correctly and not use PATH in this case. Also show the slirp4netns
version information to root users.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
this is the first pass at implementing init containers for podman pods.
init containersare made popular by k8s as a way to run setup for pods
before the pods standard containers run.
unlike k8s, we support two styles of init containers: always and
oneshot. always means the container stays in the pod and starts
whenever a pod is started. this does not apply to pods restarting.
oneshot means the container runs onetime when the pod starts and then is
removed.
Signed-off-by: Brent Baude <bbaude@redhat.com>
To match Docker's behavior, in the `--net=host` case, we need to
use the host's `/etc/hosts` file, unmodified (without adding an
entry for the container). We will still respect hosts from
`--add-host` but will not make any automatic changes.
Fortuntely, this is strictly a matter of removal and refactoring
as we already base our `/etc/hosts` on the host's version - just
need to remove the code that added entries when net=host was set.
Fixes#10319
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
podman info takes >20s on Gentoo, because equery is s..l..o..w.
qfile is much faster and, I suspect, present in most Gentoo
installations, so let's try it first.
And, because packageVersion() was scarily unmaintainable,
refactor it. Define a simple (string) list of packaging tools
to query (rpm, dpkg, ...) and iterate until we find one that
works.
IMPORTANT NOTE: the Debian (and, presumably, Ubuntu) query does not
include version number! There is no standard way on Debian to get
a package version from a file path, you can only do it via pipes
of chained commands, and I have no desire to implement that.
Signed-off-by: Ed Santiago <santiago@redhat.com>
The rootlessport forwarder requires a child IP to be set. This must be a
valid ip in the container network namespace. The problem is that after a
network disconnect and connect the eth0 ip changed. Therefore the
packages are dropped since the source ip does no longer exists in the
netns.
One solution is to set the child IP to 127.0.0.1, however this is a
security problem. [1]
To fix this we have to recreate the ports after network connect and
disconnect. To make this work the rootlessport process exposes a socket
where podman network connect/disconnect connect to and send to new child
IP to rootlessport. The rootlessport process will remove all ports and
recreate them with the new correct child IP.
Also bump rootlesskit to v0.14.3 to fix a race with RemovePort().
Fixes#10052
[1] https://nvd.nist.gov/vuln/detail/CVE-2021-20199
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Currently we override the SELinux labels specified by the user
if the container is runing a kata container or systemd container.
This PR fixes to use the label specified by the user.
Fixes: https://github.com/containers/podman/issues/11100
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This adds support to checkpoint containers out of pods and restore
container into pods.
It is only possible to restore a container into a pod if it has been
checkpointed out of pod. It is also not possible to restore a non pod
container into a pod.
The main reason this does not work is the PID namespace. If a non pod
container is being restored in a pod with a shared PID namespace, at
least one process in the restored container uses PID 1 which is already
in use by the infrastructure container. If someone tries to restore
container from a pod with a shared PID namespace without a shared PID
namespace it will also fail because the resulting PID namespace will not
have a PID 1.
Signed-off-by: Adrian Reber <areber@redhat.com>
The upcoming commit to support checkpointing out of Pods requires CRIU
3.16. This changes the CRIU version check to support checking for
different versions.
Signed-off-by: Adrian Reber <areber@redhat.com>
Implement container to container copy. Previously data could only be
copied from/to the host.
Fixes: #7370
Co-authored-by: Mehul Arora <aroram18@mcmaster.ca>
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Make sure podman network create reads all subnets from existing cni configs
and not only the first one.
Fixes#11032
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
compat containers/logs was missing actual usage of until query param.
This led me to implement the until param for libpod's container logs as well. Added e2e tests.
Signed-off-by: cdoern <cdoern@redhat.com>
The global flag will work in either location, and this flag just breaks
users expectations, and is basically a noop.
Also fix global storage-opt so that podman-remote can use it.
[NO TESTS NEEDED] Since it would be difficult to test in ci/cd.
Fixes: https://github.com/containers/podman/issues/10264
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
The `IgnorePlatform` options has been removed from the
`LookupImageOptions` in libimage to properly support multi-arch images.
Skip one buildah-bud test which requires updated CI images. This is
currently being done in github.com/containers/podman/pull/10829 but
we need to unblock merging common and buildah into podman.
[NO TESTS NEEDED]
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Adds the new --infra-name command line argument allowing users to define
the name of the infra container
Issue #10794
Signed-off-by: José Guilherme Vanz <jvanz@jvanz.com>
added support for --pid flag. User can specify ns:file, pod, private, or host.
container returns an error since you cannot point the ns of the pods infra container
to a container outside of the pod.
Signed-off-by: cdoern <cdoern@redhat.com>
There was an race condition when calling `GetRootlessCNINetNs()`. It
created the rootless cni directory before it got locked. Therefore
another process could have called cleanup and removed this directory
before it was used resulting in errors. The lockfile got moved into the
XDG_RUNTIME_DIR directory to prevent a panic when the parent dir was
removed by cleanup.
Fixes#10930Fixes#10922
To make this even more robust `GetRootlessCNINetNs()` will now return
locked. This guarantees that we can run `Do()` after `GetRootlessCNINetNs()`
before another process could have called `Cleanup()` in between.
[NO TESTS NEEDED] CI is flaking, hopefully this will fix it.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Fix issue 10929 : `[Regression in 3.2.0] CNI-in-slirp4netns DNS gets broken when running a rootful container after running a rootless container`
When /etc/resolv.conf on the host is a symlink to /run/systemd/resolve/stub-resolv.conf,
we have to mount an empty filesystem on /run/systemd/resolve in the child namespace,
so as to isolate the directory from the host mount namespace.
Otherwise our bind-mount for /run/systemd/resolve/stub-resolv.conf is unmounted
when systemd-resolved unlinks and recreates /run/systemd/resolve/stub-resolv.conf on the host.
[NO TESTS NEEDED]
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
On EOF of STDIN, we need to perform a one-sided close of the
attach connection on the client side, to ensure that STDIN
finishing will also cause the exec session to terminate, instead
of hang.
Fixes#7360
Signed-off-by: Matthew Heon <mheon@redhat.com>
If mounting to existing directory the uid/gid should be preserved.
Primary uid/gid of container shouldn't be used.
Signed-off-by: Matej Vasek <mvasek@redhat.com>
We should not be exposing the store outside of Libpod. We want to
encapsulate it as an internal implementation detail - there's no
reason functions outside of Libpod should directly be
manipulating container storage. Convert the last use to invoke a
method on Libpod instead, and remove the function.
[NO TESTS NEEDED] as this is just a refactor.
Signed-off-by: Matthew Heon <mheon@redhat.com>
The rootless cni namespace needs a valid /etc/resolv.conf file. On some
distros is a symlink to somewhere under /run. Because the kernel will
follow the symlink before mounting, it is not possible to mount a file
at exactly /etc/resolv.conf. We have to ensure that the link target will
be available in the rootless cni mount ns.
Fixes#10855
Also fixed a bug in the /var/lib/cni directory lookup logic. It used
`filepath.Base` instead of `filepath.Dir` and thus looping infinitely.
Fixes#10857
[NO TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Add a new service reaper package. Podman currently does not reap all
child processes. The slirp4netns and rootlesskit processes are not
reaped. The is not a problem for local podman since the podman process
dies before the other processes and then init will reap them for us.
However with podman system service it is possible that the podman
process is still alive after slirp died. In this case podman has to reap
it or the slirp process will be a zombie until the service is stopped.
The service reaper will listen in an extra goroutine on SIGCHLD. Once it
receives this signal it will try to reap all pids that were added with
`AddPID()`. While I would like to just reap all children this is not
possible because many parts of the code use `os/exec` with `cmd.Wait()`.
If we reap before `cmd.Wait()` things can break, so reaping everything
is not an option.
[NO TESTS NEEDED]
Fixes#9777
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
First, make podman diff accept optionally a second argument. This allows
the user to specify a second image/container to compare the first with.
If it is not set the parent layer will be used as before.
Second, podman container diff should only use containers and podman
image diff should only use images. Previously, podman container diff
would use the image when both an image and container with this name
exists.
To make this work two new parameters have been added to the api. If they
are not used the previous behaviour is used. The same applies to the
bindings.
Fixes#10649
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Pull the trigger on the `pkg/registries` package which acted as a proxy
for `c/image/pkg/sysregistriesv2`. Callers should be using the packages
from c/image directly, if needed at all.
Also make use of libimage's SystemContext() method which returns a copy
of a system context, further reducing the risk of unintentionally
altering global data.
[NO TESTS NEEDED]
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Create the /etc and /etc/mtab directories with the
correct ownership based on what the UID and GID is
for the container. This was causing issue when starting
the infra container with userns as the /etc directory
wasn't being created with the correct ownership.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
Added logic and handling for two new Podman pod create Flags.
--cpus specifies the total number of cores on which the pod can execute, this
is a combination of the period and quota for the CPU.
--cpuset-cpus is a string value which determines of these available cores,
how many we will truly execute on.
Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.
resolves#9258
Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
`syncContainer()` requires the container to be locked, otherwise we can
end up with undefined behavior.
[NO TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Podman does not need to watch the cni config directory. If a network is
not found in the cache, OCICNI will reload the networks anyway and thus
even podman system service should work as expected.
Also include a change to not mount a "new" /var by default in the
rootless cni ns, instead try to use /var/lib/cni first and then the
parent dir. This allows users to store cni configs under /var/... which
is the case for the CI compose test.
[NO TESTS NEEDED]
Fixes#10686
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Commit 84b55eec27 attempted to fix a race waiting for the container
died event. Previously, Podman slept for duration of the polling
frequence which I considerred to be a mistake. As it turns out, I was
mistaken since the file logger will, in fact, NOT read until EOF and
then stop logging but stop logging immediately _after_ it woke up.
[NO TESTS NEEDED] as the race condition cannot be hit reliably.
Fixes: #10675
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Fix the suprious "Error: nil" messages. Also add some more context to
logged error messages which makes error sources more obvious.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Previously podman failed when run in an environment where 127.0.0.53 is
the only nameserver but systemd-resolved is not used directly.
In practice this happened when podman was run within an alpine container
that used the host's network and the host was running systemd-resolved.
This fix makes podman ignore a file not found error when reading /run/systemd/resolve/resolv.conf.
Closes#10733
[NO TESTS NEEDED]
Signed-off-by: Max Goltzsche <max.goltzsche@gmail.com>
Users are complaining about read/only /var/tmp failing
even if TMPDIR=/tmp is set.
This PR Fixes: https://github.com/containers/podman/issues/10698
[NO TESTS NEEDED] No way to test this.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When starting a process with `podman exec -it` the terminal is resized
after the process is started. To fix this allow exec start to accept the
terminal height and width as parameter and let it resize right before
the process is started.
Fixes#10560
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The container name should have the slirp interface ip set in /etc/hosts
and not the gateway ip. Commit c8dfcce6db introduced this regression.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1972073
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Network connect/disconnect has to call the cni plugins when the network
namespace is already configured. This is the case for `ContainerStateRunning`
and `ContainerStateCreated`. This is important otherwise the network is
not attached to this network namespace and libpod will throw errors like
`network inspection mismatch...` This problem happened when using
`docker-compose up` in attached mode.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Podman uses the volume option map to check if it has to mount the volume
or not when the container is started. Commit 28138dafcc added to uid
and gid options to this map, however when only uid/gid is set we cannot
mount this volume because there is no filesystem or device specified.
Make sure we do not try to mount the volume when only the uid/gid option
is set since this is a simple chown operation.
Also when a uid/gid is explicity set, do not chown the volume based on
the container user when the volume is used for the first time.
Fixes#10620
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When making Exec Cleanup processes mandatory, I introduced a race
wherein attached exec sessions could be cleaned up and removed by
the cleanup process before the frontend had a chance to get their
exit code. Fortunately, we've dealt with this issue before in
containers, and the same solution can be applied here. I added an
event for an exec session's process exiting, `exec_died` (Docker
has an identical event, so this actually improves our
compatibility there) that includes the exit code of the exec
session. If the race happens and the exec session no longer
exists when we go to remove it, pick up exit code from the event
and exit cleanly.
Signed-off-by: Matthew Heon <mheon@redhat.com>
We were previously only doing this for detached exec. I don't
know why we did that, but I don't see any reason not to extend it
to all exec sessions - it guarantees that we will always clean up
exec sessions, even if the original `podman exec` process died.
[NO TESTS NEEDED] because I don't really know how to test this
one.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Unfortunately --pre-checkpointing never worked as intended and recent
changes to runc have shown that it is broken.
To create a pre-checkpoint CRIU expects the paths between the
pre-checkpoints to be a relative path. If having a previous checkpoint
it needs the be referenced like this: --prev-images-dir ../parent
Unfortunately Podman was giving runc (and CRIU) an absolute path.
Unfortunately, again, until March 2021 CRIU silently ignored if
the path was not relative and switch back to normal checkpointing.
This has been now fixed in CRIU and runc and running pre-checkpoint
with the latest runc fails, because runc already sees that the path is
absolute and returns an error.
This commit fixes this by giving runc a relative path.
This commit also fixes a second pre-checkpointing error which was just
recently introduced.
So summarizing: pre-checkpointing never worked correctly because CRIU
ignored wrong parameters and recent changes broke it even more.
Now both errors should be fixed.
[NO TESTS NEEDED]
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Adrian Reber <adrian@lisas.de>
when looking up the container cgroup, ignore named hierarchies since
containers running systemd as payload will create a sub-cgroup and
move themselves there.
Closes: https://github.com/containers/podman/issues/10602
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Checkpointed containers started with --privileged fail during restore
with:
Error: error creating container storage: ProcessLabel and Mountlabel must either not be specified or both specified
This commit fixes it by not setting the labels when restoring a
privileged container.
[NO TESTS NEEDED]
Signed-off-by: Adrian Reber <areber@redhat.com>
When 127.0.0.53 is the only nameserver in /etc/resolv.conf assume
systemd-resolved is used. This is better because /etc/resolv.conf does
not have to be symlinked to /run/systemd/resolve/stub-resolv.conf in
order to use systemd-resolved.
[NO TESTS NEEDED]
Fixes: #10570
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Fix a race in the k8s-file logs driver. When "following" the logs,
Podman will print the container's logs until the end. Previously,
Podman logged until the state transitioned into something non-running
which opened up a race with the container still running, possibly in
the "stopping" state.
To fix the race, log until we've seen the wait event for the specific
container. In that case, conmon will have finished writing all logs to
the file, and Podman will read it until EOF.
Further tweak the integration tests for testing `logs -f` on a running
container. Previously, the test only checked for one of two lines
stating that there was a race. Indeed the race was in using `run --rm`
where a log file may be removed before we could fully read it.
Fixes: #10596
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
The checkpoint archive compression was hardcoded to `archive.Gzip`.
There have been requests to make the used compression algorithm
selectable. There was especially the request to not compress the
checkpoint archive to be able to create faster checkpoints when not
compressing it.
This also changes the default from `gzip` to `zstd`. This change should
not break anything as the restore code path automatically handles
whatever compression the user provides during restore.
Signed-off-by: Adrian Reber <areber@redhat.com>