filter option accepts two filters.
- label
- until
label supports "label=value" or "label=key=value" format
until supports all golang compatible time/duration formats.
Signed-off-by: Kunal Kushwaha <kunal.kushwaha@gmail.com>
If the container is running and we need to get its netns and
can't, that is a serious bug deserving of errors.
If it's not running, that's not really a big deal. Log an error
and continue.
Signed-off-by: Matthew Heon <mheon@redhat.com>
When Libpod removes a container, there is the possibility that
removal will not fully succeed. The most notable problems are
storage issues, where the container cannot be removed from
c/storage.
When this occurs, we were faced with a choice. We can keep the
container in the state, appearing in `podman ps` and available for
other API operations, but likely unable to do any of them as it's
been partially removed. Or we can remove it very early and clean
up after it's already gone. We have, until now, used the second
approach.
The problem that arises is intermittent problems removing
storage. We end up removing a container, failing to remove its
storage, and ending up with a container permanently stuck in
c/storage that we can't remove with the normal Podman CLI, can't
use the name of, and generally can't interact with. A notable
cause is when Podman is hit by a SIGKILL midway through removal,
which can consistently cause `podman rm` to fail to remove
storage.
We now add a new state for containers that are in the process of
being removed, ContainerStateRemoving. We set this at the
beginning of the removal process. It notifies Podman that the
container cannot be used anymore, but preserves it in the DB
until it is fully removed. This will allow Remove to be run on
these containers again, which should successfully remove storage
if it fails.
Fixes#3906
Signed-off-by: Matthew Heon <mheon@redhat.com>
When restoring a container with user namespace, the user namespace is
created by the OCI runtime, and the network namespace is created after
the user namespace to ensure correct ownership.
In this case PostConfigureNetNS will be set and the value of
c.state.NetNS would be nil. Hence, the following error occurs:
$ sudo podman run --name cr \
--uidmap 0:1000:500 \
-d docker.io/library/alpine \
/bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
$ sudo podman container checkpoint cr
$ sudo podman container restore cr
...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x13a5e3c]
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
In conmon 2.0.3, we add another fifo to handle window resizing. This needs to be cleaned up for commands like restore, where the same path is used.
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Rewrite the backend for displaying the history of an image to simplify
the code and be closer to docker's behaviour. Instead of driving
index-based heuristics, create a reverse mapping from top-layers to the
corresponding image IDs and lookup the layers on-demand. Also use the
uncompressed layer size to be closer to Docker's behaviour.
Note that intermediate images from local builds are not considered for
the ID lookups anymore.
Fixes: #3359
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
When running on a node with Cgroups v2, default to using `crun` instead
of `runc`. Note that this only impacts the hard-coded default config.
No user config will be over-written.
Fixes: #4463
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Currently podman generate kube does not generate the correct RunAsUser and RunAsGroup
options in the yaml file. This patch fixes this.
This patch also make `podman play kube` use the RunAdUser and RunAsGroup options if
they are specified in the yaml file.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
`go get github.com/cri-o/ocicni@deac903fd99b6c52d781c9f42b8db3af7dcfd00a`
I had to fix compilation errors in libpod/networking_linux.go
---
ocicni.Networks has changed from string to the structure NetAttachment
with the member Name (the former string value) and the member Ifname
(optional).
I don't think we can make use of Ifname here, so I just map the array of
structures to array of strings - e.g. dropping Ifname.
---
The function GetPodNetworkStatus no longer returns Result but it returns
the wrapper structure NetResult which contains the former Result plus
NetAttachment (Network name and Interface name).
Again, I don't think we can make use of that information here, so I
just added `.Result` to fix the build.
---
Issue: #1136
Signed-off-by: Jakub Filak <jakub.filak@sap.com>
If user specifies --detach-keys="", this will disable the feature.
Adding define.DefaultDetachKeys to help screen to help identify detach keys.
Updated man pages with additonal information.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When pulling an unqualified reference (e.g., `fedora`) make sure that
the reference is not using a non-docker transport to avoid iterating
over the search registries and trying to pull from them.
Fixes: #4434
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
If the kube.yaml specifieds the SELinux type or Level, we need the container
to be launched with the correct label.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
the pidWaitTimeout is already a Duration so do not multiply it again
by time.Millisecond.
Closes: https://github.com/containers/libpod/issues/4344
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Pull in changes to pkg/secrets/secrets.go that adds the
logic to disable fips mode if a pod/container has a
label set.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
change the default to -1, so that we can change the semantic of
"--tail 0" to not print any existing log line.
Closes: https://github.com/containers/libpod/issues/4396
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Refactor the `RuntimeConfig` along with related code from libpod into
libpod/config. Note that this is a first step of consolidating code
into more coherent packages to make the code more maintainable and less
prone to regressions on the long runs.
Some libpod definitions were moved to `libpod/define` to resolve
circular dependencies.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
There were many situations that made exec act funky with input. pipes didn't work as expected, as well as sending input before the shell opened.
Thinking about it, it seemed as though the issues were because of how os.Stdin buffers (it doesn't). Dropping this input had some weird consequences.
Instead, read from os.Stdin as bufio.Reader, allowing the input to buffer before passing it to the container.
Signed-off-by: Peter Hunt <pehunt@redhat.com>
command.Start() just starts the command. That catches some
errors, but the nasty ones - bad options and similar - happen
when the command runs. Use CombinedOutput() instead - it waits
for the command to exit, and thus catches non-0 exit of the
`mount` command (invalid options, for example).
STDERR from the `mount` command is directly used, which isn't
necessarily the best, but we can't really get much more info on
what went wrong.
Fixes#4303
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
always create a new cgroup for conmon also when running as rootless.
We were previously creating one only when necessary, but that behaves
differently than root containers.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Currently podman play kube is not using the system default seccomp.json file.
This PR will use the default or override location for podman play.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Generate an image's RepoDigests list using all applicable digests, and
refrain from outputting a digest in the tag column of the "images"
output.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Be prepared to report multiple image digests for images which contain
multiple manifests but, because they continue to have the same set of
layers and the same configuration, are considered to be the same image.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add --override-arch and --override-os as hidden flags, in line with the
global flag names that skopeo uses, so that we can test behavior around
manifest lists without having to conditionalize more of it by arch.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
When an image can be opened as an ImageSource but not an Image, handle
the case where it's an image list all by itself, the case where it's an
image for a different architecture/OS combination, or the case where
it's both.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Move to containers/image v5 and containers/buildah to v1.11.4.
Replace an equality check with a type assertion when checking for a
docker.ErrUnauthorizedForCredentials in `podman login`.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
read the slirp4netns stderr and propagate it in the error when the
process fails.
Replace: https://github.com/containers/libpod/pull/4338
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We have a lot of checks for container state scattered throughout
libpod. Many of these need to ensure the container is in one of a
given set of states so an operation may safely proceed.
Previously there was no set way of doing this, so we'd use unique
boolean logic for each one. Introduce a helper to standardize
state checks.
Note that this is only intended to replace checks for multiple
states. A simple check for one state (ContainerStateRunning, for
example) should remain a straight equality, and not use this new
helper.
Signed-off-by: Matthew Heon <mheon@redhat.com>
when running in systemd mode on cgroups v1, make sure the
/sys/fs/cgroup/systemd/release_agent is masked otherwise the container
is able to modify it and execute scripts on the host.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When you try and create a new volume with the name of a volume
that already exists, you presently get a thoroughly unhelpful
error from `mkdir` as the volume attempts to create the
directory it will be mounted at. An EEXIST out of mkdir is not
particularly helpful to Podman users - it doesn't explain that
the name is already taken by another volume.
The solution here is potentially racy as the runtime is not
locked, so someone else could take the name while we're still
getting things set up, but that's a narrow timing window, and we
will still return an error - just an error that's not as good as
this one.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
if the cgroup manager is set to systemd, detect if dbus is available,
otherwise fallback to --cgroup-manager=cgroupfs.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Everything else is a flag to mount, but "uid" and "gid" are not.
We need to parse them out of "o" and handle them separately.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
make sure the user overrides are stored in the configuration file when
first created.
Closes: https://github.com/containers/libpod/issues/2659
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Previously, when `podman run` encountered a volume mount without
separate source and destination (e.g. `-v /run`) we would assume
that both were the same - a bind mount of `/run` on the host to
`/run` in the container. However, this does not match Docker's
behavior - in Docker, this makes an anonymous named volume that
will be mounted at `/run`.
We already have (more limited) support for these anonymous
volumes in the form of image volumes. Extend this support to
allow it to be used with user-created volumes coming in from the
`-v` flag.
This change also affects how named volumes created by the
container but given names are treated by `podman run --rm` and
`podman rm -v`. Previously, they would be removed with the
container in these cases, but this did not match Docker's
behaviour. Docker only removed anonymous volumes. With this patch
we move to that model as well; `podman run -v testvol:/test` will
not have `testvol` survive the container being removed by `podman
rm -v`.
The sum total of these changes let us turn on volume removal in
`--rm` by default.
Fixes: #4276
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When a container is created with a given OCI runtime, but then it
is uninstalled or removed from the configuration file, Libpod
presently reacts very poorly. The EvictContainer code can
potentially remove these containers, but we still can't see them
in `podman ps` (aside from the massive logrus.Errorf messages
they create).
Providing a minimal OCI runtime implementation for missing
runtimes allows us to behave better. We'll be able to retrieve
containers from the database, though we still pop up an error for
each missing runtime. For containers which are stopped, we can
remove them as normal.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
network statistics cannot be collected for rootless network devices with
the current implementation. for now, we return nil so that stats will
at least for users.
Fixes:#4268
Signed-off-by: baude <bbaude@redhat.com>
The json field is called `Image` while the go field is called `ImageID`,
tricking users into filtering for `Image` which ultimately results in an
error. Hence, rename the field to `Image` to align json and go.
To prevent podman users from regressing, rename `Image` to `ImageID` in
the specified filters. Add tests to prevent us from regressing. Note
that consumers of the go API that are using `ImageID` are regressing;
ultimately we consider it to be a bug fix.
Fixes: #4193
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Also, ensure that we don't try to mount them without root - it
appears that it can somehow not error and report that mount was
successful when it clearly did not succeed, which can induce this
case.
We reuse the `--force` flag to indicate that a volume should be
removed even after unmount errors. It seems fairly natural to
expect that --force will remove a volume that is otherwise
presenting problems.
Finally, ignore EINVAL on unmount - if the mount point no longer
exists our job is done.
Fixes: #4247Fixes: #4248
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
In some cases, conmon can fail without writing logs. Change the wording
of the error message from
"error reading container (probably exited) json message"
to
"container create failed (no logs from conmon)"
to have a more helpful error message that is more consistent with other
errors at that stage of execution.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Previously, `podman checkport restore` with exported containers,
when told to create a new container based on the exported
checkpoint, would create a new container, with a new container
ID, but not reset CGroup path - which contained the ID of the
original container.
If this was done multiple times, the result was two containers
with the same cgroup paths. Operations on these containers would
this have a chance of crossing over to affect the other one; the
most notable was `podman rm` once it was changed to use the --all
flag when stopping the container; all processes in the cgroup,
including the ones in the other container, would be stopped.
Reset cgroups on restore to ensure that the path matches the ID
of the container actually being run.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This is a horrible hack to work around issues with Fedora 31, but
other distros might need it to, so we'll move it upstream.
I do not recommend this functionality for general use, and the
manpages and other documentation will reflect this. But for some
upgrade cases, it will be the only thing that allows for a
working system.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
For future work, we need multiple implementations of the OCI
runtime, not just a Conmon-wrapped runtime matching the runc CLI.
As part of this, do some refactoring on the interface for exec
(move to a struct, not a massive list of arguments). Also, add
'all' support to Kill and Stop (supported by runc and used a bit
internally for removing containers).
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
when runc returns an error about not being v2 complient, catch the error
and logrus an actionable message for users.
Signed-off-by: baude <bbaude@redhat.com>
This requires updating all import paths throughout, and a matching
buildah update to interoperate.
I can't figure out the reason for go.mod tracking
github.com/containers/image v3.0.2+incompatible // indirect
((go mod graph) lists it as a direct dependency of libpod, but
(go list -json -m all) lists it as an indirect dependency),
but at least looking at the vendor subdirectory, it doesn't seem
to be actually used in the built binaries.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This ensures that containers that didn't require an evict will be
dealt with normally, and we only break out evict for containers
that refuse to be removed by normal means.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
fixes a segfault when slirp4netns is not installed and the slirp sync
pipe is not created.
Closes: https://github.com/containers/libpod/issues/4113
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
A true result from reexec.Init() isn't an error, but it indicates that
main() should exit with a success exit status.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add ability to evict a container when it becomes unusable. This may
happen when the host setup changes after a container creation, making it
impossible for that container to be used or removed.
Evicting a container is done using the `rm --force` command.
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
CNI expects that a DELETE be run before re-creating container
networks. If a reboot occurs quickly enough that containers can't
stop and clean up, that DELETE never happens, and Podman
currently wipes the old network info and thinks the state has
been entirely cleared. Unfortunately, that may not be the case on
the CNI side. Some things - like IP address reservations - may
not have been cleared.
To solve this, manually re-run CNI Delete on refresh. If the
container has already been deleted this seems harmless. If not,
it should clear lingering state.
Fixes: #3759
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
In order to run Podman with VM-based runtimes unprivileged, the
network must be set up prior to the container creation. Therefore
this commit modifies Podman to run rootless containers by:
1. create a network namespace
2. pass the netns persistent mount path to the slirp4netns
to create the tap inferface
3. pass the netns path to the OCI spec, so the runtime can
enter the netns
Closes#2897
Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
If a user upgrades to a machine that defaults to a cgroups V2 machine
and has a libpod.conf file in their homedir that defaults to OCI Runtime runc,
then we want to change it one time to crun.
runc as of this point does not work on cgroupV2 systems. This patch will
eventually be removed but is needed until runc has support.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
If the HOME environment variable is not set, make sure it is set to
the configuration found in the container /etc/passwd file.
It was previously depending on a runc behavior that always set HOME
when it is not set. The OCI runtime specifications do not require
HOME to be set so move the logic to libpod.
Closes: https://github.com/debarshiray/toolbox/issues/266
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We've been seeing a lot of issues (ref: #4061, but there are
others) where Podman hiccups on trying to start a container,
because some temporary files have been retained and Conmon will
not overwrite them.
If we're calling start() we can safely assume that we really want
those files gone so the container starts without error, so invoke
the cleanup routine. It's relatively cheap (four file removes) so
it shouldn't hurt us that much.
Also contains a small simplification to the removeConmonFiles
logic - we don't need to stat-then-remove when ignoring ENOENT is
fine.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
There were two problems with preserve fds.
libpod didn't open the fds before passing _OCI*PIPE to conmon. This caused libpod to talk on the preserved fds, rather than the pipes, with conmon talking on the pipes. This caused a hang.
Libpod also didn't convert an int to string correctly, so it would further fail.
Fix these and add a unit test to make sure we don't regress in the future
Note: this test will not pass on crun until crun supports --preserve-fds
Signed-off-by: Peter Hunt <pehunt@redhat.com>
if slirp4netns supports sandboxing, enable it.
It automatically creates a new mount namespace where slirp4netns will
run and have limited access to the host resources.
It needs slirp4netns 0.4.1.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We should not be throwing errors because the operation we wanted
to perform is already done. Now, it is definitely strange that a
container is actually unmounted, but shows as mounted in the DB -
if this reoccurs in a way where we can investigate, it's worth
tearing into.
Fixes#4033
Signed-off-by: Matthew Heon <mheon@redhat.com>
If you are running a rootless container on cgroupV1
you can not pause the container. We need to report the proper error
if this happens.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This change matches what is happening on the podman local side
and should eliminate a race condition.
Also exit commands on the server side should start to return to client.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
We have leaked the exit number codess all over the code, this patch
removes the numbers to constants.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
if we register the resize func too early, it attempts to read from the 'ctl' file before it exists. this causes the func to error, and the resize to not go through.
Fix this by registering resize func later for conmon. This, along with a conmon fix, will allow exec to know the terminal size at startup
Signed-off-by: Peter Hunt <pehunt@redhat.com>
when executing a healthcheck, we were not cleaning up after exec's use
of a socket. we now remove the socket file and ignore if for reason it
does not exist.
Fixes: #3962
Signed-off-by: baude <bbaude@redhat.com>
When --cgroupns=private is used we need to mount a new cgroup file
system so that it points to the correct namespace.
Needs: https://github.com/containers/crun/pull/88
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
when running in rootless mode and using systemd as cgroup manager
create automatically a systemd scope when the user doesn't own the
current cgroup.
This solves a couple of issues:
on cgroup v2 it is necessary that a process before it can moved to a
different cgroup tree must be in a directory owned by the unprivileged
user. This is not always true, e.g. when creating a session with su
-l.
Closes: https://github.com/containers/libpod/issues/3937
Also, for running systemd in a container it was before necessary to
specify "systemd-run --scope --user podman ...", now this is done
automatically as part of this PR.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This will be used when we allow 'podman ps' to display info on
storage containers instead of Libpod containers.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Lookup was written before volume states merged, but merged after,
and CI didn't catch the obvious failure here. Without a valid
state, we try to unmarshall into a null pointer, and 'volume rm'
is completely broken because of it.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Podman is not the only user of containers/storage, and as such we
cannot rely on our database as the sole source of truth when
pruning images. If images do not show as in use from Podman's
perspective, but subsequently fail to remove because they are
being used by a container, they're probably being used by Buildah
or another c/storage client.
Since the images in question are in use, we shouldn't error on
failure to prune them - we weren't supposed to prune them in the
first place.
Fixes: #3983
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This is mostly used with Systemd, which really wants to manage
CGroups itself when managing containers via unit file.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This change adds the following annotation to every container created by
podman:
```json
"Annotations": {
"io.containers.manager": "libpod"
}
```
Target of this annotaions is to indicate which project in the containers
ecosystem is the major manager of a container when applications share
the same storage paths. This way projects can decide if they want to
manipulate the container or not. For example, since CRI-O and podman are
not using the same container library (libpod), CRI-O can skip podman
containers and provide the end user more useful information.
A corresponding end-to-end test has been adapted as well.
Relates to: https://github.com/cri-o/cri-o/pull/2761
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
Previously, we only did this for volumes created at the same time
as the container. However, this is not correct behavior - Docker
does so for all named volumes, even those made with
'podman volume create' and mounted into a container later.
Fixes#3945
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This isn't included in Docker, but seems handy enough.
Use the new API for 'volume rm' and 'volume inspect'.
Fixes#3891
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We want to get podman info to tell us about the version of
the mount program to help us diagnose issues users are having.
Also if in rootless mode and slirp4netns is installed reveal package
info on slirp4netns.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
If c/storage paths are explicitly set to "" (the empty string) it
will use compiled-in defaults. However, it won't tell us this via
`storage.GetDefaultStoreOptions()` - we just get the empty string
(which can put our defaults, some of which are relative to
c/storage, in a bad spot).
Hardcode a sane default for cases like this. Furthermore, add
some sanity checks to paths, to ensure we don't use relative
paths for core parts of libpod.
Fixes#3952
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When we fail to remove a container's SHM, that's an error, and we
need to report it as such. This may be part of our lingering
storage woes.
Also, remove MNT_DETACH. It may be another cause of the storage
removal failures.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When volume options and the local volume driver are specified,
the volume is intended to be mounted using the 'mount' command.
Supported options will be used to volume the volume before the
first container using it starts, and unmount the volume after the
last container using it dies.
This should work for any local filesystem, though at present I've
only tested with tmpfs and btrfs.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We need to be able to track the number of times a volume has been
mounted for tmpfs/nfs/etc volumes. As such, we need a mutable
state for volumes. Add one, with the expected update/save methods
in both states.
There is backwards compat here, in that older volumes without a
state will still be accepted.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>