Currently podman generate kube does not generate the correct RunAsUser and RunAsGroup
options in the yaml file. This patch fixes this.
This patch also make `podman play kube` use the RunAdUser and RunAsGroup options if
they are specified in the yaml file.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
If user specifies --detach-keys="", this will disable the feature.
Adding define.DefaultDetachKeys to help screen to help identify detach keys.
Updated man pages with additonal information.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
change the default on cgroups v2 and create a new cgroup namespace.
When a cgroup namespace is used, processes inside the namespace are
only able to see cgroup paths relative to the cgroup namespace root
and not have full visibility on all the cgroups present on the
system.
The previous behaviour is maintained on a cgroups v1 host, where a
cgroup namespace is not created by default.
Closes: https://github.com/containers/libpod/issues/4363
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
If the kube.yaml specifieds the SELinux type or Level, we need the container
to be launched with the correct label.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
change the default to -1, so that we can change the semantic of
"--tail 0" to not print any existing log line.
Closes: https://github.com/containers/libpod/issues/4396
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Refactor the `RuntimeConfig` along with related code from libpod into
libpod/config. Note that this is a first step of consolidating code
into more coherent packages to make the code more maintainable and less
prone to regressions on the long runs.
Some libpod definitions were moved to `libpod/define` to resolve
circular dependencies.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
There were many situations that made exec act funky with input. pipes didn't work as expected, as well as sending input before the shell opened.
Thinking about it, it seemed as though the issues were because of how os.Stdin buffers (it doesn't). Dropping this input had some weird consequences.
Instead, read from os.Stdin as bufio.Reader, allowing the input to buffer before passing it to the container.
Signed-off-by: Peter Hunt <pehunt@redhat.com>
When starting a container by using its name as a reference, we should
print the name instead of the ID. We regressed on this behaviour
with commit b4124485ae which made it into Podman v1.6.2.
Kudos to openSUSE testing for catching it. To prevent future
regressions, extend the e2e tests to check the printed container
name/ID.
Reported-by: @sysrich
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Use the github.com/seccomp/containers-golang library instead of the
docker package. The docker package has changed and silently broke
on F31.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Currently podman play kube is not using the system default seccomp.json file.
This PR will use the default or override location for podman play.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Be prepared to report multiple image digests for images which contain
multiple manifests but, because they continue to have the same set of
layers and the same configuration, are considered to be the same image.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Move to containers/image v5 and containers/buildah to v1.11.4.
Replace an equality check with a type assertion when checking for a
docker.ErrUnauthorizedForCredentials in `podman login`.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
when users create a new network and the dnsname plugin can be found by
podman, we will enable container name resolution on the new network.
there is an option to opt *out* as well.
tests cannot be added until we solve the packaging portion of the
dnsname plugin.
Signed-off-by: baude <bbaude@redhat.com>
In event of a container removal that is no longer in database, log a
warning instead of an error, as there is not any problem continuing
execution.
Resolves#4314
Signed-off-by: Tyler Ramer <tyaramer@gmail.com>
This matches Docker more closely, but retains the more important
protections of nosuid/nodev.
Fixes#4318
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
if the cgroup manager is set to systemd, detect if dbus is available,
otherwise fallback to --cgroup-manager=cgroupfs.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Everything else is a flag to mount, but "uid" and "gid" are not.
We need to parse them out of "o" and handle them separately.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We need to use the new Inspect() endpoint instead of trying to
JSON the actual volume structs. Currently, the output seems
completely nonsensical; it seems like we're JSONing the struct
for the Varlink connection itself? This should restore sanity and
match the format of remote and local inspect on volumes.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
make sure the user overrides are stored in the configuration file when
first created.
Closes: https://github.com/containers/libpod/issues/2659
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
when creating a new networking, we should check existing networks for
their bridge names and make sure the proposed new name is not part of
this. reported by QE.
Signed-off-by: baude <bbaude@redhat.com>
Previously, when `podman run` encountered a volume mount without
separate source and destination (e.g. `-v /run`) we would assume
that both were the same - a bind mount of `/run` on the host to
`/run` in the container. However, this does not match Docker's
behavior - in Docker, this makes an anonymous named volume that
will be mounted at `/run`.
We already have (more limited) support for these anonymous
volumes in the form of image volumes. Extend this support to
allow it to be used with user-created volumes coming in from the
`-v` flag.
This change also affects how named volumes created by the
container but given names are treated by `podman run --rm` and
`podman rm -v`. Previously, they would be removed with the
container in these cases, but this did not match Docker's
behaviour. Docker only removed anonymous volumes. With this patch
we move to that model as well; `podman run -v testvol:/test` will
not have `testvol` survive the container being removed by `podman
rm -v`.
The sum total of these changes let us turn on volume removal in
`--rm` by default.
Fixes: #4276
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Check to see if the container's start config includes the interactive
flag when determining to attach or ignore stdin stream.
This is in line with behavior of Docker CLI and engine
Signed-off-by: Tyler Ramer <tyaramer@gmail.com>
For non-Podman users of Libpod, we don't want to force the exit
command to use ARGV[0], which probably does not support a cleanup
command.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
even if the system is using cgroups v2, rootless is not able to setup
limits when the cgroup-manager is not systemd.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
For future work, we need multiple implementations of the OCI
runtime, not just a Conmon-wrapped runtime matching the runc CLI.
As part of this, do some refactoring on the interface for exec
(move to a struct, not a massive list of arguments). Also, add
'all' support to Kill and Stop (supported by runc and used a bit
internally for removing containers).
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
"init" is a quite common name for the command executed in a container
image and Podman ends up using the systemd mode also when not
required.
Be stricter on enabling the systemd mode and not enable it
automatically when the basename is "init" but expect the full path
"/usr/sbin/init".
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
if the pause process cannot be joined, remove the pause.pid while
keeping a lock on it, and try to recreate it.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
CRI-O defaults to 1024 for the maximum pids in a container. Podman
should have a similar limit. Once we have a containers.conf, we can
set the limit in this file, and have it easily customizable.
Currently the documentation says that -1 sets pids-limit=max, but -1 fails.
This patch allows -1, but also indicates that 0 also sets the max pids limit.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This requires updating all import paths throughout, and a matching
buildah update to interoperate.
I can't figure out the reason for go.mod tracking
github.com/containers/image v3.0.2+incompatible // indirect
((go mod graph) lists it as a direct dependency of libpod, but
(go list -json -m all) lists it as an indirect dependency),
but at least looking at the vendor subdirectory, it doesn't seem
to be actually used in the built binaries.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
if there are no resources specified, make sure the OCI resources block
is empty so that the OCI runtime won't complain.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
if running rootless do not print a warning message when podman cannot
rejoin the initial network namespace.
The first network namespace is owned by root on the host, a rootless
user cannot re-join it once it moves to a new network namespace.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
currently, podman import change do not support syntax like
- KEY val
- KEY ["val"]
This adds support for both of these syntax along with KEY=val
Signed-off-by: Kunal Kushwaha <kunal.kushwaha@gmail.com>
If we don't do this, we print WARN level messages that we should
not be printing by default.
Up one WARN message to ERROR so it still shows up by default.
Fixes: #4115Fixes: #4012
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Add ability to evict a container when it becomes unusable. This may
happen when the host setup changes after a container creation, making it
impossible for that container to be used or removed.
Evicting a container is done using the `rm --force` command.
Signed-off-by: Marco Vedovati <mvedovati@suse.com>
When a named volume is mounted on any of the tmpfs filesystems
created by read-only tmpfs, it caused a conflict that was not
resolved prior to this.
Fixes BZ1755119
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
when running a container remotely, we should only be sending stdin when
running with --interactive; otherwise use nil.
Fixes: #4095
Signed-off-by: baude <bbaude@redhat.com>
To 'avoid unknown FS magic on "/run/user/1000/netns/...": 1021994'
make the network namespace bind-mount recursively shared, so the
mount is back-propogated to the host.
Signed-off-by: gabi beyer <gabrielle.n.beyer@intel.com>
In order to run Podman with VM-based runtimes unprivileged, the
network must be set up prior to the container creation. Therefore
this commit modifies Podman to run rootless containers by:
1. create a network namespace
2. pass the netns persistent mount path to the slirp4netns
to create the tap inferface
3. pass the netns path to the OCI spec, so the runtime can
enter the netns
Closes#2897
Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
We identify and resolve conflicts in paths using destination path
matches. We require exact matches, largely for performance
reasons (we use maps to efficiently access, keyed by
destination). This usually works fine, until you get mounts that
are targetted at /output and /output/ - the same path, but not
the same string.
Use filepath.Clean() aggressively to try and solve this.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
when using the remote client, users may need to specify a non-standard
port for ssh connections. we can do so on the command line and within
the remote-client configuration file.
Fixes: #3987
Signed-off-by: baude <bbaude@redhat.com>
Currently if a user specifies a --mount option, their is no way to tell SELinux
to relabel the mount point.
This patch addes the relabel=shared and relabel=private options.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
If you are running a rootless container on cgroupV1
you can not pause the container. We need to report the proper error
if this happens.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This change matches what is happening on the podman local side
and should eliminate a race condition.
Also exit commands on the server side should start to return to client.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
We have leaked the exit number codess all over the code, this patch
removes the numbers to constants.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
when removing a podman network, we need to make sure we delete the
network interface if one was ever created (by running a container).
also, when removing networks, we check if any containers are using the
network. if they are, we error out unless the user provides a 'force'
option which will remove the containers in question.
Signed-off-by: baude <bbaude@redhat.com>
when running in rootless mode and using systemd as cgroup manager
create automatically a systemd scope when the user doesn't own the
current cgroup.
This solves a couple of issues:
on cgroup v2 it is necessary that a process before it can moved to a
different cgroup tree must be in a directory owned by the unprivileged
user. This is not always true, e.g. when creating a session with su
-l.
Closes: https://github.com/containers/libpod/issues/3937
Also, for running systemd in a container it was before necessary to
specify "systemd-run --scope --user podman ...", now this is done
automatically as part of this PR.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This is mostly used with Systemd, which really wants to manage
CGroups itself when managing containers via unit file.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This change adds the following annotation to every container created by
podman:
```json
"Annotations": {
"io.containers.manager": "libpod"
}
```
Target of this annotaions is to indicate which project in the containers
ecosystem is the major manager of a container when applications share
the same storage paths. This way projects can decide if they want to
manipulate the container or not. For example, since CRI-O and podman are
not using the same container library (libpod), CRI-O can skip podman
containers and provide the end user more useful information.
A corresponding end-to-end test has been adapted as well.
Relates to: https://github.com/cri-o/cri-o/pull/2761
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
This isn't included in Docker, but seems handy enough.
Use the new API for 'volume rm' and 'volume inspect'.
Fixes#3891
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Do not require 0755 permissons for the ~/.config directory but require
at least 0700 which should be sufficient. The current implementation
internally creates this directory with 0755 if it does not exist, but if the
directory already exists with different perissions the current code returns
an empty string.
Signed-off-by: Christian Felder <c.felder@fz-juelich.de>
We want to get podman info to tell us about the version of
the mount program to help us diagnose issues users are having.
Also if in rootless mode and slirp4netns is installed reveal package
info on slirp4netns.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When we fail to remove a container's SHM, that's an error, and we
need to report it as such. This may be part of our lingering
storage woes.
Also, remove MNT_DETACH. It may be another cause of the storage
removal failures.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When volume options and the local volume driver are specified,
the volume is intended to be mounted using the 'mount' command.
Supported options will be used to volume the volume before the
first container using it starts, and unmount the volume after the
last container using it dies.
This should work for any local filesystem, though at present I've
only tested with tmpfs and btrfs.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
detect if the current user namespace doesn't match the configuration
in the /etc/subuid and /etc/subgid files.
If there is a mismatch, raise a warning and suggest the user to
recreate the user namespace with "system migrate", that also restarts
the containers.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
provide an implementation for getDevices that skip unreadable
directories for the current user.
Based on the implementation from runc/libcontainer.
Closes: https://github.com/containers/libpod/issues/3919
Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
when running in rootless mode, --device creates a bind mount from the
host instead of specifying the device in the OCI configuration. This
is required as an unprivileged user cannot use mknod, even when root
in a user namespace.
Closes: https://github.com/containers/libpod/issues/3905
Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
when using an upper case image name for container commit, we observed
panics due to a channel closing early.
Fixes: #3897
Signed-off-by: baude <bbaude@redhat.com>
For read-only containers set to create tmpfs filesystems over
/run and other common destinations, we were incorrectly setting
mount options, resulting in duplicate mount options.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
If I mount, say, /usr/bin into my container - I expect to be able
to run the executables in that mount. Unconditionally applying
noexec would be a bad idea.
Before my patches to change mount options and allow exec/dev/suid
being set explicitly, we inferred the mount options from where on
the base system the mount originated, and the options it had
there. Implement the same functionality for the new option
handling.
There's a lot of performance left on the table here, but I don't
know that this is ever going to take enough time to make it worth
optimizing.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We already process the options on all tmpfs filesystems during
final addition of mounts to the spec. We don't need to do it
before that in parseVolumes.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Previously, we explicitly set noexec/nosuid/nodev on every mount,
with no ability to disable them. The 'mount' command on Linux
will accept their inverses without complaint, though - 'noexec'
is counteracted by 'exec', 'nosuid' by 'suid', etc. Add support
for passing these options at the command line to disable our
explicit forcing of security options.
This also cleans up mount option handling significantly. We are
still parsing options in more than one place, which isn't good,
but option parsing for bind and tmpfs mounts has been unified.
Fixes: #3819Fixes: #3803
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
when performing an image build over a varlink connection, we should
clean up tmp files that are a result of sending the file to the host and
untarring it for the build.
Fixes: #3869
Signed-off-by: baude <bbaude@redhat.com>
Support generating systemd unit files for a pod. Podman generates one
unit file for the pod including the PID file for the infra container's
conmon process and one unit file for each container (excluding the infra
container).
Note that this change implies refactorings in the `pkg/systemdgen` API.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Add the digestfile option to the push command so the digest can
be stored away in a file when requested by the user. Also have added
a debug statement to show the completion of the push.
Emulates Buildah's https://github.com/containers/buildah/pull/1799/files
Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>
Drop the support for remote clients to generate systemd-service files.
The generated files are machine-dependent and hence relate only to the
a local machine. Furthermore, a proper service management when using
a remote-client is not possible as systemd has no access to a process.
Dropping the support will also reduce the risk of making users believe
that the generated services are usable in a remote scenario.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
drop the pkg/firewall module and start using the firewall CNI plugin.
It requires an updated package for CNI plugins.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
podman stats does not work in rootless environments with cgroups V1.
Fix error message and document this fact.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
adding podman network and the subcommands inspect, list, and rm. the
inspect subcommand displays the raw cni network configuration. the list
subcommand displays a summary of the cni networks ala ps. and the rm
subcommand removes a cni network.
Signed-off-by: baude <bbaude@redhat.com>
Docker has unlimited tmpfs size where Podman had it set to 64mb. Should be standard between the two.
Remove noexec default
Signed-off-by: Ashley Cui <ashleycui16@gmail.com>
Even explicitly defined hooks directories may not exist under
some circumstances. It's not worth a hard-fail if we hit an
ENOENT in these cases.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
obtaining containerstats requires the use of cgroups. at present,
rootless users do not have privileges to create cgroups. add an error
message that catches this for the varlink endpoint and return a proper
error.
Fixes: #3749
Signed-off-by: baude <bbaude@redhat.com>
Requirement from https://github.com/containers/libpod/issues/3575#issuecomment-512238393
Added --pull for podman create and pull to match the newly added flag in docker CLI.
`missing`: default value, podman will pull the image if it does not exist in the local.
`always`: podman will always pull the image.
`never`: podman will never pull the image.
Signed-off-by: Qi Wang <qiwan@redhat.com>
rework an error path so that users can run the windows remote client.
also, create the basedir path for the podman-remote.conf file if it does
not exist already.
Signed-off-by: baude <bbaude@redhat.com>
If we call Container(), we expect the namespace to be prefixed with "container:".
Add this check, and refactor to use named const strings instead of string literals
Signed-off-by: Peter Hunt <pehunt@redhat.com>
The 'podman run --mount' flag previously allowed the 'ro' option
to be specified, but was missing the ability to set it to a bool
(as is allowed by docker). Add that. While we're at it, allow
setting 'rw' explicitly as well.
Fixes#2980
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Sharing a UTS namespace means sharing the hostname. Fix situations where a container in a pod didn't properly share the hostname of the pod.
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Previously, we use CreateConfig's Command to populate container
Command (which is used as CMD for Inspect and Commit).
Unfortunately, CreateConfig's Command is the container's full
command, including a prepend of Entrypoint - so we duplicate
Entrypoint for images that include it.
Maintain a separate UserCommand in CreateConfig that does not
include the entrypoint, and use that instead.
Fixes#3708
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Drop errors to debug when trying to setup the runtimetmpdir. If the tool
can not setup a runtime dir, it will error out with a correct message
no need to put errors on the screen, when the tool actually succeeds.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
when using build, require a "more" connection to get logs.
when pulling a non-existent image, do not crash varlink connection.
Fixes: #3714Fixes: #3715
Signed-off-by: baude <bbaude@redhat.com>
Begin to separate the internal structures and frontend for
inspect on volumes. We can't rely on keeping internal data
structures for external presentation - separating presentation
and internal data format is good practice.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>