If you are running a rootless container on cgroupV1
you can not pause the container. We need to report the proper error
if this happens.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This change matches what is happening on the podman local side
and should eliminate a race condition.
Also exit commands on the server side should start to return to client.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
We have leaked the exit number codess all over the code, this patch
removes the numbers to constants.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
when removing a podman network, we need to make sure we delete the
network interface if one was ever created (by running a container).
also, when removing networks, we check if any containers are using the
network. if they are, we error out unless the user provides a 'force'
option which will remove the containers in question.
Signed-off-by: baude <bbaude@redhat.com>
when running in rootless mode and using systemd as cgroup manager
create automatically a systemd scope when the user doesn't own the
current cgroup.
This solves a couple of issues:
on cgroup v2 it is necessary that a process before it can moved to a
different cgroup tree must be in a directory owned by the unprivileged
user. This is not always true, e.g. when creating a session with su
-l.
Closes: https://github.com/containers/libpod/issues/3937
Also, for running systemd in a container it was before necessary to
specify "systemd-run --scope --user podman ...", now this is done
automatically as part of this PR.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This is mostly used with Systemd, which really wants to manage
CGroups itself when managing containers via unit file.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This change adds the following annotation to every container created by
podman:
```json
"Annotations": {
"io.containers.manager": "libpod"
}
```
Target of this annotaions is to indicate which project in the containers
ecosystem is the major manager of a container when applications share
the same storage paths. This way projects can decide if they want to
manipulate the container or not. For example, since CRI-O and podman are
not using the same container library (libpod), CRI-O can skip podman
containers and provide the end user more useful information.
A corresponding end-to-end test has been adapted as well.
Relates to: https://github.com/cri-o/cri-o/pull/2761
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
This isn't included in Docker, but seems handy enough.
Use the new API for 'volume rm' and 'volume inspect'.
Fixes#3891
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Do not require 0755 permissons for the ~/.config directory but require
at least 0700 which should be sufficient. The current implementation
internally creates this directory with 0755 if it does not exist, but if the
directory already exists with different perissions the current code returns
an empty string.
Signed-off-by: Christian Felder <c.felder@fz-juelich.de>
We want to get podman info to tell us about the version of
the mount program to help us diagnose issues users are having.
Also if in rootless mode and slirp4netns is installed reveal package
info on slirp4netns.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When we fail to remove a container's SHM, that's an error, and we
need to report it as such. This may be part of our lingering
storage woes.
Also, remove MNT_DETACH. It may be another cause of the storage
removal failures.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When volume options and the local volume driver are specified,
the volume is intended to be mounted using the 'mount' command.
Supported options will be used to volume the volume before the
first container using it starts, and unmount the volume after the
last container using it dies.
This should work for any local filesystem, though at present I've
only tested with tmpfs and btrfs.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
detect if the current user namespace doesn't match the configuration
in the /etc/subuid and /etc/subgid files.
If there is a mismatch, raise a warning and suggest the user to
recreate the user namespace with "system migrate", that also restarts
the containers.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
provide an implementation for getDevices that skip unreadable
directories for the current user.
Based on the implementation from runc/libcontainer.
Closes: https://github.com/containers/libpod/issues/3919
Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
when running in rootless mode, --device creates a bind mount from the
host instead of specifying the device in the OCI configuration. This
is required as an unprivileged user cannot use mknod, even when root
in a user namespace.
Closes: https://github.com/containers/libpod/issues/3905
Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
when using an upper case image name for container commit, we observed
panics due to a channel closing early.
Fixes: #3897
Signed-off-by: baude <bbaude@redhat.com>
For read-only containers set to create tmpfs filesystems over
/run and other common destinations, we were incorrectly setting
mount options, resulting in duplicate mount options.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
If I mount, say, /usr/bin into my container - I expect to be able
to run the executables in that mount. Unconditionally applying
noexec would be a bad idea.
Before my patches to change mount options and allow exec/dev/suid
being set explicitly, we inferred the mount options from where on
the base system the mount originated, and the options it had
there. Implement the same functionality for the new option
handling.
There's a lot of performance left on the table here, but I don't
know that this is ever going to take enough time to make it worth
optimizing.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We already process the options on all tmpfs filesystems during
final addition of mounts to the spec. We don't need to do it
before that in parseVolumes.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Previously, we explicitly set noexec/nosuid/nodev on every mount,
with no ability to disable them. The 'mount' command on Linux
will accept their inverses without complaint, though - 'noexec'
is counteracted by 'exec', 'nosuid' by 'suid', etc. Add support
for passing these options at the command line to disable our
explicit forcing of security options.
This also cleans up mount option handling significantly. We are
still parsing options in more than one place, which isn't good,
but option parsing for bind and tmpfs mounts has been unified.
Fixes: #3819Fixes: #3803
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
when performing an image build over a varlink connection, we should
clean up tmp files that are a result of sending the file to the host and
untarring it for the build.
Fixes: #3869
Signed-off-by: baude <bbaude@redhat.com>
Support generating systemd unit files for a pod. Podman generates one
unit file for the pod including the PID file for the infra container's
conmon process and one unit file for each container (excluding the infra
container).
Note that this change implies refactorings in the `pkg/systemdgen` API.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Add the digestfile option to the push command so the digest can
be stored away in a file when requested by the user. Also have added
a debug statement to show the completion of the push.
Emulates Buildah's https://github.com/containers/buildah/pull/1799/files
Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>
Drop the support for remote clients to generate systemd-service files.
The generated files are machine-dependent and hence relate only to the
a local machine. Furthermore, a proper service management when using
a remote-client is not possible as systemd has no access to a process.
Dropping the support will also reduce the risk of making users believe
that the generated services are usable in a remote scenario.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
drop the pkg/firewall module and start using the firewall CNI plugin.
It requires an updated package for CNI plugins.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
podman stats does not work in rootless environments with cgroups V1.
Fix error message and document this fact.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
adding podman network and the subcommands inspect, list, and rm. the
inspect subcommand displays the raw cni network configuration. the list
subcommand displays a summary of the cni networks ala ps. and the rm
subcommand removes a cni network.
Signed-off-by: baude <bbaude@redhat.com>
Docker has unlimited tmpfs size where Podman had it set to 64mb. Should be standard between the two.
Remove noexec default
Signed-off-by: Ashley Cui <ashleycui16@gmail.com>
Even explicitly defined hooks directories may not exist under
some circumstances. It's not worth a hard-fail if we hit an
ENOENT in these cases.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
obtaining containerstats requires the use of cgroups. at present,
rootless users do not have privileges to create cgroups. add an error
message that catches this for the varlink endpoint and return a proper
error.
Fixes: #3749
Signed-off-by: baude <bbaude@redhat.com>
Requirement from https://github.com/containers/libpod/issues/3575#issuecomment-512238393
Added --pull for podman create and pull to match the newly added flag in docker CLI.
`missing`: default value, podman will pull the image if it does not exist in the local.
`always`: podman will always pull the image.
`never`: podman will never pull the image.
Signed-off-by: Qi Wang <qiwan@redhat.com>
rework an error path so that users can run the windows remote client.
also, create the basedir path for the podman-remote.conf file if it does
not exist already.
Signed-off-by: baude <bbaude@redhat.com>
If we call Container(), we expect the namespace to be prefixed with "container:".
Add this check, and refactor to use named const strings instead of string literals
Signed-off-by: Peter Hunt <pehunt@redhat.com>
The 'podman run --mount' flag previously allowed the 'ro' option
to be specified, but was missing the ability to set it to a bool
(as is allowed by docker). Add that. While we're at it, allow
setting 'rw' explicitly as well.
Fixes#2980
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Sharing a UTS namespace means sharing the hostname. Fix situations where a container in a pod didn't properly share the hostname of the pod.
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Previously, we use CreateConfig's Command to populate container
Command (which is used as CMD for Inspect and Commit).
Unfortunately, CreateConfig's Command is the container's full
command, including a prepend of Entrypoint - so we duplicate
Entrypoint for images that include it.
Maintain a separate UserCommand in CreateConfig that does not
include the entrypoint, and use that instead.
Fixes#3708
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Drop errors to debug when trying to setup the runtimetmpdir. If the tool
can not setup a runtime dir, it will error out with a correct message
no need to put errors on the screen, when the tool actually succeeds.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
when using build, require a "more" connection to get logs.
when pulling a non-existent image, do not crash varlink connection.
Fixes: #3714Fixes: #3715
Signed-off-by: baude <bbaude@redhat.com>
Begin to separate the internal structures and frontend for
inspect on volumes. We can't rely on keeping internal data
structures for external presentation - separating presentation
and internal data format is good practice.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
If a container is restored multiple times from an exported checkpoint
with the help of '--import --name', the restore will fail if during
'podman run' a static container IP was set with '--ip'. The user can
tell the restore process to ignore the static IP with
'--ignore-static-ip'.
Signed-off-by: Adrian Reber <areber@redhat.com>
This enables programs and scripts wrapping the podman command to handle
'podman rm' and 'podman rmi' failures caused by paused or running
containers or due to images having other child images or dependent
containers. These errors are common enough that it makes sense to have
a more machine readable way of detecting them than parsing the standard
error output.
Signed-off-by: Ondrej Zoder <ozoder@redhat.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
As we previously removed our exit code retrieval code to stop a
memory leak, we need a new way of doing this. Fortunately, events
is able to do the job for us.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
the exit file
If the container exit code needs to be retained, it cannot be retained
in tmpfs, because libpod runs in a memcg itself so it can't leave
traces with a daemon-less design.
This wasn't a memleak detectable by kmemleak for example. The kernel
never lost track of the memory and there was no erroneous refcounting
either. The reference count dependencies however are not easy to track
because when a refcount is increased, there's no way to tell who's
still holding the reference. In this case it was a single page of
tmpfs pagecache holding a refcount that kept pinned a whole hierarchy
of dying memcg, slab kmem, cgropups, unrechable kernfs nodes and the
respective dentries and inodes. Such a problem wouldn't happen if the
exit file was stored in a regular filesystem because the pagecache
could be reclaimed in such case under memory pressure. The tmpfs page
can be swapped out, but that's not enough to release the memcg with
CONFIG_MEMCG_SWAP_ENABLED=y.
No amount of more aggressive kernel slab shrinking could have solved
this. Not even assigning slab kmem of dying cgroups to alive cgroup
would fully solve this. The only way to free the memory of a dying
cgroup when a struct page still references it, would be to loop over
all "struct page" in the kernel to find which one is associated with
the dying cgroup which is a O(N) operation (where N is the number of
pages and can reach billions). Linking all the tmpfs pages to the
memcg would cost less during memcg offlining, but it would waste lots
of memory and CPU globally. So this can't be optimized in the kernel.
A cronjob running this command can act as workaround and will allow
all slab cache to be released, not just the single tmpfs pages.
rm -f /run/libpod/exits/*
This patch solved the memleak with a reproducer, booting with
cgroup.memory=nokmem and with selinux disabled. The reason memcg kmem
and selinux were disabled for testing of this fix, is because kmem
greatly decreases the kernel effectiveness in reusing partial slab
objects. cgroup.memory=nokmem is strongly recommended at least for
workstation usage. selinux needs to be further analyzed because it
causes further slab allocations.
The upstream podman commit used for testing is
1fe2965e4f (v1.4.4).
The upstream kernel commit used for testing is
f16fea666898dbdd7812ce94068c76da3e3fcf1e (v5.2-rc6).
Reported-by: Michele Baldessari <michele@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
<Applied with small tweaks to comments>
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
In order to run Podman with VM-based runtimes unprivileged, the
network must be set up prior to the container creation. Therefore
this commit modifies Podman to run rootless containers by:
1. create a network namespace
2. pass the netns persistent mount path to the slirp4netns
to create the tap inferface
3. pass the netns path to the OCI spec, so the runtime can
enter the netns
Closes#2897
Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
Touch up a number of formating issues for XDG_RUNTIME_DIRS in a number
of man pages. Make use of the XDG_CONFIG_HOME environment variable
in a rootless environment if available, or set it if not.
Also added a number of links to the Rootless Podman config page and
added the location of the auth.json files to that doc.
Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>
...to work for specific edge cases with a simpler solution.
Re-reads hooks directories after any changes are detected by the watchers.
Added monitoring test for adding a different invalid hook to primary directory.
Some issues with prior code:
- ReadDir would stop when it encounters an invalid hook, rather than registering an error but continuing to read the valid hook.
- Wouldn’t account for Rename and Chmod events.
- After doing a mv of the hooks file instead of rm, it would still think the hooks file is in the directory, but it has been moved to another location.
- If a hook file was renamed, it would register the renamed file as a separate hook and not delete the original, so it would then execute the hook twice - once for the renamed file, and once for the original name which it did not delete.
Signed-off-by: samc24 <sam.chaturvedi24@gmail.com>
The variables here are 64-bit on 64-bit builds, so the linter
recommends stripping them. Unfortunately, they're 32-bit on
32-bit builds, so stripping them breaks that. Readd with a nolint
to convince it to not break.
Signed-off-by: Matthew Heon <mheon@redhat.com>
There's no way to get the error if we successfully get an exit code (as it's just printed to stderr instead).
instead of relying on the error to be passed to podman, and edit based on the error code, process it on the varlink side instead
Also move error codes to define package
Signed-off-by: Peter Hunt <pehunt@redhat.com>
including changing -l to the container id
and separating a case of setting the env that remote can't handle
Signed-off-by: Peter Hunt <pehunt@redhat.com>
clean up some final linter issues and add a make target for
golangci-lint. in addition, begin running the tests are part of the
gating tasks in cirrus ci.
we cannot fully shift over to the new linter until we fix the image on
the openshift side. for short term, we will use both
Signed-off-by: baude <bbaude@redhat.com>
This includes:
Implement exec -i and fix some typos in description of -i docs
pass failed runtime status to caller
Add resize handling for a terminal connection
Customize exec systemd-cgroup slice
fix healthcheck
fix top
add --detach-keys
Implement podman-remote exec (jhonce)
* Cleanup some orphaned code (jhonce)
adapt remote exec for conmon exec (pehunt)
Fix healthcheck and exec to match docs
Introduce two new OCIRuntime errors to more comprehensively describe situations in which the runtime can error
Use these different errors in branching for exit code in healthcheck and exec
Set conmon to use new api version
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
When removing --all images prune images only attempt to remove read/write images,
ignore read/only images
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
The default apparmor profile is not stored on disk which causes
confusion when debugging the content of the profile. To solve this, we
now add an additional API which returns the profile as byte slice.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
allow a container to run in a new cgroup namespace.
When running in a new cgroup namespace, the current cgroup appears to
be the root, so that there is no way for the container to access
cgroups outside of its own subtree.
By default it uses --cgroup=host to keep the previous behavior.
To create a new namespace, --cgroup=private must be provided.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We can infer no-new-privileges. For now, manually populate
seccomp (can't infer what file we sourced from) and
SELinux/Apparmor (hard to tell if they're enabled or not).
Signed-off-by: Matthew Heon <mheon@redhat.com>
When we first began writing Podman, we ran into a major issue
when implementing Inspect. Libpod deliberately does not tie its
internal data structures to Docker, and stores most information
about containers encoded within the OCI spec. However, Podman
must present a CLI compatible with Docker, which means it must
expose all the information in 'docker inspect' - most of which is
not contained in the OCI spec or libpod's Config struct.
Our solution at the time was the create artifact. We JSON'd the
complete CreateConfig (a parsed form of the CLI arguments to
'podman run') and stored it with the container, restoring it when
we needed to run commands that required the extra info.
Over the past month, I've been looking more at Inspect, and
refactored large portions of it into Libpod - generating them
from what we know about the OCI config and libpod's (now much
expanded, versus previously) container configuration. This path
comes close to completing the process, moving the last part of
inspect into libpod and removing the need for the create
artifact.
This improves libpod's compatability with non-Podman containers.
We no longer require an arbitrarily-formatted JSON blob to be
present to run inspect.
Fixes: #3500
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Before, play kube wasn't properly setting the command. Fix this
Also, begin a dedicated test suite for play kube to catch regressions like this in the future
Signed-off-by: Peter Hunt <pehunt@redhat.com>
do not automatically enable the controllers for the last path
component. It is necessary as once there are enabled controllers in a
cgroup, it won't possible to add processes to it.
Fix conmon being moved to the correct cgroup path when using
--cgroup-manager cgroupfs.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
when the container is running in a user namespace, check if gid=5 is
available, otherwise drop the option gid=5 for /dev/pts.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
be sure to load all the existing handlers, so that they can also be
freed in addition to the handlers we treat differently.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The newly added functionality to include the container's root
file-system changes into the checkpoint archive can now be explicitly
disabled. Either during checkpoint or during restore.
If a container changes a lot of files during its runtime it might be
more effective to migrated the root file-system changes in some other
way and to not needlessly increase the size of the checkpoint archive.
If a checkpoint archive does not contain the root file-system changes
information it will automatically be skipped. If the root file-system
changes are part of the checkpoint archive it is also possible to tell
Podman to ignore these changes.
Signed-off-by: Adrian Reber <areber@redhat.com>
One of the last limitations when migrating a container using Podman's
'podman container checkpoint --export=/path/to/archive.tar.gz' was
that it was necessary to manually handle changes to the container's root
file-system. The recommendation was to mount everything as --tmpfs where
the root file-system was changed.
This extends the checkpoint export functionality to also include all
changes to the root file-system in the checkpoint archive. The
checkpoint archive now includes a tarstream of the result from 'podman
diff'. This tarstream will be applied to the restored container before
restoring the container.
With this any container can now be migrated, even it there are changes
to the root file-system.
There was some discussion before implementing this to base the root
file-system migration on 'podman commit', but it seemed wrong to do
a 'podman commit' before the migration as that would change the parent
layer the restored container is referencing. Probably not really a
problem, but it would have meant that a migrated container will always
reference another storage top layer than it used to reference during
initial creation.
Signed-off-by: Adrian Reber <areber@redhat.com>
the commit and pull varlink endpoints were not working correctly when
'more' was not being specified.
Fixes: #3317Fixes: #3318Fixes: #3526
Signed-off-by: baude <bbaude@redhat.com>
drop the limitation of not supporting creating new cgroups v2 paths.
Every controller enabled /sys/fs/cgroup will be propagated down to the
created path. This won't work for rootless cgroupsv2, but it is not
an issue for now, as this code is used only by CRI-O.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
add a simple way to copy ulimit values from the host.
if --ulimit host is used then the current ulimits in place are copied
to the container.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
It is not correct to rely on specific location of the podman binary.
In most cases it is /usr/bin/podman, but sometimes is not (e.g. in
system tests). Use /proc/self/exe instead of hardcoded path.
Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
By default, podman points PIDFile in generated unit file to non-existent
location. As a result, the unit file, generated by podman, is broken:
an attempt to start this unit without prior modification results in a crash,
because systemd can not find the pidfile of service's main process.
Fix the value of "PIDFile" and add a system test for this case.
Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
unfortunately rootless won't work without cgo, as most of the
implementation is in C, but at least allow to build libpod.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
do not attempt to join the rootless namespace if it is running already
with euid == 0.
Closes: https://github.com/containers/libpod/issues/3463
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
some podman commands do not require the use of a container/image store.
in those cases, it is more effecient to not open the store, because that
results in having to also close the store which can be costly when the
system is under heavy write I/O loads.
Signed-off-by: baude <bbaude@redhat.com>
Currently pause process blocks all signals which may cause its
termination, including SIGTERM. This behavior hangs init(1) during
system shutdown, until pause process gets SIGKILLed after some grace
period. To avoid this hanging, SIGTERM is excluded from list of blocked
signals.
Fixes#3440
Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
at least on Fedora 30 it creates the /run/user/UID directory for the
user logged in via ssh.
This needs to be done very early so that every other check when we
create the default configuration file will point to the correct
location.
Closes: https://github.com/containers/libpod/issues/3410
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This is an initial implementation of cgroup v2 support for
pkg/cgroups. It currently works with crun, with this patch:
https://github.com/giuseppe/crun/pull/49).
It adds the pieces for:
- set PID limit to 1
- retrieve stats so that "podman stats" work.
the only missing part is the support for reading per
CPU stats (that is cpuacct.usage_percpu on cgroup v1), so for now it
always returns an empty result.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
provide a package for managing cgroups. This is not supposed to be a
complete implementation with all the features supported by cgroups,
but it is a minimal implementation designed around what libpod needs
and it is currently using.
For example, it is currently possible to Apply only the pids limit,
as it is used by libpod for stopping containers, any other Apply will
just fail.
The main goal here is to have a minimal library where we have full
control, so we can start playing with cgroup v2.
When the need arises, we can add more features.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When Host IP is not set in podman-remote.conf, error is printed out.
When Username is not set in podman-remote.conf, default username is used.
Signed-off-by: Ashley Cui <ashleycui16@gmail.com>
the compilation demands of having libpod in main is a burden for the
remote client compilations. to combat this, we should move the use of
libpod structs, vars, constants, and functions into the adapter code
where it will only be compiled by the local client.
this should result in cleaner code organization and smaller binaries. it
should also help if we ever need to compile the remote client on
non-Linux operating systems natively (not cross-compiled).
Signed-off-by: baude <bbaude@redhat.com>
To avoid unnecessary warnings and errors in the future I'd like to
propose building all cgo related sources with `-Wall -Werror`. This
commit fixes some warnings which came up in `shm_lock.c`, too.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
when using podman-remote on windows, the bridge format must account for
how windows deals with escape quoting. in this case, it does not need
any.
also, reduced duplicated code around generating the bridge endpoint for
the unix and windows platforms.
Signed-off-by: baude <bbaude@redhat.com>
The second argument of `execlp` should be of type `char *`, so we need
to add an additional argument there.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
These are only used on OS X Docker, and ignored elsewhere - but
since they are ignored, they're guaranteed to be safe everywhere,
and people are using them.
Fixes: #3340
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
In c/image, this adds the the mirror-by-digest-only option to mirrors, and
moves the search order to an independent list.
A synchronized buildah update is necessary to deal with the c/image API change.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This flag switches to removing containers directly from c/storage
and is mostly used to remove orphan containers.
It's a superior solution to our former one, which attempted
removal from storage under certain circumstances and could, under
some conditions, not trigger.
Also contains the beginning of support for storage in `ps` but
wiring that in is going to be a much bigger pain.
Fixes#3329.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We made changes earlier that empty storage options when setting
storage driver explicitly. Unfortunately, this breaks rootless
cleanup commands, as they lose the fuse-overlayfs mount program
path.
Fix this by passing along the storage options to the cleanup
process.
Also, fix --syslog, which was broken a while ago (probably when
we broke up main to add main_remote).
Fixes#3326
Signed-off-by: Matthew Heon <mheon@redhat.com>
on old kernels the ioctl NS_GET_PARENT is not available.
Handle the error code and immediately return the same fd. It should
be fine now that we use the namespace resolution using the conmon pid,
so the namespace parent resolution is just a safety measure.
Closes: https://github.com/containers/libpod/issues/2968
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The option to restore a container from an external checkpoint archive
(podman container restore -i /tmp/checkpoint.tar.gz) restores a
container with the same name and same ID as id had before checkpointing.
This commit adds the option '--name,-n' to 'podman container restore'.
With this option the restored container gets the name specified after
'--name,-n' and a new ID. This way it is possible to restore one
container multiple times.
If a container is restored with a new name Podman will not try to
request the same IP address for the container as it had during
checkpointing. This implicitly assumes that if a container is restored
from a checkpoint archive with a different name, that it will be
restored multiple times and restoring a container multiple times with
the same IP address will fail as each IP address can only be used once.
Signed-off-by: Adrian Reber <areber@redhat.com>
If restoring a container from a checkpoint it was necessary that the
image the container is based was already available (podman pull).
This commit adds the image download to podman container restore if it
does not exist.
Signed-off-by: Adrian Reber <areber@redhat.com>
This commit adds an option to the checkpoint command to export a
checkpoint into a tar.gz file as well as importing a checkpoint tar.gz
file during restore. With all checkpoint artifacts in one file it is
possible to easily transfer a checkpoint and thus enabling container
migration in Podman. With the following steps it is possible to migrate
a running container from one system (source) to another (destination).
Source system:
* podman container checkpoint -l -e /tmp/checkpoint.tar.gz
* scp /tmp/checkpoint.tar.gz destination:/tmp
Destination system:
* podman pull 'container-image-as-on-source-system'
* podman container restore -i /tmp/checkpoint.tar.gz
The exported tar.gz file contains the checkpoint image as created by
CRIU and a few additional JSON files describing the state of the
checkpointed container.
Now the container is running on the destination system with the same
state just as during checkpointing. If the container is kept running
on the source system with the checkpoint flag '-R', the result will be
that the same container is running on two different hosts.
Signed-off-by: Adrian Reber <areber@redhat.com>
Let's put inspect structs where they're actually being used. We
originally made pkg/inspect to solve circular import issues.
There are no more circular import issues.
Image structs remain for now, I'm focusing on container inspect.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
we are allowed to use only signal safe functions between a fork of a
multithreaded application and the next execve. Since setenv(3) is not
signal safe, block signals. We are already doing it for creating a
new namespace.
This is mostly a cleanup since reexec_in_user_namespace_wait is used
only only to join existing namespaces when we have not a pause.pid
file.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
after we read from the pause PID file, NUL terminate the buffer to
avoid reading garbage from the stack.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
add the ability for the podman remote client to use a configuration file
which describes its connections. users can now define a connection the
configuration and then call it by name like:
podman-remote -c connection1
and the destination and user will be derived from the configuration
file. if no -c is provided, we look for a connection in the
configuration file designated as 'default'. If the configuration file
has only one connection, it will be deemed the 'default'.
Signed-off-by: baude <bbaude@redhat.com>
Although an upgraded call is requested, the server has to send at least
one reply (can be an error) and the client has to check the reply,
before assuming an upgraded connection.
Signed-off-by: Harald Hoyer <harald@redhat.com>
move the logic for joining existing namespaces down to the rootless
package. In main_local we still retrieve the list of conmon pid files
and use it from the rootless package.
In addition, create a temporary user namespace for reading these
files, as the unprivileged user might not have enough privileges for
reading the conmon pid file, for example when running with a different
uidmap and root in the container is different than the rootless user.
Closes: https://github.com/containers/libpod/issues/3187
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
it creates a namespace where the current UID:GID on the host is mapped
to the same UID:GID in the container.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When we supercede low-priority mounts and volumes (image volumes,
and volumes sourced from --volumes-from) with higher-priority
ones (the --volume and --mount flags), we always replaced
lower-priority mounts of the same type (e.g. a user mount to
/tmp/test1 would supercede a volumes-from mount to the same
destination). However, we did not supercede the opposite type - a
named volume from image volumes at /tmp/test1 would be allowed to
remain and create a conflict, preventing container creation.
Solve this by destroying opposite types before merging (we can't
do it in the same loop, as then named volumes, which go second,
might trample changes made by mounts).
Fixes#3174
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
when doing any sort of attach to a container, a sigwinch is sent
followed by a resize event. this is fine for the local client but when
doing things over the varlink, the first sigwinch is wiped out by the
immediate resize event and is therefore lost. by making the channel
buffered, both events are processed after the varlink connection is
established.
Signed-off-by: baude <bbaude@redhat.com>
force the resources block to be empty instead of having default
values.
Regression introduced by 8e88461511
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Mark hidden all references to signature-policy
Default all uses of --authfile
Add --authfile support to podman run and podman create.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Mainly add support for podman build using --overlay mounts.
Updates containers/image also adds better support for new registries.conf
file.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
add a shortcut for joining immediately the namespace so we don't need
to re-exec Podman.
With the pause process simplificaton, we can now attempt to join the
namespaces as soon as Podman starts (and before the Go runtime kicks
in), so that we don't need to re-exec and use just one process.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
use a pause process to keep the user and mount namespace alive.
The pause process is created immediately on reload, and all successive
Podman processes will refer to it for joining the user&mount
namespace.
This solves all the race conditions we had on joining the correct
namespaces using the conmon processes.
As a fallback if the join fails for any reason (e.g. the pause process
was killed), then we try to join the running containers as we were
doing before.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Refactor client code to break out building connection string from
making the connection.
Example:
client:
Connection: unix:/run/podman/io.podman
Connection Type: DirectConnection
.
:
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Type varlinkapi.VarlinkCall currently only used as receiver for
RequiresUpgrade() future helpers could be added to this type.
RequiresUpgrade() verifies caller has given correct options to the call
for the given operation.
Signed-off-by: Jhon Honce <jhonce@redhat.com>
We were always raising an error when the rootless user attempted to
setup resources, but this is not the case anymore with cgroup v2.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
allow the user to define a remote host and remote username for their
remote podman sessions. this is then feed to the varlink "bridge" as
the ssh credentials and endpoint.
Signed-off-by: baude <bbaude@redhat.com>
We want to start supporting the registries.conf format.
Also start showing blocked registries in podman info
Fix sorting so all registries are listed together in podman info.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
The on-failure restart option supports restarting only a given
number of times. To do this, we need one additional field in the
DB to track restart count (which conveniently fills a field in
Inspect we weren't populating), plus some plumbing logic.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Fallback to executing ps(1) in case we hit an unknown psgo descriptor.
This ensures backwards compatibility with docker-top, which was purely
ps(1) driven.
Also support comma-separated descriptors as input.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
the podman generate systemd command will generate a systemd unit file
based on the attributes of an existing container and user inputs. the
command outputs the unit file to stdout for the user to copy or
redirect. it is enabled for the remote client as well.
users can set a restart policy as well as define a stop timeout
override for the container.
Signed-off-by: baude <bbaude@redhat.com>