--sdnotify container|conmon|ignore
With "conmon", we send the MAINPID, and clear the NOTIFY_SOCKET so the OCI
runtime doesn't pass it into the container. We also advertise "ready" when the
OCI runtime finishes to advertise the service as ready.
With "container", we send the MAINPID, and leave the NOTIFY_SOCKET so the OCI
runtime passes it into the container for initialization, and let the container advertise further metadata.
This is the default, which is closest to the behavior podman has done in the past.
The "ignore" option removes NOTIFY_SOCKET from the environment, so neither podman nor
any child processes will talk to systemd.
This removes the need for hardcoded CID and PID files in the command line, and
the PIDFile directive, as the pid is advertised directly through sd-notify.
Signed-off-by: Joseph Gooch <mrwizard@dok.org>
With the advent of Podman 2.0.0 we crossed the magical barrier of go
modules. While we were able to continue importing all packages inside
of the project, the project could not be vendored anymore from the
outside.
Move the go module to new major version and change all imports to
`github.com/containers/libpod/v2`. The renaming of the imports
was done via `gomove` [1].
[1] https://github.com/KSubedi/gomove
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
--tz flag sets timezone inside container
Can be set to IANA timezone as well as `local` to match host machine
Signed-off-by: Ashley Cui <acui@redhat.com>
move the chown for newly created volumes after the spec generation so
the correct UID/GID are known.
Closes: https://github.com/containers/libpod/issues/5698
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When running under systemd there is no need to create yet another
cgroup for the container.
With conmon-delegated the current cgroup will be split in two sub
cgroups:
- supervisor
- container
The supervisor cgroup will hold conmon and the podman process, while
the container cgroup is used by the OCI runtime (using the cgroupfs
backend).
Closes: https://github.com/containers/libpod/issues/6400
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We initially believed that implementing this required support for
restarting containers after reboot, but this is not the case.
The unless-stopped restart policy acts identically to the always
restart policy except in cases related to reboot (which we do not
support yet), but it does not require that support for us to
implement it.
Changes themselves are quite simple, we need a new restart policy
constant, we need to remove existing checks that block creation
of containers when unless-stopped was used, and we need to update
the manpages.
Fixes#6508
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When the container uses journald logging, we don't want to
automatically use the same driver for its exec sessions. If we do
we will pollute the journal (particularly in the case of
healthchecks) with large amounts of undesired logs. Instead,
force exec sessions logs to file for now; we can add a log-driver
flag later (we'll probably want to add a `podman logs` command
that reads exec session logs at the same time).
As part of this, add support for the new 'none' logs driver in
Conmon. It will be the default log driver for exec sessions, and
can be optionally selected for containers.
Great thanks to Joe Gooch (mrwizard@dok.org) for adding support
to Conmon for a null log driver, and wiring it in here.
Fixes#6555
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Add an `--infra-conmon-pidfile` flag to `podman-pod-create` to write the
infra container's conmon process ID to a specified path. Several
container sub-commands already support `--conmon-pidfile` which is
especially helpful to allow for systemd to access and track the conmon
processes. This allows for easily tracking the conmon process of a
pod's infra container.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Add a `CreateCommand` field to the pod config which includes the entire
`os.Args` at pod-creation. Similar to the already existing field in a
container config, we need this information to properly generate generic
systemd unit files for pods. It's a prerequisite to support the `--new`
flag for pods.
Also add the `CreateCommand` to the pod-inspect data, which can come in
handy for debugging, general inspection and certainly for the tests that
are added along with the other changes.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Support the `X-Registry-Auth` http-request header.
* The content of the header is a base64 encoded JSON payload which can
either be a single auth config or a map of auth configs (user+pw or
token) with the corresponding registries being the keys. Vanilla
Docker, projectatomic Docker and the bindings are transparantly
supported.
* Add a hidden `--registries-conf` flag. Buildah exposes the same
flag, mostly for testing purposes.
* Do all credential parsing in the client (i.e., `cmd/podman`) pass
the username and password in the backend instead of unparsed
credentials.
* Add a `pkg/auth` which handles most of the heavy lifting.
* Go through the authentication-handling code of most commands, bindings
and endpoints. Migrate them to the new code and fix issues as seen.
A final evaluation and more tests is still required *after* this
change.
* The manifest-push endpoint is missing certain parameters and should
use the ABI function instead. Adding auth-support isn't really
possible without these parts working.
* The container commands and endpoints (i.e., create and run) have not
been changed yet. The APIs don't yet account for the authfile.
* Add authentication tests to `pkg/bindings`.
Fixes: #6384
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
By moving a couple of variables from libpod/libpod to libpod/libpod/define
I am able shrink the podman-remote-* executables by another megabyte.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This one was a massive pain to track down.
The original symptom was an error message from rootless Podman
trying to make a container in a pod. I unfortunately did not look
at the error message closely enough to realize that the namespace
in question was the cgroup namespace (the reproducer pod was
explicitly set to only share the network namespace), else this
would have been quite a bit shorter.
I spent considerable effort trying to track down differences
between the inspect output of the two containers, and when that
failed I was forced to resort to diffing the OCI specs. That
finally proved fruitful, and I was able to determine what should
have been obvious all along: the container was joining the cgroup
namespace of the infra container when it really ought not to
have.
From there, I discovered a variable collision in pod config. The
UsePodCgroup variable means "create a parent cgroup for the pod
and join containers in the pod to it". Unfortunately, it is very
similar to UsePodUTS, UsePodNet, etc, which mean "the pod shares
this namespace", so an accessor was accidentally added for it
that indicated the pod shared the cgroup namespace when it really
did not. Once I realized that, it was a quick fix - add a bool to
the pod's configuration to indicate whether the cgroup ns was
shared (distinct from UsePodCgroup) and use that for the
accessor.
Also included are fixes for `podman inspect` and
`podman pod inspect` that fix them to actually display the state
of the cgroup namespace (for container inspect) and what
namespaces are shared (for pod inspect). Either of those would
have made tracking this down considerably quicker.
Fixes#6149
Signed-off-by: Matthew Heon <mheon@redhat.com>
enabled integration tests for volumes. there are two exceptions that still need work because of something not yet implemented.
also, add code to deal with the fact that containers conf appears to set a local volume driver where it used to be simply blank.
Signed-off-by: Brent Baude <bbaude@redhat.com>
Instead of getting mount options from /proc/self/mountinfo, which is
very costly to read/parse (and can even be unreliable), let's use
statfs(2) to figure out the flags we need.
[v2: move getting default options to pkg/util, make it linux-specific]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Also adds some basic tests for these two. More tests are needed
but will have to wait for state to be finished.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Add support to auto-update containers running in systemd units as
generated with `podman generate systemd --new`.
`podman auto-update` looks up containers with a specified
"io.containers.autoupdate" label (i.e., the auto-update policy).
If the label is present and set to "image", Podman reaches out to the
corresponding registry to check if the image has been updated. We
consider an image to be updated if the digest in the local storage is
different than the one of the remote image. If an image must be
updated, Podman pulls it down and restarts the container. Note that the
restarting sequence relies on systemd.
At container-creation time, Podman looks up the "PODMAN_SYSTEMD_UNIT"
environment variables and stores it verbatim in the container's label.
This variable is now set by all systemd units generated by
`podman-generate-systemd` and is set to `%n` (i.e., the name of systemd
unit starting the container). This data is then being used in the
auto-update sequence to instruct systemd (via DBUS) to restart the unit
and hence to restart the container.
Note that this implementation of auto-updates relies on systemd and
requires a fully-qualified image reference to be used to create the
container. This enforcement is necessary to know which image to
actually check and pull. If we used an image ID, we would not know
which image to check/pull anymore.
Fixes: #3575
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Until now, we've been validating every part of container
configuration through the With... functions that set the options.
This if fine when we are just validating the options to an
individual function, but things get complicated once we need to
validate conflicts between different options. We don't know the
order in which things were passed, so we need the validation on
both of the potential options that can conflict, resulting in
significant code duplication. To solve this, add a validate()
function for containers, and use this to check whether everything
is in a good state.
We can probably move more into this function (there are other
parts of container creation that also do validation of a sort)
but this is a good start to simplifying our options.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Before Libpod supported named volumes, we approximated image
volumes by bind-mounting in per-container temporary directories.
This was handled by Libpod, and had a corresponding database
entry to enable/disable it.
However, when we enabled named volumes, we completely rewrote the
old implementation; none of the old bind mount implementation
still exists, save one flag in the database. With nothing
remaining to use it, it has no further purpose.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Enables most of the network-related functionality from
`podman run` in `podman pod create`. Custom CNI networks can be
specified, host networking is supported, DNS options can be
configured.
Also enables host networking in `podman play kube`.
Fixes#2808Fixes#3837Fixes#4432Fixes#4718Fixes#4770
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This adds network-related options to the pod in the database. We
are going to add the CLI frontend in further patches.
In short, this should greatly improve the ability of pods to
configure networking, once the CLI parsing is added.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
In Podman 1.6.3, we added support for anonymous volumes - fixing
our old, broken support for named volumes that were created with
containers. Unfortunately, this reused the database field we used
for the old implementation, and toggled volume removal on for
`podman run --rm` - so now, we were removing *named* volumes
created with older versions of Podman.
We can't modify these old volumes in the DB, so the next-safest
thing to do is swap to a new field to indicate volumes should be
removed. Problem: Volumes created with 1.6.3 and up until this
lands, even anonymous volumes, will not be removed. However, this
is safer than removing too many volumes, as we were doing before.
Fixes#5009
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
it allows to disable cgroups creation only for the conmon process.
A new cgroup is created for the container payload.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
`gocritic` is a powerful linter that helps in preventing certain kinds
of errors as well as enforcing a coding style.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
support a custom tag to add to each log for the container.
It is currently supported only by the journald backend.
Closes: https://github.com/containers/libpod/issues/3653
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Store the full command plus arguments of the process the container has
been created with. Expose this data as a `Config.CreateCommand` field
in the container-inspect data as well.
This information can be useful for debugging, as we can find out which
command has created the container, and, if being created via the Podman
CLI, we know exactly with which flags the container has been created
with.
The immediate motivation for this change is to use this information for
`podman-generate-systemd` to generate systemd-service files that allow
for creating new containers (in contrast to only starting existing
ones).
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
These only conflict when joining more than one network. We can
still set a single CNI network and set a static IP and/or static
MAC.
Fixes#4500
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When Libpod removes a container, there is the possibility that
removal will not fully succeed. The most notable problems are
storage issues, where the container cannot be removed from
c/storage.
When this occurs, we were faced with a choice. We can keep the
container in the state, appearing in `podman ps` and available for
other API operations, but likely unable to do any of them as it's
been partially removed. Or we can remove it very early and clean
up after it's already gone. We have, until now, used the second
approach.
The problem that arises is intermittent problems removing
storage. We end up removing a container, failing to remove its
storage, and ending up with a container permanently stuck in
c/storage that we can't remove with the normal Podman CLI, can't
use the name of, and generally can't interact with. A notable
cause is when Podman is hit by a SIGKILL midway through removal,
which can consistently cause `podman rm` to fail to remove
storage.
We now add a new state for containers that are in the process of
being removed, ContainerStateRemoving. We set this at the
beginning of the removal process. It notifies Podman that the
container cannot be used anymore, but preserves it in the DB
until it is fully removed. This will allow Remove to be run on
these containers again, which should successfully remove storage
if it fails.
Fixes#3906
Signed-off-by: Matthew Heon <mheon@redhat.com>
Refactor the `RuntimeConfig` along with related code from libpod into
libpod/config. Note that this is a first step of consolidating code
into more coherent packages to make the code more maintainable and less
prone to regressions on the long runs.
Some libpod definitions were moved to `libpod/define` to resolve
circular dependencies.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Move to containers/image v5 and containers/buildah to v1.11.4.
Replace an equality check with a type assertion when checking for a
docker.ErrUnauthorizedForCredentials in `podman login`.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Everything else is a flag to mount, but "uid" and "gid" are not.
We need to parse them out of "o" and handle them separately.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This is a horrible hack to work around issues with Fedora 31, but
other distros might need it to, so we'll move it upstream.
I do not recommend this functionality for general use, and the
manpages and other documentation will reflect this. But for some
upgrade cases, it will be the only thing that allows for a
working system.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This requires updating all import paths throughout, and a matching
buildah update to interoperate.
I can't figure out the reason for go.mod tracking
github.com/containers/image v3.0.2+incompatible // indirect
((go mod graph) lists it as a direct dependency of libpod, but
(go list -json -m all) lists it as an indirect dependency),
but at least looking at the vendor subdirectory, it doesn't seem
to be actually used in the built binaries.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This is mostly used with Systemd, which really wants to manage
CGroups itself when managing containers via unit file.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
If I mount, say, /usr/bin into my container - I expect to be able
to run the executables in that mount. Unconditionally applying
noexec would be a bad idea.
Before my patches to change mount options and allow exec/dev/suid
being set explicitly, we inferred the mount options from where on
the base system the mount originated, and the options it had
there. Implement the same functionality for the new option
handling.
There's a lot of performance left on the table here, but I don't
know that this is ever going to take enough time to make it worth
optimizing.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Previously, we explicitly set noexec/nosuid/nodev on every mount,
with no ability to disable them. The 'mount' command on Linux
will accept their inverses without complaint, though - 'noexec'
is counteracted by 'exec', 'nosuid' by 'suid', etc. Add support
for passing these options at the command line to disable our
explicit forcing of security options.
This also cleans up mount option handling significantly. We are
still parsing options in more than one place, which isn't good,
but option parsing for bind and tmpfs mounts has been unified.
Fixes: #3819Fixes: #3803
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
add ability to not activate sd_notify when running under varlink as it
causes deadlocks and hangs.
Fixes: #3572
Signed-off-by: baude <bbaude@redhat.com>
Begin to separate the internal structures and frontend for
inspect on volumes. We can't rely on keeping internal data
structures for external presentation - separating presentation
and internal data format is good practice.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
some podman commands do not require the use of a container/image store.
in those cases, it is more effecient to not open the store, because that
results in having to also close the store which can be costly when the
system is under heavy write I/O loads.
Signed-off-by: baude <bbaude@redhat.com>
the compilation demands of having libpod in main is a burden for the
remote client compilations. to combat this, we should move the use of
libpod structs, vars, constants, and functions into the adapter code
where it will only be compiled by the local client.
this should result in cleaner code organization and smaller binaries. it
should also help if we ever need to compile the remote client on
non-Linux operating systems natively (not cross-compiled).
Signed-off-by: baude <bbaude@redhat.com>
We were formerly dumping spec.Mount structs, with no care as to
whether it was user-generated or not - a relic of the very early
days when we didn't know whether a user made a mount or not.
Now that we do, match our output to Docker's dedicated mount
struct.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We merged #2950 with some nits still remaining, as Giuseppe was
going on PTO. This addresses those small requested changes.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
it is useful to migrate existing containers to a new version of
podman. Currently, it is needed to migrate rootless containers that
were created with podman <= 1.2 to a newer version which requires all
containers to be running in the same user namespace.
Closes: https://github.com/containers/libpod/issues/2935
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We were never using it. It's actually a potentially quite sizable
field (very expensive to decode an array of structs!). Removing
it should do no harm.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Docker's upstream name validation regex has two major differences
from ours that we pick up in this PR.
The first requires that the first character of a name is a letter
or number, not a special character.
The second allows periods in names.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We accidentally patched this out trying to enable ns:/path/to/ns
This should restore the ability to configure nondefault CNI
networks with Podman, by ensuring that they request creation of a
network namespace.
Completely remove the WithNetNS() call when we do use an explicit
namespace from a path. We use that call to indicate that a netns
is going to be created - there should not be any question about
whether it actually does.
Fixes#2795
Signed-off-by: Matthew Heon <mheon@redhat.com>
Specifically, we want to be able to specify whether resolv.conf
and /etc/hosts will be create and bind-mounted into the
container.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
In lipod, we now log major events that occurr. These events
can be displayed using the `podman events` command. Each
event contains:
* Type (container, image, volume, pod...)
* Status (create, rm, stop, kill, ....)
* Timestamp in RFC3339Nano format
* Name (if applicable)
* Image (if applicable)
The format of the event and the varlink endpoint are to not
be considered stable until cockpit has done its enablement.
Signed-off-by: baude <bbaude@redhat.com>
allow to configure the path to the network-cmd binary, either via an
option flag --network-cmd-path or through the libpod.conf
configuration file.
This is currently used to customize the path to the slirp4netns
binary.
Closes: https://github.com/containers/libpod/issues/2506
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Add the ability to manually run a container's healthcheck command.
This is only the first phase of implementing the healthcheck.
Subsequent pull requests will deal with the exposing the results and
history of healthchecks as well as the scheduling.
Signed-off-by: baude <bbaude@redhat.com>
When removing volumes with rm --volumes we want to only remove
volumes that were created with the container. Volumes created
separately via 'podman volume create' should not be removed.
Also ensure that --rm implies volumes will be removed.
Fixes#2441
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
If this doesn't match, we end up not being able to access named
volumes mounted into containers, which is bad. Use the same
validation that we use for other critical paths to ensure this
one also matches.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We want named volumes to be created in a subdirectory of the
c/storage graph root, the same as the libpod root directory is
now. As such, we need to adjust its location when the graph root
changes location.
Also, make a change to how we set the default. There's no need to
explicitly set it every time we initialize via an option - that
might conflict with WithStorageConfig setting it based on graph
root changes. Instead, just initialize it in the default config
like our other settings.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
If user specifies network namespace and the /etc/netns/XXX/resolv.conf
exists, we should use this rather then /etc/resolv.conf
Also fail cleaner if the user specifies an invalid Network Namespace.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
if --runtime is specified, then it has higher priority on the
runtime_path option, which was added for backward compatibility.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The original intent behind the requirement was to ensure that, if
two SHM lock structs were open at the same time, we should not
make such a runtime available to the user, and should clean it up
instead.
It turns out that we don't even need to open a second SHM lock
struct - if we get an error mapping the first one due to a lock
count mismatch, we can just delete it, and it cleans itself up
when it errors. So there's no reason not to return a valid
runtime.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We can't do renumbering after init - we need to open a
potentially invalid locks file (too many/too few locks), and then
potentially delete the old locks and make new ones.
We need to be in init to bypass the checks that would otherwise
make this impossible.
This leaves us with two choices: make RenumberLocks a separate
entrypoint from NewRuntime, duplicating a lot of configuration
load code (we need to know where the locks live, how many there
are, etc) - or modify NewRuntime to allow renumbering during it.
Previous experience says the first is not really a viable option
and produces massive code bloat, so the second it is.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
iFix builtin volumes to work with podman volume
Currently builtin volumes are not recored in podman volumes when
they are created automatically. This patch fixes this.
Remove container volumes when requested
Currently the --volume option on podman remove does nothing.
This will implement the changes needed to remove the volumes
if the user requests it.
When removing a volume make sure that no container uses the volume.
Signed-off-by: Daniel J Walsh dwalsh@redhat.com
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
we can define multiple OCI runtimes that can be chosen with
--runtime.
in libpod.conf is possible to specify them with:
[runtimes]
foo = [
"/usr/bin/foo",
"/usr/sbin/foo",
]
bar = [
"/usr/bin/foo",
"/usr/sbin/foo",
]
If the argument to --runtime is an absolute path then it is used
directly without any lookup in the configuration.
Closes: https://github.com/containers/libpod/issues/1750
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This deprecates the libpod.conf variable of `runtime_path=`, and now has
`runtimes=`, like a map for naming the runtime, preparing for a
`--runtime` flag to `podman run` (i.e. runc, kata, etc.)
Reference: https://github.com/containers/libpod/issues/1750
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
If one of storage GraphRoot or RunRoot are specified, but the
other is not, c/storage will not use the default, and will throw
an error instead. Ensure that in cases where this would happen,
we populate the fields with the c/storage defaults ourselves.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Add support for podman volume and its subcommands.
The commands supported are:
podman volume create
podman volume inspect
podman volume ls
podman volume rm
podman volume prune
This is a tool to manage volumes used by podman. For now it only handle
named volumes, but eventually it will handle all volumes used by podman.
Signed-off-by: umohnani8 <umohnani@redhat.com>