Commit Graph

315 Commits

Author SHA1 Message Date
Valentin Rothberg e966c86d98 container.conf: support attributed string slices
All `[]string`s in containers.conf have now been migrated to attributed
string slices which require some adjustments in Buildah and Podman.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-10-27 12:44:33 +02:00
Paul Holzinger bad25da92e
libpod: add !remote tag
This should never be pulled into the remote client.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-10-24 12:11:34 +02:00
Valentin Rothberg 6293ec2e2d fix handling of static/volume dir
The processing and setting of the static and volume directories was
scattered across the code base (including c/common) leading to subtle
errors that surfaced in #19938.

There were multiple issues that I try to summarize below:

 - c/common loaded the graphroot from c/storage to set the defaults for
   static and volume dir.  That ignored Podman's --root flag and
   surfaced in #19938 and other bugs.  c/common does not set the
   defaults anymore which gives Podman the ability to detect when the
   user/admin configured a custom directory (not empty value).

 - When parsing the CLI, Podman (ab)uses containers.conf structures to
   set the defaults but also to override them in case the user specified
   a flag.  The --root flag overrode the static dir which is wrong and
   broke a couple of use cases.  Now there is a dedicated field for in
   the "PodmanConfig" which also includes a containers.conf struct.

 - The defaults for static and volume dir and now being set correctly
   and adhere to --root.

 - The CONTAINERS_CONF_OVERRIDE env variable has not been passed to the
   cleanup process.  I believe that _all_ env variables should be passed
   to conmon to avoid such subtle bugs.

Overall I find that the code and logic is scattered and hard to
understand and follow.  I refrained from larger refactorings as I really
just want to get #19938 fixed and then go back to other priorities.

https://github.com/containers/common/pull/1659 broke three pkg/machine
tests.  Those have been commented out until getting fixed.

Fixes: #19938
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-09-25 14:14:30 +02:00
Daniel J Walsh 1e54539432
Add support for passing container stop timeout as -1 (infinite)
Compat api for containers/stop should take -1 value

Add support for `podman stop --time -1`
Add support for `podman restart --time -1`
Add support for `podman rm --time -1`
Add support for `podman pod stop --time -1`
Add support for `podman pod rm --time -1`
Add support for `podman volume rm --time -1`
Add support for `podman network rm --time -1`

Fixes: https://github.com/containers/podman/issues/17542

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-08-04 08:36:45 -04:00
Daniel J Walsh 22a8b68866
make /dev & /dev/shm read/only when --read-only --read-only-tmpfs=false
The intention of --read-only-tmpfs=fals when in --read-only mode was to
not allow any processes inside of the container to write content
anywhere, unless the caller also specified a volume or a tmpfs. Having
/dev and /dev/shm writable breaks this assumption.

Fixes: https://github.com/containers/podman/issues/12937

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-07-30 06:09:30 -04:00
Aditya R 3829fbd35a
podman: add support for splitting imagestore
Add support for `--imagestore` in podman which allows users to split the filesystem of containers vs image store, imagestore if configured will pull images in image storage instead of the graphRoot while keeping the other parts still in the originally configured graphRoot.

This is an implementation of
https://github.com/containers/storage/pull/1549 in podman.

Signed-off-by: Aditya R <arajan@redhat.com>
2023-06-17 08:51:08 +05:30
Paul Holzinger 34c258b419
libpod: fix timezone handling
The current way of bind mounting the host timezone file has problems.
Because /etc/localtime in the image may exist and is a symlink under
/usr/share/zoneinfo it will overwrite the targetfile. That confuses
timezone parses especially java where this approach does not work at
all. So we end up with an link which does not reflect the actual truth.

The better way is to just change the symlink in the image like it is
done on the host. However because not all images ship tzdata we cannot
rely on that either. So now we do both, when tzdata is installed then
use the symlink and if not we keep the current way of copying the host
timezone file in the container to /etc/localtime.

Also note that we need to rebuild the systemd image to include tzdata in
order to test this as our images do not contain the tzdata by default.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2149876

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-01 11:04:13 +02:00
Valentin Rothberg 08b0d93ea3 kube play: exit-code propagation
Implement means for reflecting failed containers (i.e., those having
exited non-zero) to better integrate `kube play` with systemd.  The
idea is to have the main PID of `kube play` exit non-zero in a
configurable way such that systemd's restart policies can kick in.

When using the default sdnotify-notify policy, the service container
acts as the main PID to further reduce the resource footprint.  In that
case, before stopping the service container, Podman will lookup the exit
codes of all non-infra containers.  The service will then behave
according to the following three exit-code policies:

 - `none`: exit 0 and ignore containers (default)
 - `any`: exit non-zero if _any_ container did
 - `all`: exit non-zero if _all_ containers did

The upper values can be passed via a hidden `kube play
--service-exit-code-propagation` flag which can be used by tests and
later on by Quadlet.

In case Podman acts as the main PID (i.e., when at least one container
runs with an sdnotify-policy other than "ignore"), Podman will continue
to wait for the service container to exit and reflect its exit code.

Note that this commit also fixes a long-standing annoyance of the
service container exiting non-zero.  The underlying issue was that the
service container had been stopped with SIGKILL instead of SIGTERM and
hence exited non-zero.  Fixing that was a prerequisite for the exit-code
propagation to work but also improves the integration of `kube play`
with systemd and hence Quadlet with systemd.

Jira: issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-05-25 14:46:34 +02:00
Urvashi Mohnani edbeee5238 Add --restart flag to pod create
Add --restart flag to pod create to allow users to set the
restart policy for the pod, which applies to all the containers
in the pod. This reuses the restart policy already there for
containers and has the same restart policy options.
Add "never" to the restart policy options to match k8s syntax.
It is a synonym for "no" and does the exact same thing where the
containers are not restarted once exited.
Only the containers that have exited will be restarted based on the
restart policy, running containers will not be restarted when an exited
container is restarted in the same pod (same as is done in k8s).

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
2023-05-02 10:29:58 -04:00
Daniel J Walsh ad8a96ab95
Support running nested SELinux container separation
Currently Podman prevents SELinux container separation,
when running within a container. This PR adds a new
--security-opt label=nested

When setting this option, Podman unmasks and mountsi
/sys/fs/selinux into the containers making /sys/fs/selinux
fully exposed. Secondly Podman sets the attribute
run.oci.mount_context_type=rootcontext

This attribute tells crun to mount volumes with rootcontext=MOUNTLABEL
as opposed to context=MOUNTLABEL.

With these two settings Podman inside the container is allowed to set
its own SELinux labels on tmpfs file systems mounted into its parents
container, while still being confined by SELinux. Thus you can have
nested SELinux labeling inside of a container.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-03-13 14:21:12 -04:00
Daniel J Walsh b5a99e0816
Must use mountlabel when creating builtin volumes
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-03-09 12:36:52 -05:00
Valentin Rothberg e77f370f86 sqlite: add a hidden --db-backend flag
Add a hidden flag to set the database backend and plumb it into
podman-info.  Further add a system test to make sure the flag and the
info output are working properly.

Note that the test may need to be changed once we settled on how
to test the sqlite backend in CI.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-02 13:43:11 +01:00
OpenShift Merge Robot 02a77d27a2
Merge pull request #17450 from danishprakash/add-group-entry
create: add entry to /etc/group via `--group-entry`
2023-02-28 21:59:59 +01:00
dependabot[bot] e9942c61dd build(deps): bump github.com/container-orchestrated-devices/container-device-interface
Bumps [github.com/container-orchestrated-devices/container-device-interface](https://github.com/container-orchestrated-devices/container-device-interface) from 0.5.3 to 0.5.4.
- [Release notes](https://github.com/container-orchestrated-devices/container-device-interface/releases)
- [Commits](https://github.com/container-orchestrated-devices/container-device-interface/compare/v0.5.3...v0.5.4)

---
updated-dependencies:
- dependency-name: github.com/container-orchestrated-devices/container-device-interface
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

[NO NEW TESTS NEEDED]

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-20 14:51:04 +01:00
danishprakash 828708bac2
create: add support for --group-entry
* add test
* update documentation

Signed-off-by: danishprakash <danish.prakash@suse.com>
2023-02-15 11:20:18 +05:30
danishprakash 0999991b20
add support for limiting tmpfs size for systemd-specific mnts
* add tests
* add documentation for --shm-size-systemd
* add support for both pod and standalone run

Signed-off-by: danishprakash <danish.prakash@suse.com>
2023-02-14 14:56:09 +05:30
Erik Sjölund a5ca732256 Fix typos
Software version used
https://github.com/crate-ci/typos/releases/tag/v1.13.10

The binary was downloaded from
https://github.com/crate-ci/typos/releases/download/v1.13.10/typos-v1.13.10-x86_64-unknown-linux-musl.tar.gz

Command that was run:

typos --write-changes docs cmd cni contrib dependencies docs hack libpod pkg utils

False positives were manually removed.
A few marshaling/existant typos were manually fixed.

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2023-02-11 18:23:24 +01:00
Erik Sjölund 08e13867a9 Fix typos. Improve language.
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2023-02-09 21:56:27 +01:00
Giuseppe Scrivano 2bb4c7cdde
libpod: support idmap for --rootfs
add a new option idmap to --rootfs that works in the same way as it
does for volumes.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-02-02 22:35:00 +01:00
Daniel J Walsh c2b36beb40
Use containers/storage/pkg/regexp in place of regexp
This is a cleaner solution and guarantees the variables
will be used before they are initialized.

[NO NEW TESTS NEEDED]

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-01-12 18:33:38 -05:00
Daniel J Walsh 758f20e20a
Compile regex on demand not in init
Every podman command is paying the price for this compile even when they
don't use the Regex, this will speed up start of podman by a little.

[NO NEW TESTS NEEDED] Existing tests should catch issues.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-01-11 14:38:51 -05:00
Daniel J Walsh 0c94f61852
Allow '/' to prefix container names to match Docker
Fixes: https://github.com/containers/podman/issues/16663

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-12-26 07:37:43 -05:00
Paul Holzinger 4fa65ad0dc
libpod: remove CNI word were no longer applicable
We should have done this much earlier, most of the times CNI networks
just mean networks so I changed this and also fixed some function
names. This should make it more clear what actually refers to CNI and
what is just general network backend stuff.

[NO NEW TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-12-16 14:20:14 +01:00
Valentin Rothberg dcbf7b4481 bump golangci-lint to v1.50.1
Also fix a number of duplicate words.  Yet disable the new `dupword`
linter as it displays too many false positives.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-12-15 13:39:56 +01:00
Charlie Doern 95cc7e0527 add support for subpath in play kube for named volumes
subpath allows for only a subdirecty of a volumes data to be mounted in the container
add support for the named volume type sub path with others to follow.

resolves #12929

Signed-off-by: Charlie Doern <cbddoern@gmail.com>
2022-12-12 09:54:00 -05:00
Alexander Larsson 25d9af8f42 runtime: Handle the transient store options
This handles the transient store options from the container/storage
configuration in the runtime/engine.

Changes are:
 * Print transient store status in `podman info`
 * Print transient store status in runtime debug output
 * Add --transient-store argument to override config option
 * Propagate config state to conmon cleanup args so the callback podman
   gets the same config.

Note: This doesn't really change any behaviour yet (other than the changes
in containers/storage).

Signed-off-by: Alexander Larsson <alexl@redhat.com>
2022-12-05 18:09:21 +01:00
Matthew Heon d16129330d Add support for startup healthchecks
Startup healthchecks are similar to K8S startup probes, in that
they are a separate check from the regular healthcheck that runs
before it. If the startup healthcheck fails repeatedly, the
associated container is restarted.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-11-28 13:30:29 -05:00
Erik Sjölund a1b32866cc Fix language. Mostly spelling a -> an
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2022-11-20 19:41:06 +01:00
Aditya R f6da2b0608
specgen,wasm: switch to crun-wasm wherever applicable
Whenever image has `arch` and `os` configured for `wasm/wasi` switch to
`crun-wasm` as extension if applicable, following will only work if
`podman` is using `crun` as default runtime.

[NO NEW TESTS NEEDED]
[NO TESTS NEEDED]
A test for this can be added when `crun-wasm` is part of CI.

Signed-off-by: Aditya R <arajan@redhat.com>
2022-11-14 11:36:34 +05:30
Alexander Larsson 734c435e01 Add podman volume create --ignore
This ignores the create request if the named volume already exists.
It is very useful when scripting stuff.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
2022-10-24 17:30:31 +02:00
Valentin Rothberg aad29e759c health check: add on-failure actions
For systems that have extreme robustness requirements (edge devices,
particularly those in difficult to access environments), it is important
that applications continue running in all circumstances. When the
application fails, Podman must restart it automatically to provide this
robustness. Otherwise, these devices may require customer IT to
physically gain access to restart, which can be prohibitively difficult.

Add a new `--on-failure` flag that supports four actions:

- **none**: Take no action.

- **kill**: Kill the container.

- **restart**: Restart the container.  Do not combine the `restart`
               action with the `--restart` flag.  When running inside of
               a systemd unit, consider using the `kill` or `stop`
               action instead to make use of systemd's restart policy.

- **stop**: Stop the container.

To remain backwards compatible, **none** is the default action.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-09-09 13:02:05 +02:00
Urvashi Mohnani 98169c20dd Add emptyDir volume support to kube play
When a kube yaml has a volume set as empty dir, podman
will create an anonymous volume with the empty dir name and
attach it to the containers running in the pod. When the pod
is removed, the empy dir volume created is also removed.

Add tests and docs for this as well.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
2022-08-30 10:34:45 -04:00
Matthew Heon 0f73935563 Add support for containers.conf volume timeouts
Also, do a general cleanup of all the timeout code. Changes
include:
- Convert from int to *uint where possible. Timeouts cannot be
  negative, hence the uint change; and a timeout of 0 is valid,
  so we need a new way to detect that the user set a timeout
  (hence, pointer).
- Change name in the database to avoid conflicts between new data
  type and old one. This will cause timeouts set with 4.2.0 to be
  lost, but considering nobody is using the feature at present
  (and the lack of validation means we could have invalid,
  negative timeouts in the DB) this feels safe.
- Ensure volume plugin timeouts can only be used with volumes
  created using a plugin. Timeouts on the local driver are
  nonsensical.
- Remove the existing test, as it did not use a volume plugin.
  Write a new test that does.

The actual plumbing of the containers.conf timeout in is one line
in volume_api.go; the remainder are the above-described cleanups.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2022-08-23 15:42:00 -04:00
Valentin Rothberg 79e21b5b16 kube play: sd-notify integration
Integrate sd-notify policies into `kube play`.  The policies can be
configured for all contianers via the `io.containers.sdnotify`
annotation or for indidivual containers via the
`io.containers.sdnotify/$name` annotation.

The `kube play` process will wait for all containers to be ready by
waiting for the individual `READY=1` messages which are received via
the `pkg/systemd/notifyproxy` proxy mechanism.

Also update the simple "container" sd-notify test as it did not fully
test the expected behavior which became obvious when adding the new
tests.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-08-10 21:12:39 +02:00
Valentin Rothberg 3fc126e152 libpod: allow the notify socket to be passed programatically
The notify socket can now either be specified via an environment
variable or programatically (where the env is ignored).  The
notify mode and the socket are now also displayed in `container inspect`
which comes in handy for debugging and allows for propper testing.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-08-10 21:10:17 +02:00
Charlie Doern c00ea686fe resource limits for pods
added the following flags and handling for podman pod create

--memory-swap
--cpuset-mems
--device-read-bps
--device-write-bps
--blkio-weight
--blkio-weight-device
--cpu-shares

given the new backend for systemd in c/common, all of these can now be exposed to pod create.
most of the heavy lifting (nearly all) is done within c/common. However, some rewiring needed to be done here
as well!

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2022-07-21 14:50:01 -04:00
Sascha Grunert 251d91699d
libpod: switch to golang native error wrapping
We now use the golang error wrapping format specifier `%w` instead of
the deprecated github.com/pkg/errors package.

[NO NEW TESTS NEEDED]

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2022-07-05 16:06:32 +02:00
openshift-ci[bot] 96e72d90b8
Merge pull request #14449 from cdoern/podVolumes
podman volume create --opt=o=timeout...
2022-07-01 08:46:11 +00:00
Erik Sjölund aa4279ae15 Fix spelling "setup" -> "set up" and similar
* Replace "setup", "lookup", "cleanup", "backup" with
  "set up", "look up", "clean up", "back up"
  when used as verbs. Replace also variations of those.

* Improve language in a few places.

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2022-06-22 18:39:21 +02:00
cdoern 7b3e43c1f6 podman volume create --opt=o=timeout...
add an option to configure the driver timeout when creating a volume.
The default is 5 seconds but this value is too small for some custom drivers.

Signed-off-by: cdoern <cdoern@redhat.com>
2022-06-09 16:44:21 -04:00
OpenShift Merge Robot fef40e2ad3
Merge pull request #14483 from jakecorrenti/restart-privelaged-containers-after-host-device-change
Privileged containers can now restart if the host devices change
2022-06-07 15:48:36 -04:00
Jake Correnti 8533ea0004 Privileged containers can now restart if the host devices change
If a privileged container is running, stops, and the devices on the host
change, such as a USB device is unplugged, then a container would no
longer start. Previously, the devices from the host were only being
added to the container once: when the container was created. Now, this
happens every time the container starts.

I did this by adding a boolean to the container config that indicates
whether to mount all of the devices or not, which can be set via an option.

During spec generation, if the `MountAllDevices` option is set in the
container config, all host devices are added to the container.

Additionally, a couple of functions from `pkg/specgen/generate/config_linux.go`
were moved into `pkg/util/utils_linux.go` as they were needed in
multiple packages.

Closes #13899

Signed-off-by: Jake Correnti <jcorrenti13@gmail.com>
2022-06-06 14:14:22 -04:00
Matthew Heon 259c79963f Improve robustness of `podman system reset`
Firstly, reset is now managed by the runtime itself as a part of
initialization. This ensures that it can be used even with
runtimes that would otherwise fail to be created - most notably,
when the user has changed a core path
(runroot/root/tmpdir/staticdir).

Secondly, we now attempt a best-effort removal even if the store
completely fails to be configured.

Third, we now hold the alive lock for the entire reset operation.
This ensures that no other Podman process can start while we are
running a system reset, and removes any possibility of a race
where a user tries to create containers or pull images while we
are trying to perform a reset.

[NO NEW TESTS NEEDED] we do not test reset last I checked.

Fixes #9075

Signed-off-by: Matthew Heon <mheon@redhat.com>
2022-06-03 12:54:08 -04:00
Daniel J Walsh 5d37d80ff9
Use containers/common/pkg/util.StringToSlice
[NO NEW TESTS NEEDED] Just code cleanup for better reuse

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-05-23 12:16:54 -04:00
Valentin Rothberg 840c120c21 play kube: service container
Add the notion of a "service container" to play kube.  A service
container is started before the pods in play kube and is (reverse)
linked to them.  The service container is stopped/removed *after*
all pods it is associated with are stopped/removed.

In other words, a service container tracks the entire life cycle
of a service started via `podman play kube`.  This is required to
enable `play kube` in a systemd unit file.

The service container is only used when the `--service-container`
flag is set on the CLI.  This flag has been marked as hidden as it
is not meant to be used outside the context of `play kube`.  It is
further not supported on the remote client.

The wiring with systemd will be done in a later commit.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-05-12 10:51:13 +02:00
Valentin Rothberg 4eff0c8cf2 pod: add exit policies
Add the notion of an "exit policy" to a pod.  This policy controls the
behaviour when the last container of pod exits.  Initially, there are
two policies:

 - "continue" : the pod continues running. This is the default policy
                when creating a pod.

 - "stop" : stop the pod when the last container exits. This is the
            default behaviour for `play kube`.

In order to implement the deferred stop of a pod, add a worker queue to
the libpod runtime.  The queue will pick up work items and in this case
helps resolve dead locks that would otherwise occur if we attempted to
stop a pod during container cleanup.

Note that the default restart policy of `play kube` is "Always".  Hence,
in order to really solve #13464, the YAML files must set a custom
restart policy; the tests use "OnFailure".

Fixes: #13464
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-05-02 13:29:59 +02:00
Giuseppe Scrivano 91ead15283
volume: add new option -o o=noquota
add a new option to completely disable xfs quota usage for a volume.

xfs quota set on a volume, even just for tracking disk usage, can
cause weird errors if the volume is later re-used by a container with
a different quota projid.  More specifically, link(2) and rename(2)
might fail with EXDEV if the source file has a projid that is
different from the parent directory.

To prevent such kind of issues, the volume should be created
beforehand with `podman volume create -o o=noquota $ID`

Closes: https://github.com/containers/podman/issues/14049

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-04-28 13:29:01 +02:00
Paul Holzinger 51fbf3da9e
enable gocritic linter
The linter ensures a common code style.
- use switch/case instead of else if
- use if instead of switch/case for single case statement
- add space between comment and text
- detect the use of defer with os.Exit()
- use short form var += "..." instead of var = var + "..."
- detect problems with append()
```
newSlice := append(orgSlice, val)
```
  This could lead to nasty bugs because the orgSlice will be changed in
  place if it has enough capacity too hold the new elements. Thus we
  newSlice might not be a copy.

Of course most of the changes are just cosmetic and do not cause any
logic errors but I think it is a good idea to enforce a common style.
This should help maintainability.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-04-26 18:12:22 +02:00
Paul Holzinger 2a8e435671
enable staticcheck linter
Fix many problems reported by the staticcheck linter, including many
real bugs!

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-04-22 12:51:29 +02:00
OpenShift Merge Robot 8d3075e332
Merge pull request #13583 from rhatdan/ipc
Add support for ipc namespace modes "none, private, sharable"
2022-04-16 12:30:01 -04:00