The O_PATH flag is a recent addition to the open syscall and is not
present in darwin or in FreeBSD releases before 13.1. The constant is
not present in the FreeBSD version of x/sys/unix since that package
supports FreeBSD 12.3 and later.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
This removes a use of state.NetNS which is a linux-specific field defined
in container_linux.go from the generic container_internal.go, allowing
that to build on non-linux platforms.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
The notify socket can now either be specified via an environment
variable or programatically (where the env is ignored). The
notify mode and the socket are now also displayed in `container inspect`
which comes in handy for debugging and allows for propper testing.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
bump github.com/container-orchestrated-devices/container-device-interface from 0.4.0 to 0.5.0
This requires that the cdi.Registry be instantiated with AutoRefresh disabled for CLI clients.
[NO NEW TESTS NEEDED]
Signed-off-by: Evan Lezar <elezar@nvidia.com>
* Correct spelling and typos.
* Improve language.
Co-authored-by: Ed Santiago <santiago@redhat.com>
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
We now use the golang error wrapping format specifier `%w` instead of
the deprecated github.com/pkg/errors package.
[NO NEW TESTS NEEDED]
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
currently, setting any sort of resource limit in a pod does nothing. With the newly refactored creation process in c/common, podman ca now set resources at a pod level
meaning that resource related flags can now be exposed to podman pod create.
cgroupfs and systemd are both supported with varying completion. cgroupfs is a much simpler process and one that is virtually complete for all resource types, the flags now just need to be added. systemd on the other hand
has to be handeled via the dbus api meaning that the limits need to be passed as recognized properties to systemd. The properties added so far are the ones that podman pod create supports as well as `cpuset-mems` as this will
be the next flag I work on.
Signed-off-by: Charlie Doern <cdoern@redhat.com>
* Replace "setup", "lookup", "cleanup", "backup" with
"set up", "look up", "clean up", "back up"
when used as verbs. Replace also variations of those.
* Improve language in a few places.
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
The nolintlint linter does not deny the use of `//nolint`
Instead it allows us to enforce a common nolint style:
- force that a linter name must be specified
- do not add a space between `//` and `nolint`
- make sure nolint is only used when there is actually a problem
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
if /run is on a volume do not create the file /run/.containerenv as it
would leak outside of the container.
Closes: https://github.com/containers/podman/issues/14577
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Podman and Buildah should use the same code the generate the resolv.conf
file. This mostly moved the podman code into c/common and created a
better API for it so buildah can use it as well.
[NO NEW TESTS NEEDED] All existing tests should continue to pass.
Fixes#13599 (There is no way to test this in CI without breaking the
hosts resolv.conf)
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
If a privileged container is running, stops, and the devices on the host
change, such as a USB device is unplugged, then a container would no
longer start. Previously, the devices from the host were only being
added to the container once: when the container was created. Now, this
happens every time the container starts.
I did this by adding a boolean to the container config that indicates
whether to mount all of the devices or not, which can be set via an option.
During spec generation, if the `MountAllDevices` option is set in the
container config, all host devices are added to the container.
Additionally, a couple of functions from `pkg/specgen/generate/config_linux.go`
were moved into `pkg/util/utils_linux.go` as they were needed in
multiple packages.
Closes#13899
Signed-off-by: Jake Correnti <jcorrenti13@gmail.com>
Similar feature was added for named overlay volumes here: https://github.com/containers/podman/pull/12712
Following PR just mimics similar feature for anonymous volumes.
Often users want their anonymous overlayed volumes to be `non-volatile` in nature
that means that same `upper` dir can be re-used by one or more
containers but overall of nature of volumes still have to be overlay
so work done is still on a overlay not on the actual volume.
Following PR adds support for more advanced options i.e custom `workdir`
and `upperdir` for overlayed volumes. So that users can re-use `workdir`
and `upperdir` across new containers as well.
Usage
```console
podman run -it -v /some/path:/data:O,upperdir=/path/persistant/upper,workdir=/path/persistant/work alpine sh
```
Signed-off-by: Aditya R <arajan@redhat.com>
Most of these are no longer relevant, just drop the comments.
Most notable change: allow `podman kill` on paused containers.
Works just fine when I test it.
Signed-off-by: Matthew Heon <mheon@redhat.com>
When a container is run in the host network namespace we have to keep
the same resolv.conf content and not use the systemd-resolve detection
logic.
But also make sure we still allow --dns options.
Fixes#14055
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The files /etc/hosts, /etc/hostname and /etc/resolv.conf should always
be owned by the root user in the container. This worked correct for
/etc/hostname and /etc/hosts but not for /etc/resolv.conf.
A container run with --userns keep-id would have the reolv.conf file
owned by the current container user which is wrong.
Consolidate some common code in a new helper function to make the code more
cleaner.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The errcheck linter makes sure that errors are always check and not
ignored by accident. It spotted a lot of unchecked errors, mostly in the
tests but also some real problem in the code.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The linter ensures a common code style.
- use switch/case instead of else if
- use if instead of switch/case for single case statement
- add space between comment and text
- detect the use of defer with os.Exit()
- use short form var += "..." instead of var = var + "..."
- detect problems with append()
```
newSlice := append(orgSlice, val)
```
This could lead to nasty bugs because the orgSlice will be changed in
place if it has enough capacity too hold the new elements. Thus we
newSlice might not be a copy.
Of course most of the changes are just cosmetic and do not cause any
logic errors but I think it is a good idea to enforce a common style.
This should help maintainability.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
golint, scopelint and interfacer are deprecated. golint is replaced by
revive. This linter is better because it will also check for our error
style: `error strings should not be capitalized or end with punctuation or a newline`
scopelint is replaced by exportloopref (already endabled)
interfacer has no replacement but I do not think this linter is
important.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Use the new logic from c/common to create the hosts file. This will help
to better allign the hosts files between buildah and podman.
Also this fixes several bugs:
- remove host entries when container is stopped and has a netNsCtr
- add entries for containers in a pod
- do not duplicate entries in the hosts file
- use the correct slirp ip when an userns is used
Features:
- configure host.containers.internal entry in containers.conf
- configure base hosts file in containers.conf
Fixes#12003Fixes#13224
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This is an enhancement proposal for the checkpoint / restore feature of
Podman that enables container migration across multiple systems with
standard image distribution infrastructure.
A new option `--create-image <image>` has been added to the
`podman container checkpoint` command. This option tells Podman to
create a container image. This is a standard image with a single layer,
tar archive, that that contains all checkpoint files. This is similar to
the current approach with checkpoint `--export`/`--import`.
This image can be pushed to a container registry and pulled on a
different system. It can also be exported locally with `podman image
save` and inspected with `podman inspect`. Inspecting the image would
display additional information about the host and the versions of
Podman, criu, crun/runc, kernel, etc.
`podman container restore` has also been extended to support image
name or ID as input.
Suggested-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
It allows to customize the entry that is written to the `/etc/passwd`
file when --passwd is used.
Closes: https://github.com/containers/podman/issues/13185
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
systemd expects the container_uuid environment variable be set
when it is running in a container.
Fixes: https://github.com/containers/podman/issues/13187
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
While resolving `workdir` we mostly create a `workdir` when `stat`
fails with `ENOENT` or `ErrNotExist` however following cases are not
true when user explicitly specifies a `workdir` while `running` using
`--workdir` which tells `podman` to only use workdir if its exists on
the container. Following configuration is implicity set with other
`run` mechanism like `podman play kube`
Problem with explicit `--workdir` or similar implicit config in `podman play
kube` is that currently podman ignores the fact that workdir can also be
a `symlink` and actual `link` could be valid.
Hence following commit ensures that in such scenarios when a `workdir`
is not found and we cannot create a `workdir` podman must perform a
check to ensure that if `workdir` is a `symlink` and `link` is resolved
successfully and resolved link is present on the container then we
return as it is.
Docker performs a similar behviour.
Signed-off-by: Aditya R <arajan@redhat.com>
podman container clone takes the id of an existing continer and creates a specgen from the given container's config
recreating all proper namespaces and overriding spec options like resource limits and the container name if given in the cli options
this command utilizes the common function DefineCreateFlags meaning that we can funnel as many create options as we want
into clone over time allowing the user to clone with as much or as little of the original config as they want.
container clone takes a second argument which is a new name and a third argument which is an image name to use instead of the original container's
the current supported flags are:
--destroy (remove the original container)
--name (new ctr name)
--cpus (sets cpu period and quota)
--cpuset-cpus
--cpu-period
--cpu-rt-period
--cpu-rt-runtime
--cpu-shares
--cpuset-mems
--memory
--run
resolves#10875
Signed-off-by: cdoern <cdoern@redhat.com>
Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
Signed-off-by: cdoern <cdoern@redhat.com>
The `podman network connect` and `podman network disconnect`
commands give containers access to different networks than the
ones they were created with; these networks can also have DNS
servers associated with them. Until now, however, we did not
modify resolv.conf as network membership changed.
With this PR, `podman network connect` will add any new
nameservers supported by the new network to the container's
/etc/resolv.conf, and `podman network disconnect` command will do
the opposite, removing the network's nameservers from
`/etc/resolv.conf`.
Fixes#9603
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Append the podman dns seach domain to the host search domains when we
use the dnsname/aardvark server. Previously it would only use podman
seach domains and discard the host domains.
Fixes#13103
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
these mount flags are already used for the /dev/shm mount on the host,
but they are not set for the bind mount itself.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Often users want their overlayed volumes to be `non-volatile` in nature
that means that same `upper` dir can be re-used by one or more
containers but overall of nature of volumes still have to be `overlay`
so work done is still on a overlay not on the actual volume.
Following PR adds support for more advanced options i.e custom `workdir`
and `upperdir` for overlayed volumes. So that users can re-use `workdir`
and `upperdir` across new containers as well.
Usage
```console
$ podman run -it -v myvol:/data:O,upperdir=/path/persistant/upper,workdir=/path/persistant/work alpine sh
```
Signed-off-by: Aditya R <arajan@redhat.com>
This change updates the CDI API to commit 46367ec063fda9da931d050b308ccd768e824364
which addresses some inconistencies in the previous implementation.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
The libpod/network packages were moved to c/common so that buildah can
use it as well. To prevent duplication use it in podman as well and
remove it from here.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
For ip/macvlan networks we cannot use the gateway as address for this
hostname. In this case the gateway is normally not on the host so we
just try to use a local ip instead.
[NO NEW TESTS NEEDED] We cannot run macvlan networks in CI.
Fixes#11351
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Some containers require certain user account(s) to exist within the
container when they are run. This option will allow callers to add a
bunch of passwd entries from the host to the container even if the
entries are not in the local /etc/passwd file on the host.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935831
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When Podman is running a container in private IPC mode (default), it
creates a bind mount for /dev/shm that is then attached to a tmpfs
folder on the host file system. However, checkpointing a container has
the side-effect of stopping that container and unmount the tmpfs used
for /dev/shm. As a result, after checkpoint all files stored in the
container's /dev/shm would be lost and the container might fail to
restore from checkpoint.
To address this problem, this patch creates a tar file with the
content of /dev/shm that is included in the container checkpoint and
used to restore the container.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This ensures that existing containers will still manage
`/etc/passwd` by default, as they have been doing until now. New
containers that explicitly set `false` will still have passwd
management disabled, but otherwise the code will run.
[NO NEW TESTS NEEDED] This will only be caught on upgrade and I
don't really know how to write update tests - and Ed is on PTO.
Signed-off-by: Matthew Heon <mheon@redhat.com>
added support for a new flag --passwd which, when false prohibits podman from creating entries in
/etc/passwd and /etc/groups allowing users to modify those files in the container entrypoint
resolves#11805
Signed-off-by: cdoern <cdoern@redhat.com>
Add first non localhost ipv4 of all host interfaces as destination
for host.contaners.internal for rootless containers.
Fixes: https://github.com/containers/podman/issues/12000
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This adds the following information to the output of 'podman inspect':
* CheckpointedAt - time the container was checkpointed
Only set if the container has been checkpointed
* RestoredAt - time the container was restored
Only set if the container has been restored
* CheckpointLog - path to the checkpoint log file (CRIU's dump.log)
Only set if the log file exists (--keep)
* RestoreLog - path to the restore log file (CRIU's restore.log)
Only set if the log file exists (--keep)
* CheckpointPath - path to the actual (CRIU) checkpoint files
Only set if the checkpoint files exists (--keep)
* Restored - set to true if the container has been restored
Only set if the container has been restored
Signed-off-by: Adrian Reber <areber@redhat.com>
The new network db structure stores everything in the networks bucket.
Previously some network settings were not written the the network bucket
and only stored in the container config.
Instead of the old format which used the container ID as value in the
networks buckets we now use the PerNetworkoptions struct there.
To migrate existing users we use the state.GetNetworks() function. If it
fails to read the new format it will automatically migrate the old
config format to the new one. This is allows a flawless migration path.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
While trying to match permissions of target directory podman adds
extra `0111` which should not be needed if target path does not have
execute permission.
Signed-off-by: Aditya Rajan <arajan@redhat.com>
improve the heuristic to detect the scope that was created for the container.
This is necessary with systemd running as PID 1, since it moves itself
to a different sub-cgroup, thus stats would not account for other
processes in the same container.
Closes: https://github.com/containers/podman/issues/12400
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Some field names are confusing. Change them so that they make more sense
to the reader.
Since these fields are only in the main branch we can safely rename them
without worrying about backwards compatibility.
Note we have to change the field names in netavark too.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Podman adds a few environment variables by default, and
currently there is no way to get rid of them from your container.
This option will allow you to specify which defaults you don't
want.
--unsetenv-all will remove all default environment variables.
Default environment variables can come from podman builtin,
containers.conf or from the container image.
Fixes: https://github.com/containers/podman/issues/11836
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Honor custom `target` if specified while running or creating containers
with secret `type=mount`.
Example:
`podman run -it --secret token,type=mount,target=TOKEN ubi8/ubi:latest
bash`
Signed-off-by: Aditya Rajan <arajan@redhat.com>
This adds the parameter '--print-stats' to 'podman container restore'.
With '--print-stats' Podman will measure how long Podman itself, the OCI
runtime and CRIU requires to restore a checkpoint and print out these
information. CRIU already creates process restore statistics which are
just read in addition to the added measurements. In contrast to just
printing out the ID of the restored container, Podman will now print
out JSON:
# podman container restore --latest --print-stats
{
"podman_restore_duration": 305871,
"container_statistics": [
{
"Id": "47b02e1d474b5d5fe917825e91ac653efa757c91e5a81a368d771a78f6b5ed20",
"runtime_restore_duration": 140614,
"criu_statistics": {
"forking_time": 5,
"restore_time": 67672,
"pages_restored": 14
}
}
]
}
The output contains 'podman_restore_duration' which contains the
number of microseconds Podman required to restore the checkpoint. The
output also includes 'runtime_restore_duration' which is the time
the runtime needed to restore that specific container. Each container
also includes 'criu_statistics' which displays the timing information
collected by CRIU.
Signed-off-by: Adrian Reber <areber@redhat.com>
This adds the parameter '--print-stats' to 'podman container checkpoint'.
With '--print-stats' Podman will measure how long Podman itself, the OCI
runtime and CRIU requires to create a checkpoint and print out these
information. CRIU already creates checkpointing statistics which are
just read in addition to the added measurements. In contrast to just
printing out the ID of the checkpointed container, Podman will now print
out JSON:
# podman container checkpoint --latest --print-stats
{
"podman_checkpoint_duration": 360749,
"container_statistics": [
{
"Id": "25244244bf2efbef30fb6857ddea8cb2e5489f07eb6659e20dda117f0c466808",
"runtime_checkpoint_duration": 177222,
"criu_statistics": {
"freezing_time": 100657,
"frozen_time": 60700,
"memdump_time": 8162,
"memwrite_time": 4224,
"pages_scanned": 20561,
"pages_written": 2129
}
}
]
}
The output contains 'podman_checkpoint_duration' which contains the
number of microseconds Podman required to create the checkpoint. The
output also includes 'runtime_checkpoint_duration' which is the time
the runtime needed to checkpoint that specific container. Each container
also includes 'criu_statistics' which displays the timing information
collected by CRIU.
Signed-off-by: Adrian Reber <areber@redhat.com>
A rootless container created with a custom userns and forwarded ports
did not work. I refactored the network setup to make the setup logic
more clear.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
There was the question about how long it takes to create a checkpoint.
CRIU already provides some statistics about how long it takes to create
a checkpoint and similar.
With this change the file 'stats-dump' is included in the checkpoint
archive and the tool checkpointctl can be used to display these
statistics:
./checkpointctl show -t /tmp/cp.tar --print-stats
Displaying container checkpoint data from /tmp/dump.tar
[...]
CRIU dump statistics
+---------------+-------------+--------------+---------------+---------------+---------------+
| FREEZING TIME | FROZEN TIME | MEMDUMP TIME | MEMWRITE TIME | PAGES SCANNED | PAGES WRITTEN |
+---------------+-------------+--------------+---------------+---------------+---------------+
| 105405 us | 1376964 us | 504399 us | 446571 us | 492153 | 88689 |
+---------------+-------------+--------------+---------------+---------------+---------------+
Signed-off-by: Adrian Reber <areber@redhat.com>
A restored container still had the state set to 'Checkpointed: true'
which seems wrong if it running again.
[NO NEW TESTS NEEDED]
Signed-off-by: Adrian Reber <areber@redhat.com>
This will change mount of /dev within container to noexec, making
containers slightly more secure.
[NO NEW TESTS NEEDED]
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Checkpoint is blowing up when you use --log-driver=none
[NO NEW TESTS NEEDED] No way currently to test checkpoint restore.
Fixes: https://github.com/containers/podman/issues/11974
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This matches what docker does. Also make sure the net aliases are also
shown when the container is stopped.
docker-compose uses this special alias entry to check if it is already
correctly connected to the network. [1]
Because we do not support static ips on network connect at the moment
calling disconnect && connect will loose the static ip.
Fixes#11748
[1] 0bea52b18d/compose/service.py (L663-L667)
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Following PR allows containers to create and mount overlays on top of
named volumes instead of mounting actual volumes via already documented `:O`.
Signed-off-by: Aditya Rajan <arajan@redhat.com>
Let the gvproxy dns server handle the host.containers.internal entry.
Support for this is already added to gvproxy. [1]
To make sure the container uses the dns response from gvproxy we should
not add host.containers.internal to /etc/hosts in this case.
[NO TESTS NEEDED] podman machine has no tests :/
Fixes#11642
[1] 1108ea4516
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The check for net=none was wrong. It just assumed when we do not create
the netns but have one set that we use the none mode. This however also
applies to a container which joins the pod netns.
To correctly check for the none mode use `config.NetMode.IsNone()`.
Fixes#11596
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Make use of the new network interface in libpod.
This commit contains several breaking changes:
- podman network create only outputs the new network name and not file
path.
- podman network ls shows the network driver instead of the cni version
and plugins.
- podman network inspect outputs the new network struct and not the cni
conflist.
- The bindings and libpod api endpoints have been changed to use the new
network structure.
The container network status is stored in a new field in the state. The
status should be received with the new `c.getNetworkStatus`. This will
migrate the old status to the new format. Therefore old containers should
contine to work correctly in all cases even when network connect/
disconnect is used.
New features:
- podman network reload keeps the ip and mac for more than one network.
- podman container restore keeps the ip and mac for more than one
network.
- The network create compat endpoint can now use more than one ipam
config.
The man pages and the swagger doc are updated to reflect the latest
changes.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>