Add the new networks format to specgen. For api users cni_networks is
still supported to make migration easier however the static ip and mac
fields are removed.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Make sure we create new containers in the db with the correct structure.
Also remove some unneeded code for alias handling. We no longer need this
functions.
The specgen format has not been changed for now.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Create a custom writer which logs the netavark output to logrus. This
will log to the syslog when it is enabled.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
There is a problem with creating and storing the exit command when the
container was created. It only contains the options the container was
created with but NOT the options the container is started with. One
example would be a CNI network config. If I start a container once, then
change the cni config dir with `--cni-config-dir` ans start it a second
time it will start successfully. However the exit command still contains
the wrong `--cni-config-dir` because it was not updated.
To fix this we do not want to store the exit command at all. Instead we
create it every time the conmon process for the container is startet.
This guarantees us that the container cleanup process is startet with
the correct settings.
[NO NEW TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
To make testing easier we can overwrite the network backend with the
global `--network-backend` option.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Create a new mac address type which supports json marshal/unmarshal from
and to string. This change is backwards compatible with the previous
versions as the unmarshal method still accepts the old byte array or
base64 encoded string.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The OCICNI port format has one big problem: It does not support ranges.
So if a users forwards a range of 1k ports with podman run -p 1001-2000
we have to store each of the thousand ports individually as array element.
This bloats the db and makes the JSON encoding and decoding much slower.
In many places we already use a better port struct type which supports
ranges, e.g. `pkg/specgen` or the new network interface.
Because of this we have to do many runtime conversions between the two
port formats. If everything uses the new format we can skip the runtime
conversions.
This commit adds logic to replace all occurrences of the old format
with the new one. The database will automatically migrate the ports
to new format when the container config is read for the first time
after the update.
The `ParsePortMapping` function is `pkg/specgen/generate` has been
reworked to better work with the new format. The new logic is able
to deduplicate the given ports. This is necessary the ensure we
store them efficiently in the DB. The new code should also be more
performant than the old one.
To prove that the code is fast enough I added go benchmarks. Parsing
1 million ports took less than 0.5 seconds on my laptop.
Benchmark normalize PortMappings in specgen:
Please note that the 1 million ports are actually 20x 50k ranges
because we cannot have bigger ranges than 65535 ports.
```
$ go test -bench=. -benchmem ./pkg/specgen/generate/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/pkg/specgen/generate
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
BenchmarkParsePortMappingNoPorts-12 480821532 2.230 ns/op 0 B/op 0 allocs/op
BenchmarkParsePortMapping1-12 38972 30183 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMapping100-12 18752 60688 ns/op 141088 B/op 315 allocs/op
BenchmarkParsePortMapping1k-12 3104 331719 ns/op 223840 B/op 3018 allocs/op
BenchmarkParsePortMapping10k-12 376 3122930 ns/op 1223650 B/op 30027 allocs/op
BenchmarkParsePortMapping1m-12 3 390869926 ns/op 124593840 B/op 4000624 allocs/op
BenchmarkParsePortMappingReverse100-12 18940 63414 ns/op 141088 B/op 315 allocs/op
BenchmarkParsePortMappingReverse1k-12 3015 362500 ns/op 223841 B/op 3018 allocs/op
BenchmarkParsePortMappingReverse10k-12 343 3318135 ns/op 1223650 B/op 30027 allocs/op
BenchmarkParsePortMappingReverse1m-12 3 403392469 ns/op 124593840 B/op 4000624 allocs/op
BenchmarkParsePortMappingRange1-12 37635 28756 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange100-12 39604 28935 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange1k-12 38384 29921 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange10k-12 29479 40381 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange1m-12 927 1279369 ns/op 143022 B/op 164 allocs/op
PASS
ok github.com/containers/podman/v3/pkg/specgen/generate 25.492s
```
Benchmark convert old port format to new one:
```
go test -bench=. -benchmem ./libpod/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/libpod
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
Benchmark_ocicniPortsToNetTypesPortsNoPorts-12 663526126 1.663 ns/op 0 B/op 0 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1-12 7858082 141.9 ns/op 72 B/op 2 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10-12 2065347 571.0 ns/op 536 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts100-12 138478 8641 ns/op 4216 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1k-12 9414 120964 ns/op 41080 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10k-12 781 1490526 ns/op 401528 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1m-12 4 250579010 ns/op 40001656 B/op 4 allocs/op
PASS
ok github.com/containers/podman/v3/libpod 11.727s
```
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This will change mount of /dev within container to noexec, making
containers slightly more secure.
[NO NEW TESTS NEEDED]
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
it allows to pass the current std streams down to the container.
conmon support: https://github.com/containers/conmon/pull/289
[NO TESTS NEEDED] it needs a new conmon.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
added support for pod devices. The device gets added to the infra container and
recreated in all containers that join the pod.
This required a new container config item to keep track of the original device passed in by the user before
the path was parsed into the container device.
Signed-off-by: cdoern <cdoern@redhat.com>
We do not use the ocicni code anymore so let's get rid of it. Only the
port struct is used but we can copy this into libpod network types so
we can debloat the binary.
The next step is to remove the OCICNI port mapping form the container
config and use the better PortMapping struct everywhere.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Allows users to specify a readonly rootfs with :O, in exchange podman will create a writable overlay.
bump builah to v1.22.1-0.20210823173221-da2b428c56ce
[NO TESTS NEEDED]
Signed-off-by: flouthoc <flouthoc.git@gmail.com>
InfraContainer should go through the same creation process as regular containers. This change was from the cmd level
down, involving new container CLI opts and specgen creating functions. What now happens is that both container and pod
cli options are populated in cmd and used to create a podSpecgen and a containerSpecgen. The process then goes as follows
FillOutSpecGen (infra) -> MapSpec (podOpts -> infraOpts) -> PodCreate -> MakePod -> createPodOptions -> NewPod -> CompleteSpec (infra) -> MakeContainer -> NewContainer -> newContainer -> AddInfra (to pod state)
Signed-off-by: cdoern <cdoern@redhat.com>
Podman inspect has to show exposed ports to match docker. This requires
storing the exposed ports in the container config.
A exposed port is shown as `"80/tcp": null` while a forwarded port is
shown as `"80/tcp": [{"HostIp": "", "HostPort": "8080" }]`.
Also make sure to add the exposed ports to the new image when the
container is commited.
Fixes#10777
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Add the --userns flag to podman pod create and keep
track of the userns setting that pod was created with
so that all containers created within the pod will inherit
that userns setting.
Specifically we need to be able to launch a pod with
--userns=keep-id
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
this is the first pass at implementing init containers for podman pods.
init containersare made popular by k8s as a way to run setup for pods
before the pods standard containers run.
unlike k8s, we support two styles of init containers: always and
oneshot. always means the container stays in the pod and starts
whenever a pod is started. this does not apply to pods restarting.
oneshot means the container runs onetime when the pod starts and then is
removed.
Signed-off-by: Brent Baude <bbaude@redhat.com>
Adds the new --infra-name command line argument allowing users to define
the name of the infra container
Issue #10794
Signed-off-by: José Guilherme Vanz <jvanz@jvanz.com>
added support for --pid flag. User can specify ns:file, pod, private, or host.
container returns an error since you cannot point the ns of the pods infra container
to a container outside of the pod.
Signed-off-by: cdoern <cdoern@redhat.com>
Added logic and handling for two new Podman pod create Flags.
--cpus specifies the total number of cores on which the pod can execute, this
is a combination of the period and quota for the CPU.
--cpuset-cpus is a string value which determines of these available cores,
how many we will truly execute on.
Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
Users are complaining about read/only /var/tmp failing
even if TMPDIR=/tmp is set.
This PR Fixes: https://github.com/containers/podman/issues/10698
[NO TESTS NEEDED] No way to test this.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Podman uses the volume option map to check if it has to mount the volume
or not when the container is started. Commit 28138dafcc added to uid
and gid options to this map, however when only uid/gid is set we cannot
mount this volume because there is no filesystem or device specified.
Make sure we do not try to mount the volume when only the uid/gid option
is set since this is a simple chown operation.
Also when a uid/gid is explicity set, do not chown the volume based on
the container user when the volume is used for the first time.
Fixes#10620
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Support UID, GID, Mode options for mount type secrets. Also, change
default secret permissions to 444 so all users can read secret.
Signed-off-by: Ashley Cui <acui@redhat.com>
Env var secrets are env vars that are set inside the container but not
commited to and image. Also support reading from env var when creating a
secret.
Signed-off-by: Ashley Cui <acui@redhat.com>
volatile containers are a storage optimization that disables *sync()
syscalls for the container rootfs.
If a container is created with --rm, then automatically set the
volatile storage flag as anyway the container won't persist after a
reboot or machine crash.
[NO TESTS NEEDED]
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This option allows users to specify the maximum amount of time to run
before conmon sends the kill signal to the container.
Fixes: https://github.com/containers/podman/issues/6412
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
- Persist CDIDevices in container config
- Add e2e test
- Log HasDevice error and add additional condition for safety
Signed-off-by: Sebastian Jug <seb@stianj.ug>
We define in the man page that this overrides the default storage
options, but the code was appending to the existing options.
This PR also makes a change to allow users to specify --storage-opt="".
This will turn off all storage options.
https://github.com/containers/podman/issues/9852
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
As part of a fix for an earlier bug (#5698) we added the ability
for Podman to chown volumes to correctly match the user running
in the container, even in adverse circumstances (where we don't
know the right UID/GID until very late in the process). However,
we only did this for volumes created automatically by a
`podman run` or `podman create`. Volumes made by
`podman volume create` do not get this chown, so their
permissions may not be correct. I've looked, and I don't think
there's a good reason not to do this chwon for all volumes the
first time the container is started.
I would prefer to do this as part of volume copy-up, but I don't
think that's really possible (copy-up happens earlier in the
process and we don't have a spec). There is a small chance, as
things stand, that a copy-up happens for one container and then
a chown for a second, unrelated container, but the odds of this
are astronomically small (we'd need a very close race between two
starting containers).
Fixes#9608
Signed-off-by: Matthew Heon <mheon@redhat.com>
Some packages used by the remote client imported the libpod package.
This is not wanted because it adds unnecessary bloat to the client and
also causes problems with platform specific code(linux only), see #9710.
The solution is to move the used functions/variables into extra packages
which do not import libpod.
This change shrinks the remote client size more than 6MB compared to the
current master.
[NO TESTS NEEDED]
I have no idea how to test this properly but with #9710 the cross
compile should fail.
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
if --storage-opt are specified on the CLI append them after what is
specified in the configuration files instead of overriding it.
Closes: https://github.com/containers/podman/issues/9657
[NO TESTS NEEDED]
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We missed bumping the go module, so let's do it now :)
* Automated go code with github.com/sirkon/go-imports-rename
* Manually via `vgrep podman/v2` the rest
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Implement podman secret create, inspect, ls, rm
Implement podman run/create --secret
Secrets are blobs of data that are sensitive.
Currently, the only secret driver supported is filedriver, which means creating a secret stores it in base64 unencrypted in a file.
After creating a secret, a user can use the --secret flag to expose the secret inside the container at /run/secrets/[secretname]
This secret will not be commited to an image on a podman commit
Signed-off-by: Ashley Cui <acui@redhat.com>
We need an extra field in the pod infra container config. We may
want to reevaluate that struct at some point, as storing network
modes as bools will rapidly become unsustainable, but that's a
discussion for another time. Otherwise, straightforward plumbing.
Fixes#9165
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This implements support for mounting and unmounting volumes
backed by volume plugins. Support for actually retrieving
plugins requires a pull request to land in containers.conf and
then that to be vendored, and as such is not yet ready. Given
this, this code is only compile tested. However, the code for
everything past retrieving the plugin has been written - there is
support for creating, removing, mounting, and unmounting volumes,
which should allow full functionality once the c/common PR is
merged.
A major change is the signature of the MountPoint function for
volumes, which now, by necessity, returns an error. Named volumes
managed by a plugin do not have a mountpoint we control; instead,
it is managed entirely by the plugin. As such, we need to cache
the path in the DB, and calls to retrieve it now need to access
the DB (and may fail as such).
Notably absent is support for SELinux relabelling and chowning
these volumes. Given that we don't manage the mountpoint for
these volumes, I am extremely reluctant to try and modify it - we
could easily break the plugin trying to chown or relabel it.
Also, we had no less than *5* separate implementations of
inspecting a volume floating around in pkg/infra/abi and
pkg/api/handlers/libpod. And none of them used volume.Inspect(),
the only correct way of inspecting volumes. Remove them all and
consolidate to using the correct way. Compat API is likely still
doing things the wrong way, but that is an issue for another day.
Fixes#4304
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
`staticcheck` is a golang code analysis tool. https://staticcheck.io/
This commit fixes a lot of problems found in our code. Common problems are:
- unnecessary use of fmt.Sprintf
- duplicated imports with different names
- unnecessary check that a key exists before a delete call
There are still a lot of reported problems in the test files but I have
not looked at those.
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
Systemd is now complaining or mentioning /var/run as a legacy directory.
It has been many years where /var/run is a symlink to /run on all
most distributions, make the change to the default.
Partial fix for https://github.com/containers/podman/issues/8369
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
The podman events aren't read until the given timestamp if the
timestamp is in the future. It just reads all events until now
and exits afterwards.
This does not make sense and does not match docker. The correct
behavior is to read all events until the given time is reached.
This fixes a bug where the wrong event log file path was used
when running first time with a new storage location.
Fixes#8694
This also fixes the events api endpoint which only exited when
an error occurred. Otherwise it just hung after reading all events.
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
Docker provides extensibility through a plugin system, of which
several types are available. This provides an initial library API
for communicating with one type of plugins, volume plugins.
Volume plugins allow for an external service to create and manage
a volume on Podman's behalf.
This does not integrate the plugin system into Libpod or Podman
yet; that will come in subsequent pull requests.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Most of the builtin golang functions like os.Stat and
os.Open report errors including the file system object
path. We should not wrap these errors and put the file path
in a second time, causing stuttering of errors when they
get presented to the user.
This patch tries to cleanup a bunch of these errors.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Add a new "image" mount type to `--mount`. The source of the mount is
the name or ID of an image. The destination is the path inside the
container. Image mounts further support an optional `rw,readwrite`
parameter which if set to "true" will yield the mount writable inside
the container. Note that no changes are propagated to the image mount
on the host (which in any case is read only).
Mounts are overlay mounts. To support read-only overlay mounts, vendor
a non-release version of Buildah.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
This adds the database backend for network aliases. Aliases are
additional names for a container that are used with the CNI
dnsname plugin - the container will be accessible by these names
in addition to its name. Aliases are allowed to change over time
as the container connects to and disconnects from networks.
Aliases are implemented as another bucket in the database to
register all aliases, plus two buckets for each container (one to
hold connected CNI networks, a second to hold its aliases). The
aliases are only unique per-network, to the global and
per-container aliases buckets have a sub-bucket for each CNI
network that has aliases, and the aliases are stored within that
sub-bucket. Aliases are formatted as alias (key) to container ID
(value) in both cases.
Three DB functions are defined for aliases: retrieving current
aliases for a given network, setting aliases for a given network,
and removing all aliases for a given network.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Docker supports log-opt max_size and so does conmon (ALthough poorly).
Adding support for this allows users to at least make sure their containers
logs do not become a DOS vector.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Currently infr-command and --infra-image commands are ignored
from the user. This PR instruments them and adds tests for
each combination.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
when joining an existing container user namespace, read the existing
mappings so the storage can be created with the correct ownership.
Closes: https://github.com/containers/podman/issues/7547
Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
Most Libpod containers are made via `pkg/specgen/generate` which
includes code to generate an appropriate exit command which will
handle unmounting the container's storage, cleaning up the
container's network, etc. There is one notable exception: pod
infra containers, which are made entirely within Libpod and do
not touch pkg/specgen. As such, no cleanup process, network never
cleaned up, bad things can happen.
There is good news, though - it's not that difficult to add this,
and it's done in this PR. Generally speaking, we don't allow
passing options directly to the infra container at create time,
but we do (optionally) proxy a pre-approved set of options into
it when we create it. Add ExitCommand to these options, and set
it at time of pod creation using the same code we use to generate
exit commands for normal containers.
Fixes#7103
Signed-off-by: Matthew Heon <mheon@redhat.com>
A recent crun change stopped the creation of the container's
working directory if it does not exist. This is arguably correct
for user-specified directories, to protect against typos; it is
definitely not correct for image WORKDIR, where the image author
definitely intended for the directory to be used.
This makes Podman create the working directory and chown it to
container root, if it does not already exist, and only if it was
specified by an image, not the user.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
--umask sets the umask inside the container
Defaults to 0022
Co-authored-by: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Ashley Cui <acui@redhat.com>
In `podman inspect` output for containers and pods, we include
the command that was used to create the container. This is also
used by `podman generate systemd --new` to generate unit files.
With remote podman, the generated create commands were incorrect
since we sourced directly from os.Args on the server side, which
was guaranteed to be `podman system service` (or some variant
thereof). The solution is to pass the command along in the
Specgen or PodSpecgen, where we can source it from the client's
os.Args.
This will still be VERY iffy for mixed local/remote use (doing a
`podman --remote run ...` on a remote client then a
`podman generate systemd --new` on the server on the same
container will not work, because the `--remote` flag will slip
in) but at the very least the output of `podman inspect` will be
correct. We can look into properly handling `--remote` (parsing
it out would be a little iffy) in a future PR.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
If I enter a continer with --userns keep-id, my UID will be present
inside of the container, but most likely my user will not be defined.
This patch will take information about the user and stick it into the
container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
--sdnotify container|conmon|ignore
With "conmon", we send the MAINPID, and clear the NOTIFY_SOCKET so the OCI
runtime doesn't pass it into the container. We also advertise "ready" when the
OCI runtime finishes to advertise the service as ready.
With "container", we send the MAINPID, and leave the NOTIFY_SOCKET so the OCI
runtime passes it into the container for initialization, and let the container advertise further metadata.
This is the default, which is closest to the behavior podman has done in the past.
The "ignore" option removes NOTIFY_SOCKET from the environment, so neither podman nor
any child processes will talk to systemd.
This removes the need for hardcoded CID and PID files in the command line, and
the PIDFile directive, as the pid is advertised directly through sd-notify.
Signed-off-by: Joseph Gooch <mrwizard@dok.org>
With the advent of Podman 2.0.0 we crossed the magical barrier of go
modules. While we were able to continue importing all packages inside
of the project, the project could not be vendored anymore from the
outside.
Move the go module to new major version and change all imports to
`github.com/containers/libpod/v2`. The renaming of the imports
was done via `gomove` [1].
[1] https://github.com/KSubedi/gomove
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
--tz flag sets timezone inside container
Can be set to IANA timezone as well as `local` to match host machine
Signed-off-by: Ashley Cui <acui@redhat.com>
move the chown for newly created volumes after the spec generation so
the correct UID/GID are known.
Closes: https://github.com/containers/libpod/issues/5698
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When running under systemd there is no need to create yet another
cgroup for the container.
With conmon-delegated the current cgroup will be split in two sub
cgroups:
- supervisor
- container
The supervisor cgroup will hold conmon and the podman process, while
the container cgroup is used by the OCI runtime (using the cgroupfs
backend).
Closes: https://github.com/containers/libpod/issues/6400
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We initially believed that implementing this required support for
restarting containers after reboot, but this is not the case.
The unless-stopped restart policy acts identically to the always
restart policy except in cases related to reboot (which we do not
support yet), but it does not require that support for us to
implement it.
Changes themselves are quite simple, we need a new restart policy
constant, we need to remove existing checks that block creation
of containers when unless-stopped was used, and we need to update
the manpages.
Fixes#6508
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When the container uses journald logging, we don't want to
automatically use the same driver for its exec sessions. If we do
we will pollute the journal (particularly in the case of
healthchecks) with large amounts of undesired logs. Instead,
force exec sessions logs to file for now; we can add a log-driver
flag later (we'll probably want to add a `podman logs` command
that reads exec session logs at the same time).
As part of this, add support for the new 'none' logs driver in
Conmon. It will be the default log driver for exec sessions, and
can be optionally selected for containers.
Great thanks to Joe Gooch (mrwizard@dok.org) for adding support
to Conmon for a null log driver, and wiring it in here.
Fixes#6555
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Add an `--infra-conmon-pidfile` flag to `podman-pod-create` to write the
infra container's conmon process ID to a specified path. Several
container sub-commands already support `--conmon-pidfile` which is
especially helpful to allow for systemd to access and track the conmon
processes. This allows for easily tracking the conmon process of a
pod's infra container.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Add a `CreateCommand` field to the pod config which includes the entire
`os.Args` at pod-creation. Similar to the already existing field in a
container config, we need this information to properly generate generic
systemd unit files for pods. It's a prerequisite to support the `--new`
flag for pods.
Also add the `CreateCommand` to the pod-inspect data, which can come in
handy for debugging, general inspection and certainly for the tests that
are added along with the other changes.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Support the `X-Registry-Auth` http-request header.
* The content of the header is a base64 encoded JSON payload which can
either be a single auth config or a map of auth configs (user+pw or
token) with the corresponding registries being the keys. Vanilla
Docker, projectatomic Docker and the bindings are transparantly
supported.
* Add a hidden `--registries-conf` flag. Buildah exposes the same
flag, mostly for testing purposes.
* Do all credential parsing in the client (i.e., `cmd/podman`) pass
the username and password in the backend instead of unparsed
credentials.
* Add a `pkg/auth` which handles most of the heavy lifting.
* Go through the authentication-handling code of most commands, bindings
and endpoints. Migrate them to the new code and fix issues as seen.
A final evaluation and more tests is still required *after* this
change.
* The manifest-push endpoint is missing certain parameters and should
use the ABI function instead. Adding auth-support isn't really
possible without these parts working.
* The container commands and endpoints (i.e., create and run) have not
been changed yet. The APIs don't yet account for the authfile.
* Add authentication tests to `pkg/bindings`.
Fixes: #6384
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
By moving a couple of variables from libpod/libpod to libpod/libpod/define
I am able shrink the podman-remote-* executables by another megabyte.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This one was a massive pain to track down.
The original symptom was an error message from rootless Podman
trying to make a container in a pod. I unfortunately did not look
at the error message closely enough to realize that the namespace
in question was the cgroup namespace (the reproducer pod was
explicitly set to only share the network namespace), else this
would have been quite a bit shorter.
I spent considerable effort trying to track down differences
between the inspect output of the two containers, and when that
failed I was forced to resort to diffing the OCI specs. That
finally proved fruitful, and I was able to determine what should
have been obvious all along: the container was joining the cgroup
namespace of the infra container when it really ought not to
have.
From there, I discovered a variable collision in pod config. The
UsePodCgroup variable means "create a parent cgroup for the pod
and join containers in the pod to it". Unfortunately, it is very
similar to UsePodUTS, UsePodNet, etc, which mean "the pod shares
this namespace", so an accessor was accidentally added for it
that indicated the pod shared the cgroup namespace when it really
did not. Once I realized that, it was a quick fix - add a bool to
the pod's configuration to indicate whether the cgroup ns was
shared (distinct from UsePodCgroup) and use that for the
accessor.
Also included are fixes for `podman inspect` and
`podman pod inspect` that fix them to actually display the state
of the cgroup namespace (for container inspect) and what
namespaces are shared (for pod inspect). Either of those would
have made tracking this down considerably quicker.
Fixes#6149
Signed-off-by: Matthew Heon <mheon@redhat.com>
enabled integration tests for volumes. there are two exceptions that still need work because of something not yet implemented.
also, add code to deal with the fact that containers conf appears to set a local volume driver where it used to be simply blank.
Signed-off-by: Brent Baude <bbaude@redhat.com>
Instead of getting mount options from /proc/self/mountinfo, which is
very costly to read/parse (and can even be unreliable), let's use
statfs(2) to figure out the flags we need.
[v2: move getting default options to pkg/util, make it linux-specific]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Also adds some basic tests for these two. More tests are needed
but will have to wait for state to be finished.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Add support to auto-update containers running in systemd units as
generated with `podman generate systemd --new`.
`podman auto-update` looks up containers with a specified
"io.containers.autoupdate" label (i.e., the auto-update policy).
If the label is present and set to "image", Podman reaches out to the
corresponding registry to check if the image has been updated. We
consider an image to be updated if the digest in the local storage is
different than the one of the remote image. If an image must be
updated, Podman pulls it down and restarts the container. Note that the
restarting sequence relies on systemd.
At container-creation time, Podman looks up the "PODMAN_SYSTEMD_UNIT"
environment variables and stores it verbatim in the container's label.
This variable is now set by all systemd units generated by
`podman-generate-systemd` and is set to `%n` (i.e., the name of systemd
unit starting the container). This data is then being used in the
auto-update sequence to instruct systemd (via DBUS) to restart the unit
and hence to restart the container.
Note that this implementation of auto-updates relies on systemd and
requires a fully-qualified image reference to be used to create the
container. This enforcement is necessary to know which image to
actually check and pull. If we used an image ID, we would not know
which image to check/pull anymore.
Fixes: #3575
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Until now, we've been validating every part of container
configuration through the With... functions that set the options.
This if fine when we are just validating the options to an
individual function, but things get complicated once we need to
validate conflicts between different options. We don't know the
order in which things were passed, so we need the validation on
both of the potential options that can conflict, resulting in
significant code duplication. To solve this, add a validate()
function for containers, and use this to check whether everything
is in a good state.
We can probably move more into this function (there are other
parts of container creation that also do validation of a sort)
but this is a good start to simplifying our options.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Before Libpod supported named volumes, we approximated image
volumes by bind-mounting in per-container temporary directories.
This was handled by Libpod, and had a corresponding database
entry to enable/disable it.
However, when we enabled named volumes, we completely rewrote the
old implementation; none of the old bind mount implementation
still exists, save one flag in the database. With nothing
remaining to use it, it has no further purpose.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Enables most of the network-related functionality from
`podman run` in `podman pod create`. Custom CNI networks can be
specified, host networking is supported, DNS options can be
configured.
Also enables host networking in `podman play kube`.
Fixes#2808Fixes#3837Fixes#4432Fixes#4718Fixes#4770
Signed-off-by: Matthew Heon <matthew.heon@pm.me>