Commit Graph

2992 Commits

Author SHA1 Message Date
OpenShift Merge Robot 5432bb95f1
Merge pull request #12174 from fgimenez/fix-docker-networksettings-type-discrepancy
Introduces Address type to be used in secondary IPv4 and IPv6 inspect data structure
2021-11-19 13:57:13 +01:00
OpenShift Merge Robot 319d3fba6d
Merge pull request #12354 from Luap99/exit-command
Do not store the exit command in container config
2021-11-18 23:51:12 +01:00
OpenShift Merge Robot 348aafeb1b
Merge pull request #12348 from Luap99/rootless-netns
rootless netns, one netns per libpod tmp dir
2021-11-18 21:59:13 +01:00
Paul Holzinger 0dae50f1d3
Do not store the exit command in container config
There is a problem with creating and storing the exit command when the
container was created. It only contains the options the container was
created with but NOT the options the container is started with. One
example would be a CNI network config. If I start a container once, then
change the cni config dir with `--cni-config-dir` ans start it a second
time it will start successfully. However the exit command still contains
the wrong `--cni-config-dir` because it was not updated.

To fix this we do not want to store the exit command at all. Instead we
create it every time the conmon process for the container is startet.
This guarantees us that the container cleanup process is startet with
the correct settings.

[NO NEW TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-18 20:28:03 +01:00
Radostin Stoyanov 6d23ea60d2
Add --file-locks checkpoint/restore option
CRIU supports checkpoint/restore of file locks. This feature is
required to checkpoint/restore containers running applications
such as MySQL.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-11-18 19:23:25 +00:00
Paul Holzinger 62d6b6bf74
rootless netns, one netns per libpod tmp dir
The netns cleanup code is checking if there are running containers, this
can fail if you run several libpod instances with diffrent root/runroot.
To fix it we use one netns for each libpod instances. To prevent name
conflicts we use a hash from the static dir as part of the name.

Previously this worked because we would use the CNI files to check if
the netns was still in use. but this is no longer possible with netavark.

[NO NEW TESTS NEEDED]

Fixes #12306

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-18 17:34:06 +01:00
Federico Gimenez 2e5d3e8fb3 Introduce Address type to be used in secondary IPv4 and IPv6 inspect data
structure.

Resolves a discrepancy between the types used in inspect for docker and podman.
This causes a panic when using the docker client against podman when the
secondary IP fields in the `NetworkSettings` inspect field are populated.

Fixes containers#12165

Signed-off-by: Federico Gimenez <fgimenez@redhat.com>
2021-11-18 17:04:49 +01:00
Paul Holzinger 97c6403a1b
rename libpod nettypes fields
Some field names are confusing. Change them so that they make more sense
to the reader.
Since these fields are only in the main branch we can safely rename them
without worrying about backwards compatibility.
Note we have to change the field names in netavark too.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-16 19:19:04 +01:00
OpenShift Merge Robot f031bd23c6
Merge pull request #12100 from rhatdan/env
Add option --unsetenv to remove default environment variables
2021-11-16 16:27:34 +01:00
OpenShift Merge Robot 8430ffc72e
Merge pull request #12283 from Luap99/machine-ports
podman machine improve port forwarding
2021-11-16 14:53:40 +01:00
OpenShift Merge Robot be681ab518
Merge pull request #12294 from flouthoc/secret-mount-target
secret: honor custom `target=` for secrets with `type=mount` for ctr.
2021-11-16 01:45:27 +01:00
OpenShift Merge Robot 45d28c2219
Merge pull request #12285 from nalind/journal-follow-not-early
journald logs: keep reading until the journal's end
2021-11-15 22:09:29 +01:00
Daniel J Walsh 44d1618dd7
Add --unsetenv & --unsetenv-all to remove def environment variables
Podman adds a few environment variables by default, and
currently there is no way to get rid of them from your container.
This option will allow  you to specify which defaults you don't
want.

--unsetenv-all will remove all default environment variables.

Default environment variables can come from podman builtin,
containers.conf or from the container image.

Fixes: https://github.com/containers/podman/issues/11836

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-11-15 15:10:12 -05:00
OpenShift Merge Robot 230f0b622e
Merge pull request #12130 from rhatdan/journal
Error logs --follow if events-backend != journald, event-logger=journald
2021-11-15 20:55:28 +01:00
Nalin Dahyabhai 63ef7135d9 journald logs: keep reading until the journal's end
When reading logs from the journal, keep going after the container
exits, in case it gets restarted.

Events logged to the journal via the normal paths don't include
CONTAINER_ID_FULL, so don't bother adding it to the "history" event we
use to force at least one entry for the container to show up in the log.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2021-11-15 13:38:36 -05:00
Aditya Rajan 014cc4b9d9
secret: honor custom target for secrets with run
Honor custom `target` if specified while running or creating containers
with secret `type=mount`.

Example:
`podman run -it --secret token,type=mount,target=TOKEN ubi8/ubi:latest
bash`

Signed-off-by: Aditya Rajan <arajan@redhat.com>
2021-11-15 23:19:27 +05:30
Paul Holzinger 295d87bb0b
podman machine improve port forwarding
This commits adds port forwarding logic directly into podman. The
podman-machine cni plugin is no longer needed.

The following new features are supported:
 - works with cni, netavark and slirp4netns
 - ports can use the hostIP to bind instead of hard coding 0.0.0.0
 - gvproxy no longer listens on 0.0.0.0:7777 (requires a new gvproxy
   version)
 - support the udp protocol

With this we no longer need podman-machine-cni and should remove it from
the packaging. There is also a change to make sure we are backwards
compatible with old config which include this plugin.

Fixes #11528
Fixes #11728

[NO NEW TESTS NEEDED] We have no podman machine test at the moment.
Please test this manually on your system.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-15 15:20:47 +01:00
Adrian Reber 80e56fa12b
Added optional container restore statistics
This adds the parameter '--print-stats' to 'podman container restore'.
With '--print-stats' Podman will measure how long Podman itself, the OCI
runtime and CRIU requires to restore a checkpoint and print out these
information. CRIU already creates process restore statistics which are
just read in addition to the added measurements. In contrast to just
printing out the ID of the restored container, Podman will now print
out JSON:

 # podman container restore --latest --print-stats
 {
     "podman_restore_duration": 305871,
     "container_statistics": [
         {
             "Id": "47b02e1d474b5d5fe917825e91ac653efa757c91e5a81a368d771a78f6b5ed20",
             "runtime_restore_duration": 140614,
             "criu_statistics": {
                 "forking_time": 5,
                 "restore_time": 67672,
                 "pages_restored": 14
             }
         }
     ]
 }

The output contains 'podman_restore_duration' which contains the
number of microseconds Podman required to restore the checkpoint. The
output also includes 'runtime_restore_duration' which is the time
the runtime needed to restore that specific container. Each container
also includes 'criu_statistics' which displays the timing information
collected by CRIU.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-11-15 11:50:25 +00:00
Adrian Reber 6202e8102b
Added optional container checkpointing statistics
This adds the parameter '--print-stats' to 'podman container checkpoint'.
With '--print-stats' Podman will measure how long Podman itself, the OCI
runtime and CRIU requires to create a checkpoint and print out these
information. CRIU already creates checkpointing statistics which are
just read in addition to the added measurements. In contrast to just
printing out the ID of the checkpointed container, Podman will now print
out JSON:

 # podman container checkpoint --latest --print-stats
 {
     "podman_checkpoint_duration": 360749,
     "container_statistics": [
         {
             "Id": "25244244bf2efbef30fb6857ddea8cb2e5489f07eb6659e20dda117f0c466808",
             "runtime_checkpoint_duration": 177222,
             "criu_statistics": {
                 "freezing_time": 100657,
                 "frozen_time": 60700,
                 "memdump_time": 8162,
                 "memwrite_time": 4224,
                 "pages_scanned": 20561,
                 "pages_written": 2129
             }
         }
     ]
 }

The output contains 'podman_checkpoint_duration' which contains the
number of microseconds Podman required to create the checkpoint. The
output also includes 'runtime_checkpoint_duration' which is the time
the runtime needed to checkpoint that specific container. Each container
also includes 'criu_statistics' which displays the timing information
collected by CRIU.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-11-15 11:50:24 +00:00
Daniel J Walsh 062c887718
Error logs --follow if events-backend != journald, event-logger=journald
Fixes: https://github.com/containers/podman/issues/11255

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-11-13 07:11:09 -05:00
OpenShift Merge Robot d6d89fa79f
Merge pull request #12267 from giuseppe/safely-create-etc-mtab
libpod: create /etc/mtab safely
2021-11-11 20:47:42 +01:00
Paul Holzinger 3af19917a1
Add failing run test for netavark
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-11 17:50:10 +01:00
Paul Holzinger fe90a45e0d
Add flag to overwrite network backend from config
To make testing easier we can overwrite the network backend with the
global `--network-backend` option.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-11 17:30:27 +01:00
Giuseppe Scrivano 9f4d63f91b
libpod: create /etc/mtab safely
make sure the /etc/mtab symlink is created inside the rootfs when /etc
is a symlink.

Closes: https://github.com/containers/podman/issues/12189

[NO NEW TESTS NEEDED] there is already a test case

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-11-11 17:00:53 +01:00
Paul Holzinger 8041d44c93
Add network backend to podman info
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-11 16:49:46 +01:00
Paul Holzinger b2f7430b67
Add more netavark tests
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-11 16:49:28 +01:00
Paul Holzinger 1c88f741a7
select network backend based on config
You can change the network backendend in containers.conf supported
values are "cni" and "netavark".

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-11 16:26:47 +01:00
Paul Holzinger 3fe0c49174
Fix RUST_LOG envar for netavark
THe rust netlink library is very verbose. It contains way to much debug
and trave logs. We can set `RUST_LOG=netavark=<level>` to make sure this
log level only applies to netavark and not the libraries.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-11 16:25:52 +01:00
Paul Holzinger 4febe55769
netavark IPAM assignment
Add a new boltdb to handle IPAM assignment.

The db structure is the following:
Each network has their own bucket with the network name as bucket key.
Inside the network bucket there is an ID bucket which maps the container ID (key)
to a json array of ip addresses (value).
The network bucket also has a bucket for each subnet, the subnet is used as key.
Inside the subnet bucket an ip is used as key and the container ID as value.

The db should be stored on a tmpfs to ensure we always have a clean
state after a reboot.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-11 16:25:19 +01:00
Paul Holzinger eaae294628
netavark network interface
Implement a new network interface for netavark.
For now only bridge networking is supported.
The interface can create/list/inspect/remove networks. For setup and
teardown netavark will be invoked.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-11 15:54:02 +01:00
Paul Holzinger 12c62b92ff
Make networking code reusable
To prevent code duplication when creating new network backends move
reusable code into a separate internal package.

This allows all network backends to use the same code as long as they
implement the new NetUtil interface.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-11 15:54:02 +01:00
Paul Holzinger 3690532b3b
network reload return error if we cannot reload ports
As rootless we have to reload the port mappings. If it fails we should
return an error instead of the warning.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-10 21:16:30 +01:00
Paul Holzinger 27de152b5a
network reload without ports should not reload ports
When run as rootless the podman network reload command tries to reload
the rootlessport ports because the childIP could have changed.
However if the containers has no ports we should skip this instead of
printing a warning.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-10 21:16:08 +01:00
OpenShift Merge Robot 5437568fcd
Merge pull request #12227 from Luap99/net-setup
Fix rootless networking with userns and ports
2021-11-09 21:11:30 +01:00
OpenShift Merge Robot 43bd57c7fb
Merge pull request #12195 from boaz0/closes_11998
podman-generate-kube - remove empty structs from YAML
2021-11-09 19:46:31 +01:00
Paul Holzinger 216e2cb366
Fix rootless networking with userns and ports
A rootless container created with a custom userns and forwarded ports
did not work. I refactored the network setup to make the setup logic
more clear.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-09 15:58:57 +01:00
Ian Wienand 72cf389685 shm_lock: Handle ENOSPC better in AllocateSemaphore
When starting a container libpod/runtime_pod_linux.go:NewPod calls
libpod/lock/lock.go:AllocateLock ends up in here.  If you exceed
num_locks, in response to a "podman run ..." you will see:

 Error: error allocating lock for new container: no space left on device

As noted inline, this error is technically true as it is talking about
the SHM area, but for anyone who has not dug into the source (i.e. me,
before a few hours ago :) your initial thought is going to be that
your disk is full.  I spent quite a bit of time trying to diagnose
what disk, partition, overlay, etc. was filling up before I realised
this was actually due to leaking from failing containers.

This overrides this case to give a more explicit message that
hopefully puts people on the right track to fixing this faster.  You
will now see:

 $ ./bin/podman run --rm -it fedora bash
 Error: error allocating lock for new container: allocation failed; exceeded num_locks (20)

[NO NEW TESTS NEEDED] (just changes an existing error message)

Signed-off-by: Ian Wienand <iwienand@redhat.com>
2021-11-09 18:34:21 +11:00
Valentin Rothberg 6444f24028 pod/container create: resolve conflicts of generated names
Address the TOCTOU when generating random names by having at most 10
attempts to assign a random name when creating a pod or container.

[NO TESTS NEEDED] since I do not know a way to force a conflict with
randomly generated names in a reasonable time frame.

Fixes: #11735
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-11-08 13:33:30 +01:00
OpenShift Merge Robot 865653b661
Merge pull request #12184 from adrianreber/2021-11-05-stats-dump
Add 'stats-dump' file to exported checkpoint
2021-11-08 09:29:56 +01:00
Boaz Shuster f3fab1e17c podman-generate-kube - remove empty structs from YAML
[NO NEW TESTS NEEDED]

Signed-off-by: Boaz Shuster <boaz.shuster.github@gmail.com>
2021-11-07 16:33:38 +02:00
OpenShift Merge Robot abbd6c167e
Merge pull request #11890 from Luap99/ports
libpod: deduplicate ports in db
2021-11-06 10:39:16 +01:00
Paul Holzinger 02f67181a2
Fix swagger definition for the new mac address type
The new mac address type broke the api docs. While we could
successfully generate the swagger file it could not be viewed in a
browser.

The problem is that the swagger generation create two type definitions
with the name `HardwareAddr` and this pointed back to itself. Thus the
render process was stucked in an endless loop. To fix this manually
rename the new type to MacAddress and overwrite the types to string
because the json unmarshaller accepts the mac as string.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-05 19:25:40 +01:00
Adrian Reber 6b8fc3bd1d
Add 'stats-dump' file to exported checkpoint
There was the question about how long it takes to create a checkpoint.
CRIU already provides some statistics about how long it takes to create
a checkpoint and similar.

With this change the file 'stats-dump' is included in the checkpoint
archive and the tool checkpointctl can be used to display these
statistics:

./checkpointctl show -t /tmp/cp.tar --print-stats

Displaying container checkpoint data from /tmp/dump.tar

[...]
CRIU dump statistics
+---------------+-------------+--------------+---------------+---------------+---------------+
| FREEZING TIME | FROZEN TIME | MEMDUMP TIME | MEMWRITE TIME | PAGES SCANNED | PAGES WRITTEN |
+---------------+-------------+--------------+---------------+---------------+---------------+
| 105405 us     | 1376964 us  | 504399 us    | 446571 us     |        492153 |         88689 |
+---------------+-------------+--------------+---------------+---------------+---------------+

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-11-05 16:15:00 +00:00
Paul Holzinger 7f433df7e7
rename rootless cni ns to rootless netns
Since we want to use the rootless cni ns also for netavark we should
pick a more generic name. The name is now "rootless network namespace"
or short "rootless netns".

The rename might cause some issues after the update but when the
all containers are restarted or the host is rebooted it should work
correctly.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-05 15:44:37 +01:00
Paul Holzinger 58f8c3d743
mount full XDG_RUNTIME_DIR in rootless cni ns
We should mount the full runtime directory into the namespace instead of
just the netns dir. This allows more use cases.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-05 15:41:04 +01:00
Paul Holzinger 614c6f5970
Fix rootless cni netns cleanup logic
The check if cleanup is needed reads all container and checks if there
are running containers with bridge networking. If we do not find any we
have to cleanup the ns. However there was a problem with this because
the state is empty by default so the running check never worked.
Fortunately the was a second check which relies on the CNI files so we
still did cleanup anyway.

With netavark I noticed that this check is broken because the CNI files
were not present.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-05 00:20:10 +01:00
Paul Holzinger 001d48929d
MAC address json unmarshal should allow strings
Create a new mac address type which supports json marshal/unmarshal from
and to string. This change is backwards compatible with the previous
versions as the unmarshal method still accepts the old byte array or
base64 encoded string.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-03 15:30:16 +01:00
OpenShift Merge Robot 1305902ff4
Merge pull request #12127 from vrothberg/bz-2014149
volumes: be more tolerant and fix infinite loop
2021-10-29 13:30:29 +00:00
OpenShift Merge Robot d2147bada6
Merge pull request #12117 from adrianreber/2021-10-27-set-checkpointed-false-after-restore
Set Checkpointed state to false after restore
2021-10-28 18:06:25 +00:00
Valentin Rothberg c5f0a5d788 volumes: be more tolerant and fix infinite loop
Make Podman more tolerant when parsing image volumes during container
creation and further fix an infinite loop when checking them.

Consider `VOLUME ['/etc/foo', '/etc/bar']` in a Containerfile.  While
it looks correct to the human eye, the single quotes are wrong and yield
the two volumes to be `[/etc/foo,` and `/etc/bar]` in Podman and Docker.

When running the container, it'll create a directory `bar]` in `/etc`
and a directory `[` in `/` with two subdirectories `etc/foo,`.  This
behavior is surprising to me but how Docker behaves.  We may improve on
that in the future.  Note that the correct way to syntax for volumes in
a Containerfile is `VOLUME /A /B /C` or `VOLUME ["/A", "/B", "/C"]`;
single quotes are not supported.

This change restores this behavior without breaking container creation
or ending up in an infinite loop.

BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2014149
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-10-28 16:37:33 +02:00