Commit Graph

220 Commits

Author SHA1 Message Date
Jason T. Greene 3b6ffcd290 Update to use new common machine API
Signed-off-by: Jason T. Greene <jason.greene@redhat.com>
2022-04-25 13:52:27 -05:00
OpenShift Merge Robot 09ef4f2e22
Merge pull request #13978 from Luap99/unparam
enable unparam linter
2022-04-25 13:43:57 -04:00
Paul Holzinger c7b16645af
enable unparam linter
The unparam linter is useful to detect unused function parameters and
return values.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-04-25 13:23:20 +02:00
Lokesh Mandvekar 44642bee87
libpod/networking_linux.go: switch to sha256 hashes
SHA-1 is prone to collisions.

This will likely break connectivity between old containers started
before update and containers started after update. It will also fail to
cleanup old netns. A reboot will fix this, so a reboot is recommended
after update.

[NO NEW TESTS NEEDED]

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2022-04-22 16:31:43 -04:00
Paul Holzinger cf1b0c1965
network dis-/connect: update /etc/hosts
When we connect or disconnect from a network we also have to update
/etc/hosts to ensure we only have valid entries in there.
This also fixes problems with docker-compose since this makes use of
network connect/disconnect.

Fixes #12533

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-04-22 13:05:53 +02:00
Paul Holzinger 1f1cf7bd40
rootless netns: move process to scope only with systemd
When you run podman on a non systemd system we should not try to move the
process under a new systemd scope.

[NO NEW TESTS NEEDED]

Ref #13703

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-03-30 13:02:41 +02:00
Daniel J Walsh 7680211ede
Remove error stutter
When podman gets an error it prints out "Error: " before
printing the error string.  If the error message starts with
error, we end up with

Error: error ...

This PR Removes all of these stutters.

logrus.Error() also prints out that this is an error, so no need for the
error stutter.

[NO NEW TESTS NEEDED]

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-03-25 21:47:04 -04:00
Valentin Rothberg 68b94338ba linter: enable makezero
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-03-22 13:04:35 +01:00
Valentin Rothberg ea08765f40 go fmt: use go 1.18 conditional-build syntax
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-03-18 09:11:53 +01:00
😎 Mostafa Emami 611b45c517 Inspect network info of a joined network namespace
Closes: https://github.com/containers/podman/issues/13150
Signed-off-by: 😎 Mostafa Emami <mustafaemami@gmail.com>
2022-03-08 11:00:36 +01:00
Matthew Heon 4a60319ecb Remove the runtime lock
This primarily served to protect us against shutting down the
Libpod runtime while operations (like creating a container) were
happening. However, it was very inconsistently implemented (a lot
of our longer-lived functions, like pulling images, just didn't
implement it at all...) and I'm not sure how much we really care
about this very-specific error case?

Removing it also removes a lot of potential deadlocks, which is
nice.

[NO NEW TESTS NEEDED]

Signed-off-by: Matthew Heon <mheon@redhat.com>
2022-02-22 11:05:26 -05:00
Matthew Heon 87cca4e5e3 Modify /etc/resolv.conf when connecting/disconnecting
The `podman network connect` and `podman network disconnect`
commands give containers access to different networks than the
ones they were created with; these networks can also have DNS
servers associated with them. Until now, however, we did not
modify resolv.conf as network membership changed.

With this PR, `podman network connect` will add any new
nameservers supported by the new network to the container's
/etc/resolv.conf, and `podman network disconnect` command will do
the opposite, removing the network's nameservers from
`/etc/resolv.conf`.

Fixes #9603

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-02-10 09:44:00 -05:00
Paul Holzinger 8d0fb0a4ed
move rootless netns slirp4netns process to systemd user.slice
When running podman inside systemd user units, it is possible that
systemd kills the rootless netns slirp4netns process because it was
started in the default unit cgroup. When the unit is stopped all
processes in that cgroup are killed. Since the slirp4netns process is
run once for all containers it should not be killed. To make sure
systemd will not kill the process we move it to the user.slice.

Fixes #13153

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-02-07 17:38:53 +01:00
Giuseppe Scrivano 865f0a1977
libpod: report slirp4netns network stats
by default slirp4netns uses the tap0 device.  When slirp4netns is
used, use that device by default instead of eth0.

Closes: https://github.com/containers/podman/issues/11695

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-02-01 16:23:58 +01:00
Paul Holzinger 2f371cb12c
container create: do not check for network dns support
We should not check if the network supports dns when we create a
container with network aliases. This could be the case for containers
created by docker-compose for example if the dnsname plugin is not
installed or the user uses a macvlan config where we do not support dns.

Fixes #12972

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-01-24 16:56:11 +01:00
Valentin Rothberg bd09b7aa79 bump go module to version 4
Automated for .go files via gomove [1]:
`gomove github.com/containers/podman/v3 github.com/containers/podman/v4`

Remaining files via vgrep [2]:
`vgrep github.com/containers/podman/v3`

[1] https://github.com/KSubedi/gomove
[2] https://github.com/vrothberg/vgrep

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2022-01-18 12:47:07 +01:00
Paul Holzinger 918890a4d6
use netns package from c/common
The netns package was moved to c/common so we should use this and remove
the package from podman.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-01-12 17:40:24 +01:00
Paul Holzinger 495884b319
use libnetwork from c/common
The libpod/network packages were moved to c/common so that buildah can
use it as well. To prevent duplication use it in podman as well and
remove it from here.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-01-12 17:07:30 +01:00
Paul Holzinger 4791595b5c
network connect allow ip, ipv6 and mac address
Network connect now supports setting a static ipv4, ipv6 and mac address
for the container network. The options are added to the cli and api.

Fixes #9883

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-12-14 15:23:39 +01:00
Paul Holzinger 5490be67b3
network db rewrite: migrate existing settings
The new network db structure stores everything in the networks bucket.
Previously some network settings were not written the the network bucket
and only stored in the container config.
Instead of the old format which used the container ID as value in the
networks buckets we now use the PerNetworkoptions struct there.

To migrate existing users we use the state.GetNetworks() function. If it
fails to read the new format it will automatically migrate the old
config format to the new one. This is allows a flawless migration path.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-12-14 15:23:20 +01:00
Paul Holzinger 3753347d62
rootless netns: resolve all path components for resolv.conf
We need to follow all symlinks in the /etc/resolv.conf path. Currently
we would only check the last file but it is possible that any directory
before that is also a link.

Unfortunately this code is very hard to maintain and not well tested. I
will try to come up with a unit test when I have more time. I think we
could utilize some for of chroot for this. For now we are stucked with
the default setup in the fedora/ubunutu test VMs.

[NO NEW TESTS NEEDED]

Fixes #12461

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-12-06 18:34:14 +01:00
Paul Holzinger 3ff47748de
Fix possible rootless netns cleanup race
rootlessNetNS.Cleanup() has an issue with how it detects if cleanup
is needed, reading the container state is not good ebough because
containers are first stopped and than cleanup will be called. So at one
time two containers could wait for cleanup but the second one will fail
because the first one triggered already the cleanup thus making rootless
netns unavailable for the second container resulting in an teardown
error. Instead of checking the container state we need to check the
netns state.

Secondly, podman unshare --rootless-netns should not do the cleanup.
This causes more issues than it is worth fixing. Users also might want
to use this to setup the namespace in a special way. If unshare also
cleans this up right away we cannot do this.

[NO NEW TESTS NEEDED]

Fixes #12459

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-12-01 19:19:44 +01:00
OpenShift Merge Robot 5432bb95f1
Merge pull request #12174 from fgimenez/fix-docker-networksettings-type-discrepancy
Introduces Address type to be used in secondary IPv4 and IPv6 inspect data structure
2021-11-19 13:57:13 +01:00
Paul Holzinger 62d6b6bf74
rootless netns, one netns per libpod tmp dir
The netns cleanup code is checking if there are running containers, this
can fail if you run several libpod instances with diffrent root/runroot.
To fix it we use one netns for each libpod instances. To prevent name
conflicts we use a hash from the static dir as part of the name.

Previously this worked because we would use the CNI files to check if
the netns was still in use. but this is no longer possible with netavark.

[NO NEW TESTS NEEDED]

Fixes #12306

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-18 17:34:06 +01:00
Federico Gimenez 2e5d3e8fb3 Introduce Address type to be used in secondary IPv4 and IPv6 inspect data
structure.

Resolves a discrepancy between the types used in inspect for docker and podman.
This causes a panic when using the docker client against podman when the
secondary IP fields in the `NetworkSettings` inspect field are populated.

Fixes containers#12165

Signed-off-by: Federico Gimenez <fgimenez@redhat.com>
2021-11-18 17:04:49 +01:00
Paul Holzinger 97c6403a1b
rename libpod nettypes fields
Some field names are confusing. Change them so that they make more sense
to the reader.
Since these fields are only in the main branch we can safely rename them
without worrying about backwards compatibility.
Note we have to change the field names in netavark too.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-16 19:19:04 +01:00
Paul Holzinger 295d87bb0b
podman machine improve port forwarding
This commits adds port forwarding logic directly into podman. The
podman-machine cni plugin is no longer needed.

The following new features are supported:
 - works with cni, netavark and slirp4netns
 - ports can use the hostIP to bind instead of hard coding 0.0.0.0
 - gvproxy no longer listens on 0.0.0.0:7777 (requires a new gvproxy
   version)
 - support the udp protocol

With this we no longer need podman-machine-cni and should remove it from
the packaging. There is also a change to make sure we are backwards
compatible with old config which include this plugin.

Fixes #11528
Fixes #11728

[NO NEW TESTS NEEDED] We have no podman machine test at the moment.
Please test this manually on your system.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-15 15:20:47 +01:00
Paul Holzinger 216e2cb366
Fix rootless networking with userns and ports
A rootless container created with a custom userns and forwarded ports
did not work. I refactored the network setup to make the setup logic
more clear.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-09 15:58:57 +01:00
OpenShift Merge Robot abbd6c167e
Merge pull request #11890 from Luap99/ports
libpod: deduplicate ports in db
2021-11-06 10:39:16 +01:00
Paul Holzinger 7f433df7e7
rename rootless cni ns to rootless netns
Since we want to use the rootless cni ns also for netavark we should
pick a more generic name. The name is now "rootless network namespace"
or short "rootless netns".

The rename might cause some issues after the update but when the
all containers are restarted or the host is rebooted it should work
correctly.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-05 15:44:37 +01:00
Paul Holzinger 58f8c3d743
mount full XDG_RUNTIME_DIR in rootless cni ns
We should mount the full runtime directory into the namespace instead of
just the netns dir. This allows more use cases.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-05 15:41:04 +01:00
Paul Holzinger 614c6f5970
Fix rootless cni netns cleanup logic
The check if cleanup is needed reads all container and checks if there
are running containers with bridge networking. If we do not find any we
have to cleanup the ns. However there was a problem with this because
the state is empty by default so the running check never worked.
Fortunately the was a second check which relies on the CNI files so we
still did cleanup anyway.

With netavark I noticed that this check is broken because the CNI files
were not present.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-05 00:20:10 +01:00
Paul Holzinger 0136a66a83
libpod: deduplicate ports in db
The OCICNI port format has one big problem: It does not support ranges.
So if a users forwards a range of 1k ports with podman run -p 1001-2000
we have to store each of the thousand ports individually as array element.
This bloats the db and makes the JSON encoding and decoding much slower.
In many places we already use a better port struct type which supports
ranges, e.g. `pkg/specgen` or the new network interface.

Because of this we have to do many runtime conversions between the two
port formats. If everything uses the new format we can skip the runtime
conversions.

This commit adds logic to replace all occurrences of the old format
with the new one. The database will automatically migrate the ports
to new format when the container config is read for the first time
after the update.

The `ParsePortMapping` function is `pkg/specgen/generate` has been
reworked to better work with the new format. The new logic is able
to deduplicate the given ports. This is necessary the ensure we
store them efficiently in the DB. The new code should also be more
performant than the old one.

To prove that the code is fast enough I added go benchmarks. Parsing
1 million ports took less than 0.5 seconds on my laptop.

Benchmark normalize PortMappings in specgen:
Please note that the 1 million ports are actually 20x 50k ranges
because we cannot have bigger ranges than 65535 ports.
```
$ go test -bench=. -benchmem  ./pkg/specgen/generate/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/pkg/specgen/generate
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
BenchmarkParsePortMappingNoPorts-12             480821532                2.230 ns/op           0 B/op          0 allocs/op
BenchmarkParsePortMapping1-12                      38972             30183 ns/op          131584 B/op          9 allocs/op
BenchmarkParsePortMapping100-12                    18752             60688 ns/op          141088 B/op        315 allocs/op
BenchmarkParsePortMapping1k-12                      3104            331719 ns/op          223840 B/op       3018 allocs/op
BenchmarkParsePortMapping10k-12                      376           3122930 ns/op         1223650 B/op      30027 allocs/op
BenchmarkParsePortMapping1m-12                         3         390869926 ns/op        124593840 B/op   4000624 allocs/op
BenchmarkParsePortMappingReverse100-12             18940             63414 ns/op          141088 B/op        315 allocs/op
BenchmarkParsePortMappingReverse1k-12               3015            362500 ns/op          223841 B/op       3018 allocs/op
BenchmarkParsePortMappingReverse10k-12               343           3318135 ns/op         1223650 B/op      30027 allocs/op
BenchmarkParsePortMappingReverse1m-12                  3         403392469 ns/op        124593840 B/op   4000624 allocs/op
BenchmarkParsePortMappingRange1-12                 37635             28756 ns/op          131584 B/op          9 allocs/op
BenchmarkParsePortMappingRange100-12               39604             28935 ns/op          131584 B/op          9 allocs/op
BenchmarkParsePortMappingRange1k-12                38384             29921 ns/op          131584 B/op          9 allocs/op
BenchmarkParsePortMappingRange10k-12               29479             40381 ns/op          131584 B/op          9 allocs/op
BenchmarkParsePortMappingRange1m-12                  927           1279369 ns/op          143022 B/op        164 allocs/op
PASS
ok      github.com/containers/podman/v3/pkg/specgen/generate    25.492s
```

Benchmark convert old port format to new one:
```
go test -bench=. -benchmem  ./libpod/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/libpod
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
Benchmark_ocicniPortsToNetTypesPortsNoPorts-12          663526126                1.663 ns/op           0 B/op          0 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1-12                 7858082               141.9 ns/op            72 B/op          2 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10-12                2065347               571.0 ns/op           536 B/op          4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts100-12                138478              8641 ns/op            4216 B/op          4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1k-12                   9414            120964 ns/op           41080 B/op          4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10k-12                   781           1490526 ns/op          401528 B/op          4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1m-12                      4         250579010 ns/op        40001656 B/op          4 allocs/op
PASS
ok      github.com/containers/podman/v3/libpod  11.727s
```

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-10-27 18:59:56 +02:00
Paul Holzinger 1c8926285d
move network alias validation to container create
Podman 4.0 currently errors when you use network aliases for a network which
has dns disabled. Because the error happens on network setup this can
cause regression for old working containers. The network backend should not
validate this. Instead podman should check this at container create time
and also for network connect.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-09-28 13:40:27 +02:00
Paul Holzinger 05614ee139
always add short container id as net alias
This matches what docker does. Also make sure the net aliases are also
shown when the container is stopped.

docker-compose uses this special alias entry to check if it is already
correctly connected to the network. [1]
Because we do not support static ips on network connect at the moment
calling disconnect && connect will loose the static ip.

Fixes #11748

[1] 0bea52b18d/compose/service.py (L663-L667)

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-09-28 13:40:22 +02:00
Daniel J Walsh 1c4e6d8624
standardize logrus messages to upper case
Remove ERROR: Error stutter from logrus messages also.

[ NO TESTS NEEDED] This is just code cleanup.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-09-22 15:29:34 -04:00
Paul Holzinger b906b9d858
Drop OCICNI dependency
We do not use the ocicni code anymore so let's get rid of it. Only the
port struct is used but we can copy this into libpod network types so
we can debloat the binary.

The next step is to remove the OCICNI port mapping form the container
config and use the better PortMapping struct everywhere.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-09-15 20:00:28 +02:00
Paul Holzinger 85e8fbf7f3
Wire network interface into libpod
Make use of the new network interface in libpod.

This commit contains several breaking changes:
- podman network create only outputs the new network name and not file
  path.
- podman network ls shows the network driver instead of the cni version
  and plugins.
- podman network inspect outputs the new network struct and not the cni
  conflist.
- The bindings and libpod api endpoints have been changed to use the new
  network structure.

The container network status is stored in a new field in the state. The
status should be received with the new `c.getNetworkStatus`. This will
migrate the old status to the new format. Therefore old containers should
contine to work correctly in all cases even when network connect/
disconnect is used.

New features:
- podman network reload keeps the ip and mac for more than one network.
- podman container restore keeps the ip and mac for more than one
  network.
- The network create compat endpoint can now use more than one ipam
  config.

The man pages and the swagger doc are updated to reflect the latest
changes.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-09-15 20:00:20 +02:00
Paul Holzinger 6221f269a8
fix restart always with rootlessport
When a container is automatically restarted due its restart policy and
the container uses rootless cni networking with ports forwarded we have
to start a new rootlessport process since it exits with conmon.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-09-13 22:52:39 +02:00
Paul Holzinger 06f94dd09e
rootless cni: resolve absolute symlinks correctly
When /etc/resolv.conf is a symlink to an absolute path use it and not
join it the the previous path.

[NO TESTS NEEDED] This depends on the host layout.

Fixes #11358

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-08-30 16:55:26 +02:00
cdoern d28e85741f InfraContainer Rework
InfraContainer should go through the same creation process as regular containers. This change was from the cmd level
down, involving new container CLI opts and specgen creating functions. What now happens is that both container and pod
cli options are populated in cmd and used to create a podSpecgen and a containerSpecgen. The process then goes as follows

FillOutSpecGen (infra) -> MapSpec (podOpts -> infraOpts) -> PodCreate -> MakePod -> createPodOptions -> NewPod -> CompleteSpec (infra) -> MakeContainer -> NewContainer -> newContainer -> AddInfra (to pod state)

Signed-off-by: cdoern <cdoern@redhat.com>
2021-08-26 16:05:16 -04:00
Paul Holzinger 4b2dc48d0b
podman inspect show exposed ports
Podman inspect has to show exposed ports to match docker. This requires
storing the exposed ports in the container config.
A exposed port is shown as `"80/tcp": null` while a forwarded port is
shown as `"80/tcp": [{"HostIp": "", "HostPort": "8080" }]`.

Also make sure to add the exposed ports to the new image when the
container is commited.

Fixes #10777

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-08-24 15:44:26 +02:00
Paul Holzinger 2a8c414488
Fix rootless cni dns without systemd stub resolver
When a host uses systemd-resolved but not the resolved stub resolver the
following symlinks are created: `/etc/resolv.conf` ->
`/run/systemd/resolve/stub-resolv.conf` -> `/run/systemd/resolve/resolv.conf`.
Because the code uses filepath.EvalSymlinks we put the new resolv.conf
to `/run/systemd/resolve/resolv.conf` but the `/run/systemd/resolve/stub-resolv.conf`
link does not exists in the mount ns.
To fix this we will walk the symlinks manually until we reach the first
one under `/run` and use this for the resolv.conf file destination.

This fixes a regression which was introduced in e73d482990.

Fixes #11222

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-08-16 11:30:11 +02:00
Daniel J Walsh 404488a087
Run codespell to fix spelling
[NO TESTS NEEDED] Just fixing spelling.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-08-11 16:41:45 -04:00
Paul Holzinger e88d8dbeae
fix rootless port forwarding with network dis-/connect
The rootlessport forwarder requires a child IP to be set. This must be a
valid ip in the container network namespace. The problem is that after a
network disconnect and connect the eth0 ip changed. Therefore the
packages are dropped since the source ip does no longer exists in the
netns.
One solution is to set the child IP to 127.0.0.1, however this is a
security problem. [1]

To fix this we have to recreate the ports after network connect and
disconnect. To make this work the rootlessport process exposes a socket
where podman network connect/disconnect connect to and send to new child
IP to rootlessport. The rootlessport process will remove all ports and
recreate them with the new correct child IP.

Also bump rootlesskit to v0.14.3 to fix a race with RemovePort().

Fixes #10052

[1] https://nvd.nist.gov/vuln/detail/CVE-2021-20199

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-08-03 16:29:09 +02:00
OpenShift Merge Robot d24fc6b843
Merge pull request #10939 from Luap99/rootless-cni
Fix race conditions in rootless cni setup
2021-07-15 11:11:10 -04:00
Paul Holzinger 0007c98ddb
Fix race conditions in rootless cni setup
There was an race condition when calling `GetRootlessCNINetNs()`. It
created the rootless cni directory before it got locked. Therefore
another process could have called cleanup and removed this directory
before it was used resulting in errors. The lockfile got moved into the
XDG_RUNTIME_DIR directory to prevent a panic when the parent dir was
removed by cleanup.

Fixes #10930
Fixes #10922

To make this even more robust `GetRootlessCNINetNs()` will now return
locked. This guarantees that we can run `Do()` after `GetRootlessCNINetNs()`
before another process could have called `Cleanup()` in between.

[NO TESTS NEEDED] CI is flaking, hopefully this will fix it.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-07-15 14:33:56 +02:00
Akihiro Suda e73d482990
CNI-in-slirp4netns: fix bind-mount for /run/systemd/resolve/stub-resolv.conf
Fix issue 10929 : `[Regression in 3.2.0] CNI-in-slirp4netns DNS gets broken when running a rootful container after running a rootless container`

When /etc/resolv.conf on the host is a symlink to /run/systemd/resolve/stub-resolv.conf,
we have to mount an empty filesystem on /run/systemd/resolve in the child namespace,
so as to isolate the directory from the host mount namespace.

Otherwise our bind-mount for /run/systemd/resolve/stub-resolv.conf is unmounted
when systemd-resolved unlinks and recreates /run/systemd/resolve/stub-resolv.conf on the host.

[NO TESTS NEEDED]

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-15 17:25:09 +09:00
Paul Holzinger 2c7c679584
Make rootless-cni setup more robust
The rootless cni namespace needs a valid /etc/resolv.conf file. On some
distros is a symlink to somewhere under /run. Because the kernel will
follow the symlink before mounting, it is not possible to mount a file
at exactly /etc/resolv.conf. We have to ensure that the link target will
be available in the rootless cni mount ns.

Fixes #10855

Also fixed a bug in the /var/lib/cni directory lookup logic. It used
`filepath.Base` instead of `filepath.Dir` and thus looping infinitely.

Fixes #10857

[NO TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-07-06 18:40:03 +02:00
OpenShift Merge Robot e159eb892b
Merge pull request #10754 from Luap99/sync-lock
getContainerNetworkInfo: lock netNsCtr before sync
2021-06-23 04:25:44 -04:00
Paul Holzinger a84fa194b7 getContainerNetworkInfo: lock netNsCtr before sync
`syncContainer()` requires the container to be locked, otherwise we can
end up with undefined behavior.

[NO TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-06-22 16:51:21 +02:00
Paul Holzinger e014608539 Do not use inotify for OCICNI
Podman does not need to watch the cni config directory. If a network is
not found in the cache, OCICNI will reload the networks anyway and thus
even podman system service should work as expected.
Also include a change to not mount a "new" /var by default in the
rootless cni ns, instead try to use /var/lib/cni first and then the
parent dir. This allows users to store cni configs under /var/... which
is the case for the CI compose test.

[NO TESTS NEEDED]

Fixes #10686

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-06-22 16:00:47 +02:00
Paul Holzinger 44d9c453d3 Fix network connect race with docker-compose
Network connect/disconnect has to call the cni plugins when the network
namespace is already configured. This is the case for `ContainerStateRunning`
and `ContainerStateCreated`. This is important otherwise the network is
not attached to this network namespace and libpod will throw errors like
`network inspection mismatch...` This problem happened when using
`docker-compose up` in attached mode.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-06-11 16:00:12 +02:00
Brent Baude 7ef3981abe Enable port forwarding on host
Using the gvproxy application on the host, we can now port forward from
the machine vm on the host.  It requires that 'gvproxy' be installed in
an executable location.  gvproxy can be found in the
containers/gvisor-tap-vsock github repo.

[NO TESTS NEEDED]

Signed-off-by: Brent Baude <bbaude@redhat.com>
2021-06-01 10:13:18 -05:00
OpenShift Merge Robot 62a7d4b61e
Merge pull request #9972 from bblenard/issue-5651-hostname-for-container-gateway
Add host.containers.internal entry into container's etc/hosts
2021-05-17 10:45:23 -04:00
Baron Lenardson c8dfcce6db Add host.containers.internal entry into container's etc/hosts
This change adds the entry `host.containers.internal` to the `/etc/hosts`
file within a new containers filesystem. The ip address is determined by
the containers networking configuration and points to the gateway address
for the containers networking namespace.

Closes #5651

Signed-off-by: Baron Lenardson <lenardson.baron@gmail.com>
2021-05-17 08:21:22 -05:00
Paul Holzinger 4462113c5e podman network reload add rootless support
Allow podman network reload to be run as rootless user. While it is
unlikely that the iptable rules are flushed inside the rootless cni
namespace, it could still happen. Also fix podman network reload --all
to ignore errors when a container does not have the bridge network mode,
e.g. slirp4netns.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-05-17 10:55:02 +02:00
Paul Holzinger f99b7a314b Fix rootlesskit port forwarder with custom slirp cidr
The source ip for the rootlesskit port forwarder was hardcoded to the
standard slirp4netns ip. This is incorrect since users can change the
subnet used by slirp4netns with `--network slirp4netns:cidr=10.5.0.0/24`.
The container interface ip is always the .100 in the subnet. Only when
the rootlesskit port forwarder child ip matches the container interface
ip the port forwarding will work.

Fixes #9828

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-23 11:12:49 +02:00
Paul Holzinger 0a39ad196c podman unshare: add --rootless-cni to join the ns
Add a new --rootless-cni option to podman unshare to also join the
rootless-cni network namespace. This is useful if you want to connect
to a rootless container via IP address. This is only possible from the
rootless-cni namespace and not from the host namespace. This option also
helps to debug problems in the rootless-cni namespace.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-07 15:54:12 +02:00
Paul Holzinger f230214db1 rootless cni add /usr/sbin to PATH if not present
The CNI plugins need access to iptables in $PATH. On debian /usr/sbin
is not added to $PATH for rootless users. This will break rootless
cni completely. To prevent breaking existing users add /usr/sbin to
$PATH in podman if needed.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-06 23:55:05 +02:00
Paul Holzinger 973807092d Use the slrip4netns dns in the rootless cni ns
If a user only has a local dns server in the resolv.conf file the dns
resolution will fail. Instead we create a new resolv.conf which will use
the slirp4netns dns.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
Paul Holzinger 6cd807e3b7 Cleanup the rootless cni namespace
Delte the network namespace and kill the slirp4netns process when it is
no longer needed.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
Paul Holzinger d7e003f362 Remove unused rootless-cni-infra container files
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
Paul Holzinger db19224b6d Only use rootless RLK when the container has ports
Do not invoke the rootlesskit port forwarder when the container has no
ports.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
Paul Holzinger 294c90b05e Enable rootless network connect/disconnect
With the new rootless cni supporting network connect/disconnect is easy.
Combine common setps into extra functions to prevent code duplication.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
Paul Holzinger 94e67ba9a2 Move slirp4netns functions into an extra file
This should make maintenance easier.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
Paul Holzinger 00b2ec5e6f Add rootless support for cni and --uidmap
This is supported with the new rootless cni logic.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
Paul Holzinger 54b588c07d rootless cni without infra container
Instead of creating an extra container create a network and mount
namespace inside the podman user namespace. This ns is used to
for rootless cni operations.
This helps to align the rootless and rootful network code path.
If we run as rootless we just have to set up a extra net ns and
initialize slirp4netns in it. The ocicni lib will be called in
that net ns.

This design allows allows easier maintenance, no extra container
with pause processes, support for rootless cni with --uidmap
and possibly more.

The biggest problem is backwards compatibility. I don't think
live migration can be possible. If the user reboots or restart
all cni containers everything should work as expected again.
The user is left with the rootless-cni-infa container and image
but this can safely be removed.

To make the existing cni configs work we need execute the cni plugins
in a extra mount namespace. This ensures that we can safely mount over
/run and /var which have to be writeable for the cni plugins without
removing access to these files by the main podman process. One caveat
is that we need to keep the netns files at `XDG_RUNTIME_DIR/netns`
accessible.

`XDG_RUNTIME_DIR/rootless-cni/{run,var}` will be mounted to `/{run,var}`.
To ensure that we keep the netns directory we bind mount this relative
to the new root location, e.g. XDG_RUNTIME_DIR/rootless-cni/run/user/1000/netns
before we mount the run directory. The run directory is mounted recursive,
this makes the netns directory at the same path accessible as before.

This also allows iptables-legacy to work because /run/xtables.lock is
now writeable.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00
Paul Holzinger c5f9819dac Silence podman network reload errors with iptables-nft
Make sure we do not display the expected error when using podman network
reload. This is already done for iptables-legacy however iptables-nft
creates a slightly different error message so check for this as well.
The error is logged at info level.

[NO TESTS NEEDED] The test VMs do not use iptables-nft so there is no
way to test this. It is already tested for iptables-legacy.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-03-30 10:48:26 +02:00
Paul Holzinger aa0a57f095 Fix cni teardown errors
Make sure to pass the cni interface descriptions to cni teardowns.
Otherwise cni cannot find the correct cache files because the
interface name might not match the networks. This can only happen
when network disconnect was used.

Fixes #9602

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-03-04 11:43:59 +01:00
Paul Holzinger f152f9cf09 Network connect error if net mode is not bridge
Only the the network mode bridge supports cni networks.
Other network modes cannot use network connect/disconnect
so we should throw a error.

Fixes #9496

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-02-23 22:30:04 +01:00
Paul Holzinger 9d818be732 Fix podman network IDs handling
The libpod network logic knows about networks IDs but OCICNI
does not. We cannot pass the network ID to OCICNI. Instead we
need to make sure we only use network names internally. This
is also important for libpod since we also only store the
network names in the state. If we would add a ID there the
same networks could accidentally be added twice.

Fixes #9451

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-02-22 15:51:49 +01:00
Valentin Rothberg 5dded6fae7 bump go module to v3
We missed bumping the go module, so let's do it now :)

* Automated go code with github.com/sirkon/go-imports-rename
* Manually via `vgrep podman/v2` the rest

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-02-22 09:03:51 +01:00
Paul Holzinger 69ab67bf90 Enable golint linter
Use the golint linter and fix the reported problems.

[NO TESTS NEEDED]

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-02-11 23:01:49 +01:00
Paul Holzinger 5c6ab3075e Fix podman network disconnect wrong NetworkStatus number
The allocated `tmpNetworkStatus` must be allocated with the length 0.
Otherwise append would add new elements to the end of the slice and
not at the beginning of the allocated memory.

This caused inspect to fail since the number of networks did not
matched the number of network statuses.

Fixes #9234

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-02-04 19:41:30 +01:00
bitstrings 0959196807 Make slirp MTU configurable (network_cmd_options)
The mtu default value is currently forced to 65520.
This let the user control it using the config key network_cmd_options,
i.e.: network_cmd_options=["mtu=9000"]

Signed-off-by: bitstrings <pino.silvaggio@gmail.com>
2021-02-02 13:50:26 -05:00
baude 02ec5299f6 Add default net info in container inspect
when inspecting a container that is only connected to the default
network, we should populate the default network in the container inspect
information.

Fixes: #6618

Signed-off-by: baude <bbaude@redhat.com>

MH: Small fixes, added another test

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2021-01-26 16:00:06 -05:00
Giuseppe Scrivano 0ba1942f26
networking: lookup child IP in networks
if a CNI network is added to the container, use the IP address in that
network instead of hard-coding the slirp4netns default.

commit 5e65f0ba30 introduced this
regression.

Closes: https://github.com/containers/podman/issues/9065

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-23 18:28:56 +01:00
Giuseppe Scrivano ef654941d1
libpod: move slirp magic IPs to consts
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-22 08:08:27 +01:00
Giuseppe Scrivano 5e65f0ba30
rootlessport: set source IP to slirp4netns device
set the source IP to the slirp4netns address instead of 127.0.0.1 when
using rootlesskit.

Closes: https://github.com/containers/podman/issues/5138

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-22 08:08:26 +01:00
Daniel J Walsh d9ebbbfe5b
Switch references of /var/run -> /run
Systemd is now complaining or mentioning /var/run as a legacy directory.
It has been many years where /var/run is a symlink to /run on all
most distributions, make the change to the default.

Partial fix for https://github.com/containers/podman/issues/8369

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-01-07 05:37:24 -05:00
Anders F Björklund 25b7198441 The slirp4netns sandbox requires pivot_root
Disable the sandbox, when running on rootfs

Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
2020-12-29 18:03:49 +01:00
Josh Soref 4fa1fce930 Spelling
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-12-22 13:34:31 -05:00
Matthew Heon 46337b4708 Make `podman stats` slirp check more robust
Just checking for `rootless.IsRootless()` does not catch all the
cases where slirp4netns is in use - we actually allow it to be
used as root as well. Fortify the conditional here so we don't
fail in the root + slirp case.

Fixes #7883

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-12-08 09:59:00 -05:00
OpenShift Merge Robot 9b3a81a002
Merge pull request #8571 from Luap99/podman-network-reload
Implement pod-network-reload
2020-12-08 06:15:40 -05:00
Matthew Heon b0286d6b43 Implement pod-network-reload
This adds a new command, 'podman network reload', to reload the
networks of existing containers, forcing recreation of firewall
rules after e.g. `firewall-cmd --reload` wipes them out.

Under the hood, this works by calling CNI to tear down the
existing network, then recreate it using identical settings. We
request that CNI preserve the old IP and MAC address in most
cases (where the container only had 1 IP/MAC), but there will be
some downtime inherent to the teardown/bring-up approach. The
architecture of CNI doesn't really make doing this without
downtime easy (or maybe even possible...).

At present, this only works for root Podman, and only locally.
I don't think there is much of a point to adding remote support
(this is very much a local debugging command), but I think adding
rootless support (to kill/recreate slirp4netns) could be
valuable.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2020-12-07 19:26:23 +01:00
Ashley Cui d6d3af9e8e Add ability to set system wide options for slirp4netns
Wire in containers.conf options for slirp

Signed-off-by: Ashley Cui <acui@redhat.com>
2020-12-04 13:37:22 -05:00
baude 7d43cc06dc network connect disconnect on non-running containers
a container can connect and disconnet to networks even when not in a
running state.

Signed-off-by: baude <bbaude@redhat.com>
2020-11-30 16:10:01 -06:00
baude 1ddb19bc8e update container status with new results
a bug was being caused by the fact that the container network results
were not being updated properly.

given that jhon is on PTO, this PR will replace #8362

Signed-off-by: baude <bbaude@redhat.com>
2020-11-23 15:20:39 -06:00
Jhon Honce 44da01f45c Refactor compat container create endpoint
* Make endpoint compatibile with docker-py network expectations
* Update specgen helper when called from compat endpoint
* Update godoc on types
* Add test for network/container create using docker-py method
* Add syslog logging when DEBUG=1 for tests

Fixes #8361

Signed-off-by: Jhon Honce <jhonce@redhat.com>
2020-11-23 15:20:39 -06:00
Matthew Heon ce775248ad Make c.networks() list include the default network
This makes things a lot more clear - if we are actually joining a
CNI network, we are guaranteed to get a non-zero length list of
networks.

We do, however, need to know if the network we are joining is the
default network for inspecting containers as it determines how we
populate the response struct. To handle this, add a bool to
indicate that the network listed was the default network, and
only the default network.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-11-20 14:03:24 -05:00
baude a3e0b7d117 add network connect|disconnect compat endpoints
this enables the ability to connect and disconnect a container from a
given network. it is only for the compatibility layer. some code had to
be refactored to avoid circular imports.

additionally, tests are being deferred temporarily due to some
incompatibility/bug in either docker-py or our stack.

Signed-off-by: baude <bbaude@redhat.com>
2020-11-19 08:16:19 -06:00
baude d3e794bda3 add network connect|disconnect compat endpoints
this enables the ability to connect and disconnect a container from a
given network. it is only for the compatibility layer. some code had to
be refactored to avoid circular imports.

additionally, tests are being deferred temporarily due to some
incompatibility/bug in either docker-py or our stack.

Signed-off-by: baude <bbaude@redhat.com>
2020-11-17 14:22:39 -06:00
OpenShift Merge Robot a65ecc70c2
Merge pull request #8304 from rhatdan/error
Cleanup error reporting
2020-11-12 22:33:25 +01:00
Matthew Heon 8d56eb5342 Add support for network connect / disconnect to DB
Convert the existing network aliases set/remove code to network
connect and disconnect. We can no longer modify aliases for an
existing network, but we can add and remove entire networks. As
part of this, we need to add a new function to retrieve current
aliases the container is connected to (we had a table for this
as of the first aliases PR, but it was not externally exposed).

At the same time, remove all deconflicting logic for aliases.
Docker does absolutely no checks of this nature, and allows two
containers to have the same aliases, aliases that conflict with
container names, etc - it's just left to DNS to return all the
IP addresses, and presumably we round-robin from there? Most
tests for the existing code had to be removed because of this.

Convert all uses of the old container config.Networks field,
which previously included all networks in the container, to use
the new DB table. This ensures we actually get an up-to-date list
of in-use networks. Also, add network aliases to the output of
`podman inspect`.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-11-11 16:37:54 -05:00
Daniel J Walsh f3648b4ae8
Cleanup error reporting
The error message reported is overlay complicated and the added test does not
really help the user.

Currently the error looks like:

podman run -p 80:80 fedora echo hello
Error: failed to expose ports via rootlessport: "cannot expose privileged port 80, you might need to add "net.ipv4.ip_unprivileged_port_start=0" (currently 1024) to /etc/sysctl.conf, or choose a larger port number (>= 1024): listen tcp 0.0.0.0:80: bind: permission denied\n"

After this change

./bin/podman run -p 80:80 fedora echo hello
Error: cannot expose privileged port 80, you might need to add "net.ipv4.ip_unprivileged_port_start=0" (currently 1024) to /etc/sysctl.conf, or choose a larger port number (>= 1024): listen tcp 0.0.0.0:80: bind: permission denied

Control chars have been eliminated.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-11-11 13:11:15 -05:00
baude b7b5b6f8e3 network aliases for container creation
podman can now support adding network aliases when running containers
(--network-alias).  It requires an updated dnsname plugin as well as an
updated ocicni to work properly.

Signed-off-by: baude <bbaude@redhat.com>
2020-11-09 15:08:58 -06:00
Paul Holzinger 2704dfbb7a Fix dnsname when joining a different network namespace in a pod
When creating a container in a pod the podname was always set as
the dns entry. This is incorrect when the container is not part
of the pods network namespace. This happend both rootful and
rootless. To fix this check if we are part of the pods network
namespace and if not use the container name as dns entry.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2020-10-30 18:53:55 +01:00
OpenShift Merge Robot 2cd2359a6d
Merge pull request #7772 from TomSweeneyRedHat/dev/tsweeney/splitn
Convert Split() calls with an equal sign to SplitN()
2020-10-21 21:00:16 -04:00
Matthew Heon c1b844ecc8 Retrieve network inspect info from dependency container
When a container either joins a pod that shares the network
namespace or uses `--net=container:` to share the network
namespace of another container, it does not have its own copy of
the CNI results used to generate `podman inspect` output. As
such, to inspect these containers, we should be going to the
container we share the namespace with for network info.

Fixes #8073

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-10-20 13:27:33 -04:00