Commit Graph

22 Commits

Author SHA1 Message Date
Daniel J Walsh ffbab30d7b
Run codespell to cleanup typos
[NO NEW TESTS NEEDED]

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-03-25 15:34:41 -04:00
Valentin Rothberg ea08765f40 go fmt: use go 1.18 conditional-build syntax
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-03-18 09:11:53 +01:00
Paul Holzinger 3bb046a5e3
slirp: fix setup on ipv6 disabled systems
When enable_ipv6=true is set for slirp4netns (default since podman v4),
we will try to set the accept sysctl. This sysctl will not exist on
systems that have ipv6 disabled. In this case we should not error and
just ignore the extra ipv6 setup.

Also the current logic to wait for the slirp4 setup was kinda broken, it
did not actually wait until the sysctl was set before starting slirp.
This should now be fixed by using two `sync.WaitGroup`s.

[NO NEW TESTS NEEDED]

Fixes #13388

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-03-14 15:19:54 +01:00
Valentin Rothberg bd09b7aa79 bump go module to version 4
Automated for .go files via gomove [1]:
`gomove github.com/containers/podman/v3 github.com/containers/podman/v4`

Remaining files via vgrep [2]:
`vgrep github.com/containers/podman/v3`

[1] https://github.com/KSubedi/gomove
[2] https://github.com/vrothberg/vgrep

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2022-01-18 12:47:07 +01:00
Paul Holzinger 495884b319
use libnetwork from c/common
The libpod/network packages were moved to c/common so that buildah can
use it as well. To prevent duplication use it in podman as well and
remove it from here.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-01-12 17:07:30 +01:00
Paul Holzinger 97c6403a1b
rename libpod nettypes fields
Some field names are confusing. Change them so that they make more sense
to the reader.
Since these fields are only in the main branch we can safely rename them
without worrying about backwards compatibility.
Note we have to change the field names in netavark too.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-16 19:19:04 +01:00
Paul Holzinger 295d87bb0b
podman machine improve port forwarding
This commits adds port forwarding logic directly into podman. The
podman-machine cni plugin is no longer needed.

The following new features are supported:
 - works with cni, netavark and slirp4netns
 - ports can use the hostIP to bind instead of hard coding 0.0.0.0
 - gvproxy no longer listens on 0.0.0.0:7777 (requires a new gvproxy
   version)
 - support the udp protocol

With this we no longer need podman-machine-cni and should remove it from
the packaging. There is also a change to make sure we are backwards
compatible with old config which include this plugin.

Fixes #11528
Fixes #11728

[NO NEW TESTS NEEDED] We have no podman machine test at the moment.
Please test this manually on your system.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-15 15:20:47 +01:00
Paul Holzinger 3690532b3b
network reload return error if we cannot reload ports
As rootless we have to reload the port mappings. If it fails we should
return an error instead of the warning.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-10 21:16:30 +01:00
Paul Holzinger 27de152b5a
network reload without ports should not reload ports
When run as rootless the podman network reload command tries to reload
the rootlessport ports because the childIP could have changed.
However if the containers has no ports we should skip this instead of
printing a warning.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-10 21:16:08 +01:00
Paul Holzinger 216e2cb366
Fix rootless networking with userns and ports
A rootless container created with a custom userns and forwarded ports
did not work. I refactored the network setup to make the setup logic
more clear.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-11-09 15:58:57 +01:00
Paul Holzinger 0136a66a83
libpod: deduplicate ports in db
The OCICNI port format has one big problem: It does not support ranges.
So if a users forwards a range of 1k ports with podman run -p 1001-2000
we have to store each of the thousand ports individually as array element.
This bloats the db and makes the JSON encoding and decoding much slower.
In many places we already use a better port struct type which supports
ranges, e.g. `pkg/specgen` or the new network interface.

Because of this we have to do many runtime conversions between the two
port formats. If everything uses the new format we can skip the runtime
conversions.

This commit adds logic to replace all occurrences of the old format
with the new one. The database will automatically migrate the ports
to new format when the container config is read for the first time
after the update.

The `ParsePortMapping` function is `pkg/specgen/generate` has been
reworked to better work with the new format. The new logic is able
to deduplicate the given ports. This is necessary the ensure we
store them efficiently in the DB. The new code should also be more
performant than the old one.

To prove that the code is fast enough I added go benchmarks. Parsing
1 million ports took less than 0.5 seconds on my laptop.

Benchmark normalize PortMappings in specgen:
Please note that the 1 million ports are actually 20x 50k ranges
because we cannot have bigger ranges than 65535 ports.
```
$ go test -bench=. -benchmem  ./pkg/specgen/generate/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/pkg/specgen/generate
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
BenchmarkParsePortMappingNoPorts-12             480821532                2.230 ns/op           0 B/op          0 allocs/op
BenchmarkParsePortMapping1-12                      38972             30183 ns/op          131584 B/op          9 allocs/op
BenchmarkParsePortMapping100-12                    18752             60688 ns/op          141088 B/op        315 allocs/op
BenchmarkParsePortMapping1k-12                      3104            331719 ns/op          223840 B/op       3018 allocs/op
BenchmarkParsePortMapping10k-12                      376           3122930 ns/op         1223650 B/op      30027 allocs/op
BenchmarkParsePortMapping1m-12                         3         390869926 ns/op        124593840 B/op   4000624 allocs/op
BenchmarkParsePortMappingReverse100-12             18940             63414 ns/op          141088 B/op        315 allocs/op
BenchmarkParsePortMappingReverse1k-12               3015            362500 ns/op          223841 B/op       3018 allocs/op
BenchmarkParsePortMappingReverse10k-12               343           3318135 ns/op         1223650 B/op      30027 allocs/op
BenchmarkParsePortMappingReverse1m-12                  3         403392469 ns/op        124593840 B/op   4000624 allocs/op
BenchmarkParsePortMappingRange1-12                 37635             28756 ns/op          131584 B/op          9 allocs/op
BenchmarkParsePortMappingRange100-12               39604             28935 ns/op          131584 B/op          9 allocs/op
BenchmarkParsePortMappingRange1k-12                38384             29921 ns/op          131584 B/op          9 allocs/op
BenchmarkParsePortMappingRange10k-12               29479             40381 ns/op          131584 B/op          9 allocs/op
BenchmarkParsePortMappingRange1m-12                  927           1279369 ns/op          143022 B/op        164 allocs/op
PASS
ok      github.com/containers/podman/v3/pkg/specgen/generate    25.492s
```

Benchmark convert old port format to new one:
```
go test -bench=. -benchmem  ./libpod/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/libpod
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
Benchmark_ocicniPortsToNetTypesPortsNoPorts-12          663526126                1.663 ns/op           0 B/op          0 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1-12                 7858082               141.9 ns/op            72 B/op          2 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10-12                2065347               571.0 ns/op           536 B/op          4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts100-12                138478              8641 ns/op            4216 B/op          4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1k-12                   9414            120964 ns/op           41080 B/op          4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10k-12                   781           1490526 ns/op          401528 B/op          4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1m-12                      4         250579010 ns/op        40001656 B/op          4 allocs/op
PASS
ok      github.com/containers/podman/v3/libpod  11.727s
```

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-10-27 18:59:56 +02:00
Paul Holzinger 008075ce54
Slirp4netns with ipv6 set net.ipv6.conf.default.accept_dad=0
Duplicate Address Detection slows the ipv6 setup down for 1-2 seconds.
Since slirp4netns is run it is own namespace and not directly routed
we can skip this to make the ipv6 address immediately available.
We change the default to make sure the slirp tap interface gets the
correct value assigned so DAD is disabled for it.
Also make sure to change this value back to the original after slirp4netns
is ready in case users rely on this sysctl.

Fixes #11062

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-10-26 18:27:30 +02:00
Paul Holzinger 3ba69dccf7
rootlessport: reduce memory usage of the process
Don't use reexec for the rootlessport process, instead make it a
separate binary to reduce the memory usage. The problem with reexec is
that it will import all packages that podman uses and therefore loads a
lot of stuff into the heap. The rootlessport process however only needs
the rootlesskit library.
The memory usage is a concern since the rootlessport process will spawn
two process per container which has ports forwarded. The processes stay
until the container dies. On my laptop the current reexec version uses
47800 KB RSS. The new separate binary only uses 4540 KB RSS. This is
more than a 90% improvement.

The Makefile has been updated to compile the new binary and install it
to the libexec directory.

Fixes #10790

[NO TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-10-12 21:43:11 +02:00
Valentin Rothberg 5d6ea90e75 libpod: do not call (*container).Config()
Access the container's config field directly inside of libpod instead of
calling `Config()` which in turn creates expensive JSON deep copies.

Accessing the field directly drops memory consumption of a simple
`podman run --rm busybox true` from 1245kB to 410kB.

[NO TESTS NEEDED]

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-09-28 17:18:02 +02:00
Daniel J Walsh 1c4e6d8624
standardize logrus messages to upper case
Remove ERROR: Error stutter from logrus messages also.

[ NO TESTS NEEDED] This is just code cleanup.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-09-22 15:29:34 -04:00
Paul Holzinger 85e8fbf7f3
Wire network interface into libpod
Make use of the new network interface in libpod.

This commit contains several breaking changes:
- podman network create only outputs the new network name and not file
  path.
- podman network ls shows the network driver instead of the cni version
  and plugins.
- podman network inspect outputs the new network struct and not the cni
  conflist.
- The bindings and libpod api endpoints have been changed to use the new
  network structure.

The container network status is stored in a new field in the state. The
status should be received with the new `c.getNetworkStatus`. This will
migrate the old status to the new format. Therefore old containers should
contine to work correctly in all cases even when network connect/
disconnect is used.

New features:
- podman network reload keeps the ip and mac for more than one network.
- podman container restore keeps the ip and mac for more than one
  network.
- The network create compat endpoint can now use more than one ipam
  config.

The man pages and the swagger doc are updated to reflect the latest
changes.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-09-15 20:00:20 +02:00
Paul Holzinger abdedc31a2
rootlessport: allow socket paths with more than 108 chars
Creating the rootlessport socket can fail with `bind: invalid argument`
when the socket path is longer than 108 chars. This is the case for
users with a long runtime directory.
Since the kernel does not allow to use socket paths with more then 108
chars use a workaround to open the socket path.

[NO TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-09-01 16:14:40 +02:00
Paul Holzinger e88d8dbeae
fix rootless port forwarding with network dis-/connect
The rootlessport forwarder requires a child IP to be set. This must be a
valid ip in the container network namespace. The problem is that after a
network disconnect and connect the eth0 ip changed. Therefore the
packages are dropped since the source ip does no longer exists in the
netns.
One solution is to set the child IP to 127.0.0.1, however this is a
security problem. [1]

To fix this we have to recreate the ports after network connect and
disconnect. To make this work the rootlessport process exposes a socket
where podman network connect/disconnect connect to and send to new child
IP to rootlessport. The rootlessport process will remove all ports and
recreate them with the new correct child IP.

Also bump rootlesskit to v0.14.3 to fix a race with RemovePort().

Fixes #10052

[1] https://nvd.nist.gov/vuln/detail/CVE-2021-20199

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-08-03 16:29:09 +02:00
Paul Holzinger ed51e3f548
podman service reaper
Add a new service reaper package. Podman currently does not reap all
child processes. The slirp4netns and rootlesskit processes are not
reaped. The is not a problem for local podman since the podman process
dies before the other processes and then init will reap them for us.

However with podman system service it is possible that the podman
process is still alive after slirp died. In this case podman has to reap
it or the slirp process will be a zombie until the service is stopped.

The service reaper will listen in an extra goroutine on SIGCHLD. Once it
receives this signal it will try to reap all pids that were added with
`AddPID()`. While I would like to just reap all children this is not
possible because many parts of the code use `os/exec` with `cmd.Wait()`.
If we reap before `cmd.Wait()` things can break, so reaping everything
is not an option.

[NO TESTS NEEDED]

Fixes #9777

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2021-07-02 19:00:36 +02:00
Baron Lenardson c8dfcce6db Add host.containers.internal entry into container's etc/hosts
This change adds the entry `host.containers.internal` to the `/etc/hosts`
file within a new containers filesystem. The ip address is determined by
the containers networking configuration and points to the gateway address
for the containers networking namespace.

Closes #5651

Signed-off-by: Baron Lenardson <lenardson.baron@gmail.com>
2021-05-17 08:21:22 -05:00
Paul Holzinger f99b7a314b Fix rootlesskit port forwarder with custom slirp cidr
The source ip for the rootlesskit port forwarder was hardcoded to the
standard slirp4netns ip. This is incorrect since users can change the
subnet used by slirp4netns with `--network slirp4netns:cidr=10.5.0.0/24`.
The container interface ip is always the .100 in the subnet. Only when
the rootlesskit port forwarder child ip matches the container interface
ip the port forwarding will work.

Fixes #9828

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-23 11:12:49 +02:00
Paul Holzinger 94e67ba9a2 Move slirp4netns functions into an extra file
This should make maintenance easier.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-04-01 17:27:03 +02:00