This will allow container processes to write to the CRIU socket that gets injected
into the container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When sharing a network namespace, containers should also share
resolv.conf and /etc/hosts in case a container process made
changes to either (for example, if I set up a VPN client in
container A and join container B to its network namespace, I
expect container B to use the DNS servers from A to ensure it can
see everything on the VPN).
Resolves: #1546
Signed-off-by: Matthew Heon <mheon@redhat.com>
Instead of forcing another user lookup when mounting image
volumes, just use the information we looked up when we started
generating the spec.
This may resolve#1817
Signed-off-by: Matthew Heon <mheon@redhat.com>
Using the default capabilities, we can determine which caps were
added and dropped. Now added them to the security context structure.
Signed-off-by: baude <bbaude@redhat.com>
If one of storage GraphRoot or RunRoot are specified, but the
other is not, c/storage will not use the default, and will throw
an error instead. Ensure that in cases where this would happen,
we populate the fields with the c/storage defaults ourselves.
Signed-off-by: Matthew Heon <mheon@redhat.com>
like podman stop of containers, we should allow the user to specify
a timeout override when stopping pods; otherwise they have to wait
the full timeout time specified during the pod/container creation.
Signed-off-by: baude <bbaude@redhat.com>
i.e. actually reflect the environment variable and/or rootless mode
instead of always using the default path.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
DockerRegistryOptions.DockerInsecureSkipTLSVerify as an types.OptionalBool
can now represent that value, so forceSecure is redundant.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
DockerRegistryOptions.DockerInsecureSkipTLSVerify as an types.OptionalBool
can now represent that value, so forceSecure is redundant.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Following SystemContext.DockerInsecureSkipTLSVerify, make the
DockerRegistryOne also an OptionalBool, and update callers.
Explicitly document that --tls-verify=true and --tls-verify unset
have different behavior in those commands where the behavior changed
(or where it hasn't changed but the documentation needed updating).
Also make the --tls-verify man page sections a tiny bit more consistent
throughout.
This is a minimal fix, without changing the existing "--tls-verify=true"
paths nor existing manual insecure registry lookups.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
containers inside pods need to make sure they get /etc/resolv.conf
and /etc/hosts bind mounted when network is expected
Signed-off-by: baude <bbaude@redhat.com>
Previously not needed as it only worked inside of Batch(), but
now that it can be called anywhere we need to add mutual
exclusion on its config changes.
Signed-off-by: Matthew Heon <mheon@redhat.com>
With the changes made recently to ensure Podman does not hit the
OCI runtime as often to sync state, we can find ourselves in a
situation where the runtime's state does not match ours.
Add a --sync flag to podman rm to ensure we can still remove
containers when this happens.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Add support for podman volume and its subcommands.
The commands supported are:
podman volume create
podman volume inspect
podman volume ls
podman volume rm
podman volume prune
This is a tool to manage volumes used by podman. For now it only handle
named volumes, but eventually it will handle all volumes used by podman.
Signed-off-by: umohnani8 <umohnani@redhat.com>
Allow user to prune unused/unnamed images, the layer images from building,
via podman rmi --prune.
Allow user to prune stopped/exiuted containers via podman rm --prune.
This should resolve#1910
Signed-off-by: baude <bbaude@redhat.com>
When an image config sets config.User [1] to a numeric group (like
1000:1000), but those values do not exist in the container's
/etc/group, libpod is currently breaking:
$ podman run --rm registry.svc.ci.openshift.org/ci-op-zvml7cd6/pipeline:installer --help
error creating temporary passwd file for container 228f6e9943d6f18b93c19644e9b619ec4d459a3e0eb31680e064eeedf6473678: unable to get gid 1000 from group file: no matching entries in group file
However, the OCI spec requires converters to copy numeric uid and gid
to the runtime config verbatim [2].
With this commit, I'm frontloading the "is groupspec an integer?"
check and only bothering with lookup.GetGroup when it was not.
I've also removed a few .Mounted checks, which are originally from
00d38cb3 (podman create/run need to load information from the image,
2017-12-18, #110). We don't need a mounted container filesystem to
translate integers. And when the lookup code needs to fall back to
the mounted root to translate names, it can handle erroring out
internally (and looking it over, it seems to do that already).
[1]: https://github.com/opencontainers/image-spec/blame/v1.0.1/config.md#L118-L123
[2]: https://github.com/opencontainers/image-spec/blame/v1.0.1/conversion.md#L70
Signed-off-by: W. Trevor King <wking@tremily.us>
We don't use CNI to configure networks for rootless containers,
so no need to set it up. It may also cause issues with inotify,
so disabling it resolves some potential problems.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Instead of storing the runtime's file lock dir in the BoltDB
state, refer to the runtime inside the Bolt state instead, and
use the path stored in the runtime.
This is necessary since we moved DB initialization very far up in
runtime init, before the locks dir is properly initialized (and
it must happen before the locks dir can be created, as we use the
DB to retrieve the proper path for the locks dir now).
Signed-off-by: Matthew Heon <mheon@redhat.com>
Part of the motivation for 800eb863 (Hooks supports two directories,
process default and override, 2018-09-17, #1487) was [1]:
> We only use this for override. The reason this was caught is people
> are trying to get hooks to work with CoreOS. You are not allowed to
> write to /usr/share... on CoreOS, so they wanted podman to also look
> at /etc, where users and third parties can write.
But we'd also been disabling hooks completely for rootless users. And
even for root users, the override logic was tricky when folks actually
had content in both directories. For example, if you wanted to
disable a hook from the default directory, you'd have to add a no-op
hook to the override directory.
Also, the previous implementation failed to handle the case where
there hooks defined in the override directory but the default
directory did not exist:
$ podman version
Version: 0.11.2-dev
Go Version: go1.10.3
Git Commit: "6df7409cb5a41c710164c42ed35e33b28f3f7214"
Built: Sun Dec 2 21:30:06 2018
OS/Arch: linux/amd64
$ ls -l /etc/containers/oci/hooks.d/test.json
-rw-r--r--. 1 root root 184 Dec 2 16:27 /etc/containers/oci/hooks.d/test.json
$ podman --log-level=debug run --rm docker.io/library/alpine echo 'successful container' 2>&1 | grep -i hook
time="2018-12-02T21:31:19-08:00" level=debug msg="reading hooks from /usr/share/containers/oci/hooks.d"
time="2018-12-02T21:31:19-08:00" level=warning msg="failed to load hooks: {}%!(EXTRA *os.PathError=open /usr/share/containers/oci/hooks.d: no such file or directory)"
With this commit:
$ podman --log-level=debug run --rm docker.io/library/alpine echo 'successful container' 2>&1 | grep -i hook
time="2018-12-02T21:33:07-08:00" level=debug msg="reading hooks from /usr/share/containers/oci/hooks.d"
time="2018-12-02T21:33:07-08:00" level=debug msg="reading hooks from /etc/containers/oci/hooks.d"
time="2018-12-02T21:33:07-08:00" level=debug msg="added hook /etc/containers/oci/hooks.d/test.json"
time="2018-12-02T21:33:07-08:00" level=debug msg="hook test.json matched; adding to stages [prestart]"
time="2018-12-02T21:33:07-08:00" level=warning msg="implicit hook directories are deprecated; set --hooks-dir="/etc/containers/oci/hooks.d" explicitly to continue to load hooks from this directory"
time="2018-12-02T21:33:07-08:00" level=error msg="container create failed: container_linux.go:336: starting container process caused "process_linux.go:399: container init caused \"process_linux.go:382: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: oh, noes!\\\\n\\\"\""
(I'd setup the hook to error out). You can see that it's silenly
ignoring the ENOENT for /usr/share/containers/oci/hooks.d and
continuing on to load hooks from /etc/containers/oci/hooks.d.
When it loads the hook, it also logs a warning-level message
suggesting that callers explicitly configure their hook directories.
That will help consumers migrate, so we can drop the implicit hook
directories in some future release. When folks *do* explicitly
configure hook directories (via the newly-public --hooks-dir and
hooks_dir options), we error out if they're missing:
$ podman --hooks-dir /does/not/exist run --rm docker.io/library/alpine echo 'successful container'
error setting up OCI Hooks: open /does/not/exist: no such file or directory
I've dropped the trailing "path" from the old, hidden --hooks-dir-path
and hooks_dir_path because I think "dir(ectory)" is already enough
context for "we expect a path argument". I consider this name change
non-breaking because the old forms were undocumented.
Coming back to rootless users, I've enabled hooks now. I expect they
were previously disabled because users had no way to avoid
/usr/share/containers/oci/hooks.d which might contain hooks that
required root permissions. But now rootless users will have to
explicitly configure hook directories, and since their default config
is from ~/.config/containers/libpod.conf, it's a misconfiguration if
it contains hooks_dir entries which point at directories with hooks
that require root access. We error out so they can fix their
libpod.conf.
[1]: https://github.com/containers/libpod/pull/1487#discussion_r218149355
Signed-off-by: W. Trevor King <wking@tremily.us>
We don't need this for anything more than rootless work in Libpod
now, but Buildah still uses it as it was originally written, so
leave it intact as part of our API.
Signed-off-by: Matthew Heon <mheon@redhat.com>
When graphroot is set by the user, we should set libpod's static
directory to a subdirectory of that by default, to duplicate
previous behavior.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Ensure that the directory where we will create the Podman db
exists prior to creating the database - otherwise creating the DB
will fail.
Signed-off-by: Matthew Heon <mheon@redhat.com>
When validating fields against the DB, report more verbosely the
name of the field being validated if it fails. Specifically, add
the name used in config files, so people will actually know what
to change it errors happen.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Ensure we don't break the unit tests by creating a locks
directory (which, prior to the last commit, would be created by
BoltDB state init).
Signed-off-by: Matthew Heon <mheon@redhat.com>
We already create the locks directory as part of the libpod
runtime's init - no need to do it again as part of BoltDB's init.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Previous commits ensured that we would use database-configured
paths if not explicitly overridden.
However, our runtime generation did unconditionally override
storage config, which made this useless.
Move rootless storage configuration setup to libpod, and change
storage setup so we only override if a setting is explicitly
set, so we can still override what we want.
Signed-off-by: Matthew Heon <mheon@redhat.com>
If the DB contains default paths, and the user has not explicitly
overridden them, use the paths in the DB over our own defaults.
The DB validates these paths, so it would error and prevent
operation if they did not match. As such, instead of erroring, we
can use the DB's paths instead of our own.
Signed-off-by: Matthew Heon <mheon@redhat.com>
To configure runtime fields from the database, we need to know
whether they were explicitly overwritten by the user (we don't
want to overwrite anything that was explicitly set). Store a
struct containing whether the variables we'll grab from the DB
were explicitly set by the user so we know what we can and can't
overwrite.
This determines whether libpod runtime and static dirs were set
via config file in a horribly hackish way (double TOML decode),
but I can't think of a better way, and it shouldn't be that
expensive as the libpod config is tiny.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Previously, we implicitly validated runtime configuration against
what was stored in the database as part of database init. Make
this an explicit step, so we can call it after the database has
been initialized. This will allow us to retrieve paths from the
database and use them to overwrite our defaults if they differ.
Signed-off-by: Matthew Heon <mheon@redhat.com>
When we configure a runtime, we now will need to hit the DB early
on, so we can verify the paths we're going to use for c/storage are correct.
Signed-off-by: Matthew Heon <mheon@redhat.com>
When we create a Libpod database, we store a number of runtime
configuration fields in it. If we can retrieve those, we can use
them to configure the runtime to match the DB instead of inbuilt
defaults, helping to ensure that we don't error in cases where
our compiled-in defaults changed.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Currently we are mounting /dev/shm from disk, it should be from a tmpfs.
User Namespace supports tmpfs mounts for nonroot users, so this section of
code should work fine in bother root and rootless mode.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Whe running unittests on newer golang versions, we observe failures with some
formatting types when no declared correctly.
Signed-off-by: baude <bbaude@redhat.com>
with https://github.com/opencontainers/runc/pull/1807 we moved the
systemd notify initialization from "create" to "start", so that the
OCI runtime doesn't hang while waiting on reading from the notify
socket. This means we also need to set the correct NOTIFY_SOCKET when
start'ing the container.
Closes: https://github.com/containers/libpod/issues/746
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We now default to setting storage options to "nodev", when running
privileged containers, we need to turn this off so the processes can
manipulate the image.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
prefer a fsnotify watcher to polling the file, we take advantage of
inotify on Linux and react more promptly to the PID file being
created.
If the watcher cannot be created, then fallback to the old polling
mechanism.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The conmon exit command is running inside of a namespace where the
process is running with uid=0. When it launches again podman for the
cleanup, podman is not running in rootless mode as the uid=0.
Export some more env variables to tell podman we are in rootless
mode.
Closes: https://github.com/containers/libpod/issues/1859
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
CRIU can checkpoint and restore processes/containers with established
TCP connections if the correct option is specified. To implement
checkpoint and restore with support for established TCP connections with
Podman this commit adds the necessary options to runc during checkpoint
and also tells conmon during restore to use 'runc restore' with
'--tcp-established'.
For this Podman feature to work a corresponding conmon change is
required.
Example:
$ podman run --tmpfs /tmp --name podman-criu-test -d docker://docker.io/yovfiatbeb/podman-criu-test
$ nc `podman inspect -l | jq -r '.[0].NetworkSettings.IPAddress'` 8080
GET /examples/servlets/servlet/HelloWorldExample
Connection: keep-alive
1
GET /examples/servlets/servlet/HelloWorldExample
Connection: keep-alive
2
$ # Using HTTP keep-alive multiple requests are send to the server in the container
$ # Different terminal:
$ podman container checkpoint -l
criu failed: type NOTIFY errno 0
$ # Looking at the log file would show errors because of established TCP connections
$ podman container checkpoint -l --tcp-established
$ # This works now and after the restore the same connection as above can be used for requests
$ podman container restore -l --tcp-established
The restore would fail without '--tcp-established' as the checkpoint image
contains established TCP connections.
Signed-off-by: Adrian Reber <areber@redhat.com>
This is basically the same change as
ff47a4c2d5 (Use a struct to pass options to Checkpoint())
just for the Restore() function. It is used to pass multiple restore
options to the API and down to conmon which is used to restore
containers. This is for the upcoming changes to support checkpointing
and restoring containers with '--tcp-established'.
Signed-off-by: Adrian Reber <areber@redhat.com>
My host system runs Fedora Silverblue 29 and I have NetworkManager's
`dns=dnsmasq` setting enabled, so my `/etc/resolv.conf` only has
`127.0.0.1`.
I also run my development podman containers with `--net=host`
for various reasons.
If we have a host network namespace, there's no reason not to just
use the host's nameserver configuration either.
This fixes e.g. accessing content on a VPN, and is also faster
since the container is using cached DNS.
I know this doesn't solve the bigger picture issue of localhost-DNS
conflicting with bridged networking, but that's far more involved,
probably requiring a DNS proxy in the container. This patch
makes my workflow a lot nicer and was easy to write.
Signed-off-by: Colin Walters <walters@verbum.org>
don't wait for the timeout to expire if the runtime process exited.
I've noticed podman to hang on exit and keeping the container lock
taken when the OCI runtime already exited.
Additionally, it reduces the waiting time as we won't hit the 25
milliseconds waiting time in the worst case.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Add an exists subcommand to podman container and podman image that allows
users to verify the existence of a container or image by ID or name. The return
code can be 0 (success), 1 (failed to find), or 125 (failed to work with runtime).
Issue #1845
Signed-off-by: baude <bbaude@redhat.com>
Set the root propagation based on the properties of volumes and default
mounts. To remain compatibility, follow the semantics of Docker. If a
volume is shared, keep the root propagation shared which works for slave
and private volumes too. For slave volumes, it can either be shared or
rshared. Do not change the root propagation for private volumes and
stick with the default.
Fixes: #1834
Signed-off-by: Valentin Rothberg <vrothberg@suse.com>
CRIU supports to leave processes running after checkpointing:
-R|--leave-running leave tasks in running state after checkpoint
runc also support to leave containers running after checkpointing:
--leave-running leave the process running after checkpointing
With this commit the support to leave a container running after
checkpointing is brought to Podman:
--leave-running, -R leave the container running after writing checkpoint to disk
Now it is possible to checkpoint a container at some point in time
without stopping the container. This can be used to rollback the
container to an early state:
$ podman run --tmpfs /tmp --name podman-criu-test -d docker://docker.io/yovfiatbeb/podman-criu-test
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
3
$ podman container checkpoint -R -l
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
4
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
5
$ podman stop -l
$ podman container restore -l
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
4
So after checkpointing the container kept running and was stopped after
some time. Restoring this container will restore the state right at the
checkpoint.
Signed-off-by: Adrian Reber <areber@redhat.com>
For upcoming changes to the Checkpoint() functions this commit switches
checkpoint options from a boolean to a struct, so that additional
options can be passed easily to Checkpoint() without changing the
function parameters all the time.
Signed-off-by: Adrian Reber <areber@redhat.com>
otherwise runc will take by default the value used for creating the
container. Setting it explicit overrides its default value and we
won't end up trying to use a terminal when not available.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1625876
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
we need to allow users to expose ports to the host for the purposes
of networking, like a webserver. the port exposure must be done at
the time the pod is created.
strictly speaking, the port exposure occurs on the infra container.
Signed-off-by: baude <bbaude@redhat.com>
scope out new kube subcommand where we can add generate. you can now generate kubernetes
YAML that will allow you to run the container in a kubernetes environment. When
The YAML description will always "wrap" a container in a simple v1.Pod description.
Tests and further documentation will be added in additional PRs.
This function should be considered very much "under heavy development" at
this point.
Signed-off-by: baude <bbaude@redhat.com>
At scale, it appears that we sometimes hit the 1000ms timeout to create
the PID file when a container is created or executed.
Increasing the value to 60s should help when running a lot of containers
in heavy-loaded environment.
Related #1495Fixes#1816
Signed-off-by: Emilien Macchi <emilien@redhat.com>
/etc/resolv.conf and /etc/hosts should not be created and mounted when the
network is disabled.
We should not be calling the network setup and cleanup functions when it is
disabled either.
In doing this patch, I found that all of the bind mounts were particular to
Linux along with the generate functions, so I moved them to
container_internal_linux.go
Since we are checking if we are using a network namespace, we need to check
after the network namespaces has been created in the spec.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Redefining err by := operator within block makes this err variable block local.
Addressing lint:
libpod/oci.go:368:3⚠️ ineffectual assignment to err (ineffassign)
Signed-off-by: Šimon Lukašík <slukasik@redhat.com>
Add a rootless field to the info data (e.g., `podman info`) to indicate
if the executing user is root or not. In most cases, this can be
guessed but now it is clear and may aid in debugging, reporting and
understanding certain issues.
Signed-off-by: Valentin Rothberg <vrothberg@suse.com>
We are seeing some issues where, when part of prepare() fails
(originally noticed due to a bad static IP), the other half does
not successfully clean up, and the state can be left in a bad
place (not knowing about an active SHM mount for example).
Signed-off-by: Matthew Heon <mheon@redhat.com>
This one is tricky. By using `:=` operator we have made err variable to be local
in the gorutine and different from `err` variable in the surrounding function.
And thus `createContainer` function returned always nil, even in cases when
some error occurred in the gorutine.
Signed-off-by: Šimon Lukašík <slukasik@redhat.com>
We now can remove a paused container by sending it a kill signal while it
is paused. We then unpause the container and it is immediately killed.
Also, reworked how the parallelWorker results are handled to provide a
more consistent approach to how each subcommand implements it. It also
fixes a bug where if one container errors, the error message is duplicated
when printed out.
Signed-off-by: baude <bbaude@redhat.com>
once we changed configureNetNS to return a result beyond an error,
we need to make sure that we used locals instead of ctr attributes
when determining networks.
Resolves#1752
Signed-off-by: baude <bbaude@redhat.com>
Container images can be created without passwd or group file, currently
if one of these containers gets run with a --user flag the container blows
up complaining about t a missing /etc/passwd file.
We just need to check if the error on read is ENOEXIST then allow the
read to return, not fail.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When we read the conmon error status file, if Atoi fails to parse
the string we read from the file as an int, print the string as
part of the error message so we know what might have gone wrong.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Instead of running a full sync after starting a container to pick
up its PID, grab it from Conmon instead.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
When we scan a container in runc and see that it no longer
exists, we already set ContainerStatusExited to indicate that it
no longer exists in runc. Now, also set an exit code and exit
time, so PS output will make some sense.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
When syncing container state, we normally call out to runc to see
the container's status. This does have significant performance
implications, though, and we've seen issues with large amounts of
runc processes being spawned.
This patch attempts to use stat calls on the container exit file
created by Conmon instead to sync state. This massively decreases
the cost of calling updateContainer (it has gone from an
almost-unconditional fork/exec of runc to a single stat call that
can be avoided in most states).
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
After stopping containers, we run updateContainerStatus to sync
our state with runc (pick up exit code, for example). Then we
proceed to not save this to the database, requiring us to grab it
again on the next sync. This should remove the need to read the
exit file more than once.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
it is not writeable by non-root users so there is no point in having
access to it from a container.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
For pods using cgroupfs, we were seeing some error messages in CI
from an inability to remove the pod CGroup, which was traced down
to the conmon cgroup still being present as a child. Try to
remove these error messages and ensure successful CGroup deletion
by removing the conmon CGroup first, then the pod cgroup.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
If for any reason slirp4netns fails at startup, podman waits
indefinitely. Check every second if the process is still running so
that we avoid to hang.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
It was setting the wrong variable (CamelCase)
in the wrong module ("main", not "libpod")...
Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
conmon creates a symlink to avoid using a too long UNIX path.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1641800
There is still one issue when the path length of the symlink has the
same length of the attach socket parent directory since conmon fails
to create the symlink, but that must be addressed in conmon first.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Only return `ErrCtrStateInvalid` errors when the mount counter is equal
to 1. Also fix the "can't unmount [...] last mount[..]" error which
hasn't been returned when the error passed to `errors.Errorf()` is nil.
Fixes: #1695
Signed-off-by: Valentin Rothberg <vrothberg@suse.com>
for the purposes of performance and security, we use securejoin to contstruct
the root fs's path so that symlinks are what they appear to be and no pointing
to something naughty.
then instead of chrooting to parse /etc/passwd|/etc/group, we now use the runc user/group
methods which saves us quite a bit of performance.
Signed-off-by: baude <bbaude@redhat.com>
We implement the securejoin method to make sure the paths to /etc/passwd and
/etc/group are not symlinks to something naughty or outside the container
image. And then instead of actually chrooting, we use the runc functions to
get information about a user. The net result is increased security and
a a performance gain from 41ms to 100us.
Signed-off-by: baude <bbaude@redhat.com>
run prepare() -- which consists of creating a network namespace and
mounting the container image is now run in parallel. This saves 25-40ms.
Signed-off-by: baude <bbaude@redhat.com>
prevent opening the same file twice, since we re-exec podman in
rootless mode. While at it, also solve a possible race between the
check for the file and writing to it. Another process could have
created the file in the meanwhile and we would just end up overwriting
it.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
when reading the output from conmon using the JSON methods, it appears that
JSON marshalling is higher in pprof than it really is because the pipe is
"waiting" for a response. this gives us a clearer look at the real CPU/time
consumers.
Signed-off-by: baude <bbaude@redhat.com>
We probably won't be able to initialize a firewall plugin when we
are not running as root, so we shouldn't even try. Replace the
less-effect EUID check with the rootless package's better check
to make sure we don't accidentally set up the firewall in these
cases.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
The newly introduced CRIU version check is now used to make sure
checkpointing and restoring is only used if the CRIU version is new
enough.
Signed-off-by: Adrian Reber <areber@redhat.com>
I've seen a runc zombie process hanging around, it is caused by not
cleaning up the "$OCI status" process. Also adjust another location
that has the same issue.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We already have functions for retrieving the container's CGroup
path, so use them instead of manually generating a path.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
We are still requiring oci-systemd-hook to be installed in order to run
systemd within a container. This patch properly mounts
/sys/fs/cgroup/systemd/libpod_parent/libpod-UUID on /sys/fs/cgroup/systemd inside of container.
Since we need the UUID of the container, we needed to move Systemd to be a config option of the
container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>