We are only using imageID on that branch, so it is
more consistent.
Should not change behavior; in callers, either
both are set or neither.
[NO NEW TESTS NEEDED]
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This updates the container-device-interface dependency to v0.6.2 and renames the import to
tags.cncf.io/container-device-interface to make use of the new vanity URL.
[NO NEW TESTS NEEDED]
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Docker allows the passing of -1 to indicate the maximum limit
allowed for the current process.
Fixes: https://github.com/containers/podman/issues/19319
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
when --uts=host is provided, the expectation is to use the hostname
from the host not the container name.
Closes: https://github.com/containers/podman/issues/20448
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This solves `--security-opt unmask=ALL` still masking the path.
[NO NEW TESTS NEEDED] Can't easily test this as we do not have
access to it in CI.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
All `[]string`s in containers.conf have now been migrated to attributed
string slices which require some adjustments in Buildah and Podman.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
I don't really like this solution because it can't be undone by
`--security-opt unmask=all` but I don't see another way to make
this retroactive. We can potentially change things up to do this
the right way with 5.0 (actually have it in the list of masked
paths, as opposed to adding at spec finalization as now).
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Add podman farm build command that sends out builds to
nodes defined in the farm, builds the images on the farm
nodes, and pulls them back to the local machine to create
a manifest list.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
There is no need to carry these stub implementations that just error
anyway. The libpod package can only ever uses on linux and freebsd
anyway and the remote client should never ever import libpod directly.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
There is a potential race condition we are seeing where
we are seeing a message about a removed container which
could be caused by a non mounted container, this change
should clarify which is causing it.
Also if the container does not exists, just warn the user
instead of reporting an error, not much the user can do.
Fixes: https://github.com/containers/podman/issues/19702
[NO NEW TESTS NEEDED]
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
commit 7ade972102 introduced the change
that caused an issue in crun since it forces the root user session
instead of the system one when DBUS_SESSION_BUS_ADDRESS is set.
I am addressing it in crun, but for the time being, let's also not
pass the variable down to conmon since the assumption is that when
running as root the containers must be created on the system bus.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
always cleanup the exec session when the command specified to the
"exec" is not found.
Closes: https://github.com/containers/podman/issues/20392
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Updated the error message to suggest user to use --replace option to instruct Podman to replace the existsing external container with a newly created one.
closes#16759
Signed-off-by: Chetan Giradkar <cgiradka@redhat.com>
When trying to connect a container to a network and the connection
already exists, an error should only be raised if the container is
already running (or is in the `ContainerStateCreated` transition)
to mimic the behavior of Docker as described here:
https://github.com/containers/podman/pull/15516#issuecomment-1229265942
For running and connected containers 403 is returned which fixes#20365
Signed-off-by: Philipp Fruck <dev@p-fruck.de>
When a userns and netns is used we need to let the runtime create the
netns otherwise the netns is not owned by the right userns and thus
the capabilities would not be correct.
The current restart logic tries to reuse the netns which is fine if no
userns is used but when one is used we setup a new netns (which is
correct) but forgot to cleanup the old netns. This resulted in leaked
network namespaces and because no teardown was ever called leaked ipam
assignments, thus a quickly restarting container will run out of ip
space very fast.
Fixes#18615
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Allow users to specify
podman-remote top $cid -eo "pid comm"
or
podman-remote top $cid -eo pid,comm
Fixes: https://github.com/containers/podman/issues/19176
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
didid# new file: test/system/085-top.bats
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This fixes a regression caused by commit 7e6e267329, unfortunately this
was not caught during review as for some reason this works fine rootless
and only fails as root.
Because we set the systemd log level to notice in order to hide the unit
started/stopped messages to prevent spamming the journal the issue is
that this now also causes systemd to ignore the events we write to
journald as we also send them as info level.
To fix this we simply send health_status events now on notice level. I
decided against sending all events on notice as I think info is fine for
them. Whenever the notice level is right is of course debatable but
given it may contain the unhealthy message I think having this a notice
should be ok.
The main reason this made it through testing is because we do not rely
on the systemd unit to fire healthchecks in the tests as this is flaky.
There is one test were we rely on it though and I added a check there
to make sure events are displayed correctly when trigger via systemd.
Fixes#20342
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When containers are created with a named volume it can deadlock because
the create logic tried to lock all volumes in a loop, this is fine if it
only ever creates a single container at any given time. However because
we multiple containers can be created at the same time they can cause a
deadlock between the volumes. This is because the order of the loop is
not stable, in fact it is based on the order of how the volumes were
specified on the cli.
So if you create two containers at the same time with
`-v vol1:/dir2 -v vol2:/dir2` and the other one with
`-v vol2:/dir2 -v vol1:/dir1` then there is chance for a deadlock.
Now one solution could be to order the volumes to prevent the issue but
the reason for holding the lock is dubious. The goal was to prevent the
volume from being removed in the meantime. However that could still
have happend before we acquired the lock so it didn't protect against
that.
Both boltdb and sqlite already prevent us from adding a container with
volumes that do not exists due their internal consistency checks.
Sqlite even uses FOREIGN KEY relationships so the schema will prevent us
from doing anything wrong.
The create code currently first checks if the volume exists and if not
creates it. I have checked that the db will guarantee that this will not
work:
Boltdb: `no volume with name test2 found in database when adding container xxx: no such volume`
Sqlite: `adding container volume test2 to database: FOREIGN KEY constraint failed`
Keep in mind that this error is normally not seen, only if the volume is
removed between the volume exists check and adding the container in the
db this messages will be seen wich is an acceptable race and a
pre-existing condition anyway.
[NO NEW TESTS NEEDED] Race condition, hard to test in CI.
Fixes#20313
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Use sqlite as default but for upgrades it will still use boltdb to avoid
breaking anyone. This is done by checking if the boltdb file already
exists and if it does then we have to use it.
I added a e2e test to check the new logic and removed the system test
for it, the problem with the system test is that we share the storage
dir there so all following commands without --db-backend would try to
use boltdb as a single --db-backend boltdb command will create the file
and then all folllwing commands will use it because of the backwards
compat. In e2e tests each test uses their own --root so it is not an
issue there.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
when running as a service, the c.state.Mounted flag could get out of
sync if the container is cleaned up through the cleanup process.
To avoid this, always check if the mountpoint is really present before
skipping the mount.
[NO NEW TESTS NEEDED]
Closes: https://github.com/containers/podman/issues/17042
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Podman server logs are mostly full of healthcheck output, making them hard to navigate. Hence, made healthcheck service to run with LogLevelMax=notice, this would remove the normal output, inclusive the started/stopped messages from systemd itself.
Fixes#17856
Signed-off-by: Chetan Giradkar <cgiradka@redhat.com>
Add --rdt-class=COS to the create and run command to enable the
assignment of a container to a Class of Service (COS). The COS
represents a part of the cache based on the Cache Allocation Technology
(CAT) feature that is part of Intel's Resource Director Technology
(Intel RDT) feature set. By assigning a container to a COS, all PID's of
the container have only access to the cache space defined for this COS.
The COS has to be pre-configured based on the resctrl kernel driver.
cat_l2 and cat_l3 flags in /proc/cpuinfo represent CAT support for cache
level 2 and 3 respectively.
Signed-off-by: Wolfgang Pross <wolfgang.pross@intel.com>
Pass the _entire_ environment to conmon instead of selectively enabling
only specific variables. The main reasoning is to make sure that conmon
and the podman-cleanup callback process operate in the exact same
environment than the initial podman process. Some configuration files
may be passed via environment variables. Podman not passing those down
to conmon has led to subtle and hard to debug issues in the past, so
passing all down will avoid such kinds of issues in the future.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
The processing and setting of the static and volume directories was
scattered across the code base (including c/common) leading to subtle
errors that surfaced in #19938.
There were multiple issues that I try to summarize below:
- c/common loaded the graphroot from c/storage to set the defaults for
static and volume dir. That ignored Podman's --root flag and
surfaced in #19938 and other bugs. c/common does not set the
defaults anymore which gives Podman the ability to detect when the
user/admin configured a custom directory (not empty value).
- When parsing the CLI, Podman (ab)uses containers.conf structures to
set the defaults but also to override them in case the user specified
a flag. The --root flag overrode the static dir which is wrong and
broke a couple of use cases. Now there is a dedicated field for in
the "PodmanConfig" which also includes a containers.conf struct.
- The defaults for static and volume dir and now being set correctly
and adhere to --root.
- The CONTAINERS_CONF_OVERRIDE env variable has not been passed to the
cleanup process. I believe that _all_ env variables should be passed
to conmon to avoid such subtle bugs.
Overall I find that the code and logic is scattered and hard to
understand and follow. I refrained from larger refactorings as I really
just want to get #19938 fixed and then go back to other priorities.
https://github.com/containers/common/pull/1659 broke three pkg/machine
tests. Those have been commented out until getting fixed.
Fixes: #19938
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
This is not really an error, if the anonymous volume is still used then
this likely means it was transferred to another container with
--volumes-from. This is what the user wants and it is not like the user
can act on the logged error anyway. Once the last user of the volume is
removed it will be removed correctly.
see https://github.com/containers/podman/pull/19637
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The --syslog flag has not been passed to the cleanup process (i.e.,
conmon's exit args) complicating debugging quite a bit.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
The network list compat API requires us to include all containers with
their ip addresses for the selected networks. Because we have no network
-> container mapping in the db we have to go through all containers
every time. However the old code did it in the most ineffective way
possible, it quered the containers from the db for each individual
network. The of course is extremely expensive. Now the other expensive
call is calling Inspect() on the container each time. Inspect does for
more than we need.
To fix this we fist query containers only once for the API call, then
replace the inspect call with directly accessing the network status.
This will speed things up a lot!
The reported scenario includes 100 containers and 25 networks,
previously it took 1.5s for the API call not it takes 24ms, that is a
more than a 62x improvement. (tested with curl)
[NO NEW TESTS NEEDED] We have no timing tests.
Fixes#20035
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
First that function claims to deep copy but then actually return the
original state so it does not work correctly, given that there are no
users just remove it instead of fixing it.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
commit 8b4a79a744 introduced
oom_score_adj clamping when the container oom_score_adj value is lower
than the current one in a rootless environment. Move the check to
init() time so it is performed every time the container starts and not
only when it is created. It is more robust if the oom_score_adj value
is changed for the current user session.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
HC events were firing as part of the `exec` call, before it had
even been decided whether the HC succeeded or failed. As such,
the status was not going to be correct any time there was a
change (e.g. the first event after a container went healthy to
unhealthy would still read healthy). Move the event into the
actual Healthcheck function and throw it in a defer to make sure
it happens at the very end, after logs are written.
Ignores several conditions that did not log previously (container
in question does not have a healthcheck, or an internal failure
that should not really happen).
Still not a perfect solution. This relies on the HC log being
written, when instead we could just get the status straight from
the function writing the event - so if we fail to write the log,
we can still report a bad status. But if the log wasn't written,
we're in bad shape regardless - `podman ps` would disagree with
the event written, for example.
Fixes#19237
Signed-off-by: Matt Heon <mheon@redhat.com>
Add support to kube play to support the TerminationGracePeriodSeconds
fiels by sending the value of that to podman's stopTimeout.
Add support to kube generate to generate TerminationGracePeriodSeconds
if stopTimeout is set for a container (will ignore podman's default).
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
When a container is created and it is part of a pod, we ensure the pod
cgroup exists so limits can be applied on the pod cgroup.
Closes: https://github.com/containers/podman/issues/19175
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
accept only the resources to be used by the pod, so that the function
can more easily be used by a successive patch.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
move the code to remove the pod cgroup to a separate function. It is
a preparation for the next patch.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Under some circumstances podman tries to kill a container
using signal 37, for which unix.SignalName() returns "".
Not helpful. So, when that happens, show "(signal number)".
Signed-off-by: Ed Santiago <santiago@redhat.com>
Also add a new `StoppedByUser` field to the container-inspect state
which can be useful during debugging and is now also used in the
regression test. Note that I moved the `false` check one test above
such that we can compare the previous Podman version which should just
be stuck in the `wait $ctr` command since it will continue restarting.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
When removing a container's dependency, getting an error that the
container has already been removed (ErrNoSuchCtr and
ErrCtrRemoved) should not be fatal. We wanted the container gone,
it's gone, no need to error out.
[NO NEW TESTS NEEDED] This is a race and thus hard to test for.
Fixes#18874
Signed-off-by: Matt Heon <mheon@redhat.com>
when the "kill" command fails, print the stderr from the OCI runtime
only after we check the container state.
It also simplifies the code since we don't have to hard code the error
messages we want to ignore.
Closes: https://github.com/containers/podman/issues/18452
[NO NEW TESTS NEEDED] it fixes a flake.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Add io.podman.annotations.infra.name annotation to kube play so
users can set the name of the infra container created.
When a pod is created with --infra-name set, the generated
kube yaml will have an infraName annotation set that will
be used when playing the generated yaml with podman.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
The attach API used to always return the Content-Type
`vnd.docker.raw-stream`, however docker api v1.42 added the
`vnd.docker.multiplexed-stream` type when no tty was used.
Follow suit and return the same header for docker api v1.42 and libpod
v4.7.0. This technically allows clients to make a small optimization as
they no longer need to inspect the container to see if they get a raw or
multiplexed stream.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Users want to mount a tmpfs file system with secrets, and make
sure the secret is never saved into swap. They can do this either
by using a ramfs tmpfs mount or by passing `noswap` option to
a tmpfs mount.
Fixes: https://github.com/containers/podman/issues/19659
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When conmon is started it blocks and waits for us to signal it to start
via pipe. This works but when conmon exits before it waits for the start
message it causes podman to fail with `write child: broken pipe`. This
error is meaningless to podman users.
The real error is that conmon failed so we should not return early if we
fail to send the start message to conmon. Instead ignore the EPIPE error
case as it is safe to assume to the conmon died and for other errors we
make sure to kill conmon so that the following wait() call does not hang
forever. This also fixes problems with having conmon zombie processes
leaked as wait() was never called.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We need to actually check the output not just exit codes. While doing
this it was clear that the first test was not checking what it should
be so I had to remove the quotes from the arg.
Also this check did not work with remote testing at all, we must set the
env then restart the server as the env for conmon must be set on the
server obviously.
Also we can only match the conmon error messages on the local client.
Lastly this test requires the journald driver but we cannot use the in
container tests so skip it there.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This changes /run to /var/run for .containerenv and secrets in FreeBSD
containers for consistency with FreeBSD path conventions. Running Linux
containers on FreeBSD hosts continue to use /run for compatibility.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
Sometimes there is no output displayed from the podman top command but
no error is shown either. Looking at the code I think the issue here is
that we do not wait for the output reader to end as it runs in a
different goroutine. Thus the last lines of output might be missing.
The fix is simply to wait for said goroutine to finish before returning.
While at it also fix the missing scanner error check and return the read
errors back to the caller.
[NO NEW TESTS NEEDED] It is a flake.
Fixes#19504
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
There is a problem where our tail code does not handles correctly
partial log lines. This makes podman logs --tail output possibly
incorrect lines when k8s-file is used.
This manifests as flake in CI because partial lines are only sometimes
written, basically always when the output is flushed before writing a
newline.
For our code we must not count partial lines which was already done but
the important thing we must keep reading backwards until the next full
(F) line. This is because all partial (P) lines still must be added to
the full line. See the added tests for details on how the log file looks
like.
While fixing this, I rework the tail logic a bit, there is absolutely no
reason to read the lines in a separate goroutine just to pass the lines
back via channel. We can do this in the same routine.
The logic is very simple, read the lines backwards, append lines to
result and then at the end invert the result slice as tail must return
the lines in the correct order. This more efficient then having to
allocate two different slices or to prepend the line as this would
require a new allocation for each line.
Lastly the readFromLogFile() function wrote the lines back to the log
line channel in the same routine as the log lines we read, this was bad
and causes a deadlock when the returned lines are bigger than the
channel size. There is no reason to allocate a big channel size we can
just write the log lines in a different goroutine, in this case the main
routine were read the logs anyway.
A new system test and unit tests have been added to check corner cases.
Fixes#19545
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Because it will cause memory leak if we do not stop timer when the function has completed.
[NO NEW TESTS NEEDED]
Signed-off-by: hang.jiang <hang.jiang@daocloud.io>