Commit Graph

13 Commits

Author SHA1 Message Date
OpenShift Merge Robot 3cae574ab2
Merge pull request #18507 from mheon/fix_rm_depends
Fix `podman rm -fa` with dependencies
2023-06-12 13:27:34 -04:00
Matthew Heon f1ecdca4b6 Ensure our mutexes handle recursive locking properly
We use shared-memory pthread mutexes to handle mutual exclusion
in Libpod. It turns out that these have configurable options for
how to handle a recursive lock (IE, a thread trying to lock a
lock that the same thread had previously locked). The mutex can
either deadlock, or allow the duplicate lock without deadlocking.
Default behavior is, helpfully, unspecified, so if not explicitly
set there is no clear indication of which of these behaviors will
be seen. Unfortunately, today is the first I learned of this, so
our initial implementation did *not* explicitly set our preferred
behavior.

This turns out to be a major problem with a language like Golang,
where multiple goroutines can (and often do) use the same OS
thread. So we can have two goroutines trying to stop the same
container, and if the no-deadlock mutex behavior is in use, both
threads will successfully acquire the lock because the C library,
not knowing about Go's lightweight threads, sees the same PID
trying to lock a mutex twice, and allows it without question.

It appears that, at least on Fedora/RHEL/Debian libc, the default
(unspecified) behavior of the locks is the non-deadlocking
version - so, effectively, our locks have been of questionable
utility within the same Podman process for the last four years.
This is somewhat concerning.

What's even more concerning is that the Golang-native sync.Mutex
that was also in use did nothing to prevent the duplicate locking
(I don't know if I like the implications of this).

Anyways, this resolves the major issue of our locks not working
correctly by explicitly setting the correct pthread mutex
behavior.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-07 14:09:12 -04:00
Matt Heon 944673c883 Address review feedback and add manpage notes
The inspect format for `.LockNumber` needed to be documented.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-06-06 11:04:59 -04:00
Matt Heon 4fda7936c5 `system locks` now reports held locks
To debug a deadlock, we really want to know what lock is actually
locked, so we can figure out what is using that lock. This PR
adds support for this, using trylock to check if every lock on
the system is free or in use. Will really need to be run a few
times in quick succession to verify that it's not a transient
lock and it's actually stuck, but that's not really a big deal.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-06-05 19:34:36 -04:00
Matt Heon 1013696ad2 Add number of free locks to `podman info`
This is a nice quality-of-life change that should help to debug
situations where someone runs out of locks (usually when a bunch
of unused volumes accumulate).

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-06-05 14:00:40 -04:00
Erik Sjölund 08e13867a9 Fix typos. Improve language.
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2023-02-09 21:56:27 +01:00
Dmitry Smirnov 8d928d525f codespell: spelling corrections
Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
2019-11-13 08:15:00 +11:00
Sascha Grunert 4bfbc355de
Build cgo files with -Wall -Werror
To avoid unnecessary warnings and errors in the future I'd like to
propose building all cgo related sources with `-Wall -Werror`. This
commit fixes some warnings which came up in `shm_lock.c`, too.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-06-21 10:14:19 +02:00
Matthew Heon faae3a7065 When refreshing after a reboot, force lock allocation
After a reboot, when we refresh Podman's state, we retrieved the
lock from the fresh SHM instance, but we did not mark it as
allocated to prevent it being handed out to other containers and
pods.

Provide a method for marking locks as in-use, and use it when we
refresh Podman state after a reboot.

Fixes #2900

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-05-06 14:17:54 -04:00
Matthew Heon f9c548219b Recreate SHM locks when renumbering on count mismatch
When we're renumbering locks, we're destroying all existing
allocations anyways, so destroying the old lock struct is not a
particularly big deal. Existing long-lived libpod instances will
continue to use the old locks, but that will be solved in a
followon.

Also, solve an issue with returning error values in the C code.
There were a few places where we return ERRNO where it was not
set, so make them return actual error codes).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-02-21 10:51:42 -05:00
Matthew Heon 7fdd20ae5a Add initial version of renumber backend
Renumber is a way of renumbering container locks after the number
of locks available has changed.

For now, renumber only works with containers.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-02-21 10:51:42 -05:00
Matthew Heon e73484c176 Move to POSIX mutexes for SHM locks
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2019-01-04 09:51:09 -05:00
Matthew Heon a21f21efa1 Refactor locks package to build on non-Linux
Move SHM specific code into a subpackage. Within the main locks
package, move the manager to be linux-only and add a non-Linux
unsupported build file.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2019-01-04 09:45:59 -05:00