This reverts commit d3d97a25e8.
This does not resolve the issues we expected it would, and has
some unexpected side effects with the upcoming exec rework.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This reverts commit 4632b81c81.
We are reverting #5373 as well, which lays the foundation for
this commit, so it has to go as well.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
during container creation, if no network is provided, we need to add a default value so the container can be later started.
use apiv2 container creation for RunTopContainer instead of an exec to the system podman. RunTopContainer now also returns the container id and an error.
added a libpod commit endpoint.
also, changed the use of the connections and bindings slightly to make it more convenient to write tests.
Fixes: 5366
Signed-off-by: Brent Baude <bbaude@redhat.com>
Specifically to get:
https://github.com/containernetworking/cni/pull/735
6f29b0165883b2b52ccd4dcb937162ea4c86927b intercept netplugin std err
But also pulls in some interface name validation and a compatibility
fix for configurations that don't set a CNI version.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Some users have small /var/tmp directories and need to be able to specify a different location
for temporary files, which includes more space.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
In some older systems we point the temporary directory to /run/user/1000 which leads podman system reset to clear unrelated files under XDG_RUNTIME_DIR. This patch only removes files created by podman if TmpDir is the same as the XDG_RUNTIME_DIR.
Signed-off-by: Qi Wang <qiwan@redhat.com>
this is a cosmetic change that makes sure podman returns a sane error code when conmon dies underneath it
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Before, we were using -1 as a bogus value in podman to signify something went wrong when reading from a conmon pipe. However, conmon uses negative values to indicate the runtime failed, and return the runtime's exit code.
instead, we should use a bogus value that is actually bogus. Define that value in the define package as MinInt32 (-1<< 31 - 1), which is outside of the range of possible pids (-1 << 31)
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Before, we were getting the exit code from the file, in which we waited an arbitrary amount of time (5 seconds) for the file, and segfaulted if we didn't find it. instead, we should be a bit more certain conmon has sent the exit code. Luckily, it sends the exit code along the sync pipe fd, so we can read it from there
Adapt the ExecContainer interface to pass along a channel to get the pid and exit code from conmon, to be able to read both from the pipe
Signed-off-by: Peter Hunt <pehunt@redhat.com>
This patch allows users to specify the list of capabilities required
to run their container image.
Setting a image/container label "io.containers.capabilities=setuid,setgid"
tells podman that the contained image should work fine with just these two
capabilties, instead of running with the default capabilities, podman will
launch the container with just these capabilties.
If the user or image specified capabilities that are not in the default set,
the container will print an error message and will continue to run with the
default capabilities.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Until now, we've been validating every part of container
configuration through the With... functions that set the options.
This if fine when we are just validating the options to an
individual function, but things get complicated once we need to
validate conflicts between different options. We don't know the
order in which things were passed, so we need the validation on
both of the potential options that can conflict, resulting in
significant code duplication. To solve this, add a validate()
function for containers, and use this to check whether everything
is in a good state.
We can probably move more into this function (there are other
parts of container creation that also do validation of a sort)
but this is a good start to simplifying our options.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Modify the pod inspect bindings to hold current pod status.
Includes test to validate on pod status and added test to check
no or few pods are pruned,if the pods are in exited state.
Signed-off-by: Sujil02 <sushah@redhat.com>
This corrects a regression from Podman 1.4.x where container exec
sessions inherited supplemental groups from the container, iff
the exec session did not specify a user.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
added the ability to wait on a condition (stopped, running, paused...) for a container. if a condition is not provided, wait will default to the stopped condition which uses the original wait code paths. if the condition is stopped, the container exit code will be returned.
also, correct a mux issue we discovered.
Signed-off-by: Brent Baude <bbaude@redhat.com>
add binding tests for volumes: inspect(get), create, remove, prune, and list
implement filters ability for volumes
Signed-off-by: Brent Baude <bbaude@redhat.com>
When inspecting containers, info on CNI networks added to the
container by name (e.g. --net=name1) should be displayed
separately from the configuration of the default network, in a
separate map called Networks.
This patch adds this separation, improving our Docker
compatibility and also adding the ability to see if a container
has more than one IPv4 and IPv6 address and more than one MAC
address.
Fixes#4907
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We can easily tell if we're going to deadlock by comparing lock
IDs before actually taking the lock. Add a few checks for this in
common places where deadlocks might occur.
This does not yet cover pod operations, where detection is more
difficult (and costly) due to the number of locks being involved
being higher than 2.
Also, add some error wrapping on the Podman side, so we can tell
people to use `system renumber` when it occurs.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Before Libpod supported named volumes, we approximated image
volumes by bind-mounting in per-container temporary directories.
This was handled by Libpod, and had a corresponding database
entry to enable/disable it.
However, when we enabled named volumes, we completely rewrote the
old implementation; none of the old bind mount implementation
still exists, save one flag in the database. With nothing
remaining to use it, it has no further purpose.
Signed-off-by: Matthew Heon <mheon@redhat.com>
in cases where the log file exceeds the available memory of a system, we had a bug that triggered an oom because the entire logfile was being read when the tail parameter was given. this reads in chunks and is more or less memory safe.
fixes: #5131
Signed-off-by: Brent Baude <bbaude@redhat.com>
Looks like a bit of a misunderstanding from early on.
Docker implements --filter=since=IMAGE. Podman implements 'after'
instead of 'since'. Add an equivalent case statement to handle
both, keeping 'after' because we have no way of knowing if it
is used in the field.
Update documentation ... and fix what looks like a complete
misinterpretation of what the code actually does: the man page
claimed that these were time fields, but I don't see any
possible incantation in which a time value works or could
work. Updated docs to reflect IMAGE usage. Also changed
nonworking '==' to single '='.
Added tests. [UPDATE: skip with broken podman-remote]
Fixes: #5040
Signed-off-by: Ed Santiago <santiago@redhat.com>
when using usernamespace, dnsname respondes from cni were not making it into the containers /etc/resolv.conf because of a timing issue. this corrects that behavior.
Fixes: #5256
Signed-off-by: Brent Baude <bbaude@redhat.com>
Enables most of the network-related functionality from
`podman run` in `podman pod create`. Custom CNI networks can be
specified, host networking is supported, DNS options can be
configured.
Also enables host networking in `podman play kube`.
Fixes#2808Fixes#3837Fixes#4432Fixes#4718Fixes#4770
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Instead of manually merging the configs, use the built-in features of
TOMP to merge/extend the fields of a data type when encoding a file.
This erases the need for the merge code in libpod/config and also
addresses issues when merging booleans.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
when using -d and port mapping, make sure the correct fd is injected
into conmon.
Move the pipe creation earlier as the fd must be known at the time we
create the container through conmon.
Closes: https://github.com/containers/libpod/issues/5167
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Add pkg/capabibilities to deal with capabilities. The code has been
copied from Docker (and attributed with the copyright) but changed
significantly to only do what we really need. The code has also been
simplified and will perform better due to removed redundancy.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
If we attempt to label a volume and the file system
does not support labeling, then just warn. SELinux
may or may not work, on the volume.
There is no way to setup a private label on a newly
created volume without using the container mountlabel.
If we don't have a mount label at the time of creation of
the volume, the only option we have is to create a shared
label.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When Docker performs a copy up, it first verifies that the volume
being copied into is empty; thus, for volumes that have been
modified elsewhere (e.g. manually copying into then), the copy up
will not be performed at all. Duplicate this behavior in Podman
by checking if the volume is empty before copying.
Furthermore, move setting copyup to false further up. This will
prevent a potential race where copy up could happen more than
once if Podman was killed after some files had been copied but
before the DB was updated.
This resolves CVE-2020-1726.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
... so that _all_ Image objects are created in a single place
that is easy to update.
Should not change behavior.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Instead of the function updating image.InputName (the only reason for it
to need an image), have it return the updated value separately.
This will allow simplifying the constructors of Image further.
Should not change behavior.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
All ways to create an Image{} have a non-nil .image field, and it
is never set to nil, so this is dead code.
Should not change behavior.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
All code creating an Image by looking up a name now uses
Runtime.NewFromLocal.
Should not change behavior.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This is not _trivially_ safe because newImage.getLocalImage()
modifies newImage.ImageName, but we overwrite that value anyway.
So, this should not change behavior, and it will make future refactoring
easier to verify.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
If there are no running containers - for example, if the pod was
just created - the cgroup in question may not exist (under
certain circumstances that we're not 100% sure about). However,
regardless, we don't need to set a PID limit, as nothing will be
making cleanup processes (no running conmon processes), so not
changing the cgroup is safe regardless.
Fixes#5072
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This adds network-related options to the pod in the database. We
are going to add the CLI frontend in further patches.
In short, this should greatly improve the ability of pods to
configure networking, once the CLI parsing is added.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
The current Libpod pkg/spec has become a victim of the better
part of three years of development that tied it extremely closely
to the current Podman CLI. Defaults are spread across multiple
places, there is no easy way to produce a CreateConfig that will
actually produce a valid container, and the logic for generating
configs has sprawled across at least three packages.
This is an initial pass at a package that generates OCI specs
that will supersede large parts of the current pkg/spec. The
CreateConfig will still exist, but will effectively turn into a
parsed CLI. This will be compiled down into the new SpecGenerator
struct, which will generate the OCI spec and Libpod create
options.
The preferred integration point for plugging into Podman's Go API
to create containers will be the new CreateConfig, as it's less
tied to Podman's command line. CRI-O, for example, will likely
tie in here.
Signed-off-by: Matthew Heon <mheon@redhat.com>
This makes restart a bit slower for root containers, but it does
make it more consistent with `podman stop` and `podman start` on
a container. Importantly, `podman restart` will now recreate
firewall rules if they were somehow purged.
Fixes#5051
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
In Podman 1.6.3, we added support for anonymous volumes - fixing
our old, broken support for named volumes that were created with
containers. Unfortunately, this reused the database field we used
for the old implementation, and toggled volume removal on for
`podman run --rm` - so now, we were removing *named* volumes
created with older versions of Podman.
We can't modify these old volumes in the DB, so the next-safest
thing to do is swap to a new field to indicate volumes should be
removed. Problem: Volumes created with 1.6.3 and up until this
lands, even anonymous volumes, will not be removed. However, this
is safer than removing too many volumes, as we were doing before.
Fixes#5009
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Get the layer's size whether it relates to the first history entry or
not. This fixes issues where the first entry would always be shown
to be of size 0.
Fixes: #4916
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
When a container specification has a pull policy, we should honor it when recreating the pods/containers from yaml. furthermore, ini kube, if a tag is :latest, then the always pull policy is automatically instituted.
Fixes: #4880
Signed-off-by: Brent Baude <bbaude@redhat.com>
when a docker image has a defined healthcheck, it should be displayed with inspect. this is only valid for docker images as oci images are not aware of healthchecks.
Fixes: #4799
Signed-off-by: Brent Baude <bbaude@redhat.com>
Add API review comments to correct documentation and endpoints. Also, add a libpode prune method to reduce code duplication. Only used right now for the API but when the remote client is wired, we will switch over there too.
Signed-off-by: Brent Baude <bbaude@redhat.com>
This is purely a display change - we weren't initializing the
default value to display for the CPUShares field, which defaults
to 1024.
Fixes#4822
Signed-off-by: Matthew Heon <mheon@redhat.com>
Detect whether we are running under systemd (if the INVOCATION_ID is
set). If Podman is running under a systemd service, we do not need to
create a cgroup for conmon.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The new APIv2 branch provides an HTTP-based remote API to Podman.
The requirements of this are, unfortunately, incompatible with
the existing Attach API. For non-terminal attach, we need append
a header to what was copied from the container, to multiplex
STDOUT and STDERR; to do this with the old API, we'd need to copy
into an intermediate buffer first, to handle the headers.
To avoid this, provide a new API to handle all aspects of
terminal and non-terminal attach, including closing the hijacked
HTTP connection. This might be a bit too specific, but for now,
it seems to be the simplest approach.
At the same time, add a Resize endpoint. This needs to be a
separate endpoint, so our existing channel approach does not work
here.
I wanted to rework the rest of attach at the same time (some
parts of it, particularly how we start the Attach session and how
we do resizing, are (in my opinion) handled much better here.
That may still be on the table, but I wanted to avoid breaking
existing APIs in this already massive change.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
it allows to disable cgroups creation only for the conmon process.
A new cgroup is created for the container payload.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Move the top logic from pkg/adapter into the (*libpod.Container).Top().
This way, we drop the dependency from pkg/api on pkg/adapters and have
a clearer separation of concerns.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
do not change the permissions mask for the rundir and the tmpdir when
running a container with a user namespace and the current user is
mapped inside the user namespace.
The change was introduced with
849548ffb8, that dropped the
intermediate mount namespace in favor of allowing root into the user
namespace to access these directories.
Closes: https://github.com/containers/libpod/issues/4846
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Include the unit tests (i.e., _test.go files) for linting to make the
tests more robust and enforce the linters' coding styles etc.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Our networking code bakes in a lot of assumptions about how
networking should work - that CNI is *always* used with root, and
that slirp4netns is *always* used only with rootless. These are
not safe assumptions. This fixes one particular issue, which
would cause CNI to also be run when slirp4netns was requested as
root.
Fixes: #4687
Signed-off-by: Matthew Heon <mheon@redhat.com>
`gocritic` is a powerful linter that helps in preventing certain kinds
of errors as well as enforcing a coding style.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Create service command
Use cd cmd/service && go build .
$ systemd-socket-activate -l 8081 cmd/service/service &
$ curl http://localhost:8081/v1.24/images/json
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Correct Makefile
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Two more stragglers
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Report errors back as http headers
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Split out handlers, updated output
Output aligned to docker structures
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Refactored routing, added more endpoints and types
* Encapsulated all the routing information in the handler_* files.
* Added more serviceapi/types, including podman additions. See Info
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Cleaned up code, implemented info content
* Move Content-Type check into serviceHandler
* Custom 404 handler showing the url, mostly for debugging
* Refactored images: better method names and explicit http codes
* Added content to /info
* Added podman fields to Info struct
* Added Container struct
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Add a bunch of endpoints
containers: stop, pause, unpause, wait, rm
images: tag, rmi, create (pull only)
Signed-off-by: baude <bbaude@redhat.com>
Add even more handlers
* Add serviceapi/Error() to improve error handling
* Better support for API return payloads
* Renamed unimplemented to unsupported these are generic endpoints
we don't intend to ever support. Swarm broken out since it uses
different HTTP codes to signal that the node is not in a swarm.
* Added more types
* API Version broken out so it can be validated in the future
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Refactor to introduce ServiceWriter
Signed-off-by: Jhon Honce <jhonce@redhat.com>
populate pods endpoints
/libpod/pods/..
exists, kill, pause, prune, restart, remove, start, stop, unpause
Signed-off-by: baude <bbaude@redhat.com>
Add components to Version, fix Error body
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Add images pull output, fix swarm routes
* docker-py tests/integration/api_client_test.py pass 100%
* docker-py tests/integration/api_image_test.py pass 4/16
+ Test failures include services podman does not support
Signed-off-by: Jhon Honce <jhonce@redhat.com>
pods endpoint submission 2
add create and others; only top and stats is left.
Signed-off-by: baude <bbaude@redhat.com>
Update pull image to work from empty registry
Signed-off-by: Jhon Honce <jhonce@redhat.com>
pod create and container create
first pass at pod and container create. the container create does not
quite work yet but it is very close. pod create needs a partial
rewrite. also broken off the DELETE (rm/rmi) to specific handler funcs.
Signed-off-by: baude <bbaude@redhat.com>
Add docker-py demos, GET .../containers/json
* Update serviceapi/types to reflect libpod not podman
* Refactored removeImage() to provide non-streaming return
Signed-off-by: Jhon Honce <jhonce@redhat.com>
create container part2
finished minimal config needed for create container. started demo.py
for upcoming talk
Signed-off-by: baude <bbaude@redhat.com>
Stop server after honoring request
* Remove casting for method calls
* Improve WriteResponse()
* Update Container API type to match docker API
Signed-off-by: Jhon Honce <jhonce@redhat.com>
fix namespace assumptions
cleaned up namespace issues with libpod.
Signed-off-by: baude <bbaude@redhat.com>
wip
Signed-off-by: baude <bbaude@redhat.com>
Add sliding window when shutting down server
* Added a Timeout rather than closing down service on each call
* Added gorilla/schema dependency for Decode'ing query parameters
* Improved error handling
* Container logs returned and multiplexed for stdout and stderr
* .../containers/{name}/logs?stdout=True&stderr=True
* Container stats
* .../containers/{name}/stats
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Improve error handling
* Add check for at least one std stream required for /containers/{id}/logs
* Add check for state in /containers/{id}/top
* Fill in more fields for /info
* Fixed error checking in service start code
Signed-off-by: Jhon Honce <jhonce@redhat.com>
get rest of image tests for pass
Signed-off-by: baude <bbaude@redhat.com>
linting our content
Signed-off-by: baude <bbaude@redhat.com>
more linting
Signed-off-by: baude <bbaude@redhat.com>
more linting
Signed-off-by: baude <bbaude@redhat.com>
pruning
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]apiv2 pods
migrate from using args in the url to using a json struct in body for
pod create.
Signed-off-by: baude <bbaude@redhat.com>
fix handler_images prune
prune's api changed slightly to deal with filters.
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]enabled base container create tests
enabling the base container create tests which allow us to get more into
the stop, kill, etc tests. many new tests now pass.
Signed-off-by: baude <bbaude@redhat.com>
serviceapi errors: append error message to API message
I dearly hope this is not breaking any other tests but debugging
"Internal Server Error" is not helpful to any user. In case, it
breaks tests, we can rever the commit - that's why it's a small one.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
serviceAPI: add containers/prune endpoint
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
add `service` make target
Also remove the non-functional sub-Makefile.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
add make targets for testing the service
* `sudo make run-service` for running the service.
* `DOCKERPY_TEST="tests/integration/api_container_test.py::ListContainersTest" \
make run-docker-py-tests`
for running a specific tests. Run all tests by leaving the env
variable empty.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Split handlers and server packages
The files were split to help contain bloat. The api/server package will
contain all code related to the functioning of the server while
api/handlers will have all the code related to implementing the end
points.
api/server/register_* will contain the methods for registering
endpoints. Additionally, they will have the comments for generating the
swagger spec file.
See api/handlers/version.go for a small example handler,
api/handlers/containers.go contains much more complex handlers.
Signed-off-by: Jhon Honce <jhonce@redhat.com>
[CI:DOCS]enabled more tests
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]libpod endpoints
small refactor for libpod inclusion and began adding endpoints.
Signed-off-by: baude <bbaude@redhat.com>
Implement /build and /events
* Include crypto libraries for future ssh work
Signed-off-by: Jhon Honce <jhonce@redhat.com>
[CI:DOCS]more image implementations
convert from using for to query structs among other changes including
new endpoints.
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]add bindings for golang
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]add volume endpoints for libpod
create, inspect, ls, prune, and rm
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]apiv2 healthcheck enablement
wire up container healthchecks for the api.
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]Add mount endpoints
via the api, allow ability to mount a container and list container
mounts.
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]Add search endpoint
add search endpoint with golang bindings
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]more apiv2 development
misc population of methods, etc
Signed-off-by: baude <bbaude@redhat.com>
rebase cleanup and epoch reset
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]add more network endpoints
also, add some initial error handling and convenience functions for
standard endpoints.
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]use helper funcs for bindings
use the methods developed to make writing bindings less duplicative and
easier to use.
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]add return info for prereview
begin to add return info and status codes for errors so that we can
review the apiv2
Signed-off-by: baude <bbaude@redhat.com>
[CI:DOCS]first pass at adding swagger docs for api
Signed-off-by: baude <bbaude@redhat.com>
support a custom tag to add to each log for the container.
It is currently supported only by the journald backend.
Closes: https://github.com/containers/libpod/issues/3653
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
In a largely anticlimatic solution to the saga of piped input from conmon, we come to this solution.
When we pass the Stdin stream to the exec.Command structure, it's immediately consumed and lost, instead of being consumed through CopyDetachable().
When we don't pass -i in, conmon is not told to create a masterfd_stdin, and won't pass anything to the container.
With both, we can do
echo hi | podman exec -til cat
and get the expected hi
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Golint claims that image.Image stutters but renaming the type would be a
breaking change which isn't worth the consequences.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
RootlessKit port forwarder has a lot of advantages over the slirp4netns port forwarder:
* Very high throughput.
Benchmark result on Travis: socat: 5.2 Gbps, slirp4netns: 8.3 Gbps, RootlessKit: 27.3 Gbps
(https://travis-ci.org/rootless-containers/rootlesskit/builds/597056377)
* Connections from the host are treated as 127.0.0.1 rather than 10.0.2.2 in the namespace.
No UDP issue (#4586)
* No tcp_rmem issue (#4537)
* Probably works with IPv6. Even if not, it is trivial to support IPv6. (#4311)
* Easily extensible for future support of SCTP
* Easily extensible for future support of `lxc-user-nic` SUID network
RootlessKit port forwarder has been already adopted as the default port forwarder by Rootless Docker/Moby,
and no issue has been reported AFAIK.
As the port forwarder is imported as a Go package, no `rootlesskit` binary is required for Podman.
Fix#4586
May-fix #4559Fix#4537
May-fix #4311
See https://github.com/rootless-containers/rootlesskit/blob/v0.7.0/pkg/port/builtin/builtin.go
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
The pod name does not appear when doing `podman ps -p`.
It is missing as the documentation says:
-p, --pod Print the ID and name of the pod the containers are associated with
The pod name is added in the ps output and checked in unit tests.
Closes#4703
Signed-off-by: NevilleC <neville.cain@qonto.eu>
Currently, if a user requests the size on a container (inspect --size -t container),
the SizeRw does not show up if the value is 0. It's because InspectContainerData is
defined as int64 and there is an omit when empty.
We do want to display it even if the value is empty. I have changed the type of SizeRw to be a pointer to an int64 instead of an int64. It will allow us todistinguish the empty value to the missing value.
I updated the test "podman inspect container with size" to ensure we check thatSizeRw is displayed correctly.
Closes#4744
Signed-off-by: NevilleC <neville.cain@qonto.eu>
when removing an image from storage, we should return a struct that
details what was untagged vs deleted. this replaces the simple
println's used previously and assists in API development.
Signed-off-by: baude <bbaude@redhat.com>
When a container is in a PID namespace, it is enought to send
the stop signal to the PID 1 of the namespace, only send signals
to all processes in the container when the container is not in
a pid namespace.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When trying to reproduce #4704 I noticed that the named volumes
from the Postgres containers in the reproducer weren't being
removed by `podman pod rm -f` saying that the container they were
attached to was still in use. This was rather odd, considering
they were only in use by one container, and that container was in
the process of being removed with the pod.
After a bit of tracing, I realized that the cause is the ordering
of container removal when we remove a pod. Normally, it's done
in removeContainer() before volume removal (which is the last
thing in that function). However, when we are removing a pod, we
remove containers all at once, after removeContainer has already
finished - meaning the container still exists when we try to
remove its volumes, and thus the volume can't be removed.
Solution: collect a list of all named volumes in use by the pod,
and remove them all at once after every container in the pod is
gone. This ensures that there are no dependency issues.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
The earlier attempt to re-add config migration only worked with
user-specified configs (podman run --config). This version works
more in line with that we want - the first rootless config file
will be changed from runc to crun.
Verified on my system after an F31 migration - everything seems
to be working well.
Signed-off-by: Matthew Heon <mheon@redhat.com>
Store the full command plus arguments of the process the container has
been created with. Expose this data as a `Config.CreateCommand` field
in the container-inspect data as well.
This information can be useful for debugging, as we can find out which
command has created the container, and, if being created via the Podman
CLI, we know exactly with which flags the container has been created
with.
The immediate motivation for this change is to use this information for
`podman-generate-systemd` to generate systemd-service files that allow
for creating new containers (in contrast to only starting existing
ones).
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
During writing the tests I found it would be probably useful to have the
tag history part of the inspect data.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
Updates the podman info command to show registries from v1 config file
in the search table format.
Signed-off-by: José Guilherme Vanz <jvanz@jvanz.com>
We currently rely on exec sessions being removed from the state
by the Exec() API itself, on detecting the session stopping. This
is not a reliable method, though. The Podman frontend for exec
could be killed before the session ended, or another Podman
process could be holding the lock and prevent update (most
notable in `run --rm`, when a container with an active exec
session is stopped).
To resolve this, add a function to reap active exec sessions from
the state, and use it on cleanup (to clear sessions after the
container stops) and remove (to do the same when --rm is passed).
This is a bit more complicated than it ought to be because Kata
and company exist, and we can't guarantee the exec session has a
PID on the host, so we have to plumb this through to the OCI
runtime.
Fixes#4666
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
to make things more effecient for the api work we are doing, we should
process image filters internally (as opposed to in main). this allows
for better api responses and more closely affiliated functions.
Signed-off-by: baude <bbaude@redhat.com>
In the process, make everything in the config omitempty in TOML.
We're seeing issues (notably [1]) where, after rewriting
libpod.conf, fields that were not previously populated are
written - and, because they were not previously written, they are
included as empty. This is unfortunately different from not
included at all - it means that we need to assume the user
explicitly unset the value, and we can't use defaults. Setting
omitempty prevents us from writing things that should not be
written as they were not set originally.
[1] https://github.com/containers/libpod/issues/4210
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When doing a checkpoint with --export the root file-system diff was not
working as expected. Instead of getting the changes from the running
container to the highest storage layer it got the changes from the
highest layer to that parent's layer. For a one layer container this
could mean that the complete root file-system is part of the checkpoint.
With this commit this changes to use the same functionality as 'podman
diff'. This actually enables to correctly diff the root file-system
including tracking deleted files.
This also removes the non-working helper functions from libpod/diff.go.
Signed-off-by: Adrian Reber <areber@redhat.com>
Return types had to change a bit for this, but since we can wrap
the old v1.ImageConfig, changes are overall not particularly bad.
At present, I believe this only works with commit, not import.
This matches how things were before we changed to the new parsing
so I think this is fine.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
It turns out we had two independent parsing impkementations for
Dockerfile instructions out of --change. My previous commit fixed
the one used in --change, but as I discovered to my dismay,
commit used a different implementation. Remove that and use the
new parsing implementation instead.
While we're at it, fix some bugs in the current commit code. The
addition of anonymous named volumes to Libpod recently means we
can now include those in the image config when committing. Some
changes (VOLUME, ENV, EXPOSE, LABEL) previously cleared the
config of the former image when used; Docker does not do this, so
I removed that behavior.
Still needs fixing: the new implementation does not support
ONBUILD, while the old one did.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
The code currently assumes that the container we delegate network
namespace to will never further delegate to another container, so
when looking up things like /etc/hosts and /etc/resolv.conf we
won't pull the correct files from the chained dependency. The
changes to resolve this are relatively simple - just need to keep
looking until we find a container without NetNsCtr set.
Fixes#4626
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
After a restart, pods and containers both run a refresh()
function to prepare to run after a reboot. Until now, volumes
have not had a similar function, because they had no per-boot
setup to perform.
Unfortunately, this was not noticed when in-memory locking was
introduced to volumes. The refresh() routine is, among other
things, responsible for ensuring that locks are reserved after a
reboot, ensuring they cannot be taken by a freshly-created
container, pod, or volume. If this reservation is not done, we
can end up with two objects using the same lock, potentially
needing to lock each other for some operations - classic recipe
for deadlocks.
Add a refresh() function to volumes to perform lock reservation
and ensure it is called as part of overall refresh().
Fixes#4605Fixes#4621
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
do not change the runtime error to be lowercase, but use a case
insensitive regex matching. In this way the original error from the
OCI runtime is reported back.
regression introduced by bc485bce47
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This command will destroy all data created via podman.
It will remove containers, images, volumes, pods.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Trying to checkpoint a container started with --rm works, but it makes
no sense as the container, including the checkpoint, will be deleted
after writing the checkpoint. This commit inhibits checkpointing
containers started with '--rm' unless '--export' is used. If the
checkpoint is exported it can easily be restored from the exported
checkpoint, even if '--rm' is used. To restore a container from a
checkpoint it is even necessary to manually run 'podman rm' if the
container is not started with '--rm'.
Signed-off-by: Adrian Reber <areber@redhat.com>
when parsing the OCI error, be sure to discard any other output that
is not matched. The full output is still printed with
--log-level=debug.
Closes: https://github.com/containers/libpod/issues/4574
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We leverage the containers/storage image history tracking feature to
show the previously used image names when running:
`podman images --history`
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
These only conflict when joining more than one network. We can
still set a single CNI network and set a static IP and/or static
MAC.
Fixes#4500
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
filter option accepts two filters.
- label
- until
label supports "label=value" or "label=key=value" format
until supports all golang compatible time/duration formats.
Signed-off-by: Kunal Kushwaha <kunal.kushwaha@gmail.com>
If the container is running and we need to get its netns and
can't, that is a serious bug deserving of errors.
If it's not running, that's not really a big deal. Log an error
and continue.
Signed-off-by: Matthew Heon <mheon@redhat.com>
When Libpod removes a container, there is the possibility that
removal will not fully succeed. The most notable problems are
storage issues, where the container cannot be removed from
c/storage.
When this occurs, we were faced with a choice. We can keep the
container in the state, appearing in `podman ps` and available for
other API operations, but likely unable to do any of them as it's
been partially removed. Or we can remove it very early and clean
up after it's already gone. We have, until now, used the second
approach.
The problem that arises is intermittent problems removing
storage. We end up removing a container, failing to remove its
storage, and ending up with a container permanently stuck in
c/storage that we can't remove with the normal Podman CLI, can't
use the name of, and generally can't interact with. A notable
cause is when Podman is hit by a SIGKILL midway through removal,
which can consistently cause `podman rm` to fail to remove
storage.
We now add a new state for containers that are in the process of
being removed, ContainerStateRemoving. We set this at the
beginning of the removal process. It notifies Podman that the
container cannot be used anymore, but preserves it in the DB
until it is fully removed. This will allow Remove to be run on
these containers again, which should successfully remove storage
if it fails.
Fixes#3906
Signed-off-by: Matthew Heon <mheon@redhat.com>
When restoring a container with user namespace, the user namespace is
created by the OCI runtime, and the network namespace is created after
the user namespace to ensure correct ownership.
In this case PostConfigureNetNS will be set and the value of
c.state.NetNS would be nil. Hence, the following error occurs:
$ sudo podman run --name cr \
--uidmap 0:1000:500 \
-d docker.io/library/alpine \
/bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
$ sudo podman container checkpoint cr
$ sudo podman container restore cr
...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x13a5e3c]
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
In conmon 2.0.3, we add another fifo to handle window resizing. This needs to be cleaned up for commands like restore, where the same path is used.
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Rewrite the backend for displaying the history of an image to simplify
the code and be closer to docker's behaviour. Instead of driving
index-based heuristics, create a reverse mapping from top-layers to the
corresponding image IDs and lookup the layers on-demand. Also use the
uncompressed layer size to be closer to Docker's behaviour.
Note that intermediate images from local builds are not considered for
the ID lookups anymore.
Fixes: #3359
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
When running on a node with Cgroups v2, default to using `crun` instead
of `runc`. Note that this only impacts the hard-coded default config.
No user config will be over-written.
Fixes: #4463
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Currently podman generate kube does not generate the correct RunAsUser and RunAsGroup
options in the yaml file. This patch fixes this.
This patch also make `podman play kube` use the RunAdUser and RunAsGroup options if
they are specified in the yaml file.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
`go get github.com/cri-o/ocicni@deac903fd99b6c52d781c9f42b8db3af7dcfd00a`
I had to fix compilation errors in libpod/networking_linux.go
---
ocicni.Networks has changed from string to the structure NetAttachment
with the member Name (the former string value) and the member Ifname
(optional).
I don't think we can make use of Ifname here, so I just map the array of
structures to array of strings - e.g. dropping Ifname.
---
The function GetPodNetworkStatus no longer returns Result but it returns
the wrapper structure NetResult which contains the former Result plus
NetAttachment (Network name and Interface name).
Again, I don't think we can make use of that information here, so I
just added `.Result` to fix the build.
---
Issue: #1136
Signed-off-by: Jakub Filak <jakub.filak@sap.com>
If user specifies --detach-keys="", this will disable the feature.
Adding define.DefaultDetachKeys to help screen to help identify detach keys.
Updated man pages with additonal information.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When pulling an unqualified reference (e.g., `fedora`) make sure that
the reference is not using a non-docker transport to avoid iterating
over the search registries and trying to pull from them.
Fixes: #4434
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
If the kube.yaml specifieds the SELinux type or Level, we need the container
to be launched with the correct label.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
the pidWaitTimeout is already a Duration so do not multiply it again
by time.Millisecond.
Closes: https://github.com/containers/libpod/issues/4344
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Pull in changes to pkg/secrets/secrets.go that adds the
logic to disable fips mode if a pod/container has a
label set.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
change the default to -1, so that we can change the semantic of
"--tail 0" to not print any existing log line.
Closes: https://github.com/containers/libpod/issues/4396
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Refactor the `RuntimeConfig` along with related code from libpod into
libpod/config. Note that this is a first step of consolidating code
into more coherent packages to make the code more maintainable and less
prone to regressions on the long runs.
Some libpod definitions were moved to `libpod/define` to resolve
circular dependencies.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
There were many situations that made exec act funky with input. pipes didn't work as expected, as well as sending input before the shell opened.
Thinking about it, it seemed as though the issues were because of how os.Stdin buffers (it doesn't). Dropping this input had some weird consequences.
Instead, read from os.Stdin as bufio.Reader, allowing the input to buffer before passing it to the container.
Signed-off-by: Peter Hunt <pehunt@redhat.com>
command.Start() just starts the command. That catches some
errors, but the nasty ones - bad options and similar - happen
when the command runs. Use CombinedOutput() instead - it waits
for the command to exit, and thus catches non-0 exit of the
`mount` command (invalid options, for example).
STDERR from the `mount` command is directly used, which isn't
necessarily the best, but we can't really get much more info on
what went wrong.
Fixes#4303
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
always create a new cgroup for conmon also when running as rootless.
We were previously creating one only when necessary, but that behaves
differently than root containers.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Currently podman play kube is not using the system default seccomp.json file.
This PR will use the default or override location for podman play.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Generate an image's RepoDigests list using all applicable digests, and
refrain from outputting a digest in the tag column of the "images"
output.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Be prepared to report multiple image digests for images which contain
multiple manifests but, because they continue to have the same set of
layers and the same configuration, are considered to be the same image.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add --override-arch and --override-os as hidden flags, in line with the
global flag names that skopeo uses, so that we can test behavior around
manifest lists without having to conditionalize more of it by arch.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
When an image can be opened as an ImageSource but not an Image, handle
the case where it's an image list all by itself, the case where it's an
image for a different architecture/OS combination, or the case where
it's both.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Move to containers/image v5 and containers/buildah to v1.11.4.
Replace an equality check with a type assertion when checking for a
docker.ErrUnauthorizedForCredentials in `podman login`.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
read the slirp4netns stderr and propagate it in the error when the
process fails.
Replace: https://github.com/containers/libpod/pull/4338
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We have a lot of checks for container state scattered throughout
libpod. Many of these need to ensure the container is in one of a
given set of states so an operation may safely proceed.
Previously there was no set way of doing this, so we'd use unique
boolean logic for each one. Introduce a helper to standardize
state checks.
Note that this is only intended to replace checks for multiple
states. A simple check for one state (ContainerStateRunning, for
example) should remain a straight equality, and not use this new
helper.
Signed-off-by: Matthew Heon <mheon@redhat.com>
when running in systemd mode on cgroups v1, make sure the
/sys/fs/cgroup/systemd/release_agent is masked otherwise the container
is able to modify it and execute scripts on the host.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When you try and create a new volume with the name of a volume
that already exists, you presently get a thoroughly unhelpful
error from `mkdir` as the volume attempts to create the
directory it will be mounted at. An EEXIST out of mkdir is not
particularly helpful to Podman users - it doesn't explain that
the name is already taken by another volume.
The solution here is potentially racy as the runtime is not
locked, so someone else could take the name while we're still
getting things set up, but that's a narrow timing window, and we
will still return an error - just an error that's not as good as
this one.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
if the cgroup manager is set to systemd, detect if dbus is available,
otherwise fallback to --cgroup-manager=cgroupfs.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Everything else is a flag to mount, but "uid" and "gid" are not.
We need to parse them out of "o" and handle them separately.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
make sure the user overrides are stored in the configuration file when
first created.
Closes: https://github.com/containers/libpod/issues/2659
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Previously, when `podman run` encountered a volume mount without
separate source and destination (e.g. `-v /run`) we would assume
that both were the same - a bind mount of `/run` on the host to
`/run` in the container. However, this does not match Docker's
behavior - in Docker, this makes an anonymous named volume that
will be mounted at `/run`.
We already have (more limited) support for these anonymous
volumes in the form of image volumes. Extend this support to
allow it to be used with user-created volumes coming in from the
`-v` flag.
This change also affects how named volumes created by the
container but given names are treated by `podman run --rm` and
`podman rm -v`. Previously, they would be removed with the
container in these cases, but this did not match Docker's
behaviour. Docker only removed anonymous volumes. With this patch
we move to that model as well; `podman run -v testvol:/test` will
not have `testvol` survive the container being removed by `podman
rm -v`.
The sum total of these changes let us turn on volume removal in
`--rm` by default.
Fixes: #4276
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
When a container is created with a given OCI runtime, but then it
is uninstalled or removed from the configuration file, Libpod
presently reacts very poorly. The EvictContainer code can
potentially remove these containers, but we still can't see them
in `podman ps` (aside from the massive logrus.Errorf messages
they create).
Providing a minimal OCI runtime implementation for missing
runtimes allows us to behave better. We'll be able to retrieve
containers from the database, though we still pop up an error for
each missing runtime. For containers which are stopped, we can
remove them as normal.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
network statistics cannot be collected for rootless network devices with
the current implementation. for now, we return nil so that stats will
at least for users.
Fixes:#4268
Signed-off-by: baude <bbaude@redhat.com>
The json field is called `Image` while the go field is called `ImageID`,
tricking users into filtering for `Image` which ultimately results in an
error. Hence, rename the field to `Image` to align json and go.
To prevent podman users from regressing, rename `Image` to `ImageID` in
the specified filters. Add tests to prevent us from regressing. Note
that consumers of the go API that are using `ImageID` are regressing;
ultimately we consider it to be a bug fix.
Fixes: #4193
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Also, ensure that we don't try to mount them without root - it
appears that it can somehow not error and report that mount was
successful when it clearly did not succeed, which can induce this
case.
We reuse the `--force` flag to indicate that a volume should be
removed even after unmount errors. It seems fairly natural to
expect that --force will remove a volume that is otherwise
presenting problems.
Finally, ignore EINVAL on unmount - if the mount point no longer
exists our job is done.
Fixes: #4247Fixes: #4248
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
In some cases, conmon can fail without writing logs. Change the wording
of the error message from
"error reading container (probably exited) json message"
to
"container create failed (no logs from conmon)"
to have a more helpful error message that is more consistent with other
errors at that stage of execution.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Previously, `podman checkport restore` with exported containers,
when told to create a new container based on the exported
checkpoint, would create a new container, with a new container
ID, but not reset CGroup path - which contained the ID of the
original container.
If this was done multiple times, the result was two containers
with the same cgroup paths. Operations on these containers would
this have a chance of crossing over to affect the other one; the
most notable was `podman rm` once it was changed to use the --all
flag when stopping the container; all processes in the cgroup,
including the ones in the other container, would be stopped.
Reset cgroups on restore to ensure that the path matches the ID
of the container actually being run.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This is a horrible hack to work around issues with Fedora 31, but
other distros might need it to, so we'll move it upstream.
I do not recommend this functionality for general use, and the
manpages and other documentation will reflect this. But for some
upgrade cases, it will be the only thing that allows for a
working system.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
For future work, we need multiple implementations of the OCI
runtime, not just a Conmon-wrapped runtime matching the runc CLI.
As part of this, do some refactoring on the interface for exec
(move to a struct, not a massive list of arguments). Also, add
'all' support to Kill and Stop (supported by runc and used a bit
internally for removing containers).
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
when runc returns an error about not being v2 complient, catch the error
and logrus an actionable message for users.
Signed-off-by: baude <bbaude@redhat.com>
This requires updating all import paths throughout, and a matching
buildah update to interoperate.
I can't figure out the reason for go.mod tracking
github.com/containers/image v3.0.2+incompatible // indirect
((go mod graph) lists it as a direct dependency of libpod, but
(go list -json -m all) lists it as an indirect dependency),
but at least looking at the vendor subdirectory, it doesn't seem
to be actually used in the built binaries.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This ensures that containers that didn't require an evict will be
dealt with normally, and we only break out evict for containers
that refuse to be removed by normal means.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>