This patch adds pause/unpause to the command line, api, and drivers
for use on containers. This is implemented using the cgroups/freeze
utility in libcontainer and lxc freeze/unfreeze.
Co-Authored-By: Eric Windisch <ewindisch@docker.com>
Co-Authored-By: Chris Alfonso <calfonso@redhat.com>
Docker-DCO-1.1-Signed-off-by: Ian Main <imain@redhat.com> (github: imain)
This also makes sure that devices are pointers to avoid copies
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
We now have one place that keeps track of (most) devices that are allowed and created within the container. That place is pkg/libcontainer/devices/devices.go
This fixes several inconsistencies between which devices were created in the lxc backend and the native backend. It also fixes inconsistencies between wich devices were created and which were allowed. For example, /dev/full was being created but it was not allowed within the cgroup. It also declares the file modes and permissions of the default devices, rather than copying them from the host. This is in line with docker's philosphy of not being host dependent.
Docker-DCO-1.1-Signed-off-by: Timothy Hobbs <timothyhobbs@seznam.cz> (github: https://github.com/timthelion)
Add specific types for Required and Optional DeviceNodes
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
We need SETFCAP to be able to mark files as having caps, which is
heavily used by fedora.
See https://github.com/dotcloud/docker/issues/5928
We also need SETPCAP, for instance systemd needs this to set caps
on its childen.
Both of these are safe in the sense that they can never ever
result in a process with a capability not in the bounding set of the
container.
We also add NET_BIND_SERVICE caps, to be able to bind to ports lower
than 1024.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
those that were specified in the config. This commit also explicitly
adds a set of capabilities that we were silently not dropping and were
assumed by the tests.
Docker-DCO-1.1-Signed-off-by: Victor Marmol <vmarmol@google.com> (github: vmarmol)
This updates systemd.Apply to match the fs backend by:
* Always join blockio controller (for stats)
* Support CpusetCpus
* Support MemorySwap
Also, it removes the generic UnitProperties in favour of a single
option to set the slice.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
This also cleans up some of the left over restriction paths code from
before.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
It has been pointed out that some files in /proc and /sys can be used
to break out of containers. However, if those filesystems are mounted
read-only, most of the known exploits are mitigated, since they rely
on writing some file in those filesystems.
This does not replace security modules (like SELinux or AppArmor), it
is just another layer of security. Likewise, it doesn't mean that the
other mitigations (shadowing parts of /proc or /sys with bind mounts)
are useless. Those measures are still useful. As such, the shadowing
of /proc/kcore is still enabled with both LXC and native drivers.
Special care has to be taken with /proc/1/attr, which still needs to
be mounted read-write in order to enable the AppArmor profile. It is
bind-mounted from a private read-write mount of procfs.
All that enforcement is done in dockerinit. The code doing the real
work is in libcontainer. The init function for the LXC driver calls
the function from libcontainer to avoid code duplication.
Docker-DCO-1.1-Signed-off-by: Jérôme Petazzoni <jerome@docker.com> (github: jpetazzo)