This can be allowed because it should only restrict more per the seccomp docs, and multiple apps use it today.
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
Block kcmp, procees_vm_readv, process_vm_writev.
All these require CAP_PTRACE, and are only used for ptrace related
actions, so are not useful as we block ptrace.
Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
The bpf syscall can load code into the kernel which may
persist beyond container lifecycle. Requires CAP_SYS_ADMIN
already.
Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
These provide an in kernel virtual machine for x86 real mode on x86
used by one very early DOS emulator. Not required for any normal use.
Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
The stime syscall is a legacy syscall on some architectures
to set the clock, should be blocked as time is not namespaced.
Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
clock_adjtime is the new posix style version of adjtime allowing
a specific clock to be specified. Time is not namespaced, so do
not allow.
Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
This is a new version of init_module that takes a file descriptor
rather than a file name.
Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
The set_robust_list syscall sets the list of futexes which are
cleaned up on thread exit, and are needed to avoid mutexes
being held forever on thread exit.
See for example in Musl libc mutex handling:
http://git.musl-libc.org/cgit/musl/tree/src/thread/pthread_mutex_trylock.c#n22
Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
It's used for updating properties of one or more containers, we only
support resource configs for now. It can be extended in the future.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
- Make the API client library completely standalone.
- Move windows partition isolation detection to the client, so the
driver doesn't use external types.
Signed-off-by: David Calavera <david.calavera@gmail.com>
This restores the behavior that existed prior to #16235 for setting
OOMKilled, while retaining the additional benefits it introduced around
emitting the oom event.
This also adds a test for the most obvious OOM cases which would have
caught this regression.
Fixes#18510
Signed-off-by: Euan <euank@amazon.com>
Whether a shared/slave volume propagation will work or not also depends on
where source directory is mounted on and what are the propagation properties
of that mount point. For example, for shared volume mount to work, source
mount point should be shared. For slave volume mount to work, source mount
point should be either shared/slave.
This patch determines the mount point on which directory is mounted and
checks for desired minimum propagation properties of that mount point. It
errors out of configuration does not seem right.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Allow passing mount propagation option shared, slave, or private as volume
property.
For example.
docker run -ti -v /root/mnt-source:/root/mnt-dest:slave fedora bash
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Ubuntu 14.04 LTS is on apparmor 2.8.95.
This enables `ps` inside a container without causing
audit log entries on the host.
Signed-off-by: Joel Hansson <joel.hansson@ecraft.com>
It will Tar up contents of child directory onto tmpfs if mounted over
This patch will use the new PreMount and PostMount hooks to "tar"
up the contents of the base image on top of tmpfs mount points.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
libcontainer v0.0.4 introduces setting `/proc/self/oom_score_adj` to
better tune oom killing preferences for container process. This patch
simply integrates OomScoreAdj libcontainer's config option and adjust
the cli with this new option.
Signed-off-by: Antonio Murdaca <amurdaca@redhat.com>
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Adjust the docker-default profile for when the docker daemon is running in
AppArmor confinement. To enable 'docker kill' we need to allow the container
to receive kill signals from the daemon.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Our implementation of systemd cgroups is mixture of systemd api and
plain filesystem api. It's hard to keep it up to date with systemd and
it already contains some nasty bugs with new versions. Ideally it should
be replaced with some daemon flag which will allow to set parent systemd
slice.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
The LXC driver was deprecated in Docker 1.8.
Following the deprecation rules, we can remove a deprecated feature
after two major releases. LXC won't be supported anymore starting on Docker 1.10.
Signed-off-by: David Calavera <david.calavera@gmail.com>
This reverts commit d5cd032a86.
Commit caused issues on systems with case-insensitive filesystems.
Revert for now
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- Move autogen/dockerversion to version
- Update autogen and "builds" to use this package and a build flag
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
When running LXC dind (outer docker is started with native driver)
cgroup paths point to `/docker/CID` inside `/proc/self/mountinfo` but
these paths aren't mounted (root is wrong). This fix just discard the
cgroup dir from mountinfo and set it to root `/`.
This patch fixes/skip OOM LXC tests that were failing.
Fix#16520
Signed-off-by: Antonio Murdaca <runcom@linux.com>
Signed-off-by: Antonio Murdaca <amurdaca@redhat.com>
On LXC memory swap was only set to memory_limit*2 even if a value for
memory swap was provided. This patch fix this behavior to be the same
as the native driver and set correct memory swap in the template.
Also add a test specifically for LXC but w/o adding a new test
requirement.
Signed-off-by: Antonio Murdaca <runcom@linux.com>
Adds support for the daemon to handle user namespace maps as a
per-daemon setting.
Support for handling uid/gid mapping is added to the builder,
archive/unarchive packages and functions, all graphdrivers (except
Windows), and the test suite is updated to handle user namespace daemon
rootgraph changes.
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
This passes through the container hostname to HCS, which in Windows Server
2016 TP4 will set the container's hostname in the registry before starting
it. This will be silently ignored by TP3.
Signed-off-by: John Starks <jostarks@microsoft.com>
Although having a request ID available throughout the codebase is very
valuable, the impact of requiring a Context as an argument to every
function in the codepath of an API request, is too significant and was
not properly understood at the time of the review.
Furthermore, mixing API-layer code with non-API-layer code makes the
latter usable only by API-layer code (one that has a notion of Context).
This reverts commit de41640435, reversing
changes made to 7daeecd42d.
Signed-off-by: Tibor Vass <tibor@docker.com>
Conflicts:
api/server/container.go
builder/internals.go
daemon/container_unix.go
daemon/create.go