Commit Graph

211 Commits

Author SHA1 Message Date
Vincent Demeester c32f8bb36a Merge pull request #17704 from LK4D4/default_cgroupfs
Use fs cgroups by default
2015-11-19 14:01:13 +01:00
Alexander Morozov 419fd7449f Use fs cgroups by default
Our implementation of systemd cgroups is mixture of systemd api and
plain filesystem api. It's hard to keep it up to date with systemd and
it already contains some nasty bugs with new versions. Ideally it should
be replaced with some daemon flag which will allow to set parent systemd
slice.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-11-18 16:16:13 -08:00
Alexander Morozov 9187656305 Merge pull request #16639 from mrunalp/skip_dev_setup
Skip /dev setup in container when it is bind mounted in
2015-11-17 11:36:00 -08:00
John Howard be2f53ece8 Windows: Fix native exec template
Signed-off-by: John Howard <jhoward@microsoft.com>
2015-10-31 11:39:19 -07:00
Dan Walsh 0c518b6ab2 Docker is calling cont.Destroy twice on success
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2015-10-19 14:53:55 -04:00
Antonio Murdaca cfcddefacd daemon: execdriver: lxc: fix cgroup paths
When running LXC dind (outer docker is started with native driver)
cgroup paths point to `/docker/CID` inside `/proc/self/mountinfo` but
these paths aren't mounted (root is wrong). This fix just discard the
cgroup dir from mountinfo and set it to root `/`.
This patch fixes/skip OOM LXC tests that were failing.
Fix #16520

Signed-off-by: Antonio Murdaca <runcom@linux.com>
Signed-off-by: Antonio Murdaca <amurdaca@redhat.com>
2015-10-13 14:46:59 -07:00
Phil Estes 442b45628e Add user namespace (mapping) support to the Docker engine
Adds support for the daemon to handle user namespace maps as a
per-daemon setting.

Support for handling uid/gid mapping is added to the builder,
archive/unarchive packages and functions, all graphdrivers (except
Windows), and the test suite is updated to handle user namespace daemon
rootgraph changes.

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2015-10-09 17:47:37 -04:00
Deng Guangxing a1620084c5 remove useless function generateIfaceName()
generateIfaceName() is useless as libnetwork has done
the job.

Signed-off-by: Deng Guangxing <dengguangxing@huawei.com>
2015-10-09 18:07:54 +08:00
Tibor Vass b08f071e18 Revert "Merge pull request #16228 from duglin/ContextualizeEvents"
Although having a request ID available throughout the codebase is very
valuable, the impact of requiring a Context as an argument to every
function in the codepath of an API request, is too significant and was
not properly understood at the time of the review.

Furthermore, mixing API-layer code with non-API-layer code makes the
latter usable only by API-layer code (one that has a notion of Context).

This reverts commit de41640435, reversing
changes made to 7daeecd42d.

Signed-off-by: Tibor Vass <tibor@docker.com>

Conflicts:
	api/server/container.go
	builder/internals.go
	daemon/container_unix.go
	daemon/create.go
2015-09-29 14:26:51 -04:00
Mrunal Patel 4911b58862 Skip /dev setup in container when it is bind mounted in
We need to set the device array to nil to skip /dev setup in runc/libcontainer.
See c9d5850629

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-09-28 18:28:16 -04:00
Michael Crosby f6064cb42b Update CAP_ prefix for new spec format
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-09-24 15:00:30 -07:00
Doug Davis 26b1064967 Add context.RequestID to event stream
This PR adds a "request ID" to each event generated, the 'docker events'
stream now looks like this:

```
2015-09-10T15:02:50.000000000-07:00 [reqid: c01e3534ddca] de7c5d4ca927253cf4e978ee9c4545161e406e9b5a14617efb52c658b249174a: (from ubuntu) create
```
Note the `[reqID: c01e3534ddca]` part, that's new.

Each HTTP request will generate its own unique ID. So, if you do a
`docker build` you'll see a series of events all with the same reqID.
This allow for log processing tools to determine which events are all related
to the same http request.

I didn't propigate the context to all possible funcs in the daemon,
I decided to just do the ones that needed it in order to get the reqID
into the events. I'd like to have people review this direction first, and
if we're ok with it then I'll make sure we're consistent about when
we pass around the context - IOW, make sure that all funcs at the same level
have a context passed in even if they don't call the log funcs - this will
ensure we're consistent w/o passing it around for all calls unnecessarily.

ping @icecrime @calavera @crosbymichael

Signed-off-by: Doug Davis <dug@us.ibm.com>
2015-09-24 11:56:37 -07:00
Jess Frazelle 23750fb802 Merge pull request #15862 from calavera/share_shm_and_mqueue
Share shm and mqueue between containers.
2015-09-24 11:23:59 -07:00
Hu Keping f05bacbe50 Events for OOM needs to be shift to an earlier time
It's worth to warn user as soon as possilbe when OOM happend.

Signed-off-by: Hu Keping <hukeping@huawei.com>
2015-09-21 10:18:08 +08:00
Madhu Venugopal e148e763b8 Update native execdriver to exploit libcontainer hooks
Using @mavenugo's patch for enabling the libcontainer pre-start hook to
be used for network namespace initialization (correcting the conflict
with user namespaces); updated the boolean check to the more generic
SupportsHooks() name, and fixed the hook state function signature.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2015-09-16 12:51:14 -04:00
Mrunal Patel c8291f7107 Add support for sharing /dev/shm/ and /dev/mqueue between containers
This changeset creates /dev/shm and /dev/mqueue mounts for each container under
/var/lib/containers/<id>/ and bind mounts them into the container. When --ipc:container<id/name>
is used, then the /dev/shm and /dev/mqueue of the ipc container are used instead of creating
new ones for the container.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)

(cherry picked from commit d88fe447df)
2015-09-11 14:02:11 -04:00
Hu Keping 40d3ce1063 Minor typo
Signed-off-by: Hu Keping <hukeping@huawei.com>
2015-09-10 14:13:15 +08:00
David Calavera 688dd8477e Revert "Add support for sharing /dev/shm/ and /dev/mqueue between containers"
This reverts commit d88fe447df.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-08-26 05:23:00 -04:00
Jessie Frazelle 903cd2b9e3 Merge pull request #12159 from mrunalp/feature/ipc_share_dev
ipc: Share /dev/shm and /dev/mqueue when --ipc container:<id/name> is used
2015-08-24 17:55:03 -07:00
David Calavera 9bac520c12 Merge pull request #15571 from ewindisch/apparmor_denywproc
AppArmor: Deny w to /proc/* files
2015-08-24 11:03:41 +02:00
Mrunal Patel d88fe447df Add support for sharing /dev/shm/ and /dev/mqueue between containers
This changeset creates /dev/shm and /dev/mqueue mounts for each container under
/var/lib/containers/<id>/ and bind mounts them into the container. When --ipc:container<id/name>
is used, then the /dev/shm and /dev/mqueue of the ipc container are used instead of creating
new ones for the container.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
2015-08-19 12:36:52 -04:00
Eric Windisch 7342d59114 AppArmor: Deny w to /proc/* files
Introduce a write denial for files at the root of /proc.

This prohibits root users from performing a chmod of those
files. The rules for denials in proc are also cleaned up,
making the rules better match their targets.

Locally tested on:
- Ubuntu precise (12.04) with AppArmor 2.7
- Ubuntu trusty (14.04) with AppArmor 2.8.95

Signed-off-by: Eric Windisch <eric@windisch.us>
2015-08-13 15:39:25 -04:00
Tim Dettrick 03f65b3d0d Revert "Revert "Add docker exec run a command in privileged mode""
This reverts commit 40b71adee3.

Original commit (for which this is effectively a rebased version) is
72a500e9e5 and was provided by Lei Jitang
<leijitang@huawei.com>.

Signed-off-by: Tim Dettrick <t.dettrick@uq.edu.au>
2015-08-13 16:36:44 +10:00
Jessica Frazelle e542238f2a remove docker-unconfined profile we were not using it and it breaks apparmor on wheezy
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2015-08-06 16:51:01 -07:00
Jessica Frazelle ed248207d7 revert apparmor changes back to how it was in 1.7.1, but keep tests
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2015-08-06 12:49:25 -07:00
Tibor Vass 2d730c93b4 Merge pull request #15148 from hqhq/hq_golint_native
Add back golint for daemon/execdriver/native
2015-07-30 15:51:06 -04:00
Tibor Vass a687448c4d Merge pull request #15163 from crosbymichael/proc-ro
Don't mount /proc as ro
2015-07-30 15:12:29 -04:00
Eric Windisch f5c388b35a Only explicitly deny ptrace for container-originated procs
The 'deny ptrace' statement was supposed to only ignore
ptrace failures in the AUDIT log. However, ptrace was implicitly
allowed from unconfined processes (such as the docker daemon and
its integration tests) due to the abstractions/base include.

This rule narrows the definition such that it will only ignore
the failures originating inside of the container and will not
cause denials when the daemon or its tests ptrace inside processes.

Introduces positive and negative tests for ptrace /w apparmor.

Signed-off-by: Eric Windisch <eric@windisch.us>
2015-07-30 14:40:28 -04:00
Michael Crosby bfc51cf660 Don't mount /proc as ro
This caused a regression with LSM labeling.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-07-30 10:57:50 -07:00
Qiang Huang e34f562a77 Add back golint for daemon/execdriver/native
It's broken by #15099 Fix it.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-07-30 08:58:54 +08:00
Eric Windisch 5832715052 Fix the proc integration test & include missing AA profile
Integration tests were failing due to proc filter behavior
changes with new apparmor policies.

Also include the missing docker-unconfined policy resolving
potential startup errors. This policy is complain-only so
it should behave identically to the standard unconfined policy,
but will not apply system path-based policies within containers.

Signed-off-by: Eric Windisch <eric@windisch.us>
2015-07-29 17:08:51 -04:00
Jessie Frazelle d7661cb48b Merge pull request #15099 from ewindisch/apparmor-restore-en
Restore AppArmor generation + fixes
2015-07-29 09:36:59 -07:00
Eric Windisch 3edc88f76d Restore AppArmor profile generation
Will attempt to load profiles automatically. If loading fails
but the profiles are already loaded, execution will continue.

A hard failure will only occur if Docker cannot load
the profiles *and* they have not already been loaded via
some other means.

Also introduces documentation for AppArmor.

Signed-off-by: Eric Windisch <eric@windisch.us>
2015-07-28 17:45:51 -04:00
Qiang Huang 3d17c3bb66 Fix golint warnings for daemon/execdriver/*
Addresses: #14756

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-07-28 08:43:22 +08:00
David Calavera 94ab0d312f Revert "Introduce a dedicated unconfined AA policy"
This reverts commit 87376c3add.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-07-24 16:35:51 -07:00
David Calavera ac9fc03c74 Merge pull request #14855 from ewindisch/apparmor-unconfined
Introduce a dedicated unconfined AA policy
2015-07-23 10:21:51 -07:00
Eric Windisch 87376c3add Introduce a dedicated unconfined AA policy
By using the 'unconfined' policy for privileged
containers, we have inherited the host's apparmor
policies, which really make no sense in the
context of the container's filesystem.

For instance, policies written against
the paths of binaries such as '/usr/sbin/tcpdump'
can be easily circumvented by moving the binary
within the container filesystem.

Fixes GH#5490

Signed-off-by: Eric Windisch <eric@windisch.us>
2015-07-22 11:28:32 -04:00
Qiang Huang af3059855c Remove unused parameter in NewTtyConsole
It's introduced in
68ba5f0b69 (Execdriver implementation on new libcontainer API)

But I don't see reson why we need it.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-07-22 10:32:31 +08:00
Alexander Morozov 6ae377ffa0 Remove unused TtyTerminal interface
It was used only by integration tests, which now gone.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-07-21 09:56:28 -07:00
Eric Windisch 80d99236c1 Move AppArmor policy to contrib & deb packaging
The automatic installation of AppArmor policies prevents the
management of custom, site-specific apparmor policies for the
default container profile. Furthermore, this change will allow
a future policy for the engine itself to be written without demanding
the engine be able to arbitrarily create and manage AppArmor policies.

- Add deb package suggests for apparmor.
- Ubuntu postinst use aa-status & fix policy path
- Add the policies to the debian packages.
- Add apparmor tests for writing proc files
Additional restrictions against modifying files in proc
are enforced by AppArmor. Ensure that AppArmor is preventing
access to these files, not simply Docker's configuration of proc.
- Remove /proc/k?mem from AA policy
The path to mem and kmem are in /dev, not /proc
and cannot be restricted successfully through AppArmor.
The device cgroup will need to be sufficient here.
- Load contrib/apparmor during integration tests
Note that this is somewhat dirty because we
cannot restore the host to its original configuration.
However, it should be noted that prior to this patch
series, the Docker daemon itself was loading apparmor
policy from within the tests, so this is no dirtier or
uglier than the status-quo.

Signed-off-by: Eric Windisch <eric@windisch.us>
2015-07-21 11:05:53 -04:00
Alexander Morozov c86189d554 Update libcontainer
Replaced github.com/docker/libcontainer with
github.com/opencontainers/runc/libcontaier.
Also I moved AppArmor profile generation to docker.

Main idea of this update is to fix mounting cgroups inside containers.
After updating docker on CI we can even remove dind.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-07-16 16:02:26 -07:00
Qiang Huang a7f5e1c4c3 Remove cgroup read-only flag when privileged
Fixes: #14543

It needs libcontainer fix from:
https://github.com/opencontainers/runc/pull/91

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-07-14 15:00:41 +08:00
Mrunal Patel e0d96fb3ef Adds support for specifying additional groups.
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2015-07-13 14:47:28 -04:00
Jessie Frazelle 703248da20 Merge pull request #13669 from ewindisch/readonly-proc
Make /proc, /sys, & /dev readonly for readonly containers
2015-07-10 15:32:13 -07:00
Qiang Huang f18fb5b3ef Add cgroup bind mount by default
Libcontainer already supported mount container's own cgroup into
container, with this patch, we can see container's own cgroup info
in container.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-07-10 13:12:09 +08:00
Eric Windisch 5400d8873f Make /proc, /sys, /dev readonly for readonly containers
If a container is read-only, also set /proc, /sys,
& /dev to read-only. This should apply to both privileged and
unprivileged containers.

Note that when /dev is read-only, device files may still be
written to. This change will simply prevent the device paths
from being modified, or performing mknod of new devices within
the /dev path.

Tests are included for all cases. Also adds a test to ensure
that /dev/pts is always mounted read/write, even in the case of a
read-write rootfs. The kernel restricts writes here naturally and
bad things will happen if we mount it ro.

Signed-off-by: Eric Windisch <eric@windisch.us>
2015-07-02 19:08:00 +00:00
unclejack c1477db04f daemon: lower allocations
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
2015-06-30 01:45:31 +03:00
Phil Estes 9e9d227677 Initialize swappiness in libcontainer cgroups template
By default, the cgroup setting in libcontainer's configs.Cgroup for
memory swappiness will default to 0, which is a valid choice for memory
swappiness, but that means by default every container's memory
swappiness will be set to zero instead of the default 60, which is
probably not what users are expecting.

When the swappiness UI PR comes into Docker, there will be docker run
controls to set this per container, but for now we want to make sure
*not* to change the default, as well as work around an older kernel
issue that refuses to allow it to be set when cgroup hiearchies are in
use.

Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
2015-06-18 19:27:04 -04:00
Alexander Morozov f1b59d64d2 Remove useless debug message
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-06-01 11:15:15 -07:00
Arnaud Porterie b50e780925 Merge pull request #13491 from jfrazelle/revert-exec-privileged
Revert "Add docker exec run a command in privileged mode"
2015-05-26 16:41:50 -07:00