Commit Graph

468 Commits

Author SHA1 Message Date
Sebastiaan van Stijn 967acd56c1 Merge pull request #18512 from euank/18510-fixOomKilled
Set OOMKilled state on any OOM event
2016-01-11 00:09:26 +01:00
Justin Cormack 13a9d4e899 Add i386 specific modify_ldt syscall to default seccomp filter
This syscall is used by Go on i386 binaries, although not by libc.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2016-01-10 12:00:11 +00:00
Jess Frazelle 938d28e772 Merge pull request #19144 from LK4D4/fix_parent_systemd
Choose default-cgroup parent by cgroup driver
2016-01-07 10:24:51 -08:00
Alexander Morozov c1cd45d547 Choose default-cgroup parent by cgroup driver
It's "/docker" for cgroupfs and "system.slice" for systemd.

Fix #19140

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2016-01-07 08:56:26 -08:00
David Calavera 907407d0b2 Modify import paths to point to the new engine-api package.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-01-06 19:48:59 -05:00
David Calavera 4ee3048fa8 Merge pull request #19110 from brahmaroutu/update_openc
update runc to the latest code base to fix gccgo builds
2016-01-06 15:09:11 -08:00
Srini Brahmaroutu 9982631707 update runc to the latest code base to fix gccgo build
Signed-off-by: Srini Brahmaroutu <srbrahma@us.ibm.com>
2016-01-06 00:02:56 +00:00
Justin Cormack 822c4f79ab
Allow the waitpid syscall
This version is sometimes used eg by glibc on x86

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2016-01-05 09:29:16 -08:00
Justin Cormack ca3ae72e43
Support compatible architectures with default seccomp rules
In the default seccomp rule, allow use of 32 bit syscalls on
64 bit architectures, so you can run x86 Linux images on x86_64
without disabling seccomp or using a custom rule.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2016-01-05 09:28:42 -08:00
Justin Cormack d8e06d54cf
Allow sigreturn syscall
This is used on some 32 bit architectures, eg x86

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2016-01-04 16:11:59 -08:00
Justin Cormack 923609179b
Add _llseek syscall
This is the newer verion of lseek on many 32 bit platforms

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2016-01-04 11:55:28 -08:00
Justin Cormack d6a9c5abed
Do not allow obsolete syscalls
sysfs and ustat syscalls are marked obsolete.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2016-01-04 11:55:28 -08:00
Justin Cormack c1b57fc1c9
Do not allow name_to_handle_at, as we have already blocked open_by_handle_at
Being able to obtain a file handle is no use as we cannot perform
any operation in it, and it may leak kernel state.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2016-01-04 11:55:27 -08:00
Jessica Frazelle a1747b3cc8
add 32bit syscalls to whitelist
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-01-04 11:55:26 -08:00
Jessica Frazelle 17735c3c98
change seccomp blacklist to whitelist
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-01-04 11:55:21 -08:00
Lukas Waslowski 9a03967f0a Fix declarations of of execdriver/native.NewDriver to have the same signature.
This change is done so that driver_unsupported.go and driver_unsupported_nocgo.go
declare the same signature for NewDriver as driver.go.

Fixes #19032

Signed-off-by: Lukas Waslowski <cr7pt0gr4ph7@gmail.com>
2016-01-02 19:55:37 +01:00
Jess Frazelle abc695d9d5 Merge pull request #18974 from jfrazelle/remove-seccomp-from-seccomp-profile
remove seccomp from seccomp profile
2015-12-29 13:15:14 -08:00
Arnaud Porterie a81e438544 Merge pull request #18969 from justincormack/vm86
Block vm86 syscalls in default seccomp profile
2015-12-29 11:57:35 -08:00
Arnaud Porterie 2307f47fdd Merge pull request #18972 from justincormack/bpf
Block bpf syscall from default seccomp profile
2015-12-29 11:57:07 -08:00
Arnaud Porterie e01cab1cc5 Merge pull request #18971 from justincormack/ptrace
Block additional ptrace related syscalls in default seccomp profile
2015-12-29 11:56:51 -08:00
Jessica Frazelle b610fc226a
remove seccomp from seccomp profile
This can be allowed because it should only restrict more per the seccomp docs, and multiple apps use it today.

Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2015-12-29 11:21:33 -08:00
Arnaud Porterie 94e0760868 Merge pull request #18947 from jfrazelle/fix-seccomp-unsupported
fix default profile where unsupported
2015-12-29 10:21:07 -08:00
Arnaud Porterie afdc4747dc Merge pull request #18953 from justincormack/robust_list
Allow use of robust list syscalls in default seccomp policy
2015-12-29 10:19:41 -08:00
Arnaud Porterie a32b06b067 Merge pull request #18956 from justincormack/umount
Block original umount syscall in default seccomp filter
2015-12-29 10:19:04 -08:00
Justin Cormack a0a8ca0ae0 Block additional ptrace related syscalls in default seccomp profile
Block kcmp, procees_vm_readv, process_vm_writev.
All these require CAP_PTRACE, and are only used for ptrace related
actions, so are not useful as we block ptrace.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2015-12-29 18:17:28 +00:00
Arnaud Porterie ad8bce2ce4 Merge pull request #18959 from justincormack/finit_module
Deny finit_module in default seccomp profile
2015-12-29 10:12:50 -08:00
Arnaud Porterie 8ac3d083a8 Merge pull request #18961 from justincormack/clock_adjtime
Block clock_adjtime in default seccomp config
2015-12-29 10:08:45 -08:00
Justin Cormack 33568405f3 Block bpf syscall from default seccomp profile
The bpf syscall can load code into the kernel which may
persist beyond container lifecycle. Requires CAP_SYS_ADMIN
already.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2015-12-29 17:28:30 +00:00
Justin Cormack 6c3ea7a511 Block vm86 syscalls in default seccomp profile
These provide an in kernel virtual machine for x86 real mode on x86
used by one very early DOS emulator. Not required for any normal use.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2015-12-29 15:47:23 +00:00
Justin Cormack 6300a08be9 Block stime in default seccomp profile
The stime syscall is a legacy syscall on some architectures
to set the clock, should be blocked as time is not namespaced.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2015-12-29 15:28:05 +00:00
Justin Cormack 0e5c43cdda Block clock_adjtime in default seccomp config
clock_adjtime is the new posix style version of adjtime allowing
a specific clock to be specified. Time is not namespaced, so do
not allow.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2015-12-29 12:48:16 +00:00
Justin Cormack 0d5306a0b6 Deny finit_module in default seccomp profile
This is a new version of init_module that takes a file descriptor
rather than a file name.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2015-12-29 12:31:33 +00:00
Justin Cormack 9be0d93cf7 Block original umount syscall in default seccomp filter
The original umount syscall without flags argument needs to
be blocked too.

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2015-12-29 11:57:16 +00:00
Justin Cormack 7b133e7235 Allow use of robust list syscalls
The set_robust_list syscall sets the list of futexes which are
cleaned up on thread exit, and are needed to avoid mutexes
being held forever on thread exit.

See for example in Musl libc mutex handling:
http://git.musl-libc.org/cgit/musl/tree/src/thread/pthread_mutex_trylock.c#n22

Signed-off-by: Justin Cormack <justin.cormack@unikernel.com>
2015-12-29 10:22:05 +00:00
Jessica Frazelle b4c14a0bb8
fix code comment
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2015-12-28 22:36:54 -08:00
Jessica Frazelle 94b45310f4
fix default profile where unsupported
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2015-12-28 20:42:15 -08:00
Jessica Frazelle 15674c5fb7
add docs and unconfined to run a container without the default seccomp profile
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2015-12-28 10:26:51 -08:00
Jessica Frazelle 947293a280
set default seccomp profile
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2015-12-28 10:18:47 -08:00
Qiang Huang 8799c4fc0f Implemet docker update command
It's used for updating properties of one or more containers, we only
support resource configs for now. It can be extended in the future.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2015-12-28 19:19:26 +08:00
Daniel Nephin 83237aab2b Remove package pkg/ulimit, use go-units instead.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2015-12-23 13:27:58 -05:00
David Calavera 7ac4232e70 Move Config and HostConfig from runconfig to types/container.
- Make the API client library completely standalone.
- Move windows partition isolation detection to the client, so the
  driver doesn't use external types.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-12-22 13:34:30 -05:00
David Calavera 056e744903 Replace usage of pkg/nat with go-connections/nat.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-12-22 13:31:46 -05:00
Ma Shimiao 843084b08b Add support for blkio read/write iops device
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2015-12-21 09:14:49 +08:00
Jess Frazelle ff69b23dc0 Merge pull request #18395 from LK4D4/default_cgroup_is_not_daemon
Use /docker as cgroup parent instead of docker
2015-12-17 13:59:00 -08:00
Euan 0b5131444d Set OOMKilled state on any OOM event
This restores the behavior that existed prior to #16235 for setting
OOMKilled, while retaining the additional benefits it introduced around
emitting the oom event.

This also adds a test for the most obvious OOM cases which would have
caught this regression.

Fixes #18510

Signed-off-by: Euan <euank@amazon.com>
2015-12-15 19:27:57 +00:00
Brian Goff ce0b1841c8 Merge pull request #17034 from rhvgoyal/volume-propagation
Capability to specify per volume mount propagation mode
2015-12-15 12:14:41 -05:00
Alexander Morozov ecc3717cb1 Merge pull request #18612 from mrunalp/update_runc
Update runc/libcontainer to v0.0.6
2015-12-14 13:05:53 -08:00
Jess Frazelle c38aa60180 Merge pull request #18393 from qzio/apparmor/ptrace-ubuntu14
Enable ptrace in a container on apparmor below 2.9
2015-12-14 10:07:01 -08:00
Vivek Goyal d4b4ce2588 Check Propagation properties of source mount point
Whether a shared/slave volume propagation will work or not also depends on
where source directory is mounted on and what are the propagation properties
of that mount point. For example, for shared volume mount to work, source
mount point should be shared. For slave volume mount to work, source mount
point should be either shared/slave.

This patch determines the mount point on which directory is mounted and
checks for desired minimum propagation properties of that mount point. It
errors out of configuration does not seem right.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-12-14 10:39:53 -05:00
Vivek Goyal a2dc4f79f2 Add capability to specify mount propagation per volume
Allow passing mount propagation option shared, slave, or private as volume
property.

For example.
docker run -ti -v /root/mnt-source:/root/mnt-dest:slave fedora bash

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-12-14 10:39:53 -05:00