Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Windows: add support for images stored in alternate location.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Signed-off-by: Don Kjer <don.kjer@gmail.com>
Changing vendor/src/github.com/docker/libnetwork to match lindenlab/libnetwork custom-host-port-ranges-1.7 branch
Introduce a write denial for files at the root of /proc.
This prohibits root users from performing a chmod of those
files. The rules for denials in proc are also cleaned up,
making the rules better match their targets.
Locally tested on:
- Ubuntu precise (12.04) with AppArmor 2.7
- Ubuntu trusty (14.04) with AppArmor 2.8.95
Signed-off-by: Eric Windisch <eric@windisch.us>
This reverts commit 40b71adee3.
Original commit (for which this is effectively a rebased version) is
72a500e9e5 and was provided by Lei Jitang
<leijitang@huawei.com>.
Signed-off-by: Tim Dettrick <t.dettrick@uq.edu.au>
Commit e27c904 added a wrong and misleading comment
to GetMetadata(). Fix it using the wording from
commit 407a626 which introduced GetMetadata().
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
These functions are not part of the graphdriver.Driver
interface and should therefore be private.
Also, remove comments added by commit e27c904 as they are
* pretty obvious
* no longer required by golint
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Ploop graph driver provides its own ext4 filesystem to every
container. It so happens that ext4 root comes with lost+found
directory, causing failures from DriverTestCreateEmpty() and
DriverTestCreateBase() tests on ploop.
While I am not yet ready to submit ploop graph driver for review,
this change looks simple enough to push.
Note that filtering is done without any additional allocations,
as described in https://github.com/golang/go/wiki/SliceTricks.
[v2: added a comment about lost+found]
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Have network files mounted read-only when mounted using the -v
open and -v parameter has 'ro' passed.
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
It may happen that host system settings are changed while the daemon is running.
This will cause errors at libcontainer level when starting a container with a
particular hostConfig (e.g. hostConfig with memory swappiness but the memory
cgroup was umounted).
This patch adds an hostConfig check on container start to prevent the daemon
from even calling libcontainer with the wrong configuration as we're already
doing on container's creation).
Signed-off-by: Antonio Murdaca <runcom@linux.com>
(cherry picked from commit 0d2628cdf19783106ae8723f51fae0a7c7f361c6)
sysinfo struct was initialized at daemon startup to make sure
kernel configs such as device cgroup are present and error out if not.
The struct was embedded in daemon struct making impossible to detect
if some system config is changed at daemon runtime (i.e. someone
umount the memory cgroup). This leads to container's starts failure if
some config is changed at daemon runtime.
This patch moves sysinfo out of daemon and initilize and check it when
needed (daemon startup, containers creation, contaienrs startup for
now).
Signed-off-by: Antonio Murdaca <runcom@linux.com>
(cherry picked from commit 472b6f66e03f9a85fe8d23098dac6f55a87456d8)
Carried: #14015
If kernel is compiled with CONFIG_FAIR_GROUP_SCHED disabled cpu.shares
doesn't exist.
If kernel is compiled with CONFIG_CFQ_GROUP_IOSCHED disabled blkio.weight
doesn't exist.
If kernel is compiled with CONFIG_CPUSETS disabled cpuset won't be
supported.
We need to handle these conditions by checking sysinfo and verifying them.
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
After tailing a file, if the number of lines requested is > the number
of lines in the file, this would cause a json unmarshalling error to
occur when we later try to go follow the file.
So brute force set it to the end if any tailing occurred.
There is potential that there could be some missing log messages if logs
are being written very quickly, however I was not able to make this
happen even with `while true; do echo hello; done`, so this is probably
acceptable.
While testing this I also found a panic in LogWatcher.Close can be
called twice due to a race. Fix channel close to only close when there
has been no signal to the channel.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Some structures use int for sizes and UNIX timestamps. On some
platforms, int is 32 bits, so this can lead to the year 2038 issues and
overflows when dealing with large containers or layers.
Consistently use int64 to store sizes and UNIX timestamps in
api/types/types.go. Update related to code accordingly (i.e.
strconv.FormatInt instead of strconv.Itoa).
Use int64 in progressreader package to avoid integer overflow when
dealing with large quantities. Update related code accordingly.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Expand the godoc documentation for the graph package.
Centralize DefaultTag in the graphs/tag package instead of defining it
twice.
Remove some unnecessary "config" structs that are only used to pass
a few parameters to a function.
Simplify the GetParentsSize function - there's no reason for it to take
an accumulator argument.
Unexport some functions that aren't needed outside the package.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
[pkg/archive] Update archive/copy path handling
- Remove unused TarOptions.Name field.
- Add new TarOptions.RebaseNames field.
- Update some of the logic around path dir/base splitting.
- Update some of the logic behind archive entry name rebasing.
[api/types] Add LinkTarget field to PathStat
[daemon] Fix stat, archive, extract of symlinks
These operations *should* resolve symlinks that are in the path but if the
resource itself is a symlink then it *should not* be resolved. This patch
puts this logic into a common function `resolvePath` which resolves symlinks
of the path's dir in scope of the container rootfs but does not resolve the
final element of the path. Now archive, extract, and stat operations will
return symlinks if the path is indeed a symlink.
[api/client] Update cp path hanling
[docs/reference/api] Update description of stat
Add the linkTarget field to the header of the archive endpoint.
Remove path field.
[integration-cli] Fix/Add cp symlink test cases
Copying a symlink should do just that: copy the symlink NOT
copy the target of the symlink. Also, the resulting file from
the copy should have the name of the symlink NOT the name of
the target file.
Copying to a symlink should copy to the symlink target and not
modify the symlink itself.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.
Quoting MkdirAll documentation:
> MkdirAll creates a directory named path, along with any necessary
> parents, and returns nil, or else returns an error. If path
> is already a directory, MkdirAll does nothing and returns nil.
This means two things:
1. If a directory to be created already exists, no error is returned.
2. If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.
The above is a theory, based on quoted documentation and my UNIX
knowledge.
3. In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.
Because of #1, IsExist check after MkdirAll is not needed.
Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.
Note this error is all over the tree, I guess due to copy-paste,
or trying to follow the same usage pattern as for Mkdir(),
or some not quite correct examples on the Internet.
[v2: a separate aufs commit is merged into this one]
[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
The 'deny ptrace' statement was supposed to only ignore
ptrace failures in the AUDIT log. However, ptrace was implicitly
allowed from unconfined processes (such as the docker daemon and
its integration tests) due to the abstractions/base include.
This rule narrows the definition such that it will only ignore
the failures originating inside of the container and will not
cause denials when the daemon or its tests ptrace inside processes.
Introduces positive and negative tests for ptrace /w apparmor.
Signed-off-by: Eric Windisch <eric@windisch.us>
Integration tests were failing due to proc filter behavior
changes with new apparmor policies.
Also include the missing docker-unconfined policy resolving
potential startup errors. This policy is complain-only so
it should behave identically to the standard unconfined policy,
but will not apply system path-based policies within containers.
Signed-off-by: Eric Windisch <eric@windisch.us>
- downcase and privatize exported variables that were unused
- make accurate an error message
- added package comments
- remove unused var ReadLogsNotSupported
- enable linter
- some spelling corrections
Signed-off-by: Morgan Bauer <mbauer@us.ibm.com>
If I run two containers with the same network they share the same /etc/resolv.conf.
The current code changes the labels of the /etc/resolv.conf currently to the
private label which causes it to be unusable in the first container.
This patch changes the labels to a shared label if more then one container
will use the content.
Docker-DCO-1.1-Signed-off-by: Dan Walsh dwalsh@redhat.com (github: rhatdan)
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
Will attempt to load profiles automatically. If loading fails
but the profiles are already loaded, execution will continue.
A hard failure will only occur if Docker cannot load
the profiles *and* they have not already been loaded via
some other means.
Also introduces documentation for AppArmor.
Signed-off-by: Eric Windisch <eric@windisch.us>
The `ApplyDiff` function takes a tar archive stream that is
automagically decompressed later. This was causing a double
decompression, and when the layer was empty, that causes an early EOF.
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Return an error when the container is stopped only in api versions
equal or greater than 1.20 (docker 1.8).
Signed-off-by: David Calavera <david.calavera@gmail.com>
daemon_test.go supposted to be unit test for daemon, so
don't see reason why we need another daemon_unit_test.go.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
In 1.6.2 we were decoding inspect API response into interface{}.
time.Time fields were JSON encoded as RFC3339Nano in the response
and when decoded into interface{} they were just strings so the inspect
template treated them as just strings.
From 1.7 we are decoding into types.ContainerJSON and when the template
gets executed it now gets a time.Time and it's formatted as
2015-07-22 05:02:38.091530369 +0000 UTC.
This patch brings back the old behavior by typing time.Time fields
as string so they gets formatted as they were encoded in JSON -- RCF3339Nano
Signed-off-by: Antonio Murdaca <runcom@linux.com>
* Add godoc documentation where it was missing
* Change identifier names that don't match Go style, such as INDEX_NAME
* Rename RegistryInfo to PingResult, which more accurately describes
what this structure is for. It also has the benefit of making the name
not stutter if used outside the package.
Updates #14756
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
There is no option validation for "journald" log-driver, so it makes no
sense to fail in that case.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
This patch creates a new cli package that allows to combine both client
and daemon commands (there is only one daemon command: docker daemon).
The `-d` and `--daemon` top-level flags are deprecated and a special
message is added to prompt the user to use `docker daemon`.
Providing top-level daemon-specific flags for client commands result
in an error message prompting the user to use `docker daemon`.
This patch does not break any old but correct usages.
This also makes `-d` and `--daemon` flags, as well as the `daemon`
command illegal in client-only binaries.
Signed-off-by: Tibor Vass <tibor@docker.com>
Prevent the docker daemon from mounting the created network files over
those provided by the user via -v command line option. This would otherwise
hide the one provide by the user.
The benefit of this is that a user can provide these network files using the
-v command line option and place them in a size-limited filesystem.
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
By using the 'unconfined' policy for privileged
containers, we have inherited the host's apparmor
policies, which really make no sense in the
context of the container's filesystem.
For instance, policies written against
the paths of binaries such as '/usr/sbin/tcpdump'
can be easily circumvented by moving the binary
within the container filesystem.
Fixes GH#5490
Signed-off-by: Eric Windisch <eric@windisch.us>
When we are creating a container, first we call into graph driver to take
snapshot of image and create root for container-init. Then we write some
files to it and call into graph driver again to create container root
from container-init as base.
Once we have written files to container-init root, we don't unmount it
before taking a snapshot of it. Looks like with XFS it leaves it in such
a state that when we mount the container root, it goes into log recovery
path.
Jul 22 10:24:54 vm2-f22 kernel: XFS (dm-6): Mounting V4 Filesystem
Jul 22 10:24:54 vm2-f22 kernel: XFS (dm-6): Starting recovery (logdev: internal)
Jul 22 10:24:54 vm2-f22 kernel: XFS (dm-6): Ending recovery (logdev: internal)
This should not be required. So let us unmount container-init before use
it as a base for container root and then XFS does not go into this
internal recovery path.
Somebody had raised this issue for ext4 sometime back and proposed the same
change. I had shot it down at that point of time. I think now time has
come for this change.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
It's introduced in
68ba5f0b69 (Execdriver implementation on new libcontainer API)
But I don't see reson why we need it.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
The following methods will deprecate the Copy method and introduce
two new, well-behaved methods for creating a tar archive of a resource
in a container and for extracting a tar archive into a directory in a
container.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
The automatic installation of AppArmor policies prevents the
management of custom, site-specific apparmor policies for the
default container profile. Furthermore, this change will allow
a future policy for the engine itself to be written without demanding
the engine be able to arbitrarily create and manage AppArmor policies.
- Add deb package suggests for apparmor.
- Ubuntu postinst use aa-status & fix policy path
- Add the policies to the debian packages.
- Add apparmor tests for writing proc files
Additional restrictions against modifying files in proc
are enforced by AppArmor. Ensure that AppArmor is preventing
access to these files, not simply Docker's configuration of proc.
- Remove /proc/k?mem from AA policy
The path to mem and kmem are in /dev, not /proc
and cannot be restricted successfully through AppArmor.
The device cgroup will need to be sufficient here.
- Load contrib/apparmor during integration tests
Note that this is somewhat dirty because we
cannot restore the host to its original configuration.
However, it should be noted that prior to this patch
series, the Docker daemon itself was loading apparmor
policy from within the tests, so this is no dirtier or
uglier than the status-quo.
Signed-off-by: Eric Windisch <eric@windisch.us>
There are several bug reports on this error happening, and error is
not helpful unless you read the code. Google brings up removing
the repositories.btrfs file.
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
The ZFS driver should raise proper errors when the ZFS utility is
missing or when there's no zfs partition active on the system. Raising the
proper errors make possible to silently ignore the ZFS storage
driver when no default storage driver is specified.
Previous to this commit it was no longer possible to start the
docker daemon in that way:
docker -d --storage-opt dm.loopdatasize=2GB
The above command resulted in an exit error because the ZFS driver
tried to use the storage options.
Signed-off-by: Flavio Castelli <fcastelli@suse.com>
Closes#14621
This one grew to be much more than I expected so here's the story... :-)
- when a bad port string (e.g. xxx80) is passed into container.create()
via the API it wasn't being checked until we tried to start the container.
- While starting the container we trid to parse 'xxx80' in nat.Int()
and would panic on the strconv.ParseUint(). We should (almost) never panic.
- In trying to remove the panic I decided to make it so that we, instead,
checked the string during the NewPort() constructor. This means that
I had to change all casts from 'string' to 'Port' to use NewPort() instead.
Which is a good thing anyway, people shouldn't assume they know the
internal format of types like that, in general.
- This meant I had to go and add error checks on all calls to NewPort().
To avoid changing the testcases too much I create newPortNoError() **JUST**
for the testcase uses where we know the port string is ok.
- After all of that I then went back and added a check during container.create()
to check the port string so we'll report the error as soon as we get the
data.
- If, somehow, the bad string does get into the metadata we will generate
an error during container.start() but I can't test for that because
the container.create() catches it now. But I did add a testcase for that.
Signed-off-by: Doug Davis <dug@us.ibm.com>
Current default basesize is 10G. Change it to 100G. Reason being that for
some people 10G is turning out to be too small and we don't have capabilities
to grow it dyamically.
This is just overcommitting and no real space is allocated till container
actually writes data. And this is no different then fs based graphdrivers
where virtual size of a container root is unlimited.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Replaced github.com/docker/libcontainer with
github.com/opencontainers/runc/libcontaier.
Also I moved AppArmor profile generation to docker.
Main idea of this update is to fix mounting cgroups inside containers.
After updating docker on CI we can even remove dind.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
- Refactor opts.ValidatePath and add an opts.ValidateDevice
ValidePath will now accept : containerPath:mode, hostPath:containerPath:mode
and hostPath:containerPath.
ValidateDevice will have the same behavior as current.
- Refactor opts.ValidateEnv, opts.ParseEnvFile
Environment variables will now be validated with the following
definition :
> Environment variables set by the user must have a name consisting
> solely of alphabetics, numerics, and underscores - the first of
> which must not be numeric.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Memory swappiness option takes 0-100, and helps to tune swappiness
behavior per container.
For example, When a lower value of swappiness is chosen
the container will see minimum major faults. When no value is
specified for memory-swappiness in docker UI, it is inherited from
parent cgroup. (generally 60 unless it is changed).
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
When a container is removed but it had an exec, that still hasn't been
GC'd per PR #14476, and someone tries to inspect the exec we should
return a 404, not a 500+container not running. Returning "..not running" is
not only misleading because it could lead people to think the container is
actually still around, but after 5 minutes the error will change to a 404
after the GC. This means that we're externalizing our internall soft-deletion/GC
logic which shouldn't be any of the end user's concern. They should get the
same results immediate or after 5 minutes.
Signed-off-by: Doug Davis <dug@us.ibm.com>
Libcontainer already supported mount container's own cgroup into
container, with this patch, we can see container's own cgroup info
in container.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
This takes the final removal for exec commands in two steps. The first
GC tick will mark the exec commands for removal and then the second tick
will remove the config from the daemon.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This adds an event loop for running a GC cleanup for exec command
references that are on the daemon. These cannot be cleaned up
immediately because processes may need to get the exit status of the
exec command but it should not grow out of bounds. The loop is set to a
default 5 minute interval to perform cleanup.
It should be safe to perform this cleanup because unless the clients are
remembering the exec id of the process they launched they can query for
the status and see that it has exited. If they don't save the exec id
they will have to do an inspect on the container for all exec instances
and anything that is not live inside that container will not be returned
in the container inspect.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This removes the exec config from the container after the command exits
so that dead exec commands are not displayed in the container inspect.
The commands are still kept on the daemon so that when you inspect the
exec command, not the container, you are still able to get it's exit
status.
This also changes the ProcessConfig to a pointer.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>