In 1.6.2 we were decoding inspect API response into interface{}.
time.Time fields were JSON encoded as RFC3339Nano in the response
and when decoded into interface{} they were just strings so the inspect
template treated them as just strings.
From 1.7 we are decoding into types.ContainerJSON and when the template
gets executed it now gets a time.Time and it's formatted as
2015-07-22 05:02:38.091530369 +0000 UTC.
This patch brings back the old behavior by typing time.Time fields
as string so they gets formatted as they were encoded in JSON -- RCF3339Nano
Signed-off-by: Antonio Murdaca <runcom@linux.com>
* Add godoc documentation where it was missing
* Change identifier names that don't match Go style, such as INDEX_NAME
* Rename RegistryInfo to PingResult, which more accurately describes
what this structure is for. It also has the benefit of making the name
not stutter if used outside the package.
Updates #14756
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
There is no option validation for "journald" log-driver, so it makes no
sense to fail in that case.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
This patch creates a new cli package that allows to combine both client
and daemon commands (there is only one daemon command: docker daemon).
The `-d` and `--daemon` top-level flags are deprecated and a special
message is added to prompt the user to use `docker daemon`.
Providing top-level daemon-specific flags for client commands result
in an error message prompting the user to use `docker daemon`.
This patch does not break any old but correct usages.
This also makes `-d` and `--daemon` flags, as well as the `daemon`
command illegal in client-only binaries.
Signed-off-by: Tibor Vass <tibor@docker.com>
Prevent the docker daemon from mounting the created network files over
those provided by the user via -v command line option. This would otherwise
hide the one provide by the user.
The benefit of this is that a user can provide these network files using the
-v command line option and place them in a size-limited filesystem.
Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
By using the 'unconfined' policy for privileged
containers, we have inherited the host's apparmor
policies, which really make no sense in the
context of the container's filesystem.
For instance, policies written against
the paths of binaries such as '/usr/sbin/tcpdump'
can be easily circumvented by moving the binary
within the container filesystem.
Fixes GH#5490
Signed-off-by: Eric Windisch <eric@windisch.us>
When we are creating a container, first we call into graph driver to take
snapshot of image and create root for container-init. Then we write some
files to it and call into graph driver again to create container root
from container-init as base.
Once we have written files to container-init root, we don't unmount it
before taking a snapshot of it. Looks like with XFS it leaves it in such
a state that when we mount the container root, it goes into log recovery
path.
Jul 22 10:24:54 vm2-f22 kernel: XFS (dm-6): Mounting V4 Filesystem
Jul 22 10:24:54 vm2-f22 kernel: XFS (dm-6): Starting recovery (logdev: internal)
Jul 22 10:24:54 vm2-f22 kernel: XFS (dm-6): Ending recovery (logdev: internal)
This should not be required. So let us unmount container-init before use
it as a base for container root and then XFS does not go into this
internal recovery path.
Somebody had raised this issue for ext4 sometime back and proposed the same
change. I had shot it down at that point of time. I think now time has
come for this change.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
It's introduced in
68ba5f0b69 (Execdriver implementation on new libcontainer API)
But I don't see reson why we need it.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
The following methods will deprecate the Copy method and introduce
two new, well-behaved methods for creating a tar archive of a resource
in a container and for extracting a tar archive into a directory in a
container.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
The automatic installation of AppArmor policies prevents the
management of custom, site-specific apparmor policies for the
default container profile. Furthermore, this change will allow
a future policy for the engine itself to be written without demanding
the engine be able to arbitrarily create and manage AppArmor policies.
- Add deb package suggests for apparmor.
- Ubuntu postinst use aa-status & fix policy path
- Add the policies to the debian packages.
- Add apparmor tests for writing proc files
Additional restrictions against modifying files in proc
are enforced by AppArmor. Ensure that AppArmor is preventing
access to these files, not simply Docker's configuration of proc.
- Remove /proc/k?mem from AA policy
The path to mem and kmem are in /dev, not /proc
and cannot be restricted successfully through AppArmor.
The device cgroup will need to be sufficient here.
- Load contrib/apparmor during integration tests
Note that this is somewhat dirty because we
cannot restore the host to its original configuration.
However, it should be noted that prior to this patch
series, the Docker daemon itself was loading apparmor
policy from within the tests, so this is no dirtier or
uglier than the status-quo.
Signed-off-by: Eric Windisch <eric@windisch.us>
There are several bug reports on this error happening, and error is
not helpful unless you read the code. Google brings up removing
the repositories.btrfs file.
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
The ZFS driver should raise proper errors when the ZFS utility is
missing or when there's no zfs partition active on the system. Raising the
proper errors make possible to silently ignore the ZFS storage
driver when no default storage driver is specified.
Previous to this commit it was no longer possible to start the
docker daemon in that way:
docker -d --storage-opt dm.loopdatasize=2GB
The above command resulted in an exit error because the ZFS driver
tried to use the storage options.
Signed-off-by: Flavio Castelli <fcastelli@suse.com>
Closes#14621
This one grew to be much more than I expected so here's the story... :-)
- when a bad port string (e.g. xxx80) is passed into container.create()
via the API it wasn't being checked until we tried to start the container.
- While starting the container we trid to parse 'xxx80' in nat.Int()
and would panic on the strconv.ParseUint(). We should (almost) never panic.
- In trying to remove the panic I decided to make it so that we, instead,
checked the string during the NewPort() constructor. This means that
I had to change all casts from 'string' to 'Port' to use NewPort() instead.
Which is a good thing anyway, people shouldn't assume they know the
internal format of types like that, in general.
- This meant I had to go and add error checks on all calls to NewPort().
To avoid changing the testcases too much I create newPortNoError() **JUST**
for the testcase uses where we know the port string is ok.
- After all of that I then went back and added a check during container.create()
to check the port string so we'll report the error as soon as we get the
data.
- If, somehow, the bad string does get into the metadata we will generate
an error during container.start() but I can't test for that because
the container.create() catches it now. But I did add a testcase for that.
Signed-off-by: Doug Davis <dug@us.ibm.com>
Current default basesize is 10G. Change it to 100G. Reason being that for
some people 10G is turning out to be too small and we don't have capabilities
to grow it dyamically.
This is just overcommitting and no real space is allocated till container
actually writes data. And this is no different then fs based graphdrivers
where virtual size of a container root is unlimited.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Replaced github.com/docker/libcontainer with
github.com/opencontainers/runc/libcontaier.
Also I moved AppArmor profile generation to docker.
Main idea of this update is to fix mounting cgroups inside containers.
After updating docker on CI we can even remove dind.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
- Refactor opts.ValidatePath and add an opts.ValidateDevice
ValidePath will now accept : containerPath:mode, hostPath:containerPath:mode
and hostPath:containerPath.
ValidateDevice will have the same behavior as current.
- Refactor opts.ValidateEnv, opts.ParseEnvFile
Environment variables will now be validated with the following
definition :
> Environment variables set by the user must have a name consisting
> solely of alphabetics, numerics, and underscores - the first of
> which must not be numeric.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Memory swappiness option takes 0-100, and helps to tune swappiness
behavior per container.
For example, When a lower value of swappiness is chosen
the container will see minimum major faults. When no value is
specified for memory-swappiness in docker UI, it is inherited from
parent cgroup. (generally 60 unless it is changed).
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
When a container is removed but it had an exec, that still hasn't been
GC'd per PR #14476, and someone tries to inspect the exec we should
return a 404, not a 500+container not running. Returning "..not running" is
not only misleading because it could lead people to think the container is
actually still around, but after 5 minutes the error will change to a 404
after the GC. This means that we're externalizing our internall soft-deletion/GC
logic which shouldn't be any of the end user's concern. They should get the
same results immediate or after 5 minutes.
Signed-off-by: Doug Davis <dug@us.ibm.com>
Libcontainer already supported mount container's own cgroup into
container, with this patch, we can see container's own cgroup info
in container.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
This takes the final removal for exec commands in two steps. The first
GC tick will mark the exec commands for removal and then the second tick
will remove the config from the daemon.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This adds an event loop for running a GC cleanup for exec command
references that are on the daemon. These cannot be cleaned up
immediately because processes may need to get the exit status of the
exec command but it should not grow out of bounds. The loop is set to a
default 5 minute interval to perform cleanup.
It should be safe to perform this cleanup because unless the clients are
remembering the exec id of the process they launched they can query for
the status and see that it has exited. If they don't save the exec id
they will have to do an inspect on the container for all exec instances
and anything that is not live inside that container will not be returned
in the container inspect.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This removes the exec config from the container after the command exits
so that dead exec commands are not displayed in the container inspect.
The commands are still kept on the daemon so that when you inspect the
exec command, not the container, you are still able to get it's exit
status.
This also changes the ProcessConfig to a pointer.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
The ability to save and verify base device UUID (#13896) introduced a
situation where the initialization would panic when removing the device
returns EBUSY.
Functions `verifyBaseDeviceUUID` and `saveBaseDeviceUUID` now take the
lock on the `DeviceSet`, which solves the problem as `removeDevice`
assumes it owns the lock.
Signed-off-by: Arnaud Porterie <arnaud.porterie@docker.com>
Often it happens that docker is not able to shutdown/remove the thin
pool it created because some device has leaked into some mount name
space. That means device is in use and that means pool can't be removed.
Docker will leave pool as it is and exit. Later when user starts the
docker, it finds pool is already there and docker uses it. But docker
does not know it is same pool which is using the loop devices. Now
docker thinks loop devices are not being used. That means it does not
display the data correctly in "docker info", giving user wrong information.
This patch tries to detect if loop devices as created by docker are
being used for pool and fills in the right details in "docker info".
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
If a container is read-only, also set /proc, /sys,
& /dev to read-only. This should apply to both privileged and
unprivileged containers.
Note that when /dev is read-only, device files may still be
written to. This change will simply prevent the device paths
from being modified, or performing mknod of new devices within
the /dev path.
Tests are included for all cases. Also adds a test to ensure
that /dev/pts is always mounted read/write, even in the case of a
read-write rootfs. The kernel restricts writes here naturally and
bad things will happen if we mount it ro.
Signed-off-by: Eric Windisch <eric@windisch.us>
DeviceMapper must be explicitly selected because the Docker binary might not be linked to the right devmapper library.
With this change, Docker fails fast if the driver detection finds the devicemapper directory but the driver is not the default option.
The option `override_udev_sync_check` doesn't make sense anymore, since the user must be explicit to select devicemapper, so it's being removed.
Docker fails to use devicemapper only if Docker has been built statically unless the option was explicit.
Signed-off-by: David Calavera <david.calavera@gmail.com>
- Container networking statistics are no longer
retrievable from libcontainer after the introduction
of libnetwork. This change adds the missing code
for docker daemon to retireve the nw stats from
Endpoint.
Signed-off-by: Alessandro Boch <aboch@docker.com>
libnetwork host, none and bridge driver initialization is incorrectly
disabled if the daemon flag --bridge=none. The expected behavior of
setting --bridge as none is to disable the bridge driver alone and let
all other modes to be operational.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
By convention /pkg is safe to use from outside the docker tree, for example
if you're building a docker orchestrator.
/nat currently doesn't have any dependencies outside of /pkg, so it seems
reasonable to move it there.
This rename was performed with:
```
gomvpkg -vcs_mv_cmd="git mv {{.Src}} {{.Dst}}" \
-from github.com/docker/docker/nat \
-to github.com/docker/docker/pkg/nat
```
Signed-off-by: Peter Waller <p@pwaller.net>
Export metadata for container and image in docker-inspect when overlay
graphdriver is in use. Right now it is done only for devicemapper graph
driver.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
When a container is started with `--net=host` with
a particular name and it is subsequently destroyed,
then all subsequent creations of the container with
the same name will fail. This is because in `--net=host`
the namespace is shared i.e the host namespace so
trying to destroy the host namespace by calling
`LeaveAll` will fail and the endpoint is left with
the dangling state. So the fix is, for this mode, do
not attempt to destroy the namespace but just cleanup
the endpoint state and return.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
With publish-service and default-network support, a container could be
connected to a user-defined network that is backed by any driver/plugin.
But if the user uses port mapping or expose commands, the expectation
for that container is to behave like existing bridge network.
Thanks to the Libnetwork's CNM model, containers can be connected
to the bridge network as a secondary network in addition to the
user-specified network.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
- brings in vxlan based native multihost networking
- added a daemon flag required by libkv for dist kv operations
- moved the daemon flags to experimental
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This commit makes use of the CNM model supported by LibNetwork and
provides an ability to let a container to publish a specified service.
Behind the scenes, if a service with the given name doesnt exist, it is
automatically created on appropriate network and attach the container.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Signed-off-by: Madhu Venugopal <madhu@docker.com>
By default, the cgroup setting in libcontainer's configs.Cgroup for
memory swappiness will default to 0, which is a valid choice for memory
swappiness, but that means by default every container's memory
swappiness will be set to zero instead of the default 60, which is
probably not what users are expecting.
When the swappiness UI PR comes into Docker, there will be docker run
controls to set this per container, but for now we want to make sure
*not* to change the default, as well as work around an older kernel
issue that refuses to allow it to be set when cgroup hiearchies are in
use.
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
This commit also brings in the ability to specify a default network and its
corresponding driver as daemon flags. This helps in existing clients to
make use of newer networking features provided by libnetwork.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
It is easy for one to use docker for a while, shut it down and restart
docker with different set of storage options for device mapper driver
which will effectively change the thin pool. That means any of the
metadata stored in /var/lib/docker/devicemapper/metadata/ is not valid
for the new pool and user will run into various kind of issues like
container not found in the pool etc.
Users think that their images or containers are lost but it might just
be the case of configuration issue. People might use wrong metadata
with wrong pool.
To detect such situations, save UUID of base image and once docker
starts later, query and compare the UUID of base image with the
stored one. If they don't match, fail the initialization with the
error that UUID failed to match.
That way user will be forced to cleanup /var/lib/docker/ directory
and start docker again.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
An inspection of the graph package showed this function to be way out of place.
It is only depended upon by the daemon code. The function prepares a top-level
readonly layer used to provide a consistent runtime environment for docker
images.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
If no Mtu value is provided to the docker daemon, get the mtu from the
default route's interface. If there is no default route, default to a
mtu of 1500.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Export image/container metadata stored in graph driver. Right now 3 fields
DeviceId, DeviceSize and DeviceName are being exported from devicemapper.
Other graph drivers can export fields as they see fit.
This data can be used to mount the thin device outside of docker and tools
can look into image/container and do some kind of inspection.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
This PR brings the vendored libnetwork code to
3be488927db8d719568917203deddd630a194564, which pulls in quite a few
fixes to support kvstore, windows daemon compilation fixes,
multi-network support for Bridge driver, etc...
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This is breaking various setups where the host's rootfs is mount shared
correctly and breaks live migration with bind mounts.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
When the daemon is going down trigger immediate
garbage collection of libnetwork resources deleted
like namespace path since there will be no way to
remove them when the daemon restarts.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Move graph related functions in image to graph package.
Consolidating graph functionality is the first step in refactoring graph into an image store model.
Subsequent refactors will involve breaking up graph into multiple types with a strongly defined interface.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
container.config.NetworkDisabled is set for both daemon's
DisableNetwork and --networking=false case. Hence using
this flag instead to fix#13725.
There is an existing integration-test to catch this issue,
but it is working for the wrong reasons.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Move some calls to container.LogEvent down lower so that there's
less of a chance of them being missed. Also add a few more events
that appear to have been missed.
Added testcases for new events: commit, copy, resize, attach, rename, top
Signed-off-by: Doug Davis <dug@us.ibm.com>
Merge user specified devices correctly with default devices.
Otherwise the user specified devices end up without permissions.
Signed-off-by: David R. Jenni <david.r.jenni@gmail.com>
When using a scanner, log lines over 64K will crash the Copier with
bufio.ErrTooLong. Subsequently, the ioutils.bufReader will grow without
bound as the logs are no longer being flushed to disk.
Signed-off-by: Burke Libbey <burke.libbey@shopify.com>
daemon.Diff already implements mounting for naivegraphdriver and
aufs which does diffing on its owns does not need the container to be mounted.
So new filesystem driver should mount filesystems on their own if it is needed
to implement Diff(). This issue was reported by @kvasdopil while working on a
freebsd port, because freebsd does not allow mount an already mounted
filesystem. Also it saves some cycles for other operating systems as well.
Signed-off-by: Jörg Thalheim <joerg@higgsboson.tk>
I ran into a situation where I was trying:
`docker rmi busybox`
and it kept failing saying:
`could not find image: Prefix can't be empty`
While I have no idea how I got into this situation, it turns out this is
error message is from `daemon.canDeleteImage()`. In that func we loop over
all containers checking to see if they're using the image we're trying to
delete. In my case though, I had a container with no ImageID. So the code
would die tryig to find that image (hence the "Prefix can't be empty" err).
This would stop all processing despite the fact that the container we're
checking had nothing to do with 'busybox'.
My change logs the bad situation in the logs and then skips that container.
There's no reason to fail all `docker rmi ...` calls just because of one
bad container.
Will continue to try to figure out how I got a container w/o an ImageID
but as of now I have no idea, I didn't do anything but normal docker cli
commands.
Signed-off-by: Doug Davis <dug@us.ibm.com>
Fixes a regression from the volumes refactor where the vfs graphdriver
was setting labels for volumes to `s0` so that they can both be written
to by the container and shared with other containers.
When moving away from vfs this was never re-introduced.
Since this needs to happen regardless of volume driver, this is
implemented outside of the driver.
Fixes issue where `z` and `Z` labels are not set for bind-mounts.
Don't lock while creating volumes
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
UnmountVolumes used to also unmount 'specialMounts' but it no longer does after
a recent refactor of volumes. This patch corrects this behavior to include
unmounting of `networkMounts` which replaces `specialMounts` (now dead code).
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
I'm fairly consistently seeing an error in
DockerSuite.TestContainerApiRestartNotimeoutParam:
docker_api_containers_test.go:969:
c.Assert(status, check.Equals, http.StatusNoContent)
... obtained int = 500
... expected int = 204
And in the daemon logs I see:
INFO[0003] Container 8cf77c20275586b36c5095613159cf73babf92ba42ed4a2954bd55dca6b08971 failed to exit within 0 seconds of SIGTERM - using the force
ERRO[0003] Handler for POST /containers/{name:.*}/restart returned error: Cannot restart container 8cf77c20275586b36c5095613159cf73babf92ba42ed4a2954bd55dca6b08971: [2] Container does not exist: container destroyed
ERRO[0003] HTTP Error err=Cannot restart container 8cf77c20275586b36c5095613159cf73babf92ba42ed4a2954bd55dca6b08971: [2] Container does not exist: container destroyed
statusCode=500
Note the "container destroyed" error message. This is being generatd by
the libcontainer code and bubbled up in container.Kill() as a result of the
call to `container.killPossiblyDeadProcess(9)` on line 439.
See the comment in the code, but what I think is going on is that because we
don't have any timeout on the Stop() call we immediate try to force things to
stop. And by the time we get into libcontainer code the process just finished
stopping due to the initial signal, so this secondary sig-9 fails due to the
container no longer running (ie. its 'destroyed').
Since we can't look for "container destroyed" to just ignore the error, because
some other driver might have different text, I opted to just ignore the error
and keep going - with the assumption that if it couldnt send a sig-9 to the
process then it MUST be because its already dead and not something else.
To reproduce this I just run:
curl -v -X POST http://127.0.0.1:2375/v1.19/containers/8cf77c20275586b36c5095613159cf73babf92ba42ed4a2954bd55dca6b08971/restart
a few times and then it fails with the HTTP 500.
Would like to hear some other ideas on to handle this since I'm not
thrilled with the proposed solution.
Signed-off-by: Doug Davis <dug@us.ibm.com>
* Don't AllocateNetwork when network is disabled
* Don't createNetwork in execdriver when network is disabled
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
We should let user create container even if the container he wants
join is not running, that check should be done at start time.
In this case, the running check is done by getIpcContainer() when
we start container.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed by all authors:
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Signed-off-by: Arnaud Porterie <arnaud.porterie@docker.com>
Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Jeff Lindsay <progrium@gmail.com>
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Signed-off-by: Luke Marsden <luke@clusterhq.com>
Signed-off-by: David Calavera <david.calavera@gmail.com>
Sometimes container.cleanup() can be called from multiple paths
for the same container during error conditions from monitor and
regular startup path. So if the container network has been already
released do not try to release it again.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
The DOCKER_EXPERIMENTAL environment variable drives the activation of
the 'experimental' build tag.
Signed-off-by: Arnaud Porterie <arnaud.porterie@docker.com>