Commit Graph

91 Commits

Author SHA1 Message Date
Kenfe-Mickael Laventure 8ef01f724b Handle out-of-sync libcontainerd client on restore
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
(cherry picked from commit 51f21a1674c60108f97878815046c69f769cee48)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-08-11 17:56:48 -07:00
Tonis Tiigi c473d14d45 libcontainerd: mark container exited after failed restart
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
(cherry picked from commit 9be0fb45c25e4d8d3cf0aa444da5ae41dd18f435)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-08-11 17:56:47 -07:00
Tonis Tiigi f6d388f5b1 libcontainerd: wait for restart after state change
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
(cherry picked from commit 495448b2903c1a765cc17dff05afebe16a466917)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-08-11 17:56:47 -07:00
Michael Crosby eddee8e932 Check if the container is running if no event
When there is no event for the container it can happen because of a
crash and the container state on the persistent disk will have a
mismatch between what was in `/run` ( machine crash ).

This situation will create an unkillable container in docker because
containerd does not see it and it is not running but docker thinks it is
and you cannot tell it anything different.

This fixes the issue by checking if containerd has the container running
if we do not have an event instead of just returning.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-07-28 09:22:51 -07:00
Lei Jitang c46db363d6 Fix daemon panic on restoring containers
Signed-off-by: Lei Jitang <leijitang@huawei.com>
(cherry picked from commit c75de8e33cc0db5236eef6146f2de06533b46aa8)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-27 14:51:42 -07:00
Kenfe-Mickael Laventure 47b7cf5ceb Fix missing unlock in libcontainerd.Restore()
This was preventing the "exit" event to be correctly processed during
the restore process without live-restore enabled.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
(cherry picked from commit ac068a1f9de2b20b145b5682cd514c1f6b1fac17)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-25 23:24:37 -07:00
Kenfe-Mickael Laventure 6142557cba Prepend libcontainerd log message with "libcontainerd:"
This will make it easier to pinpoint error messages in the daemon
logs.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
(cherry picked from commit 5231c5534679206e20672ca16bbee5c10d699319)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-25 23:21:14 -07:00
Kenfe-Mickael Laventure afc64c2d71 Update libcontainerd.AddProcess to accept a context
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
(cherry picked from commit c02f82756e914081543bf05cb1815a48c02b1ebd)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-25 23:15:43 -07:00
Kenfe-Mickael Laventure b7687cc673 Do not rely on "live" event anymore
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
(cherry picked from commit 64483c3bdaa1887b8b932e0564362fbbff025dc0)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-25 23:15:42 -07:00
Kenfe-Mickael Laventure 6c717a5744 Vendor in new containerd
This version introduces the following:
 - uses nanosecond timestamps for event
 - ensure events are sent once their effect is "live"

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
(cherry picked from commit 29b2714580d085533c29807fa337c2b7a302abb6)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-25 23:15:42 -07:00
Tonis Tiigi 09b01499b7 Wait for the reader fifo opening to block
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
(cherry picked from commit 0b2023130e285a0207be9fda4b22e1419997c552)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-25 23:15:12 -07:00
Kenfe-Mickael Laventure ec03307eb2 Fix data race in libcontainerd
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
(cherry picked from commit 8e9fbc8f5fc5759eb7f26ec998f227994ff6c642)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-25 23:15:10 -07:00
Michael Crosby 89986cbff4 Add --oom-score-adjust to daemon
This adds an `--oom-score-adjust` flag to the daemon so that the value
provided can be set for the docker daemon's process.  The default value
for the flag is -500.  This will allow the docker daemon to have a
less chance of being killed before containers do.  The default value for
processes is 0 with a min/max of -1000/1000.

-500 is a good middle ground because it is less than the default for
most processes and still not -1000 which basically means never kill this
process in an OOM condition on the host machine.  The only processes on
my machine that have a score less than -500 are dbus at -900 and sshd
and xfce( my window manager ) at -1000.  I don't think docker should be
set lower, by default, than dbus or sshd so that is why I chose -500.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
(cherry picked from commit a894aec8d81de5484152a76d76b80809df9edd71)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-25 23:14:57 -07:00
Alexander Morozov 926d66b50f all: fix usage of some variables
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
(cherry picked from commit 57e14714ee85e67f59d8c22aed23dc875cf2e58c)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-25 23:13:48 -07:00
John Howard 0a861141fa Windows: Ensure VolumePath is not set for Hyper-V containers
Signed-off-by: John Howard <jhoward@microsoft.com>
(cherry picked from commit fd4f5c23650799a7e76e193614bf82454b375fe3)
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-07-08 15:33:28 -07:00
Kenfe-Mickael Laventure 7b2e5216b8 Add support for multiples runtimes
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2016-06-14 07:47:31 -07:00
Michael Crosby 3020081e94 Merge pull request #23213 from crosbymichael/restore-option
Add --live-restore flag
2016-06-13 20:57:19 -07:00
Michael Crosby d705dab1b1 Add --live-restore flag
This flags enables full support of daemonless containers in docker.  It
ensures that docker does not stop containers on shutdown or restore and
properly reconnects to the container when restarted.

This is not the default because of backwards compat but should be the
desired outcome for people running containers in prod.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-06-13 19:16:26 -07:00
Sebastiaan van Stijn 50c7bcac1e Merge pull request #23443 from swernli/servicing-async
Updating call sequence for servicing Windows containers
2016-06-13 19:49:23 +02:00
Antonio Murdaca 44ccbb317c *: fix logrus.Warn[f]
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-06-11 19:42:38 +02:00
Stefan J. Wernli f2ad7be2c4 Updating call sequence for servicing Windows containers
This change adjusts the calling pattern for servcing containers to use waiting on the process instead of expecting start to block.  This is safer, as it avoids timeouts in the start code path for the potentially expensive update operation.

Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
2016-06-10 15:19:10 -07:00
Kenfe-Mickael Laventure 64a91ee74e Increase containerd start-timeout to 2 minutes
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
(cherry picked from commit 4251e1e99e16ff7ff5557ee16e5bef26a14cd127)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-06-10 16:22:19 +02:00
Jannick Fahlbusch e3490cdcc0 Fix some typos
Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
2016-06-08 21:59:34 +02:00
Thomas Leonard b6c7becbfe
Add support for user-defined healthchecks
This PR adds support for user-defined health-check probes for Docker
containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus
some corresponding "docker run" options. It can be used with a restart policy
to automatically restart a container if the check fails.

The `HEALTHCHECK` instruction has two forms:

* `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container)
* `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image)

The `HEALTHCHECK` instruction tells Docker how to test a container to check that
it is still working. This can detect cases such as a web server that is stuck in
an infinite loop and unable to handle new connections, even though the server
process is still running.

When a container has a healthcheck specified, it has a _health status_ in
addition to its normal status. This status is initially `starting`. Whenever a
health check passes, it becomes `healthy` (whatever state it was previously in).
After a certain number of consecutive failures, it becomes `unhealthy`.

The options that can appear before `CMD` are:

* `--interval=DURATION` (default: `30s`)
* `--timeout=DURATION` (default: `30s`)
* `--retries=N` (default: `1`)

The health check will first run **interval** seconds after the container is
started, and then again **interval** seconds after each previous check completes.

If a single run of the check takes longer than **timeout** seconds then the check
is considered to have failed.

It takes **retries** consecutive failures of the health check for the container
to be considered `unhealthy`.

There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list
more than one then only the last `HEALTHCHECK` will take effect.

The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK
CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands;
see e.g. `ENTRYPOINT` for details).

The command's exit status indicates the health status of the container.
The possible values are:

- 0: success - the container is healthy and ready for use
- 1: unhealthy - the container is not working correctly
- 2: starting - the container is not ready for use yet, but is working correctly

If the probe returns 2 ("starting") when the container has already moved out of the
"starting" state then it is treated as "unhealthy" instead.

For example, to check every five minutes or so that a web-server is able to
serve the site's main page within three seconds:

    HEALTHCHECK --interval=5m --timeout=3s \
      CMD curl -f http://localhost/ || exit 1

To help debug failing probes, any output text (UTF-8 encoded) that the command writes
on stdout or stderr will be stored in the health status and can be queried with
`docker inspect`. Such output should be kept short (only the first 4096 bytes
are stored currently).

When the health status of a container changes, a `health_status` event is
generated with the new status. The health status is also displayed in the
`docker ps` output.

Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-06-02 23:58:34 +02:00
Alexander Morozov cb36dddad1 Merge pull request #23148 from mlaventure/wait-for-containerd-before-restarting-it
Wait for containerd to die before restarting it
2016-06-01 10:35:31 -07:00
Brian Goff bcf0c8ca28 Merge pull request #23142 from Microsoft/ExtraCleanup
Windows: Remove a double free on hcs container handle
2016-06-01 11:09:06 -04:00
Kenfe-Mickael Laventure ce160b37e1 Wait for containerd to die before restarting it
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2016-06-01 07:45:03 -07:00
Vincent Demeester c7aba69cc1 Merge pull request #22989 from Microsoft/StartCleanup
Windows: Adding missing cleanup call when container start fails
2016-06-01 15:42:47 +02:00
Daniel Nephin 8b5e5c6195 Set --state-dir on containerd.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2016-05-31 11:48:05 -07:00
Darren Stahl c8454394f7 Windows: Remove a double free on hcs container handle
Signed-off-by: Darren Stahl <darst@microsoft.com>
2016-05-31 10:25:38 -07:00
Darren Stahl 054992e291 Windows: Adding missing cleanup call when container start fails
Signed-off-by: Darren Stahl <darst@microsoft.com>
2016-05-31 10:19:05 -07:00
Darren Stahl 717209c9ff Fix a leaked process handle of the first container to start on Windows
Signed-off-by: Darren Stahl <darst@microsoft.com>
2016-05-25 21:33:50 -07:00
John Starks 6508c015fe Windows: Use image version, not OS version for TTY fixup
A previous change added a TTY fixup for stdin on older Windows versions to
work around a Windows issue with backspace/delete behavior. This change
used the OS version to determine whether to activate the behavior.
However, the Windows bug is actually in the image, not the OS, so it
should have used the image's OS version.

This ensures that a Server TP5 container running on Windows 10 will have
reasonable console behavior.

Signed-off-by: John Starks <jostarks@microsoft.com>
2016-05-25 12:22:52 -07:00
John Howard c7ee503082 Merge pull request #22958 from Microsoft/hcs_rpc
Windows: Use the new HCS RPC API
2016-05-25 09:25:22 -07:00
Darren Stahl 959c1a52bf Change Docker to use the new HCS RPC API
Signed-off-by: Darren Stahl <darst@microsoft.com>
2016-05-24 16:36:51 -07:00
Vincent Demeester 86a7632d63 Merge pull request #22091 from amitkris/build_solaris
Get the Docker Engine to build clean on Solaris
2016-05-24 21:41:36 +02:00
Alexander Morozov d7dfe9103b Merge pull request #22541 from crosbymichael/graph-restore
Implement graph driver restore on reboot
2016-05-23 22:57:23 -07:00
Amit Krishnan 86d8758e2b Get the Docker Engine to build clean on Solaris
Signed-off-by: Amit Krishnan <krish.amit@gmail.com>
2016-05-23 16:37:12 -07:00
Michael Crosby 31e903b0e1 Remove restart test
This test is not applicable anymore now that containers are not stopped
when the daemon is restored.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-23 15:57:23 -07:00
Michael Crosby 009ee16bef Restore ref count
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-23 15:57:23 -07:00
John Starks f124829c9b Windows: Work around Windows BS/DEL behavior
In Windows containers in TP5, DEL is interpreted as the delete key, but
Linux generally interprets it as backspace. This prevents backspace from
working when using a Linux terminal or the native console terminal
emulation in Windows.

To work around this, translate DEL to BS in Windows containers stdin when
TTY is enabled. Do this only for builds that do not have the fix in
Windows itself.

Signed-off-by: John Starks <jostarks@microsoft.com>
2016-05-20 19:04:20 -07:00
Stefan J. Wernli a5b64f2847 Fixing Windows update logic.
Removing the call to Shutdown from within Signal in order to rely on waitExit handling the exit of the process.

Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
2016-05-12 17:45:53 -07:00
Alexander Morozov e811e9784f Merge pull request #22544 from Microsoft/jjh/terminate
Windows: Terminate on failed shutdown, fixes dockerd deadlock
2016-05-12 14:46:56 -07:00
John Howard feacb1205b Windows: Terminate on failed shutdown
Signed-off-by: John Howard <jhoward@microsoft.com>
2016-05-10 10:09:50 -07:00
Michael Crosby 6889c3276c Fix containerd proto for connection
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-05-09 15:17:10 -07:00
Sebastiaan van Stijn 9a9ebc7f85 Merge pull request #22046 from cpuguy83/containerd_stdio
Set containerd pdeathsig
2016-05-06 09:26:16 +02:00
Brian Goff d4559313d5
Set Pdeathsig for containerd on SIGKILL
Makes sure containerd exits (when started by docker) if docker gets
SIGKILL'd.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-05-02 14:23:38 -04:00
Doug Davis 1e44bba4af Remain extra \n on INFO log msg
Signed-off-by: Doug Davis <dug@us.ibm.com>
2016-04-27 05:19:40 -07:00
Stefan J. Wernli da92dad59f Adding servicing update to postRunProcessing for Windows containers.
This change enables the workflow of finishing installing Windows OS updates in the container after it has completed running, via a special servicing container.

Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
2016-04-25 12:16:26 -07:00
Vincent Demeester 17d5c97c90 Merge pull request #22125 from crosbymichael/restart-timeout
Reset restart timeout if execution longer than 10s
2016-04-25 19:15:32 +02:00