In cases there are failures in task start, swarmkit might be trying to
restart the task again in the same node which might keep failing. This
creates a race where when a failed task is getting removed it might
remove the associated network while another task for the same service
or a different service but connected to the same network is proceeding
with starting the container knowing that the network is still
present. Fix this by reacting to `ErrNoSuchNetwork` error during
container start by trying to recreate the managed networks. If they
have been removed it will be recreated. If they are already present
nothing bad will happen.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
(cherry picked from commit 117cef5e9766d6ba228770c225e816c6afd16ff8)
Signed-off-by: Tibor Vass <tibor@docker.com>
Context cancellations were previously causing `Prepare` to fail
completely on re-entrant calls. To prevent this, we filtered out cancels
and deadline errors. While this allowed the service to proceed without
errors, it had the possibility of interrupting long pulls, causing the
pull to happen twice.
This PR forks the context of the pull to match the lifetime of
`Controller`, ensuring that for each task, the pull is only performed
once. It also ensures that multiple calls to `Prepare` are re-entrant,
ensuring that the pull resumes from its original position.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
(cherry picked from commit d8d71ad5b94d44a2778f2d8989424259cac94e9b)
Signed-off-by: Tibor Vass <tibor@docker.com>
Ensure that cancellation of a pull propagates rather than continuing to
container creation. This ensures that the `Prepare` method is properly
re-entrant.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
(cherry picked from commit d99c6b837ffd18ffe5bce801feb4936bf0edd2aa)
Signed-off-by: Tibor Vass <tibor@docker.com>
This warning appears in the course of normal use of swarm mode. Since
it's meant more as an internal TODO than something which should be
exposed to a user, remove the log message.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
(cherry picked from commit 39c93cfb47783f2531ba44f062fd9a8b351d2224)
As described in our ROADMAP.md, introduce new Swarm management API
endpoints relying on swarmkit to deploy services. It currently vendors
docker/engine-api changes.
This PR is fully backward compatible (joining a Swarm is an optional
feature of the Engine, and existing commands are not impacted).
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Signed-off-by: Madhu Venugopal <madhu@docker.com>