Commit Graph

90 Commits

Author SHA1 Message Date
Alexandre Beslic ac6341066d Fix Consul Leader Election Failure on multi-server
This fixes an error with Consul which fails to see a new
swarm leader after a consul leader soft restart because
of session renewal and missing status update.

Signed-off-by: Alexandre Beslic <alexandre.beslic@gmail.com>
2016-04-07 12:23:47 -07:00
Victor Vieux 2906a670a3 Merge pull request #2060 from allencloud/fix-typos
fix typos
2016-03-31 15:50:05 -07:00
Sun Hongliang 07a47d54a4 fix typos
Signed-off-by: Sun Hongliang <allen.sun@daocloud.io>
2016-03-31 13:52:01 +08:00
Sun Hongliang e60b93aef3 make port 0 invalid in checkAddrFormat
Signed-off-by: Sun Hongliang <allen.sun@daocloud.io>
2016-03-30 19:47:42 +08:00
Victor Vieux ffba4054dc enable rescheduling watchdog only when primary
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-03-24 06:34:05 -07:00
Sun Hongliang e2dbd70357 remove parameter which is not used in createDiscovery
Signed-off-by: Sun Hongliang <allen.sun@daocloud.io>
2016-03-19 23:12:09 +08:00
Xian Chaobo a91f1818c4 Merge pull request #1963 from allencloud/validate-duration-flag-positive
validate duration flags: --delay, --timeout,--replication-ttl
2016-03-18 08:55:24 +08:00
Sun Hongliang 5aa339ed38 validate duration flags:--delay, --timeout,--replication-ttl
Signed-off-by: Sun Hongliang <allen.sun@daocloud.io>
2016-03-16 10:09:57 +08:00
Nishant Totla 81b6fded58 Merge pull request #1971 from vieux/rescheduling_out
move rescheduling out of experimental
2016-03-15 12:00:42 -07:00
Victor Vieux b771a7cb14 Merge pull request #1973 from dongluochen/MaxThreadCount
Increase max thread count to 50k
2016-03-15 11:19:37 -07:00
Dong Chen 408f3cff8e Increase max thread count to 50k
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-03-14 18:36:48 -07:00
Victor Vieux 082f4b65af move rescheduling out of experimental
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-03-14 17:18:36 -07:00
Sun Hongliang b9d0d8927f force to validate min and max refresh interval to be positive
Signed-off-by: Sun Hongliang <allen.sun@daocloud.io>
2016-03-15 00:29:08 +08:00
Dong Chen bf5c744a49 Check port range. Add more unit tests for address format.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-03-02 11:01:48 -08:00
Dong Chen c73f17fb6e Support IPv6 address format.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-03-02 10:04:03 -08:00
Alexandre Beslic 3f7a035855 extract leadership package in its own repository and update Godeps
Signed-off-by: Alexandre Beslic <alexandre.beslic@gmail.com>
2016-02-08 16:55:32 -08:00
Nishant Totla 3d7678389f Removing backspaces in /info output for new API version
Signed-off-by: Nishant Totla <nishanttotla@gmail.com>
2016-01-31 23:54:33 -08:00
Victor Vieux f3a1027bbe update docker info
add test

Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-27 01:07:52 -08:00
Victor Vieux 04fb48d27a support 1.10 events
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-15 19:35:04 -08:00
Isabel Jimenez b297c1bd41 Enabling checkpoint failover in FrameworkInfo
Signed-off-by: Isabel Jimenez <contact@isabeljimenez.com>
2016-01-14 04:05:00 -05:00
Dong Chen 8cc9b6c284 Add swarm container create retry option.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-01-13 16:06:38 -08:00
Victor Vieux 14bf4e08b3 add -experimental to enable rescheduling
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-12 01:35:39 -08:00
Victor Vieux fedf7aa4cb use "docker/swarm/nodes"
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-12 00:38:10 -08:00
Victor Vieux fc1e7bbca2 use docker/docker/pkg/discovery
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-12 00:38:06 -08:00
Andrea Luzzardi 13f60212f5 Add support for container rescheduling on node failure.
Add rescheduling integration tests.

Signed-off-by: Andrea Luzzardi <aluzzardi@gmail.com>
2016-01-11 15:59:44 -08:00
Isabel Jimenez 5a529d4c4a Adding help for new flag offer_refuse_seconds and renaming
Signed-off-by: Isabel Jimenez <contact@isabeljimenez.com>
2016-01-06 15:50:30 -05:00
Alexandre Beslic 40f26856a5 Merge pull request #1410 from dongluochen/joinSpike
Add a random delay to avoid synchronized registration at swarm join. Simple fix for #1353
2016-01-04 12:28:17 -08:00
Victor Vieux 53cf899e31 Merge pull request #1517 from dongluochen/EngineFastFailure
Use failureCount as a secondary health indicator.
2015-12-17 16:35:29 -08:00
Alexandre Beslic 5e8998eb6d Fix Consul Lock TTL with store failure
If using the Lock TTL feature with Consul, the code
path in libkv is issuing a Put in the background through
the PeriodicRenewal call. The error is then eaten up and
ignored on the candidate loop. This would lead to the
candidate and followers being stuck in their candidate
loop. Consequence would be that they would not retry to
take the lock ending in a state with no Leader.

This patch restores an explicit error check instead of
wrongfully passing on the error to the channel before
giving it back to the caller.

Signed-off-by: Alexandre Beslic <abronan@docker.com>
2015-12-16 15:46:11 -08:00
Dong Chen d80a32b3df Explicitly deprecate --engine-refresh-retry.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-12-15 19:13:03 -08:00
Dong Chen ec3b00c484 Reorganize engine failure detection procedure. Change engine option 'RefreshRetry' to 'FailureRetry'.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-12-15 19:13:03 -08:00
Dong Chen 2c029f9795 Change '--joindelay' to '--delay' since it's a join option.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-12-14 17:00:34 -08:00
Dong Chen db5c8aba7c Add a command line option for swam join delay.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-12-14 17:00:34 -08:00
Dong Chen 36ca8ff63f Add a random delay to avoid synchronized registration at swarm join.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-12-14 17:00:34 -08:00
Victor Vieux 4aafe4aa7b Merge pull request #1492 from aluzzardi/profiling
Enable profiling over HTTP in debug mode
2015-12-08 12:43:00 -08:00
Alexandre Beslic f21efa4337 Increase default TTL and heartbeat value
Increases the default ttl and heartbeat value for discovery.
Because the node will still be listed for a long period on
`docker info`, there is now a Status to know if a node is
in the healthy or unhealthy state.

Signed-off-by: Alexandre Beslic <abronan@docker.com>
2015-12-04 17:11:33 -08:00
Andrea Luzzardi f1155ca431 Enable profiling over HTTP in debug mode
Signed-off-by: Andrea Luzzardi <aluzzardi@gmail.com>
2015-12-03 03:01:05 -08:00
Victor Vieux de6383c4dd Merge pull request #1448 from jimenez/timeout_default
Changing offers timeout default to prevent other frameworks starvation
2015-11-30 14:35:09 -08:00
Isabel Jimenez 484edd33cd Changing offers timeout default to prevent other frameworks starvation
Signed-off-by: Isabel Jimenez <contact@isabeljimenez.com>
2015-11-25 04:01:30 -05:00
Dong Chen 51d92d4b69 fix time duration in EngineOpts
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-11-02 16:13:50 -08:00
Dong Chen 68fbfe0cac change refresh retry count to IntFlag
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-11-02 14:42:30 -08:00
Dong Chen c9f3471dba add engine options for refresh interval
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-10-28 12:56:48 -07:00
Daniel Hiltgen 3661b6e63b Add TLS support for libkv
This adds TLS support into the KV store for swarm.  The manage, join,
and list commands all have a new CLI argument, matching the docker engine
discovery backend.  This required adding the tlsconfig utility
package from docker engine.

Here's an example showing re-use of the cluster certs for the KV store:

    swarm manage --tlsverify \
        --tlscacert /etc/docker/ssl/ca.pem
        --tlscert /etc/docker/ssl/cert.pem
        --tlskey /etc/docker/ssl/key.pem
        --discovery-opt kv.cacertfile=/etc/docker/ssl/ca.pem
        --discovery-opt kv.certfile=/etc/docker/ssl/cert.pem
        --discovery-opt kv.keyfile=/etc/docker/ssl/key.pem
        --advertise 192.168.122.47:3376
        etcd://192.168.122.47:2379

Signed-off-by: Daniel Hiltgen <daniel.hiltgen@docker.com>
2015-10-12 13:33:08 -07:00
Alexandre Beslic c74cf900ef Replace --leaderTTL flag by --replication-ttl
Fixes #1256

Signed-off-by: Alexandre Beslic <abronan@docker.com>
2015-10-02 08:42:14 -07:00
Alexandre Beslic ab8d1b489c add support for specifying the leader election lock ttl
Signed-off-by: Alexandre Beslic <abronan@docker.com>
2015-09-23 04:06:35 -07:00
Morgan Bauer 5c4b0a1765
remove deprecated unused flag
Signed-off-by: Morgan Bauer <mbauer@us.ibm.com>
2015-09-09 16:22:53 -07:00
Alexandre Beslic 6c1c83f7a3 Cleanup state folder with local file persistence (not used anymore)
Signed-off-by: Alexandre Beslic <abronan@docker.com>
2015-08-30 17:15:52 -07:00
Victor Vieux 28bc55ed6b improve usage for discovery
Signed-off-by: Victor Vieux <vieux@docker.com>
2015-08-01 16:53:07 -07:00
Alexandre Beslic c7513506be Fault tolerant Leader Election process, fixes leader information on docker info, fixes intermittent error on Consul session lock
Signed-off-by: Alexandre Beslic <abronan@docker.com>
2015-07-31 10:06:47 -07:00
Andrea Luzzardi f38c034499 Leader Election: Use same path prefix as discovery.
Fixes #1037

Signed-off-by: Andrea Luzzardi <aluzzardi@gmail.com>
2015-07-09 01:16:47 -07:00