Commit Graph

400 Commits

Author SHA1 Message Date
Isabel Jimenez a99ceeb9c1 Adding suicide logic for tasks so as to prevent false timeout for tasks having a long image pull
Signed-off-by: Isabel Jimenez <contact@isabeljimenez.com>
2016-01-14 13:37:23 -05:00
Victor Vieux 08839f62fa Merge pull request #1636 from jimenez/checkpoint_failover
Enabling checkpoint failover in FrameworkInfo
2016-01-14 10:18:09 -08:00
Victor Vieux d3e4ddb0f7 Merge pull request #1635 from jimenez/task_timeout_restructure
Removing Queue package and regrouping task logic
2016-01-14 10:18:03 -08:00
Isabel Jimenez b297c1bd41 Enabling checkpoint failover in FrameworkInfo
Signed-off-by: Isabel Jimenez <contact@isabeljimenez.com>
2016-01-14 04:05:00 -05:00
Isabel Jimenez fe8da8fe80 Removing Queue package and regrouping task logic
Signed-off-by: Isabel Jimenez <contact@isabeljimenez.com>
2016-01-14 03:52:11 -05:00
Dong Chen 8cc9b6c284 Add swarm container create retry option.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-01-13 16:06:38 -08:00
Alexandre Beslic d21748699d Merge pull request #1565 from jimmyxian/fresh-image-when-commit
fresh image when receive commit event
2016-01-13 12:01:09 -08:00
Victor Vieux 985974854c Merge pull request #1630 from jimenez/driver_join
Adding observe async for driver abort/errors
2016-01-13 11:47:10 -08:00
Isabel Jimenez 9bfc28c291 Adding obeserve async for driver abort/errors
Signed-off-by: Isabel Jimenez <contact@isabeljimenez.com>
2016-01-13 04:46:45 -05:00
Alexandre Beslic 254e095f77 Merge pull request #1601 from vieux/docker_discovery
use docker/docker/pkg/discovery and update godeps
2016-01-12 17:06:51 -08:00
Victor Vieux 18b6435839 Merge pull request #1621 from jimenez/scheduler_driver
Restructuring mesos scheduler driver outside of Cluster
2016-01-12 17:02:35 -08:00
Isabel Jimenez 443d49167a Restructuring mesos scheduler driver outside of Cluster
Signed-off-by: Isabel Jimenez <contact@isabeljimenez.com>
2016-01-12 19:24:50 -05:00
Alexandre Beslic e1213384bc Merge pull request #1578 from aluzzardi/rescheduling
[experimental] Simple container rescheduling on node failure
2016-01-12 15:00:27 -08:00
Victor Vieux 14bf4e08b3 add -experimental to enable rescheduling
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-12 01:35:39 -08:00
Victor Vieux 31ad0e047f update godeps
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-12 00:38:09 -08:00
Victor Vieux fc1e7bbca2 use docker/docker/pkg/discovery
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-12 00:38:06 -08:00
Victor Vieux a2018c177c improve eventHandlers locking
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-11 17:23:48 -08:00
Dong Chen 8f384b1d40 Address review comments.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-01-11 16:08:51 -08:00
Victor Vieux 78008f4d4a add doc
fix tests and keep swarm id
remove duplicate on node reconnect
explicit failure

Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-11 15:59:44 -08:00
Andrea Luzzardi 13f60212f5 Add support for container rescheduling on node failure.
Add rescheduling integration tests.

Signed-off-by: Andrea Luzzardi <aluzzardi@gmail.com>
2016-01-11 15:59:44 -08:00
Andrea Luzzardi 56941d02a8 cluster: Support multiple event handlers.
Signed-off-by: Andrea Luzzardi <aluzzardi@gmail.com>
2016-01-11 15:59:44 -08:00
Dong Chen cf664141b6 Scheduler prefers nodes without connection failures.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-01-11 11:42:58 -08:00
Xian Chaobo 1fef59f738 fresh image when receive commit event
Signed-off-by: Xian Chaobo <xianchaobo@huawei.com>
2016-01-08 17:25:30 +08:00
Alexandre Beslic 8b173fd382 Merge pull request #1569 from dongluochen/nodeManagement
Improve node management.
2016-01-07 16:14:36 -08:00
Dong Chen 7e266f18ed Name constants.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-01-07 15:55:12 -08:00
Xian Chaobo 3aa302d706 Merge pull request #1587 from vieux/do_not_save_image_aff
do not save image affinity on reschedule
2016-01-07 09:42:16 +08:00
Dongluo Chen b4a6ad2e56 Merge pull request #1585 from jimenez/klaus-jimenez-offer-refuse
Klaus jimenez offer refuse
2016-01-06 13:20:02 -08:00
Isabel Jimenez 5a529d4c4a Adding help for new flag offer_refuse_seconds and renaming
Signed-off-by: Isabel Jimenez <contact@isabeljimenez.com>
2016-01-06 15:50:30 -05:00
Dong Chen 58a0e1719d Update failureCount scenario and test cases.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-01-06 10:33:51 -08:00
Dong Chen 9a1584d508 Update integration test. Reduce pending node validation sleep interval. Each pending node has its own validation interval according to failure count. So reducing sleep interval is not increasing validation frequency for unreachable nodes.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-01-05 15:56:55 -08:00
Dong Chen 52a7616d99 Add integration test for state machine.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-01-05 14:59:30 -08:00
Victor Vieux 2449a352ef add unit test
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-05 10:31:47 -08:00
Victor Vieux 5daaecdaa1 do not save image affinity on reschedule
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-05 10:29:45 -08:00
Klaus Ma cf78e799fd address review comments
Signed-off-by: Klaus Ma <klaus.ma@outlook.com>
2016-01-05 13:18:51 -05:00
Klaus Ma b68537cc20 correct code style & build error
Signed-off-by: Klaus Ma <klaus.ma@outlook.com>
2016-01-05 12:47:42 -05:00
Klaus Ma a23ce43337 Add MESOS_OFFER_REFUSE_SECONDS environment configuration
Signed-off-by: Klaus Ma <klaus.ma@outlook.com>
2016-01-05 12:47:42 -05:00
Victor Vieux 97f3767618 fix soft affinity reschedule
Signed-off-by: Victor Vieux <vieux@docker.com>
2016-01-05 04:58:36 -08:00
Dong Chen 995866d76c Improve node management.
1. Introduce pending state. Pending nodes need validation before moving to healthy state. Resolve issues of duplicate ID and dead node drop issues.
2. Expose error and last update time in docker info.
3. Use connect success/failure to drive state transition between healthy and unhealthy.

Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-12-30 13:25:43 -08:00
Victor Vieux a2380a6c71 update godeps
Signed-off-by: Victor Vieux <vieux@docker.com>
2015-12-22 00:20:04 -08:00
Victor Vieux be0fce961f update code
Signed-off-by: Victor Vieux <vieux@docker.com>
2015-12-22 00:20:04 -08:00
Isabel Jimenez de0e67f571 Merge pull request #1554 from ezrasilvera/mesosFixLock
Change the scheduler lock in Mesos cluster
2015-12-18 10:20:29 -08:00
Ezra Silvera 219f7192d6 Change the scheduler lock in Mesos cluster
Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
2015-12-17 18:20:57 +02:00
Dong Chen 02553d0727 Cover connection failure error reported by dockerclient and by proxy cases.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-12-15 19:20:29 -08:00
Dong Chen 9bc6c35321 Use engine connection error to fail engine fast.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-12-15 19:13:03 -08:00
Dong Chen ec3b00c484 Reorganize engine failure detection procedure. Change engine option 'RefreshRetry' to 'FailureRetry'.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-12-15 19:13:03 -08:00
Dong Chen 4d24256c19 Use failureCount as a secondary health indicator.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2015-12-15 19:13:03 -08:00
Victor Vieux cdd42a5c6b display all the containers that are part of a global network on inspect
update godeps

Signed-off-by: Victor Vieux <victorvieux@gmail.com>
2015-12-15 17:48:35 -08:00
Victor Vieux ed987b8d85 Merge pull request #1542 from jimenez/slave_to_agent
Name changing slave to agent
2015-12-14 13:57:31 -08:00
Isabel Jimenez 18cccc521c renaming files + change on tests
Signed-off-by: Isabel Jimenez <contact@isabeljimenez.com>
2015-12-14 16:20:38 -05:00
Victor Vieux 81bf5bc067 Merge pull request #1538 from vitan/patch-1
Typo
2015-12-14 13:18:26 -08:00