马拉松将健康的任务视为不健康并杀死他们

我在Docker中(通过中间层)使用Marathon框架部署了一些服务,有时马拉松会杀死运行任务。

服务使用HTTP运行状况检查( intervalSeconds = 30, maxConsecutiveFailures = 3, timeoutSeconds = 20 )。

它随机发生,甚至有时我甚至可以看到当Marathon用户界面的任务变成红色,即使这样http检查在浏览器中运行良好(因此服务是健康的),然后马拉松杀死并重新启动影响整体系统性能的服务。

任何build议将是有益的

Mesos(v0.22.1),Marathon(v0.9.0)

日志:

 I1223 12:23:45.058763 32718 slave.cpp:1581] Asked to kill task prod-tracker-backend-processor.63dbfa9b-a965-11e5-a046-e24e 30c7374f of framework 20150527-135958-3712123914-5050-2238-0000 I1223 12:23:45.189750 32720 slave.cpp:2531] Handling status update TASK_KILLED (UUID: 09e76bce-f24c-4999-8933-270baf023c62 ) for task prod-tracker-backend-processor.63dbfa9b-a965-11e5-a046-e24e30c7374f of framework 20150527-135958-3712123914-505 0-2238-0000 from executor(1)@10.132.66.219:33503 I1223 12:23:45.214113 32718 docker.cpp:1009] Updated 'cpu.shares' to 102 at /sys/fs/cgroup/cpu/docker/9e0bc3b40ad9b37c4a0f 6133ca1316c2addd2e2c5a7941e56a4e1770d7afd3a2 for container 9dad82a5-34e1-4bf9-a641-17129464a226 W1223 12:23:45.214740 32718 docker.cpp:1021] Container 9dad82a5-34e1-4bf9-a641-17129464a226 does not appear to be a member of a cgroup where the 'memory' subsystem is mounted I1223 12:23:45.216114 32724 status_update_manager.cpp:317] Received status update TASK_KILLED (UUID: 09e76bce-f24c-4999-89 33-270baf023c62) for task prod-tracker-backend-processor.63dbfa9b-a965-11e5-a046-e24e30c7374f of framework 20150527-135958 -3712123914-5050-2238-0000 I1223 12:23:45.216359 32724 status_update_manager.hpp:346] Checkpointing UPDATE for status update TASK_KILLED (UUID: 09e76 bce-f24c-4999-8933-270baf023c62) for task prod-tracker-backend-processor.63dbfa9b-a965-11e5-a046-e24e30c7374f of framework 20150527-135958-3712123914-5050-2238-0000 I1223 12:23:45.221278 32720 slave.cpp:2776] Forwarding the update TASK_KILLED (UUID: 09e76bce-f24c-4999-8933-270baf023c62) for task prod-tracker-backend-processor.63dbfa9b-a965-11e5-a046-e24e30c7374f of framework 20150527-135958-3712123914-5050 -2238-0000 to master@10.132.8.65:5050 I1223 12:23:45.222024 32720 slave.cpp:2709] Sending acknowledgement for status update TASK_KILLED (UUID: 09e76bce-f24c-499 9-8933-270baf023c62) for task prod-tracker-backend-processor.63dbfa9b-a965-11e5-a046-e24e30c7374f of framework 20150527-13 5958-3712123914-5050-2238-0000 to executor(1)@10.132.66.219:33503 I1223 12:23:45.233886 32725 status_update_manager.cpp:389] Received status update acknowledgement (UUID: 09e76bce-f24c-499 9-8933-270baf023c62) for task prod-tracker-backend-processor.63dbfa9b-a965-11e5-a046-e24e30c7374f of framework 20150527-13 5958-3712123914-5050-2238-0000