commit | ae00468363b0006ecf1ae90ed3833bbe54820df5 | [log] [tgz] |
---|---|---|
author | Serge Bazanski <serge@monogon.tech> | Tue Apr 18 13:28:48 2023 +0200 |
committer | Serge Bazanski <serge@monogon.tech> | Wed Apr 19 13:55:01 2023 +0000 |
tree | 3dff4cdf264bed17e66f7aed2c8085b67738104d | |
parent | 86a714d6e81bb524dc59fda7baa63b45e7180489 [diff] |
cloud/shepherd/equinix: implement recoverer This implements basic recovery functionality for 'stuck' agents. The shepherd will notice machines with a agent that either never sent a heartbeat, or stopped sending heartbeats, and will remove their agent started tags and reboot the machine. Then, the main agent start logic should kick in again. More complex recovery flows can be implemented later, this will do for now. Change-Id: I2c1b0d0465e4e302cdecce950a041581c2dc8548 Reviewed-on: https://review.monogon.dev/c/monogon/+/1560 Tested-by: Jenkins CI Reviewed-by: Tim Windelschmidt <tim@monogon.tech>
This is the main repository containing the source code for the Monogon Platform.
This is pre-release software - take a look, and check back later!
Our build environment is self-contained and requires only minimal host dependencies:
/dev/kvm
(if you want to run tests).Our docs assume that Bazelisk is available as bazel
on your PATH.
Refer to SETUP.md for detailed instructions.
Build CLI and node image:
bazel build //metropolis/cli/dbg //:launch -c dbg
Launch an ephemeral test node:
bazel test //:launch -c dbg --test_output=streamed
Run a kubectl command while the test is running:
bazel-bin/metropolis/cli/dbg/dbg_/dbg kubectl describe node
Run full test suite:
bazel test -c dbg //...