metropolis/node/curator: restart listener on status changes

This change ensures that a Curator's gRPC listener restart not only when
it switches from leader to follower (and vice versa), but also when the
curator instance keeps being a leader/follower but the leadership status
changes in an important way.

Restarting the listener is important when the Curator is a re-elected
leader, as otherwise the listener will continue running with the old
leadership data which includes the old leadership election key/revision,
subsequently failing any leadership transactions when serving gRPC
requests.

Similarly, we need to restart the follower listener when the we the
leader changes to some other node, otherwise we end up serving old
leadership data in GetCurrentLeader.

An alternative fix would be to move the leadership status to an Event
value that gets updated and passed through to the listeners. This would
avoid somewhat unnecessary listener restarts on leadership changes, but
this will do for now.

This should fix issue #276 and perhaps some related flakiness. I'll try
to write a test for this on top of this CR, but this will require a new
kind of Curator test which we don't have yet.

Change-Id: I20ef147adeb0c976b904ef55dbadb9b461111e94
Reviewed-on: https://review.monogon.dev/c/monogon/+/2722
Tested-by: Jenkins CI
Reviewed-by: Lorenz Brun <lorenz@monogon.tech>
2 files changed
tree: 3a020de589b3931d069a29deebddbb47e1f41d7f
  1. .github/
  2. build/
  3. cloud/
  4. go/
  5. intellij/
  6. metropolis/
  7. net/
  8. third_party/
  9. tools/
  10. version/
  11. .bazelignore
  12. .bazelproject
  13. .bazelrc
  14. .bazelrc.sandboxroot
  15. .bazelversion
  16. .git-ignore-revs
  17. .gitignore
  18. BUILD.bazel
  19. CODING_STANDARDS.md
  20. go.mod
  21. go.sum
  22. LICENSE
  23. MODULE.bazel
  24. MODULE.bazel.lock
  25. README.md
  26. SETUP.md
  27. shell.nix
  28. WORKSPACE
README.md

Monogon Monorepo

This is the main repository containing the source code for the Monogon Platform.

This is pre-release software - take a look, and check back later!

Environment

Our build environment is self-contained and requires only minimal host dependencies:

  • A Linux machine or VM.
  • Bazelisk >= v1.15.0 (or a working Nix environment).
  • A reasonably recent kernel with user namespaces enabled.
  • Working KVM with access to /dev/kvm (if you want to run tests).

Our docs assume that Bazelisk is available as bazel on your PATH.

Refer to SETUP.md for detailed instructions.

Monogon OS

Run a single node demo cluster

Build CLI and node image:

bazel build //metropolis/cli/dbg //:launch --config dbg

Launch an ephemeral test node:

bazel test //:launch --config dbg --test_output=streamed

Run a kubectl command while the test is running:

bazel-bin/metropolis/cli/dbg/dbg_/dbg kubectl describe node

Test suite

Run full test suite:

bazel test --config dbg //...