m/n/c/network: fix panic when DHCP lease expires

The statusCallback of the network service previously accessed
new.AssignedIP without checking if new is nil, which caused a panic when
the DHCP lease expired. When subsequently the DHCP service was restarted
and a new lease obtained, CoreDNS was left without upstream servers
configured. The reason for this is that just before the panic, CoreDNS
was configured with an empty list of upstreams, but the lease field of
the DHCP service was not updated. When the lease callback was called
again with the new lease, old and new lease had the same DNS servers, so
CoreDNS was not configured to use the upstreams.

To fix the panic, this adds a check for a nil lease before accessing
AssignedIP. I looked if all consumers of the ExternalAddress Status can
handle nil, and added a nil check in the statuspush worker. The
apiserver stops when the lease is lost, and starts again once it is
reacquired; I'm not sure if this is the intended behavior.

The DNS problem occured because the old lease passed to the callback was
not the last lease that the callback had seen, and it then mistakenly
suppressed the update. In general, a callback cannot rely on the old
lease being the last lease that the callback has been called with. For
example, when a callback earlier in the Compose chain returns an error,
later callbacks are not called, so a callback may not see all lease
changes. Because the old lease parameter cannot be trusted, I removed
it. Callbacks which need the previous lease should keep track of it
themselves.

For manually testing lease expiry, I modified
metropolis/test/nanoswitch/nanoswitch.go like this:

+		start := time.Now()
 		server, err := server4.NewServer(link.Attrs().Name, &laddr, func(conn net.PacketConn, peer net.Addr, m *dhcpv4.DHCPv4) {
 			if m == nil {
 				return
 			}
+			if start.Add(50*time.Second).Before(time.Now()) && start.Add(90*time.Second).After(time.Now()) {
+				supervisor.Logger(ctx).Infof("Dropping DHCP packet")
+				return
+			}
 			reply, err := dhcpv4.NewReplyFromRequest(m)

Change-Id: Ifa0c039769c37ee53033ce013eed4f1af6f02142
Reviewed-on: https://review.monogon.dev/c/monogon/+/3214
Tested-by: Jenkins CI
Reviewed-by: Lorenz Brun <lorenz@monogon.tech>
7 files changed
tree: 9fe27c2e435dab9ab0927b94ba9ee48d2e2e31b5
  1. .github/
  2. build/
  3. cloud/
  4. go/
  5. intellij/
  6. metropolis/
  7. net/
  8. osbase/
  9. third_party/
  10. tools/
  11. version/
  12. .bazelignore
  13. .bazelproject
  14. .bazelrc
  15. .bazelrc.ci
  16. .bazelrc.sandboxroot
  17. .bazelversion
  18. .git-ignore-revs
  19. .gitignore
  20. BUILD.bazel
  21. CODING_STANDARDS.md
  22. go.mod
  23. go.sum
  24. LICENSE
  25. MODULE.bazel
  26. MODULE.bazel.lock
  27. README.md
  28. SETUP.md
  29. shell.nix
  30. WORKSPACE
README.md

Monogon Monorepo

This is the main repository containing the source code for the Monogon Platform.

This is pre-release software - take a look, and check back later! In the meantime, join us on Matrix (#monogon-os-community:matrix.org) or Discord.

Environment

Our build environment is self-contained and requires only minimal host dependencies:

  • A Linux machine or VM.
  • Bazelisk >= v1.15.0 (or a working Nix environment).
  • A reasonably recent kernel with user namespaces enabled.
  • Working KVM with access to /dev/kvm (if you want to run tests).

Our docs assume that Bazelisk is available as bazel on your PATH.

Refer to SETUP.md for detailed instructions.

Monogon OS

The source code lives in //metropolis (Metropolis is the codename of Monogon OS).

See the //metropolis/README.md for a developer quick start guide, or see the Monogon OS Handbook for user documentation.