supervisor: never give up

This fixes T756, in which supervised processes would reach a negative
backoff value. This seems to be caused by the backoff library's
ExponentialBackoff having a default MaxElapsedTime of 15 minutes, after
which it returns 'Stop', or, -1 seconds.

Test Plan: There's no easy way to test this. Unfortunately, the behaviour to return Stop is not after a number of calls, but after time has elapsed. We don't want to wait 15 minutes for a test, and we don't have an easy way to mock time, either. But I did test this manually and I cannot observe the 'negative backoffs' after 15 minutes anymore.

Bug: T756

X-Origin-Diff: phab/D619
GitOrigin-RevId: 49d8617bcf2c8b36127cb43acde8afb7cc35c99f
diff --git a/core/internal/common/supervisor/supervisor_node.go b/core/internal/common/supervisor/supervisor_node.go
index 32f9720..e2af62c 100644
--- a/core/internal/common/supervisor/supervisor_node.go
+++ b/core/internal/common/supervisor/supervisor_node.go
@@ -148,11 +148,16 @@
 // newNode creates a new node with a given parent. It does not register it with the parent (as that depends on group
 // placement).
 func newNode(name string, runnable Runnable, sup *supervisor, parent *node) *node {
+	// We use exponential backoff for failed runnables, but at some point we cap at a given backoff time.
+	// To achieve this, we set MaxElapsedTime to 0, which will cap the backoff at MaxInterval.
+	bo := backoff.NewExponentialBackOff()
+	bo.MaxElapsedTime = 0
+
 	n := &node{
 		name:     name,
 		runnable: runnable,
 
-		bo: backoff.NewExponentialBackOff(),
+		bo: bo,
 
 		sup:    sup,
 		parent: parent,