build/ci: move Dockerfile, document new CI

This moves the Builder Imager Dockerfile into //build/ci, adds some
small changes to make it usable as a Jenkins agent base, documents its
usage, and adds a script which builds and pushes that image into an
external container registry.

We also remove the old Phabricator-based CI scripting.

Change-Id: I332608f7d7105f675104db3ee2d787b2412fcbe9
Reviewed-on: https://review.monogon.dev/c/monogon/+/28
Reviewed-by: Leopold Schabel <leo@nexantic.com>
diff --git a/build/Dockerfile b/build/ci/Dockerfile
similarity index 67%
rename from build/Dockerfile
rename to build/ci/Dockerfile
index 8935d1f..e319026 100644
--- a/build/Dockerfile
+++ b/build/ci/Dockerfile
@@ -1,8 +1,3 @@
-#
-# The CI only rebuilds this Dockerfile if its hash changes.
-# Do not reference any external files, since modifications to them won't trigger a rebuild.
-#
-
 FROM fedora:32
 
 RUN dnf -y upgrade && \
@@ -33,7 +28,16 @@
 	grpc-cli \
 	nc \
 	python-unversioned-command \
-	openssl-devel
+	openssl-devel \
+	java-11-openjdk
+
+# Create CI build user. This is not used by scripts/bin/bazel, but instead only
+# used by CI infrastructure to run build agents as.
+# The newly created user will have a UID of 500, and a corresponding CI group
+# of GID 500 will be created as well. This UID:GID pair's numeric values are
+# relied on by the CI infrastructure and must not change without coordination.
+RUN set -e -x ;\
+	useradd -u 500 -U -m -d /home/ci ci
 
 # Install Bazel binary
 RUN curl -o /usr/local/bin/bazel \
diff --git a/build/ci/README.md b/build/ci/README.md
new file mode 100644
index 0000000..45dcd3f
--- /dev/null
+++ b/build/ci/README.md
@@ -0,0 +1,64 @@
+Monogon CI
+==========
+
+Monogon has a work-in-progress continuous integration / testing pipeline.
+Because of historical reasons, some parts of this pipeline are defined in a
+separate non-public repository that is managed by Monogon Labs.
+
+In the long term, the entire infrastructure code relating to this will become
+public and part of the Monogon repository. In the meantime, this document
+should serve as a public reference that explains how that part works and how it
+integrates with `//build/ci/...` and the project as a whole.
+
+Builder Image & Container
+-------------------------
+
+`//build/ci/Dockerfile` describes a 'builder image'. This image contains a
+stable, Fedora-based build environment in which all Monogon components should
+be built. It has currently two uses:
+
+1. The build scripts at
+   `//scripts/{create_container.sh,destroy_container.sh,/bin/bazel}`. These are
+   used by developers to run Bazel against a controlled environment to develop
+   Monogon code. The `create_container.sh` script builds the Builder image and
+   starts a Builder container. The `bin/bazel` wrapper script launches Bazel in
+   it. The `destroy_container.sh` script cleans everything up.
+
+2. The Jenkins based CI uses the Builder image as a base to run Jenkins agents.
+   A Monogon Labs developer runs `//build/ci/build_ci_image`, which builds the
+   Builder Image and pushes it to a container registry. Then, in another
+   repository, that image is used as a base to overlay a Jenkins agent on top,
+   and then used to run all Jenkins actions.
+
+As Monogon evolves and gets better build hermeticity using Bazel toolchains,
+the need for a Builder image should subdue. Meanwhile, using the same image
+ensures that we have the maximum possible reproducibility of builds across
+development and CI machines, and gets us a base level of build hermeticity and
+reproducibility.
+
+CI usage
+--------
+
+When a change on https://review.monogon.dev/ gets opened, it needs to either
+be owned by a 'trusted user', or be vouched by one. This is because our current
+CI setup is not designed to protect against malicious changes that might
+attempt to take over the CI system, or change the CI scripts themselves to skip
+tests.
+
+Currently, all Monogon Labs employees (thus, the core Monogon development team)
+are marked as 'trusted users'. There is no formal process for community
+contributors to become part of this group, but we are more than happy to
+formalize such a process when needed, or appoint active community contributors
+to this group. Ideally, though, the CI system should be rebuilt to allow any
+external contributor to run CI in a secure and sandboxed fashion.
+
+CI implementation
+-----------------
+
+The CI system is currently made of a Jenkins instance running on
+https://jenkins.monogon.dev/. It runs against open changes that have the
+Allow-Run-CI label evaluated to 'ok' Gerrit Prolog rules, and executes the
+`//build/ci/jenkins-presubmit.groovy` script on them.
+
+Currently, the Jenkins instance is not publicly available, and thus CI logs are
+not publicly available either. This will be fixed very soon.
diff --git a/scripts/create_container.sh b/scripts/create_container.sh
index 2c60e5d..9ddaf7a 100755
--- a/scripts/create_container.sh
+++ b/scripts/create_container.sh
@@ -16,7 +16,7 @@
 fi
 
 # Rebuild base image
-podman build -t monogon-builder build
+podman build -t monogon-builder build/ci
 
 # Keep this in sync with ci.sh:
 
diff --git a/scripts/gazelle.sh b/scripts/gazelle.sh
deleted file mode 100755
index 0f37ae0..0000000
--- a/scripts/gazelle.sh
+++ /dev/null
@@ -1,5 +0,0 @@
-#!/usr/bin/env bash
-# gazelle.sh regenerates BUILD.bazel files for Go source files.
-set -euo pipefail
-
-bazel run //:gazelle -- update
diff --git a/scripts/push_ci_image.sh b/scripts/push_ci_image.sh
new file mode 100755
index 0000000..3176f56
--- /dev/null
+++ b/scripts/push_ci_image.sh
@@ -0,0 +1,43 @@
+#!/usr/bin/env bash
+
+# This script can be run by Monogon Labs employees to push the Builder image
+# built from //build/ci/Dockerfile into a public registry image. That image is
+# then consumed by external, non-public infrastructure code as a basis to run
+# Jenkins CI agents.
+#
+# For more information, see //build/ci/README.md.
+
+set -euo pipefail
+
+main() {
+    if [[ "$HOME" == /user ]] && [[ -d /user ]] && [[ -d /home/ci ]]; then
+        echo "WARNING: likely running within Builder image instead of the host environment." >&2
+        echo "If this script was invoked using 'bazel run', please instead do:" >&2
+        echo "    \$ scripts/bin/bazel build //build/ci:push_ci_image" >&2
+        echo "    \$ bazel-bin/build/ci/push_ci_image" >&2
+        echo "This will build the script within the container but run it on the host." >&2
+    fi
+
+    local podman="$(command -v podman || true)"
+    if [[ -z "$podman" ]]; then
+        echo "'podman' must be available in the system PATH to build the image." >&2
+        exit 1
+    fi
+
+    local dockerfile="build/ci/Dockerfile"
+    if [[ ! -f "${dockerfile}" ]]; then
+        echo "Could not find ${dockerfile} - this script needs to be run from the root of the Monogon repository." >&2
+        ecit 1
+    fi
+
+    local image="gcr.io/monogon-infra/monogon-builder:$(date +%s)"
+
+    echo "Building image ${image} from ${dockerfile}..."
+    "${podman}" build -t "${image}" - < "${dockerfile}"
+    echo "Pushing image ${image}..."
+    "${podman}" push "${image}"
+    echo "Done, new image is ${image}"
+}
+
+main "$@"
+
diff --git a/scripts/run_ci.sh b/scripts/run_ci.sh
deleted file mode 100755
index a8612f1..0000000
--- a/scripts/run_ci.sh
+++ /dev/null
@@ -1,110 +0,0 @@
-#!/usr/bin/env bash
-# This helper scripts executes all Bazel tests in our CI environment.
-# https://phab.monogon.dev/harbormaster/plan/2/
-set -euo pipefail
-
-DOCKERFILE_HASH=$(sha1sum build/Dockerfile | cut -c -8)
-
-BUILD_ID=$1;
-BUILD_PHID=$2;
-shift; shift;
-
-TAG=monogon-version-${DOCKERFILE_HASH}
-CONTAINER=monogon-build-${BUILD_ID}
-
-# We keep one set of Bazel build caches per working copy to avoid concurrency
-# issues (we cannot run multiple Bazel servers on a given _bazel_root).
-function getWorkingCopyID {
-  local pattern='/var/drydock/workingcopy-([0-9]+)/'
-  [[ "$(pwd)" =~ $pattern ]]
-  echo ${BASH_REMATCH[1]}
-}
-
-# Main Bazel cache, used as Bazel outputRoot/outputBase.
-CACHE_VOLUME=bazel-cache-$(getWorkingCopyID)
-# Secondary disk cache for Bazel, used to keep build data between configuration
-# switches (saving from spurious rebuilds when switchint from debug to
-# non-debug builds).
-SECONDARY_CACHE_VOLUME=bazel-secondary-cache-$(getWorkingCopyID)
-SECONDARY_CACHE_LOCATION="/user/.cache/bazel-secondary"
-# TODO(q3k): Neither the main nor secondary caches are garbage collected and
-# they will slowly fill up the disk of the CI builder.
-
-# The Go pkg cache is safe to use concurrently.
-GOPKG_VOLUME=gopkg-cache
-
-cat > ci.bazelrc <<EOF
-build --disk_cache=${SECONDARY_CACHE_LOCATION}
-EOF
-
-# We do our own image caching since the podman build step cache does
-# not work across different repository checkouts and is also easily
-# invalidated by multiple in-flight revisions with different Dockerfiles.
-if ! podman image inspect "$TAG" >/dev/null; then
-  echo "Could not find $TAG, building..."
-  podman build -t ${TAG} build
-fi
-
-function cleanup {
-  rc=$?
-  ! podman kill $CONTAINER
-  ! podman rm $CONTAINER --force
-  exit $rc
-}
-
-trap cleanup EXIT
-
-! podman kill $CONTAINER
-! podman rm $CONTAINER --force
-
-! podman volume create --opt o=nodev,exec ${CACHE_VOLUME}
-! podman volume create --opt o=nodev ${SECONDARY_CACHE_VOLUME}
-! podman volume create --opt o=nodev ${GOPKG_VOLUME}
-
-function bazel() {
-    podman run \
-        --rm \
-        --name $CONTAINER \
-        -v $(pwd):/work \
-        -v ${CACHE_VOLUME}:/user/.cache/bazel/_bazel_root \
-        -v ${SECONDARY_CACHE_VOLUME}:${SECONDARY_CACHE_LOCATION} \
-        -v ${GOPKG_VOLUME}:/user/go/pkg \
-        --privileged \
-        ${TAG} \
-        bazel "$@"
-}
-
-bazel run //:fietsje
-bazel run //:gazelle -- update
-
-if [[ ! -z "$(git status --porcelain)" ]]; then
-  echo "Unclean working directory after running gazelle and fietsje:"
-  git diff HEAD
-  cat <<EOF
-Please run:
-
-  $ bazel run //:fietsje
-  $ bazel run //:gazelle -- update
-
-in your local branch and add the resulting changes to this diff.
-EOF
-  exit 1
-fi
-
-bazel test //...
-bazel test //... -c dbg
-
-function conduit() {
-  # Get Phabricator host from Git origin
-  local pattern='ssh://(.+?):([0-9]+)'
-  [[ "$(git remote get-url origin)" =~ $pattern ]];
-  local host=${BASH_REMATCH[1]}
-  local port=${BASH_REMATCH[2]}
-
-  ssh "$host" -p "$port" conduit $@
-}
-
-# Report build results if we made it here successfully
-conduit harbormaster.sendmessage <<EOF
-{"params": "{\"buildTargetPHID\": \"${BUILD_PHID}\", \"type\": \"pass\"}"}
-EOF