| Serge Bazanski | 968d720 | 2024-02-07 16:33:38 +0100 | [diff] [blame] | 1 | # Monogon Operating System |
| 2 | |
| 3 | A cluster operating system. Codename: metropolis. Linux kernel, stateless userland, API-driven management, high |
| 4 | integrity. Designed to run Kubernetes and other workload scheduling systems. |
| 5 | |
| 6 | ## Documentation |
| 7 | |
| 8 | The canonical documentation for Monogon OS is the [Monogon OS Handbook](https://docs.monogon.dev/metropolis-v0.1/handbook/index.html). |
| 9 | |
| 10 | |
| 11 | ## Developer Quick Start |
| 12 | |
| 13 | Follow the setup instructions in the top-level README.md. We recommend using `nix-shell` either via Nix installed on an existing distribution or on NixOS. |
| 14 | |
| 15 | Start a test cluster by running: |
| 16 | |
| 17 | ``` |
| 18 | $ bazel run //metropolis:launch-cluster |
| 19 | ``` |
| 20 | |
| 21 | This will build all the required code and run a fully userspace test cluster consisting of four qemu VMs (three for |
| 22 | Monogon OS nodes, one for a swich/router emulator). No root access on the host required. |
| 23 | |
| 24 | ``` |
| 25 | .--------. .--------. .--------. |
| 26 | | node 0 | | node 1 | | node 2 | |
| 27 | '--------' '--------' '--------' |
| 28 | ^ ^ ^ |
| 29 | | | | Virtual Ethernet (10.1.0.0/24) |
| 30 | V V V |
| 31 | .-------------------------. |
| 32 | | nanoswitch | |
| 33 | |-------------------------| .-------------------. |
| 34 | | Router, switch, |--->| Internet via host | |
| 35 | | SOCKS proxy | '-------------------' |
| 36 | '-------------------------' |
| 37 | ^ |
| 38 | | gRPC over SOCKS to nodes |
| 39 | .----------. |
| 40 | | metroctl | |
| 41 | '----------' |
| 42 | ``` |
| 43 | |
| 44 | The launch tool will output information on how to connect to the cluster: |
| 45 | |
| 46 | ``` |
| 47 | Launch: Cluster running! |
| 48 | To access cluster use: metroctl --config /tmp/metroctl-3733429479 --proxy 127.0.0.1:42385 --endpoints 10.1.0.2 --endpoints 10.1.0.3 --endpoints 10.1.0.4 |
| 49 | Or use this handy wrapper: /tmp/metroctl-3733429479/metroctl.sh |
| 50 | To access Kubernetes, use kubectl --context=launch-cluster |
| 51 | ``` |
| 52 | |
| 53 | You can use the metroctl wrapper to then look at the node list per the Monogon OS cluster control plane: |
| 54 | |
| 55 | ``` |
| 56 | $ alias metroctl=/tmp/metroctl-3733429479/metroctl.sh |
| 57 | $ metroctl node describe |
| 58 | NODE ID STATE ADDRESS HEALTH ROLES TPM VERSION HEARTBEAT |
| 59 | metropolis-067651202d00b79fffe92df0001aabff UP 10.1.0.4 HEALTHY yes v0.1.0-dev494.g0d8a8a4f.dirty 1s |
| 60 | metropolis-7ccd2437c50696ea9a9b6543dc163f84 UP 10.1.0.3 HEALTHY yes v0.1.0-dev494.g0d8a8a4f.dirty 3s |
| 61 | metropolis-ec101152c48c5f761534c1910cf66200 UP 10.1.0.2 HEALTHY ConsensusMember,KubernetesController yes v0.1.0-dev494.g0d8a8a4f.dirty 3s |
| 62 | ``` |
| 63 | |
| 64 | We have a node running the Monogon OS control plane (ConsensusMember role) and Kubernetes control plane ( |
| 65 | KubernetesController role), but no Kubernetes worker nodes. But changing that is a simple API call (or metroctl |
| 66 | invocation) away: |
| 67 | |
| 68 | ``` |
| 69 | $ metroctl node add role KubernetesWorker metropolis-067651202d00b79fffe92df0001aabff |
| 70 | 2024/02/12 17:42:33 Updated node metropolis-067651202d00b79fffe92df0001aabff. |
| 71 | $ metroctl node add role KubernetesWorker metropolis-7ccd2437c50696ea9a9b6543dc163f84 |
| 72 | 2024/02/12 17:42:36 Updated node metropolis-7ccd2437c50696ea9a9b6543dc163f84. |
| 73 | $ metroctl node describe |
| 74 | NODE ID STATE ADDRESS HEALTH ROLES TPM VERSION HEARTBEAT |
| 75 | metropolis-067651202d00b79fffe92df0001aabff UP 10.1.0.4 HEALTHY KubernetesWorker yes v0.1.0-dev494.g0d8a8a4f.dirty 0s |
| 76 | metropolis-7ccd2437c50696ea9a9b6543dc163f84 UP 10.1.0.3 HEALTHY KubernetesWorker yes v0.1.0-dev494.g0d8a8a4f.dirty 3s |
| 77 | metropolis-ec101152c48c5f761534c1910cf66200 UP 10.1.0.2 HEALTHY ConsensusMember,KubernetesController yes v0.1.0-dev494.g0d8a8a4f.dirty 3s |
| 78 | ``` |
| 79 | |
| 80 | And just like that, we can now see these nodes in Kubernetes, too: |
| 81 | |
| 82 | ``` |
| 83 | $ kubectl --context=launch-cluster get nodes |
| 84 | NAME STATUS ROLES AGE VERSION |
| 85 | metropolis-067651202d00b79fffe92df0001aabff Ready <none> 15s v1.24.2+mngn |
| 86 | metropolis-7ccd2437c50696ea9a9b6543dc163f84 Ready <none> 13s v1.24.2+mngn |
| 87 | $ kubectl --context=launch-cluster run -it --image=ubuntu:22.04 test -- bash |
| 88 | If you don't see a command prompt, try pressing enter. |
| 89 | root@test:/# uname -a |
| 90 | Linux test 6.1.69-metropolis #1 SMP PREEMPT_DYNAMIC Tue Jan 30 14:43:23 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| 91 | root@test:/# |
| 92 | |
| 93 | ``` |
| 94 | |
| 95 | With the test launch tooling, you can now start iterating on the codebase. Regardless of whether you're changing the |
| 96 | Linux kernel config or implementing a new RPC, testing your changes interactively is a single `bazel` command away. |
| 97 | |
| 98 | ## End-to-end tests |
| 99 | |
| 100 | We have an end-to-end test suite. It's run automatically on CI. Any new logic should be exercised there. |
| 101 | |
| 102 | ``` |
| 103 | $ bazel run //metropolis/test/e2e:e2e_test |
| 104 | ``` |
| 105 | |
| 106 | These tests operate on a fully virtualized cluster just like the launch tooling, so be patient. |