Diff - 12971d6c8031d06f497c81ae1ed2a5bee488e7d2^! - monogon

commit	12971d6c8031d06f497c81ae1ed2a5bee488e7d2	[log] [tgz]
author	Serge Bazanski <serge@nexantic.com>	Tue Nov 17 12:12:58 2020 +0100
committer	Serge Bazanski <serge@nexantic.com>	Tue Nov 17 12:12:58 2020 +0100
tree	3332b72dda28e9c3d476aba0dd63d8465a3245f7
parent	b0272187ee577a94edb803b81413165b7c1a89ba [diff]

logtree: capture multiple lines in leveled log entries

This implements a solution to a disputed answer to the following
question:

    “What happens when someone calls Infof("foo\nbar")?”

Multiple answers immediately present themselves:

    a) Don't do anything, whatever consumers logs needs to expect that
       they might contain newlines.
    b) Yell/error/panic so that the programmer doesn't do this.
    c) Split the one Info call into multiple Info calls, one per line,
       somewhere in the logging path.

Following the argumentation for these we establish the follwoing
requirments for any solution:

    1) We want the programmer to be able to log multiple lines from a
       single Info call and have that not fail. This is especially
       important for reliability - we don't want an accidental codepath
       that suddenly starts printing %s-formatted user-controlled
       messages to start erroring out in production. This rules out b).
    2) We want to allow emitting multiple lines that will not be
       interleaved when viewing the log data. This rules out c).
    3) We want to prohibit log injection by malicious \n-containing
       payloads (in case of %s-formatted user-controlled content). This
       rules out a).
    4) If multiple lines are allowed in a leveled payload, the type
       system should support that, so that log consumers/tools will not
       forget to account for the possible newlines. This too rules out
       a).

With these in mind, we instead opt for a different solutions: changing
the API of logtree and logging protos to contain multiple possible lines
per single log entry. This is a breaking change, but since all access to
logs is currently self-contained within the Monogon OS codebase, we can
afford this.

To support this change, we change the access API (at LogEntry and
LeveledPayload level) to contain two different methods for retrieving
the canonical representation of an entry:

    fn String() string

which returns a string with possible intra-string newlines (but no
trailing newlines), but with each newline-delimited chunk having the
canonical text representation prefix for this message. This prevents
newline injection into logs creating fake prefixes.

    fn Strings() (prefix string, lines []string)

which returns a common prefix for this entry (in its text
representation) and a set of lines that were contained in the original
log entry. This allows slightly smarter consuming code to make more
active decisions regarding the rendering of a multi-line entry, while
still providing a canonical text formatted representation of that log
entry.

These permit simple log access code that prints log data into a terminal
(or terminal-like view), like dbg, to continue using the String() call.
In fact, no changes had to be made to dbg for it to continue working,
even though the API underneath changed.

Naturally, raw logging entries continue to contain only a single line,
so no change is implemented in the LineBuffer API. The containing
LogEntry for raw log entries emits single-lined Strings() results and no
newline-containing strings in String() results.

Test Plan: Updated unit tests to cover this.

X-Origin-Diff: phab/D650
GitOrigin-RevId: 4e339a930c4cbefff91b289b07bc31f774745eca

diff --git a/core/proto/api/debug.proto b/core/proto/api/debug.proto
index b0bbb57..7a046ec 100644
--- a/core/proto/api/debug.proto
+++ b/core/proto/api/debug.proto

@@ -136,7 +136,7 @@
 
 message LogEntry {
     message Leveled {
-        string message = 1;
+        repeated string lines = 1;
         int64 timestamp = 2;
         LeveledLogSeverity severity = 3;
         string location = 4;