logtree: capture multiple lines in leveled log entries This implements a solution to a disputed answer to the following question: “What happens when someone calls Infof("foo\nbar")?” Multiple answers immediately present themselves: a) Don't do anything, whatever consumers logs needs to expect that they might contain newlines. b) Yell/error/panic so that the programmer doesn't do this. c) Split the one Info call into multiple Info calls, one per line, somewhere in the logging path. Following the argumentation for these we establish the follwoing requirments for any solution: 1) We want the programmer to be able to log multiple lines from a single Info call and have that not fail. This is especially important for reliability - we don't want an accidental codepath that suddenly starts printing %s-formatted user-controlled messages to start erroring out in production. This rules out b). 2) We want to allow emitting multiple lines that will not be interleaved when viewing the log data. This rules out c). 3) We want to prohibit log injection by malicious \n-containing payloads (in case of %s-formatted user-controlled content). This rules out a). 4) If multiple lines are allowed in a leveled payload, the type system should support that, so that log consumers/tools will not forget to account for the possible newlines. This too rules out a). With these in mind, we instead opt for a different solutions: changing the API of logtree and logging protos to contain multiple possible lines per single log entry. This is a breaking change, but since all access to logs is currently self-contained within the Monogon OS codebase, we can afford this. To support this change, we change the access API (at LogEntry and LeveledPayload level) to contain two different methods for retrieving the canonical representation of an entry: fn String() string which returns a string with possible intra-string newlines (but no trailing newlines), but with each newline-delimited chunk having the canonical text representation prefix for this message. This prevents newline injection into logs creating fake prefixes. fn Strings() (prefix string, lines []string) which returns a common prefix for this entry (in its text representation) and a set of lines that were contained in the original log entry. This allows slightly smarter consuming code to make more active decisions regarding the rendering of a multi-line entry, while still providing a canonical text formatted representation of that log entry. These permit simple log access code that prints log data into a terminal (or terminal-like view), like dbg, to continue using the String() call. In fact, no changes had to be made to dbg for it to continue working, even though the API underneath changed. Naturally, raw logging entries continue to contain only a single line, so no change is implemented in the LineBuffer API. The containing LogEntry for raw log entries emits single-lined Strings() results and no newline-containing strings in String() results. Test Plan: Updated unit tests to cover this. X-Origin-Diff: phab/D650 GitOrigin-RevId: 4e339a930c4cbefff91b289b07bc31f774745eca

commit: 12971d6c8031d06f497c81ae1ed2a5bee488e7d2 [log] [tgz]
author: Serge Bazanski <serge@nexantic.com> Tue Nov 17 12:12:58 2020 +0100
committer: Serge Bazanski <serge@nexantic.com> Tue Nov 17 12:12:58 2020 +0100
tree: 3332b72dda28e9c3d476aba0dd63d8465a3245f7
parent: b0272187ee577a94edb803b81413165b7c1a89ba [diff] [blame]
diff --git a/core/pkg/logtree/doc.go b/core/pkg/logtree/doc.go
index caef97a..ab3c537 100644
--- a/core/pkg/logtree/doc.go
+++ b/core/pkg/logtree/doc.go

@@ -16,10 +16,11 @@
 
 /*
 Package logtree implements a tree-shaped logger for debug events. It provides log publishers (ie. Go code) with a
-glog-like API, with loggers placed in a hierarchical structure defined by a dot-delimited path (called a DN, short for
-Distinguished Name).
+glog-like API and io.Writer API, with loggers placed in a hierarchical structure defined by a dot-delimited path
+(called a DN, short for Distinguished Name).
 
     tree.MustLeveledFor("foo.bar.baz").Warningf("Houston, we have a problem: %v", err)
+    fmt.Fprintf(tree.MustRawFor("foo.bar.baz"), "some\nunstructured\ndata\n")
 
 Logs in this context are unstructured, operational and developer-centric human readable text messages presented as lines
 of text to consumers, with some attached metadata. Logtree does not deal with 'structured' logs as some parts of the
@@ -69,7 +70,7 @@
 logs of the entire tree, just a single DN (like svc), or a subtree (like everything under listener, ie. messages emitted
 to listener.http and listener.grpc).
 
-Log Producer API
+Leveled Log Producer API
 
 As part of the glog-like logging API available to producers, the following metadata is attached to emitted logs in
 addition to the DN of the logger to which the log entry was emitted:
@@ -82,6 +83,18 @@
 node of the tree. For more information about the producer-facing logging API, see the documentation of the LeveledLogger
 interface, which is the main interface exposed to log producers.
 
+If the submitted message contains newlines, it will be split accordingly into a single log entry that contains multiple
+string lines. This allows for log producers to submit long, multi-line messages that are guaranteed to be non-interleaved
+with other entries, and allows for access API consumers to maintain semantic linking between multiple lines being emitted
+as a single atomic entry.
+
+Raw Log Producer API
+
+In addition to leveled, glog-like logging, LogTree supports 'raw logging'. This is implemented as an io.Writer that will
+split incoming bytes into newline-delimited lines, and log them into that logtree's DN. This mechanism is primarily
+intended to support storage of unstructured log data from external processes - for example binaries running with redirected
+stdout/stderr.
+
 Log Access API
 
 The Log Access API is mostly exposed via a single function on the LogTree struct: Read. It allows access to log entries
@@ -94,5 +107,10 @@
 Thus, log consumers should be aware that it is much better to stream and buffer logs specific to some long-standing
 logging request on their own, rather than repeatedly perform reads of a subtree backlog.
 
+The data returned from the log access API is a LogEntry, which itself can contain either a raw logging entry, or a leveled
+logging entry. Helper functions are available on LogEntry that allow canonical string representations to be returned, for
+easy use in consuming tools/interfaces. Alternatively, the consumer can itself access the internal raw/leveled entries and
+print them according to their own preferred format.
+
 */
 package logtree
commit	12971d6c8031d06f497c81ae1ed2a5bee488e7d2	[log] [tgz]
author	Serge Bazanski <serge@nexantic.com>	Tue Nov 17 12:12:58 2020 +0100
committer	Serge Bazanski <serge@nexantic.com>	Tue Nov 17 12:12:58 2020 +0100
tree	3332b72dda28e9c3d476aba0dd63d8465a3245f7
parent	b0272187ee577a94edb803b81413165b7c1a89ba [diff] [blame]