Woodchipper has a four-stop processing pipeline for incoming message:
- Reading: reads raw messages as strings from some source,
- Parsing: converts raw messages into a standardized format
- Classification: converts standardized messages into human-readable chunks with rendering metadata
- Rendering: displays messages to the screen, possibly applying styles and providing interactive features
Each stage may have multiple implementations and will be selected either by the user (readers and renderers) or determined automatically (e.g. parsers and classifiers).
Readers fetch messages from some input source as text and pass them along for parsing. Input sources may be local (stdin, file, subprocess) or may fetch log messages via sockets or some API.
Existing implementations include:
stdin.rs
: reads lines from standard input / pipesstdin_hack.rs
: reads lines from/dev/stdin
to avoid conflicts with the interactive renderer on Unixnull.rs
: a dummy reader that prints an error and quits, used as a fallback if no other reader is available- the kubernetes reader fetches log messages from Kubernetes pods via the Kubernetes API
Readers run in a dedicated thread and send messages over a channel for further
processing. If needed, they may accept arguments via the Config
to,
for example, set the Kubernetes namespace.
Rust's blocking IO means that reader threads cannot be reliably terminated at users' request, so we can't necessarily expect readers to be capable of responding to an exit request. However, readers require some cleanup actions may use the optional exit request and response channels to listen for exit requests, perform cleanup actions, and notify the main thread that it's safe to terminate.
Rather than pushing just a raw message string over the channel, lines are
instead wrapped in a LogEntry
, allowing some additional
metadata to be send along the channel:
-
LogEntry::eof()
can be sent to notify renderers that the end of input has been reached -
LogEntry::message()
is used to send normal messagesOptionally, a
ReaderMetadata
may be provided to pass along datatype hints if they're available at read-time, e.g. a source name if reading from multiple sources or a timestamp if tracked via the input api (e.g. Docker and Kubernetes). -
LogEntry::internal()
is used to send internal messages to the user as our own logging ability is restricted, particularly in the interactive renderer
Woodchipper parses lines independently to better support applications that output multiple formats (e.g. startup scripts, 3rd party libraries, or multiple separate Kubernetes containers). Parsers must quickly determine if messages are supported or hand them off to the next parser in the chain.
If the parser can parse the input message, it returns a normalized
Message
instance with as much metadata as it could extract.
Existing implementations include:
-
json.rs
: parses JSON log lines, i.e. lines like{...}\n
It specifically aims to support logrus-like JSON output formats, but various other field mappings are also supported.
Prefers RFC-3339-style timestamps but falls back to
dtparse
.Unidentified fields are copied to the
metadata
field for use later in the pipline. -
plain.rs
: the fallback parser; renders the raw message, but opportunistically includes metadata if it can be identified.Where possible, timestamps are parsed out of messages using
dtparse
, with some simple checks to discard timestamps for common failure cases. Log levels are identified where possible.
Parsers may refer to the reader's metadata to include or override their parsed
contextual info. For example, the plain parser prefers to use the reader's
timestamp rather than using the significantly slower and less accurate dtparse
free-form parser.
Given a normalized Message
instance, a classifier generates
some number of Chunks
. They are responsible for
determining various rendering-specific attributes:
- the formatted text content
- the
kind
, used mainly for highlighting and aligning text segments - the
slot
, used to place the segment within a screen region (left, center, right) - the alignment of text within a chunk
- padding, wrapping, and line break hints
- the
weight
, used to hide less important chunks on smaller displays
At the moment, chunks are arranged based on the order in which classifiers are executed. Chunks may contain children to individually apply styles to different sub-sections of a text segment while avoiding improper line wrapping.
Classifiers may mark metadata fields as "consumed" by adding their keys to a
shared HashSet
, allowing later classifiers in the chain to skip
Existing implementations include:
timestamp.rs
: formats timestamps into two chunks, allowing the lower priority date chunk to be pruned while still displaying the time.level.rs
: adds the log level using its level-specifickind
text.rs
: adds force-wrapped chunks per line of input text, allowing strings with newlines to be displayed sensiblylogrus.rs
: extracts logrus'sfile
field for display in the right column, trimming the path to the last few componentsmetadata.rs
: adds all un-processed metadata fields to the message as[key]=[value]
pairs
Existing implementations include:
-
json.rs
: writes the normalized parsed messages back to standard output, discarding classifier results. Useful for normalizing log messages in scripting applications. -
plain.rs
: writes classified messages to standard output with basic (whitespace-only) formatting, suitable for sharing.This renderer is automatically selected if output is piped. The interactive renderer will re-format messages using this renderer when copying to the clipboard.
-
styled.rs
: writes classified and styled output to standard output.If terminal width can be detected, lines will be wrapped and a right-side column may display contextual information.
This output is less suitable for sharing as it contains ANSI escape characters and right-aligned text.
-
the interactive renderer: a performant custom pager with interactive features, including text reflow, searching, filtering, and improved browsing.